Google DeepMind makes 4-dimensional videos from 2D inputs

D4RT, Dynamic 4D Reconstruction and Tracking, can calculate any pixel for 3D worlds in an instant. (Picture: Google).

Google’s new D4RT is a breakthrough in transcoding two dimensional videos from a camera or file, and turning them into 4D content.

This means they can create 3D worlds from a 2D camera, and it can also calculate the fourth dimension, which is time and movement.

— Anytime we look at the world, we perform an extraordinary feat of memory and prediction. We see and understand things as they are at a given moment in time, as they were a moment ago, and how they are going to be in the moment to follow, Google writes.

Enabling this will be especially useful for training robots, who need spatial real-time awareness about the world around it, and might not have fancy 3D cameras.

The technology is 18x to 300x faster than previous state of the art models, Google says — making the translations function in real-time.

The use case for the technology in addition to robotics is making 3D-models for augmented reality in smart glasses, and in building world models — which is the real advantage on the path to creating world-awareness for Artificial General Intelligence.

Read more: Google’s blog, Launch thread, research paper.

Google DeepMind makes 4-dimensional videos from 2D inputs

Nvidia releases the Vera Rubin platform: three and a half times faster [updated]

Amodei officially says Anthropic won’t drop Pentagon safeguards

ByteDance’s Seedance 2.0 video generator goes viral, prompts warnings

Faux pas at Indian AI summit as Amodei and Altman refuse hands

Nvidia strikes «multi-year strategic partnership» with Meta for AI chips

OpenAI’s first device will reportedly be a pocket-sized AI speaker, due in 2027