Google DeepMind makes 4-dimensional videos from 2D inputs

D4RT, Dynamic 4D Reconstruction and Tracking, can calculate any pixel for 3D worlds in an instant. (Picture: Google).
Google’s new D4RT is a breakthrough in transcoding two dimensional videos from a camera or file, and turning them into 4D content.

This means they can create 3D worlds from a 2D camera, and it can also calculate the fourth dimension, which is time and movement.

— Anytime we look at the world, we perform an extraordinary feat of memory and prediction. We see and understand things as they are at a given moment in time, as they were a moment ago, and how they are going to be in the moment to follow, Google writes.

Enabling this will be especially useful for training robots, who need spatial real-time awareness about the world around it, and might not have fancy 3D cameras.

The technology is 18x to 300x faster than previous state of the art models, Google says — making the translations function in real-time.

The use case for the technology in addition to robotics is making 3D-models for augmented reality in smart glasses, and in building world models — which is the real advantage on the path to creating world-awareness for Artificial General Intelligence.

Read more: Google’s blog, Launch thread, research paper.