Skip to main content

InfiniteNature-Zero: Perpetual View Generation Learned On Single Images

InfiniteNature-Zero is an improved version of Infinite Nature, both being models for the generation of video based on single input images. The new improvements show higher quality and an improved training process compared to it's predecessor.
Created on July 25|Last edited on July 25
A new model for perpetual view generation based on individual images has been launched, called InfiniteNature-Zero. This model is actually based on another model from late 2021 called Infinite Nature. The goal of both models is to generate videos which display a camera's perspective moving through a scene indefinitely based on only a single image as initial input. Here's what that looks like:

These videos were generated by InfiniteNature-Zero based on only the image seen in the first frame. Impressively, the motion of the "camera" can do more than just fly forwards in a straight line - it can bob up and down and turn left and right as well, all completely handled by the generative AI model.

How does InfiniteNature-Zero improve Infinite Nature?

Like mentioned before, this new model is based off of an older one, both models being predominantly developed by AI researchers at Google Research. However, with time comes improvement and InfiniteNature-Zero is surely an improvement over it's predecessor.
The main difference InfiniteNature-Zero brings is what gives it it's name; While Infinite Nature was trained on video data full of generated information about the camera's physical location in 3D space along with point-maps describing 3D terrain, InfiniteNature-Zero trained without any of that additional data. Furthermore, instead of being trained on video data at all, InfiniteNature-Zero trained on individual images.
The researchers show that individual images are perfectly sufficient for training a model for this purpose, as their model is able to generate much higher quality and more believable videos than the earlier Infinite Nature model.

How does InfiniteNature-Zero work?

InfiniteNature-Zero generates motion videos from a single input image by recursively generating one "frame" forward, using each newly generated frame as input for creating the next. During frame generation, other models are brought in to refine and give detail to the generated frame so that it looks more realistic.
During training, InfiniteNature-Zero learns to reconstruct a given image by being provided altered versions of the image as if they were the previous and next frames in a video. Think of the training input image being frame 𝑛 of a video - InfiniteNature-Zero is provided faked 𝑛-1 and 𝑛+1 frames, and must try to recreate the frame at 𝑛 by itself.

By learning how to create frames based on camera motion and an input frame, the model can be used to indefinitely create frames as the virtual camera moves around. To ensure high quality frames continue to be generated, a GAN system was introduced as well.

Find out more

A GitHub repository with project code will be coming soon.
Tags: ML News
Iterate on AI agents and models faster. Try Weights & Biases today.