Video generation Archives

Video generation models as world simulators

February 15, 2024

We explore large-scale training of generative models on video data. Specifically, we train text-conditional diffusion models jointly on videos and images of variable durations, resolutions and aspect ratios. We leverage a transformer architecture that operates on spacetime patches of video and image latent codes. Our largest model, Sora, is capable of generating a minute of high fidelity video. Our results suggest that scaling video generation models is a promising path towards building general purpose simulators of the physical world.

Category: Video generation

Video generation models as world simulators

With generative AI, MIT chemists quickly calculate 3D genomic structures

Vinay Singh, Oracle Fusion Cloud Financials Lead at McGraw Hill — Inspiration for Specializing in Oracle Fusion Cloud Financials, AI in Finance, Healthcare, Supply Chain, and the Future of Work

How AI is Transforming the Future of Podcasting

10 Top Women in AI in 2025

Pradeep Etikani, Staff Software Engineer at Walmart — AI and Cloud Integration, Retail Tech Evolution, Mentorship in Engineering, Ethical AI, Business Transformation, and the Future of Technology