Rohit Girdhar
@_rohitgirdhar_
Research Scientist at Meta GenAI
Super excited to share MovieGen: new SOTA media generation system! When we started, I didn’t think we’d get this far this quickly. But turns out a simplified approach (flow matching) paired with scaling up model size and data, indeed works amazingly well! Details in the paper 😀
🎥 Today we’re premiering Meta Movie Gen: the most advanced media foundation models to-date. Developed by AI research teams at Meta, Movie Gen delivers state-of-the-art results across a range of capabilities. We’re excited for the potential of this line of research to usher in…
@CVPR is around the corner!! Join us at the Workshop on T4V at #CVPR2025 with a great speaker lineup (@MikeShou1, @jw2yang4ai, @WenhuChen, @roeiherzig, Yuheng Li, Kristen Grauman) covering diverse topics! Website: sites.google.com/view/t4v-cvpr2… #CVPR #Transformer #Vision #T4V2025 #T4V
And check out another paper we just put online: DiTo! A new image/video tokenization approach, trained purely using diffusion, modernizing the tokenization pipeline and making it a lot simpler and scalable!
Introducing “Diffusion Autoencoders are Scalable Image Tokenizers” (DiTo). We show that with proper designs and scaling up, diffusion autoencoders (a single L2 loss) can outperform the GAN-LPIPS tokenizers (hybrid losses) used in current SOTA generative models. (1/4)
VERY excited about the era of generative AR we're bringing to life. Check out this preview! It's early but so damn promising — this isn't "AI slop"... it's unlocking Creators' imaginations on their own videos. Change your wardrobe, scene, lighting etc. with little expertise. PS…