Tianwei Yin
@TianweiY
@reveimage. Prev: @MIT @Adobe
Video diffusion models generate high-quality videos but are too slow for interactive applications. We @MIT_CSAIL @AdobeResearch introduce CausVid, a fast autoregressive video diffusion model that starts playing the moment you hit "Generate"! A thread 🧵
I’ve joined @reveimage to lead storytelling! so excited to be building the future of AI powered creative expression by shaping our brand, community, and GTM efforts couldn’t be luckier to create a powerhouse product and share it with the world alongside the most stellar team 🌟
(for the curious ones, 1K likes to this Space translates to almost half a million videos generated on the Space! 🤯 go check it out: huggingface.co/spaces/multimo…)
🚀 CausVid (causvid.github.io) is powering the first real-time, audio-driven AI avatars at @character_ai — amazing work! Real-time video models open the door to countless interactive experiences. Excited to see what comes next! blog.character.ai/character-ais-…
Impressive works with our text-to-image preview!
Meet Benjamin (@Ben__Springer) — a CGI 3D artist pioneering AI-driven immersive filmmaking. See how Reve’s stunning quality and next-level details make it Benjamin’s go-to image model for creative exploration and production ☄️
Impressive new autoregressive video diffusion by Xun and team. A key challenge in AR diffusion is error accumulation—causing flicker and overexposure in long videos. This work tackles it with a surprisingly simple fix: train on the model’s own rollouts. It’s made possible thanks…
Real-time video generation is finally real — without sacrificing quality. Introducing Self-Forcing, a new paradigm for training autoregressive diffusion models. The key to high quality? Simulate the inference process during training by unrolling transformers with KV caching.
Bored of linear recurrent memories (e.g., linear attention) and want a scalable, nonlinear alternative? Our new paper “Test-Time Training Done Right” propose LaCT (Large Chunk Test-Time Training) — a highly efficient, massively scalable nonlinear memory with: 💡 Pure PyTorch…
CausVid is a criminally under-hyped distillation technique from @TianweiY (MIT and Adobe), which had a LoRA extracted from it by @Kijaidesign It's now supported on diffusers 🧨 , you can use the demo here huggingface.co/spaces/multimo…
A new AI model from MIT CSAIL researchers crafts smooth, high-quality videos in seconds: bit.ly/3GKkSJQ Their "CausVid" system combines the photorealism of diffusion models w/the speed of autoregressive approaches. Take a peek at its most compelling visuals 🧵
It doesn't have to be lonely at the top. Congratulations to GPT-4o for rocketing up the @ArtificialAnlys leaderboard and joining @reveimage.
OpenAI’s GPT-4o Image Generation debuts with an ELO score in equal first-place in the Artificial Analysis Image Arena, outperforming Recraft V3, FLUX 1.1 [pro] and Gemini 2.0 Flash @OpenAI last week launched GPT-4o Image Generation, upgrading ChatGPT’s built-in image generation…
Super excited to share that I’ve officially defended my PhD, wrapped up an incredible journey at @MIT and @AdobeResearch, and joined @reveimage! Thrilled to be working alongside the same amazing founders I teamed up with back in the Adobe days. That experience gave me deep…
Halfmoon is Reve Image — and it’s the best image model in the world 🥇 (🔊)
I’m extremely excited to finally reveal what I’ve been working on for the last year and a half. (1 of 7)
Halfmoon is Reve Image — and it’s the best image model in the world 🥇 (🔊)
🤩
Excited to come out of stealth at @reveimage! Today's text-to-image/video models, in contrast to LLMs, lack logic. Images seem plausible initially but fall apart under scrutiny: painting techniques don't match, props don't carry meaning, and compositions lack intention. (1/4)
Creative generative media has been my passion since my early days at Google Brain Research. I am beyond excited to finally share what we have been building for the past few months! The team is very small but the most creative and scientifically rigorous 😍 This is just the…
Halfmoon is Reve Image — and it’s the best image model in the world 🥇 (🔊)
Just launched Reve Image 1.0! 🚀 "Rêve" means dream in French—and we're turning that dream into reality. Incredibly proud of our team's work. And trust me, we're only getting started. ✨ @reveimage
Halfmoon is Reve Image — and it’s the best image model in the world 🥇 (🔊)
Today's visual generative models are mere stochastic parrots of imagery, much like early language models, which could only statistically mimic short sentences with little reasoning. In contrast, modern large language models (LLMs) can comprehend long documents, keep track of…
Halfmoon is Reve Image — and it’s the best image model in the world 🥇 (🔊)
Excited to see our hard work come to life today! Honored to work with an incredible team. From research to product, this journey has been nothing short of transformative. #ProductLaunch #Teamwork #Innovation
Halfmoon is Reve Image — and it’s the best image model in the world 🥇 (🔊)
Check out the impressive work by Yuanzhi on distilling **discrete** masked diffusion models into one-step generator!
Masked Diffusion Models (MDMs) are a hot topic in generative AI 🔥 — powerful but slow due to multiple sampling steps. We @Polytechnique and @Inria introduce Di[M]O — a novel approach to distill MDMs into a one-step generator without sacrificing quality.
A simple but effective trick for enhancing the diversity of distilled models. Really Great Work!
Why do distilled diffusion models generate similar-looking images? 🤔 Our Diffusion Target (DT) visualization reveals the secret to diversity. It is the very first time-step! And—there is a simple, training-free way to make them more diverse! Here is how: 🧵👇