Xun Huang

@xunhuang1995

Interactive Video World Model @AdobeResearch, Adjunct Professor @CarnegieMellon, ex-@NVIDIAAI, Ph.D. @Cornell, Snap & NVIDIA & Adobe Fellowship Recipient.

Pittsburgh, PA

Joined August 2015

426Following

4KFollowers

Pinned

Xun Huang@xunhuang1995 · Jun 9

Real-time video generation is finally real — without sacrificing quality. Introducing Self-Forcing, a new paradigm for training autoregressive diffusion models. The key to high quality? Simulate the inference process during training by unrolling transformers with KV caching.

126

779

583

135.0K

Xun Huang Retweeted

Demis Hassabis@demishassabis · Jul 21

Btw as an aside, we didn’t announce on Friday because we respected the IMO Board's original request that all AI labs share their results only after the official results had been verified by independent experts & the students had rightly received the acclamation they deserved

122

2.0K

282.0K

Xun Huang@xunhuang1995 · Jul 13

We should have called it "scaling up rollout", not RL. RL is a necessary evil for the discrete nature of language. My intuition tells me using RL for continuous data (images, videos, audios), where differentiable supervision is easily available, is a terrible idea.

AAndrej Karpathy@karpathy · Jul 13

Scaling up RL is all the rage right now, I had a chat with a friend about it yesterday. I'm fairly certain RL will continue to yield more intermediate gains, but I also don't expect it to be the full story. RL is basically "hey this happened to go well (/poorly), let me slightly…

305

223

53.0K

Xun Huang@xunhuang1995 · Jul 10

Welcome to the Era of Real-Time Experience

EEric the Daydreamer@ericxtang · Jul 9

Real-time video generation is arriving faster than most realize. StreamDiT, CausVideo, Self-Forcing, and Seaweed APT2 represent a new model architecture that enables temporal consistency and fast frame rate. A new era is emerging

3.0K

Xun Huang@xunhuang1995 · Jul 7

Wow, nearly half a million videos have been created with CausVid! Huge thanks to @multimodalart and the incredible open-source community for expanding the codebase and building such impressive demos!

aapolinario 🌐@multimodalart · Jul 7

(for the curious ones, 1K likes to this Space translates to almost half a million videos generated on the Space! 🤯 go check it out: huggingface.co/spaces/multimo…)

22.0K

Xun Huang@xunhuang1995 · Jul 5

🚀 CausVid (causvid.github.io) is powering the first real-time, audio-driven AI avatars at @character_ai — amazing work! Real-time video models open the door to countless interactive experiences. Excited to see what comes next! blog.character.ai/character-ais-…

xunhuang1995's tweet card. At Character.AI, we’re excited to introduce TalkingMachines, our newest autoregressive diffusion model that enables real-time, audio-driven, FaceTime-style video generation.

285

166

18.0K

Xun Huang Retweeted

shi3z@shi3z · Jun 21

Self-Forcingが強力すぎる　動画のリアルタイム生成だけでなく言語モデルや音声、強化学習が劇的に高性能化する可能性｜shi3z @shi3z note.com/shi3zblog/n/n8…

129

22.0K

Xun Huang Retweeted

apolinario 🌐@multimodalart · Jun 19

this is not a drill 🚨, real-time open source video generation is here 🔥 Self-Forcing - a real-time video distilled model from Wan 2.1 by @Adobe is out, and they open sourced it 🐐 I've built a live real time demo on @huggingface Spaces 📹💨

488

509

51.0K

Xun Huang Retweeted

Infini-AI-Lab@InfiniAILab · Jun 18

🚀 Excited to introduce our latest work GRESO: a method that identifies and skips millions of uninformative prompts before rollout and achieves up to 2.0x wall-clock time speedup in training. More rollouts lead to better model performance, but they’re also a major bottleneck in…

164

145

30.0K

Xun Huang@xunhuang1995 · Jun 18

NVIDIA wants to sell you NVL72 rack ($3M) so you can do real-time video generation 😅 Good thing: you don't need it. Self Forcing does the job with one 4090, and with better quality 😊 self-forcing.github.io

xunhuang1995's tweet image. NVIDIA wants to sell you NVL72 rack ($3M) so you can do real-time video generation 😅

Good thing: you don't need it. Self Forcing does the job with one 4090, and with better quality 😊

self-forcing.github.io

226

129

24.0K

Xun Huang Retweeted

OWL@wayfarerlabs · Jun 16

We built a real-time audio-video world model (& showed it off at CVPR)! 🎥360p 🔊W/ synced audio ⚡️10fps on a gaming laptop (faster than our H100s!) 📦image+audio VAE & a causal diffusion WM 🔥 built and trained in under 72 hours! See below for our technical blog 1/4 🧵

307

150

32.0K

Xun Huang@xunhuang1995 · Jun 16

Hello @MiniMax__AI exciting model but questionable claim on its better reasoning scaling than @deepseek_ai and @Alibaba_Qwen. Nice try on reasoning longer to be SOTA but using flops to quantify the cost in Test-time scaling doesn’t work for hybrid model 🫣 @chenzhuoming911 has…

MMiniMax (official)@MiniMax__AI · Jun 16

Day 1/5 of #MiniMaxWeek: We’re open-sourcing MiniMax-M1, our latest LLM — setting new standards in long-context reasoning. - World’s longest context window: 1M-token input, 80k-token output - State-of-the-art agentic use among open-source models - RL at unmatched efficiency:…

4.0K

Xun Huang Retweeted

Infini-AI-Lab@InfiniAILab · Jun 16

🔥 We introduce Multiverse, a new generative modeling framework for adaptive and lossless parallel generation. 🚀 Multiverse is the first open-source non-AR model to achieve AIME24 and AIME25 scores of 54% and 46% 🌐 Website: multiverse4fm.github.io 🧵 1/n

220

108

77.0K