Enze Xie
@xieenze_jr
Staff Research Scientist at NVIDIA, doing GenAI, CS PhD from HKU MMLab, interned at NVIDIA.
I gave an oral presentation on our image generation foundation model, SANA, at ICLR 2025 Singapore. Welcome to follow! In addition, we are developing SANA Video model, stay tuned! ICLR oral replay: youtube.com/watch?v=rrKFyY…
SANA-Sprint is selected as **Highlight** paper in ICCV 2025!🎉
🚀🔥SANA-Sprint: One-step text-to-image generation in 0.1s with SANA! arxiv.org/abs/2503.09641 SANA-Sprint delivers: ⚡️ 1024×1024 images in 0.1s on H100 🏆 SOTA: 7.59 FID in ONE step ⚙️ 10x faster than FLUX-Schnell 💻 Deployable on laptop (0.31s on RTX 4090) Code coming soon!🥳
🚀 Big update for Fast-dLLM! ✨ Introduced a Factor-based parallel decoding strategy 📈 Achieves 50% higher throughput vs. threshold-based, with minimal accuracy loss 🎨 Supports multimodal diffusion LLMs (LLaDA-V) — 10× faster 📊 Includes analysis of decoding dynamics 📜…


Thank you Felix for hosting this excellet workshop!
Enze Xie, Nvidia "Building Image Generation model from scratch and Acceleration" @xieenze_jr
I will be attending CVPR from June 11-15. Welcome to meet me for coffee! I will also share our team's research on Efficient Image Generation (including SANA, DC-AE, VILA-U, HART) in two workshops on Jun 12. 我将在 6月11-15 参加 CVPR, 欢迎约咖啡!👀 coop-intelligence.github.io…
🚀The code for Fast-dLLM is now open-source! 💥 Fast-dLLM achieves a 27.6× end-to-end speedup on 1024-token sequences with less than 2% accuracy drop. Check out the code here: github.com/NVlabs/Fast-dL…
🚀 Fast-dLLM: 27.6× Faster Diffusion LLMs with KV Cache & Parallel Decoding 💥 Key Features🌟 - Block-Wise KV Cache Reuses 90%+ attention activations via bidirectional caching (prefix/suffix), enabling 8.1×–27.6× throughput gains with <2% accuracy loss 🔄 -…
🚀 Fast-dLLM: 27.6× Faster Diffusion LLMs with KV Cache & Parallel Decoding 💥 Key Features🌟 - Block-Wise KV Cache Reuses 90%+ attention activations via bidirectional caching (prefix/suffix), enabling 8.1×–27.6× throughput gains with <2% accuracy loss 🔄 -…

I know you have always secretly craved a cool distillation script that actually gets results. That time has come 🤯 In collaboration w/ @lawrence_cjs & Shuchen Xue, we present a Diffusers-compatible training script for SANA Sprint 🏃 Links ⬇️
Deep compression in vae is hard, but this paper beautifully explains how to achieve this
I'll be attending ICLR from April 24th to 28th in Singapore—hope to see you there! I'd be delighted to grab coffee and chat about topics such as #GenerativeAI and #EfficientAI

I tried WriteHere's Deep Research report feature, and the results were truly impressive! 🤩 Congrats @Beastlyprime !
What if AI could write creative stories & insightful #DeepResearch reports like an expert? Our heterogeneous recursive planning [1] enables this via adaptive subgoals [2] & dynamic execution. Agents dynamically replan & weave retrieval, reasoning, & composition mid-flow. Explore…
🚀 SANA 1.5 Update: Inference Scaling Now Open-Source! 🎉 📈 Breakthrough on GenEval benchmark: • SANA 1.5 + Inference Scaling: 0.81 → 0.96 (!!) 🎯 • SD 1.5 + Inference Scaling: 0.42 → 0.87 ⬆️ 💫 The secret sauce: 1. Generate n candidates 🎨 2. Pick top k with NVILA…


I will attend GTC 2025 at NVIDIA Santa Clara from March 17 to March 21, welcome coffee chat ! ☕️
👀SANA-1.5 4.8B checkpoint is released!🎉🥳 Much better than SANA-1.0 1.6B. We also release more training code e.g. FSDP support / webdataset loader / multi-scale image sampler. One-click start training and feel free to try! github.com/NVlabs/Sana?ta… We will release Sprint soon😎
The best few-step sampling model across the speed-memory frontier? 😱 Introducing SANA-Sprint in collaboration with the great SANA team! Beyond the results, perhaps more importantly, the work is about the recipe of SANA-Sprint. Code & model will be open ❤️ Let's go ⬇️
SANA-Sprint One-Step Diffusion with Continuous-Time Consistency Distillation