Yunzhi Zhang
@zhang_yunzhi
CS PhD @Stanford
🌟Got multiple expert models and want them to steer your image/video generation? We’ve re-implemented the Product of Experts for Visual Generation paper on a toy example, and broken it down step by step in our new blog post! Includes: - Github repo: Annealed Importance…
Nicely put @_kevinlu. A fruitful path: Not just getting offline human experience snapshots, but also Internet-scale samples of MDPs and deploying AI models to interact within. We are blessed that supervised pre-training warm-starts the system for better-informed exploration.
Why you should stop working on RL research and instead work on product // The technology that unlocked the big scaling shift in AI is the internet, not transformers I think it's well known that data is the most important thing in AI, and also that researchers choose not to work…
After session starts! @jesu9 leading discussions on concept recognition with VLMs.
Happening now in Room 101A! Daniel Ritchie opening up with programmatic visual concept representations. #CVPR2025
🚀 Excited to announce our CVPR 2025 Workshop: 3D Digital Twin: Progress, Challenges, and Future Directions 🗓 June 12, 2025 · 9:00 AM–5:00 PM 📢 Incredible lineup: @rapideRobot, Andrea Vedaldi @Oxford_VGG,@richardzhangsfu,@QianqianWang5,Dr. Xiaoshuai Zhang @Hillbot_AI,…
Happening now in Room 101A! Daniel Ritchie opening up with programmatic visual concept representations. #CVPR2025

The submission deadline for the Workshop on Visual Concepts @CVPR is extended to April 15. #CVPR2025 As visual generative and perception modeling rapidly evolve, it's a great time to join us (and an incredible speaker lineup!) for discussions. More info: sites.google.com/stanford.edu/v…

State-of-the-art zero-shot customized image generation by @prime_cai, Eric Chan, @zhang_yunzhi, Leo Guibas, @jiajunwu_cs !
Diffusion Self-Distillation app is out on Hugging Face redefines zero-shot customized image generation using FLUX. DSD is like DreamBooth, but zero-shot/training-free. It works across any input subject and desired context—character consistency, item/asset adaptation, scene…
Excited to bring back the 2nd Workshop on Visual Concepts at @CVPR 2025, this time with a call for papers! We welcome submissions on the following topics. See our website for more info: sites.google.com/stanford.edu/w… Join us & a fantastic lineup of speakers in Tennessee!
New work on relightable 4D (:=3D + temporal) asset generation led by @gengchen01!
Ever wondered how roses grow and wither in your backyard?🌹 Our latest work on generating 4D temporal object intrinsics lets you explore a rose's entire lifecycle—from birth to death—under any environment light, from any viewpoint, at any moment. Project page:…
Two keys in our recipe for text+image-prompted image generation: data from self-distillation (critical w/ limited real data); an architecture casting image-to-image tasks as video frame synthesis, effectively injecting image controls to FLUX. Work led by the fantastic @prime_cai!
Sharing something exciting we've been working on as a Thanksgiving gift: Diffusion Self-Distillation (DSD), which redefines zero-shot customized image generation using FLUX. DSD is like DreamBooth, but zero-shot/training-free. It works across any input subject and desired…
This is a great opportunity to join @elliottszwu's new group! Working with Elliott has always been inspiring and fun---He brings incredible insights and depth to research. Excited to see what the lab will bring!
I'm building a new research lab @Cambridge_Eng focusing on 4D computer vision and generative models. Interested in joining us as a PhD student? Apply to the Engineering program by Dec 3 🗓️ postgraduate.study.cam.ac.uk/courses/direct… ChatGPT's "portrait of my current life"👇 elliottwu.com