Ryan Tabrizi @ CVPR
@ryan_tabrizi
research @AdobeResearch, @berkeley_ai
Teaching computer vision next semester? Hoping to finally learn about diffusion models in 2025? Check out this diffusion project that we designed and test-drove this past semester at Berkeley and Michigan!
woohoo! so excited to finally share this. check out the website, and sound ON!! It's craaaazy how much of a difference it makes to hear your videos. 🔊
Video, meet audio. 🎥🤝🔊 With Veo 3, our new state-of-the-art generative video model, you can add soundtracks to clips you make. Create talking characters, include sound effects, and more while developing videos in a range of cinematic styles. 🧵
I had a lot of fun helping put this problem set together -- if you're teaching diffusion models + computer vision, consider using this homework for your course! (links at end of @ryan_tabrizi's thread!)
Teaching computer vision next semester? Hoping to finally learn about diffusion models in 2025? Check out this diffusion project that we designed and test-drove this past semester at Berkeley and Michigan!
Some problems can’t be rushed—they can only be done step by step, no matter how many people or processors you throw at them. We’ve scaled AI by making everything bigger and more parallel: Our models are parallel. Our scaling is parallel. Our GPUs are parallel. But what if the…
coming to the Exploratorium today! thank you @CatieCuan for bringing me along on this wild adventure :) exploratorium.edu/visit/calendar…
I'm excited to share that I’ll be joining @UofMaryland as an Assistant Professor in Computer Science, where I’ll be launching the Resilient AI and Grounded Sensing Lab. The RAGS Lab will build AI that works in chaotic environments. If you would like to partner, please DM me!
I had a pleasure TA’ing this class! Check out the material, especially the flow matching from scratch assignment!
Angjoo Kanazawa @akanazawa and I taught CS 280, graduate computer vision, this semester at UC Berkeley. We found a combination of classical and modern CV material that worked well, and are happy to share our lecture material from the class. cs280-berkeley.github.io Enjoy!
our new system trains humanoid robots using data from cell phone videos, enabling skills such as climbing stairs and sitting on chairs in a single policy (w/ @redstone_hong @junyi42 @davidrmcall)
Excited to introduce PyRoki ("Python Robot Kinematics"): easier IK, trajectory optimization, motion retargeting... with an open-source toolkit on both CPU and GPU
Excited to announce our new preprint, "Training Video Foundation Models with NVIDIA NeMo." Paper: arxiv.org/abs/2503.12964
Very excited to share Stable Virtual Camera, a generalist diffusion model for view synthesis: stable-virtual-camera.github.io It scales well with data, and works out-the-box for different NVS tasks. Code and 🤗 demo are released! 🧵(1/N)
Stability AI just dropped Stable Virtual Camera on Hugging Face a generalist diffusion model designed to address the exciting challenge of Novel View Synthesis (NVS). With just one or a few images, it allows you to create a smooth trajectory video from any viewpoint you desire.
Do LLMs understand probability distributions? Can they serve as effective simulators of probability? No! However, in our latest paper that via in-context learning, LLMs update their broken priors in a manner akin to Bayseian updating. 📝 arxiv.org/abs/2503.04722
Introducing Carl, the first AI system to create a research paper that passes peer review. Carl's work was just accepted at an @ICLR_conf workshop on the Tiny Papers track. Carl forms new research hypotheses, tests them & writes up results. Learn more: autoscience.ai/blog/meet-carl…
Happy to announce that our paper on “Scaling Properties of Diffusion Models For Perceptual Tasks" has been accepted to CVPR 2025! 🥳 🎉 We present a detailed study on how to efficiently scale conditional diffusion models for perceptual tasks under a unified framework. Our…
Decentralized Diffusion Models power stronger models trained on more accessible infrastructure. DDMs mitigate the networking bottleneck that locks training into expensive and power-hungry centralized clusters. They scale gracefully to billions of parameters and generate…
An Empirical Study of Autoregressive Pre-training from Videos. paper: arxiv.org/pdf/2501.05453 website: brjathu.github.io/toto We empirically study autoregressive pre-training from videos. Our models are pre-trained on a diverse dataset of videos and images comprising over 1…