Sander Dieleman
@sedielem
Research Scientist at Google DeepMind (WaveNet, Imagen, Veo). I tweet about deep learning (research + software), music, generative models (personal account).
New blog post: let's talk about latents! sander.ai/2025/04/15/lat…
We just discovered the 🔥 COOLEST 🔥 trick in Flow that we have to share: Instead of wordsmithing the perfect prompt, you can just... draw it. Take the image of your scene, doodle what you'd like on it (through any editing app), and then briefly describe what needs to happen…
We are hiring on the Veo team!📽️ Some people asked me about this at #ICML2025. If that's you, I will have told you to check deepmind.google/careers/ regularly. 👀It's just been updated: Europe (London, Zurich) job-boards.greenhouse.io/deepmind/jobs/… US (Mountain View) job-boards.greenhouse.io/deepmind/jobs/…
Want to be part of a team redefining SOTA for generative video models? Excited about building models that can reach billions of users? The Veo team is hiring! We are looking for amazing researchers and engineers, in North America and Europe. Details below:
What a wonderful event to let researchers sit and chat about interesting ideas! LOVE the Diffusion Circle!
We are sitting all the way at the back of the conference center (west building)!
Mission accomplished! Thank you so much @sedielem for organizing the (diffuse) diffusion circle. I met so many new people and learned a lot too!
We are sitting all the way at the back of the conference center (west building)!
We are sitting all the way at the back of the conference center (west building)!
Hello #ICML2025👋, anyone up for a diffusion circle? We'll just sit down somewhere and talk shop. 🕒Join us at 3PM on Thursday July 17. We'll meet here (see photo, near the west building's west entrance), and venture out from there to find a good spot to sit. Tell your friends!
On the way to #ICML2025! ✈️🇨🇦 Come find me and let's talk about: - diffusion models 😐😶🌫️🫥 - generative media 🖼️🎞️🔊 - what the topic of my next blog post should be 🤔💡✍️ Join us at the ML for audio workshop on Saturday! mlforaudioworkshop.github.io

Tokenization is just a special case of "chunking" - building low-level data into high-level abstractions - which is in turn fundamental to intelligence. Our new architecture, which enables hierarchical *dynamic chunking*, is not only tokenizer-free, but simply scales better.
Tokenization has been the final barrier to truly end-to-end language models. We developed the H-Net: a hierarchical network that replaces tokenization with a dynamic chunking process directly inside the model, automatically discovering and operating over meaningful units of data
Excellent blog post by @_albertgu about Transformers, SSMs and the role of tokenisation. Well worth a read. goombalab.github.io/blog/2025/trad…
I converted one of my favorite talks I've given over the past year into a blog post. "On the Tradeoffs of SSMs and Transformers" (or: tokens are bullshit) In a few days, we'll release what I believe is the next major advance for architectures.
🔥Happy to announce that the AI for Music Workshop is coming to #NeurIPS2025! We have an amazing lineup of speakers! We call for papers & demos (due on August 22)! See you in San Diego!🏖️ @chrisdonahuey @Ilaria__Manco @zawazaw @huangcza @McAuleyLabUCSD @zacknovack @NeurIPSConf
Roll call: #ICML2025 diffusion circle 📢 Who's coming? Please tag people that might be interested! Date/time TBD, probably Thursday afternoon. (Beware though👇 joining a diffusion circle is at your own risk!🫣)
We are sitting on the floor outside room 301 (west side)
Diffusion models have analytical solutions, but they involve sums over the entire training set, and they don't generalise at all. They are mainly useful to help us understand how practical diffusion models generalise. Nice blog + code by Raymond Fan: rfangit.github.io/blog/2025/opti…

This looks like a great deep dive on neural network architectures for diffusion models. tl;dr use a Transformer, but there's quite a bit more to it, and as always in this field, the devil is in the details!
Had the honor to present diffusion transformers at CS25, Stanford. The place is truly magical. Slides: bit.ly/dit-cs25 Recording: youtu.be/vXtapCFctTI?si… Thanks to @stevenyfeng for making it happen!
now wouldn't that be something...
Let me play a video game of my veo 3 videos already. Google cooked so good 👌 @OfficialLoganK playable world models wen?
High quality image editing no longer needs closed models We release FLUX.1 Kontext [dev] - an open weights model for proprietary-level image editing performance. Runs on consumer chips. ✓ Open weights available ✓ Best in-class performance ✓ Self-serve commercial licensing
Excited to announce 🎵Magenta RealTime, the first open weights music generation model capable of real-time audio generation with real-time control. 👋 **Try Magenta RT on Colab TPUs**: colab.research.google.com/github/magenta… 👀 Blog post: g.co/magenta/rt 🧵 below
This work uncovers a profound connection between continuous and discrete (non-absorbing) diffusion models, allowing transfer of advanced techniques such as consistency distillation to the discrete setting! Also: amazing title, no notes! 🧑🍳😙🤌
🚨 “The Diffusion Duality” is out! @ICML2025 ⚡️ Few-step generation in discrete diffusion language models by exploiting the underlying Gaussian diffusion. 🦾Beats AR on 3/7 zero-shot likelihood benchmarks. 📄 Paper: arxiv.org/abs/2506.10892 💻 Code: github.com/s-sahoo/duo 🧠…