Kulin Shah
@shahkulin98
ML Ph.D. Student @ UT Austin | Ex - Student researcher @ Google | Ex - Research fellow @ Microsoft Research India
congrats on the award!! great to see more work designing insightful tasks that bring out the role of token ordering & difficulty (should i say "indecipherability" ;-) ). I think the idea of learning token-level subproblems is broken both for diffusion (and next-token learning)
Excited about this new work where we dig into the role of token order in masked diffusions! MDMs train on some horribly hard tasks, but careful planning at inference can sidestep the hardest ones, dramatically improving over vanilla MDM sampling (e.g. 7%->90% acc on Sudoku) 1/
Today @ChenHenryWu and I will be presenting our #ICML work on creativity in the Oral 3A Reasoning session (West Exhibition Hall C) 10 - 11 am PT Or please stop by our poster right after @ East Exhibition Hall A-B #E-2505 11am-1:30pm. (Hope you enjoy some silly human drawings!)
Congrats to @UTAustin students Kulin Shah and Vasilis Kontonis who won an Outstanding Paper Award at #ICML2025! Their work pushes the boundaries of how AI models learn and understand the world. We’re proud to see @utcompsci students leading at the frontier of machine learning.
Congrats to the Kim, Shah, et al. team from Harvard on winning an outstanding paper award at ICML for their work on understanding generation orderings in masked diffusion! And thank you for the shout-out to @InceptionAILabs!
📢 Thrilled to share our new paper, LASER: Attention with Exponential Transformation, accepted at ICML2025, work done at Google. Come by our poster presentation! 🗓️ Thurs, July 17th, 4:30-7pm 📍 West Exhibition Hall B2-B3, # W-915 Read the full paper here: arxiv.org/abs/2411.03493
A team from #KempnerInstitute, @hseas & @UTCompSci has won a best paper award at #ICML2025 for work unlocking the potential of masked diffusion models. Congrats to @Jaeyeon_Kim_0, @shahkulin98, Vasilis Kontonis, @ShamKakade6 and @sitanch. kempnerinstitute.harvard.edu/news/kempner-i… #AI
Thrilled to share that our work received the Outstanding Paper Award at ICML! I will be giving the oral presentation on Tuesday at 4:15 PM. @Jaeyeon_Kim_0 and I both will be at the poster session shortly after the oral presentation. Please attend if possible!
Excited about this new work where we dig into the role of token order in masked diffusions! MDMs train on some horribly hard tasks, but careful planning at inference can sidestep the hardest ones, dramatically improving over vanilla MDM sampling (e.g. 7%->90% acc on Sudoku) 1/
Incredible to see how fast the field moves since we worked on masked diffusion arxiv.org/abs/2406.04329. Huge congratulations to @Jaeyeon_Kim_0 @sitanch for the award!
#ICML2025 Outstanding Paper Awards
Announcing Ambient Protein Diffusion, a state-of-the-art 17M-params generative model for protein structures. Diversity improves by 91% and designability by 26% over previous 200M SOTA model for long proteins. The trick? Treat low pLDDT AlphaFold predictions as low-quality data
Standard attention is softmax(QKᵀ)V. We explore 2-simplicial attention using additional keys K' for richer interactions: softmax(Q(K⊗K')ᵀ). 🧵
Excited to share what I worked on during my time at Meta. - We introduce a Triton-accelerated Transformer with *2-simplicial attention*—a tri-linear generalization of dot-product attention - We show how to adapt RoPE to tri-linear forms - We show 2-simplicial attention scales…
Excited to share that I’ll be presenting two oral papers in this ICML—see u guys in Vancouver!!🇨🇦 1️⃣ arxiv.org/abs/2502.06768 Understanding Masked Diffusion Models theoretically/scientifically 2️⃣ arxiv.org/abs/2502.09376 Theoretical analysis on LoRA training
Nice thread by Aayush on our new work on diffusion reward guidance! Was quite surprised how well this worked and how simple the algorithm is. Also happy that we finally managed to prove some rigorous guarantees for DPS (diffusion posterior sampling)
Steering diffusion models with external rewards has recently led to exciting results, but what happens when the reward is inherently difficult? Introducing ReGuidance: a simple algorithm to (provably!) boost your favorite guidance method on hard problems! 🚀🚀🚀 A thread: (1/n)
Excited about this work, where we show that a simple algorithm of inverting candidate samples boosts the performance for reward guidance in diffusion, both in experiments and in theory! Check out the thread by Aayush for more details.
Steering diffusion models with external rewards has recently led to exciting results, but what happens when the reward is inherently difficult? Introducing ReGuidance: a simple algorithm to (provably!) boost your favorite guidance method on hard problems! 🚀🚀🚀 A thread: (1/n)
Steering diffusion models with external rewards has recently led to exciting results, but what happens when the reward is inherently difficult? Introducing ReGuidance: a simple algorithm to (provably!) boost your favorite guidance method on hard problems! 🚀🚀🚀 A thread: (1/n)
We’ve developed Gemini Diffusion: our state-of-the-art text diffusion model. Instead of predicting text directly, it learns to generate outputs by refining noise, step-by-step. This helps it excel at coding and math, where it can iterate over solutions quickly. #GoogleIO
Introducing d1🚀 — the first framework that applies reinforcement learning to improve reasoning in masked diffusion LLMs (dLLMs). Combining masked SFT with a novel form of policy gradient algorithm, d1 significantly boosts the performance of pretrained dLLMs like LLaDA.
Despite the pessimistic landscape of results, we make empirical and theoretical progress in this question. Empirically, we develop an algorithm that pushes the Pareto frontier of {quality, memorization} in both the text-conditional and the unconditional setting.