Kulin Shah (@shahkulin98)

K

Kulin Shah@shahkulin98 · Jul 15

congrats on the award!! great to see more work designing insightful tasks that bring out the role of token ordering & difficulty (should i say "indecipherability" ;-) ). I think the idea of learning token-level subproblems is broken both for diffusion (and next-token learning)

SSitan Chen@sitanch · Feb 11

Excited about this new work where we dig into the role of token order in masked diffusions! MDMs train on some horribly hard tasks, but careful planning at inference can sidestep the hardest ones, dramatically improving over vanilla MDM sampling (e.g. 7%->90% acc on Sudoku) 1/

1

2

5

2

852

Kulin Shah Retweeted

V

Vaishnavh Nagarajan@_vaishnavh · Jul 16

Today @ChenHenryWu and I will be presenting our #ICML work on creativity in the Oral 3A Reasoning session (West Exhibition Hall C) 10 - 11 am PT Or please stop by our poster right after @ East Exhibition Hall A-B #E-2505 11am-1:30pm. (Hope you enjoy some silly human drawings!)

1

17

85

17

6.0K

Kulin Shah Retweeted

I

Institute for Foundations of Machine Learning@MLFoundations · Jul 18

Congrats to @UTAustin students Kulin Shah and Vasilis Kontonis who won an Outstanding Paper Award at #ICML2025! Their work pushes the boundaries of how AI models learn and understand the world. We’re proud to see @utcompsci students leading at the frontier of machine learning.

0

5

33

2

1.0K

Kulin Shah Retweeted

V

Volodymyr Kuleshov 🇺🇦@volokuleshov · Jul 17

Congrats to the Kim, Shah, et al. team from Harvard on winning an outstanding paper award at ICML for their work on understanding generation orderings in masked diffusion! And thank you for the shout-out to @InceptionAILabs!

3

13

265

122

15.0K

Kulin Shah Retweeted

S

Sai Surya Duvvuri@dvsaisurya · Jul 17

📢 Thrilled to share our new paper, LASER: Attention with Exponential Transformation, accepted at ICML2025, work done at Google. Come by our poster presentation! 🗓️ Thurs, July 17th, 4:30-7pm 📍 West Exhibition Hall B2-B3, # W-915 Read the full paper here: arxiv.org/abs/2411.03493

2

4

23

14

5.0K

Kulin Shah Retweeted

K

Kempner Institute at Harvard University@KempnerInst · Jul 15

A team from #KempnerInstitute, @hseas & @UTCompSci has won a best paper award at #ICML2025 for work unlocking the potential of masked diffusion models. Congrats to @Jaeyeon_Kim_0, @shahkulin98, Vasilis Kontonis, @ShamKakade6 and @sitanch. kempnerinstitute.harvard.edu/news/kempner-i… #AI

0

10

56

6

15.0K

K

Kulin Shah@shahkulin98 · Jul 15

Thrilled to share that our work received the Outstanding Paper Award at ICML! I will be giving the oral presentation on Tuesday at 4:15 PM. @Jaeyeon_Kim_0 and I both will be at the poster session shortly after the oral presentation. Please attend if possible!

SSitan Chen@sitanch · Feb 11

Excited about this new work where we dig into the role of token order in masked diffusions! MDMs train on some horribly hard tasks, but careful planning at inference can sidestep the hardest ones, dramatically improving over vanilla MDM sampling (e.g. 7%->90% acc on Sudoku) 1/

5

21

146

38

13.0K

K

Kulin Shah@shahkulin98 · Jul 15

Incredible to see how fast the field moves since we worked on masked diffusion arxiv.org/abs/2406.04329. Huge congratulations to @Jaeyeon_Kim_0 @sitanch for the award!

KKosta Derpanis@CSProfKGD · Jul 15

#ICML2025 Outstanding Paper Awards

3

15

90

23

9.0K

Kulin Shah Retweeted

G

Giannis Daras@giannis_daras · Jul 7

Announcing Ambient Protein Diffusion, a state-of-the-art 17M-params generative model for protein structures. Diversity improves by 91% and designability by 26% over previous 200M SOTA model for long proteins. The trick? Treat low pLDDT AlphaFold predictions as low-quality data

4

41

188

116

28.0K

K

Kulin Shah@shahkulin98 · Jul 4

Standard attention is softmax(QKᵀ)V. We explore 2-simplicial attention using additional keys K' for richer interactions: softmax(Q(K⊗K')ᵀ). 🧵

AAurko Roy@aurko79 · Jul 4

Excited to share what I worked on during my time at Meta. - We introduce a Triton-accelerated Transformer with *2-simplicial attention*—a tri-linear generalization of dot-product attention - We show how to adapt RoPE to tri-linear forms - We show 2-simplicial attention scales…

2

5

52

22

6.0K

Kulin Shah Retweeted

J

Jaeyeon (Jay) Kim @ICML@Jaeyeon_Kim_0 · Jul 6

Excited to share that I’ll be presenting two oral papers in this ICML—see u guys in Vancouver!!🇨🇦 1️⃣ arxiv.org/abs/2502.06768 Understanding Masked Diffusion Models theoretically/scientifically 2️⃣ arxiv.org/abs/2502.09376 Theoretical analysis on LoRA training

4

31

247

129

22.0K

K

Kulin Shah@shahkulin98 · Jun 13

Nice thread by Aayush on our new work on diffusion reward guidance! Was quite surprised how well this worked and how simple the algorithm is. Also happy that we finally managed to prove some rigorous guarantees for DPS (diffusion posterior sampling)

AAayush Karan@aakaran31 · Jun 13

Steering diffusion models with external rewards has recently led to exciting results, but what happens when the reward is inherently difficult? Introducing ReGuidance: a simple algorithm to (provably!) boost your favorite guidance method on hard problems! 🚀🚀🚀 A thread: (1/n)

1

2

22

7

2.0K

K

Kulin Shah@shahkulin98 · Jun 15

Excited about this work, where we show that a simple algorithm of inverting candidate samples boosts the performance for reward guidance in diffusion, both in experiments and in theory! Check out the thread by Aayush for more details.

AAayush Karan@aakaran31 · Jun 13

Steering diffusion models with external rewards has recently led to exciting results, but what happens when the reward is inherently difficult? Introducing ReGuidance: a simple algorithm to (provably!) boost your favorite guidance method on hard problems! 🚀🚀🚀 A thread: (1/n)

0

1

9

2

391

Kulin Shah Retweeted

A

Aayush Karan@aakaran31 · Jun 13

Steering diffusion models with external rewards has recently led to exciting results, but what happens when the reward is inherently difficult? Introducing ReGuidance: a simple algorithm to (provably!) boost your favorite guidance method on hard problems! 🚀🚀🚀 A thread: (1/n)

7

27

160

114

15.0K

Kulin Shah Retweeted

G

Google DeepMind@GoogleDeepMind · May 20

We’ve developed Gemini Diffusion: our state-of-the-art text diffusion model. Instead of predicting text directly, it learns to generate outputs by refining noise, step-by-step. This helps it excel at coding and math, where it can iterate over solutions quickly. #GoogleIO

93

663

5.0K

1.0K

1.2M

Kulin Shah Retweeted

S

Siyan Zhao@siyan_zhao · Apr 11

Introducing d1🚀 — the first framework that applies reinforcement learning to improve reasoning in masked diffusion LLMs (dLLMs). Combining masked SFT with a novel form of policy gradient algorithm, d1 significantly boosts the performance of pretrained dLLMs like LLaDA.

8

107

569

379

82.0K

Kulin Shah Retweeted

G

Giannis Daras@giannis_daras · Mar 3

Despite the pessimistic landscape of results, we make empirical and theoretical progress in this question. Empirically, we develop an algorithm that pushes the Pareto frontier of {quality, memorization} in both the text-conditional and the unconditional setting.

1

2

1

573