Siyan Zhao

@siyan_zhao

CS PhD student @UCLA | Bachelors @UofT EngSci | LLMs, generative models, decision-making

Los Angeles, CA

Joined January 2019

715Following

3KFollowers

Pinned

Siyan Zhao@siyan_zhao · Apr 11

Introducing d1🚀 — the first framework that applies reinforcement learning to improve reasoning in masked diffusion LLMs (dLLMs). Combining masked SFT with a novel form of policy gradient algorithm, d1 significantly boosts the performance of pretrained dLLMs like LLaDA.

siyan_zhao's tweet image. Introducing d1🚀 — the first framework that applies reinforcement learning to improve reasoning in masked diffusion LLMs (dLLMs).

Combining masked SFT with a novel form of policy gradient algorithm, d1 significantly boosts the performance of pretrained dLLMs like LLaDA.

107

569

379

82.0K

Pinned

Siyan Zhao Retweeted

Aditya Grover@adityagrover_ · Jun 24

Since our launch earlier this year, we are thrilled to witness the growing community around dLLMs. The Mercury tech report from @InceptionAILabs is now on @arxiv with more extensive evaluations: arxiv.org/abs/2506.17298 New model updates dropping later this week!

256

140

22.0K

Siyan Zhao Retweeted

Mihir Prabhudesai@mihirp98 · Jul 22

🚨 The era of infinite internet data is ending, So we ask: 👉 What’s the right generative modelling objective when data—not compute—is the bottleneck? TL;DR: ▶️Compute-constrained? Train Autoregressive models ▶️Data-constrained? Train Diffusion models Get ready for 🤿 1/n

121

171

966

833

170.0K

Siyan Zhao Retweeted

Tung Nguyen@tungnd_13 · Jul 8

🚀 Introducing PhysiX: One of the first large-scale foundation models for physics simulations! PhysiX is a 4.5B parameter model that unifies a wide range of physical systems, from fluid dynamics to reaction-diffusion, outperforming specialized, state-of-the-art models.

259

2.0K

1.0K

139.0K

Siyan Zhao Retweeted

Sansa Gong@sansa19739319 · Jul 2

🤖Can diffusion models write code competitively? Excited to share our latest 7B coding diffusion LLM!!💻 With DiffuCoder, we explore how they decode, why temperature🔥 matters, and how to improve them via coupled-GRPO that speaks diffusion!!📈 Code: github.com/apple/ml-diffu… 🧵

112

583

378

45.0K

Siyan Zhao Retweeted

Shufan (Jack) Li@li78658171 · Jul 1

(1/6)Our work Reflect-DiT was accepted to #ICCV2025 ! Reflect-DiT allows the model to reflect on its past generations and textual feedback to self-correct and improve, extending reasoning to text-to-image generation.

8.0K

Siyan Zhao@siyan_zhao · Jun 16

🚀 Super excited to share Multiverse! 🏃 It’s been a long journey exploring the space between model design and hardware efficiency. What excites me most is realizing that, beyond optimizing existing models, we can discover better model architectures by embracing system-level…

IInfini-AI-Lab@InfiniAILab · Jun 16

🔥 We introduce Multiverse, a new generative modeling framework for adaptive and lossless parallel generation. 🚀 Multiverse is the first open-source non-AR model to achieve AIME24 and AIME25 scores of 54% and 46% 🌐 Website: multiverse4fm.github.io 🧵 1/n

10.0K

Siyan Zhao@siyan_zhao · May 22

🧑‍🍳Very excited to present LaViDa, one of the first diffusion language models for multimodal understanding! 🌟Unlike autoregressive LMs, you can control the speed-quality tradeoff, and solve constrained generation problems out of the box 📦 🌟 We also release LaViDa-Reason, a…

SShufan (Jack) Li@li78658171 · May 22

📢(1/11)Diffusion LMs are fast and controllable at inference time! But why restrict such benefits for processing text data? We are excited to announce LaViDa, one of the first and fastest large diffusion LM for vision-language understanding!!

20.0K

Siyan Zhao Retweeted

Shufan (Jack) Li@li78658171 · May 22

168

143

31.0K

Siyan Zhao Retweeted

Songlin Yang@SonglinYang4 · May 6

starting now

7.0K

Siyan Zhao Retweeted

chang ma@ma_chang_nlp · May 6

We are kicking off a series of seminars at @hkunlp2020. @siyan_zhao will be giving a talk titled "d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning" at ⏰Friday 5.9 11am HKT (Thursday 5.8 8pm PDT). Link to talk: hku.zoom.us/j/97925412724?…

3.0K

Siyan Zhao Retweeted

Zijie Huang@ACL2025(virtually)@HuangZi71008374 · May 6

🔬 Checkout our newest #ICML’25 spotlight paper GREAT, which revolutionizes GraphODEs with better generalization through smart disentanglement & causal reasoning. This also marks the final piece of GraphODE series work during my PhD journey :) From model , data, to applications!

132

13.0K

Siyan Zhao@siyan_zhao · May 5

A nice and clean implementation based on huggingface TRL!

SSiyan Zhao@siyan_zhao · May 5

Our Diffu-GRPO and evaluation code is now released! Check it out at github.com/dllm-reasoning…

1.0K

Siyan Zhao@siyan_zhao · May 5

Our Diffu-GRPO and evaluation code is now released! Check it out at github.com/dllm-reasoning…

SSiyan Zhao@siyan_zhao · Apr 11

131

18.0K

Siyan Zhao@siyan_zhao · Apr 29

Thank you @VentureBeat for covering our research on enhancing reasoning with diffusion LLMs using d1. Great collaboration with @siyan_zhao , @DevaanshGupta1 and @qqyuzu .

VVentureBeat@VentureBeat · Apr 28

30 seconds vs. 3: The d1 reasoning framework that's slashing AI response times venturebeat.com/ai/30-seconds-…

3.0K

Siyan Zhao Retweeted

Hritik Bansal@hbXNov · Apr 22

✈️ I will be at @iclr_conf 🇸🇬 to present the following work on LLM reasoning, vision-language understanding, and LLM evaluation w/ @uclanlp, UCLA Machine Intelligence (MINT), and @GoogleDeepMind! Come to the poster sessions and say hi 👋 I will be happy to meet folks from…

4.0K

Siyan Zhao Retweeted

Kaixiang@linkaixi · Apr 22

Attending #ICLR2025 from 4/23 to 4/28 & will present PrefEval (prefeval.github.io) discussing the performance of SoTA LLMs on personalization. Catch our presentations on April 26th: 🔥Oral: 10:42-10:54am @ Hall 1 Apex 📊 Poster #558: 3:00-5:30pm @ Hall 3 + Hall 2B…

1.0K

Siyan Zhao Retweeted

Tung Nguyen@tungnd_13 · Apr 21

Large language models (LLMs) have been explored for optimization via prompting to evaluate or improve candidate solutions. However, this approach does not generalize to domains that are either under-represented during pretraining or inefficient to translate as text. Our #ICLR2025…

4.0K

Siyan Zhao Retweeted

Yannic Kilcher 🇸🇨@ykilcher · Apr 12

📅Saturday Night Paper Discussion📅 Join us tonight to talk about d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning 6pm UTC on Discord, no prior knowledge required: discord.gg/ec2FXBFA?event…

100

8.0K

Siyan Zhao Retweeted

Tanishq Abraham is at ICML@iScienceLuvr · Apr 17

d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning "we propose d1, a framework to adapt pre-trained masked dLLMs into reasoning models via a combination of supervised finetuning (SFT) and RL." "we introduce a novel critic-free, policy-gradient…

396

241

32.0K