Zirui Wu @ACL2025 🇦🇹

@WilliamZR7

Master's Student at Peking University | Research Assistant @hkunlp2020 | Diffusion Language Models | Opinions my own

Beijing

Joined January 2023

515Following

201Followers

Pinned

Zirui Wu @ACL2025 🇦🇹@WilliamZR7 · Jul 15

We present DreamOn: a simple yet effective method for variable-length generation in diffusion language models. Our approach boosts code infilling performance significantly and even catches up with oracle results.

113

14.0K

Zirui Wu @ACL2025 🇦🇹 Retweeted

Subbarao Kambhampati (కంభంపాటి సుబ్బారావు)@rao2z · 23 h

PSA: Don't say "Longer thinking kills performance" when you mean "Length of intermediate token string is not correlated with final accuracy"

207

53.0K

Zirui Wu @ACL2025 🇦🇹 Retweeted

David@DavidSHolz · Jul 23

anyone out there capable and interested in joining the Midjourney team to train some diffusion text models? really feels like we could do something special and i'm increasingly tempted to try

837

145

72.0K

Zirui Wu @ACL2025 🇦🇹 Retweeted

Chao Huang@huang_chao4969 · Jul 22

🎬 Introducing our "VideoAgent" - The open-source alternative to NotebookLM for AI-powered video/audio creation! 🔗 GitHub: github.com/HKUDS/Open-Not… ✨ Create professional videos/audios through simple conversation - no technical expertise required! - 🔥 Key capabilities: - 🧠…

5.0K

Zirui Wu @ACL2025 🇦🇹 Retweeted

KwaiAICoder@KwaiAICoder · Jul 21

🚀 Excited to introduce KAT-V1 (Kwaipilot-AutoThink) – a breakthrough 40B large language model from the Kwaipilot team! KAT-V1 dynamically switches between reasoning and non-reasoning modes to address the “overthinking” problem in complex reasoning tasks. Key Highlights: 📌 40B…

12.0K

Zirui Wu @ACL2025 🇦🇹 Retweeted

Liliang Ren@liliang_ren · Jul 18

We’re open-sourcing the pre-training code for Phi4-mini-Flash, our SoTA hybrid model that delivers 10× faster reasoning than Transformers — along with μP++, a suite of simple yet powerful scaling laws for stable large-scale training. 🔗 github.com/microsoft/Arch… (1/4)

216

1.0K

135.0K

Zirui Wu @ACL2025 🇦🇹 Retweeted

HKUNLP@hkunlp2020 · Jul 18

Xinyu Yang from CMU will be giving a talk titled "Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation" at Friday July 25 11am HKT (Thursday July 24 8pm PDT). Link to talk: hku.zoom.us/j/92651812689?…

2.0K

Zirui Wu @ACL2025 🇦🇹 Retweeted

Lingpeng Kong@ikekong · Jul 15

What happend after Dream 7B? First, Dream-Coder 7B: A fully open diffusion LLM for code delivering strong performance, trained exclusively on public data. Plus, DreamOn cracks the variable-length generation problem! It enables code infilling that goes beyond a fixed canvas.

6.0K

Zirui Wu @ACL2025 🇦🇹 Retweeted

Jiacheng Ye@JiachengYe15 · Jul 15

📢 Update: Announcing Dream's next-phase development. - Dream-Coder 7B: A fully open diffusion LLM for code delivering strong performance, trained exclusively on public data. - DreamOn: targeting the variable-length generation problem in dLLM!

9.0K

Zirui Wu @ACL2025 🇦🇹@WilliamZR7 · Jul 15

Introducing Dream-Coder 7B -- pushing forward with diffusion language models for code generation💻

ZZhihui Xie@_zhihuixie · Jul 15

🚀 Thrilled to announce Dream-Coder 7B — the most powerful open diffusion code  LLM to date.

505

Zirui Wu @ACL2025 🇦🇹@WilliamZR7 · Jul 15

🚀Check out our Dream-Coder 7B!

ZZhihui Xie@_zhihuixie · Jul 15

🚀 Thrilled to announce Dream-Coder 7B — the most powerful open diffusion code  LLM to date.

310

Zirui Wu @ACL2025 🇦🇹 Retweeted

Zhihui Xie@_zhihuixie · Jul 15

🚀 Thrilled to announce Dream-Coder 7B — the most powerful open diffusion code  LLM to date.

110

11.0K

Zirui Wu @ACL2025 🇦🇹 Retweeted

JingqiZhou@zhou_jingqi_ · Jul 8

🌳TreeSynth: Synthesizing large-scale diverse data from scratch! Struggling with repetition and space collapse in data synthesis? 🤔 Our latest research mitigates this challenge through innovative tree-guided subspace partitioning. ✨Introducing TreeSynth—a novel framework…

343