Zirui Wu @ACL2025 🇦🇹
@WilliamZR7
Master's Student at Peking University | Research Assistant @hkunlp2020 | Diffusion Language Models | Opinions my own
We present DreamOn: a simple yet effective method for variable-length generation in diffusion language models. Our approach boosts code infilling performance significantly and even catches up with oracle results.
PSA: Don't say "Longer thinking kills performance" when you mean "Length of intermediate token string is not correlated with final accuracy"
anyone out there capable and interested in joining the Midjourney team to train some diffusion text models? really feels like we could do something special and i'm increasingly tempted to try
🎬 Introducing our "VideoAgent" - The open-source alternative to NotebookLM for AI-powered video/audio creation! 🔗 GitHub: github.com/HKUDS/Open-Not… ✨ Create professional videos/audios through simple conversation - no technical expertise required! - 🔥 Key capabilities: - 🧠…
🚀 Excited to introduce KAT-V1 (Kwaipilot-AutoThink) – a breakthrough 40B large language model from the Kwaipilot team! KAT-V1 dynamically switches between reasoning and non-reasoning modes to address the “overthinking” problem in complex reasoning tasks. Key Highlights: 📌 40B…
We’re open-sourcing the pre-training code for Phi4-mini-Flash, our SoTA hybrid model that delivers 10× faster reasoning than Transformers — along with μP++, a suite of simple yet powerful scaling laws for stable large-scale training. 🔗 github.com/microsoft/Arch… (1/4)
Xinyu Yang from CMU will be giving a talk titled "Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation" at Friday July 25 11am HKT (Thursday July 24 8pm PDT). Link to talk: hku.zoom.us/j/92651812689?…
What happend after Dream 7B? First, Dream-Coder 7B: A fully open diffusion LLM for code delivering strong performance, trained exclusively on public data. Plus, DreamOn cracks the variable-length generation problem! It enables code infilling that goes beyond a fixed canvas.
📢 Update: Announcing Dream's next-phase development. - Dream-Coder 7B: A fully open diffusion LLM for code delivering strong performance, trained exclusively on public data. - DreamOn: targeting the variable-length generation problem in dLLM!
Introducing Dream-Coder 7B -- pushing forward with diffusion language models for code generation💻
🚀 Thrilled to announce Dream-Coder 7B — the most powerful open diffusion code LLM to date.
🚀Check out our Dream-Coder 7B!
🚀 Thrilled to announce Dream-Coder 7B — the most powerful open diffusion code LLM to date.
🚀 Thrilled to announce Dream-Coder 7B — the most powerful open diffusion code LLM to date.
🌳TreeSynth: Synthesizing large-scale diverse data from scratch! Struggling with repetition and space collapse in data synthesis? 🤔 Our latest research mitigates this challenge through innovative tree-guided subspace partitioning. ✨Introducing TreeSynth—a novel framework…