chang ma
@ma_chang_nlp
Ph.D student @hku previously @PKU1898, I work on agents and science.
“What is the answer of 1 + 1?” Large Reasoning Models (LRMs) may generate 1500+ tokens just to answer this trivial question. Too much thinking 🤯 Can LRMs be both Faster AND Stronger? Yes. Introducing LASER💥: Learn to Reason Efficiently with Adaptive Length-based Reward Shaping…
Xinyu Yang from CMU will be giving a talk titled "Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation" at Friday July 25 11am HKT (Thursday July 24 8pm PDT). Link to talk: hku.zoom.us/j/92651812689?…
🚀 Thrilled to announce Dream-Coder 7B — the most powerful open diffusion code LLM to date.
What happend after Dream 7B? First, Dream-Coder 7B: A fully open diffusion LLM for code delivering strong performance, trained exclusively on public data. Plus, DreamOn cracks the variable-length generation problem! It enables code infilling that goes beyond a fixed canvas.
We present DreamOn: a simple yet effective method for variable-length generation in diffusion language models. Our approach boosts code infilling performance significantly and even catches up with oracle results.
DiffuCoder: Understanding and Improving Masked Diffusion Models for Code Generation Apple introduces DiffuCoder, a 7B diffusion LLM trained on 130B tokens of code authors also propose a diffusion-native RL training framework, coupled-GRPO Decoding of dLLMs differ from…
Hongru Wang from CUHK will be giving a talk titled "Theory of agent: from definition to objective" at ⏰Wednesday 6.11 3pm HKT (Thursday 6.11 11am PDT). Link to talk: hku.zoom.us/j/91654661534?…
I always found it puzzling how language models learn so much from next-token prediction, while video models learn so little from next frame prediction. Maybe it's because LLMs are actually brain scanners in disguise. Idle musings in my new blog post: sergeylevine.substack.com/p/language-mod…
🔍 Are Verifiers Trustworthy in RLVR? Our paper, Pitfalls of Rule- and Model-based Verifiers, exposes the critical flaws in reinforcement learning verification for mathematical reasoning. 🔑 Key findings: 1️⃣ Rule-based verifiers miss correct answers, especially when presented in…
🔥 Meet PromptCoT-Mamba The first reasoning model with constant-memory inference to beat Transformers on competition-level math & code ⚡ Efficient decoding: no attention, no KV cache ⚡ +16.0% / +7.1% / +16.6% vs. s1.1-7B on AIME 24 / 25 / LiveCodeBench 🚀 Up to 3.66× faster
Share our another #ICML25 paper: “Bring Reason to Vision: Understanding Perception and Reasoning through Model Merging” ! (1/5) We use model merging to enhance VLMs' reasoning by integrating math-focused LLMs—bringing textual reasoning into multi-modal models. Surprisingly, this…
Guanqi Jiang from UCSD will be giving a talk titled "Robots Pre-Train Robots: Manipulation-Centric Robotic Representation from Large-Scale Robot Datasets" at ⏰Friday 5.16 11am HKT (Thursday 5.15 8pm PDT). Link to talk: hku.zoom.us/j/97674910858?…
Follow our new HKUNLP seminars at hkunlp.github.io/seminar/. You can also sign up as a speaker to share your work!
We are kicking off a series of seminars at @hkunlp2020. @siyan_zhao will be giving a talk titled "d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning" at ⏰Friday 5.9 11am HKT (Thursday 5.8 8pm PDT). Link to talk: hku.zoom.us/j/97925412724?…

🚀🔥 Thrilled to announce our ICML25 paper: "Why Is Spatial Reasoning Hard for VLMs? An Attention Mechanism Perspective on Focus Areas"! We dive into the core reasons behind spatial reasoning difficulties for Vision-Language Models from an attention mechanism view. 🌍🔍 Paper:…
Excited to be in Singapore 🇸🇬 for #ICLR2025! Thrilled for my first time attending after past visa issues kept me away 😢. We'll be presenting our work on: 1️⃣ Jailbreaking as a Reward Misspecification Problem 🗓️ Thursday, April 24 — 3:00 PM - 5:30 PM (SGT) 📍 Hall 3 + Hall 2B —…
Excited to share our work at ICLR 2025 in 🇸🇬. @iclr_conf 🥳 Happy to chat about LLM reasoning & planning, agents, and AI4Science! 📍Sat 26 Apr 3 p.m. CST — 5:30 p.m Hall 3 + Hall 2B #554

Thanks for sharing our work -- our solution to improving GUI agents with other data rich sources.
9/🧵Breaking the Data Barrier – Building GUI Agents Through Task Generalization This paper presents a mid-training approach using Vision Language Models to enhance GUI agent performance by leveraging diverse, reasoning-intensive tasks, resulting in significant improvements in…
Excited to share our work studying CLIP and LLaVA on chart understanding tasks 🔍 As the CLIP vision encoder serves as LLaVA's visual information source, have you considered how the CLIP ability affects LLaVA? 📄Paper: arxiv.org/abs/2503.18435 📷Code: github.com/hkust-nlp/Visi… (1/5)