Dongwei Jiang
@Dongwei__Jiang
Working on LLMs, focusing specifically on reasoning and self-improvement. Spent six years in my past life doing research in industry on speech processing
🧵 Recent studies show LLMs can self-improve their responses when given external feedback. But how effectively can they incorporate it? We tested this systematically—and found they can't fully integrate feedback, even when the feedback is high-quality and backed by ground-truth.

Incredibly grateful to @TheOfficialACM SIGPLAN for awarding #LeanLang the Programming Languages Software Award 2025 at #PLDI2025! 🎉 "The Lean theorem prover is a remarkable software artifact... Lean has had and continues to have a broad impact on industrial practice and…
🚨🚨 New paper out with @Dongwei__Jiang and team: Even with near-perfect, ground-truth feedback, LLMs often fail to fully integrate it. We call this "feedback friction"—a key barrier to self-improvement. x.com/Dongwei__Jiang…
🧵 Recent studies show LLMs can self-improve their responses when given external feedback. But how effectively can they incorporate it? We tested this systematically—and found they can't fully integrate feedback, even when the feedback is high-quality and backed by ground-truth.
🚨 We discovered a surprising side effect of Reinforcement Finetuning (RFT): it makes LLMs more confidently wrong on unanswerable questions. We call this the hallucination tax: a drop in refusal behavior that leads to overconfident hallucinations. 🧵 1/n
We've been thinking about this gap too! Our paper (arxiv.org/abs/2404.04298) found that when verifiable environments aren't available, LLMs aren't better at discriminating among previously generated alternatives than generating initial responses.
Now accepted by #ACL2025! Thrilled to see our paper also referenced in @lilianweng's latest blog post on reasoning in LLMs! Check it out: lilianweng.github.io/posts/2025-05-…
Process supervision for reasoning is 🔥! While previous approaches often relied on human annotation and struggled to generalize across different reasoning tasks, we're now asking: Can we improve this? Introducing 𝐑𝐀𝐓𝐈𝐎𝐍𝐀𝐋𝐘𝐒𝐓: a new model pre-trained on implicit…
Excited to be presenting our paper on training language models under heavily imbalanced data tomorrow at #NAACL2025! If you want to chat about data curation for both pre- and post-training, feel free to reach out! 📝 arxiv.org/abs/2410.04579 📅 11-12:30am, Fri, May 2 📍 Hall 3
"Upsample or Upweight? Balanced Training on Heavily Imbalanced Datasets" arxiv.org/abs/2410.04579 TLDR—When pre-training on imbalanced data, "Upsampling" and loss "Upweighting" are often assumed equivalent. (1)We show they behave differently. (2) Using this, we propose…
Current copyright mitigation methods for LLMs typically focus on average-case risks, but overlook worst-case scenarios involving long verbatim copying ⚠️. We propose BloomScrub 🧽, a method providing certified mitigation of worst-case infringement while preserving utility.
Reasoning to Learn from Latent Thoughts "Motivated by how humans apply deliberate thinking to learn from limited data, we train an LM to infer (or “decompress”) latent thoughts underlying the highly compressed observed data. These synthesized latent thoughts augment the raw…
Verification, The Key to AI Read the archives of Rich Sutton, Turing Award winner :D, has all the major ideas
This isn't quite true. Test-time compute helps when verification is easier than generation (e.g., sudoku), but if the task is "When was George Washington born?" and you don't know, no amount of thinking will get you to the correct answer. You're bottlenecked by verification.
I'll be at #AAAI25 presenting my poster on Self-[In]Correct (arxiv.org/abs/2404.04298) during Session 3 on March 1st at 12:30. Would love to connect if you're attending!