Gongfan Fang
@Vinnnnf
PhD@NUS
Representation Alignment (REPA) is NOT ALWAYS helpful for diffusion training!🤷 Sharing latest work w/ @HPCAILab and @VITAGroupUT: "REPA Works Until It Doesn't: Early-Stopped, Holistic Alignment Supercharges Diffusion Training". Acceleration up to 28x w/o performance drop.(🧵1/7)
Thanks @_akhaliq for sharing our work!! LLMs can learn when to think through simple reinforcement learning, reducing unnecessary reasoning by 50%–90% on easy tasks like “1 + 1 = ?”. Check our GitHub for more details: github.com/VainF/Thinkless
Thinkless LLM Learns When to Think
💡 Can LLMs learn when to think? We introduce Thinkless, an LLM that knows when to think. 🔑 Decoupled GRPO: learns when to think & how to answer 🔀 Cuts reasoning by 50–90% ❌ Stop overthinking 1 + 1 📎 Paper: arxiv.org/abs/2505.13379 💻 Code: github.com/VainF/Thinkless #LLM

Key Highlights: ⚡ Controllable CoT Length—seamlessly generalizes to unseen reasoning lengths! 🚀 Massive compression gains: - GSM8K: 741 → 225 tokens with only a minor accuracy drop - AIME: 6827 → 4629 tokens (📉 32% reduction) - Largely beyond prompt-based control
🚀 Controlling CoT Length with a Magic Valve! What if we could adjust the reasoning chain length for QwQ/DeepseekR1 based on task difficulty? 🤔 Our solution: CoT-Valve, a tuning strategy that elastically controls and compresses CoT length Paper: arxiv.org/abs/2502.09601