Peixuan Han (韩沛煊)
@peixuanhakhan
1st year Ph.D. student at UIUC @IllinoisCS Amazon 25Summer Intern LLM researcher
How can we unlock generalized reasoning? ⚡️Introducing Energy-Based Transformers (EBTs), an approach that out-scales (feed-forward) transformers and unlocks generalized reasoning/thinking on any modality/problem without rewards. TLDR: - EBTs are the first model to outscale the…
Super excited to begin my Applied Scientist Internship at @amazon, which is my first internship in the industry. I'm looking forward to conducting interesting and insightful research on the efficient reasoning of LLMs!


Can LLMs make rational decisions like human experts? 📖Introducing DecisionFlow: Advancing Large Language Model as Principled Decision Maker We introduce a novel framework that constructs a semantically grounded decision space to evaluate trade-offs in hard decision-making…
⚠️ Rogue AI scientists? 🛡️ SafeScientist rejects unsafe prompts for ethical discoveries. Check out paper ➡️ (arxiv.org/pdf/2505.23559) #AISafety #LLM #SafeAI #AI
📢 New Paper Drop: From Solving to Modeling! LLMs can solve math problems — but can they model the real world? 🌍 📄 arXiv: arxiv.org/pdf/2505.15068 💻 Code: github.com/qiancheng0/Mod… Introducing ModelingAgent, a breakthrough system for real-world mathematical modeling with LLMs.
💥Time-R1 is here! Can a 3B LLM truly grasp time? 🤔 YES! Excited to share our new work, Time-R1: Towards Comprehensive Temporal Reasoning in LLMs 🚀 Check it out: 📖 Paper: arxiv.org/abs/2505.13508 💻 Code: github.com/ulab-uiuc/Time… #TemporalReasoning #RL #LLMs
We introduce Gradient Variance Minimization (GVM)-RAFT, a principled dynamic sampling strategy that minimizes gradient variance to improve the efficiency of chain-of-thought (CoT) training in LLMs. – Achieves 2–4× faster convergence than RAFT – Improves accuracy on math…
🚀 Can we cast reward modeling as a reasoning task? 📖 Introducing our new paper: RM-R1: Reward Modeling as Reasoning 📑 Paper: arxiv.org/pdf/2505.02387 💻 Code: github.com/RM-R1-UIUC/RM-… Inspired by recent advances of long chain-of-thought (CoT) on reasoning-intensive tasks, we…
🧪 Want an AI-generated paper draft in just 1 minute? Or dreaming of building auto-research apps but frustrated with setups? Meet tiny-scientist, a minimal package to start AI-powered research: 👉 pip install tiny-scientist 🔗 github.com/ulab-uiuc/tiny… #AIAgent #pythonpackages
Introducing Qwen3! We release and open-weight Qwen3, our latest large language models, including 2 MoE models and 6 dense models, ranging from 0.6B to 235B. Our flagship model, Qwen3-235B-A22B, achieves competitive results in benchmark evaluations of coding, math, general…
🚀 Excited to announce that our paper 𝐒𝐞𝐚𝐫𝐜𝐡-𝐑𝟏 is now live! 📄 We introduce an RL framework (an extension of 𝐃𝐞𝐞𝐩𝐬𝐞𝐞𝐤-𝐑𝟏) for training reasoning-and-retrieval interleaved LLMs. We’re also open-sourcing all resources—models, data, and more! 📜 Paper:…
🚀 Introducing 𝗦𝗲𝗮𝗿𝗰𝗵-𝗥𝟭 – the first 𝗿𝗲𝗽𝗿𝗼𝗱𝘂𝗰𝘁𝗶𝗼𝗻 𝗼𝗳 𝗗𝗲𝗲𝗽𝘀𝗲𝗲𝗸-𝗥𝟭 (𝘇𝗲𝗿𝗼) for training reasoning and search-augmented LLM agents with reinforcement learning! This is a step towards training an 𝗼𝗽𝗲𝗻-𝘀𝗼𝘂𝗿𝗰𝗲 𝗢𝗽𝗲𝗻𝗔𝗜 “𝗗𝗲𝗲𝗽…
Gemma 3 is here! Our new open models are incredibly efficient - the largest 27B model runs on just one H100 GPU. You'd need at least 10x the compute to get similar performance from other models ⬇️