Zhaopeng Tu
@tuzhaopeng
Principal Researcher, Hunyuan AI Digital Human, Tencent
We've taught LLMs math and code with RLVR. But can we teach them empathy? 🤖❤️ Introducing Reinforcement Learning with Verifiable Emotion Rewards (RLVER), the first RLVR framework that enhances LLMs' empathy from a simulated user . ❤️ Feelings → Numbers: A…
Can today's LLMs truly understand you, not just your words? 🤖❤️ Introducing SAGE: Sentient Agent as a Judge — the first evaluation framework that uses sentient agents to simulate human emotional dynamics and inner reasoning for assessing social cognition in LLM conversations.…
🚨 Sentient Leaderboard shake-up! 🚨 🏆 A new king is crowned! The Gemini2.5-pro-0605 version has skyrocketed to #1 on our Sentient Leaderboard, with its social cognition score jumping from 62.9 to a massive 82.4! 🧠 First model to break the 80-point barrier in emotion reading &…
🚨 New LLMs have just landed on the Sentient Leaderboard from the SAGE benchmark! 🚨 Can our favorite "helpful" models also excel socially? 🥉 New Gemini2.5 models (Flash-Think and Pro) firmly hold positions #3 and #4, with scores largely unchanged. 🔥 Claude4 debuts at #7,…
We're thrilled to release & open-source Hunyuan3D World Model 1.0! This model enables you to generate immersive, explorable, and interactive 3D worlds from just a sentence or an image. It's the industry's first open-source 3D world generation model, compatible with CG pipelines…
Just stumbled upon an interesting finding: 16.8% of the outputs from the simple ASDIV dataset using DeepSeek-R1-0528 include the new thinking sentence "I think I'm overthinking it.". Seems like our overthinking work (accepted by #ICML2025) might be having a subtle impact after…
Are we scaling test-time compute efficiently and intelligently? Introducing a comprehensive study on the prevalent issue of overthinking in o1-like models, where excessive computational resources are allocated for simple problems with minimal benefit. 🪡Across extensive…
Thank you for recommending RLVER! It’s a step forward in training models that respond with empathy — by turning simulated emotional reactions into measurable feedback, we’re making emotional intelligence teachable. x.com/tuzhaopeng/sta…
Tencent just released RLVER on Hugging Face! It's a pioneering reinforcement learning framework that uses verifiable emotion rewards to create truly empathetic AI agents.
当前大模型之所以“看起来富有同理心”,主要得益于以下两点: ① 在预训练阶段,通过海量对话语料学习到共情类话术; ② 在对齐阶段(如 RLHF 或 RLVER)进一步利用情绪或人类反馈信号强化“共情”行为。 SAGE 通过情感推理将这种“情绪价值”转化为 0–100 的可追溯评分,为 RLVER…
「Sentient Agent, 情绪价值 」 为什么大模型总是那么有同理心,会提供情绪价值? 分享两篇文章,关注 LLM 智商之外的另外一个维度,情商,即情绪价值的能力。 [ 论文 1 ] Sentient Agent as a Judge: Evaluating Higher-Order Social Cognition in Large Language Models [ 论文 2 ]…
Thank you for including RLVER! It’s a step forward in training models that respond with empathy — by turning simulated emotional reactions into measurable feedback, we’re making emotional intelligence teachable. x.com/tuzhaopeng/sta…
➥ Eli5 Reading List (Pt. 3) ➠ @DavidBeckworth - Winners, Losers, and the Lingering Fears of Inflation macroeconomicpolicynexus.substack.com/p/winners-lose… ➠ @0xShitTrader - The @atlasxyz Design Philosophy x.com/0xShitTrader/s… ➠ @ambergroup_io - Proof of Useful Work ambergroup.medium.com/proof-of-usefu… ➠…
Check out RLVER — the first RLVR framework to boost LLM empathy, using a simulated user that turns emotional reactions into reward signals. We’re open-sourcing code, checkpoints, and scripts to accelerate research into emotionally intelligent AI! Learn more:…
We've taught LLMs math and code with RLVR. But can we teach them empathy? 🤖❤️ Introducing Reinforcement Learning with Verifiable Emotion Rewards (RLVER), the first RLVR framework that enhances LLMs' empathy from a simulated user . ❤️ Feelings → Numbers: A…
Thrilled to share that our work "Beyond Prompt Engineering: Robust Behavior Control in LLMs via Steering Target Atoms" has been selected as an ACL 2025 Oral (Top 8.1%)! 🎉 The first author Mengru Wang will be presenting it in person — feel free to drop by and chat! Looking…
Is it possible to “give medicine” (vectors) to large language models to control their behavior at inference time? 🚀 Excited to share our ACL 2025 work: Steering Target Atoms (STA) — an approach to controlling LLM behavior at inference time without retraining. #ACL2025 #LLM #AI…
🚀 Atomic-to-Compositional Generalization for Mobile Agents 🧠 A new benchmark & scheduling system to push the limits of mobile agens. 📄 Paper: arxiv.org/abs/2506.08972 🌐 Website: ui-nexus.github.io 🧵1/n
🔴 We’re live. Join the Tencent Hunyuan #ICML2025 Frontier Papers Workshop now!
🚀 Hunyuan 3D 2.1 is here! The first fully open-source, production-ready PBR 3D generative model! ✅Cinema-grade visuals: PBR material synthesis brings leather, bronze, and more to life with stunning light interactions. ✅ Fully open-source: Model weights, training/inference…
Thanks for adopting DeepMath-103K! It's great to have both RLVR and agentic RL support!
🚀 Introducing ROLL: An Efficient and User-Friendly RL Training Framework for Large-Scale Learning! 🔥 Efficient, Scalable & Flexible – Train 200B+ models with 5D parallelism (TP/PP/CP/EP/DP), seamless vLLM/SGLang switching, async multi-env rollout for maximum RL throughput!
Had a great time at this CVPR community-building workshop---lots of fun discussions and some really important insights for early-career researchers. I also gave a talk on "Research as an Infinite Game." Here are the slides: canva.com/design/DAGp0iR…
In this #CVPR2025 edition of our community-building workshop series, we focus on supporting the growth of early-career researchers. Join us tomorrow (Jun 11) at 12:45 PM in Room 209 Schedule: sites.google.com/view/standoutc… We have an exciting lineup of invited talks and candid…
Can image safeguards be bypassed by breaking harmful prompts into harmless steps? ⚠️ Introducing Chain-of-Jailbreak (CoJ) Attack, a novel method showing how image generation models (e.g., GPT-4V/o, Gemini 1.5) can be compromised by decomposing malicious queries into a…

Can MLLMs truly "see" safety risks in image-text combinations? 🌲🖼️ Introducing MMSafetyAwareness, the first comprehensive benchmark for multimodal safety awareness in MLLMs, featuring 1,500 image-prompt pairs across 29 safety scenarios to evaluate whether models correctly…
