Andrea Zanette @ ICML 2025

@Zanette_ai

Assistant professor at CMU

Pittsburgh, PA

Joined February 2024

814Following

716Followers

Andrea Zanette @ ICML 2025 Retweeted

Gokul Swamy@g_k_swamy · Jul 15

Recent work has seemed somewhat magical: how can RL with *random* rewards make LLMs reason? We pull back the curtain on these claims and find out this unexpected behavior hinges on the inclusion of certain *heuristics* in the RL algorithm. Our blog post: tinyurl.com/heuristics-con…

478

425

81.0K

Andrea Zanette @ ICML 2025 Retweeted

Polina Kirichenko@polkirichenko · Jun 16

Excited to release AbstentionBench -- our paper and benchmark on evaluating LLMs’ *abstention*: the skill of knowing when NOT to answer! Key finding: reasoning LLMs struggle with unanswerable questions and hallucinate! Details and links to paper & open source code below! 🧵1/9

593

412

123.0K

Andrea Zanette @ ICML 2025 Retweeted

Infini-AI-Lab@InfiniAILab · Jun 16

🔥 We introduce Multiverse, a new generative modeling framework for adaptive and lossless parallel generation. 🚀 Multiverse is the first open-source non-AR model to achieve AIME24 and AIME25 scores of 54% and 46% 🌐 Website: multiverse4fm.github.io 🧵 1/n

220

108

77.0K

Andrea Zanette @ ICML 2025 Retweeted

Ruiqi Zhang@RuiqiZhang0614 · Jun 13

(1/n) Check our new paper about accelerating RL training in LLMs! zanette-labs.github.io/speed-rl/ We propose SPEED, an online curriculum learning method for rule-based RL training of reasoning models. SPEED achieves 2x to 6x speedups across training setups and benchmarks.

1.0K

Andrea Zanette @ ICML 2025 Retweeted

Angela Dai@angelaqdai · Jun 12

Super cool insights on 4D reconstruction from @QianqianWang5 at the ScanNet++ workshop @CVPR!

4.0K

Andrea Zanette @ ICML 2025 Retweeted

Prateek Jain@jainprateek_ · Jun 11

We are hiring Research Scientists for our Machine Learning and Optimization team at Google DeepMind Bangalore. If you're passionate about cutting-edge AI research and building efficient, elastic, customized, and safe LLMs, we'd love to hear from you. We are looking for…

706

245

73.0K

Andrea Zanette @ ICML 2025 Retweeted

Gokul Swamy@g_k_swamy · Jun 10

Say ahoy to 𝚂𝙰𝙸𝙻𝙾𝚁⛵: a new paradigm of *learning to search* from demonstrations, enabling test-time reasoning about how to recover from mistakes w/o any additional human feedback! 𝚂𝙰𝙸𝙻𝙾𝚁 ⛵ out-performs Diffusion Policies trained via behavioral cloning on 5-10x data!

255

126

69.0K

Andrea Zanette @ ICML 2025@Zanette_ai · Jun 3

SCA is the first self-improvement rl framework for general multi-turn tool-use agents. It does so by first generating its own verifiers for its own synthetic tasks. Stay tuned for more details!

JJason Weston@jaseweston · Jun 3

🚨Self-Challenging Language Model Agents🚨 📝: arxiv.org/abs/2506.01716 A new paradigm to train LLM agents to use different tools with challenging self-generated data ONLY: Self-challenging agents (SCA) both propose new tasks and solve them, using self-generated verifiers to…

5.0K

Andrea Zanette @ ICML 2025@Zanette_ai · May 30

This is really great work by Fahim and co, moving out of the regime where we have ground truth rewards is critical for the next level of RL scaling in LLMs

FFahim Tajwar@FahimTajwar10 · May 28

RL with verifiable reward has shown impressive results in improving LLM reasoning, but what can we do when we do not have ground truth answers? Introducing Self-Rewarding Training (SRT): where language models provide their own reward for RL training! 🧵 1/n

3.0K

Andrea Zanette @ ICML 2025 Retweeted

alphaXiv@askalphaxiv · May 30

"Can Large Reasoning Models Self-Train?" A brilliant paper from CMU showing LLMs can improve at math reasoning WITHOUT human labels - just learning from their own consistency. Early results rival models trained on ground-truth answers.

350

307

27.0K

Andrea Zanette @ ICML 2025 Retweeted

alphaXiv@askalphaxiv · May 29

This is pretty remarkable – AI systems learning to self-improve We're seeing a wave of research where AI isn't just learning from human feedback, it's starting to figure out how to improve itself using its own internal signals A subtle but profound shift.

116

584

490

42.0K

Andrea Zanette @ ICML 2025 Retweeted

Intology@IntologyAI · Apr 23

🇸🇬✈️Come check out Zochi's work at #ICLR2025 — and a big congrats for their first citation 😉🎉 We thank the workshop organizers for approving the work & inviting our reps to present on Zochi's behalf. Locations, times, & and more details below 🧵👇

2.0K

Andrea Zanette @ ICML 2025 Retweeted

Andy Zhou@zhouandy_ · May 28

Announcing the first fully AI-generated scientific discovery to pass the highest level of peer review – the main track of an A* conference (ACL 2025). Several groups have shown AI-generated work at workshops, but main conference acceptance is a far higher bar. While workshops…

257

219

70.0K

Andrea Zanette @ ICML 2025 Retweeted

Intology@IntologyAI · May 28

The 1st fully AI-generated scientific discovery to pass the highest level of peer review – the main track of an A* conference (ACL 2025). Zochi, the 1st PhD-level agent. Beta open.

133

667

420

211.0K