Pranjal Aggarwal ✈️ ICML 2025

@PranjalAggarw16

PhD Student @LTIatCMU. research scientist intern @AIatMeta FAIR. Working on reasoning, computer-use agents and test-time compute. Prev @IITD

Joined August 2020

100Following

442Followers

Pinned

Pranjal Aggarwal ✈️ ICML 2025@PranjalAggarw16 · Mar 7

What if you could control how long a reasoning model “thinks”? Presenting L1-1.5B, an RL-trained reasoning model with: - controllable thinking length via a prompt - better performance per token than S1 - better short CoT performance than GPT-4o cmu-l3.github.io/l1 🧵

PranjalAggarw16's tweet image. What if you could control how long a reasoning model “thinks”?

Presenting L1-1.5B, an RL-trained reasoning model with:

- controllable thinking length via a prompt
- better performance per token than S1
- better short CoT performance than GPT-4o

cmu-l3.github.io/l1

🧵

336

263

44.0K

Pinned

Pranjal Aggarwal ✈️ ICML 2025@PranjalAggarw16 · Feb 26

Will future SWE agents be computer-use agents? We explore this shift in Programming with Pixels: an agent environment where agents learn to use an IDE's existing functionality rather than relying on hand-designed tool APIs programmingwithpixels.com

PPranjal Aggarwal ✈️ ICML 2025@PranjalAggarw16 · Feb 26

What if AI agents did software engineering like humans—seeing the screen & using any developer tool? Introducing Programming with Pixels: an SWE environment where agents control VSCode via screen perception, typing & clicking to tackle diverse tasks. programmingwithpixels.com 🧵

5.0K

Pranjal Aggarwal ✈️ ICML 2025@PranjalAggarw16 · Jul 17

Can LLMs self-improve on code generation? Check out our work AlphaVerus where model generates provably correct code and self-improves without any weight updates! At #ICML2025 today: 📆: 11:00 AM - 1:30 PM 📷: Poster #East-2912 alphaverus.github.io w/ Bryan, @wellecks

PranjalAggarw16's tweet image. Can LLMs self-improve on code generation? Check out our work AlphaVerus where model generates provably correct code and self-improves without any weight updates! At #ICML2025 today:

📆: 11:00 AM - 1:30 PM
📷: Poster #East-2912

alphaverus.github.io

w/ Bryan, @wellecks

5.0K

Pranjal Aggarwal ✈️ ICML 2025@PranjalAggarw16 · Jul 14

I will be at #ICML2025 this week. Reach out if you want to chat about llm reasoning, computer-use agents, code gen or actually anything! (DMs are open) I will also be presenting AlphaVerus (self-improving verified code gen) this Thursday! alphaverus.github.io

919

Pranjal Aggarwal ✈️ ICML 2025 Retweeted

Shashwat Goel@ShashwatGoel7 · May 29

Confused about recent LLM RL results where models improve without any ground-truth signal? We were too. Until we looked at the reported numbers of the Pre-RL models and realized they were serverely underreported across papers. We compiled discrepancies in a blog below🧵👇

126

880

536

316.0K

Pranjal Aggarwal ✈️ ICML 2025@PranjalAggarw16 · May 1

AlphaVerus has been accepted at #ICML2025! alphaverus.github.io arxiv.org/abs/2412.06176 We've seen in math that good verification (e.g., Lean) unlocks surprising capabilities–why not for code too? AlphaVerus puts LLMs & Rust’s Verus verifier into a self-improving loop–lots…

SSean Welleck@wellecks · Dec 10

We present AlphaVerus, which enables LLMs to generate provably correct Rust code via a new tree search and self-improvement loop Very excited about AlphaVerus as a starting point for truly trustworthy code generation. Amazing work by @PranjalAggarw16! alphaverus.github.io

7.0K

Pranjal Aggarwal ✈️ ICML 2025@PranjalAggarw16 · Apr 16

Cool to see our L1 (arxiv.org/abs/2503.04697) methodology used here! And a nice insight about using the controllable reasoning budget to enable more efficient use of inference hardware

PPrime Intellect@PrimeIntellect · Apr 15

With INTELLECT-2 we aim for frontier reasoning performance with a controllable thinking budget. By incorporating length rewards into our training run, users can specify how long the model should reason for a given task. primeintellect.ai/blog/intellect…

11.0K

Pranjal Aggarwal ✈️ ICML 2025@PranjalAggarw16 · Mar 7

The recent Claude 3.7 model from Anthropic lets you control the budget for thinking—how might this work? Check out L1, our fully open recipe for training reasoning models with controllable thinking budgets!

PPranjal Aggarwal ✈️ ICML 2025@PranjalAggarw16 · Mar 7

10.0K