Aradhye Agarwal

@AradhyeAgarwal

Incoming Research Fellow at Microsoft Research, CSE @ IIT Delhi, scaling test-time compute for LLMs

Joined May 2020

109Following

55Followers

Pinned

Aradhye Agarwal@AradhyeAgarwal · May 26

For the past couple of months we've been working on test-time scaling, and we've discovered a huge thing:

5.0K

Aradhye Agarwal Retweeted

Rota@pli_cachete · Jul 19

Terence Tao on the supposed Gold from OpenAI at IMO

553

6.0K

3.0K

604.0K

Aradhye Agarwal@AradhyeAgarwal · Jun 25

My two cents on RL in its current form.

AAradhye Agarwal@AradhyeAgarwal · Jun 25

I think simply performing RL with the final answer in mind is problematic since it might result in the LLM learning to "game" the reward. We would like the model to produce "correct" reasoning traces in addition to the correct answer, but without checking intermediate steps,…

199

Aradhye Agarwal@AradhyeAgarwal · Jun 23

In Prof. Alex's defense, "short is correct" and "correct is short" mean the same, which we analytically show in our recent paper. Check out the attached snip from our paper: arxiv.org/abs/2505.18149

AAlex Dimakis@AlexGDimakis · Jan 31

Yup, my student correcting my Bayesian fallacy here. We care about P(correct /short ) not P(short/correct) because at inference time we don’t obviously know what is correct. This is also why (best of K ) is not something you can do at inference time, if you don’t know what the…

290