Nived Rajaraman

@Nived_Rajaraman

EECS PhD student at Berkeley. Former intern at Deepmind. Reinforcement learning. I organize the BLISS seminar https://bliss.eecs.berkeley.edu/Seminar/index.html

Berkeley, California

Joined July 2014

125Following

146Followers

Nived Rajaraman@Nived_Rajaraman · Jul 19

At es-fomo workshop, talk to @rish2k1 about scaling test-time compute as a function of user-facing latency (instead of FLOPS)

AAhmad Beirami@abeirami · Jul 9

[Sat Jul 19] @Nived_Rajaraman & @rish2k1 present work on improving accuracy-latency tradeoffs for test-time scaling. @gh_aminian presents work showing that a smoothened version of best-of-n gives improves reward vs KL tradeoffs when a low-quality proxy reward is used.

5.0K

Nived Rajaraman@Nived_Rajaraman · May 19

The abstract submission deadline for FoPt has been extended to the 21st of May (11:59pm UTC). Submission website: openreview.net/group?id=learn…

NNived Rajaraman@Nived_Rajaraman · May 9

Announcing the first workshop on Foundations of Post-Training (FoPT) at COLT 2025! 📝 Soliciting abstracts/posters exploring theoretical & practical aspects of post-training and RL with language models! │ 🗓️ Deadline: May 19, 2025

2.0K

Nived Rajaraman@Nived_Rajaraman · May 18

A quick reminder that the deadline for abstract submissions to FoPt (May 19) is fast approaching! Submit your best works!

NNived Rajaraman@Nived_Rajaraman · May 9

270

Nived Rajaraman@Nived_Rajaraman · Apr 25

@nived_rajaraman will give an oral talk at the VerifAI workshop on why RL or verification is needed to effectively scale test-time compute! Lots of interesting insights in this paper! At VerifAI workshop, 3:45pm, April 27 arxiv.org/abs/2502.12118 x.com/setlur_amrith/…

AAmrith Setlur@setlur_amrith · Feb 19

🚨 RL or distillation/SFT: what to use to train next reasoning model? Which 📈 perf faster as we scale test compute? We answer these in a principled way so you don't have to burn GPUs🔥. 🎯 Ans: RL w/ rewards or verification >> SFT/distillation 😱 arxiv.org/pdf/2502.12118 🧵⤵️

453

Nived Rajaraman Retweeted

Conference on Parsimony and Learning (CPAL)@CPALconf · Mar 25

Our first Rising Stars session featured fantastic talks by @TianlongChen4 @CongyueD @Nived_Rajaraman @zyh2022 Grigorios Chrysos

1.0K

Nived Rajaraman Retweeted

Eric Zhao@ericzhao28 · Mar 17

Thinking for longer (e.g. o1) is only one of many axes of test-time compute. In a new @Google_AI paper, we instead focus on scaling the search axis. By just randomly sampling 200x & self-verifying, Gemini 1.5 ➡️ o1 performance. The secret: self-verification is easier at scale!

267

2.0K

353.0K

Nived Rajaraman Retweeted

Amrith Setlur@setlur_amrith · Feb 19

229

151

55.0K

Nived Rajaraman Retweeted

Tanishq Abraham is at ICML@iScienceLuvr · Feb 18

Scaling Test-Time Compute Without Verification or RL is Suboptimal "In this paper, we prove that finetuning LLMs with verifier-based (VB) methods based on RL or search is far superior to verifier-free (VF) approaches based on distilling or cloning search traces, given a fixed…

102

620

526

70.0K

Nived Rajaraman Retweeted

Banghua Zhu@BanghuaZ · Dec 10, 2023

I'll be at #NeurIPS2023, and the academic job market this year! RT will be greatly appreciated! I work on statistics and information theory, with applications in robust statistics, offline RL, game theory, human-AI interactions and LLMs. I'm recently working on better…

14.0K

Nived Rajaraman@Nived_Rajaraman · Oct 18, 2022

This is sickening.I hope the survivor is ok. How is he still listed as teaching this year @stanford?

KKate Selig@kate_selig · Oct 5, 2022

Scoop: Stanford biology professor Hunter Fraser was arrested and charged with domestic violence. Fraser allegedly threw a woman on the ground and later slammed a door into her, court records show. Fraser pleaded not guilty. @StanfordDaily stanforddaily.com/2022/10/04/sta…