Neel Rajani @ICML'25
@NeelRajani_
PhD student in responsible NLP @InfAtEd. Passionate about Mechanistic Interpretability and LLM training dynamics
🚨New paper alert!🚨 "Scalpel vs. Hammer: GRPO Amplifies Existing Capabilities, SFT Replaces Them" @ActInterp ICML'25 @deepseek_ai popularised RLVR and distillation for 'reasoning training'! But how do they differ under the hood? Details in 🧵: (1/8)
Super timely work by @aryopg on failure cases of reasoning models, make sure to check it out!
New Anthropic Research: “Inverse Scaling in Test-Time Compute” We found cases where longer reasoning leads to lower accuracy. Our findings suggest that naïve scaling of test-time compute may inadvertently reinforce problematic reasoning patterns. 🧵
Catch me at my poster @ActInterp in East Ballroom A if you'd like some free chocolate :)

This is a super nice read by @ZeroyuHuang !
🚀 Introducing Prefix-RFT to blend SFT and RFT! SFT can learn more complex problems by mimicking, but can have poor generalization. RFT has better overall performance but is limited by the initial policy. Our method, Prefix-RFT, makes the best of both worlds!
Transformers struggle with length generalization and long context. What can we do about it? Our new #TMLR paper with @rolandalong , @paul_smolensky and @JianfengGao0217 shows how to handle the issue. Using a new attention mechanism called TRA. Curious? Read the 🧵 for more 🤓
Finally made it to @icmlconf in gorgeous Vancouver! Presenting work at @ActInterp on Saturday (more on that soon 👀). If you're into interpretability/RL/AI Safety, I'd love to chat :)

Are you compositionally curious 🤓 Want to know how to learn embeddings using🌲? In our new #ICML2025 paper, we present Banyan: A recursive net that you can train super efficiently for any language or domain, and get embeddings competitive with much much larger LLMs 1/🧵
Blessed are those who do rigorous evals, for theirs is the kingdom of heaven.