Souradip Chakraborty
@SOURADIPCHAKR18
Student Researcher @Google || PhD @umdcs @ml_umd, working on #LLM #Alignment #RLHF #Reasoning Prev : #JPMC #Walmart Labs, MS #IndianStatisticalInstitute
🚀 Exciting Research Alert! Traditional #AIAlignment #RLHF methods are expensive & require updating billions of parameters. 🔥 Is it possible to do #LLMAlignment without finetuning model parameters? ✅ YES! Transfer Q*: Principled Decoding Alignment
🌟 Can you imagine aligning your AI model 🤖 on the fly, without updating its core parameters so much that it becomes unsuitable for others with different preferences? 🚀 Introducing "Transfer Q Star: Principled Decoding for #LLM #Alignment" 🔗: arxiv.org/abs/2405.20495 A 🧵👇
@andrewgwils @micahgoldblum @amritsinghbedi3 @furongh @PhilBeaudoin @SharonYixuanLi @DBahdanau @roydanroy @zdeborova @murefil @pcastr @DavidDuvenaud @RogerGrosse @percyliang @DianboLiu @ClementDelangue @ZhijingJin @james_y_zou @DavidSKrueger @iScienceLuvr
I think this is a human bias more broadly. Many people look for reasons to reject instead of accept. The latter would actually lead to better and certainly more exciting papers appearing at the conference.
I love this idea. I think a well prompted LLM would actually generate a better initial review than the reviews I am seeing. And by well prompted, I would mean an LLM that is asked to write a positive review. If a reviewer cannot improve on this, the LLM suggests accept, take it.
If so, this is no reason to reject the paper (unless you have a list of hundred such things). @RealAAAI I wonder whether you want some volunteers to help you tune the LLM reviews. I would volunteer for that.
I hope the LLM will be tuned to write a charitable review. What I find very damning about the current reviewing culture is that reviewers think their job is to reject as many papers as possible for the smallest possible things. No one is asking, can this be fixed easily? 1/x
As @roydanroy & others noted, review quality has degraded, potentially AI-generated. Isn't it a viable next step to have a well-trained LLM reviewer perform a first round of review to reduce the load? For quality reviews in 2nd round? #NeurIPS2025 #icml2025 #ACL2025 #iclr2025
With all do respect, ... @jeffclune @WenhuChen @AnimaAnandkumar @canondetortugas @OwainEvans_UK @patrickshafto @StefanoErmon @DavidDuvenaud @sanmikoyejo @RogerGrosse @percyliang @chelseabfinn @aviral_kumar2 ... I could go on and on
We had already shown the finding in our earlier work || Does Thinking More always Help : arxiv.org/abs/2506.04210 Link: x.com/SOURADIPCHAKR1… @MengdiWang10 @amritsinghbedi3 @furongh @ghosal_suvra @dmanocha
🔥 Does test-time scaling in #reasoningmodels via thinking more always help? 🚫 Answer is No - Performance increases first and then drops due to #Overthinking ❓Why is this behaviour and how to mitigate 🚀 Check our recent findings #LLMReasoning Link: arxiv.org/pdf/2506.04210
Great minds think alike! 👀🧠 We also found that more thinking ≠ better reasoning. In our recent paper (arxiv.org/abs/2506.04210), we show how output variance creates the illusion of improvement—when in fact, it can hurt precision. Naïve test-time scaling needs a rethink. 👇…
New Anthropic Research: “Inverse Scaling in Test-Time Compute” We found cases where longer reasoning leads to lower accuracy. Our findings suggest that naïve scaling of test-time compute may inadvertently reinforce problematic reasoning patterns. 🧵
Illusion of Test time scaling ?
Great minds think alike! 👀🧠 We also found that more thinking ≠ better reasoning. In our recent paper (arxiv.org/abs/2506.04210), we show how output variance creates the illusion of improvement—when in fact, it can hurt precision. Naïve test-time scaling needs a rethink. 👇…
Thanks @furongh, indeed, we showed similar results and 🚫 raised a concern that More Thinking != Improved reasoning !!! x.com/SOURADIPCHAKR1…
Great minds think alike! 👀🧠 We also found that more thinking ≠ better reasoning. In our recent paper (arxiv.org/abs/2506.04210), we show how output variance creates the illusion of improvement—when in fact, it can hurt precision. Naïve test-time scaling needs a rethink. 👇…
Recent paper by #Anthropic @aryopg @PMinervini @yanda_chen_ @EthanJPerez Inverse Scaling in Test-Time Compute : arxiv.org/abs/2507.14417 Validates our existing findings in the work published last month: Does test-time scaling always help? x.com/SOURADIPCHAKR1…
🔥 Does test-time scaling in #reasoningmodels via thinking more always help? 🚫 Answer is No - Performance increases first and then drops due to #Overthinking ❓Why is this behaviour and how to mitigate 🚀 Check our recent findings #LLMReasoning Link: arxiv.org/pdf/2506.04210
🚨 Can AI design harmful viruses or toxic molecules? 🎉 Excited to announce that our Workshop on Biosecurity Safeguards for Gen AI got accepted at #Neurips2025 Link : biosafe-gen-ai.github.io Kudos to amazing team @MengdiWang10 @lecong Alvaro @ZaixiZhang @amritsinghbedi3 Ruofan

🔥This is a fantastic moment where #LLMs are solving IMO-level problems, which highlights the roles of #exploration, Test-time scaling, and going beyond verifiable rewards. 🔊We have been working on these ideas scholar.google.co.in/citations?hl=e… 🥍Detailed Thread coming soon.
1/N I’m excited to share that our latest @OpenAI experimental reasoning LLM has achieved a longstanding grand challenge in AI: gold medal-level performance on the world’s most prestigious math competition—the International Math Olympiad (IMO).
Wonderful Panel discussion on Exploration at the EXAIT Workshop #ICML2025 with @aviral_kumar2 @jeffclune @Masa_Uehara_1 @jiwoncpark @kgjamieson Very interesting talks !!!!

Our new work on how to Learn-to-Explore using Transformer Models will be presented today the EXAIT workshop at #ICML2025! This is joint work with Alessio Russo @rssalessio and Ryan Welch.
Wondering how to do online pure exploration using transformer models? Check our EXAIT Workshop at #ICML2025, presented today by my brilliant co-author Ryan Welch Full paper: arxiv.org/abs/2506.01876 A joint work with @aldopacchiano @ BU/MIT/Broad institute.
🗣 Speakers & Panelists - Excited to host the top experts in #AI, #biosecurity, #policy, and #ethics. An exciting lineup! @Yoshua_Bengio @PeterHndrsn @jmuiuc @Rbaltman @Tkaraletsos @MeganBlewett Sheng Lin-Gibson @NIST Stephanie Guerra @geochurch @lecong
🚨 Can AI design harmful viruses or toxic molecules? 🚨 🔥 At #NeurIPS2025, we’re launching a new workshop: 🔬 Biosecurity Safeguards for Generative AI Link : biosafe-gen-ai.github.io 🙏 Grateful to our amazing co-organizers and expert advisors! #BioSafeGenAI #AI4Science