Souradip Chakraborty (@SOURADIPCHAKR18)

Pinned

S

Souradip Chakraborty@SOURADIPCHAKR18 · Jun 4, 2024

🚀 Exciting Research Alert! Traditional #AIAlignment #RLHF methods are expensive & require updating billions of parameters. 🔥 Is it possible to do #LLMAlignment without finetuning model parameters? ✅ YES! Transfer Q*: Principled Decoding Alignment

FFurong Huang@furongh · Jun 4, 2024

🌟 Can you imagine aligning your AI model 🤖 on the fly, without updating its core parameters so much that it becomes unsuitable for others with different preferences? 🚀 Introducing "Transfer Q Star: Principled Decoding for #LLM #Alignment" 🔗: arxiv.org/abs/2405.20495 A 🧵👇

1

35

121

75

25.0K

Souradip Chakraborty Retweeted

A

Arian Khorasani 🦅@Arian_Khorasani · Jul 28

@andrewgwils @micahgoldblum @amritsinghbedi3 @furongh @PhilBeaudoin @SharonYixuanLi @DBahdanau @roydanroy @zdeborova @murefil @pcastr @DavidDuvenaud @RogerGrosse @percyliang @DianboLiu @ClementDelangue @ZhijingJin @james_y_zou @DavidSKrueger @iScienceLuvr

0

1

258

Souradip Chakraborty Retweeted

A

Andrew Gordon Wilson@andrewgwils · Jul 27

I think this is a human bias more broadly. Many people look for reasons to reject instead of accept. The latter would actually lead to better and certainly more exciting papers appearing at the conference.

0

2

4

2

329

Souradip Chakraborty Retweeted

C

Csaba Szepesvari@CsabaSzepesvari · Jul 26

Accountability of all people involved..

0

1

3

0

218

Souradip Chakraborty Retweeted

C

Csaba Szepesvari@CsabaSzepesvari · Jul 26

I love this idea. I think a well prompted LLM would actually generate a better initial review than the reviews I am seeing. And by well prompted, I would mean an LLM that is asked to write a positive review. If a reviewer cannot improve on this, the LLM suggests accept, take it.

3

11

3

927

Souradip Chakraborty Retweeted

C

Csaba Szepesvari@CsabaSzepesvari · Jul 26

If so, this is no reason to reject the paper (unless you have a list of hundred such things). @RealAAAI I wonder whether you want some volunteers to help you tune the LLM reviews. I would volunteer for that.

1

2

5

2

723

Souradip Chakraborty Retweeted

C

Csaba Szepesvari@CsabaSzepesvari · Jul 26

I hope the LLM will be tuned to write a charitable review. What I find very damning about the current reviewing culture is that reviewers think their job is to reject as many papers as possible for the smallest possible things. No one is asking, can this be fixed easily? 1/x

2

3

15

2

5.0K

S

Souradip Chakraborty@SOURADIPCHAKR18 · Jul 25

As @roydanroy & others noted, review quality has degraded, potentially AI-generated. Isn't it a viable next step to have a well-trained LLM reviewer perform a first round of review to reduce the load? For quality reviews in 2nd round? #NeurIPS2025 #icml2025 #ACL2025 #iclr2025

6

4

18

5

3.0K

Souradip Chakraborty Retweeted

D

Dan Roy@roydanroy · Jul 26

With all do respect, ... @jeffclune @WenhuChen @AnimaAnandkumar @canondetortugas @OwainEvans_UK @patrickshafto @StefanoErmon @DavidDuvenaud @sanmikoyejo @RogerGrosse @percyliang @chelseabfinn @aviral_kumar2 ... I could go on and on

8

4

76

60

30.0K

S

Souradip Chakraborty@SOURADIPCHAKR18 · Jul 22

We had already shown the finding in our earlier work || Does Thinking More always Help : arxiv.org/abs/2506.04210 Link: x.com/SOURADIPCHAKR1… @MengdiWang10 @amritsinghbedi3 @furongh @ghosal_suvra @dmanocha

SSouradip Chakraborty@SOURADIPCHAKR18 · Jun 5

🔥 Does test-time scaling in #reasoningmodels via thinking more always help? 🚫 Answer is No - Performance increases first and then drops due to #Overthinking ❓Why is this behaviour and how to mitigate 🚀 Check our recent findings #LLMReasoning Link: arxiv.org/pdf/2506.04210

0

2

9

5

1.0K

S

Souradip Chakraborty@SOURADIPCHAKR18 · Jul 22

Great minds think alike! 👀🧠 We also found that more thinking ≠ better reasoning. In our recent paper (arxiv.org/abs/2506.04210), we show how output variance creates the illusion of improvement—when in fact, it can hurt precision. Naïve test-time scaling needs a rethink. 👇…

AAryo Pradipta Gema@aryopg · Jul 22

New Anthropic Research: “Inverse Scaling in Test-Time Compute” We found cases where longer reasoning leads to lower accuracy. Our findings suggest that naïve scaling of test-time compute may inadvertently reinforce problematic reasoning patterns. 🧵

4

12

96

46

15.0K

S

Souradip Chakraborty@SOURADIPCHAKR18 · Jul 22

Illusion of Test time scaling ?

FFurong Huang@furongh · Jul 22

Great minds think alike! 👀🧠 We also found that more thinking ≠ better reasoning. In our recent paper (arxiv.org/abs/2506.04210), we show how output variance creates the illusion of improvement—when in fact, it can hurt precision. Naïve test-time scaling needs a rethink. 👇…

0

1

2

1

342

S

Souradip Chakraborty@SOURADIPCHAKR18 · Jul 22

Thanks @furongh, indeed, we showed similar results and 🚫 raised a concern that More Thinking != Improved reasoning !!! x.com/SOURADIPCHAKR1…

FFurong Huang@furongh · Jul 22

Great minds think alike! 👀🧠 We also found that more thinking ≠ better reasoning. In our recent paper (arxiv.org/abs/2506.04210), we show how output variance creates the illusion of improvement—when in fact, it can hurt precision. Naïve test-time scaling needs a rethink. 👇…

0

6

0

703

S

Souradip Chakraborty@SOURADIPCHAKR18 · Jul 22

Recent paper by #Anthropic @aryopg @PMinervini @yanda_chen_ @EthanJPerez Inverse Scaling in Test-Time Compute : arxiv.org/abs/2507.14417 Validates our existing findings in the work published last month: Does test-time scaling always help? x.com/SOURADIPCHAKR1…

SSouradip Chakraborty@SOURADIPCHAKR18 · Jun 5

🔥 Does test-time scaling in #reasoningmodels via thinking more always help? 🚫 Answer is No - Performance increases first and then drops due to #Overthinking ❓Why is this behaviour and how to mitigate 🚀 Check our recent findings #LLMReasoning Link: arxiv.org/pdf/2506.04210

2

5

15

5

3.0K

S

Souradip Chakraborty@SOURADIPCHAKR18 · Jul 21

🚨 Can AI design harmful viruses or toxic molecules? 🎉 Excited to announce that our Workshop on Biosecurity Safeguards for Gen AI got accepted at #Neurips2025 Link : biosafe-gen-ai.github.io Kudos to amazing team @MengdiWang10 @lecong Alvaro @ZaixiZhang @amritsinghbedi3 Ruofan

SOURADIPCHAKR18's tweet image. 🚨 Can AI design harmful viruses or toxic molecules?

🎉 Excited to announce that our Workshop on Biosecurity Safeguards for Gen AI got accepted at #Neurips2025
Link : biosafe-gen-ai.github.io

Kudos to amazing team @MengdiWang10 @lecong Alvaro @ZaixiZhang @amritsinghbedi3 Ruofan

1

4

8

3

573

S

Souradip Chakraborty@SOURADIPCHAKR18 · Jul 20

🔥This is a fantastic moment where #LLMs are solving IMO-level problems, which highlights the roles of #exploration, Test-time scaling, and going beyond verifiable rewards. 🔊We have been working on these ideas scholar.google.co.in/citations?hl=e… 🥍Detailed Thread coming soon.

AAlexander Wei@alexwei_ · Jul 19

1/N I’m excited to share that our latest @OpenAI experimental reasoning LLM has achieved a longstanding grand challenge in AI: gold medal-level performance on the world’s most prestigious math competition—the International Math Olympiad (IMO).

1

2

7

2

737

S

Souradip Chakraborty@SOURADIPCHAKR18 · Jul 19

Wonderful Panel discussion on Exploration at the EXAIT Workshop #ICML2025 with @aviral_kumar2 @jeffclune @Masa_Uehara_1 @jiwoncpark @kgjamieson Very interesting talks !!!!

SOURADIPCHAKR18's tweet image. Wonderful Panel discussion on Exploration at the EXAIT Workshop #ICML2025 with @aviral_kumar2 @jeffclune @Masa_Uehara_1 @jiwoncpark @kgjamieson
Very interesting talks !!!!

0

3

13

2

549

S

Souradip Chakraborty@SOURADIPCHAKR18 · Jul 19

Our new work on how to Learn-to-Explore using Transformer Models will be presented today the EXAIT workshop at #ICML2025! This is joint work with Alessio Russo @rssalessio and Ryan Welch.

RR. Alessio @ BU@rssalessio · Jul 19

Wondering how to do online pure exploration using transformer models? Check our EXAIT Workshop at #ICML2025, presented today by my brilliant co-author Ryan Welch Full paper: arxiv.org/abs/2506.01876 A joint work with @aldopacchiano @ BU/MIT/Broad institute.

0

4

12

2

1.0K

S

Souradip Chakraborty@SOURADIPCHAKR18 · Jul 17

🗣 Speakers & Panelists - Excited to host the top experts in #AI, #biosecurity, #policy, and #ethics. An exciting lineup! @Yoshua_Bengio @PeterHndrsn @jmuiuc @Rbaltman @Tkaraletsos @MeganBlewett Sheng Lin-Gibson @NIST Stephanie Guerra @geochurch @lecong

BBiosecurity for GenAI Workshop @Neurips 2025@biosafe_neurips · Jul 15

🚨 Can AI design harmful viruses or toxic molecules? 🚨 🔥 At #NeurIPS2025, we’re launching a new workshop: 🔬 Biosecurity Safeguards for Generative AI Link : biosafe-gen-ai.github.io 🙏 Grateful to our amazing co-organizers and expert advisors! #BioSafeGenAI #AI4Science

1

2

5

1

216