Gaurav Ghosal

@gaurav_ghosal

Ph.D. Student @mldcmu | Former Undergraduate Student @berkeley_eecs and Researcher @berkeley_ai |

Joined January 2023

177Following

215Followers

Pinned

Gaurav Ghosal@gaurav_ghosal · Jul 17

1/So much of privacy research is designing post-hoc methods to make models mem. free. It’s time we turn that around with architectural changes. Excited to add Memorization Sinks to the transformer architecture this #ICML2025 to isolate memorization during LLM training🧵

gaurav_ghosal's tweet image. 1/So much of privacy research is designing post-hoc methods to make models mem. free.
It’s time we turn that around with architectural changes. Excited to add Memorization Sinks to the transformer architecture this #ICML2025 to isolate memorization during LLM training🧵

7.0K

Gaurav Ghosal@gaurav_ghosal · Jul 13

@abitha___ will be presenting our work on training language models to predict further into the future beyond the next token and the benefits this objective brings. x.com/gm8xx8/status/…

�𝚐𝔪𝟾𝚡𝚡𝟾@gm8xx8 · Apr 16

Looking beyond the next token TRELAWNEY inserts future tokens <T>...</T> during training to teach models to plan ahead—boosting reasoning, coherence, and control. Highlights: - NO ARCHITECTURE CHANGES. JUST SMARTER DATA. - works with standard decoding - enables controllable…

2.0K

Gaurav Ghosal@gaurav_ghosal · Jul 17

In "Mind Your Step (by Step): Chain‑of‑Thought can Reduce Performance on Tasks where Thinking Makes Humans Worse", we connect human "overthinking" insights to LLM reasoning, offering a new lens on when thinking‑out‑loud backfires. 📄 Read the full paper: arxiv.org/abs/2410.21333…

EEd H. Chi@edchi · Jul 17

One of the better posters I saw today at #icml25 This gets at the root of the problems we were thinking about when we conceived and wrote the CoT paper.

12.0K

Gaurav Ghosal Retweeted

Fahim Tajwar@FahimTajwar10 · May 28

RL with verifiable reward has shown impressive results in improving LLM reasoning, but what can we do when we do not have ground truth answers? Introducing Self-Rewarding Training (SRT): where language models provide their own reward for RL training! 🧵 1/n

143

838

865

82.0K

Gaurav Ghosal Retweeted

Jacob Springer@jacspringer · Mar 26

Training with more data = better LLMs, right? 🚨 False! Scaling language models by adding more pre-training data can decrease your performance after post-training! Introducing "catastrophic overtraining." 🥁🧵+arXiv 👇 1/9

181

816

651

160.0K

Gaurav Ghosal@gaurav_ghosal · Feb 13

New work on training reasoning models to be more efficient! Would love to hear your thoughts on this.

AAndrea Zanette @ ICML 2025@Zanette_ai · Feb 13

Excited to share my first work at CMU, led by my fantastic student Daman @amuseddaman ! 🚀 We post-train reasoning models with reinforcement learning to reduce token usage while largely preserving accuracy.

3.0K