Charlie Snell

@sea_snell

PhD student @berkeley_ai; research @cursor_ai; prev @GoogleDeepMind. My friend told me to tweet more. I stare at my computer a lot and make things

Berkeley, CA

Joined April 2013

6KFollowing

7KFollowers

Pinned

Charlie Snell@sea_snell · Nov 26

Can we predict emergent capabilities in GPT-N+1🌌 using only GPT-N model checkpoints, which have random performance on the task? We propose a method for doing exactly this in our paper “Predicting Emergent Capabilities by Finetuning”🧵

sea_snell's tweet image. Can we predict emergent capabilities in GPT-N+1🌌 using only GPT-N model checkpoints, which have random performance on the task?

We propose a method for doing exactly this in our paper “Predicting Emergent Capabilities by Finetuning”🧵

572

394

152.0K

Pinned

Charlie Snell Retweeted

Shenzhi Wang🌟@ShenzhiWang_THU · Jun 3

🚨Beyond 80/20 in LLM reasoning🚨Dropping 80% low-entropy tokens in RL greatly boosts performance 🔗arxiv.org/abs/2506.01939 🏆Zero-RL SoTA: 63.5/68.1 (AIME24), 56.7 (AIME25) 🚀Insights: 1. RL retains base model entropy patterns 2. High-entropy tokens drive all RL improvement ⬇️

283

192

45.0K

Charlie Snell@sea_snell · 11 h

music.apple.com/us/album/dream…

613

Charlie Snell Retweeted

Chujie Zheng@ChujieZheng · Jul 25

Proud to introduce Group Sequence Policy Optimization (GSPO), our stable, efficient, and performant RL algorithm that powers the large-scale RL training of the latest Qwen3 models (Instruct, Coder, Thinking) 🚀 📄 huggingface.co/papers/2507.18…

165

1.0K

796

106.0K

Charlie Snell Retweeted

roon@tszzl · 17 h

pretraining is an elegant science, done by mathematicians who sit in cold rooms writing optimization theory on blackboards, engineers with total absorb of distributed systems of titanic scale posttraining is hair raising cowboy research where people drinking a lot of diet coke…

134

2.0K

517

162.0K

Charlie Snell Retweeted

Cursor@cursor_ai · Jul 24

In the past month, Cursor found 1M+ bugs in human-written PRs. Over half were real logic issues that were fixed before merging. Today, we're releasing the system that spotted these bugs. It's already become a required pre-merge check for many teams.

137

167

3.0K

992

305.0K

Charlie Snell Retweeted

Alexander Wei@alexwei_ · Jul 19

1/N I’m excited to share that our latest @OpenAI experimental reasoning LLM has achieved a longstanding grand challenge in AI: gold medal-level performance on the world’s most prestigious math competition—the International Math Olympiad (IMO).

406

1.0K

7.0K

2.0K

5.2M

Charlie Snell@sea_snell · Jul 12

YOLOing the ice cream at Salt and Straw (no taste test) was a mistake

10.0K

Charlie Snell@sea_snell · Jul 6

music.apple.com/us/album/compr…

1.0K

Charlie Snell Retweeted

tender@tenderizzation · Jul 5

incredible that this was written before the advent of vibe coding, just lmao

558

361

46.0K

Charlie Snell@sea_snell · Jun 20

What did Ilya see?

CCharlie Snell@sea_snell · Mar 9

What did Ilya’s investors see?

9.0K

Charlie Snell Retweeted

Cursor@cursor_ai · Jun 19

Cursor is now in Slack!

108

250

3.0K

644

535.0K

Charlie Snell Retweeted

Amrith Setlur@setlur_amrith · Jun 13

Introducing e3 🔥 Best <2B model on math 💪 Are LLMs implementing algos ⚒️ OR is thinking an illusion 🎩.? Is RL only sharpening the base LLM distrib. 🤔 OR discovering novel strategies outside base LLM 💡? We answer these ⤵️ 🚨 arxiv.org/abs/2506.09026 🚨 matthewyryang.github.io/e3/

11.0K

Charlie Snell@sea_snell · Jun 8

RL is very inference heavy and shifts infrastructure build outs heavily Scaling well engineered environments is difficult Reward hacking and non verifiable rewards are key areas of research Recursive self improvement already playing out Major shift in o4 and o5 RL training

SSemiAnalysis@SemiAnalysis_ · Jun 8

Scaling Reinforcement Learning Environments, Reward Hacking, Agents, Scaling Data Infrastructure Bottlenecks and Changes Distillation Data is a Moat Recursive Self Improvement o4 and o5 RL Training China Accelerator Production semianalysis.com/2025/06/08/sca…

546

201

102.0K

Charlie Snell Retweeted

Cursor@cursor_ai · Jun 4

Cursor 1.0 is out now! Cursor can now review your code, remember its mistakes, and work on dozens of tasks in the background.

448

1.0K

12.0K

3.0K

1.5M

Charlie Snell@sea_snell · Jun 4

new here

205

12.0K

Charlie Snell@sea_snell · May 30

Really enjoyed this conversation! A good look into how we're training frontier models at Cursor

CCursor@cursor_ai · May 29

A conversation on the optimal reward for coding agents, infinite context models, and real-time RL

267

102

42.0K