Ankesh Anand

@ankesh_anand

Research scientist @googledeepmind (Gemini Thinking & Post-Training), prev phd @milamontreal. RL for Gemini 2.5 and Project Mariner. Opinions are my own.

London, England

Joined December 2011

625Following

5KFollowers

Pinned

Ankesh Anand@ankesh_anand · Mar 25

2.5 Pro is our new frontier model: fresh big model smell with extremely strong reasoning / thinking capabilities. We report single attempt / pass@1 scores for clean comparisons.

ankesh_anand's tweet image. 2.5 Pro is our new frontier model: fresh big model smell with extremely strong reasoning / thinking capabilities.

We report single attempt / pass@1 scores for clean comparisons.

115

9.0K

Ankesh Anand Retweeted

Kimi.ai@Kimi_Moonshot · 9 h

Kimi K2 tech report just dropped! Quick hits: - MuonClip optimizer: stable + token-efficient pretraining at trillion-parameter scale - 20K+ tools, real & simulated: unlocking scalable agentic data - Joint RL with verifiable + self-critique rubric rewards: alignment that adapts -…

175

1.0K

276

46.0K

Ankesh Anand@ankesh_anand · Jun 5

Here we go! A new 2.5 Pro with all around capability improvements compared to previous versions. - Much better at code editing now, sota on Aider (82.2), try out this model on cursor! - #1 on webdev-arena (surpassing opus 4). - supports budgets now (128 to 32k) - much better at…

ankesh_anand's tweet image. Here we go! A new 2.5 Pro with all around capability improvements compared to previous versions.

- Much better at code editing now, sota on Aider (82.2), try out this model on cursor!
- #1 on webdev-arena (surpassing opus 4).
- supports budgets now (128 to 32k)
- much better at…

116

4.0K

Ankesh Anand@ankesh_anand · Apr 2

📈📈📈

MMislav Balunović@mbalunovic · Apr 2

Big update to our MathArena USAMO evaluation: Gemini 2.5 Pro, which was released *the same day* as our benchmark, is the first model to achieve non-trivial amount of points (24.4%). The speed of progress is really mind-blowing.

352

83.0K

Ankesh Anand@ankesh_anand · Mar 25

shoutout to the believers!

2.0K

210

200.0K

Ankesh Anand@ankesh_anand · Jan 29

The whole surprise over 5.5M$ was because everyone is anchored to Llama3’s compute efficiency. Wenfeng himself said it’s about two generations behind frontier lab numbers. Sonnet costs “tens of millions” of dollars, I hope we release the 2.0 Flash / Flash Thinking numbers as…

DDario Amodei@DarioAmodei · Jan 29

My thoughts on China, export controls and two possible futures darioamodei.com/on-deepseek-an…

8.0K