Paul Gauthier
@paulgauthier
Entrepreneur, investor, advisor
Aider v0.85.0 is out. - Support for Responses API models like o3-pro and o1-pro. - New Gemini 2.5 Pro models. - Updated costs for o3. - Repo-map & linting support for Clojure and MATLAB. - Aider wrote 21% of the code in this release. Full release notes: aider.chat/HISTORY.html
Kimi K2 scored 59% on the aider polyglot coding benchmark. Full leaderboard: aider.chat/docs/leaderboa…

Grok 4 scored 80% on the aider polyglot coding benchmark, with high reasoning effort. This puts Grok in 4th place on the leaderboard. Full leaderboard: aider.chat/docs/leaderboa…

OpenAI's o3-pro set a new SOTA of 85% on the aider polyglot coding benchmark, running with "high" reasoning effort. Full leaderboard: aider.chat/docs/leaderboa…

DeepSeek R1 0528 scored 71% on the aider polyglot coding benchmark. This is a significant increase over the prior release of R1. Full leaderboard: aider.chat/docs/leaderboa…

Gemini 2.5 Pro 06-05 has set a new SOTA on the aider polyglot coding benchmark, scoring 83% with 32k thinking tokens. The default thinking mode, where Gemini self-determines the thinking budget, scored 79%. Full leaderboard: aider.chat/docs/leaderboa…

Aider v0.84.0 is out with support for Claude 4 Opus and Sonnet and Gemini 2.5 Flash Preview 05-20. Aider wrote 79% of the code in this release. Full release notes: aider.chat/HISTORY.html
Gemini 2.5 Flash 05-20 with 23k thinking tokens scored 55% on the aider polyglot coding benchmark. Without thinking, it scored 44%. Full leaderboard: aider.chat/docs/leaderboa…

Claude 4 Opus scored 72% on the aider polyglot coding benchmark. Claude 4 Sonnet scored 61%. Both of those are with 32k think tokens. Sonnet 4 seems to have underperformed 3.7. Full leaderboard: aider.chat/docs/leaderboa…

Aider just passed 1000000000000000 GitHub Stars! That's 2^15 or 32,768 stars in decimal. github.com/Aider-AI/aider

I was able to benchmark Qwen3 235B A22B via the official API. It scored 60% using diff and 62% using the whole edit format. The leaderboard and Qwen3 article have both been updated. aider.chat/docs/leaderboa… aider.chat/2025/05/08/qwe…

Aider v0.83.0 is out with support for Qwen3, Gemini 2.5 Pro Preview 05-06. A huge number of QOL features, many from contributors. Thanks! Aider wrote 55% of the code in this release. Full release notes: aider.chat/HISTORY.html
Gemini Pro is quite good at unified diffs. Not good enough to apply literally with patch, but aider has a very flexible udiff backend. I mostly use Gemini like: aider --model gemini --edit-format udiff-simple Benchmarks a bit worse, so I'm reluctant to make it default.
Gemini 2.5 Pro Preview 05-06 scored 77% on the leaderboard, coming in 2nd place close behind o3 (high). Full leaderboard: aider.chat/docs/leaderboa…

The $6.32 benchmark cost for Gemini 2.5 Pro Preview 03-25 was incorrect. The true cost was higher, possibly significantly so. Unfortunately 03-25 is no longer available to re-run. The new 05-06 version costs $37 to run the benchmark. Root cause analysis: aider.chat/2025/05/07/gem…