Mike Dodds
@miike
Formal methods enthusiast. Principal scientist at @Galois. English immigrant. Nitwit. Opinions my own.
We need to start figuring out what’s possible when math proofs are very cheap
An advanced version of Gemini with Deep Think has officially achieved gold medal-level performance at the International Mathematical Olympiad. 🥇 It solved 5️⃣ out of 6️⃣ exceptionally difficult problems, involving algebra, combinatorics, geometry and number theory. Here’s how 🧵
An advanced version of Gemini with Deep Think has officially achieved gold medal-level performance at the International Mathematical Olympiad. 🥇 It solved 5️⃣ out of 6️⃣ exceptionally difficult problems, involving algebra, combinatorics, geometry and number theory. Here’s how 🧵
In a new paper, we examine recent claims that AI systems have been observed ‘scheming’, or making strategic attempts to mislead humans. We argue that to test these claims properly, more rigorous methods are needed.
This past week, Harmonic had the opportunity to represent our advanced mathematical reasoning model, Aristotle, at the International Mathematics Olympiad - the most prestigious mathematics competition in the world. To uphold the sanctity of the student competition, the IMO Board…
True in general, but some kinds of sci/tech such as formal verification are genuinely blocked on the cost of math, and for those, AI will be transformative
this IMO gold will fly past us as quickly as the turing test did soon normies will say “duh of course they’re good at math, they’re computers” but the RL breakthroughs the team made to solve math (congrats!!) will likely generalize to environments with much higher direct value
If it holds up, this is an Insane result, wtf
Today, we at @OpenAI achieved a milestone that many considered years away: gold medal-level performance on the 2025 IMO with a general reasoning LLM—under the same time limits as humans, without tools. As remarkable as that sounds, it’s even more significant than the headline 🧵
Hot take: the programming languages research community should focus less on complex solutions to very narrow problems in the name of novelty, and more---on making existing solutions accessible to non-experts.
Yes “programming in specifications” seems promising, with AI as a kind of verifying compiler. The core problem may be designing good spec languages. English is flexible but impossible to make guarantees about - vs formal languages that are precise but users can’t understand
XKCD is so handy. Want to make fun of somebody for extrapolating too much? Show them #605. Want to make fun of somebody for extrapolating too little? Show them #2278! Every possible argument, neatly packaged and numbered for the internet commentator's convenience.
Holy shit, linked below is one of my favorite blog posts of all time. Everyone working in cybersecurity should read it twice.
There galois.com/articles/what-… I've so much to say about this topic that I wouldnt know where to start.. so wont start.
Shout out to Galois intern Mark Moeller, who is presenting “Active Learning of Symbolic NetKAT Automata” at #PLDI2025 in Seoul on Wed, June 18! If you’re at PLDI, swing by to say “Hi,” chat about Galois, and nerd out on automata & #NetKAT for packet-switched networks!
This is one of the best articles on formal methods I have read for a while. I am so glad you wrote this insightful article. Congratulations @miike.
New-ish @galois blog: “What Works (and Doesn't) Selling Formal Methods”. The boring truth: engineers are rational and adoption is all about cost/benefit tradeoffs