Siméon
@Simeon_Cps
Creating more common knowledge on AI risks, one tweet at a time. Founder in Paris. AI auditing, standardization & governance.
The wave that first hit protein folding scientists in 2019 is now coming for mathematicians
the openai IMO news hit me pretty heavy this weekend i'm still in the acute phase of the impact, i think i consider myself a professional mathematician (a characterization some actual professional mathematicians might take issue with, but my party my rules) and i don't think i…
Google is crushing it. They got their gold in natural language with Gemini. It seems like they mostly caught up to OpenAI, in less than a year.
Official results are in - Gemini achieved gold-medal level in the International Mathematical Olympiad! 🏆 An advanced version was able to solve 5 out of 6 problems. Incredible progress - huge congrats to @lmthang and the team! deepmind.google/discover/blog/…
i'd like to read some commentary by people with relevant domain knowledge about these proofs. Maybe @davidad, @an_interstice or @FabienDRoger? Do they feel like the kind of proofs that are relying on a ton of knowledge and leveraging how knowledgeable these models are or do…
10/N If you want to take a look, here are the model’s solutions to the 2025 IMO problems! The model solved P1 through P5; it did not produce a solution for P6. (Apologies in advance for its … distinct style—it is very much an experimental model 😅) github.com/aw31/openai-im…
Here we are, crushing benchmarks that characterize the top of human fluid intelligence
1/N I’m excited to share that our latest @OpenAI experimental reasoning LLM has achieved a longstanding grand challenge in AI: gold medal-level performance on the world’s most prestigious math competition—the International Math Olympiad (IMO).
I'm seeing contradictory reports on whether the H20 license is only for existing inventory (which is 100k-200k GPUs afaiu) or if it's longer term? Which one is it?
So, uh, what's the Windsurf deal? OpenAI pays the $3B, Google acqui-hires the founders and Cognition acquires the leftovers?

i feel like the Claude Pro (whether personal or team plan) is limited to like 5 Opus interactions per day? it feels really overpriced for what it gives access to. I'm constantly rate limited despite using it pretty sparsely.
pure alpha just dropped
So excited to share that today @CSETGeorgetown and @emergingtechobs are launching an ✨updated✨ version of our chip supply chain explorer! We've got: 👉 New data 👉 New features 👉 New analysis Links in thread
Moonshot, the Margin Slayers
For a wide range of tasks, K2 is probably the cheapest model by far right now, in terms of actual costs per task. It is just cheap, it has no long-CoT, and it does not yap. This is very refreshing. Like the best of Anthropic models, but cheaper and even more to the point.
These costs 👀
Rumblings all morning it was going to arrive, and here it is. Open source, and comparable to the best models in the world.
Exciting! Not sure it's gonna work but definitely worth fighting this fight that most have dropped! Congrats Luke & Rudolf
Announcing Workshop Labs, a public benefit company.