Hristo Vassilev
@hristo_vassilev
Shortcut – the first superhuman excel agent – is live. While not perfect, Shortcut beats first year analysts from McKinsey/Goldman head-to-head 89.1% (220:27) when blindly judged by their managers. We even gave humans 10x more time. Try Shortcut now (before your boss does).
november 2016 with nvidia's stock at $1.63: "For fun, our firm has an internal game of what public companies we'd invest in if we were a hedge fund. We'd put all our money into Nvidia." — @pmarca nvidia today: $173

I believe this is true, I used the word in my “Unreasonable Effectiveness of RNNs” post from 2015, and as far as I can remember I also hallucinated it.
TIL that @karpathy is not only behind “vibe coding” but also “hallucinations”

We achieved gold medal-level performance 🥇on the 2025 International Mathematical Olympiad with a general-purpose reasoning LLM! Our model solved world-class math problems—at the level of top human contestants. A major milestone for AI and mathematics.
1/N I’m excited to share that our latest @OpenAI experimental reasoning LLM has achieved a longstanding grand challenge in AI: gold medal-level performance on the world’s most prestigious math competition—the International Math Olympiad (IMO).
Today, we at @OpenAI achieved a milestone that many considered years away: gold medal-level performance on the 2025 IMO with a general reasoning LLM—under the same time limits as humans, without tools. As remarkable as that sounds, it’s even more significant than the headline 🧵
1/N I’m excited to share that our latest @OpenAI experimental reasoning LLM has achieved a longstanding grand challenge in AI: gold medal-level performance on the world’s most prestigious math competition—the International Math Olympiad (IMO).
1/N I’m excited to share that our latest @OpenAI experimental reasoning LLM has achieved a longstanding grand challenge in AI: gold medal-level performance on the world’s most prestigious math competition—the International Math Olympiad (IMO).
I had early access & ChatGPT agent is, I think, a big step forward for getting AIs to do real work Even at this stage, it does a good job autonomously doing research & assembling Excel files (with formulas!), PowerPoint, etc. It gives a sense of how agents are coming together
Humanity has prevailed (for now!) I'm completely exhausted. I figured, I had 10h of sleep in the last 3 days and I'm barely alive. I'll post more about the contest when I get some rest. (To be clear, those are provisional results, but my lead should be big enough)
Introducing Studio Setup 🎥 🚀 RT if you love a good Studio Setup :)
Thinking Machines Lab exists to empower humanity through advancing collaborative general intelligence. We're building multimodal AI that works with how you naturally interact with the world - through conversation, through sight, through the messy way we collaborate. We're…
Drake performing SICKO MODE at Wireless sounds insane 😭
Drake Just Sonned Kendrick Lamar And Nobody Even Noticed 💣
Things Worth Remembering: ‘Life Is Real! Life Is Earnest!’ — These days, it’s embarrassing to take anything too seriously. But if you don’t, as the poets remind us, your life won’t amount to much. @RyanHoliday @TheFP
Drake shook up Wireless Fest with surprise appearances from Lauryn Hill, Giveon, Bryson Tiller, and Mario. Unforgettable energy, iconic collabs, and a headlining set that raised the bar. Tap in for the full breakdown: complex.com/music/a/markel…
What Did I Miss? @Drake youtube.com/watch?v=weU76D…
Grok 4 may be doing well on some metrics but after an hour or so of testing my conclusion is that is overfitting. Grok4 is behind o3 and Gemini 2.5 in reasoning & well behind either of those models or o4 in writing quality. But great to see competition!