Rob
@Rob_GCC
Entrepreneur, coder, and scientist.
A follow-up study on Apple's "Illusion of Thinking" Paper is published now. Shows the same models succeed once the format lets them give compressed answers, proving the earlier collapse was a measurement artifact. Token limits, not logic, froze the models. Collapse vanished…
Beautiful research from @Apple More thoughts stop helping once tasks cross critical depth. Thinking tokens rise, then crash, revealing compute inefficiency. So Standard LLMs beat LRMs on easy puzzles, unexpectedly. Researchers stress-test them on puzzles whose difficulty can…
Our new state-of-the-art AI model Aeneas transforms how historians connect the past. 📜 Ancient inscriptions often lack context – it's like solving a puzzle with 90% of the pieces lost to time. It helps researchers interpret and situate inscriptions in their past context. 🧵
xAI gave us early access to Grok 4 - and the results are in. Grok 4 is now the leading AI model. We have run our full suite of benchmarks and Grok 4 achieves an Artificial Analysis Intelligence Index of 73, ahead of OpenAI o3 at 70, Google Gemini 2.5 Pro at 70, Anthropic Claude…
We start slowly scaling out of BSV. Relysia will be maintained another 6 months (and 12 months for infra clients), while mnemonics should be retrievable for much longer. While we still support BSV, the maintenance is just higher than the revenue and resources are better utilised…
Grok's thoughts Potential Effects on OpenAI’s Future: 1. Talent Retention Challenges: Losing key researchers could slow OpenAI’s progress toward AGI, especially if Meta’s high financial incentives continue to lure talent. 2. Compensation Overhaul: Recalibrating compensation…
remember when i said gemini-2.5-pro is the best? i wasn't joking. i use every single one of these models. i have api keys for them and each model is a hotkey away. gemini 2.5 is literally just the best
The aider polyglot leaderboard has been updated to reflect the new, much lower o3 pricing. aider.chat/docs/leaderboa…
G1 is so aggressive, it seems that humans are not used to its sudden appearance, and they are scared 😂
this goes in the worst designs hall of fame the more you look at it the more your eyes hurt ...
I asked o3 to analyse and critique Apple's new "LLMs can't reason" paper. Despite its inability to reason I think it did a pretty decent job, don't you?
BREAKING: Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. They just memorize patterns really well. Here's what Apple discovered: (hint: we're not as close to AGI as the hype suggests)
In 2009 (!!!), Paul Graham (@paulg) wrote a post about the 5 most interesting founders of the last 30 years. The list included Steve Jobs, Larry & Sergey, TJ Rodgers, Paul Buchheit, and... Sam Altman (@sama)
used gemini 2.5 pro to build a simple shot counter for myself + give jordan feedback per shot.
Completely agree. LLMs being possible should feel like insane magic.
excited to finally share on arxiv what we've known for a while now: All Embedding Models Learn The Same Thing embeddings from different models are SO similar that we can map between them based on structure alone. without *any* paired data feels like magic, but it's real:🧵
this is sick all i'll say is that these GIFs are proof that the biggest bet of my research career is gonna pay off excited to say more soon
We're barely 2 years from Will Smith eating spaghetti...
Say goodbye to the silent era of video generation: Introducing Veo 3 — with native audio generation. 🗣️ Quality is up from Veo 2, and now you can add dialogue between characters, sound effects and background noise. Veo 3 is available now in the @GeminiApp for Google AI Ultra…
Aurelia by @holtsetio Completely procedural jellyfish, with verlet physics and fake volumetric lighting. Rendered in WebGPU and TSL. holtsetio.com/lab/aurelia/
Introducing AlphaEvolve: a Gemini-powered coding agent for algorithm discovery. It’s able to: 🔘 Design faster matrix multiplication algorithms 🔘 Find new solutions to open math problems 🔘 Make data centers, chip design and AI training more efficient across @Google. 🧵
Announcing the newest releases from Meta FAIR. We’re releasing new groundbreaking models, benchmarks, and datasets that will transform the way researchers approach molecular property prediction, language processing, and neuroscience. 1️⃣ Open Molecules 2025 (OMol25): A dataset…