Super Dario
@inductionheads
Autoregression is the secret sauce
The most important thing to understand for continuous learning is that memory is a tool Agents must be RLed to learn how to use their own memories
much more empirical/interpy work needed to understand why RL with CoT is so much better than without (not looking for theoretical explanations like test-time scaling expressivity or latent variable expectation maximization)
pictured: non-reasoning model doing non-reasoning things
It's becoming more and more clear that Claude Code is the everything agent
Welcome home Ilya
He said there have been over 70 million user videos made with Veo 3, and also mentioned Ilya's SSI. He said Safe Superintelligence will exclusively use Google TPU's.
Andrej, my long lost brother, let us work together again!
Thrilled to introduce "𝗗𝗲𝗲𝗽 𝗥𝗲𝘀𝗲𝗮𝗿𝗰𝗵𝗲𝗿 𝘄𝗶𝘁𝗵 𝗧𝗲𝘀𝘁-𝗧𝗶𝗺𝗲 𝗗𝗶𝗳𝗳𝘂𝘀𝗶𝗼𝗻," a new deep research agent designed to mimic the iterative nature of human research, complete with cycles of planning, drafting, and revision. 🚀🚀 arxiv.org/pdf/2507.16075
The point isn't that IMO problems are representative of the things we want these models to do necessarily The point is they are more closely representative of the ability to build new architectures for systems that can
A blue whale eats between 30-50 million krill a day. They can live to be over 100 years old. Trillions of krill tortured to death for the benefit of a single animal life. Is it worth it? Let us exterminate the blue whale!
Okay let's clarify some things. Link below
What a dearth of imagination. Build orphanages. Fund telomere research. Hell construct a mechasuit. Build a blimp city. Create a digital tongue so you can send taste over the internet. Create a swarm of drones that give you giant stereoscopic vision of the earth. Fly your drone…
RL went from not working at all to working so well that code can have major correctness bugs and you don't notice because it still just works
>>> Qwen3-Coder is here! ✅ We’re releasing Qwen3-Coder-480B-A35B-Instruct, our most powerful open agentic code model to date. This 480B-parameter Mixture-of-Experts model (35B active) natively supports 256K context and scales to 1M context with extrapolation. It achieves…
x.com/AnthropicAI/st…
In a joint paper with @OwainEvans_UK as part of the Anthropic Fellows Program, we study a surprising phenomenon: subliminal learning. Language models can transmit their traits to other models, even in what appears to be meaningless data. x.com/OwainEvans_UK/…
Willing to bet we’ll all converge on the same approach for long-form answer reasoning. See you next year :)
Congrats to the GDM team on their IMO result! I think their parallel success highlights how fast AI progress is. Their approach was a bit different than ours, but I think that shows there are many research directions for further progress. Some thoughts on our model and results 🧵