Tim Rocktäschel
@_rockt
Director and Open-Endedness Team Lead @GoogleDeepMind, Professor of AI @AI_UCL, PI @UCL_DARK, Fellow @ELLISforEurope.
My @iclr_conf 2025 Keynote on "Open-Endedness, World Models, and the Automation of Innovation" is now publicly available: iclr.cc/virtual/2025/i…
Come work with us on helping agents reach unprecedented levels of autonomy :) We have a ridiculously cracked team and are pushing the bounds of AGI with awesome folks like @robertarail @OriolVinyalsML @_rockt @quocleix Come join ~~ le rocket ship ~~
I’m building a new team at @GoogleDeepMind to work on Open-Ended Discovery! We’re looking for strong Research Scientists and Research Engineers to help us push the frontier of autonomously discovering novel artifacts such as new knowledge, capabilities, or algorithms, in an…
I love that book! “..many of our greatest engineering aspirations—such as flight, solar power, artificial intelligence—were not the explicit objective of evolution, though it created all of them. It created them because nature is a stepping-stone collector, accumulating steps…
Tim Rocktäschel’s keynote talk at #ICLR2025 about Open-Endedness and AI. “Almost no prerequisite to any major invention was invented with that invention in mind.” “Basically almost everybody in my lab at UCL and at DeepMind have read this book: Why Greatness Cannot Be Planned.”
Join us!
I’m building a new team at @GoogleDeepMind to work on Open-Ended Discovery! We’re looking for strong Research Scientists and Research Engineers to help us push the frontier of autonomously discovering novel artifacts such as new knowledge, capabilities, or algorithms, in an…
1 43.6 Grok-4-Wiz-AI-Cha died in The Dungeons of Doom on level 1. Killed by a housecat.
LLMs acing math olympiads? Cute. But BALROG is where agents fight dragons (and actual Balrogs)🐉😈 And today, Grok-4 (@grok) takes the gold 🥇 Welcome to the podium, champion!
Grok 4 results on @NetHack_LE just dropped!
LLMs acing math olympiads? Cute. But BALROG is where agents fight dragons (and actual Balrogs)🐉😈 And today, Grok-4 (@grok) takes the gold 🥇 Welcome to the podium, champion!
The world is moving towards agents Static benchmarks don't measure what agents do best (multi-turn reasoning) Thus, interactive benchmarks: * Terminal Bench (@alexgshaw, @Mike_A_Merrill) * Text Arena (@LeonGuertler) * BALROG (@PaglieriDavide, @_rockt) * ARC-AGI-3 (@arcprize)
💯 Who knew that the International Math Olympiad (IMO) is much easier than @NetHack_LE for AI.
Meanwhile, another wall - @NetHack_LE - is still standing firm and tall.
When will an AI win a Gold Medal in the International Math Olympiad? Median predicted date over time July 2021: 2043 (22 years away) July 2022: 2029 (7 years away) July 2023: 2028 (5 years away) July 2024: 2026 (2 years away) metaculus.com/questions/6728…
Mine is even simpler. Assuming it's a general purpose learning agent, let it read the NetHack Wiki and ascend once. Probably more of an ASI benchmark at this point.
My bar for AGI is far simpler: an AI cooking a nice dinner at anyone’s house for any cuisine. The Physical Turing Test is very likely harder than the Nobel Prize. Moravec’s paradox will continue to haunt us, looming larger and darker, for the decade to come.
I think ARC is a great eval, but at this point we should just use nethack
Today we’re releasing our first public preview of ARC-AGI-3: the first three games. Version 3 is a big upgrade over v1 and v2 which are designed to challenge pure deep learning and static reasoning. In contrast, v3 challenges interactive reasoning (eg. agents). The full version…
NeurIPS is pleased to officially endorse EurIPS, an independently-organized meeting taking place in Copenhagen this year, which will offer researchers an opportunity to additionally present their accepted NeurIPS work in Europe, concurrently with NeurIPS. Read more in our blog…
🚀 Excited to announce our workshop “Embodied World Models for Decision Making” at #NeurIPS2025! 🎉 Keynote speakers, panelists, and content are now live! Check out: 👉 embodied-world-models.github.io #WorldModels #RL #NeurIPS #NeurIPS2025 #neuripsworkshop #workshop
The term "AI alignment" is often used without specifying "to whom?" and much of the work on AI alignment in practice looks more like "AI controllability" without answering "who controls the controller?" (i.e. user or operator). One key challenge is that alignment is fundamentally…
Introducing: Full-Stack Alignment 🥞 A research program dedicated to co-aligning AI systems *and* institutions with what people value. It's the most ambitious project I've ever undertaken. Here's what we're doing: 🧵
When you build EV cars, you still start with a car body, wheels, and a steering wheel. When you build starships, you still start with a rocket frame, fuel tanks, fins etc. I would be surprised if frontier AI models in five years all still follow the same recipe.
It's quite sad to see Elon in this position. He has built the world's first commercially successful electric car company and the world's first commercially successful private space company, but with xAI, all he can do is throw more GPUs at the problem everyone else is solving…
The automation of innovation is within reach! Delighted that my @raais talk is now available for anyone to watch, alongside an excellent blogpost summary by the inimitable @nathanbenaich.
"2025 is the year of open-endedness" at @raais, @edwardfhughes argued that we’re entering a new phase in the evolution of ai: one where open-endedness becomes the central organizing principle. Not just solving problems, but defining them. Not just predicting the next token,…
In a rare combination of theory and empirics @Karim_abdelll & @MatthewFdashR show that UED via minimax-regret can mitigate goal misgeneralization. They're the first to connect UED to AI Safety. Expect more to follow! There's gold for UED/AIS researchers in the 30 page appendix🚀
*New AI Alignment Paper* 🚨 Goal misgeneralization occurs when AI agents learn the wrong reward function, instead of the human's intended goal. 😇 We show that training with a minimax regret objective provably mitigates it, promoting safer and better-aligned RL policies!
"2025 is the year of open-endedness" at @raais, @edwardfhughes argued that we’re entering a new phase in the evolution of ai: one where open-endedness becomes the central organizing principle. Not just solving problems, but defining them. Not just predicting the next token,…
Homo sapiens survived the ice age, the great famine, bubonic plague, and COVID-19, only to be wiped out by backprop. That would be a real shame!
AI Isn't stealing your creativity. — Rich Rubin from the Daily Stoic Podcast. Watch the full episode here: youtu.be/GvzaNZ67gQA?si…