Matthew Jackson
@JacksonMattT
SR @GoogleDeepMind (Genie), ex @wayve_ai (GAIA) | PhD student @whi_rl & @flair_ox | Video Models, Diffusion, and Offline RL
🌹 Today we're releasing Unifloral, our new library for Offline Reinforcement Learning! We make research easy: ⚛️ Single-file 🤏 Minimal ⚡️ End-to-end Jax Best of all, we unify prior methods into one algorithm - a single hyperparameter space for research! ⤵️
1/ 🕵️ Algorithm discovery could lead to huge AI breakthroughs! But what is the best way to learn or discover new algorithms? I'm so excited to share our brand new @rl_conference paper which takes a step towards answering this! 🧵
Antiviral therapy design is myopic 🦠🙈 optimised only for the current strain. That's why you need a different Flu vaccine every year! Our #ICML2025 paper ADIOS proposes "shaper therapies" that steer viral evolution in our favour & remain effective. Work done @FLAIR_Ox 🧵👇
🚀Introducing “StochasTok: Improving Fine-Grained Subword Understanding in LLMs”!🚀 LLMs are incredible but still struggle disproportionately with subword tasks, e.g., for character counts, wordplay, multi-digit numbers, fixing typos… Enter StochasTok, led by @anyaasims! [1/]
Excited to announce my first paper, with @j_foerst and @FLAIR_Ox, was accepted into @rl_conference 2025! We establish a new UED method called NCC that obtains strong performance based on principles of optimisation theory.
Google DeepMind, David Silver reveals: we built a system that used RL to discover its own RL algorithms. this AI-designed system outperformed all human-created RL algorithms developed over the years.
The best of RL research, brought to Offline RL! 🚀 TL;DR 1. CleanRL-style implementations ⚡️ 2. Rainbow-style algorithm unification 🦾 3. Rliable-style evaluation protocol 🔬 Check out our paper + library!
🌹 Today we're releasing Unifloral, our new library for Offline Reinforcement Learning! We make research easy: ⚛️ Single-file 🤏 Minimal ⚡️ End-to-end Jax Best of all, we unify prior methods into one algorithm - a single hyperparameter space for research! ⤵️
Making offline RL more honest, reproducible, and robust.
🌹 Today we're releasing Unifloral, our new library for Offline Reinforcement Learning! We make research easy: ⚛️ Single-file 🤏 Minimal ⚡️ End-to-end Jax Best of all, we unify prior methods into one algorithm - a single hyperparameter space for research! ⤵️
🔮Looking forward, we intend Unifloral🌹to be more than a library—it's a scaffolding 🌱 for indexing current & future ORL work!🏵️ We encourage 🥺 you to: 🔄 PR your awesome work using the 🌹 format 🎮 Explore the unified implementation 🧩 Try to find new SOTA algos with it
🌹 Today we're releasing Unifloral, our new library for Offline Reinforcement Learning! We make research easy: ⚛️ Single-file 🤏 Minimal ⚡️ End-to-end Jax Best of all, we unify prior methods into one algorithm - a single hyperparameter space for research! ⤵️
⁉️ While trying to find the best hyperparameter setting of ORL algorithms using a bandit, we noticed something unexpected: 🤯 After evaluating the episodic returns of more and more policies online, the bandit's performance *decreased*! x.com/JacksonMattT/s…
🌹 Today we're releasing Unifloral, our new library for Offline Reinforcement Learning! We make research easy: ⚛️ Single-file 🤏 Minimal ⚡️ End-to-end Jax Best of all, we unify prior methods into one algorithm - a single hyperparameter space for research! ⤵️
Introducing GAIA-2 🌎Generative world modeling just stepped up a gear. GAIA-2 is the latest development of Wayve’s video-generative world model tailored for driving. GAIA-2 offers richer, more realistic, and highly controllable synthetic driving scenarios, accelerating Wayve’s…
My group @FLAIR_Ox is recruiting a postdoc and looking for someone who can get started by the end of April. Deadline to apply is in one week (!), 19th of March at noon, so please help spread the word: my.corehr.com/pls/uoxrecruit…
Introducing 🧞Genie 2 🧞 - our most capable large-scale foundation world model, which can generate a diverse array of consistent worlds, playable for up to a minute. We believe Genie 2 could unlock the next wave of capabilities for embodied agents 🧠.
Huge unlock for RL foundation models and environment design... 1) Grounded in real-world physics, 2) Level design in the browser, 3) End-to-end Jax, so lightning fast! More great work from the Michaels 🔥
🏋️♂️Go from creating an environment to having a trained expert agent within minutes! As part of Kinetix, we are releasing an editor that can create custom physics-based RL environments, and import them seamlessly into an RL training loop. 1/
🏋️♂️Go from creating an environment to having a trained expert agent within minutes! As part of Kinetix, we are releasing an editor that can create custom physics-based RL environments, and import them seamlessly into an RL training loop. 1/