Michael Matthews (@mitrma)

Pinned

M

We are very excited to announce Kinetix: an open-ended universe of physics-based tasks for RL! We use Kinetix to train a general agent on millions of randomly generated physics problems and show that this agent generalises to unseen handmade environments. 1/🧵

14

216

1.0K

775

159.0K

Michael Matthews Retweeted

M

Martin Klissarov@MartinKlissarov · Jun 27

As AI agents face increasingly long and complex tasks, decomposing them into subtasks becomes increasingly appealing. But how do we discover such temporal structure? Hierarchical RL provides a natural formalism-yet many questions remain open. Here's our overview of the field🧵

12

63

277

182

31.0K

Michael Matthews Retweeted

S

Samuel Garcin@SamuelGarcin · Jun 13

You work on RL from pixels, and you're tired to wait 10 hours for a DMC run to finish? Or up to 100 hours, if you add video distractors? Well, we got you covered : PixelBrax can run your continuous control experiments from pixels in < 1 hr! Come chat with @trevormcinroe and I at…

1

3

16

6

772

Michael Matthews Retweeted

M

Mikael Henaff@HenaffMikael · Jun 9

A couple bits of news: 1. Happy to share my first (human) NetHack ascension-next step is RL agents :) 2. I wrote a post discussing some @NetHack_LE challenges & how they map to open problems in RL & agentic AI. Still the best RL benchmark imo. mikaelhenaff.substack.com/p/first-nethac…

4

11

57

23

10.0K

Michael Matthews Retweeted

S

Seohong Park@seohong_park · Jun 5

Is RL really scalable like other objectives? We found that just scaling up data and compute is *not* enough to enable RL to solve complex tasks. The culprit is the horizon. Paper: arxiv.org/abs/2506.04168 Thread ↓

10

143

919

751

135.0K

Michael Matthews Retweeted

J

Jakob Foerster@j_foerst · May 27

Hello World: My team at FAIR / @metaai (AI Research Agent) is looking to hire contractors across software engineering and ML. If you are interested and based in the UK, please fill in the following short EoI form: docs.google.com/forms/d/e/1FAI…

3

23

113

50

15.0K

M

Michael Matthews@mitrma · Apr 26

We are presenting Kinetix today! Oral - 11:30am Peridot Room 5F Poster - 3pm Hall 3+2B 377

MMichael Matthews@mitrma · Nov 11

We are very excited to announce Kinetix: an open-ended universe of physics-based tasks for RL! We use Kinetix to train a general agent on millions of randomly generated physics problems and show that this agent generalises to unseen handmade environments. 1/🧵

1

3

22

1

1.0K

Michael Matthews Retweeted

M

Matthew Jackson@JacksonMattT · Apr 18

🌹 Today we're releasing Unifloral, our new library for Offline Reinforcement Learning! We make research easy: ⚛️ Single-file 🤏 Minimal ⚡️ End-to-end Jax Best of all, we unify prior methods into one algorithm - a single hyperparameter space for research! ⤵️

5

36

144

80

30.0K

M

Michael Matthews@mitrma · Apr 18

I'll be in Singapore next week to present Kinetix as an Oral along with @mcbeukman. Reach out if you'd like to chat! 🇸🇬

MMichael Matthews@mitrma · Nov 11

We are very excited to announce Kinetix: an open-ended universe of physics-based tasks for RL! We use Kinetix to train a general agent on millions of randomly generated physics problems and show that this agent generalises to unseen handmade environments. 1/🧵

0

2

26

1

1.0K

M

Michael Matthews@mitrma · Apr 17

I'll be attending ICLR next week to present Kinetix with @mitrma. Would love to chat about anything UED / Open-Ended RL / QD related, or interesting research in general :)

MMichael Matthews@mitrma · Nov 11

We are very excited to announce Kinetix: an open-ended universe of physics-based tasks for RL! We use Kinetix to train a general agent on millions of randomly generated physics problems and show that this agent generalises to unseen handmade environments. 1/🧵

0

1

34

3

1.0K

Michael Matthews Retweeted

A

Andrei Lupu@_andreilupu · Mar 25

Did you know that \textcolor{white} text is still visible to LLMs? Anyway, don't use LLMs to write your reviews. Your co-authors will thank you.

4

6

146

32

27.0K

M

Michael Matthews@mitrma · Feb 25

Kinetix was featured on Computerphile!

JJakob Foerster@j_foerst · Feb 25

Some time ago @computer_phile came to visit @FLAIR_Ox. I didn't have tons of time to prepare (too busy doing research!) so turning my verbiage into something semi coherent must have been difficult for the team. Here's the result -- judge for yourself: youtube.com/watch?v=fN3gdU…

2

24

1

3.0K

Michael Matthews Retweeted

L

Leor Cohen@liorcohen5s · Feb 21

Introducing M³: A 𝗠odular 𝗪orld 𝗠odel over streams of tokens for sample-efficient RL 🌍🤖 M³ achieves state-of-the-art performance for planning-free world models on Atari-100K 🕹️, DMC 🦾, and Craftax-1M! 🚀 🧵1/8

2

6

32

17

3.0K

Michael Matthews Retweeted

M

Machine Learning Street Talk@MLStreetTalk · Feb 12

Jakob Foerster @j_foerst at @UniofOxford arguing that the AI community needs to avoid being goodharted by benchmarks.

4

16

117

37

10.0K

Michael Matthews Retweeted

a

alphaXiv@askalphaxiv · Feb 15

1997: Deep Blue defeats Kasparov at chess 2016: AlphaGo masters the game of Go 2025: Stanford researchers crack Among Us Trending on alphaXiv 📈 Remarkable new work trains LLMs to master strategic social deduction through multi-agent RL, doubling win rates over standard RL.

15

249

2.0K

989

209.0K

Michael Matthews Retweeted

M

Mikayel Samvelyan@_samvelyan · Feb 14

⚔️ MiniHack Updates! ⚔️ 1️⃣ MiniHack 1.0.0 is here! Following popular demand, it now supports the new Gymnasium API and is built on NLE 1.1.0. Huge thanks to @Stephen_Oman (maintainer of @NetHack_LE ) for his outstanding contribution! 🙌

3

13

65

10

5.0K

M

Michael Matthews@mitrma · Feb 11

Kinetix has been accepted at ICLR as an Oral! See you in Singapore 🇸🇬

MMichael Matthews@mitrma · Nov 11

We are very excited to announce Kinetix: an open-ended universe of physics-based tasks for RL! We use Kinetix to train a general agent on millions of randomly generated physics problems and show that this agent generalises to unseen handmade environments. 1/🧵

2

3

49

7

3.0K

M

Michael Matthews@mitrma · Feb 5

Congratulations to @antoine_dedieu @joeaortiz @sirbayes and the team for setting a new SOTA on Craftax-1M and Craftax-Classic-1M! 🎉

aantoine dedieu@antoine_dedieu · Feb 5

Happy to share our new preprint “Improving Transformer World Models for Data-Efficient RL”: arxiv.org/abs/2502.01591 We propose a ladder of improvements to model-based RL and achieve for the first time a superhuman reward on the challenging Craftax-classic benchmark! 1/10

2

0

17

6

1.0K

Michael Matthews Retweeted

M

Martin Klissarov@MartinKlissarov · Feb 4

Can AI agents adapt zero-shot, to complex multi-step language instructions in open-ended environments? We present MaestroMotif, a method for AI-assisted skill design that produces highly capable and steerable hierarchical agents. To the best of our knowledge, it is the first…

6

52

202

165

79.0K

Michael Matthews Retweeted

T

Tim Rocktäschel@_rockt · Jan 23

Couldn't agree more. "UK Research and Innovation funding in the UK fell under the previous government from 6,835 in 2018-19 to 4,900 in 2022-23". To give a concrete example (with my @UCLCS professor hat on): 4 out of 7 @UCL_DARK PhD students were funded by the Centre for Doctoral…

7

35

157

18

22.0K

M

Michael Matthews@mitrma · Dec 6

I'll also be at NeurIPS, keen to chat about UED, SFL, Kinetix, or anything in open-ended RL :)

AAlex Rutherford@alexrutherford0 · Dec 6

Hello! I'll be at NeurIPS next week presenting our work on using learnability to select levels for RL autocurricula. If you're there, I would love to chat about curricula and RL generalisation more broadly. Please DM if you'd like to grab a coffee :)

2

4

28

5

3.0K