nathan monette
@nathanrmonette
msc student @FLAIR_Ox
Excited to announce my first paper, with @j_foerst and @FLAIR_Ox, was accepted into @rl_conference 2025! We establish a new UED method called NCC that obtains strong performance based on principles of optimisation theory.

He's terrible. Screwed my buddy sgd.
Anyone knows adam?
reinforcement learning infrastructure
Every great consumer app is just a slot machine in disguise.
1/ 🕵️ Algorithm discovery could lead to huge AI breakthroughs! But what is the best way to learn or discover new algorithms? I'm so excited to share our brand new @rl_conference paper which takes a step towards answering this! 🧵
*New AI Alignment Paper* 🚨 Goal misgeneralization occurs when AI agents learn the wrong reward function, instead of the human's intended goal. 😇 We show that training with a minimax regret objective provably mitigates it, promoting safer and better-aligned RL policies!
🚀 Excited to announce Hyperoptax, a library for parallel hyperparameter tuning in JAX. Implements Grid, Random, and Bayesian search in pure JAX so that you can rapidly search across parameter configurations in parallel ‖. 📦 pip install hyperoptax github.com/TheodoreWolf/h…
Antiviral therapy design is myopic 🦠🙈 optimised only for the current strain. That's why you need a different Flu vaccine every year! Our #ICML2025 paper ADIOS proposes "shaper therapies" that steer viral evolution in our favour & remain effective. Work done @FLAIR_Ox 🧵👇
Theory of Mind (ToM) is crucial for next gen LLM Agents, yet current benchmarks suffer from multiple shortcomings. Enter 💽 Decrypto, an interactive benchmark for multi-agent reasoning and ToM in LLMs! Work done with @TimonWilli & @j_foerst at @AIatMeta & @FLAIR_Ox 🧵👇
Had a blast together with @how_uhh at @LeRobotHF hackathon this weekend. Built phone-based teleoperation for my SO-100 arm using pose estimation. Here’s a quick BTS of the final demo with teleop working (+ a small victory dance 🎉)
True dat
now that RL is hot again, you should all register for RLC and come visit Edmonton in August rl-conference.cc/index.html
So many incredible, inspiring ideas in @_rockt’s keynote…. But my personal favorite slide was a clarification on the world model definition 😱
FLAIR is at ICLR 🇸🇬 Find out our schedule for the week 👇
Our new paper (first one of my PhD!) on cooperative AI reveals a surprising insight: Environment Diversity > Partner Diversity. Agents trained in self-play across many environments learn cooperative norms that transfer to humans on novel tasks. shorturl.at/fqsNN🧵