Mikael Henaff
@HenaffMikael
Research Scientist at @MetaAI, previously postdoc at @MSFTResearch and PhD at @nyuniversity. All views my own.
Super stoked to share this work led by @proceduralia & @MartinKlissarov. Our method Motif uses LLMs to rank pairs of observation captions and synthesize dense intrinsic rewards specified by natural language. New SOTA on NetHack while being easily steerable. Paper+code in thread!
Can reinforcement learning from AI feedback unlock new capabilities in AI agents? Introducing Motif, an LLM-powered method for intrinsic motivation from AI feedback. Motif extracts reward functions from Llama 2's preferences and uses them to train agents with reinforcement…
On #ICML2025 16 Jul, 11 AM We present Meta Locate 3D: a model for accurate object localization in 3D environments. Meta Locate 3D can help robots accurately understand their surroundings and interact more naturally with humans. Demo, model, paper: go.fb.me/2lx31s
Happy "@NetHack_LE is still completely unsolved" day for those of you who are celebrating it. We released The NetHack Learning Environment (arxiv.org/abs/2006.13760) on this day five years ago. Current frontier models achieve only ~1.7% progression (see balrogai.com).…
A couple bits of news: 1. Happy to share my first (human) NetHack ascension-next step is RL agents :) 2. I wrote a post discussing some @NetHack_LE challenges & how they map to open problems in RL & agentic AI. Still the best RL benchmark imo. mikaelhenaff.substack.com/p/first-nethac…

Introducing Meta Locate 3D: a model for accurate object localization in 3D environments. Learn how Meta Locate 3D can help robots accurately understand their surroundings and interact more naturally with humans. You can download the model and dataset, read our research paper,…
Can visual SSL match CLIP on VQA? Yes! We show with controlled experiments that visual SSL can be competitive even on OCR/Chart VQA, as demonstrated by our new Web-SSL model family (1B-7B params) which is trained purely on web images – without any language supervision.
My good friend @arcanelibrary designs old-school D&D games and her latest kickstarter is up! I've had lots of fun playing Shadowdark, highly recommend if you're into RPGs :)
Shadowdark: The Western Reaches is now live on Kickstarter and funded in two minutes! kickstarter.com/projects/shado…
Introducing ⚡️Fast3R: the bitter lesson comes for SfM. By using a big dumb ViT, we can reconstruct pointmaps for 1000 images in a single forward pass @ 250 FPS. How do we do this? Using techniques from LLMS. Website: fast3r-3d.github.io Demo: fast3r.ngrok.app 🧵
Btw, the lead author @jed_yang is graduating this year and will be on the job market. Jed is highly motivated and creative, a great engineer and researcher who gets stuff to work, and has been a pleasure to work with...if you're hiring I suggest reaching out to him!
Excited to share our Fast3R paper, to be presented at CVPR 2025. This recasts 3D reconstruction and camera pose estimation from video as an end-to-end learning problem, leading to ~4x-300x improvements in speed while maintaining performance. Code, model & demo in thread!
Excited to share our Fast3R paper, to be presented at CVPR 2025. This recasts 3D reconstruction and camera pose estimation from video as an end-to-end learning problem, leading to ~4x-300x improvements in speed while maintaining performance. Code, model & demo in thread!
⚡️ Excited to announce Fast3R: 3D reconstruction of 1000+ images in a single forward pass! Fast3R achieves 251 FPS at its peak. 🔥 Try the demo with your images or video! 🔗 Website: fast3r-3d.github.io 🎮 Demo: fast3r.ngrok.app #CVPR2025 #3D @AIatMeta
MaestroMotif has been given an oral presentation at ICLR! 🙏 See how your AI could solve tasks like: “Do not leave the first dungeon level until you achieve XP level 4, then find a shopkeeper and sell an item that you have collected; finally survive for another 300 steps”
Can AI agents adapt zero-shot, to complex multi-step language instructions in open-ended environments? We present MaestroMotif, a method for AI-assisted skill design that produces highly capable and steerable hierarchical agents. To the best of our knowledge, it is the first…
The era of Hierarchical Agents has begun
Can AI agents adapt zero-shot, to complex multi-step language instructions in open-ended environments? We present MaestroMotif, a method for AI-assisted skill design that produces highly capable and steerable hierarchical agents. To the best of our knowledge, it is the first…
Martin led this great work, check it out. For a dinosaur like me, let me say that, in more classical RL terms, this is a demonstration of how we can effectively combine options and LLMs through programmatic policies.
Can AI agents adapt zero-shot, to complex multi-step language instructions in open-ended environments? We present MaestroMotif, a method for AI-assisted skill design that produces highly capable and steerable hierarchical agents. To the best of our knowledge, it is the first…
Super excited to see MaestroMotif out into the world--the first hierarchical LLM agent that can solve open-ended compositional tasks requiring hundreds of steps 🚀🚀🚀 🤖 What can MaestroMotif do? - solve complex tasks by re-combining skills - adapt zero-shot to new instructions…
Can AI agents adapt zero-shot, to complex multi-step language instructions in open-ended environments? We present MaestroMotif, a method for AI-assisted skill design that produces highly capable and steerable hierarchical agents. To the best of our knowledge, it is the first…
Another banger led by dream team @MartinKlissarov and @proceduralia, to be presented at ICLR 2025. MaestroMotif is a hierarchical agent which zero-shot composes Motif skills using an LLM controller, reaching new depths of the NetHack dungeon. Code available!
Can AI agents adapt zero-shot, to complex multi-step language instructions in open-ended environments? We present MaestroMotif, a method for AI-assisted skill design that produces highly capable and steerable hierarchical agents. To the best of our knowledge, it is the first…
🚨 DeepSeek crushed existing benchmarks. But how does it fare in embodied agentic tasks? We tested @deepseek_ai R1 Distil Qwen 32B on BALROG, and the results were both inspiring and entertaining. The good news for those still beginning their careers is… lots to do here! 🚀⬇️
Yearly reminder
"Is it AGI" flow chart. Developed with @_rockt at NeurIPS 2022.
🤔 How to extract knowledge from LLMs to train better RL agents? 📚 Our new paper (with @qqyuzu @HenaffMikael @yayitsamyzhang @adityagrover_ ) studies LLM-driven rewards for NetHack! Paper: arxiv.org/abs/2410.23022 Code: github.com/facebookresear…
ONI offers concurrent policy training & reward synthesizing, a good fit for long horizon sparse reward problems! I also believe its great potential to be extended to multimodal inputs and complex planning/reasoning environments!
🤔 How to extract knowledge from LLMs to train better RL agents? 📚 Our new paper (with @qqyuzu @HenaffMikael @yayitsamyzhang @adityagrover_ ) studies LLM-driven rewards for NetHack! Paper: arxiv.org/abs/2410.23022 Code: github.com/facebookresear…