Pieter Abbeel
@pabbeel
Berkeley & Amazon
FastTD3: "Minimum innovation, maximum results" Not the paper we had planned to write, but one of the works I am most proud of. We wanted to make sure our baseline (TD3) was a very solid baseline, so we added a few things that are already known to help in RL (large,…
Excited to present FastTD3: a simple, fast, and capable off-policy RL algorithm for humanoid control -- with an open-source code to run your own humanoid RL experiments in no time! Thread below 🧵
if you want to try training a robot to dance or pick up stuff, check out these Colab notebooks released by Google & @kevin_zakka this week. Train Unitree dog to spin and handstand: colab.research.google.com/github/google-… Train Franka robot arm to pick up stuff: colab.research.google.com/github/google-…
Our robotics team will be at ICRA next week in Atlanta! Having started a new research team at Amazon building robot foundation models, we're hiring across all levels, full-time or intern, and across both SW and Research roles. Ping me at [email protected] and let's have a chat!
Congratulations to my brilliant co-founder Philipp Moritz (@pcmoritz) and the legendary John Schulman, Sergey Levine, Pieter Abbeel, and Michael Jordan on their Test-of-Time Honorable Mention at ICML 2025 today! For creating TRPO. This was done during the previous wave of…
Today we're unveiling something truly extraordinary—one of the coolest and most transformative technologies: Mixed Reality Testing (MRT). Since launching @Waabi_ai World four years ago, we've been building towards this pivotal moment - bringing our virtual testing capabilities…
Congratulations Misha! Something many engineers in enterprises wanted for a long time. A great example of unified research-modeling-product-business strategy :)
Engineers spend 70% of their time understanding code, not writing it. That’s why we built Asimov at @reflection_ai. The best-in-class code research agent, built for teams and organizations.
🤖 What if a humanoid robot could make a hamburger from raw ingredients—all the way to your plate? 🔥 Excited to announce ViTacFormer: our new pipeline for next-level dexterous manipulation with active vision + high-resolution touch. 🎯 For the first time ever, we demonstrate…
Congratulations to BAIR researchers @kevin_zakka @qiayuanliao @arthurallshire @carlo_sferrazza @KoushilSreenath @pabbeel and Google collaborators for winning the Outstanding Demo Paper Award at RSS 2025! playground.mujoco.org
We’re super thrilled to have received the Outstanding Demo Paper Award for MuJoCo Playground at RSS 2025! Huge thanks to everyone who came by our booth and participated, asked questions, and made the demo so much fun! @carlo_sferrazza @qiayuanliao @arthurallshire
Demo starting in 10 minutes, come witness the magic of open-source sim2real!
Sim2real is getting so mature that with great hardware (thanks @clemens_chr @katzschmann), you can get things running in just days 🤖 We are also demoing @orcahand trained with MuJoCo Playground at #RSS2025. Come by Associates Park at 12:30pm-2pm and 6:30pm-8pm! @RoboticsSciSys
Just open-sourced Geometric Retargeting (GeoRT) — the kinematic retargeting module behind DexterityGen. Includes tools for importing custom hands. Give it a try: github.com/facebookresear… A software by @berkeley_ai and @AIatMeta. More coming soon.
Very excited for this one. We took a cautiously experimental view on NN optimizers, aiming to find something that just works. SPlus matches Adam within ~44% of steps on a range of objectives. Please try it out in your setting, or read below for how it works.…
Everyday human data is robotics’ answer to internet-scale tokens. But how can robots learn to feel—just from videos?📹 Introducing FeelTheForce (FTF): force-sensitive manipulation policies learned from natural human interactions🖐️🤖 👉 feel-the-force-ftf.github.io 1/n
🚀Check out our new work, FastTD3, a reinforcement learning algorithm that is simple, efficient, and highly capable. It achieves truly remarkable performance across challenging RL tasks.
Excited to present FastTD3: a simple, fast, and capable off-policy RL algorithm for humanoid control -- with an open-source code to run your own humanoid RL experiments in no time! Thread below 🧵
Excited to present FastTD3: a simple, fast, and capable off-policy RL algorithm for humanoid control -- with an open-source code to run your own humanoid RL experiments in no time! Thread below 🧵
Off-policy learning transfers from sim to real-world humanoids! Off-policy methods have pushed RL sample efficiency, but robotics still leans on parallel on-policy RL (PPO) for wall-time gains. Introducing FastTD3, a simple off-policy algorithm that gets the best of both worlds!
Excited to present FastTD3: a simple, fast, and capable off-policy RL algorithm for humanoid control -- with an open-source code to run your own humanoid RL experiments in no time! Thread below 🧵
Over the past year, I've been compiling some "alchemist's notes" on deep learning. Right now it covers basic optimization, architectures, and generative models. Focus is on learnability -- each page has nice graphics and an end-to-end implementation. notes.kvfrans.com
The future of robotics isn't in the lab – it's in your hands. Can we teach robots to act in the real world without a single robot demonstration? Introducing EgoZero. Train real-world robot policies from human-first egocentric data. No robots. No teleop. Just Aria glasses and…
Closed-loop robot policies directly from human interactions. No teleop, no robot data co-training, no RL, and no sim. Just Aria smart glasses. Everyday human data is passively scalable and a massively underutilized resource in robotics...More to come here in the coming weeks.
The future of robotics isn't in the lab – it's in your hands. Can we teach robots to act in the real world without a single robot demonstration? Introducing EgoZero. Train real-world robot policies from human-first egocentric data. No robots. No teleop. Just Aria glasses and…