Tony Tao
@_tonytao_
Masters @CMU_Robotics
Training robots for the open world needs diverse data But collecting robot demos in the wild is hard! Presenting DexWild 🙌🏕️ Human data collection system that works in diverse environments, without robots 💪🦾 Human + Robot Cotraining pipeline that unlocks generalization 🧵👇
Scaling dexterous robot learning is going to require a lot of data. DexWild is a way of collecting a lot of useful real-world data in diverse settings for training diverse robot skills. Cool work by @_tonytao_ and @mohansrirama
Full episode dropping soon! Geeking out with @_tonytao_ @mohansrirama on DexWild - Dexterous Human Interactions for In-the-Wild Robot Policies dexwild.github.io Co-hosted by @chris_j_paxton @micoolcho
Had a lot of fun chatting with @chris_j_paxton and @micoolcho! In the episode, we unpack the hidden stories, interesting detours, and lessons learned behind our paper DexWild 🦾
Full episode dropping soon! Geeking out with @_tonytao_ @mohansrirama on DexWild - Dexterous Human Interactions for In-the-Wild Robot Policies dexwild.github.io Co-hosted by @chris_j_paxton @micoolcho
Research arc: ⏪ 2 yrs ago, we introduced VRB: learning from hours of human videos to cut down teleop (Gibson🙏) ▶️ Today, we explore a wilder path: robots deployed with no teleop, no human demos, no affordances. Just raw video generation magic 🙏 Day 1 of faculty life done! 😉…
🚀 Introducing RIGVid: Robots Imitating Generated Videos! Robots can now perform complex tasks—pouring, wiping, mixing—just by imitating generated videos, purely zero-shot! No teleop. No OpenX/DROID/Ego4D. No videos of human demonstrations. Only AI generated video demos 🧵👇
🐕 I'm happy to share my paper: RAMBO: RL-augmented Model-based Whole-body Control for Loco-manipulation has been accepted by IEEE Robotics and Automation Letters (RA-L) 🧶 Project website: jin-cheng.me/rambo.github.i… Paper: arxiv.org/abs/2504.06662
Ep#22 with @_tonytao_ @mohansrirama on DexWild - Dexterous Human Interactions for In-the-Wild Robot Policies dexwild.github.io Co-hosted by @chris_j_paxton @micoolcho
🚨 The era of infinite internet data is ending, So we ask: 👉 What’s the right generative modelling objective when data—not compute—is the bottleneck? TL;DR: ▶️Compute-constrained? Train Autoregressive models ▶️Data-constrained? Train Diffusion models Get ready for 🤿 1/n
Tactile interaction in the wild can unlock fine-grained manipulation! 🌿🤖✋ We built a portable handheld tactile gripper that enables large-scale visuo-tactile data collection in real-world settings. By pretraining on this data, we bridge vision and touch—allowing robots to:…
At a robotics lab in Pittsburgh, engineers are building adaptable, AI-powered robots that could one day work where it's too dangerous for humans. The research drew a visit from President Trump, who touted U.S. dominance in AI as companies announced $90 billion in new investments.
Compression is the heart of intelligence From Occam to Kolmogorov—shorter programs=smarter representations Meet KARL: Kolmogorov-Approximating Representation Learning. Given an image, token budget T & target quality 𝜖 —KARL finds the smallest t≤T to reconstruct it within 𝜖🧵
Introducing Muscle v0 -- infinite degrees of freedom, from @DaxoRobotics. A different mountain to climb - with a far more beautiful peak. We built this from the ground up: - Ultra-dexterous - Built for machine learning - Durable and robust More below (1/n)
We tested WSRL (Warm-start RL) on a Franka Robot, and it leads to really efficient online RL fine-tuning in the real world! WSRL learned the peg insertion task perfectly with only 11 minutes of warmup and *7 minutes* of online RL interactions 👇🧵
Whenever I see something impressive in robotics, I always ask for a live demo. If they can’t demo it, it’s hard to believe it really works. I hope more companies do demos in the future to show the world their policies work beyond a cool video.
We have started taking DYNA-1, our dexterous robust VLA model, to conferences and showcasing it for hours on end! The model run for 3 days, 8 hours each day at #HITEC2025 3 weeks ago with 99.9% overall success rate (dropped 1 towel in day 2). No intervention, it just works :)
K-Bot is the world’s first open-source humanoid robot that is affordable, available and made in America. Robots should serve people and empower anyone to build the future, not just big corporations. ➡️Order now kscale.dev
Got to visit the Robotics Institute at CMU today. The institute has a long legacy of pioneering research and pushing the frontiers of robotics. Thanks @kenny__shaw @JasonJZLiu @adamhkan4 for showing your latest projects. Here’s a live autonomous demo trained with DexWild data
Having reliable signals to compare different robot policies is sooo important. The more informative the metrics, the better we can compare, improve, and build smarter robot policies. Love to see work in this direction.
🚨Tired of binary pass/fail metrics that miss the bigger picture? 🤖Introducing #RoboEval — an open benchmark that shows *how* robot manipulation policies behave and *why* they fail, not just *if* they succeed. 🧵1/n 🔗 robo-eval.github.io 📄 robo-eval.github.io/media/RoboEval…
We’re excited to finally share what we’ve been assembling over the past few months! This is a field full of excitement and hope—but also one that’s constantly surrounded by noise and restlessness. In the waves of technological change, trends come and go, bubbles rise and burst,…
Today, We’re launching Genesis AI — a global physical AI lab and full-stack robotics company — to build generalist robots and unlock unlimited physical labor. We’re backed by $105M in seed funding from @EclipseVentures, @khoslaventures, @Bpifrance, HSG, and visionaries…
1/ Maximizing confidence indeed improves reasoning. We worked with @ShashwatGoel7, @nikhilchandak29 @AmyPrb for the past 3 weeks (over a zoom call and many emails!) and revised our evaluations to align with their suggested prompts/parsers/sampling params. This includes changing…
Confused about recent LLM RL results where models improve without any ground-truth signal? We were too. Until we looked at the reported numbers of the Pre-RL models and realized they were serverely underreported across papers. We compiled discrepancies in a blog below🧵👇
What would a World Model look like if we start from a real embodied agent acting in the real world? It has to have: 1) A real, physically grounded and complex action space—not just abstract control signals. 2) Diverse, real-life scenarios and activities. Or in short: It has to…