Deepak Pathak
@pathak2206
Co-Founder & CEO at @SkildAI, Faculty at @CarnegieMellon. PhD @UCBerkeley. I study topics in AI (machine learning, robotics & computer vision).
Even after 4yrs of locomotion research, we keep getting surprised by how far we can push the limits of legged robots! We report a major update 🚀🤖 Extreme Parkour: extremely long & high jumps, ramp, handstand, etc. all with a single neural net! extreme-parkour.github.io 🧵(1/n)
Very nice results here, we've seen similar ability of diffusion models to benefit from repeated data, more so than AR models. The diffusion loss is much noisier so might act as a natural regularizer to prevent overfitting.
🚨 The era of infinite internet data is ending, So we ask: 👉 What’s the right generative modelling objective when data—not compute—is the bottleneck? TL;DR: ▶️Compute-constrained? Train Autoregressive models ▶️Data-constrained? Train Diffusion models Get ready for 🤿 1/n
Thrilled to finally release this study! 🚀 We view (discrete) diffusion models as implicitly doing data augmentation over autoregressive. Through this lens, we find that diffusion outperforms AR in data-constrained settings, but it requires larger models and way more epochs to…
🚨 The era of infinite internet data is ending, So we ask: 👉 What’s the right generative modelling objective when data—not compute—is the bottleneck? TL;DR: ▶️Compute-constrained? Train Autoregressive models ▶️Data-constrained? Train Diffusion models Get ready for 🤿 1/n
Everyone get your top 1% quality dataset and train 100 epochs right now
Very cool release by Misha, Ioannis and team at Reflection! Can’t wait to try it.
Engineers spend 70% of their time understanding code, not writing it. That’s why we built Asimov at @reflection_ai. The best-in-class code research agent, built for teams and organizations.
Want to add diverse, high-quality data to your robot policy? Happy to share that the DexWild Dataset is now fully public, hosted by @huggingface 🤗 Find it here! huggingface.co/datasets/board…
Training robots for the open world needs diverse data But collecting robot demos in the wild is hard! Presenting DexWild 🙌🏕️ Human data collection system that works in diverse environments, without robots 💪🦾 Human + Robot Cotraining pipeline that unlocks generalization 🧵👇
Got to visit the Robotics Institute at CMU today. The institute has a long legacy of pioneering research and pushing the frontiers of robotics. Thanks @kenny__shaw @JasonJZLiu @adamhkan4 for showing your latest projects. Here’s a live autonomous demo trained with DexWild data
A great example of scientific discourse at its best—thoughtful, constructive, and conclusive. We now have more rigorous evidence that confidence maximization improves reasoning. 👇
1/ Maximizing confidence indeed improves reasoning. We worked with @ShashwatGoel7, @nikhilchandak29 @AmyPrb for the past 3 weeks (over a zoom call and many emails!) and revised our evaluations to align with their suggested prompts/parsers/sampling params. This includes changing…
Glad we could together improve the scientific discourse around reasoning. Was great to see the authors reach out and incorporate all our feedback!
1/ Maximizing confidence indeed improves reasoning. We worked with @ShashwatGoel7, @nikhilchandak29 @AmyPrb for the past 3 weeks (over a zoom call and many emails!) and revised our evaluations to align with their suggested prompts/parsers/sampling params. This includes changing…
Congratulations to the team... great start at RSS!! We have open-sourced DexWild -- makes it easy to build and scale robot learning with hands: x.com/_tonytao_/stat…
Thrilled to have received Best Paper Award at the EgoAct Workshop at RSS 2025! 🏆 We’ll also be giving a talk at the Imitation Learning Session I tomorrow, 5:30–6:30pm. Come to learn about DexWild! Work co-led by @mohansrirama, with @JasonJZLiu, @kenny__shaw, and @pathak2206.
Presenting FACTR today at #RSS2025 in the Imitation Learning I session at 5:30pm (June 22). Come by if you're interested in force-feedback teleop and policy learning!
Low-cost teleop systems have democratized robot data collection, but they lack any force feedback, making it challenging to teleoperate contact-rich tasks. Many robot arms provide force information — a critical yet underutilized modality in robot learning. We introduce: 1. 🦾A…
Tired of tuning PPO or blaming it on reward, task design, etc.? Introducing EPO -- our second (and hopefully final :) attempt at fixing PPO at scale! Contrary to intuition, as the batch size or data increases, PPO saturates due to a lack of diversity in sampling. We proposed a…
(1/n) Since its publication in 2017, PPO has essentially become synonymous with RL. Today, we are excited to provide you with a better alternative - EPO.
I’m thrilled to announce the launch of my $40M pre-seed and seed-stage fund, @SevenStars_VC, where I’ll be focused on partnering with visionary founders building enduring AI application companies across consumer and enterprise technology. Seven Stars is deeply personal. It’s…
Congratulations, Dr. Murtaza! 🥳
Incredibly excited to share that I am now officially Dr. Murtaza Dalal! Last weekend marked the official end of an incredible journey across the last 5 years, including doing the first year of my PhD remote, moving to the other side of the country, becoming an independent…
someone should probably retry all those late 2010s deep RL ideas to see if they work on LLMs
Maybe real-world robot generalization doesn’t need massive teleop datasets? 🤔 In DexWild, we show that human demos 🙌 + a little robot data 🤖 = policies that generalize across scenes 🏞️, tasks 🛠️, and embodiments 🦾!
Training robots for the open world needs diverse data But collecting robot demos in the wild is hard! Presenting DexWild 🙌🏕️ Human data collection system that works in diverse environments, without robots 💪🦾 Human + Robot Cotraining pipeline that unlocks generalization 🧵👇
Exiciting to see (at 5:55) Nvidia adopting LEAP Hand in their sim2real efforts! Build your own at leaphand.com ! Lots more coming this summer, stay tuned :) @pathak2206 @anag004
The Physical Turing Test: your house is a complete mess after a Sunday hackathon. On Monday night, you come home to an immaculate living room and a candlelight dinner. And you couldn't tell whether a human or a machine had been there. Deceptively simple, insanely hard. It is the…