Jesse Zhang
@Jesse_Y_Zhang
Robot Learning Postdoc @ UW w/ A. Gupta, D. Fox. Prev: PhD @ USC w/ Erdem Biyik, Jesse Thomason, Joseph Lim, Ugrad @ Berkeley, Intern @ NVIDIA, AWS.
How can non-experts quickly teach robots a variety of tasks? Introducing HAND ✋, a simple, time-efficient method of training robots! Using just a **single hand demo**, HAND learns manipulation tasks in under **4 minutes**! 🧵
Teaching robots to learn only from RGB human videos is hard! In Feel The Force (FTF), we teach robots to mimic the tactile feedback humans experience when handling objects. This allows for delicate, touch-sensitive tasks—like picking up a raw egg without breaking it. 🧵👇
Are current eval/deployment practices enough for today’s robot policies? Announcing the Eval&Deploy workshop at CoRL 2025 @corl_conf, where we'll explore eval + deployment in the robot learning lifecycle and how to improve it! eval-deploy.github.io 🗓️ Submissions due Aug 30
In other news, ReWiND won best paper at the OOD workshop at RSS yesterday! If you haven’t already, check out: 🕸️📑: rewind-reward.github.io Or if you’re lazy like me read @Jesse_Y_Zhang’s tweet 😉
Reward models that help real robots learn new tasks—no new demos needed! ReWiND uses language-guided rewards to train bimanual arms on OOD tasks in 1 hour! Offline-to-online, lang-conditioned, visual RL on action-chunked transformers. 🧵
Learned visuomotor policies are notoriously fragile, they break with changes in conditions like lighting, clutter, or object variations amongst other things. In @yunchuzh's latest work, we asked whether we could get these policies to be robust and generalizable with a clever…
🚨 RSS Demo Paper Alert! Amazon needs to manipulate millions of items daily—demanding robust policies that handle (1) diverse objects in cluttered warehouses, (2) adapt across robotic embodiments, and (3) deliver high performance across thousands of sites with varying layouts.
Meet Casper👻, a friendly robot sidekick who shadows your day, decodes your intents on the fly, and lends a hand while you stay in control! Instead of passively receiving commands, what if a robot actively sense what you need in the background, and step in when confident? (1/n)
2 Days, 10 teams, and an army of robots. You guys drove hours, battled LA heat, and squashed countless bugs. The result… we showed the world that SoCal has what it takes to be a hub for embodied AI 😎🦾 #LeRobot #globalhackathon @yonduai
Say ahoy to 𝚂𝙰𝙸𝙻𝙾𝚁⛵: a new paradigm of *learning to search* from demonstrations, enabling test-time reasoning about how to recover from mistakes w/o any additional human feedback! 𝚂𝙰𝙸𝙻𝙾𝚁 ⛵ out-performs Diffusion Policies trained via behavioral cloning on 5-10x data!
[Blog Post Announcement] The internet is full of “interesting” data: cat videos, think pieces, and highlight reels—but robots often need to learn from mundane data to help us with everyday unexciting tasks. While people aren’t incentivized to share this boring data, we constantly…
How can we move beyond static-arm lab setups and learn robot policies in our messy homes? We introduce HoMeR, an imitation learning agent for in-the-wild mobile manipulation. 🧵1/8
This is really cool! Is real-world RL starting to come back? 👀
Real-world RL, where robots learn directly from physical interactions, is extremely challenging — especially for high-DoF systems like mobile manipulators. 1⃣ Long-horizon tasks and large action spaces lead to difficult policy optimization. 2⃣ Real-world exploration with…
Real-world RL, where robots learn directly from physical interactions, is extremely challenging — especially for high-DoF systems like mobile manipulators. 1⃣ Long-horizon tasks and large action spaces lead to difficult policy optimization. 2⃣ Real-world exploration with…
VLAs have the potential to generalize over scenes and tasks, but require a ton of data to learn robust policies. We introduce OG-VLA, a novel architecture and learning framework that combines the generalization strengths of VLAs with the robustness of 3D-aware policies. 🧵
How to learn dexterous manipulation for any robot hand from a single human demonstration? Check out DexMachina, our new RL algorithm that learns long-horizon, bimanual dexterous policies for a variety of dexterous hands, articulated objects, and complex motions.
Some play data + 1 human demonstration (without the robot!!) achieves the task. Thread and paper below 👇🏻
How can non-experts quickly teach robots a variety of tasks? Introducing HAND ✋, a simple, time-efficient method of training robots! Using just a **single hand demo**, HAND learns manipulation tasks in under **4 minutes**! 🧵
With just a single 2D hand path, we can retrieve relevant trajectories from a task-agnostic play dataset to help robots learn manipulation tasks! Excited to share my first first-author project!
How can non-experts quickly teach robots a variety of tasks? Introducing HAND ✋, a simple, time-efficient method of training robots! Using just a **single hand demo**, HAND learns manipulation tasks in under **4 minutes**! 🧵
Check out our latest work on robot learning from human hand demonstrations! HAND ✋ can learn new manipulation policies in under 4 minutes from a single hand demo! Huge shoutout to my collaborators and @matthewh6_ who will be applying to PhD programs this year!
How can non-experts quickly teach robots a variety of tasks? Introducing HAND ✋, a simple, time-efficient method of training robots! Using just a **single hand demo**, HAND learns manipulation tasks in under **4 minutes**! 🧵
**When AIs Start Rewriting Themselves** Darwin Gödel Machine: Open-Ended Evolution of Self-Improving Agents The Darwin Gödel Machine can: 1. Read and modify its own code 2. Evaluate if the change improves performance 3. Open-endedly explore the solution space 🧵👇
Excited to introduce DexUMI, our new paradigm for intuitive, accurate and generalizable data collection for dexterous hands. We make your own hand feels like the robot hand both kinematically and visually, critical for transferring complex skills to robots. Details below!
Can we collect robot dexterous hand data directly with human hand? Introducing DexUMI: 0 teleoperation and 0 re-targeting dexterous hand data collection system → autonomously complete precise, long-horizon and contact-rich tasks Project Page: dex-umi.github.io