Chen Wang
@chenwang_j
Final-year CS PhD @Stanford. Prev @GoogleDeepMind @NVIDIA @MIT_CSAIL. Robotics/Manipulation
Can we use wearable devices to collect robot data without actual robots? Yes! With a pair of gloves🧤! Introducing DexCap, a portable hand motion capture system that collects 3D data (point cloud + finger motion) for training robots with dexterous hands Everything open-sourced
Excited to share a new humanoid robot platform we’ve been working on. Berkeley Humanoid is a reliable and low-cost mid-scale research platform for learning-based control. We demonstrate the robot walking on various terrains and dynamic hopping with a simple RL controller.
Our new exciting work on hand arm grasp-anything is out now. All trained in sim with RL. Direct pixels to action grasping on a 23 DoF system with hand and arm jointly coordinating to pick up objects. Fast, reactive and buttery smooth motion. Some of the coolest results (!)…
🤖 Ever wondered what robots need to truly help humans around the house? 🏡 Introducing 𝗕𝗘𝗛𝗔𝗩𝗜𝗢𝗥 𝗥𝗼𝗯𝗼𝘁 𝗦𝘂𝗶𝘁𝗲 (𝗕𝗥𝗦)—a comprehensive framework for mastering mobile whole-body manipulation across diverse household tasks! 🧹🫧 From taking out the trash to…
Sim2Real RL for Vision-Based Dexterous Manipulation on Humanoids toruowo.github.io/recipe/ TLDR - we train a humanoid robot with two multifingered hands to perform a range of dexterous manipulation tasks robust generalization and high performance without human demonstration :D
A good hand can push intelligence development. Introducing Eyesight Hand, equipped with full-hand high-res tactile sensors and proprioceptive actuators. It is compliant, agile, and powerful. Good tactile sensing makes learning more efficient and robust. Shout out to Branden!
Introducing EgoMimic - just wear a pair of Project Aria @meta_aria smart glasses 👓 to scale up your imitation learning datasets! Check out what our robot can do. A thread below👇
Excited to introduce ARCap! We found that visual feedback is crucial for high-quality data collection, and AR can greatly help! We invited 20 novice users to each gather a small amount of data using only AR—no robot hardware required. The combined data can successfully train a…
How can we collect high-quality robot data without teleoperation? AR can help! Introducing ARCap, a fully open-sourced AR solution for collecting cross-embodiment robot data (gripper and dex hand) directly using human hands. 🌐:stanford-tml.github.io/ARCap/ 📜:arxiv.org/abs/2410.08464
Why hand-engineer digital twins when digital cousins are free? Check out ACDC: Automated Creation of Digital Cousins 👭 for Robust Policy Learning, accepted at @corl2024! 🎉 📸 Single image -> 🏡 Interactive scene ⏩ Fully automatic (no annotations needed!) 🦾 Robot policies…
Synchronize Dual Hands for Physics-Based Dexterous Guitar Playing discuss: huggingface.co/papers/2409.16… We present a novel approach to synthesize dexterous motions for physically simulated hands in tasks that require coordination between the control of two hands with high temporal…
We found that the relations between keypoints are a powerful way to represent tasks. What’s more exciting is that these keypoint relations can be formulated as constraint satisfaction problems, allowing us to use off-the-shelf optimization solvers to generate complex robot…
What structural task representation enables multi-stage, in-the-wild, bimanual, reactive manipulation? Introducing ReKep: LVM to label keypoints & VLM to write keypoint-based constraints, solve w/ optimization for diverse tasks, w/o task-specific training or env models. 🧵👇
We are hosting the dexterous manipulation workshop at CoRL this year 🤖! We'll dive into topics like visual & tactile perception, skill learning, and control. Don’t miss the opportunity to share your amazing works and participate! dex-manipulation.github.io/corl2024/
🎺 Announcing our CoRL 2024 “Learning Robot Fine and Dexterous Manipulation: Perception and Control” in Munich. Join us to hear from an incredible lineup of speakers! And don’t miss the opportunity to submit your work and participate! Checkout: dex-manipulation.github.io/corl2024/
Meet our AI-powered robot that’s ready to play table tennis. 🤖🏓 It’s the first agent to achieve amateur human level performance in this sport. Here’s how it works. 🧵
Try 𝐓𝐞𝐥𝐞𝐕𝐢𝐬𝐢𝐨𝐧 now without any setup: tinyurl.com/open-television It will display a 3D movie recorded during teleoperation. If you open the link with a VR device it will be a 3D movie. You can also see your hand key points if you use VR. x.com/xuxin_cheng/st…
Introduce Open-𝐓𝐞𝐥𝐞𝐕𝐢𝐬𝐢𝐨𝐧🤖: We need an intuitive and remote teleoperation interface to collect more robot data. 𝐓𝐞𝐥𝐞𝐕𝐢𝐬𝐢𝐨𝐧 lets you immersively operate a robot even if you are 3000 miles away, like in the movie 𝘈𝘷𝘢𝘵𝘢𝘳. Open-sourced!
We are demoing our LEAP Hand teleoperation system today (Thursday) at #rss2024 with @ManusMeta in van Hasseltzaal (the demo room behind the poster rooms in the 3rd floor). Come try and teleoperate our low-cost easy-to-build dexterous robots hands!
I’ve been training dogs since middle school. It’s about time I train robot dogs too 😛 Introducing, UMI on Legs, an approach for scaling manipulation skills on robot dogs🐶It can toss, push heavy weights, and make your ~existing~ visuo-motor policies mobile!
Introducing tactile skin sim-to-real for dexterous in-hand translation! We propose a simulation model for ReSkin, a magnetic tactile sensing skin. It can simulate ternary shear and binary normal forces. More: jessicayin.github.io/tactile-skin-r…
Introducing FLAIR: Feeding via Long-horizon AcquIsition of Realistic dishes! Our system merges a library of skills with foundation models for efficient robotic feeding, tailored to user preferences. 🔗emprise.cs.cornell.edu/flair 📃arxiv.org/abs/2407.07561 To Appear at RSS ’24 1/N
Image-generation diffusion models can draw arbitrary visual-patterns. What if we finetune Stable Diffusion to 🖌️ draw joint actions 🦾 on RGB observations? Introducing 𝗚𝗘𝗡𝗜𝗠𝗔 paper, videos, code, ckpts: genima-robot.github.io 🧵Thread⬇️
🔊 Audio signals contain rich information about daily interactions. Can our robots learn from videos with sound? Introducing ManiWAV, a robotic system that learns contact-rich manipulation skills from in-the-wild audio-visual data. See thread for more details (1/4) 👇
Introducing Diffusion Forcing, which unifies next-token prediction (eg LLMs) and full-seq. diffusion (eg SORA)! It offers improved performance & new sampling strategies in vision and robotics, such as stable, infinite video generation, better diffusion planning, and more! (1/8)