Haoru Xue
@HaoruXue
PhD @berkeley_ai | intern @ NVIDIA GEAR | prev. @CMU_Robotics @LeCARLab | Robot Learning, Humanoids
🚀 Introducing LeVERB, the first 𝗹𝗮𝘁𝗲𝗻𝘁 𝘄𝗵𝗼𝗹𝗲-𝗯𝗼𝗱𝘆 𝗵𝘂𝗺𝗮𝗻𝗼𝗶𝗱 𝗩𝗟𝗔 (upper- & lower-body), trained on sim data and zero-shot deployed. Addressing interactive tasks: navigation, sitting, locomotion with verbal instruction. 🧵 ember-lab-berkeley.github.io/LeVERB-Website/
🧠With the shift in humanoid control from pure RL to learning from demonstrations, we take a step back to unpack the landscape. 🔗breadli428.github.io/post/lfd/ 🚀Excited to share our blog post on Feature-based vs. GAN-based Learning from Demonstrations—when to use which, and why it…
LeVERB is a VLA framework for humanoid whole-body control, combining a vision-language model and a low-level controller via a shared latent action space, trained entirely in sim, deployed zero shot.
Executable action vocabulary naturally exists for manipulation VLA (e.g., end-effector pose). For building humanoid VLA, this paper learns an action vocabulary connecting systems 1&2 that is: 1. Expressive enough and executable for versatile humanoid whole-body control in system…
🚀 Introducing LeVERB, the first 𝗹𝗮𝘁𝗲𝗻𝘁 𝘄𝗵𝗼𝗹𝗲-𝗯𝗼𝗱𝘆 𝗵𝘂𝗺𝗮𝗻𝗼𝗶𝗱 𝗩𝗟𝗔 (upper- & lower-body), trained on sim data and zero-shot deployed. Addressing interactive tasks: navigation, sitting, locomotion with verbal instruction. 🧵 ember-lab-berkeley.github.io/LeVERB-Website/
"What cannot be measured cannot be managed. We first create LeVERB-Bench, a photorealistic, whole-body vision-language benchmark for humanoid robots." 🚨A huge milestone in VLA for humanoid whole-body control — they open-sourced: 🏅The first image-to-real-ready, vision-language,…
🚀 Introducing LeVERB, the first 𝗹𝗮𝘁𝗲𝗻𝘁 𝘄𝗵𝗼𝗹𝗲-𝗯𝗼𝗱𝘆 𝗵𝘂𝗺𝗮𝗻𝗼𝗶𝗱 𝗩𝗟𝗔 (upper- & lower-body), trained on sim data and zero-shot deployed. Addressing interactive tasks: navigation, sitting, locomotion with verbal instruction. 🧵 ember-lab-berkeley.github.io/LeVERB-Website/
Simulation could give you so much more than you think before you do real-world teleop. Check out @HaoruXue's latest work on latent humanoid VLA by connecting low-level humanoid control with high-level vision-language understanding—pure Sim2Real magic!
🚀 Introducing LeVERB, the first 𝗹𝗮𝘁𝗲𝗻𝘁 𝘄𝗵𝗼𝗹𝗲-𝗯𝗼𝗱𝘆 𝗵𝘂𝗺𝗮𝗻𝗼𝗶𝗱 𝗩𝗟𝗔 (upper- & lower-body), trained on sim data and zero-shot deployed. Addressing interactive tasks: navigation, sitting, locomotion with verbal instruction. 🧵 ember-lab-berkeley.github.io/LeVERB-Website/
Latents serve as the interface between System 1 and System 2, rather than relying on explicit kinematic motions. This paradigm more closely resembles how humans plan and execute movements. Nice work making humanoid robots more human!
🚀 Introducing LeVERB, the first 𝗹𝗮𝘁𝗲𝗻𝘁 𝘄𝗵𝗼𝗹𝗲-𝗯𝗼𝗱𝘆 𝗵𝘂𝗺𝗮𝗻𝗼𝗶𝗱 𝗩𝗟𝗔 (upper- & lower-body), trained on sim data and zero-shot deployed. Addressing interactive tasks: navigation, sitting, locomotion with verbal instruction. 🧵 ember-lab-berkeley.github.io/LeVERB-Website/
CYBER-TRIP 😎 Rolling down HWY 1 with @zhengyiluo @TairanHe99 @_wenlixiao @zi2865 and a G1 See y’all at RSS!
Nvidia GEAR RSS 2025 Squad Rolling Out
Impressive work. Lots of works this year shows good engineering can really demystify WBC. There is no more excuse for crappy policies. Next steps: making WBC policy more accessible, making it easier to interface with vision-language.
🚀Introducing GMT — a general motion tracking framework that enables high-fidelity motion tracking on humanoid robots by training a single policy from large, unstructured human motion datasets. 🤖A step toward general humanoid controllers. Project Website:…
🤖Can a humanoid robot carry a full cup of beer without spilling while walking 🍺? Hold My Beer ! Introducing Hold My Beer🍺: Learning Gentle Humanoid Locomotion and End-Effector Stabilization Control Project: lecar-lab.github.io/SoFTA/ See more details below👇