Wenhao Yu
@Stacormed
Research Scientist @DeepMind
No more head bumping from picking up stuff from under the bed! Checkout our recent work that tightly integrates legs and manipulators that can open up so many applications and interesting loco-manipulation research questions!
LocoMan = Quadrupedal Robot + 2 * Loco-Manipulator Powered by dual lightweight 3-DoF Loco-Manipulators and the Whole-Body Controller, LocoMan achieves various challenging tasks, such as manipulation in narrow spaces and bimanual-manipulation. linchangyi1.github.io/LocoMan 👇👇👇
How do imbue robots with the ability to imagine the world and complete tasks better? Join us at CoRL 25 workshop on Robotics World Modeling and share your latest work in this area!
🤖🌎 We are organizing a workshop on Robotics World Modeling at @corl_conf 2025! We have an excellent group of speakers and panelists, and are inviting you to submit your papers with a July 13 deadline. Website: robot-world-modeling.github.io
Deadline extended! You now have until May 25th (10 days post-NeurIPS) to submit to our ICML World Model Workshop. Looking forward to your papers!
How can we connect world models to physical world? Come join our 2025 workshop at ICML on Building Physically Plausible World Models! physical-world-modeling.github.io (1/2)
Complementary to Gemini Robotics -- the massive vision-language-action (VLA) model released yesterday -- we also investigated how far we can push Gemini for robotics _purely from simulation data_ in Proc4Gem: 🧵
We’ve always thought of robotics as a helpful testing ground for translating AI advances into the physical world. Today we’re taking our next step in this journey with our newest Gemini 2.0 robotics models. They show state of the art performance on two important benchmarks -…
Super excited to share what we’ve been working on!
Meet Gemini Robotics: our latest AI models designed for a new generation of helpful robots. 🤖 Based on Gemini 2.0, they bring capabilities such as better reasoning, interactivity, dexterity and generalization into the physical world. 🧵 goo.gle/gemini2-roboti…
2nd Earth Rover Challenge is coming! Eager to see how much progress AI will make in navigating real cities against real human agents!
Announcing the 2nd Earth Rover Challenge: an "AI vs Gamers" global navigation competition (to be held #ICRA2025 in May in Atlanta) Co-organized with researchers from Deepmind, Meta & academia A thread 🧵 - 1/n
💪💪
Welcome to the world, Gemini 2.0 ✨ our most capable AI model yet. We're first releasing an experimental version of 2.0 Flash ⚡ It has better performance, new multimodal output, @Google tool use - and paves the way for new agentic experiences. 🧵 goo.gle/gemini-2
Wow this is really good! In some way I’m more impressed that it’s teleoperated than if it’s autonomous cuz it feels very plausible to develop a highly specialized RL-based policy to do this, but being able to tele op this opens up a wide range of data to be collected.
Got a new hand for Black Friday
How can we leverage the common sense knowledge from a VLM to understand the progress (and even quality!) of a robotics trajectory? Check out GVL on a surprisingly simple and elegant way to do that! Awesome work by Jason!
Excited to finally share Generative Value Learning (GVL), my @GoogleDeepMind project on extracting universal value functions from long-context VLMs via in-context learning! We discovered a simple method to generate zero-shot and few-shot values for 300+ robot tasks and 50+…
We just open sourced the hardware and software of LocoMan: github.com/linchangyi1/Lo…. Try it out yourself!
LocoMan = Quadrupedal Robot + 2 * Loco-Manipulator Powered by dual lightweight 3-DoF Loco-Manipulators and the Whole-Body Controller, LocoMan achieves various challenging tasks, such as manipulation in narrow spaces and bimanual-manipulation. linchangyi1.github.io/LocoMan 👇👇👇
At Vienna for ICML this week! Let me know if you are down to catch up and look forward to the great discussions and talks ahead! Also come check out our work on PIVOT pivot-prompt.github.io during Tuesday’s poster session!

How can Gemini 1.5 Pro’s long context window help robots navigate the world? 🤖 A thread of our latest experiments. 🧵
For the past year we've been working on ALOHA Unleashed 🌋 @GoogleDeepmind - pushing the scale and dexterity of tasks on our ALOHA 2 fleet. Here is a thread with some of the coolest videos! The first task is hanging a shirt on a hanger (autonomous 1x)
Soccer players have to master a range of dynamic skills, from turning and kicking to chasing a ball. How could robots do the same? ⚽ We trained our AI agents to demonstrate a range of agile behaviors using reinforcement learning. Here’s how. 🧵 dpmd.ai/3vUlgjC
Can we use wearable devices to collect robot data without actual robots? Yes! With a pair of gloves🧤! Introducing DexCap, a portable hand motion capture system that collects 3D data (point cloud + finger motion) for training robots with dexterous hands Everything open-sourced