Shuran Song
@SongShuran
Assistant Professor @Stanford University working on #Robotics #AI #ComputerVision
Meet the newest member of the UMI family: DexUMI! Designed for intuitive data collection — and it fixes a few things the original UMI couldn’t handle: 🖐️ Supports multi-finger dexterous hands — tested on both under- and fully-actuated types 🧂 Records tactile info — it can tell…
Can we collect robot dexterous hand data directly with human hand? Introducing DexUMI: 0 teleoperation and 0 re-targeting dexterous hand data collection system → autonomously complete precise, long-horizon and contact-rich tasks Project Page: dex-umi.github.io
Thank you @CSProfKGD!!
Really cool invited talk by @SongShuran - “Making Video Model Useful for Robots”
Really cool invited talk by @SongShuran - “Making Video Model Useful for Robots”
What an incredible setting for a workshop 😍
Tactile interaction in the wild can unlock fine-grained manipulation! 🌿🤖✋ We built a portable handheld tactile gripper that enables large-scale visuo-tactile data collection in real-world settings. By pretraining on this data, we bridge vision and touch—allowing robots to:…
have been waiting for this release! Robotics needs rigorous and careful evaluation now more than ever 🦾
TRI's latest Large Behavior Model (LBM) paper landed on arxiv last night! Check out our project website: toyotaresearchinstitute.github.io/lbm1/ One of our main goals for this paper was to put out a very careful and thorough study on the topic to help people understand the state of the…
TRI's latest Large Behavior Model (LBM) paper landed on arxiv last night! Check out our project website: toyotaresearchinstitute.github.io/lbm1/ One of our main goals for this paper was to put out a very careful and thorough study on the topic to help people understand the state of the…
Point maps have become a powerful representation for image-based 3D reconstruction. What if we could push point maps even further tackle 3D registration and assembly? Introducing Rectified Point Flow (RPF), a generic formulation for point cloud pose estimation.
Robot learning has largely focused on standard platforms—but can it embrace robots of all shapes and sizes? In @XiaomengXu11's latest blog post, we show how data-driven methods bring unconventional robots to life, enabling capabilities that traditional designs and control can't…
We build Cosmos-Predict2 as a world foundation model for Physical AI builders — fully open and adaptable. Post-train it for specialized tasks or different output types. Available in multiple sizes, resolutions, and frame rates. 📷 Watch the repo walkthrough…
I'll present RoboPanoptes at #RSS2025 tomorrow 6/22 🐍 Spotlight talk: 9:00-10:30am (Bovard Auditorium) Poster: 12:30-2:00pm, poster #31 (Associates Park)
Can robots leverage their entire body to sense and interact with their environment, rather than just relying on a centralized camera and end-effector? Introducing RoboPanoptes, a robot system that achieves whole-body dexterity through whole-body vision. robopanoptes.github.io
Teleoperating a robot feels unnatural — not just because of limited arm or hand DoFs, but also because of the lack of perceptual freedom! Humans naturally move their head and torso to search, track, and focus — far beyond a simple 2-DoF camera. How to get there? Check out…
Your bimanual manipulators might need a Robot Neck 🤖🦒 Introducing Vision in Action: Learning Active Perception from Human Demonstrations ViA learns task-specific, active perceptual strategies—such as searching, tracking, and focusing—directly from human demos, enabling robust…
Let’s say we do have a powerful robot model that learns rich behaviors from large-scale robot data. What now? To make them actually useful, we need to be able to steer and control their behavior so that they match user intent, preferences, or deployment needs. How can we do…
Normally, changing robot policy behavior means changing its weights or relying on a goal-conditioned policy. What if there was another way? Check out DynaGuide, a novel policy steering approach that works on any pretrained diffusion policy. dynaguide.github.io 🧵
There’s something satisfying to see the robot slotting in the box flaps so nicely in the end ... 😌
Today we're excited to share a glimpse of what we're building at Generalist. As a first step towards our mission of making general-purpose robots a reality, we're pushing the frontiers of what end-to-end AI models can achieve in the real world. Here's a preview of our early…
Today we're excited to share a glimpse of what we're building at Generalist. As a first step towards our mission of making general-purpose robots a reality, we're pushing the frontiers of what end-to-end AI models can achieve in the real world. Here's a preview of our early…
We've been heads-down building. The robots have gotten pretty good. We'll be sharing a brief update soon.
just realized twitter cropped the images :( here's the full list:
always enjoy reading Seohong's paper, they are incredibly clear and thought-provoking -- Seohong needs to write more :)
Q-learning is not yet scalable seohong.me/blog/q-learnin… I wrote a blog post about my thoughts on scalable RL algorithms. To be clear, I'm still highly optimistic about off-policy RL and Q-learning! I just think we haven't found the right solution yet (the post discusses why).
Manipulation is not just about the hand; it is a whole-body activity 🐕
Excited to share our new work ReLIC, a framework for versatile loco-manipulation through flexible interlimb coordination. We combine reinforcement learning and model-based control to let robots dynamically assign limbs 🦾🦿 for manipulation or locomotion based on task demands.
Excited to share our new work ReLIC, a framework for versatile loco-manipulation through flexible interlimb coordination. We combine reinforcement learning and model-based control to let robots dynamically assign limbs 🦾🦿 for manipulation or locomotion based on task demands.
💡Can we let an arm-mounted quadrupedal robot to perform task with both arms and legs? Introducing ReLIC: Reinforcement Learning for Interlimb Coordination for versatile loco-manipulation in unstructured environments. [1/6] relic-locoman.rai-inst.com
How can we move beyond static-arm lab setups and learn robot policies in our messy homes? We introduce HoMeR, an imitation learning agent for in-the-wild mobile manipulation. 🧵1/8