Shuran Song (@SongShuran)

Pinned

S

Shuran Song@SongShuran · May 30

Meet the newest member of the UMI family: DexUMI! Designed for intuitive data collection — and it fixes a few things the original UMI couldn’t handle: 🖐️ Supports multi-finger dexterous hands — tested on both under- and fully-actuated types 🧂 Records tactile info — it can tell…

MMengda Xu@mengdaxu__ · May 30

Can we collect robot dexterous hand data directly with human hand? Introducing DexUMI: 0 teleoperation and 0 re-targeting dexterous hand data collection system → autonomously complete precise, long-horizon and contact-rich tasks Project Page: dex-umi.github.io

3

19

105

25

18.0K

S

Shuran Song@SongShuran · Jul 20

Thank you @CSProfKGD!!

KKosta Derpanis@CSProfKGD · Jul 19

Really cool invited talk by @SongShuran - “Making Video Model Useful for Robots”

1

34

6

5.0K

S

Shuran Song@SongShuran · Jul 19

Really cool invited talk by @SongShuran - “Making Video Model Useful for Robots”

KKosta Derpanis@CSProfKGD · Jul 19

What an incredible setting for a workshop 😍

2

4

35

11

10.0K

Shuran Song Retweeted

B

Binghao Huang@binghao_huang · Jul 16

Tactile interaction in the wild can unlock fine-grained manipulation! 🌿🤖✋ We built a portable handheld tactile gripper that enables large-scale visuo-tactile data collection in real-world settings. By pretraining on this data, we bridge vision and touch—allowing robots to:…

3

51

278

123

45.0K

S

Shuran Song@SongShuran · Jul 9

have been waiting for this release! Robotics needs rigorous and careful evaluation now more than ever 🦾

RRuss Tedrake@RussTedrake · Jul 9

TRI's latest Large Behavior Model (LBM) paper landed on arxiv last night! Check out our project website: toyotaresearchinstitute.github.io/lbm1/ One of our main goals for this paper was to put out a very careful and thorough study on the topic to help people understand the state of the…

1

5

67

8

6.0K

Shuran Song Retweeted

R

Russ Tedrake@RussTedrake · Jul 9

TRI's latest Large Behavior Model (LBM) paper landed on arxiv last night! Check out our project website: toyotaresearchinstitute.github.io/lbm1/ One of our main goals for this paper was to put out a very careful and thorough study on the topic to help people understand the state of the…

8

105

476

189

73.0K

Shuran Song Retweeted

L

Liyuan Zhu@liyuan_zz · Jul 7

Point maps have become a powerful representation for image-based 3D reconstruction. What if we could push point maps even further tackle 3D registration and assembly? Introducing Rectified Point Flow (RPF), a generic formulation for point cloud pose estimation.

3

58

353

211

38.0K

Shuran Song Retweeted

S

Stanford AI Lab@StanfordAILab · Jun 30

Robot learning has largely focused on standard platforms—but can it embrace robots of all shapes and sizes? In @XiaomengXu11's latest blog post, we show how data-driven methods bring unconventional robots to life, enabling capabilities that traditional designs and control can't…

3

25

111

39

15.0K

Shuran Song Retweeted

H

Hanzi Mao@hanna_mao · Jun 28

We build Cosmos-Predict2 as a world foundation model for Physical AI builders — fully open and adaptable. Post-train it for specialized tasks or different output types. Available in multiple sizes, resolutions, and frame rates. 📷 Watch the repo walkthrough…

8

71

283

111

28.0K

S

Shuran Song@SongShuran · Jun 22

I'll present RoboPanoptes at #RSS2025 tomorrow 6/22 🐍 Spotlight talk: 9:00-10:30am (Bovard Auditorium) Poster: 12:30-2:00pm, poster #31 (Associates Park)

XXiaomeng Xu@XiaomengXu11 · Jan 10

Can robots leverage their entire body to sense and interact with their environment, rather than just relying on a centralized camera and end-effector? Introducing RoboPanoptes, a robot system that achieves whole-body dexterity through whole-body vision. robopanoptes.github.io

0

15

106

23

9.0K

S

Shuran Song@SongShuran · Jun 19

Teleoperating a robot feels unnatural — not just because of limited arm or hand DoFs, but also because of the lack of perceptual freedom! Humans naturally move their head and torso to search, track, and focus — far beyond a simple 2-DoF camera. How to get there? Check out…

HHaoyu Xiong@Haoyu_Xiong_ · Jun 19

Your bimanual manipulators might need a Robot Neck 🤖🦒 Introducing Vision in Action: Learning Active Perception from Human Demonstrations ViA learns task-specific, active perceptual strategies—such as searching, tracking, and focusing—directly from human demos, enabling robust…

1

6

65

20

6.0K

S

Shuran Song@SongShuran · Jun 18

Let’s say we do have a powerful robot model that learns rich behaviors from large-scale robot data. What now? To make them actually useful, we need to be able to steer and control their behavior so that they match user intent, preferences, or deployment needs. How can we do…

MMaximilian Du@du_maximilian · Jun 18

Normally, changing robot policy behavior means changing its weights or relying on a goal-conditioned policy. What if there was another way? Check out DynaGuide, a novel policy steering approach that works on any pretrained diffusion policy. dynaguide.github.io 🧵

0

7

82

35

8.0K

S

Shuran Song@SongShuran · Jun 18

There’s something satisfying to see the robot slotting in the box flaps so nicely in the end ... 😌

GGeneralist@GeneralistAI_ · Jun 17

Today we're excited to share a glimpse of what we're building at Generalist. As a first step towards our mission of making general-purpose robots a reality, we're pushing the frontiers of what end-to-end AI models can achieve in the real world. Here's a preview of our early…

1

11

139

31

13.0K

S

Shuran Song@SongShuran · Jun 18

haha 😂😂 @andyzeng_

SSimon Kalouche@simonkalouche · Jun 18

Andy Zeng's trademark toss.

1

43

3

5.0K

Shuran Song Retweeted

G

Generalist@GeneralistAI_ · Jun 17

Today we're excited to share a glimpse of what we're building at Generalist. As a first step towards our mission of making general-purpose robots a reality, we're pushing the frontiers of what end-to-end AI models can achieve in the real world. Here's a preview of our early…

33

147

853

263

257.0K

Shuran Song Retweeted

G

Generalist@GeneralistAI_ · Jun 17

We've been heads-down building. The robots have gotten pretty good. We'll be sharing a brief update soon.

1

8

131

24

10.0K

Shuran Song Retweeted

M

Mandi Zhao@ZhaoMandi · Jun 17

just realized twitter cropped the images :( here's the full list:

0

2

9

1

2.0K

S

Shuran Song@SongShuran · Jun 14

always enjoy reading Seohong's paper, they are incredibly clear and thought-provoking -- Seohong needs to write more :)

SSeohong Park@seohong_park · Jun 13

Q-learning is not yet scalable seohong.me/blog/q-learnin… I wrote a blog post about my thoughts on scalable RL algorithms. To be clear, I'm still highly optimistic about off-policy RL and Q-learning! I just think we haven't found the right solution yet (the post discusses why).

3

12

185

143

22.0K

S

Shuran Song@SongShuran · Jun 13

Manipulation is not just about the hand; it is a whole-body activity 🐕

KKuan Fang@KuanFang · Jun 13

Excited to share our new work ReLIC, a framework for versatile loco-manipulation through flexible interlimb coordination. We combine reinforcement learning and model-based control to let robots dynamically assign limbs 🦾🦿 for manipulation or locomotion based on task demands.

1

0

34

3

4.0K

S

Shuran Song@SongShuran · Jun 13

Excited to share our new work ReLIC, a framework for versatile loco-manipulation through flexible interlimb coordination. We combine reinforcement learning and model-based control to let robots dynamically assign limbs 🦾🦿 for manipulation or locomotion based on task demands.

YYuxin Chen@ThomasYuxinChen · Jun 13

💡Can we let an arm-mounted quadrupedal robot to perform task with both arms and legs? Introducing ReLIC: Reinforcement Learning for Interlimb Coordination for versatile loco-manipulation in unstructured environments. [1/6] relic-locoman.rai-inst.com

2

10

63

10

10.0K

Shuran Song Retweeted

P

Priya Sundaresan@priyasun_ · Jun 9

How can we move beyond static-arm lab setups and learn robot policies in our messy homes? We introduce HoMeR, an imitation learning agent for in-the-wild mobile manipulation. 🧵1/8

6

51

235

128

28.0K