Haoyu Xiong (@Haoyu_Xiong_)

Pinned

H

Haoyu Xiong@Haoyu_Xiong_ · Jun 19

Your bimanual manipulators might need a Robot Neck 🤖🦒 Introducing Vision in Action: Learning Active Perception from Human Demonstrations ViA learns task-specific, active perceptual strategies—such as searching, tracking, and focusing—directly from human demos, enabling robust…

17

86

367

120

86.0K

H

Haoyu Xiong@Haoyu_Xiong_ · Jul 27

Just reread the tidybot2.github.io docs today, what an incredible tutorial for building a robot system. Honestly, you could set up an entire new robot lab just by following it, @jimmyyhwu even gave you the link of the screwdriver he used 😂

2

25

187

142

12.0K

H

Haoyu Xiong@Haoyu_Xiong_ · Jul 24

Extrapolating this trend to robotics, i believe if one is doing sim2real they should prefer Autoregressive > Diffusion (compute bottleneck). But if they are doing real world training then Autoregressive < Diffusion (data bottleneck).. We don't empirically validate this for…

MMihir Prabhudesai@mihirp98 · Jul 22

🚨 The era of infinite internet data is ending, So we ask: 👉 What’s the right generative modelling objective when data—not compute—is the bottleneck? TL;DR: ▶️Compute-constrained? Train Autoregressive models ▶️Data-constrained? Train Diffusion models Get ready for 🤿 1/n

3

10

123

73

16.0K

Haoyu Xiong Retweeted

M

Mihir Prabhudesai@mihirp98 · Jul 22

🚨 The era of infinite internet data is ending, So we ask: 👉 What’s the right generative modelling objective when data—not compute—is the bottleneck? TL;DR: ▶️Compute-constrained? Train Autoregressive models ▶️Data-constrained? Train Diffusion models Get ready for 🤿 1/n

124

172

1.0K

840

180.0K

H

Haoyu Xiong@Haoyu_Xiong_ · Feb 24

Agency > Intelligence I had this intuitively wrong for decades, I think due to a pervasive cultural veneration of intelligence, various entertainment/media, obsession with IQ etc. Agency is significantly more powerful and significantly more scarce. Are you hiring for agency? Are…

GGarry Tan@garrytan · Feb 24

Intelligence is on tap now so agency is even more important

860

4.0K

24.0K

15.0K

2.5M

H

Haoyu Xiong@Haoyu_Xiong_ · Jul 17

Check out @binghao_huang ‘s great work on scaling up tactile interaction in the wild!

BBinghao Huang@binghao_huang · Jul 16

Tactile interaction in the wild can unlock fine-grained manipulation! 🌿🤖✋ We built a portable handheld tactile gripper that enables large-scale visuo-tactile data collection in real-world settings. By pretraining on this data, we bridge vision and touch—allowing robots to:…

0

2

19

1

2.0K

Haoyu Xiong Retweeted

B

Binghao Huang@binghao_huang · Jul 16

Tactile interaction in the wild can unlock fine-grained manipulation! 🌿🤖✋ We built a portable handheld tactile gripper that enables large-scale visuo-tactile data collection in real-world settings. By pretraining on this data, we bridge vision and touch—allowing robots to:…

3

50

281

123

46.0K

Haoyu Xiong Retweeted

C

CBS Evening News@CBSEveningNews · Jul 15

At a robotics lab in Pittsburgh, engineers are building adaptable, AI-powered robots that could one day work where it's too dangerous for humans. The research drew a visit from President Trump, who touted U.S. dominance in AI as companies announced $90 billion in new investments.

14

39

177

57

78.0K

Haoyu Xiong Retweeted

U

Ultra@Ultraroboticsco · Jul 14

We’re building robots that work. Ultra's intelligent warehouse robots deploy in hours (not weeks), and adapt to real world chaos. Our robots are already packaging e-commerce orders in 3PL warehouses across the US. We create value for customers on day 1 using teleop control, and…

18

53

325

130

38.0K

Haoyu Xiong Retweeted

S

Shivam Duggal@ShivamDuggal4 · Jul 11

Compression is the heart of intelligence From Occam to Kolmogorov—shorter programs=smarter representations Meet KARL: Kolmogorov-Approximating Representation Learning. Given an image, token budget T & target quality 𝜖 —KARL finds the smallest t≤T to reconstruct it within 𝜖🧵

12

62

344

270

51.0K

H

Haoyu Xiong@Haoyu_Xiong_ · Jul 12

This is wild…

EEddy Xu@eddybuild · Jul 11

we’re putting ai glasses on chinese factory workers to replace them with robots

0

8

1

914

Haoyu Xiong Retweeted

S

Stanford AI Lab@StanfordAILab · Jun 30

Robot learning has largely focused on standard platforms—but can it embrace robots of all shapes and sizes? In @XiaomengXu11's latest blog post, we show how data-driven methods bring unconventional robots to life, enabling capabilities that traditional designs and control can't…

3

25

111

39

15.0K

H

Haoyu Xiong@Haoyu_Xiong_ · Jun 28

I have new favourite blogsite

PPramod Goyal@goyal__pramod · Jun 28

It is insane how underrated these blogs are Man made an interative visualization for different kinds of attention mech (He has interactive visualizations for RNNs, LSTMs, CNNs, and so much more)

3

187

1.0K

104.0K

Haoyu Xiong Retweeted

L

Lester Li@sizhe_lester_li · Jun 27

Now in Nature! 🚀 Our method learns a controllable 3D model of any robot from vision, enabling single-camera closed-loop control at test time! This includes robots previously uncontrollable, soft, and bio-inspired, potentially lowering the barrier of entry to automation! Paper:…

5

69

422

349

91.0K

H

Haoyu Xiong@Haoyu_Xiong_ · Jun 27

Checkout @mangahomanga new work on One-shot human imitation

HHomanga Bharadhwaj@mangahomanga · Jun 26

Presenting DemoDiffusion: An extremely simple approach enabling a pre-trained 'generalist' diffusion policy to follow a human-demonstration for a novel task during inference One-shot human imitation *without* requiring any paired human-robot data or online RL 🙂 1/n

0

1

8

0

1.0K

H

Haoyu Xiong@Haoyu_Xiong_ · Jun 19

Witnessed the process of @Haoyu_Xiong_ building up the entire system from scratch. Amazing to see the outcomes! Robots operating in clustered environments with many occlusions are still unaddressed. Your robot really needs a neck for that, and it can be as many as 6 DoFs 🐍

HHaoyu Xiong@Haoyu_Xiong_ · Jun 19

Your bimanual manipulators might need a Robot Neck 🤖🦒 Introducing Vision in Action: Learning Active Perception from Human Demonstrations ViA learns task-specific, active perceptual strategies—such as searching, tracking, and focusing—directly from human demos, enabling robust…

0

1

11

2

1.0K

H

Haoyu Xiong@Haoyu_Xiong_ · Jun 19

Haoyu built an awesome bimanual + neck robot which can be easily mounted on the TidyBot++ mobile base. Hardware design is fully open source! Check out his thread below to learn more 👇

HHaoyu Xiong@Haoyu_Xiong_ · Jun 19

Your bimanual manipulators might need a Robot Neck 🤖🦒 Introducing Vision in Action: Learning Active Perception from Human Demonstrations ViA learns task-specific, active perceptual strategies—such as searching, tracking, and focusing—directly from human demos, enabling robust…

5

35

180

60

17.0K

Haoyu Xiong Retweeted

H

Haoyu Xiong@Haoyu_Xiong_ · Jun 19

We’ve open-sourced everything: vision-in-action.github.io Arxiv: arxiv.org/abs/2506.15666 Github: github.com/haoyu-x/vision… Hardware: github.com/haoyu-x/vision… Thanks to my incredible collaborators @XiaomengXu11 @jimmyyhwu @YifanHou2. Thanks to Jeannette @leto__jean for her…

1

6

37

12

2.0K

H

Haoyu Xiong@Haoyu_Xiong_ · Jun 19

Teleoperating a robot feels unnatural — not just because of limited arm or hand DoFs, but also because of the lack of perceptual freedom! Humans naturally move their head and torso to search, track, and focus — far beyond a simple 2-DoF camera. How to get there? Check out…

HHaoyu Xiong@Haoyu_Xiong_ · Jun 19

Your bimanual manipulators might need a Robot Neck 🤖🦒 Introducing Vision in Action: Learning Active Perception from Human Demonstrations ViA learns task-specific, active perceptual strategies—such as searching, tracking, and focusing—directly from human demos, enabling robust…

1

6

65

20

6.0K

H

Haoyu Xiong@Haoyu_Xiong_ · Jun 21

Assemble a minimal humanoid using off-the-shelf arms and just a few frame components!

JJimmy Wu@jimmyyhwu · Jun 19

Haoyu built an awesome bimanual + neck robot which can be easily mounted on the TidyBot++ mobile base. Hardware design is fully open source! Check out his thread below to learn more 👇

0

10

4

1.0K

Haoyu Xiong Retweeted

H

Haoyu Xiong@Haoyu_Xiong_ · Jun 19

ViA shows robust visual understanding. In the Lime & Pot task, the lime is randomly placed and often not visible at first. The robot learns to look around and search for the object first before initiating arm actions. Check the rollouts👇 5/7

1

2

21

2

2.0K