Abhishek Gupta

@abhishekunique7

Assistant Professor at University of Washington. I like robots, and reinforcement learning. Previously: post-doc at MIT, PhD at Berkeley

Seattle, WA

Joined February 2012

862Following

8KFollowers

Pinned

Abhishek Gupta@abhishekunique7 · Jun 25

So you’ve trained your favorite diffusion/flow based policy, but it’s just not good enough 0-shot. Worry not, in our new work DSRL - we show how to *steer* pre-trained diffusion policies with off-policy RL, improving behavior efficiently enough for direct training in the real…

190

120

16.0K

Pinned

Abhishek Gupta@abhishekunique7 · Jun 20

Check out some of our new work on distributed robot evaluation led by @KarlPertsch, @pranav_atreya and @tonyh_lee! Hopefully folks can contribute, and help us take a step towards systematic and standardized empiricism in robot learning! :) Also check out some of the fun sim eval…

KKarl Pertsch@KarlPertsch · Jun 20

We’re releasing the RoboArena today!🤖🦾 Fair & scalable evaluation is a major bottleneck for research on generalist policies. We’re hoping that RoboArena can help! We provide data, model code & sim evals for debugging! Submit your policies today and join the leaderboard! :) 🧵

3.0K

Abhishek Gupta Retweeted

Russ Tedrake@RussTedrake · Jul 9

TRI's latest Large Behavior Model (LBM) paper landed on arxiv last night! Check out our project website: toyotaresearchinstitute.github.io/lbm1/ One of our main goals for this paper was to put out a very careful and thorough study on the topic to help people understand the state of the…

105

478

189

73.0K

Abhishek Gupta@abhishekunique7 · Jul 1

In our latest paper, we discovered a surprising result: training LLMs with self-play reinforcement learning on zero-sum games (like poker) significantly improves performance on math and reasoning benchmarks, zero-shot. Whaaat? How does this work? We analyze the results and find…

BBo Liu (Benjamin Liu)@Benjamin_eecs · Jul 1

We've always been excited about self-play unlocking continuously improving agents. Our insight: RL selects generalizable CoT patterns from pretrained LLMs. Games provide perfect testing grounds with cheap, verifiable rewards. Self-play automatically discovers and reinforces…

274

149

25.0K

Abhishek Gupta Retweeted

Andrew Wagenmaker@ajwagenmaker · Jun 25

Diffusion policies have demonstrated impressive performance in robot control, yet are difficult to improve online when 0-shot performance isn’t enough. To address this challenge, we introduce DSRL: Diffusion Steering via Reinforcement Learning. (1/n) diffusion-steering.github.io

296

189

53.0K

Abhishek Gupta@abhishekunique7 · Jun 21

I'm sadly unable to be at #RSS2025 this year, but my students @prodarhan, @chuning_zhu and @marceltornev will be! Find them presenting some exciting work today, 6/21: 1) @chuning_zhu will present Unified World Models: Coupling Video and Action Diffusion for Pretraining on Large…

3.0K

Abhishek Gupta Retweeted

Pranav Atreya@pranav_atreya · Jun 20

In robotics benchmarks are rarely shared. New eval setups are created for each new project, a stark difference from evals in broader ML. But generalist policies share a problem statement: do any task in any environment. Can generalist capabilities make robot evaluation easier?

131

15.0K

Abhishek Gupta Retweeted

Yunchu@yunchuzh · Jun 19

How should a robot perceive the world? What kind of visual representation leads to robust visuomotor policy learning for robotics? Policies trained on raw images are often fragile—easily broken by lighting, clutter, or object variations—making it challenging to deploy policies…

160

22.0K

Abhishek Gupta@abhishekunique7 · Jun 19

Check out @yunchuzh's new work on automatically selecting keypoints as a representation for super robust policy learning!

YYunchu@yunchuzh · Jun 19

3.0K