Ananye Agarwal

@anag004

building robot brains @SkildAI | MLD PhD at CMU | Prev CS @ IITD.

Joined March 2012

499Following

1KFollowers

Pinned

Ananye Agarwal@anag004 · Jul 9, 2024

As a founding researcher, I have seen @SkildAI grow exponentially. We changed 3 offices, grew 10x in human (and robot) numbers, and become a unicorn in less than a year. If you want to scale up robotics and work with a cracked team of engineers and scientists, come to @SkildAI.

DDeepak Pathak@pathak2206 · Jul 9, 2024

Thrilled to announce @SkildAI! Over the past year, @gupta_abhinav_ and I have been working with our top-tier team to build an AI foundation model grounded in the physical world. Today, we’re taking Skild AI out of stealth with $300M in Series A funding: forbes.com/sites/rashishr…

197

39.0K

Pinned

Ananye Agarwal@anag004 · Jul 2

Research arc: ⏪ 2 yrs ago, we introduced VRB: learning from hours of human videos to cut down teleop (Gibson🙏) ▶️ Today, we explore a wilder path: robots deployed with no teleop, no human demos, no affordances. Just raw video generation magic 🙏 Day 1 of faculty life done! 😉…

SShivansh Patel@shivanshpatel35 · Jul 1

🚀 Introducing RIGVid: Robots Imitating Generated Videos! Robots can now perform complex tasks—pouring, wiping, mixing—just by imitating generated videos, purely zero-shot! No teleop. No OpenX/DROID/Ego4D. No videos of human demonstrations. Only AI generated video demos 🧵👇

133

23.0K

Pinned

Ananye Agarwal Retweeted

Homanga Bharadhwaj@mangahomanga · May 14

Humans grasp objects with a purpose! Web2Grasp enables such functional grasping for dexterous robot hands via hand-object reconstruction from web images - without *any* robot teleop data collection 1/n

116

28.0K

Ananye Agarwal Retweeted

Mihir Prabhudesai@mihirp98 · Jul 22

🚨 The era of infinite internet data is ending, So we ask: 👉 What’s the right generative modelling objective when data—not compute—is the bottleneck? TL;DR: ▶️Compute-constrained? Train Autoregressive models ▶️Data-constrained? Train Diffusion models Get ready for 🤿 1/n

121

171

961

831

168.0K

Ananye Agarwal Retweeted

Aradhye Agarwal@AradhyeAgarwal · Jul 17

Big news! Our paper "Step-by-Step Unmasking for Parameter-Efficient Fine-tuning of LLMs" has been accepted to TACL — a top-tier ACL-sponsored journal (Impact Factor > 9)! 🎉 📄 Paper: arxiv.org/abs/2408.14470 🔧 Code: github.com/Aradhye2002/se… 🧵Thread below 👇

551

Ananye Agarwal@anag004 · Jun 23

Pure continuous-space reasoning isn’t practical. Reasoning requires decision-making, which is naturally enforced when hidden states are decoded into discrete tokens.

YYann LeCun@ylecun · Jun 18

It is intuitively obvious that reasoning in continuous embedding space is dramatically more powerful than reasoning in discrete token space. This paper from @tydsh and team show that it is the case theoretically.

2.0K

Ananye Agarwal@anag004 · Jun 17

PPO is often frustrating to tune for many continuous control tasks since it keeps getting stuck in local minima. In our SAPG paper (sapg-rl.github.io), we showed how training multiple followers with PPO and combining their data can mitigate this issue. In EPO,…

JJianren Wang@wang_jianren · Jun 17

(1/n) Since its publication in 2017, PPO has essentially become synonymous with RL. Today, we are excited to provide you with a better alternative - EPO.

3.0K

Ananye Agarwal Retweeted

Mihir Prabhudesai@mihirp98 · May 28

Excited to share our work: Maximizing Confidence Alone Improves Reasoning Humans rely on confidence to learn when answer keys aren’t available (e.g taking an exam). Surprisingly, LLMs can also learn w/o ground-truth answers, simply by reinforcing high-confidence answers via RL!

285

265

66.0K

Ananye Agarwal@anag004 · May 26

Cool (and somewhat counterintuitive) finding from my brother - conciseness and correctness are correlated in llm reasoning! This neat fact can be used to design an efficient and more accurate test time search algorithm.

AAradhye Agarwal@AradhyeAgarwal · May 26

For the past couple of months we've been working on test-time scaling, and we've discovered a huge thing:

991

Ananye Agarwal Retweeted

Aradhye Agarwal@AradhyeAgarwal · May 26

For the past couple of months we've been working on test-time scaling, and we've discovered a huge thing:

5.0K

Ananye Agarwal Retweeted

Kenny Shaw@kenny__shaw · May 19

Very exciting Handy Moves workshop at ICRA 2025 this year! It's an honor to be hosting this morning session! Please join us in Room 302 😀 sites.google.com/view/dexterity…

2.0K

Ananye Agarwal@anag004 · May 13

Maybe real-world robot generalization doesn’t need massive teleop datasets? 🤔 In DexWild, we show that human demos 🙌 + a little robot data 🤖 = policies that generalize across scenes 🏞️, tasks 🛠️, and embodiments 🦾!

TTony Tao@_tonytao_ · May 13

Training robots for the open world needs diverse data But collecting robot demos in the wild is hard! Presenting DexWild 🙌🏕️ Human data collection system that works in diverse environments, without robots 💪🦾 Human + Robot Cotraining pipeline that unlocks generalization 🧵👇

8.0K

Ananye Agarwal Retweeted

Tony Tao@_tonytao_ · May 13

319

141

89.0K

Ananye Agarwal@anag004 · May 11

Exiciting to see (at 5:55) Nvidia adopting LEAP Hand in their sim2real efforts! Build your own at leaphand.com ! Lots more coming this summer, stay tuned :) @pathak2206 @anag004

JJim Fan@DrJimFan · May 8

The Physical Turing Test: your house is a complete mess after a Sunday hackathon. On Monday night, you come home to an immaculate living room and a candlelight dinner. And you couldn't tell whether a human or a machine had been there. Deceptively simple, insanely hard. It is the…

6.0K

Ananye Agarwal@anag004 · May 11

Great to see the nvidia using LEAP Hand and the sim2real pipeline we developed for it (5:55)! We trained a policy to in-hand rotate a cube using only proprioception v1.leaphand.com. work w/ @kenny__shaw @pathak2206

JJim Fan@DrJimFan · May 8

737

Ananye Agarwal Retweeted

Skild AI@SkildAI · Apr 18

Skild AI is on the @Forbes AI 50 list of the most promising privately-held AI companies in the world!! #ForbesAI50 Join us: skild.ai/career Full list: forbes.com/lists/ai50/

4.0K

Ananye Agarwal Retweeted

Christina Baek@_christinabaek · Apr 16

Are current reasoning models optimal for test-time scaling? 🌠 No! Models make the same incorrect guess over and over again. We show that you can fix this problem w/o any crazy tricks 💫 – just do weight ensembling (WiSE-FT) for big gains on math! 1/N

103

485

327

53.0K

Ananye Agarwal Retweeted

Mihir Prabhudesai@mihirp98 · Apr 2

1/ Happy to share UniDisc - Unified Multimodal Discrete Diffusion – We train a 1.5 billion parameter transformer model from scratch on 250 million image/caption pairs using a **discrete diffusion objective**. Our model has all the benefits of diffusion models but now in…

113

778

487

103.0K

Ananye Agarwal Retweeted

Eliot Xing@etaoxing · Mar 12

RL is notoriously sample inefficient. How can we scale RL on tasks much slower to simulate than rigid body physics, such as soft bodies? In our #ICLR2025 spotlight, we introduce both a new first-order RL algorithm, SAPO, and differentiable simulation platform, Rewarped. 1/n

355

215

32.0K

Ananye Agarwal Retweeted

Jason Liu@JasonJZLiu · Feb 25

Low-cost teleop systems have democratized robot data collection, but they lack any force feedback, making it challenging to teleoperate contact-rich tasks. Many robot arms provide force information — a critical yet underutilized modality in robot learning. We introduce: 1. 🦾A…

808

248

142.0K

Ananye Agarwal Retweeted

Samuel Sokota@ssokota · Feb 14

Model-free deep RL algorithms like NFSP, PSRO, ESCHER, & R-NaD are tailor-made for games with hidden information (e.g. poker). We performed the largest-ever comparison of these algorithms. We find that they do not outperform generic policy gradient methods, such as PPO. 1/N

351

325

77.0K