Pulkit Agrawal
@pulkitology
Faculty @ MIT
Presenting Visual Dexterity: Object re-orientation in its full generality! Single camera. Novel objects. Any orientation. Downward-facing hand that fights gravity. Real-time dynamic control. Open source setup. Learn more: taochenshh.github.io/projects/visua… Led by @taochenshh #robots #rl
What if robots had wrists? 👇
What’s keeping robot arms from working like human arms? They're big, slow, have the wrong joints, and can't conform to their environment. DexWrist solves all of these issues and simplifies learning constrained, dynamic manipulation👉 dexwrist.csail.mit.edu
What’s keeping robot arms from working like human arms? They're big, slow, have the wrong joints, and can't conform to their environment. DexWrist solves all of these issues and simplifies learning constrained, dynamic manipulation👉 dexwrist.csail.mit.edu
What if an LLM can decide what data to use, potentially generate its own data and decide how to update itself 👇
What if an LLM could update its own weights? Meet SEAL🦭: a framework where LLMs generate their own training data (self-edits) to update their weights in response to new inputs. Self-editing is learned via RL, using the updated model’s downstream performance as reward.
Llama 4 (@Meta) results are consistent with what we hypothesized will unleash the next generation of AI reasoning. A new paradigm for pre-training is around the corner arxiv.org/abs/2502.19402
Llama 4 (@Meta) shows too much SFT limits RL exploration — something we also found in our recent work! A new and superior pretraining paradigm is around the corner to unleash a new era of reasoning. Check out our paper: arxiv.org/abs/2502.19402 Thread: x.com/pulkitology/st…
Agents performing "curious" exploration by setting random abstract goals! We present a simple but effective method for deep exploration of reinforcement learning (RL) agents that we call random latent exploration (RLE). Typical RL gents explore by: ⚔️ Noise-based exploration,…

You aren't doing robotics if you are not breaking some robots! uan.csail.mit.edu
Auditing and exposing the fragility of language-conditioned robot models with Embodied Red Teaming (ERT)! 🤯 Simple re-phrasing of task instructions, e.g., from "Please bring me a can of coke" to "Give me a coke," is the difference between the robot succeeding or failing.…
