Chelsea Finn
@chelseabfinn
Asst Prof of CS & EE @Stanford Co-founder of Physical Intelligence @physical_int PhD from @Berkeley_EECS, EECS BS from @MIT
Since the first year of my PhD, every talk I’ve given has opened with a slide about the distant north star: dropping a robot in a home it’s never been before and having it do useful things. I think it might be time for me to find a new opening slide 😀. Thrilled to share π-0.5!
We got a robot to clean up homes that were never seen in its training data! Our new model, π-0.5, aims to tackle open-world generalization. We took our robot into homes that were not in the training data and asked it to clean kitchens and bedrooms. More below⤵️
Chelsea Finn (@chelseabfinn) on building general-purpose robotics, and bringing intelligence into the physical world. At AI Startup School in San Francisco. 00:00 - General Purpose Robots 00:11 - Challenges in Robotics Applications 00:57 - Physical Intelligence: A New Approach…
The robot can autonomously perform a real gallbladder removal subroutine! - Successfully completed the procedure on all 8 of 8 held-out gallbladders - Uses same alg that we used to train robots to make trail mix, using language hierarchy Paper + videos: h-surgical-robot-transformer.github.io
Introducing Hierarchical Surgical Robot Transformer (SRT-H), a language-guided policy for autonomous surgery🤖🏥 On the da Vinci robot, we perform a real surgical procedure on animal tissue. Collaboration b/w @JohnsHopkins & @Stanford
We still lack a scalable recipe for RL post-training seeded with demonstration data. Many methods add an imitation loss, but this constrains learning too much. We propose to use the demos only to perturb exploration -- It works really well! Paper: arxiv.org/abs/2506.07505
RL often struggles with poor sample efficiency, even with expert data. How can we address this? One approach is to incorporate an imitation loss, but that can overconstrain the policy We propose leveraging prior data implicitly to guide more effective exploration (1/5)
How can robots problem solve in novel environments? We combine high-level reasoning with VLMs with low-level controllers to allow test-time problem solving. Paper & code: anniesch.github.io/vlm-pc/
How can robots autonomously handle ambiguous situations that require commonsense reasoning? *VLM-PC* provides adaptive high-level planning, so robots can get unstuck by exploring multiple strategies. Paper: anniesch.github.io/vlm-pc/
How do we make a scalable RL recipe for robots? We study batch online RL w/ demos. Key findings: - iterative filtered imitation is insufficient - need diverse policy data, eg using diffusion policy - policy extraction can hinder data diversity Paper: pd-perry.github.io/batch-online-r…
Robotic models are advancing rapidly—but how do we scale their improvement? 🤖 We propose a recipe for batch online RL (train offline with online rollouts) that enables policies to self-improve without complications of online RL More: pd-perry.github.io/batch-online-rl (1/8)
Most robot policies don't have any memory! This is because: - policies often perform *worse* with past observations as input - GPU memory and compute constraints We address both to train long-context robot diffusion policies. 🤖 Paper & code: long-context-dp.github.io
Giving history to our robot policies is crucial to solve a variety of daily tasks. However, diffusion policies get worse when adding history. 🤖 In our recent work we learn how adding an auxiliary loss that we name Past-Token Prediction (PTP) together with cached embeddings…
Introducing ✨Latent Diffusion Planning✨ (LDP)! We explore how to use expert, suboptimal, & action-free data. To do so, we learn a diffusion-based *planner* that forecasts latent states, and an *inverse-dynamics model* that extracts actions. w/ @_oleh @DorsaSadigh @chelseabfinn
I’m in Singapore for #ICLR2025! Excited to present Improving Test-Time Search for LLMs with Backtracking Against In-Context Value Verifiers (openreview.net/pdf?id=ZXRKOAf…). Workshops: - Reasoning and Planning for LLMs — Oral Session April 28 - SSI-FM — Poster Happy to chat/meet up!
I'm giving two talks today/Sunday at #ICLR2025 ! - Post-Training Robot Foundation Models (Robot Learning Workshop @ 12:50 pm) - Robot Foundation Models with Open-Ended Generalization (Foundation Models in the Wild @ 2:30 pm) Will cover π-0, Demo-SCORE, Hi Robot, & π-0.5.
My goal throughout my PhD has been to take robots out of the lab and into the real world. It was so special to be a part of this effort and see this dream become reality! Excited to keep pushing model capabilities—and, of course, keep playing with robots 🤖
We got a robot to clean up homes that were never seen in its training data! Our new model, π-0.5, aims to tackle open-world generalization. We took our robot into homes that were not in the training data and asked it to clean kitchens and bedrooms. More below⤵️