Danny Driess

@DannyDriess

Research Scientist @physical_int. Formerly Google DeepMind

Joined August 2021

327Following

4KFollowers

Pinned

Danny Driess@DannyDriess · May 28

How to build vision-language-action models that train fast, run fast & generalize? In our new paper, we formalize & analyze the approach of our π-0.5 model & further improve it with a single stage recipe. Blog: pi.website/research/knowl… Paper: pi.website/download/pi05_…

220

118

16.0K

Danny Driess Retweeted

Russ Tedrake@RussTedrake · Jul 9

TRI's latest Large Behavior Model (LBM) paper landed on arxiv last night! Check out our project website: toyotaresearchinstitute.github.io/lbm1/ One of our main goals for this paper was to put out a very careful and thorough study on the topic to help people understand the state of the…

105

480

190

74.0K

Danny Driess@DannyDriess · Jul 8

Had a blast on the Unsupervised Learning Podcast with @hausman_k! We covered the past, present, and future of robot learning 🤖 Big thanks to @jacobeffron for being a fantastic host!

JJacob Effron@jacobeffron · Jul 8

New Unsupervised Learning with @hausman_k & @DannyDriess (@physical_int) on building generalist robotics foundation models and: - What’s next in AI x robotics - Biggest outstanding questions - How they 10x’d model training speed - Open sourcing π 0 - Breakthroughs…

2.0K

Danny Driess@DannyDriess · May 19

Check out our new work where we dissect various aspects of chain-of-thought at both training and inference time) for robotics! Awesome work led by @verityw_

WWill Chen@verityw_ · May 19

Embodied chain-of-thought reasoning (ECoT) is a powerful way to improve robot generalization & performance. But why is this the case, and how can that inform the design of learned robot policies? We investigate these questions in our latest work! ecot-lite.github.io 1/6

959

Danny Driess@DannyDriess · May 2

We auto-encode point tracks to automatically evaluate motion realism in generative video models. By inherently focusing on motion, our new metric (TRAJAN) correlates much better with human judgments of these models than appearance based metrics.

KKelsey Allen@KelseyRAllen · May 2

Humans can tell the difference between a realistic generated video and an unrealistic one – can models? Excited to share TRAJAN: the world’s first point TRAJectory AutoeNcoder for evaluating motion realism in generated and corrupted videos. 🌐 trajan-paper.github.io 🧵

913

Danny Driess@DannyDriess · Apr 22

Scaling data diversity, transfer between data sources, and a good training recipe were the main ingredients to allow robots to generalize to new homes!

PPhysical Intelligence@physical_int · Apr 22

We got a robot to clean up homes that were never seen in its training data! Our new model, π-0.5, aims to tackle open-world generalization. We took our robot into homes that were not in the training data and asked it to clean kitchens and bedrooms. More below⤵️

2.0K

Danny Driess@DannyDriess · Apr 22

More insights: π-0.5 is trained to break tasks down into subtasks, before producing actual robot actions. It turns out that adding the subtask prediction data is useful, even if you query the model with the overall task directly.

DannyDriess's tweet image. More insights: π-0.5 is trained to break tasks down into subtasks, before producing actual robot actions. It turns out that adding the subtask prediction data is useful, even if you query the model with the overall task directly.

2.0K