Carl Doersch

@CarlDoersch

Researcher at DeepMind

London, UK

Joined April 2017

288Following

2KFollowers

Pinned

Carl Doersch@CarlDoersch · May 23, 2024

We present a new SOTA on point tracking, via self-supervised training on real, unlabeled videos! BootsTAPIR achieves 67.4% AJ on TAP-Vid DAVIS with minimal architecture changes, tracks 10K points on a 50-frame video in 6 secs. Pytorch & JAX impl on Github. bootstap.github.io

317

129

50.0K

Carl Doersch Retweeted

Kelsey Allen@KelseyRAllen · May 2

Humans can tell the difference between a realistic generated video and an unrealistic one – can models? Excited to share TRAJAN: the world’s first point TRAJectory AutoeNcoder for evaluating motion realism in generated and corrupted videos. 🌐 trajan-paper.github.io 🧵

17.0K

Carl Doersch Retweeted

Daniel Geng@dangengdg · Dec 4

What happens when you train a video generation model to be conditioned on motion? Turns out you can perform "motion prompting," just like you might prompt an LLM! Doing so enables many different capabilities. Here’s a few examples – check out this thread 🧵 for more results!

147

672

334

93.0K

Carl Doersch@CarlDoersch · Sep 24

Want a robot to solve a task, specified in language? Generate a video of a person doing it, and then retarget the action to the robot with the help of point tracking! Cool collab with @mangahomanga during his student researcher stint at Google.

HHomanga Bharadhwaj@mangahomanga · Sep 24

Gen2Act: Casting language-conditioned manipulation as *human video generation* followed by *closed-loop policy execution conditioned on the generated video* enables solving diverse real-world tasks unseen in the robot dataset! homangab.github.io/gen2act/ 1/n

768

Carl Doersch@CarlDoersch · Jul 17, 2024

Want to make a difference with point tracking? The medical community needs help tracking tissue deformation during surgery! Participate in the STIR challenge (stir-challenge.github.io) at MICCAI, deadline in September.

369

Carl Doersch Retweeted

Skanda@skandakoppula · Jul 9, 2024

We're excited to release TAPVid-3D: an evaluation benchmark of 4,000+ real world videos and 2.1 million metric 3D point trajectories, for the task of Tracking Any Point in 3D!

291

138

44.0K

Carl Doersch Retweeted

Dima Damen@dimadamen · Jun 12, 2024

Can you win 2nd Perception Test Challenge? @eccvconf workshop: ptchallenge-workshop.github.io Diagnose Audio-visual MLM on ability to model memory, physics, abstraction &semantics through 6 tasks: VQA, Point Tracking, Box T, action/sound localisation - Jointly! @GoogleDeepMind +win 💰

28.0K

Carl Doersch@CarlDoersch · Nov 7, 2023

Just in time for CVPR, we've released code to generate "rainbow visualizations" from a set of point tracks: it semi-automatically segments foreground objects and corrects for camera motion. Try our colab demo at colab.sandbox.google.com/github/deepmin… (vid source youtube.com/watch?v=yuQFQ8…)

110

713

225

67.0K

Carl Doersch Retweeted

Dima Damen@dimadamen · Aug 2, 2023

📢 Perception Test @ICCVConference now w/ Test Set. We invite submissions to 1st Perception Test- winners announced #ICCV2023 in Paris 6 leaderboards to test Multimodal Model's ultimate perception capabilities. Workshop: ptchallenge-workshop.github.io GitHub: github.com/deepmind/perce… 1/5

11.0K