Yiming Dou

@_YimingDou

Ph.D. student at UMich | B.Eng. from SJTU | Computer Vision, Multimodal, Robotics

Shanghai ↔️ Ann Arbor

Joined March 2022

895Following

757Followers

Pinned

Yiming Dou@_YimingDou · Jun 13

Ever wondered how a scene sounds👂 when you interact👋 with it? Introducing our #CVPR2025 work "Hearing Hands: Generating Sounds from Physical Interactions in 3D Scenes" -- we make 3D scene reconstructions audibly interactive! yimingdou.com/hearing_hands/

7.0K

Yiming Dou Retweeted

Paul Liang@pliang279 · Jun 18

Despite much progress in AI, the ability for AI to 'smell' like humans remains elusive. Smell AIs 🤖👃can be used for allergen sensing (e.g., peanuts or gluten in food), hormone detection for health, safety & environmental monitoring, quality control in manufacturing, and more.…

132

15.0K

Yiming Dou Retweeted

Linyi Jin@jin_linyi · Jun 13

Hello! If you are interested in dynamic 3D or 4D, don't miss the oral session 3A at 9 am on Saturday: @zhengqi_li will be presenting "MegaSaM" I'll be presenting "Stereo4D" and @QianqianWang5 will be presenting "CUT3R"

1.0K

Yiming Dou Retweeted

Ayush Shrivastava@ayshrv · Jun 13

Excited to share our CVPR 2025 paper on cross-modal space-time correspondence! We present a method to match pixels across different modalities (RGB-Depth, RGB-Thermal, Photo-Sketch, and cross-style images) — trained entirely using unpaired data and self-supervision. Our…

121

8.0K

Yiming Dou Retweeted

Jeongsoo Park@jespark0 · Jun 13

Can AI image detectors keep up with new fakes? Mostly, no. Existing detectors are trained using a handful of models. But there are thousands in the wild! Our work, Community Forensics, uses 4800+ generators to train detectors that generalize to new fakes. #CVPR2025 🧵 (1/5)

1.0K

Yiming Dou Retweeted

Daniel Geng@dangengdg · Jun 12

Hello! If you like pretty images and videos and want a rec for CVPR oral session, you should def go to Image/Video Gen, Friday at 9am: I'll be presenting "Motion Prompting" @RyanBurgert will be presenting "Go with the Flow" and @ChangPasca1650 will be presenting "LookingGlass"

5.0K

Yiming Dou Retweeted

Chris Rockwell@_crockwell · Apr 25

Ever wish YouTube had 3D labels? 🚀Introducing🎥DynPose-100K🎥, an Internet-scale collection of diverse videos annotated with camera pose! Applications include camera-controlled video generation🤩and learned dynamic pose estimation😯 Download: huggingface.co/datasets/nvidi…

179

42.0K

Yiming Dou Retweeted

Yuanchen Ju@ju_yuanchen · Apr 22

🧩#CVPR2025🌷Introducing Two By Two✌️: The First Large-Scale Daily Pairwise Assembly Dataset with SE(3)-Equivariant Pose Estimation. 🤖2BY2 helps robots master daily 3D assembly tasks—like plugging sockets or arranging flowers—across diverse objects! 🐨Co-lead by @yuqi_Beijing

8.0K

Yiming Dou@_YimingDou · Mar 27

Thanks to @OpenAI, got a chance to grow up again in Ghibli anime🤗

614

Yiming Dou Retweeted

Sarah Jabbour@SarahJabbour_ · Jan 15

I’m on the PhD internship market for Spr/Summer 2025! I have experience in multimodal AI (EHR, X-ray, text), explainability for image models w/ genAI, clinician-AI interaction (surveyed 700+ doctors), and tabular foundation models. Please reach out if you think there’s a fit!

6.0K

Yiming Dou Retweeted

Yuanchen Ju@ju_yuanchen · Dec 8

🍌We present DenseMatcher！ 🤖️DenseMatcher enables robots to acquire generalizable skills across diverse object categories by only seeing one demo, by finding correspondences between 3D objects even with different types, shapes, and appearances.

116

24.0K

Yiming Dou Retweeted

Daniel Geng@dangengdg · Dec 4

What happens when you train a video generation model to be conditioned on motion? Turns out you can perform "motion prompting," just like you might prompt an LLM! Doing so enables many different capabilities. Here’s a few examples – check out this thread 🧵 for more results!

147

671

334

93.0K

Yiming Dou Retweeted

Junyi Zhang@junyi42 · Oct 7

Excited to share MonST3R! -- a simple way to estimate geometry from unposed video of dynamic scene We achieve competitive results on several downstreams (video depth, camera pose) and believe this is a promising step toward feed-forward 4D reconstruction monst3r-project.github.io

141

735

316

130.0K

Yiming Dou Retweeted

Zichen Wang@Zichen2501 · Sep 30

Differentiable rendering made SIMPLE❗️ Differentiating physically based renderers is hard: Dirac-delta discontinuities arise at object silhouette. Our #SIGGRAPHAsia2024 work shows how a simple relaxation can rescue the day, enabling easy 3D reconstruction and relighting! (1/N)

347

211

44.0K

Yiming Dou Retweeted

Ayush Shrivastava@ayshrv · Oct 1

We present Global Matching Random Walks, a simple self-supervised approach to the Tracking Any Point (TAP) problem, accepted to #ECCV2024. We train a global matching transformer to find cycle consistent tracks through video via contrastive random walks (CRW).

17.0K