Yin Cui (@YinCuiCV)

Pinned

Y

Yin Cui@YinCuiCV · Apr 23

Introducing the Describe Anything Model (DAM), a powerful Multimodal LLM that generates detailed descriptions for user-specified regions in images or videos using points, boxes, scribbles, or masks. Open-source code, models, demo, data, and benchmark at: describe-anything.github.io

8

77

409

271

33.0K

Yin Cui Retweeted

G

Google DeepMind@GoogleDeepMind · Jul 21

An advanced version of Gemini with Deep Think has officially achieved gold medal-level performance at the International Mathematical Olympiad. 🥇 It solved 5️⃣ out of 6️⃣ exceptionally difficult problems, involving algebra, combinatorics, geometry and number theory. Here’s how 🧵

153

779

4.0K

685

1.0M

Y

Yin Cui@YinCuiCV · Jul 2

Excited to share that Describe Anything has been accepted at ICCV 2025! 🎉 Describe Anything Model (DAM) is a powerful Multimodal LLM that generates detailed descriptions for user-specified regions in images or videos using points, boxes, scribbles, or masks. Open-source code,…

AAK@_akhaliq · Apr 23

Nvidia just dropped Describe Anything on Hugging Face Detailed Localized Image and Video Captioning

2

26

117

58

20.0K

Yin Cui Retweeted

H

Hanzi Mao@hanna_mao · Jun 28

We build Cosmos-Predict2 as a world foundation model for Physical AI builders — fully open and adaptable. Post-train it for specialized tasks or different output types. Available in multiple sizes, resolutions, and frame rates. 📷 Watch the repo walkthrough…

8

70

283

111

28.0K

Yin Cui Retweeted

G

Google AI Developers@googleaidevs · Jun 25

Introducing Gemini CLI, a light and powerful open-source AI agent that brings Gemini directly into your terminal. >_ Write code, debug, and automate tasks with Gemini 2.5 Pro with industry-leading high usage limits at no cost.

330

2.0K

10.0K

5.0K

3.2M

Yin Cui Retweeted

C

ComfyUI@ComfyUI · Jun 17

🎉 ComfyUI now natively supports NVIDIA’s Cosmos-Predict2 model family! Cosmos-Predict2 brings high-fidelity, physics-aware Image generation and Video2World (Image-to-Video) generation. Another reality inside ComfyUI!

10

42

217

107

16.0K

Y

Yin Cui@YinCuiCV · Jun 17

🚀 We're releasing Cosmos-Predict2 — our developer-first, top-performing world foundation models for Physical AI! 🔗 huggingface.co/blog/nvidia/co… 👩‍💻 Pretrained weights, inference, and post-training scripts available. 💬 Try it out and share your feedback! - code:…

QQinsheng Zhang@qsh_zh · Jun 12

🚀 Introducing Cosmos-Predict2! Our most powerful open video foundation model for Physical AI. Cosmos-Predict2 significantly improves upon Predict1 in visual quality, prompt alignment, and motion dynamics—outperforming popular open-source video foundation models. It’s openly…

2

36

136

75

33.0K

Yin Cui Retweeted

k

kiui@ashawkey3 · Jun 13

Happy to share our work PartPacker: We enable one-shot image-to-3D generation with any number of parts! Project page: research.nvidia.com/labs/dir/partp… Demo: huggingface.co/spaces/nvidia/… Code: github.com/NVlabs/PartPac…

0

16

73

30

12.0K

Y

Yin Cui@YinCuiCV · Jun 12

Cosmos-Predict2 is our latest open video foundation model for Physical AI! research.nvidia.com/labs/dir/cosmo… If you’re at #cvpr2025, I would also love to chat with you about world models!

QQinsheng Zhang@qsh_zh · Jun 12

🚀 Introducing Cosmos-Predict2! Our most powerful open video foundation model for Physical AI. Cosmos-Predict2 significantly improves upon Predict1 in visual quality, prompt alignment, and motion dynamics—outperforming popular open-source video foundation models. It’s openly…

0

3

46

8

3.0K

Y

Yin Cui@YinCuiCV · Jun 12

The physics meets vision workshop just started! Come joining us!

FFangyin Wei@FangyinWei · Jun 12

Join us on the 1st workshop on Vision Meets Physics: Synergizing Physical Simulation and Computer Vision at #CVPR2025 tomorrow! Thought-provoking talks and expert insights from leading researchers that YOU CANNOT MISS! 📍104A ⏰ 8:45am June 12th visionmeetphysics.github.io

0

4

31

3

4.0K

Y

Yin Cui@YinCuiCV · Jun 12

The WorldModelBench workshop is happening tomorrow (June 12th) at #CVPR2025! We have an exciting series of talks, do attend! Place: Room 108 Time: Morning Session #NVIDIAResearch

PPrithvijit@prithvijitch · Mar 11

Join us at the WorldModelBench workshop at #CVPR2025 where we'll tackle systematic evaluation of World Models! Focus: benchmarks, metrics, downstream tasks, and safety. Submit papers now: worldmodelbench.github.io

1

9

19

1

3.0K

Y

Yin Cui@YinCuiCV · Jun 12

If you are attending #CVPR2025 tomorrow, please visit two highly relevant workshops organized by our team members: - Vision Meets Physics: visionmeetphysics.github.io - Benchmarking World Models: worldmodelbench.github.io

QQinsheng Zhang@qsh_zh · Jun 12

many core-contributors are attending #CVPR2025 . Let’s discuss the future of world models!

0

3

29

3

3.0K

Yin Cui Retweeted

Q

Qinsheng Zhang@qsh_zh · Jun 12

🚀 Introducing Cosmos-Predict2! Our most powerful open video foundation model for Physical AI. Cosmos-Predict2 significantly improves upon Predict1 in visual quality, prompt alignment, and motion dynamics—outperforming popular open-source video foundation models. It’s openly…

7

59

202

83

74.0K

Yin Cui Retweeted

F

Fangyin Wei@FangyinWei · Jun 12

Join us on the 1st workshop on Vision Meets Physics: Synergizing Physical Simulation and Computer Vision at #CVPR2025 tomorrow! Thought-provoking talks and expert insights from leading researchers that YOU CANNOT MISS! 📍104A ⏰ 8:45am June 12th visionmeetphysics.github.io

1

5

17

0

3.0K

Yin Cui Retweeted

M

Max Zhaoshuo Li 李赵硕@mli0603 · Jun 11

Cosmos-Reason1 has exciting updates 💡 Now it understands physical reality — judging videos as real or fake! Check out the resources👇 Paper: arxiv.org/abs/2503.15558 Huggingface: huggingface.co/nvidia/Cosmos-… Code: github.com/nvidia-cosmos/… Project page: research.nvidia.com/labs/dir/cosmo… (1/n)

2

32

99

29

12.0K