Yuncong Yang (@YuncongYY)

Pinned

Y

Yuncong Yang@YuncongYY · Jul 21

Test-time scaling nailed code & math—next stop: the real 3D world. 🌍 MindJourney pairs any VLM with a video-diffusion World Model, letting it explore an imagined scene before answering. One frame becomes a tour—and the tour leads to new SOTA in spatial reasoning. 🚀 🧵1/

3

23

81

55

50.0K

Y

Yuncong Yang@YuncongYY · 23 m

Just paid ¥4.99 to a site that "predicts" NeurIPS acceptance from your ratings and confidence scores. Total scam-basically a random number generator. 🤡 I should build my own startup for this. Pretty sure I could make a fortune off researchers' anxiety these days. #NeurIPS2025

YuncongYY's tweet image. Just paid ¥4.99 to a site that "predicts" NeurIPS acceptance from your ratings and confidence scores.
Total scam-basically a random number generator. 🤡
I should build my own startup for this. Pretty sure I could make a fortune off researchers' anxiety these days.
#NeurIPS2025

0

1

0

87

Y

Yuncong Yang@YuncongYY · Jul 23

VLM struggles badly to interpret 3D from 2D observations, but what if it has a good mental model about the world? Checkout our MindJourney - A test-time scaling for spatial reasoning in 3D world. Without any specific training, MindJourney imagines (acts mentally) step-by-step…

YYuncong Yang@YuncongYY · Jul 21

Test-time scaling nailed code & math—next stop: the real 3D world. 🌍 MindJourney pairs any VLM with a video-diffusion World Model, letting it explore an imagined scene before answering. One frame becomes a tour—and the tour leads to new SOTA in spatial reasoning. 🚀 🧵1/

0

3

31

10

3.0K

Y

Yuncong Yang@YuncongYY · Jul 22

Spatial reasoning from a single image is inherently difficult, but it becomes significantly easier when leveraging a controlled world model, analogous to the mental models used by humans! Code: github.com/UMass-Embodied…

YYuncong Yang@YuncongYY · Jul 21

Test-time scaling nailed code & math—next stop: the real 3D world. 🌍 MindJourney pairs any VLM with a video-diffusion World Model, letting it explore an imagined scene before answering. One frame becomes a tour—and the tour leads to new SOTA in spatial reasoning. 🚀 🧵1/

2

11

93

38

14.0K

Y

Yuncong Yang@YuncongYY · Jul 18

VLMs often struggle with physical reasoning tasks such as spatial reasoning. Excited to share how we can use world models + test-time search to zero-shot improve spatial reasoning in VLMs!

AAK@_akhaliq · Jul 18

MindJourney Test-Time Scaling with World Models for Spatial Reasoning

3

25

186

94

24.0K

Y

Yuncong Yang@YuncongYY · Jul 18

Thanks @_akhaliq for sharing our work! MindJourney fuses a world model with any VLM, so the model can first imagine walking around before it answers. From “one snapshot” to “what if I stand over there?”—and suddenly spatial reasoning hits SOTA. 🚀 Project Page:…

AAK@_akhaliq · Jul 18

MindJourney Test-Time Scaling with World Models for Spatial Reasoning

2

9

57

14

14.0K

Yuncong Yang Retweeted

A

AK@_akhaliq · Jul 16

You can install anycoder as a Progressive Web App on your device. Visit huggingface.co/spaces/akhaliq… and in the footer click settings then follow instructions and click the install button in the URL address bar of your browser

0

11

52

32

30.0K

Yuncong Yang Retweeted

M

Martin Ziqiao Ma@ziqiao_ma · Jul 10

📣 Excited to announce SpaVLE: #NeurIPS2025 Workshop on Space in Vision, Language, and Embodied AI! 👉 …vision-language-embodied-ai.github.io 🦾Co-organized with an incredible team → @fredahshi · @maojiayuan · @DJiafei · @ManlingLi_ · David Hsu · @Kordjamshidi 🌌 Why Space & SpaVLE? We…

0

24

66

7

8.0K

Y

Yuncong Yang@YuncongYY · Jun 20

I hope humans and robots live peacefully in the Virtual Community. Great work by @QinhongZhou ! #DetroitBecomeHuman #AI #Robotics

CChuang Gan@gan_chuang · Jun 20

World Simulator, reimagined — now alive with humans, robots, and their vibrant society unfolding in 3D real-world geospatial scenes across the globe! 🚀 One day soon, humans and robots will co-exist in the same world. To prepare, we must address: 1️⃣ How can robots cooperate or…

0

4

0

389

Y

Yuncong Yang@YuncongYY · Jun 13

Nashville’s food is hands-down the highlight of CVPR for me so far. Sending a meat-lover’s salute to the South 🤤 P1 Hattie B P2 Peg Leg Porker #CVPR2025

YuncongYY's tweet image. Nashville’s food is hands-down the highlight of CVPR for me so far.
Sending a meat-lover’s salute to the South 🤤

P1 Hattie B
P2 Peg Leg Porker
#CVPR2025

1

12

1

1.0K

Y

Yuncong Yang@YuncongYY · May 8

Watched the notorious @celtics game while working on my NeurIPS submission. It took me 2½ hours to realized there’s something even more painful than rushing a NeurIPS paper. #Celtics #NeurIPS2025

0

1

6

0

845

Yuncong Yang Retweeted

A

AK@_akhaliq · Apr 30

TesserAct is out on Hugging Face Learning 4D Embodied World Models

4

44

239

141

58.0K