Chuang Gan
@gan_chuang
Faculty Member at UMass Amherst; Principal researcher at MIT-IBM Watson AI Lab; Homepage: https://embodied-agi.cs.umass.edu/
World Simulator, reimagined — now alive with humans, robots, and their vibrant society unfolding in 3D real-world geospatial scenes across the globe! 🚀 One day soon, humans and robots will co-exist in the same world. To prepare, we must address: 1️⃣ How can robots cooperate or…
Spatial reasoning from a single image is inherently difficult, but it becomes significantly easier when leveraging a controlled world model, analogous to the mental models used by humans! Code: github.com/UMass-Embodied…
Test-time scaling nailed code & math—next stop: the real 3D world. 🌍 MindJourney pairs any VLM with a video-diffusion World Model, letting it explore an imagined scene before answering. One frame becomes a tour—and the tour leads to new SOTA in spatial reasoning. 🚀 🧵1/
Professor Zhao 👍👍👍
I'll be around the ICML venue this afternoon. Message me if you want to meet! These days, I think about reasoning and RL. Also happy to talk about academia vs. industry (I think the lack of compute in academia is a feature not a bug), faculty and PhD student recruiting at UMass.
Excited to be at ICML to present four papers and recruit new faculty for UMass Amherst! We're hiring in generative AI, NLP, and 3D vision—please feel free to reach out if you're interested!
Thank you to AK for introducing our new work on Fast 3D Language Gaussian Splatting! Please try our code: github.com/ZhaoYujie2002/…
LangSplatV2 High-dimensional 3D Language Gaussian Splatting with 450+ FPS
Building a World Simulator is my best bet for achieving embodied AGI! I'm truly inspired and grateful to see the next generation of robotics leaders — @zhou_xian_, @theo_gervet, @Zhenjia_Xu, @johnsonwang0810, @yilingq97, and many others — boldly carrying this vision forward and…
Today, We’re launching Genesis AI — a global physical AI lab and full-stack robotics company — to build generalist robots and unlock unlimited physical labor. We’re backed by $105M in seed funding from @EclipseVentures, @khoslaventures, @Bpifrance, HSG, and visionaries…
🧠 LLMs think too much—and waste tokens! Can we precisely control how long they reason? Introducing Budget Guidance — a thinking-budget-conditioned generation method that controls how long an LLM thinks! We use a lightweight predictor to estimate the remaining reasoning…

VLM can think visually without generating pixels! VLM can think visually without generating pixels! VLM can think visually without generating pixels! 📢 We introduce Machine Mental Imagery (Mirage): a new framework that enables VLM to imagine using latent visual…

Attending RSS for the first time and giving a talk tomorrow at the Learning Structured World Models for Robotic Manipulation workshop! At midnight, I made a last-minute crazy decision to change my talk content to Virtual Community — to honor the incredible hard work of my…
World Simulator, reimagined — now alive with humans, robots, and their vibrant society unfolding in 3D real-world geospatial scenes across the globe! 🚀 One day soon, humans and robots will co-exist in the same world. To prepare, we must address: 1️⃣ How can robots cooperate or…
Digital twin of (the future of) our physical world?
World Simulator, reimagined — now alive with humans, robots, and their vibrant society unfolding in 3D real-world geospatial scenes across the globe! 🚀 One day soon, humans and robots will co-exist in the same world. To prepare, we must address: 1️⃣ How can robots cooperate or…
Wow, this is so cool! Have been dreaming of building agents that can interact with humans via language communications, and the world via physical interaction (locomotion, manipulation, etc). Definitely a great step-stone and playground!
World Simulator, reimagined — now alive with humans, robots, and their vibrant society unfolding in 3D real-world geospatial scenes across the globe! 🚀 One day soon, humans and robots will co-exist in the same world. To prepare, we must address: 1️⃣ How can robots cooperate or…
guys, real geospatial data is a total goldmine for digital agents. step away from the web browser and get real. (we explored a bit in virl-platform.github.io, but building a simulation-ready pipeline like this could take things way further)
Virtual Community provides an online pipeline that automatically generates 3D scenes from real geospatial data, performing comprehensive cleaning and enhancement of both geometry and texture — including mesh simplification, texture refinement, object placement, and automatic…
🤖Can world models quickly adapt to new environments with just a few interactions? Introducing AdaWorld 🌍 — a new approach to learning world models conditioned on continuous latent actions extracted from videos via self-supervision! It enables rapid adaptation, efficient…