Tianmin Shu
@tianminshu
assistant professor @JHUCompSci & @JHUCogSci | director of SCAI lab | working on machine social intelligence, embodied AI, and computational social cognition
🚀 Excited to introduce SimWorld: an embodied simulator for infinite photorealistic world generation 🏙️ populated with diverse agents 🤖 If you are at #CVPR2025, come check out the live demo 👇 Jun 14, 12:00-1:00 pm at JHU booth, ExHall B Jun 15, 10:30 am-12:30 pm, #7, ExHall B
💥💥BANG! Experience the future of gaming with our real-time world model for video games!🕹️🕹️ Not just PLAY—but CREATE! Introducing Mirage, the world’s first AI-native UGC game engine. Now featuring real-time playable demos of two games: 🏙️ GTA-style urban chaos 🏎️ Forza…
World modeling is more than next state prediction. Inspired by cognitive studies on human intuitive physics and spatial reasoning, we introduce WM-ABench, a diagnostic benchmark of evaluating fundamental aspects of world modeling in VLMs.
🤔 Have @OpenAI o3, Gemini 2.5, Claude 3.7 formed an internal world model to understand the physical world, or just align pixels with words? We introduce WM-ABench, the first systematic evaluation of VLMs as world models. Using a cognitively-inspired framework, we test 15 SOTA…
We spent an additional three months refining and making exciting updates to AutoToM. Here's a summary✨: 1. In addition to achieving SOTA performance on five benchmarks, we conducted further experiments showing that (a) AutoToM produces human-like confidence estimates as…
How to achieve human-level open-ended machine Theory of Mind? Introducing #AutoToM: a fully automated and open-ended ToM reasoning method combining the flexibility of LLMs with the robustness of Bayesian inverse planning, achieving SOTA results across five benchmarks. 🧵[1/n]
🚨 New robot manipulation benchmark 🚨 We will present the PartInstruct benchmark at #RSS2025, which includes a diverse & richly annotated dataset and a diagnostic test suite for fine-grained manipulation, where robots must follow part-level instructions like: “Pick up the…
🚀New robot manipulation benchmark How to teach robots to reason about and interact with relevant object parts for a given fine-grained manipulation task? To address this challenge, our #RSS2025paper introduces PartInstruct, the first large-scale benchmark for fine-grained…
Happening now #7 ExHall B, come check out our demo!
Thrilled to finally share SimWorld — the result of over a year’s work of the team. Simulators have been foundational for embodied AI research (I’ve worked with AI2Thor, CARLA, Genesis…), and SimWorld pushes this further with photorealistic Unreal-based rendering and scalable…
Join us tomorrow for the 1st Workshop on Humanoid Agents! We have an exciting lineup: @xiaolonw @xavierpuigf @GuanyaShi @GerardPonsMoll1 @blacksquirrel__ @tianminshu @petitegeek @xbpeng4 📍 Room 101 D, Music City Center 🔗 humanoid-agents.github.io @CVPR #CVPR2025
Excited to announce the Artificial Social Intelligence Workshop @ ICCV 2025 @ICCVConference Join us in October to discuss the science of social intelligence and algorithms to advance socially-intelligent AI! Discussion will focus on reasoning, multimodality, and embodiment.
Don’t miss this awesome workshop that @tianminshu is organizing! 🤖
Excited to announce the 1st Workshop on Continual Robot Learning from Humans @ #RSS2025 in LA! We're bringing together interdisciplinary researchers to explore how robots can continuously learn through human interactions! Full details: …-robot-learning-from-humans.github.io @RoboticsSciSys
Excited to announce the 1st Workshop on Continual Robot Learning from Humans @ #RSS2025 in LA! We're bringing together interdisciplinary researchers to explore how robots can continuously learn through human interactions! Full details: …-robot-learning-from-humans.github.io @RoboticsSciSys
When you try out tasks in an AI benchmark, you may not always agree with the “ground truth” labels. In our recent position paper, we found that there could be a systematic disagreement between human judgment and provided labels. We then argued for a more human-like evaluation
Many studies suggest AI has achieved human-like performance on various cognitive tasks. But what is “human-like” performance? Our new paper conducted a human re-labeling of several popular AI benchmarks and found widespread biases and flaws in task and label designs. We make 5…