Xindi Wu
@cindy_x_wu
PhD student @PrincetonCS | Interning @nvidia | Data-centric multimodal ml | prev @roboVisionCMU @CMU_Robotics | @RealityLabs @Snapchat | 🏎️
Introducing COMPACT: COMPositional Atomic-to-complex Visual Capability Tuning, a data-efficient approach to improve multimodal models on complex visual tasks without scaling data volume. 📦 arxiv.org/abs/2504.21850 1/10

Excited to share recent work with @kaihuac5 and @RamananDeva where we learn to do novel view synthesis for dynamic scenes in a self-supervised manner, only from 2D videos! webpage: cog-nvs.github.io arxiv: arxiv.org/abs/2507.12646 code (soon): github.com/Kaihua-Chen/co…
🚀 Launch day! The NeurIPS 2025 PokéAgent Challenge is live. Two tracks: ① Showdown Battling – imperfect-info, turn-based strategy ② Pokemon Emerald Speedrunning – long horizon RPG planning 5 M labeled replays • starter kit • baselines. Bring your LLM, RL, or hybrid…
Can data owners & LM developers collaborate to build a strong shared model while each retaining data control? Introducing FlexOlmo💪, a mixture-of-experts LM enabling: • Flexible training on your local data without sharing it • Flexible inference to opt in/out your data…
Introducing FlexOlmo, a new paradigm for language model training that enables the co-development of AI through data collaboration. 🧵
🤔 Feel like your AI is bullshitting you? It’s not just you. 🚨 We quantified machine bullshit 💩 Turns out, aligning LLMs to be "helpful" via human feedback actually teaches them to bullshit—and Chain-of-Thought reasoning just makes it worse! 🔥 Time to rethink AI alignment.
📢 Our ICCV 2025 Workshop on Curated Data for Efficient Learning is accepting submissions! To be published in the proceedings: Deadline: July 7, 2025 All other submissions: Deadline: August 29, 2025 curateddata.github.io Join us this October in Hawaii! 🌺@ICCVConference
Call for Papers! Excited to announce our Workshop on Curated Data for Efficient Learning! #ICCV2025 We seeks to advance the understanding and development of data-centric techniques to improve the efficiency of large-scale training. Deadline: July 7, 2025 curateddata.github.io
We are excited to share Cosmos-Drive-Dreams 🚀 A bold new synthetic data generation (SDG) pipeline powered by world foundation models—designed to synthesize rich, challenging driving scenarios at scale. Models, Code, Dataset, Tookit are released. Website:…
🔍 Recent job posting from our (newly renamed) Spatial Intelligence Lab Come work at NVIDIA on cutting-edge research at the intersection of Machine Learning, Computer Vision, and Computer Graphics! Apply here: nvidia.wd5.myworkdayjobs.com/en-US/NVIDIAEx…
🔍What Does It Mean to Represent a Vision Dataset? 🧭 Our answer: Wasserstein barycenters. #ICCV2025 Several years ago, I learnt about Wasserstein barycenters—a concept from optimal transport that computes the geometric average of distributions. Unlike KL or MMD, which collapse…
📢📢 "Align Your Flow: Scaling Continuous-Time Flow Map Distillation" New flow map framework for state-of-the-art few-step generation, w/ the amazing @amsabour and @FidlerSanja. 🔥 Project page: research.nvidia.com/labs/toronto-a… 📜 Paper: arxiv.org/abs/2506.14603 🧵Thread below... (1/n)
We're going to update the SWE-bench leaderboards soon- lots of new submissions, including 3 systems for SWE-bench Multimodal :) We will also release SWE-agent Multimodal. w/ @jyangballin @_carlosejimenez @KLieret
🚀 Introducing LeVERB, the first 𝗹𝗮𝘁𝗲𝗻𝘁 𝘄𝗵𝗼𝗹𝗲-𝗯𝗼𝗱𝘆 𝗵𝘂𝗺𝗮𝗻𝗼𝗶𝗱 𝗩𝗟𝗔 (upper- & lower-body), trained on sim data and zero-shot deployed. Addressing interactive tasks: navigation, sitting, locomotion with verbal instruction. 🧵 ember-lab-berkeley.github.io/LeVERB-Website/
There are many KV cache-reduction methods, but a fair comparison is challenging. We propose a new unified metric called “critical KV footprint”. We compare existing methods and propose a new one - PruLong, which “prunes” certain attn heads to only look at local tokens. 1/7
Customizing Your LLMs in seconds using prompts🥳! Excited to share our latest work with @HPCAILab, @VITAGroupUT, @k_schuerholt, @YangYou1991, @mmbronstein, @damianborth : Drag-and-Drop LLMs(DnD). 2 features: tuning-free, comparable or even better than full-shot tuning.(🧵1/8)
Tired of over-optimized generations that stray too far from the base distribution? We present SLCD: Supervised Learning based Controllable Diffusion, which (provably) solves the KL constrained reward maximization problem for diffusion through supervised learning! (1/n)
Despite much progress in AI, the ability for AI to 'smell' like humans remains elusive. Smell AIs 🤖👃can be used for allergen sensing (e.g., peanuts or gluten in food), hormone detection for health, safety & environmental monitoring, quality control in manufacturing, and more.…
You’re watching a few rounds of poker games. ♠️♠️♠️♠️ The cards look normal — but the outcomes don’t.♦️ No one explains the rules. You just see hands play out. -- Can you figure out what’s going on? 🎯 That’s the setup, for LLMs. Recently, there is heated discussions on…
Can LLMs learn when not to answer, and does reasoning finetuning help or hurt that ability? It turns out knowing when to say “I don’t know” is still a hard unsolved problem. Check out my awesome mentor @polkirichenko & team’s new paper! 🎉
Excited to release AbstentionBench -- our paper and benchmark on evaluating LLMs’ *abstention*: the skill of knowing when NOT to answer! Key finding: reasoning LLMs struggle with unanswerable questions and hallucinate! Details and links to paper & open source code below! 🧵1/9
My amazing partner @tarashakhurana is presenting at Poster #174! #CVPR2025 Go check it out!
Excited to attend my first #CVPR2025 conference! My plans: - June 11: organizing DemoDiv workshop (room 213) - June 11, 2pm: presenting a shared talk on biases in data selection with @orussakovsky @cindy_x_wu @will_hs_hwang at NeXD workshop (room 201B) - June 12, 10:30am:…