Homanga Bharadhwaj
@mangahomanga
Research Scientist @AIatMeta Prev: PhD @CMU_Robotics MSc @UofT BTech @IITKanpur. Hater of hot takes. Lover of analyses.
Gen2Act: Casting language-conditioned manipulation as *human video generation* followed by *closed-loop policy execution conditioned on the generated video* enables solving diverse real-world tasks unseen in the robot dataset! homangab.github.io/gen2act/ 1/n
Exciting analysis on diffusion vs. auto-regressive modeling, with a surprisingly clean takeaway! Hope to see similar analyses in more sequential-decision-making-y / robotics-y problems.
🚨 The era of infinite internet data is ending, So we ask: 👉 What’s the right generative modelling objective when data—not compute—is the bottleneck? TL;DR: ▶️Compute-constrained? Train Autoregressive models ▶️Data-constrained? Train Diffusion models Get ready for 🤿 1/n
always exciting to see clever use of pre-trained diffusion models for hard applications! cool work from friends @CMU_Robotics
Excited to share recent work with @kaihuac5 and @RamananDeva where we learn to do novel view synthesis for dynamic scenes in a self-supervised manner, only from 2D videos! webpage: cog-nvs.github.io arxiv: arxiv.org/abs/2507.12646 code (soon): github.com/Kaihua-Chen/co…
Join us for the Robotics World Modeling Workshop at CoRL 2025! We'll have exciting (spicy) discussions about the type of models to build, implicit vs. explicit, 2D, 3D, role of video generation... We are accepting papers with a July 13 deadline.
🤖🌎 We are organizing a workshop on Robotics World Modeling at @corl_conf 2025! We have an excellent group of speakers and panelists, and are inviting you to submit your papers with a July 13 deadline. Website: robot-world-modeling.github.io
As a researcher, it is immensly satisfying when the community tackles open problems from your previous work! In Gen2Act last year we showed how video generation models can be used zero-shot for manipulation. This paper takes the idea further via richer motion cues.
Research arc: ⏪ 2 yrs ago, we introduced VRB: learning from hours of human videos to cut down teleop (Gibson🙏) ▶️ Today, we explore a wilder path: robots deployed with no teleop, no human demos, no affordances. Just raw video generation magic 🙏 Day 1 of faculty life done! 😉…
Generalist manipulation policies often struggle out-of-the-box in unseen scenarios. Our work DemoDiffusion allows guiding a pre-trained diffusion policy (e.g. Pi-0) via a single human demonstration to perform new tasks in new environments. Work led by @sungj1026 and @mangahomanga
Presenting DemoDiffusion: An extremely simple approach enabling a pre-trained 'generalist' diffusion policy to follow a human-demonstration for a novel task during inference One-shot human imitation *without* requiring any paired human-robot data or online RL 🙂 1/n
Excellent work from @agiachris @leto__jean et al. on data curation for manipulation. Strong theoretical grounding combined with exciting empirical takeaways!
What makes data “good” for robot learning? We argue: it’s the data that drives closed-loop policy success! Introducing CUPID 💘, a method that curates demonstrations not by "quality" or appearance, but by how they influence policy behavior, using influence functions. (1/6)
while the startup world is cheering on with robots, it is refreshing to see this unique initiative focused on humans! Congrats to @Vikashplus and team
All forms of intelligence co-emerged with a body, except AI We're building a #future where AI evolves as your lifelike digital twin to assist your needs across health, sports, daily life, creativity, & beyond... myolab.ai ➡️ Preview your first #HumanEmbodiedAI
I'm attending @RoboticsSciSys for the first time #RSS2025 in LA Excited to be giving two invited talks, at the Continual Learning and EgoAct workshops on Sat, June 21 I'll share the latest on 2D/3D motion prediction from human videos for manipulation! Do drop by and say hi :)
The #RSS2025 Workshop on Continual Robot Learning from Humans is happening on June 21. We have an amazing lineup of speakers discussing how we can enable robots to acquire new skills and knowledge from humans continuously. Join us in person and on Zoom (info on our website)!
Join us on Saturday, 21st June at EgoAct 🥽🤖: the 1st Workshop on Egocentric Perception & Action for Robot Learning @ RSS 2025 @RoboticsSciSys in Los Angeles! ☀️🌴 Full program w/ accepted contributions & talks at: egoact.github.io/rss2025 Online stream: tinyurl.com/egoact
Excited to announce EgoAct🥽🤖: the 1st Workshop on Egocentric Perception & Action for Robot Learning @ #RSS2025 in LA! We’re bringing together researchers exploring how egocentric perception can drive next-gen robot learning! Full info: egoact.github.io/rss2025 @RoboticsSciSys
🤖 Does VLA models really listen to language instructions? Maybe not 👀 🚀 Introducing our RSS paper: CodeDiffuser -- using VLM-generated code to bridge the gap between **high-level language** and **low-level visuomotor policy** 🎮 Try the live demo: robopil.github.io/code-diffuser/ (1/9)
"We see in order to move; we move in order to see" Active perception has always been known to be key for human manipulation. Impressive work as always led by @Haoyu_Xiong_ on building excellent hardware for endowing bimanual manipulators with this capability!
Your bimanual manipulators might need a Robot Neck 🤖🦒 Introducing Vision in Action: Learning Active Perception from Human Demonstrations ViA learns task-specific, active perceptual strategies—such as searching, tracking, and focusing—directly from human demos, enabling robust…
AI for smell is so cool! The next time we want AI to "pass the sniff test" we might actually mean it literally :) Congrats to @pliang279 and group!
Despite much progress in AI, the ability for AI to 'smell' like humans remains elusive. Smell AIs 🤖👃can be used for allergen sensing (e.g., peanuts or gluten in food), hormone detection for health, safety & environmental monitoring, quality control in manufacturing, and more.…
#CVPR2025 closing with a final keynote about the future a.k.a robotics!
