Jingyun Yang
@yjy0625
PhD student at Stanford.
Introducing Mobi-π: Mobilizing Your Robot Learning Policy. Our method: ✈️ enables flexible mobile skill chaining 🪶 without requiring additional policy training data 🏠 while scaling to unseen scenes 🧵↓
This new work from NYU proposes ViTaL, which learns generalizable, contact-rich policies with just 32 demos + 45 minutes of real-world RL. They also conducted a comprehensive study to conclude that tactile sensing is essential for precision and reliability. Congrats to the team!
Generalization needs data. But data collection is hard for precise tasks like plugging USBs, swiping cards, inserting plugs, and keying locks. Introducing robust, precise VisuoTactile Local (ViTaL) policies: >90% success rates from just 30 demos and 45 min of real-world RL.🧶⬇️
Real-world benchmarks for robot policies are challenging to develop for many reasons: robots differ in kinematics, sensors, firmware and wear; environmental factors are hard to control; efforts to evaluate robot policies are non-trivial. Pranav, Karl, and Tony’s team developed a…
We’re releasing the RoboArena today!🤖🦾 Fair & scalable evaluation is a major bottleneck for research on generalist policies. We’re hoping that RoboArena can help! We provide data, model code & sim evals for debugging! Submit your policies today and join the leaderboard! :) 🧵
A low-cost and 3D printable tactile sensor from NYU. It would be interesting to see how useful it is when installed on a dexterous hand and whether there are opportunities to increase its resolution.
We have developed a new tactile sensor, called e-Flesh, with a simple working principle: measure deformations in 3D printable microstructures. Now all you need to make tactile sensors is a 3D printer, magnets, and magnetometers! 🧵
We tested WSRL (Warm-start RL) on a Franka Robot, and it leads to really efficient online RL fine-tuning in the real world! WSRL learned the peg insertion task perfectly with only 11 minutes of warmup and *7 minutes* of online RL interactions 👇🧵
Can we teach dexterous robot hands manipulation without human demos or hand-crafted rewards? Our key insight: Use Vision-Language Models (VLMs) to scaffold coarse motion plans, then train an RL agent to execute them with 3D keypoints as the interface. 1/7
Teleoperation is slow, expensive, and difficult to scale. So how can we train our robots instead? Introducing X-Sim: a real-to-sim-to-real framework that trains image-based policies 1) learned entirely in simulation 2) using rewards from human videos. portal-cornell.github.io/X-Sim
Our new work CUPID 🏹 curates imitation learning data your robot loves 🤖💖 with influence functions, which causally link demos to the policy’s expected return. This approach enables 🧹 filtering high-quality data 🏋🏻 identifying robust strategies 📊 fixing spurious correlations…
What makes data “good” for robot learning? We argue: it’s the data that drives closed-loop policy success! Introducing CUPID 💘, a method that curates demonstrations not by "quality" or appearance, but by how they influence policy behavior, using influence functions. (1/6)
Thrilled to be a speaker at the #RSS2025 Workshop on Whole-body Control & Bimanual Manipulation: Applications in Humanoids and Beyond! 🗓️ This Wednesday, June 25 📍 Room OHE 136 🔗 Workshop link: wcbm-workshop.github.io Stop by and let’s chat more — see you there!
Come to join our workshop! Map for in-person participants and Zoom for online interaction: stanford.zoom.us/j/92067016723?…
Workshop on Mobile Manipulation in #RSS2025 kicking off with a talk from @leto__jean! Come by EEB 132 if you’re here in person, or join us on Zoom (link on the website)
Join our #RSS2025 MoMa Workshop tomorrow to hear about latest advancements and challenges in mobile manipulation. 📰 Learn more: rss-moma-2025.github.io 🛜 Also available on Zoom!

Most assistive robots live in labs. We want to change that. FEAST enables care recipients to personalize mealtime assistance in-the-wild, with minimal researcher intervention across diverse in-home scenarios. 🏆 Outstanding Paper & Systems Paper Finalist @RoboticsSciSys 🧵1/8
When humans perform manipulation tasks in tight spaces, we actively move our necks and adjust our gazes to perceive the environment in the best way. @Haoyu_Xiong_'s new work "Vision in Action" gives robots the same capability. Congrats on the very cool work!
Your bimanual manipulators might need a Robot Neck 🤖🦒 Introducing Vision in Action: Learning Active Perception from Human Demonstrations ViA learns task-specific, active perceptual strategies—such as searching, tracking, and focusing—directly from human demos, enabling robust…
Reviewer #2 looks quite spicy 😂
Our lab at Stanford usually do research in AI & robotics, but very occasionally we indulge in being functional alcoholics -- Recently we hosted a lab cocktail night, and created drinks with research-related puns like 'reviewer#2' and 'make 6 figures', sharing the full recipes…
Meet ProVox: a proactive robot teammate that gets you 🤖❤️🔥 ProVox models your goals and expectations before a task starts — enabling personalized, proactive help for smoother, more natural collaboration. All powered by LLM commonsense. Recently accepted at @ieeeras R-AL! 🧵1/7
Very cool bimanual manipulation demos! Looking forward to what comes next.
Today we're excited to share a glimpse of what we're building at Generalist. As a first step towards our mission of making general-purpose robots a reality, we're pushing the frontiers of what end-to-end AI models can achieve in the real world. Here's a preview of our early…
Join us for the 2nd workshop on Equivariant Vision: From Theory to Practice @CVPR on June 11, at Room 101C! Enjoy the exciting talks by @RobinSFWalters @GhaffariMaani @TacoCohen @_gabrielecesa_ @tesssmidt @vincesitzmann @twmitchel More details at: equivision.github.io
Want to train a mobile manipulation policy with as few as 20 demos? Our latest work HoMeR combines a hybrid imitation learning agent with whole-body control to achieve data-efficient mobile policy learning. Find out more in @priyasun_’s thread!
How can we move beyond static-arm lab setups and learn robot policies in our messy homes? We introduce HoMeR, an imitation learning agent for in-the-wild mobile manipulation. 🧵1/8