Kush Hari
@KushtimusPrime
PhD Student @berkeley_ai
NeRFs and Gaussian Splats excel at static 3D modeling but robots work in dynamic, unpredictable environments. POGS (Persistent Object Gaussian Splats) combines semantic, visual, and grouping features that can be queried with language and spatially updated as environments change
Imitation learning has seen great success, but IL policies still struggle with OOD observations We designed a 3D backbone, Adapt3R, that can combine with your favorite IL algorithm to enable zero-shot generalization to unseen embodiments and camera viewpoints!
In LLM land, a slow model is annoying. In robotics, a slow model can be disastrous! Visible pauses at best, dangerously jerky motions at worst. But large VLAs are slow by nature. What can we do about this? An in-depth 🧵:
Was super fun to demo Gemini Robotics @ Google I/O! This was a big effort with the @GoogleDeepMind team including @ColinearDevin, @SudeepDasari, and many others. Here's a fun uncut video of me playing with the demo :)
Imitation learning has a data scarcity problem. Introducing EgoDex from Apple, the largest and most diverse dataset of dexterous human manipulation to date — 829 hours of egocentric video + paired 3D hand poses across 194 tasks. Now on arxiv: arxiv.org/abs/2505.11709 (1/4)
Excited to introduce PyRoki ("Python Robot Kinematics"): easier IK, trajectory optimization, motion retargeting... with an open-source toolkit on both CPU and GPU
our new system trains humanoid robots using data from cell phone videos, enabling skills such as climbing stairs and sitting on chairs in a single policy (w/ @redstone_hong @junyi42 @davidrmcall)