Chung Min Kim
@ChungMinKim
PhD @ UC Berkeley 🧸🌅
Excited to introduce PyRoki ("Python Robot Kinematics"): easier IK, trajectory optimization, motion retargeting... with an open-source toolkit on both CPU and GPU
🌟gsplat🌟just integrated 3DGUT, which allows training and rendering 3DGS on *distorted* pinhole/fisheye cameras, as well as rolling shutter effects! > Checkout this NVIDIA tech blog developer.nvidia.com/blog/revolutio… > Sweepstakes to win a 4090 nvidia.com/en-us/research…
Had so much fun working on this😊 PyTorch and JAX implementations are both out!
For everyone interested in precise 📷camera control 📷 in transformers [e.g., video / world model etc] Stop settling for Plücker raymaps -- use camera-aware relative PE in your attention layers, like RoPE (for LLMs) but for cameras! Paper & code: liruilong.cn/prope/
Everyone knows action chunking is great for imitation learning. It turns out that we can extend its success to RL to better leverage prior data for improved exploration and online sample efficiency! colinqiyangli.github.io/qc/ The recipe to achieve this is incredibly simple. 🧵 1/N
🥋 We're excited to share judo: a hackable toolbox for sampling-based MPC (SMPC), data collection, and more, designed to make it easier to experiment with high-performance control. Try it: pip install judo-rai
Had a great time at this CVPR community-building workshop---lots of fun discussions and some really important insights for early-career researchers. I also gave a talk on "Research as an Infinite Game." Here are the slides: canva.com/design/DAGp0iR…
In this #CVPR2025 edition of our community-building workshop series, we focus on supporting the growth of early-career researchers. Join us tomorrow (Jun 11) at 12:45 PM in Room 209 Schedule: sites.google.com/view/standoutc… We have an exciting lineup of invited talks and candid…
In LLM land, a slow model is annoying. In robotics, a slow model can be disastrous! Visible pauses at best, dangerously jerky motions at worst. But large VLAs are slow by nature. What can we do about this? An in-depth 🧵:
✨New preprint: Dual-Process Image Generation! We distill *feedback from a VLM* into *feed-forward image generation*, at inference time. The result is flexible control: parameterize tasks as multimodal inputs, visually inspect the images with the VLM, and update the generator.🧵
How to learn dexterous manipulation for any robot hand from a single human demonstration? Check out DexMachina, our new RL algorithm that learns long-horizon, bimanual dexterous policies for a variety of dexterous hands, articulated objects, and complex motions.
Was super fun to demo Gemini Robotics @ Google I/O! This was a big effort with the @GoogleDeepMind team including @ColinearDevin, @SudeepDasari, and many others. Here's a fun uncut video of me playing with the demo :)
Introducing POD ! Predict-Optimize-Distill : A Self-Improving Cycle for 4D Object Understanding ! Inputs: a multi-view scan of an object + casually captured, long-form human interaction monocular videos (from your phone) ! Outputs: 3D part poses over time .
New lab members keep showing up at BAIR recently (tbh idk who brought them here but thanks!) They’re becoming stronger, smaller, faster, and a lot more mixture-of-experts-level fluffy… #bairzoo
The ultimate test of any physics simulator is its ability to deliver real-world results. With MuJoCo Playground, we’ve combined the very best: MuJoCo’s rich and thriving ecosystem, massively parallel GPU-accelerated simulation, and real-world results across a diverse range of…