Max Simchowitz
@max_simchowitz
Assistant Professor @mldcmu. Formerly: Postdoc @MITEECS, PhD @Berkeley_EECS, Math Undergrad @Princeton. New to Twitter. https://msimchowitz.github.io/
There’s a lot of awesome research about LLM reasoning right now. But how is learning in the physical world 🤖different than in language 📚? In a new paper, show that imitation learning in continuous spaces can be exponentially harder than for discrete state spaces, even when…
So excited for this!!! The key technical breakthrough here is that we can control joints and fingertips of the robot **without joint encoders**. Learning from self-supervised data collection is all you need for training the humanoid hand control you see below.
Despite great advances in learning dexterity, hardware remains a major bottleneck. Most dexterous hands are either bulky, weak or expensive. I’m thrilled to present the RUKA Hand — a powerful, accessible research tool for dexterous manipulation that overcomes these limitations!
👏👏This is pretty massive!! Generative modeling looks clean in math, but getting it up and running can require a fair bit of alchemy. 🧪🧪 Thankfully, Nick Boffi (co-creator of Stochastic Interpolants) just dropped a super-clean, super-fast, super-reproducible repo for core…
🧵generative models are sweet, but navigating existing repositories can be overwhelming, particularly when starting a new research project so i built jax-interpolants, a clean & flexible implementation of the stochastic interpolant framework in jax github.com/nmboffi/jax-in…
TRI's latest Large Behavior Model (LBM) paper landed on arxiv last night! Check out our project website: toyotaresearchinstitute.github.io/lbm1/ One of our main goals for this paper was to put out a very careful and thorough study on the topic to help people understand the state of the…
Very cool! In addition to optimizing inference-time search as a learning desideratum, this really speaks to power of building reward models purely from expert trajectories, via discriminative objectives. Excited to see how far this can go!
Say ahoy to 𝚂𝙰𝙸𝙻𝙾𝚁⛵: a new paradigm of *learning to search* from demonstrations, enabling test-time reasoning about how to recover from mistakes w/o any additional human feedback! 𝚂𝙰𝙸𝙻𝙾𝚁 ⛵ out-performs Diffusion Policies trained via behavioral cloning on 5-10x data!
I am giving a talk "From Sim2Real 1.0 to 4.0 for Humanoid Whole-Body Control and Loco-Manipulation" at the RoboLetics 2.0 workshop @ieee_ras_icra today, summarizing my recent thoughts on sim2real. If you are interested: 2pm, May 23 @ room 302.
Want to scale robot data with simulation, but don’t know how to get large numbers of realistic, diverse, and task-relevant scenes? Our solution: ➊ Pretrain on broad procedural scene data ➋ Steer generation toward downstream objectives 🌐 steerable-scene-generation.github.io 🧵1/8
RL and post-training play a central role in giving language models advanced reasoning capabilities, but many algorithmic and scientific questions remain unanswered. Join us at FoPT @ COLT '25 to explore pressing emerging challenges and opportunities for theory to bring clarity.
Announcing the first workshop on Foundations of Post-Training (FoPT) at COLT 2025! 📝 Soliciting abstracts/posters exploring theoretical & practical aspects of post-training and RL with language models! │ 🗓️ Deadline: May 19, 2025
The Chicago Pope implies the existence of a New Keynesian Pope and a Behavioral Pope
really excited about this one -- please submit your best work, and come join us in beautiful Lyon to talk about machine learning and science!
@khodakmoments, @__tm__157, along with myself, @nmboffi and Jianfeng Lu are organizing a COLT 2025 workshop on the Theory of AI for Scientific Computing, to be held on the first day of the conference (June 30).
Congrats to Andrea Bajcsy (@andrea_bajcsy) on receiving the NSF CAREER award! 👏 Her work: “Formalizing Open World Safety for Interactive Robots,” explores how robots make safe decisions beyond collision avoidance. Read about it and her education plans: loom.ly/59evuD0
Building AI systems is now a fragmented process spanning multiple organizations & entities. In new work (w/ @aspenkhopkins @cen_sarah @andrew_ilyas @imstruckman @LVidegaray), we study the implications of these emerging networks → what we call *AI supply chains* 🧵
Before the (exciting) workshops on Sun, catch Vincent’s oral talk at the #ICLR2025 main conference on this paper today at 3:30pm, Hall 1 Apex! And don’t forget to talk with the co-leads Vincent and @YiSu37328759 at the poster 10 a.m - 12:30 p.m Hall 3 + Hall 2B #558.
🚨 How can we fine-tune LLMs to implement nuanced algorithmic behaviors at test-time? I've been very behind on posting, but in SCoRe, we studied a special instance: training LLMs to self-correct. arxiv.org/abs/2409.12917 (in v2, we've updated the presentation, 🙏 for feedback!)