David McAllister
@davidrmcall
PhD Student @berkeley_ai | Interning with Nvidia Helsinki
Decentralized Diffusion Models power stronger models trained on more accessible infrastructure. DDMs mitigate the networking bottleneck that locks training into expensive and power-hungry centralized clusters. They scale gracefully to billions of parameters and generate…
Great!
NeRFs and Gaussian Splats excel at static 3D modeling but robots work in dynamic, unpredictable environments. POGS (Persistent Object Gaussian Splats) combines semantic, visual, and grouping features that can be queried with language and spatially updated as environments change
NeRFs and Gaussian Splats excel at static 3D modeling but robots work in dynamic, unpredictable environments. POGS (Persistent Object Gaussian Splats) combines semantic, visual, and grouping features that can be queried with language and spatially updated as environments change
No one wants to hear this, but all evidence suggests that you should train on your test set. argmin.net/p/in-defense-o…
For everyone interested in precise 📷camera control 📷 in transformers [e.g., video / world model etc] Stop settling for Plücker raymaps -- use camera-aware relative PE in your attention layers, like RoPE (for LLMs) but for cameras! Paper & code: liruilong.cn/prope/
Artifacts in your attention maps? Forgot to train with registers? Use 𝙩𝙚𝙨𝙩-𝙩𝙞𝙢𝙚 𝙧𝙚𝙜𝙞𝙨𝙩𝙚𝙧𝙨! We find a sparse set of activations set artifact positions. We can shift them anywhere ("Shifted") — even outside the image into an untrained token. Clean maps, no retrain.
Excited to present VideoMimic this week at #CVPR2025! 🎥🤖 📌 POETs Workshop "Embodied Humans" Spotlight Talk | June 12, Thu, -10:10 | Room 101B 📌 Agents in Interaction: From Humans to Robots Poster #182-#201 | June 12, Thu, -12:15 | ExHall D Come by and chat!…
🚀Join our #CVPR2025 2nd POETs Workshop --Embodied "Humans": Symbiotic Intelligence Between Virtual Humans and Humanoid Robots We have super cool live demo sessions, and awesome lineup of speakers @UnitreeRobotics @GerardPonsMoll1, @pathak2206, Karen Liu, @chelseabfinn @psyth91…
At @theworldlabs, we built a new Gaussian splatting web renderer with all the bells and whistles we needed to make splats a first-class citizen of the incredible @threejs ecosystem. Today, we're open sourcing Forge under the MIT license.
We trained a robotic foundation model that can drive mobile robots in six different countries, and navigate Sproul Plaza in midday on the UC Berkeley campus! Some cool new work w/ @NoriakiHirose, Lydia Ignatova, @KyleStachowicz, @CatGlossop, @shahdhruv_ model-base-reannotation.github.io
Excited to introduce PyRoki ("Python Robot Kinematics"): easier IK, trajectory optimization, motion retargeting... with an open-source toolkit on both CPU and GPU
Humanoids on campus! Check out our new work on context-aware locomotion
our new system trains humanoid robots using data from cell phone videos, enabling skills such as climbing stairs and sitting on chairs in a single policy (w/ @redstone_hong @junyi42 @davidrmcall)
RDM is now published at Nature Methods! This was a 3 year effort and my introduction to academic research. I’m fortunate to have been mentored by one of the smartest people I’ve ever met @the_legitamit!
Ring deconvolution microscopy is now published at @naturemethods! nature.com/articles/s4159… There are some fun new additions including light-sheet deconvolution 🫡 Stay tuned for the official python package release next week! Any feature suggestions are more than welcome 😃
Introducing St4RTrack!🖖 Simultaneous 4D Reconstruction and Tracking in the world coordinate feed-forwardly, just by changing the meaning of two pointmaps! st4rtrack.github.io
“No government—regardless of which party is in power—should dictate what private universities can teach, whom they can admit and hire, and which areas of study and inquiry they can pursue.” - President Alan Garber hrvd.me/GarberRespond3…
Next-gen vision pre-trained models shouldn’t be short-sighted. Humans can easily perceive 10K x 10K resolution. But today’s top vision models—like SigLIP and DINOv2—are still pre-trained at merely hundreds by hundreds of pixels, bottlenecking their real-world usage. Today, we…
It's finally here: Brampton Brampton is the world's most intelligent, creative, and fastest model. Brampton dramatically outperforms Grok 3, Claude 3.7 Sonnet, and GPT 4.5. Reply with "brampton" for early access.
Very excited to share Stable Virtual Camera, a generalist diffusion model for view synthesis: stable-virtual-camera.github.io It scales well with data, and works out-the-box for different NVS tasks. Code and 🤗 demo are released! 🧵(1/N)
Stability AI just dropped Stable Virtual Camera on Hugging Face a generalist diffusion model designed to address the exciting challenge of Novel View Synthesis (NVS). With just one or a few images, it allows you to create a smooth trajectory video from any viewpoint you desire.
🚀 Introducing InterDyn — our newly accepted CVPR work that explores controllable synthesis of interactive dynamics! Building upon powerful video diffusion models, InterDyn infers future motion and interactions directly from an input image and a dynamic control signal (e.g., a…