Sam Blau
@SamMBlau
Research scientist & computational chemist @BerkeleyLab using HT DFT workflows, machine learning, and reaction networks to model complex reactivity.
The Open Molecules 2025 dataset is out! With >100M gold-standard ωB97M-V/def2-TZVPD calcs of biomolecules, electrolytes, metal complexes, and small molecules, OMol is by far the largest, most diverse, and highest quality molecular DFT dataset for training MLIPs ever made 1/N

Wow, unreal to hear OMol being called "an AlphaFold moment" for computational chemistry. And models trained on OMol will certainly get way better in the coming months and years as MLIP architectures improve their description of charge/spin/long-range interactions!
Last week, Meta released OMol25/UMA, a massive dataset of QM calculations and a set of models trained on the dataset. Over the past week, we've spent a lot of time benchmarking, hosting, and running these models. We wrote a post summarizing what we've learned (link below):
Our group is currently looking for a #compchem #postdoc to start in Fall 2025. Primary area of research will be with the NSF Center for the Mechanical Control of Chemistry (CMCC), with second focus in organic materials/molecular design. More info at: dtaborgroup.com/openings/
New UMA models trained on OMol25 etc are now out! Can’t wait to see how well the medium model does - certainly it will be slower than small but should be substantially more accurate
🚀Exciting news! We are releasing new UMA-1.1 models (Small and Medium) today and the UMA paper is now on arxiv! UMA represents a step-change in what’s possible with a single machine learning interatomic potential (short overview in the post below). The goal was to make a model…
Starkly Speaking tomorrow: @bwood_m will present "UMA: A Family of Universal Models for Atoms" ai.meta.com/research/publi… Join us on Zoom 12pm ET / 6pm CEST: portal.valencelabs.com/starklyspeaking
Excited to share that our CO₂-at-the-air–water-interface study just landed the front cover of @JPhysChemLett! 🌊✨ Explore the artwork & paper here 👉 pubs.acs.org/toc/jpclcd/16/… Big thanks to @ACSPublications for the spotlight. #MyACSCover #CO2 #ChemTwitter
Fascinating new work from the @gabepgomes group. The team developed Stereoelectronics-Infused Molecular Graphs (SIMGs) that incorporate quantum chemical orbital interactions into molecular representations. This work goes beyond traditional molecular graph representations by…
Advancing molecular machine learning representations with stereoelectronics-infused molecular graphs #machinelearning #compchem nature.com/articles/s4225…
We've had a lot of scientists reach out to us wondering how to run the new OMol25/UMA models from Meta. Scientific software can be intimidating! To help people get started, our latest blog post contains step-by-step instructions for how to run these models locally and on Rowan.
Last week to apply to this position in my group.
We are recruiting a PhD student in machine learning for photocatalysis! In this project, we will collaborate with the group of Frank Glorius @GloriusGroup to develop predictive tools for energy-transfer-catalyzed photocycloadditions. Reposts appreciated! jobs.ethz.ch/job/view/JOPG_…
We are recruiting a PhD student in machine learning for photocatalysis! In this project, we will collaborate with the group of Frank Glorius @GloriusGroup to develop predictive tools for energy-transfer-catalyzed photocycloadditions. Reposts appreciated! jobs.ethz.ch/job/view/JOPG_…
🚨 Brand new postdoctoral researcher position in my group at @Princeton! Please apply or share with anyone interested in computational materials science and being an electron whisperer! puwebp.princeton.edu/AcadHire/apply….
We're starting to see more and more AI for chemistry and biology which I'm super excited about given the potential for good! @AIatMeta just released OMol25 on @huggingface, a dataset of 𝟭𝟬𝟬𝗠+ 𝗺𝗼𝗹𝗲𝗰𝘂𝗹𝗮𝗿 𝗰𝗼𝗻𝗳𝗼𝗿𝗺𝗲𝗿𝘀 spanning 83 elements and diverse chemical…
We released the Open Molecules 2025 (OMol25) Dataset last week! 🚀🧪 OMol25 is a large (100M+) and diverse molecular DFT dataset for training machine learning models. It was a massive collaborative and interdisciplinary effort and I’m super proud of the whole team! 🙌 1/7
Wow - over 2x improvement on GMTKN55 for MLIPs trained on OMol versus an MLIP fine-tuned on coupled cluster is pretty wild
Molecular benchmarks have been added to benchmarks.rowansci.com for two models trained on OMol25: the eSEN conserving forces small model and the Universal Models for Atoms (UMA) small model (selected task: omol). The results are astounding! Molecular energies are nearly spot on.