Samuel Stanton
@samuel_stanton_
Principal Machine Learning Scientist @PrescientDesign @Genentech @Roche | @NYUDataScience PhD | developing AI agents for scientific discovery in biotech
Drug discovery is a profoundly exciting application of ML, involving discrete, high-dimensional, multi-objective optimization problems. The @andrewgwils lab and @BigHatBio are proud to introduce LaMBO, a new method for biological sequence design. arxiv.org/abs/2203.12742 1/9

come by to chat about training LLMs to be black-box token sequence optimizers
Have you ever wondered how your specialized biomolecule engineering model compares with a general purpose LLM on bio-optimization tasks? Come check out our work at ICML, in the East poster session (#E-2804) happening right now!. #ICML2025
& “WATCH: Adaptive Monitoring for AI Deployments via Weighted-Conformal Martingales” by @DrewPrinster, @xinghan0, @anqi_liu33, & @suchisaria proposes a weighted generalization of conformal test martingales: arxiv.org/abs/2505.04608 (5/5)
There is so much deep engineering work that is far upstream of drug discovery and in vivo systems. If you have a strong technical background you can have a very high impact by contributing to things like sequencing instruments, imaging, lab robotics and so on.
Imagine how things would reorganize around a whole transcriptome, epigenome + proteome spatial assay with intracellular resolution that can be prepped and read out in 10 minutes to your laptop from a $100 silicon wafer that fits in the palm of your hand These are fun things to…
If you, like me, felt frustrated after reviewing 6 papers for @NeurIPSConf and thought "there must be a better way" then this post is for you. Link 👇
i was asked by a few (inc. @YuanqingWang ) what i meant by this earlier tweet, and since i'm pretty busy, i decided to write a blog post to answer the question 🤦
ai for drug discovery looks a lot like machine translation research/development during the cold war.
I recently had the pleasure of speaking with my graduate alma mater @NYUDataScience about my personal experience building a lab-in-the-loop system for therapeutic antibody lead optimization with my amazing colleagues @genentech. Active learning is a leap of faith!
CDS PhD graduates @samuel_stanton_ & @omarnmahmood, CDS Prof @khyuncho, former CDS co-Director @RichBonneauNYC, and others developed a machine learning system that achieved up to 100x improvements in antibody binding. nyudatascience.medium.com/making-drugs-w…
The two most common questions I get via cold email are 1) what should I work on, and 2) how do I get a job doing research, engineering, etc.? I wrote a new post summarizing my advice for people trying to enter Biology+ML, from undergrads to early career. 👇 (link below)
AI monitoring is key to responsible deployment. Our #ICML2025 paper develops approaches for 3 main goals: 1) *Adapting* to mild data shifts 2) *Quickly Detecting* harmful shifts 3) *Diagnosing* cause of degradation 🧵w/ @xinghan0 @anqi_liu33 @suchisaria arxiv.org/abs/2505.04608
i think you should be less concerned about how other people are (mis)using AI and more concerned about whether you are using AI to be more virtuous, productive, and empathetic.
AI is revolutionizing drug discovery right now at enterprise scale. It's not a hypothetical anymore 🚀
We're hiring! 🎉 Join our @PrescientDesign @genentech team in NYC! We're hiring a Principal Software Engineer to lead the development of our flagship AI/ML Lab-in-the-Loop platform for therapeutic molecular design 🧬 Drive impact w/ full-stack skills (esp. front-end) as part of…
We're hiring! Come work with me, @timbitsz, and the fantastic teams @genentech Comp Bio / Disco Onc & @PrescientDesign as a postdoc. Link to apply below 👇
We are hiring a PhD research intern at FAIR w/ @marksibrahim @kamalikac to start this summer or Fall! Potential topics: trustworthy and reliable LLMs, multi-modal LLMs and agents, post-training, reasoning, with a focus on open science/sharing our findings in the paper at the end…
Sequencing should be a human right for those with possible rare disease. Clinical care should be -> start with maximal data and work backwards vs. (the current) start with minimal data -> then slowly add more, letting the disease progress
long on lobster
Interpretable Lobsters at #ICLR2025 🦞 🦞 Come by poster #504!
a lot of people think binder design is the bottleneck for drug discovery, but developability is arguably where wetlabs struggle more due to the cost of the assays. swing by Amy's poster to learn more!
📢 Excited to present our poster at the #ICLR2025 @gembioworkshop! I'll be introducing TherAbDesign - a novel, sequence-based framework that efficiently optimizes antibodies toward therapeutically relevant biophysical properties. [1/4]
There are no interpretability workshops @iclr_conf, so stop by @asalam_91 's poster (#504) during Poster Session 5 and we'll show you how to build interpretable Language Models by design.
The Urgency of Interpretability: Why it's crucial that we understand how AI models work darioamodei.com/post/the-urgen…
I will be @iclr_conf in Singapore and I'm looking for a senior ML scientist to join my team @PrescientDesign @genentech! 🧬 If you have experience & interest in multi-modal biological foundation models and agents for science, reach out and let's meet up! Link to apply 👇