Ahmed Alaa
@_ahmedmalaa
Assistant Professor @UCBerkeley + @UCSF, @berkeley_ai Prev @broadinstitute, @MIT, @UCLA, @Cambridge_Uni, @UniofOxford. Machine Learning & AI for Healthcare.
Super exciting. Come join us to work on reliable AI and community evaluations! There's a lot of exciting research to be done and products to be built. If you're interested in helping, apply to join, and my DMs/e-mail are always open :)
📢We’re excited to share that we’ve raised $100M in seed funding to support LMArena and continue our research on reliable AI. Led by @a16z and UC Investments (@UofCalifornia), we're proud to have the support of those that believe in both the science and the mission. We’re…
Medical disclaimers in AI outputs are quietly fading, but they matter for safety and trust! Our recent study on this topic is featured in a new MIT @techreview piece.
AI companies have stopped warning you that their chatbots aren’t doctors trib.al/OL4aqhu
If you are at #ICML2025, don't miss our talks/posters! - Medical LLM Benchmarking (Position Paper): Oral - July 16, 3:30 PM + Poster - July 16, 4:30 PM by @tom_hartvigsen - Venn-Abers Calibration + Conformal Prediction: Poster on July 16, 4:30 PM by @LarsvanderLaan3

How has medical disclaimer messaging changed in generative AI models over the past few years? New work with Sonali Sharma and @RoxanaDaneshjou!
Brilliant student Sonali Sharma came to me with a question. If patients are using AI to answer their medical questions, are they being adequately warned by AI systems that it cannot provide medical advice? What we found surprised us!
New paper out in @NatureComms led by @cmuptx 's team! We study how generative AI models can enable accurate medical image segmentation even with limited labeled data.
🚀 Excited to share that our work GenSeg has been published in Nature Communications! GenSeg is an end-to-end, downstream-task-guided framework for generating synthetic training data for medical image segmentation. It significantly reduces the need for manual annotations — by a…
Looking for an emergency reviewer for NeurIPS paper for a fairness paper, due in a week. Email or DM me! Ideally you’ve first-authored a paper in a major ML conference Reward is a thoughtful compliment about you and your work
individual reporting for post-deployment evals — a little manifesto (& new preprints!) tldr: end users have unique insights about how deployed systems are failing; we should figure out how to translate their experiences into formal evaluations of those systems.
What more could we understand about the fractal, “jagged” edges of AI system deployments if we had better ways to listen to the people who interact with them? What a joy to work w @jessicadai_ using individual experiences to inform AI evaluation (blog/ICML/arXiv links 👇)
Honored to receive a Google Research Scholar Award. Thanks @GoogleResearch for supporting our work!
Congratulations to CPH's @_ahmedmalaa, receiving @googleresearch Scholar award for groundbreaking work on use of AI and high-throughput proteomics to predict treatment impact for populations often excluded from clinical trials. @BerkeleyDataSci @UCSF_BCHSI
Just did a major revision to our paper on Lifelong Knowledge Editing!🔍 Key takeaway (+ our new title) - "Lifelong Knowledge Editing requires Better Regularization" Fixing this leads to consistent downstream performance! @tom_hartvigsen @_ahmedmalaa @GopalaSpeech @berkeley_ai
Our Rising Stars series is back with @paula_gradu sharing a control-theoretic perspective on medication tapering problems! Link: youtu.be/IP9unMmvbNE?si…

📢If you're interested in conformal prediction, algorithms w/predictions, robust stats & connections between them from a theory perspective, join us for a workshop at #COLT2025 in Lyon 🇫🇷 June 30! Submit a poster description by May 25, more here: vaidehi8913.github.io/predictions-an…
Great to see our work with @DrMayaPetersen, @emrek, @cholmesuk and @mark_vdlaan featured by @MSFTResearch!
Discover how Microsoft Research is leveraging AI to transform research processes, accelerate innovation, and drive growth. Learn more about the strategies and tools empowering researchers here: msft.it/6011SZK8U
Excited to share that our paper on uncertainty-aware model calibration using generalized Venn–Abers methods was accepted to #ICML2025! We show that Vovk’s Venn–Abers calibration and conformal prediction are both special cases of a unified framework.
🚨 Excited about this new paper on Generalized Venn Calibration and conformal prediction! We show that Venn and Venn-Abers can be extended to general losses, and that conformal prediction can be viewed as Venn multicalibration for the quantile loss! #calibration #conformal
We have two papers accepted at #ICML2025! An oral on construct validity of medical LLM benchmarks and a poster on generalized Venn-Abers calibration. See you in Vancouver 🇨🇦!

Our multimodal AMIE study is out! This is another exciting step for conversational medical AI – many congrats to the AMIE team! A brief 🧵on what this means for both doctors and patients: x.com/GoogleAI/statu…
Building on Articulate Medical Intelligence Explorer — AMIE, our research diagnostic conversational AI agent — today on the blog we share a first of its kind demonstration of a multimodal conversational diagnostic AI agent, multimodal AMIE. Learn more →goo.gle/42D0QcB
Glad to see this piece out in @theNAEng for their latest edition on "AI Promises & Risks". So many of the practical evaluation challenges with generative AI have already been encountered before. Not surprising how much we can learn from other fields! nae.edu/337862/Spring-…