Naomi Saphra
@nsaphra
Waiting on a robot body. All opinions are universal and held by both employers and family. Now a dedicated grok hate account. Accepting ML/NLP PhD students.
Life update: I'm starting as faculty at Boston University in 2026! BU has SCHEMES for LM interpretability & analysis, so I couldn't be more pumped to join a burgeoning supergroup w/ @najoungkim @amuuueller. Looking for my first students, so apply and reach out!

Really tired of watching folks treat xai like any other frontier lab. How many times do they have to do something insane to lose the presumption of good faith
Starting in 30 min at 10am PT! @nsaphra presents at DLCT: Sudden Drops in the Loss: Syntax Acquisition, Phase Transitions, and Simplicity Bias in MLMs Angelica Chen, Ravid Shwartz-Ziv, Kyunghyun Cho, Matthew L. Leavitt, Naomi Saphra arxiv.org/abs/2309.07311 Zoom 👇
If you’re in Vienna for ACL go check out our interpretability poster on how feature interactions reflect linguistic structure! Wednesday, 11-12:30, Poster Session #4 (Session 12: IP-Posters), Hall 4/5
ACL paper alert! What structure is lost when using linearizing attribution like Shapley? We show the nonlinear interactions between features reflect structures described by the sciences of syntax, semantics, and phonology.
Used Uber's customer service AI earlier -- it had small model smell and also said it "couldn't" bring in a human. Companies, don't do any of that??
Btw as an aside, we didn’t announce on Friday because we respected the IMO Board's original request that all AI labs share their results only after the official results had been verified by independent experts & the students had rightly received the acclamation they deserved
Instead of cutting US science funding by 60% they'll cut it by 20% and then say "oh what were you freaking out about and anyways don't you know there's a budget deficit"
🚨 According to a friend, the IMO asked AI companies not to steal the spotlight from kids and to wait a week after the closing ceremony to announce results. OpenAI announced the results BEFORE the closing ceremony. According to a Coordinator on Problem 6, the one problem OpenAI…
On July 9th the family were informed he had died. They were consumed with grief and still had no information about what happened to him. Mercifully it turns out that information was a mistake, and Leon was instead being held in Guatemala.
So, all the models underperform humans on the new International Mathematical Olympiad questions, and Grok-4 is especially bad on it, even with best-of-n selection? Unbelievable!
So you want to skip our thinning proofs—but you’d still like our out-of-the-box attention speedups? I’ll be presenting the Thinformer in two ICML workshop posters tomorrow! Catch me at Es-FoMo (1-2:30, East hall A) and at LCFM (10:45-11:30 & 3:30-4:30, West 202-204)
Your data is low-rank, so stop wasting compute! In our new paper on low-rank thinning, we share one weird trick to speed up Transformer inference, SGD training, and hypothesis testing at scale. Come by ICML poster W-1012 Tuesday at 4:30!
🚨The Actionable Interpretability Workshop is happening tomorrow at ICML! Join us for an exciting lineup of speakers, nearly 70 posters, and a great panel discussion 🙌 Don’t miss it! 🔍⚙️ @icmlconf @ActInterp
See you tomorrow morning!
We're excited to have 5 invited speakers: Naomi Saphra (@nsaphra) at 10am: And Nothing Between: Using Categorical Differences to Understand and Predict Model Behavior Shiry Ginosar (@shiryginosar) at 10:40: What Do Vision and Vision-Language Models Really Know About the World?
I'm excited to discuss downstream applications of interpretability at @ActInterp! For a preview of my thoughts on the topic, see my blog post on how I think about picking applications to target x.com/saprmarks/stat…
🚨Meet our panelists at the Actionable Interpretability Workshop @ActInterp at @icmlconf! Join us July 19 at 4pm for a panel on making interpretability research actionable, its challenges, and how the community can drive greater impact. @nsaphra @saprmarks @kylelostat @FazlBarez
I'm excited to be in ICML this week :-) @perceptroninc is co-sponsoring the Assessing World Models workshop this Friday. Come see some great talks from @jacobandreas @nsaphra and more; topics include mechanistic interpretability, intuitive physics, LLMs for generating scientific…
I'm biased but this paper is so cool 😭 like I really didn't think this kind of deep theory would be able to beat kludgey Transformer optimization hacks in 2025
Your data is low-rank, so stop wasting compute! In our new paper on low-rank thinning, we share one weird trick to speed up Transformer inference, SGD training, and hypothesis testing at scale. Come by ICML poster W-1012 Tuesday at 4:30!