Naomi Saphra

@nsaphra

Waiting on a robot body. All opinions are universal and held by both employers and family. Now a dedicated grok hate account. Accepting ML/NLP PhD students.

New York

Joined November 2010

1KFollowing

9KFollowers

Pinned

Naomi Saphra@nsaphra · Mar 27

Life update: I'm starting as faculty at Boston University in 2026! BU has SCHEMES for LM interpretability & analysis, so I couldn't be more pumped to join a burgeoning supergroup w/ @najoungkim @amuuueller. Looking for my first students, so apply and reach out!

nsaphra's tweet image. Life update: I'm starting as faculty at Boston University in 2026! BU has SCHEMES for LM interpretability &amp; analysis, so I couldn't be more pumped to join a burgeoning supergroup w/ @najoungkim @amuuueller. Looking for my first students, so apply and reach out!

459

30.0K

Naomi Saphra Retweeted

Eugene Vinitsky 🍒🦋@EugeneVinitsky · Jul 25

Really tired of watching folks treat xai like any other frontier lab. How many times do they have to do something insane to lose the presumption of good faith

129

11.0K

Naomi Saphra Retweeted

ML Collective@ml_collective · Jul 25

Starting in 30 min at 10am PT! @nsaphra presents at DLCT: Sudden Drops in the Loss: Syntax Acquisition, Phase Transitions, and Simplicity Bias in MLMs Angelica Chen, Ravid Shwartz-Ziv, Kyunghyun Cho, Matthew L. Leavitt, Naomi Saphra arxiv.org/abs/2309.07311 Zoom 👇

1.0K

Naomi Saphra@nsaphra · Jul 25

If you’re in Vienna for ACL go check out our interpretability poster on how feature interactions reflect linguistic structure! Wednesday, 11-12:30, Poster Session #4 (Session 12: IP-Posters), Hall 4/5

NNaomi Saphra@nsaphra · Jun 12

ACL paper alert! What structure is lost when using linearizing attribution like Shapley? We show the nonlinear interactions between features reflect structures described by the sciences of syntax, semantics, and phonology.

754

Naomi Saphra Retweeted

Miles Brundage@Miles_Brundage · Jul 25

Used Uber's customer service AI earlier -- it had small model smell and also said it "couldn't" bring in a human. Companies, don't do any of that??

6.0K

Naomi Saphra Retweeted

Demis Hassabis@demishassabis · Jul 21

Btw as an aside, we didn’t announce on Friday because we respected the IMO Board's original request that all AI labs share their results only after the official results had been verified by independent experts & the students had rightly received the acclamation they deserved

122

2.0K

282.0K

Naomi Saphra Retweeted

Eugene Vinitsky 🍒🦋@EugeneVinitsky · Jul 21

Instead of cutting US science funding by 60% they'll cut it by 20% and then say "oh what were you freaking out about and anyways don't you know there's a budget deficit"

3.0K

Naomi Saphra Retweeted

Mikhail Samin@Mihonarium · Jul 20

🚨 According to a friend, the IMO asked AI companies not to steal the spotlight from kids and to wait a week after the closing ceremony to announce results. OpenAI announced the results BEFORE the closing ceremony. According to a Coordinator on Problem 6, the one problem OpenAI…

198

2.0K

507

456.0K

Naomi Saphra Retweeted

Kelly@broadwaybabyto · Jul 19

On July 9th the family were informed he had died. They were consumed with grief and still had no information about what happened to him. Mercifully it turns out that information was a mistake, and Leon was instead being held in Guatemala.

569

9.0K

204.0K

Naomi Saphra Retweeted

Ravid Shwartz Ziv@ziv_ravid · Jul 19

So, all the models underperform humans on the new International Mathematical Olympiad questions, and Grok-4 is especially bad on it, even with best-of-n selection? Unbelievable!

150

186

3.0K

696

578.0K

Naomi Saphra@nsaphra · Jul 19

So you want to skip our thinning proofs—but you’d still like our out-of-the-box attention speedups? I’ll be presenting the Thinformer in two ICML workshop posters tomorrow! Catch me at Es-FoMo (1-2:30, East hall A) and at LCFM (10:45-11:30 & 3:30-4:30, West 202-204)

AAnnabelle Michael Carrell@annabelle_cs · Jul 14

Your data is low-rank, so stop wasting compute! In our new paper on low-rank thinning, we share one weird trick to speed up Transformer inference, SGD training, and hypothesis testing at scale. Come by ICML poster W-1012 Tuesday at 4:30!

2.0K

Naomi Saphra Retweeted

Actionable Interpretability Workshop ICML2025@ActInterp · Jul 18

🚨The Actionable Interpretability Workshop is happening tomorrow at ICML! Join us for an exciting lineup of speakers, nearly 70 posters, and a great panel discussion 🙌 Don’t miss it! 🔍⚙️ @icmlconf @ActInterp

2.0K

Naomi Saphra@nsaphra · Jul 17

See you tomorrow morning!

WWorkshop on Assessing World Models (ICML)@WorldModelsICML · Jul 17

We're excited to have 5 invited speakers: Naomi Saphra (@nsaphra) at 10am: And Nothing Between: Using Categorical Differences to Understand and Predict Model Behavior Shiry Ginosar (@shiryginosar) at 10:40: What Do Vision and Vision-Language Models Really Know About the World?

1.0K

Naomi Saphra@nsaphra · Jul 15

I'm excited to discuss downstream applications of interpretability at @ActInterp! For a preview of my thoughts on the topic, see my blog post on how I think about picking applications to target x.com/saprmarks/stat…

TTal Haklay ✈️ACL@tal_haklay · Jul 10

🚨Meet our panelists at the Actionable Interpretability Workshop @ActInterp at @icmlconf! Join us July 19 at 4pm for a panel on making interpretability research actionable, its challenges, and how the community can drive greater impact. @nsaphra @saprmarks @kylelostat @FazlBarez

4.0K

Naomi Saphra Retweeted

Jeremy Dohmann@jecdohmann · Jul 17

I'm excited to be in ICML this week :-) @perceptroninc is co-sponsoring the Assessing World Models workshop this Friday. Come see some great talks from @jacobandreas @nsaphra and more; topics include mechanistic interpretability, intuitive physics, LLMs for generating scientific…

2.0K

Naomi Saphra@nsaphra · Jul 14

I'm biased but this paper is so cool 😭 like I really didn't think this kind of deep theory would be able to beat kludgey Transformer optimization hacks in 2025

AAnnabelle Michael Carrell@annabelle_cs · Jul 14

1.0K