Laura Kopf

@lkopf_ml

PhD student in Interpretable Machine Learning @bifoldberlin @TUBerlin

Berlin

Joined May 2024

73Following

63Followers

Pinned

Laura Kopf@lkopf_ml · Jun 19

🔍 When do neurons encode multiple concepts? We introduce PRISM, a framework for extracting multi-concept feature descriptions to better understand polysemanticity. 📄 Capturing Polysemanticity with PRISM: A Multi-Concept Feature Description Framework arxiv.org/abs/2506.15538 🧵

lkopf_ml's tweet image. 🔍 When do neurons encode multiple concepts?

We introduce PRISM, a framework for extracting multi-concept feature descriptions to better understand polysemanticity.

📄 Capturing Polysemanticity with PRISM: A Multi-Concept Feature Description Framework
arxiv.org/abs/2506.15538
🧵

2.0K

Laura Kopf Retweeted

Kirill Bykov@kirill_bykov · Dec 9

I am not attending #NeurIPS2024, but I encourage everyone interested in #XAI and #MechInterp to check out our paper on evaluating textual descriptions of neurons! Join @lkopf_ml, @anna_hedstroem, and @Marina_MCV on Thu 09.12, 1 p.m. to 4 p.m. CST at East Exhibit Hall A-C #3107!

923

Laura Kopf Retweeted

Neel Nanda@NeelNanda5 · Dec 10

NeurIPS has an overwhelming amount of papers, so I made myself a hacky spreadsheet of all (well, most) of the interpretability papers - sharing in case others find it useful! It's definitely got false negatives and positives, but hopefully is better than baseline.

444

342

31.0K

Laura Kopf@lkopf_ml · Jul 26, 2024

Join us today at the #ICML2024 Workshop on the Next Generation of AI Safety! Find @kirill_bykov and me in Hall A1 at Poster Session #2, from 3:30 PM to 4:30 PM. Looking forward to seeing you there!

lkopf_ml's tweet image. Join us today at the #ICML2024 Workshop on the Next Generation of AI Safety! Find @kirill_bykov and me in Hall A1 at Poster Session #2, from 3:30 PM to 4:30 PM. Looking forward to seeing you there!

570

Laura Kopf Retweeted

Understandable Machine Intelligence Lab@UMI_Lab_AI · Jul 20, 2024

Join us at the @icmlconf in Vienna next week. We are presenting two of our papers at the Mechanistic Interpretability and Next Generation of AI Safety workshops: •CoSy: Evaluating Textual Explanations of Neurons •Manipulating Feature Visualizations with Gradient Slingshots

2.0K