Ekdeep Singh (@EkdeepL)

Pinned

E

Ekdeep Singh@EkdeepL · Jun 28

🚨New paper! We know models learn distinct in-context learning strategies, but *why*? Why generalize instead of memorize to lower loss? And why is generalization transient? Our work explains this & *predicts Transformer behavior throughout training* without its weights! 🧵 1/

9

65

344

345

60.0K

Pinned

E

Ekdeep Singh@EkdeepL · Jun 28

Really nice analysis!

EEkdeep Singh@EkdeepL · Jun 28

🚨New paper! We know models learn distinct in-context learning strategies, but *why*? Why generalize instead of memorize to lower loss? And why is generalization transient? Our work explains this & *predicts Transformer behavior throughout training* without its weights! 🧵 1/

0

2

22

10

3.0K

Ekdeep Singh Retweeted

A

Andrew Saxe@SaxeLab · Jul 14

Excited to share new work @icmlconf by Loek van Rossem exploring the development of computational algorithms in recurrent neural networks. Hear it live tomorrow, Oral 1D, Tues 14 Jul West Exhibition Hall C: icml.cc/virtual/2025/p… Paper: openreview.net/forum?id=3go0l… (1/11)

2

18

64

39

5.0K

E

Ekdeep Singh@EkdeepL · Jul 19

Check out my boy @dmkrash presenting our “outstanding paper award” winner at the Actionable Interpretability workshop today!

DDima Krasheninnikov@dmkrash · Jul 19

Check out my posters today if you're at ICML! 1) Detecting high-stakes interactions with activation probes — Outstanding paper @ Actionable interp workshop, 10:40-11:40 2) LLMs’ activations linearly encode training-order recency — Best paper runner up @ MemFM workshop, 2:30-3:45

0

4

18

1

1.0K

E

Ekdeep Singh@EkdeepL · Jul 16

While you still can, snatch this prodigy undergrad for your lab when he applies for PhDs this fall!

KKento Nishi｜AI Researcher, LiveTL+HyperChat Dev🐔@kento_nishi · Jul 16

Thank you to everyone who swung by our poster presentation!!! So many engaging conversations today. #ICML2025

0

14

1

1.0K

E

Ekdeep Singh@EkdeepL · Jul 12

Submit to our workshop on contextualizing Cogsci approaches for understanding neural networks---"Cognitive interpretability"!

CCogInterp Workshop @ NeurIPS 2025@CogInterp · Jul 11

We’re excited to announce the first workshop on CogInterp: Interpreting Cognition in Deep Learning Models @ NeurIPS 2025! 📣 How can we interpret the algorithms and representations underlying complex behavior in deep learning models? 🌐 coginterp.github.io/neurips2025/ 1/

0

5

15

3

2.0K

E

Ekdeep Singh@EkdeepL · Jul 12

I'll be at ICML beginning this Monday---hit me up if you'd like to chat!

2

0

20

1

1.0K

Ekdeep Singh Retweeted

N

Naomi Saphra@nsaphra · Jul 10

🚨 New preprint! 🚨 Everyone loves causal interp. It’s coherently defined! It makes testable predictions about mechanistic interventions! But what if we had a different objective: predicting model behavior not under mechanistic interventions, but on unseen input data?

2

24

240

169

16.0K

E

Ekdeep Singh@EkdeepL · Jul 9

Don't forget to tune in tomorrow, July 10th for a session with @EkdeepL on "Rational Analysis of In-Context Learning Elicits a Loss-Complexity Tradeoff" Learn more: cohere.com/events/Cohere-…

CCohere Labs@Cohere_Labs · Jul 3

How do LLMs learn new tasks from just a few examples? What’s happening inside during in-context learning? 🤔 Join us July 10 for a talk by @EkdeepL on how LLMs adapt like cognitive maps—and how we can predict their behavior without accessing weights.

0

4

19

3

2.0K

E

Ekdeep Singh@EkdeepL · Jul 3

I'll be giving an online talk at Cohere Labs---join!

CCohere Labs@Cohere_Labs · Jul 3

How do LLMs learn new tasks from just a few examples? What’s happening inside during in-context learning? 🤔 Join us July 10 for a talk by @EkdeepL on how LLMs adapt like cognitive maps—and how we can predict their behavior without accessing weights.

1

4

42

8

4.0K

Ekdeep Singh Retweeted

P

Pulkit Gopalani@GopalaniPulkit · Jun 19

Excited to announce our recent work on understanding training-time emergence in Transformers! Thread🧵(1/11)

2

8

35

15

5.0K

E

Ekdeep Singh@EkdeepL · Jun 30

Bayesian models as the ultimate normative theories neural network as the ultimate task-performing models

nnoahdgoodman@noahdgoodman · Jun 30

It turns out that a lot of the most interesting behavior of LLMs can be explained without knowing anything about architecture or learning algorithms. Here we predict the rise (and fall) of in-context learning using hierarchical Bayesian methods.

0

2

14

5

1.0K

Ekdeep Singh Retweeted

D

Daniel Wurgaft@danielwurgaft · Jun 26

Can we record and study human chains of thought? The think-aloud method, where participants voice their thoughts as they solve a task, offers a way! In our #CogSci2025 paper co-led with Ben Prystawski, we introduce a method to automate analysis of human reasoning traces! (1/8)🧵

2

10

28

12

14.0K

E

Ekdeep Singh@EkdeepL · Jun 30

It turns out that a lot of the most interesting behavior of LLMs can be explained without knowing anything about architecture or learning algorithms. Here we predict the rise (and fall) of in-context learning using hierarchical Bayesian methods.

EEkdeep Singh@EkdeepL · Jun 28

🚨New paper! We know models learn distinct in-context learning strategies, but *why*? Why generalize instead of memorize to lower loss? And why is generalization transient? Our work explains this & *predicts Transformer behavior throughout training* without its weights! 🧵 1/

4

20

108

77

18.0K

Ekdeep Singh Retweeted

K

Koyena Pal@kpal_koyena · Jun 30

🚨 Registration is live! 🚨 The New England Mechanistic Interpretability (NEMI) Workshop is happening August 22nd 2025 at Northeastern University! A chance for the mech interp community to nerd out on how models really work 🧠🤖 🌐 Info: nemiconf.github.io/summer25/ 📝 Register:…

2

28

103

48

17.0K

Ekdeep Singh Retweeted

x

xuan (ɕɥɛn / sh-yen)@xuanalogue · Jun 30

I've struggled to announce this amidst so much dark & awful going on in the world, but with 1mo to go, I wanted to share that: (i) I finally graduated; (ii) In August, I'll begin as an assistant professor in the CS dept. of the National University of Singapore.

115

50

2.0K

169

103.0K

Ekdeep Singh Retweeted

C

Ching Fang (chingfang.bsky.social)@chingfang17 · Jun 26

Humans and animals can rapidly learn in new environments. What computations support this? We study the mechanisms of in-context reinforcement learning in transformers, and propose how episodic memory can support rapid learning. Work w/ @KanakaRajanPhD: arxiv.org/abs/2506.19686

7

57

236

169

22.0K

E

Ekdeep Singh@EkdeepL · Jun 28

This collab was one of the most beautiful papers I've ever worked on! The amount I learned from @danielwurgaft was insane and you should follow him to inherit some gems too :D

DDaniel Wurgaft@danielwurgaft · Jun 28

🚨New paper! We know models learn distinct in-context learning strategies, but *why*? Why generalize instead of memorize to lower loss? And why is generalization transient? Our work explains this & *predicts Transformer behavior throughout training* without its weights! 🧵 1/

1

2

22

14

2.0K