Jacob Andreas (@jacobandreas)

Pinned

J

Jacob Andreas@jacobandreas · Jul 23

👉 New preprint! Today, many the biggest challenges in LM post-training aren't just about correctness, but rather consistency & coherence across interactions. This paper tackles some of these issues by optimizing reasoning LMs for calibration rather than accuracy...

MMehul Damani@MehulDamani2 · Jul 23

🚨New Paper!🚨 We trained reasoning LLMs to reason about what they don't know. o1-style reasoning training improves accuracy but produces overconfident models that hallucinate more. Meet RLCR: a simple RL method that trains LLMs to reason and reflect on their uncertainty --…

2

11

95

50

12.0K

J

Jacob Andreas@jacobandreas · Jul 23

fun new paper training LLMs to analyze their own uncertainty and be more calibrated in their confidence! arxiv.org/abs/2507.16806

MMehul Damani@MehulDamani2 · Jul 23

🚨New Paper!🚨 We trained reasoning LLMs to reason about what they don't know. o1-style reasoning training improves accuracy but produces overconfident models that hallucinate more. Meet RLCR: a simple RL method that trains LLMs to reason and reflect on their uncertainty --…

1

10

87

36

8.0K

J

Jacob Andreas@jacobandreas · Jul 23

New paper on emergent reasoning about uncertainty in RL! It was great to move the needle a bit on an important problem - and very excited for future work in the space. It was an absolute pleasure working with @MehulDamani2 @ishapuri101 @IdanShenfeld @jacobandreas

MMehul Damani@MehulDamani2 · Jul 23

🚨New Paper!🚨 We trained reasoning LLMs to reason about what they don't know. o1-style reasoning training improves accuracy but produces overconfident models that hallucinate more. Meet RLCR: a simple RL method that trains LLMs to reason and reflect on their uncertainty --…

0

2

9

2

1.0K

Jacob Andreas Retweeted

M

Mehul Damani@MehulDamani2 · Jul 23

🚨New Paper!🚨 We trained reasoning LLMs to reason about what they don't know. o1-style reasoning training improves accuracy but produces overconfident models that hallucinate more. Meet RLCR: a simple RL method that trains LLMs to reason and reflect on their uncertainty --…

12

285

897

596

83.0K

Jacob Andreas Retweeted

M

Mehul Damani@MehulDamani2 · Jul 23

RLCR produces reasoning LLMs that not only solve problems - but also reason about what they don't know. ✨🧠 📄Paper: arxiv.org/abs/2507.16806 w/ @ishapuri101, @StewartSlocum1, @IdanShenfeld, @LChoshen, @yoonrkim, @jacobandreas [9/N]

0

4

50

14

3.0K

Jacob Andreas Retweeted

T

Tyler Brooke-Wilson@T_BrookeWilson · Jul 18

How do people reason while still staying coherent – as if they have an internal ‘world model’ for situations they’ve never encountered? A new paper on open-world cognition (preview at the world models workshop at #ICML2025!)

4

25

139

88

16.0K

Jacob Andreas Retweeted

K

Koyena Pal@kpal_koyena · Jun 30

🚨 Registration is live! 🚨 The New England Mechanistic Interpretability (NEMI) Workshop is happening August 22nd 2025 at Northeastern University! A chance for the mech interp community to nerd out on how models really work 🧠🤖 🌐 Info: nemiconf.github.io/summer25/ 📝 Register:…

2

28

103

48

18.0K

J

Jacob Andreas@jacobandreas · Jun 13

👉 New preprint on a new family of Transformer-type models whose depth scales logarithmically with sequence length. Enables: - fast training - fast decoding - large memory capacity in associative recall - strong length generalization on state tracking

MMorris Yau@MorrisYau · Jun 13

Transformers: ⚡️fast to train (compute-bound), 🐌slow to decode (memory-bound). Can Transformers be optimal in both? Yes! By exploiting sequential-parallel duality. We introduce Transformer-PSM with constant time per token decode. 🧐 arxiv.org/pdf/2506.10918

2

9

83

49

14.0K

Jacob Andreas Retweeted

M

Morris Yau@MorrisYau · Jun 13

Transformers: ⚡️fast to train (compute-bound), 🐌slow to decode (memory-bound). Can Transformers be optimal in both? Yes! By exploiting sequential-parallel duality. We introduce Transformer-PSM with constant time per token decode. 🧐 arxiv.org/pdf/2506.10918

3

37

192

137

35.0K

Jacob Andreas Retweeted

U

Uzay @ sf@uzpg_ · Jun 3

@kaivu, @atticuswzf , and I were researching long horizon reasoning (with @jacobandreas). We found existing benchmarks’ hard problems often featured tricky puzzles, not tests of system understanding. So we made Breakpoint: a SWE benchmark designed to disambiguate this capability.

3

10

55

26

5.0K

Jacob Andreas Retweeted

I

INTERPLAY Workshop@interplaywrkshp · May 16

🚨🚨 Studying the INTERPLAY of LMs' internals and behavior? Join our @colmweb.org workshop on comprehensivly evaluating LMs. Deadline: June 23rd CfP: shorturl.at/sBomu We're excited to see your work!! See you in Montréal 🇨🇦 #nlproc #interpretability

1

4

8

2

2.0K

Jacob Andreas Retweeted

S

Stanford NLP Group@stanfordnlp · May 14

For this week’s NLP Seminar, we are thrilled to host @jacobandreas to talk about “Just Asking Questions” When: 5/15 Thurs 11am PT Non-Stanford affiliates registration form: forms.gle/svy5q5uu7anHw7…

1

12

86

24

10.0K

Jacob Andreas Retweeted

L

Laura Ruis@LauraRuis · May 12

Excited to announce that this fall I'll be joining @jacobandreas's amazing lab at MIT for a postdoc to work on interp. for reasoning (with @ev_fedorenko 🤯 among others). Cannot wait to think more about this direction in such a dream academic context!

45

11

482

43

31.0K

Jacob Andreas Retweeted

C

Cédric@cedcolas · May 1

i just got an art grant from the council for the arts at MIT! *Tangible Dreams* will let visitors experiment and play with a physical neural network generating images real-time—by twisting knobs and switches, by reconnecting nodes together @ArtsatMIT

5

12

75

10

6.0K

J

Jacob Andreas@jacobandreas · Apr 22

MIT NLP @ ICLR 2025 - catch @MehulDamani2 at poster 219, Thursday 3PM to chat about "Learning How Hard to Think: Input Adaptive Allocation of LM Computation"!

MMehul Damani@MehulDamani2 · Apr 22

I am super excited to be presenting our work on adaptive inference -time compute at ICLR! Come chat with me on Thursday 4/24 at 3PM (Poster #219). I am also happy to chat about RL/reasoning/ RLHF/ inference scaling (DMs are open)!

0

1

18

7

4.0K

Jacob Andreas Retweeted

S

Shikhar@ShikharMurty · Apr 21

New #NAACL2025 paper! 🚨 Transformer LMs are data hungry, we propose a new auxiliary loss function (TreeReg) to fix that. TreeReg takes bracketing decisions from syntax trees and turns them into orthogonality constraints on span representations. ✅ Boosts pre-training data…

4

24

98

59

27.0K

Jacob Andreas Retweeted

E

Ekin Akyürek@akyurekekin · Apr 18

✨ Big life updates ✨ - @afeyzaakyurek and I welcomed our baby! - Successfully defended my PhD and graduated from MIT 🎓 - Joined @OpenAI 🍓 Excited for what's next!

59

12

676

47

46.0K

Jacob Andreas Retweeted

Ġ

Ġabe Ġrand@gabe_grand · Apr 17

Tackling complex problems with LMs requires search/planning, but how should test-time compute be structured? Introducing Self-Steering, a new meta-reasoning framework where LMs coordinate their own inference procedures by writing code!

7

38

115

52

18.0K