Belinda Li

@belindazli

PhD student @MIT_CSAIL | formerly SWE @facebookai, BS'19 @uwcse | NLP, ML

Cambridge, MA

Joined October 2019

669Following

2KFollowers

Pinned

Belinda Li@belindazli · Mar 12

Past work has shown that world state is linearly decodable from LMs trained on text and games like Othello. But how do LMs *compute* these states? We investigate state tracking using permutation composition as a model problem, and discover interpretable, controllable procedures🧵

belindazli's tweet image. Past work has shown that world state is linearly decodable from LMs trained on text and games like Othello. But how do LMs *compute* these states? We investigate state tracking using permutation composition as a model problem, and discover interpretable, controllable procedures🧵

227

156

41.0K

Belinda Li Retweeted

Mehul Damani@MehulDamani2 · Jul 23

🚨New Paper!🚨 We trained reasoning LLMs to reason about what they don't know. o1-style reasoning training improves accuracy but produces overconfident models that hallucinate more. Meet RLCR: a simple RL method that trains LLMs to reason and reflect on their uncertainty --…

289

896

596

83.0K

Belinda Li Retweeted

Katie Collins@katie_m_collins · Jul 22

How do people reason so flexibly about new problems, bringing to bear globally-relevant knowledge while staying locally-consistent? Can we engineer a system that can synthesize bespoke world models (expressed as probabilistic programs) on-the-fly?

6.0K

Belinda Li Retweeted

Jiaxin Pei@jiaxin_pei · Jul 21

Life Update: I will join @UTiSchool as an Assistant Professor in Fall 2026 and will continue my work on LLM, HCI, and Computational Social Science. I'm building a new lab on Human-Centered AI Systems and will be hiring PhD students in the coming cycle!

336

21.0K

Belinda Li Retweeted

Tyler Brooke-Wilson@T_BrookeWilson · Jul 18

How do people reason while still staying coherent – as if they have an internal ‘world model’ for situations they’ve never encountered? A new paper on open-world cognition (preview at the world models workshop at #ICML2025!)

139

16.0K

Belinda Li@belindazli · Jul 18

Come check out our "Assessing World Models" workshop tomorrow! We'll be discussing whether generative AI builds world models, and what these world models might look like.

WWorkshop on Assessing World Models (ICML)@WorldModelsICML · Jul 17

Join us for the Workshop on Assessing World Models at ICML tomorrow! When: Friday July 17, 8:45am-5:15pm Where: West Ballroom B (same floor as registration)

681

Belinda Li Retweeted

Workshop on Assessing World Models (ICML)@WorldModelsICML · Jul 17

Join us for the Workshop on Assessing World Models at ICML tomorrow! When: Friday July 17, 8:45am-5:15pm Where: West Ballroom B (same floor as registration)

3.0K

Belinda Li@belindazli · Jul 15

I'll be presenting "(How) Do Language Models Track State" at ICML! Come by our poster tomorrow, Tuesday July 15 from 4:30pm - 7pm to chat about LMs and whether/how they encode dynamic world models! 🔗 icml.cc/virtual/2025/p…

BBelinda Li@belindazli · Mar 12

113

9.0K

Belinda Li Retweeted

xuan (ɕɥɛn / sh-yen)@xuanalogue · Jul 12

Ever since I started thinking seriously about AI value alignment in 2016-7, I've been frustrated by the inadequacy of utility+RL theory to account for the richness of human values. Glad to be part of a larger team now moving beyond those thin theories towards thicker ones.

132

10.0K

Belinda Li Retweeted

Keyon Vafa@keyonV · Jul 11

Can an AI model predict perfectly and still have a terrible world model? What would that even mean? Our new ICML paper formalizes these questions One result tells the story: A transformer trained on 10M solar systems nails planetary orbits. But it botches gravitational laws 🧵

213

1.0K

7.0K

5.0K

1.3M

Belinda Li Retweeted

jessica dai@jessicadai_ · Jul 1

individual reporting for post-deployment evals — a little manifesto (& new preprints!) tldr: end users have unique insights about how deployed systems are failing; we should figure out how to translate their experiences into formal evaluations of those systems.

134

27.0K

Belinda Li Retweeted

CLS@ChengleiSi · Jun 30

Are AI scientists already better than human researchers? We recruited 43 PhD students to spend 3 months executing research ideas proposed by an LLM agent vs human experts. Main finding: LLM ideas result in worse projects than human ideas.

170

599

204

139.0K

Belinda Li Retweeted

Laura Ruis@LauraRuis · Jun 24

LLMs can be programmed by backprop 🔎 In our new preprint, we show they can act as fuzzy program interpreters and databases. After being ‘programmed’ with next-token prediction, they can retrieve, evaluate, and even *compose* programs at test time, without seeing I/O examples.

316

252

33.0K

Belinda Li Retweeted

Ilia Sucholutsky@sucholutsky · Jun 20

Thrilled to announce I’ll be joining @PurdueCS as an Assistant Professor in Fall 2026! My lab will work on AI thought partners, machines that think with people rather than instead of people – I'll be recruiting PhD students this upcoming cycle so reach out & apply if interested!

448

27.0K

Belinda Li Retweeted

Erik Brynjolfsson@erikbryn · Jun 13

Some tasks are painful to do. But some are fulfilling and fun. How do they line up with the tasks that AI agents are set to automate? Not that well, based on our new paper "Future of Work with AI Agents: Auditing Automation and Augmentation Potential across the U.S. Workforce"…

217

218

31.0K

Belinda Li Retweeted

Yijia Shao@EchoShao8899 · Jun 12

🚨 70 million US workers are about to face their biggest workplace transmission due to AI agents. But nobody asks them what they want. While AI races to automate everything, we took a different approach: auditing what workers want vs. what AI can do across the US workforce.🧵

132

667

722

105.0K

Belinda Li Retweeted

Morris Yau@MorrisYau · Jun 13

Transformers: ⚡️fast to train (compute-bound), 🐌slow to decode (memory-bound). Can Transformers be optimal in both? Yes! By exploiting sequential-parallel duality. We introduce Transformer-PSM with constant time per token decode. 🧐 arxiv.org/pdf/2506.10918

192

137

35.0K

Belinda Li Retweeted

Jon Richens@jonathanrichens · Jun 4

Are world models necessary to achieve human-level agents, or is there a model-free short-cut? Our new #ICML2025 paper tackles this question from first principles, and finds a surprising answer, agents _are_ world models… 🧵

176

1.0K

180.0K

Belinda Li Retweeted

Tiago Pimentel@tpimentelms · May 29

If you're finishing your camera-ready for ACL (#acl2025nlp) or ICML (#icml2025 ) and want to cite co-first authors more fairly, I just made a simple fix to do this! Just add $^*$ to the authors' names in your bibtex, and the citations should change :) github.com/tpimentelms/ac…

199

41.0K

Belinda Li Retweeted

Stanford NLP Group@stanfordnlp · May 14

For this week’s NLP Seminar, we are thrilled to host @jacobandreas to talk about “Just Asking Questions” When: 5/15 Thurs 11am PT Non-Stanford affiliates registration form: forms.gle/svy5q5uu7anHw7…

10.0K

Belinda Li Retweeted

Laura Ruis@LauraRuis · May 12

Excited to announce that this fall I'll be joining @jacobandreas's amazing lab at MIT for a postdoc to work on interp. for reasoning (with @ev_fedorenko 🤯 among others). Cannot wait to think more about this direction in such a dream academic context!

482

31.0K