Tom McCoy

@RTomMcCoy

Assistant professor @YaleLinguistics. Studying computational linguistics, cognitive science, and AI. He/him.

New Haven, CT

Joined December 2018

574Following

4KFollowers

Pinned

Tom McCoy@RTomMcCoy · Oct 10

🤖🧠NOW OUT IN PNAS🧠🤖 Language models show many surprising behaviors. E.g., they can count 30 items more easily than 29 In Embers of Autoregression, we explain such effects by analyzing what LMs are trained to do pnas.org/doi/10.1073/pn… Major updates since the preprint! 1/n

RTomMcCoy's tweet image. 🤖🧠NOW OUT IN PNAS🧠🤖

Language models show many surprising behaviors. E.g., they can count 30 items more easily than 29

In Embers of Autoregression, we explain such effects by analyzing what LMs are trained to do

pnas.org/doi/10.1073/pn…
Major updates since the preprint!

1/n

351

188

51.0K

Pinned

Tom McCoy Retweeted

Raphaël Millière@raphaelmilliere · Jun 3

Transformer-based neural networks achieve impressive performance on coding, math & reasoning tasks that require keeping track of variables and their values. But how can they do that without explicit memory? 📄 Our new ICML paper investigates this in a synthetic setting! 🧵 1/13

622

691

69.0K

Tom McCoy@RTomMcCoy · Jul 21

One of the most exciting papers I've read in a while - highly recommended!🧠🤖 It gives a compelling account of how the human mind reasons so flexibly, and of what's missing from LLM reasoning

TTyler Brooke-Wilson@T_BrookeWilson · Jul 18

How do people reason while still staying coherent – as if they have an internal ‘world model’ for situations they’ve never encountered? A new paper on open-world cognition (preview at the world models workshop at #ICML2025!)

2.0K

Tom McCoy Retweeted

Miranda Zhu 筱萌@XiaomengMZhu · Jul 20

🤖New paper w/ @zh_herbert_zhou @SimonCharlow @bob_frank in ACL2025 & SCiL 💡We use core ideas from Dynamic Semantics to evaluate LLMs and found that they show human-like judgments on anaphora accessibility but rely on specific lexical cues under more careful scrutiny. 🧵1/6

3.0K

Tom McCoy Retweeted

Tyler Brooke-Wilson@T_BrookeWilson · Jul 18

139

16.0K

Tom McCoy@RTomMcCoy · Jul 16

So much research is being done about LLMs that it's hard to stay on top of the literature. To help with this, I've made a list of all the most important papers from the past 8 years: rtmccoy.com/pubs/ I hope you enjoy!

181

106

13.0K

Tom McCoy@RTomMcCoy · Jul 4

July the 4th be with you!

653

Tom McCoy Retweeted

Yukyung Lee@yukyunglee_ · Jul 2

Can coding agents autonomously implement AI research extensions? We introduce RExBench, a benchmark that tests if a coding agent can implement a novel experiment based on existing research and code. Finding: Most agents we tested had a low success rate, but there is promise!

138

20.0K

Tom McCoy Retweeted

Laura Ruis@LauraRuis · Jun 24

LLMs can be programmed by backprop 🔎 In our new preprint, we show they can act as fuzzy program interpreters and databases. After being ‘programmed’ with next-token prediction, they can retrieve, evaluate, and even *compose* programs at test time, without seeing I/O examples.

316

252

33.0K

Tom McCoy Retweeted

Jackson Petty@jowenpetty · Jun 9

How well can LLMs understand tasks with complex sets of instructions? We investigate through the lens of RELIC: REcognizing (formal) Languages In-Context, finding a significant overhang between what LLMs are able to do theoretically and how well they put this into practice.

100

17.0K

Tom McCoy@RTomMcCoy · Jun 4

The word "laundry" contains both steps of the laundry process: 1. Undry 2. Dry

3.0K

Tom McCoy@RTomMcCoy · May 24

Had a fun visit to UChicago/TTIC over the past couple days - really great group doing NLP & CompLing there!

4.0K