Jessy Lin

@realJessyLin

PhD @Berkeley_AI, visiting researcher @AIatMeta. Interactive language agents 🤖 💬

Joined March 2013

863Following

2KFollowers

Pinned

Jessy Lin@realJessyLin · Apr 21

I’ll be at #ICLR2025 this week! ✈️ A couple of things I’m excited about lately: 1) Real-time multimodal models: how do we post-train assistants for real-time (and real world) tasks beyond the chat box? 2) Continual learning and memory: to have models / agents that learn from…

10.0K

Jessy Lin Retweeted

Ilya Sutskever@ilyasut · Oct 1, 2020

The Bitter lesson does not say to not bother with methods research. It says to not bother with methods that are handcrafted datapoints in disguise.

591

137

Jessy Lin@realJessyLin · Jul 10

💯 Can't wait for the second blog! This could be an important step towards making AI agents more "human-centered". We want AI agents to help users (safely ofc), yet solely optimizing for tasks wo "users" in the picture might not get us there, e.g., x.com/metr_evals/sta…

JJessy Lin@realJessyLin · Jul 10

User simulators bridge RL with real-world interaction // jessylin.com/2025/07/10/use… How do we get the RL paradigm to work on tasks beyond math & code? Instead of designing datasets, RL requires designing environments. Given that most non-trivial real-world tasks involve…

1.0K

Jessy Lin@realJessyLin · Jun 12

underrated idea to learn passively about people from everyday computer use - I think the natural extension is learning from *trajectories* of how people prefer to do things, which is hard to get from prompting / static user data otherwise

OOmar Shaikh@oshaikh13 · Jun 9

What if LLMs could learn your habits and preferences well enough (across any context!) to anticipate your needs? In a new paper, we present the General User Model (GUM): a model of you built from just your everyday computer use. 🧵

5.0K

Jessy Lin Retweeted

John Yang@jyangballin · May 7

40% with just 1 try per task: SWE-agent-LM-32B is the new #1 open source model on SWE-bench Verified. We built it by synthesizing a ton of agentic training data from 100+ Python repos. Today we’re open-sourcing the toolkit that made it happen: SWE-smith.

133

653

379

97.0K

Jessy Lin Retweeted

Sam Rodriques@SGRodriques · May 1

Today, we are launching the first publicly available AI Scientist, via the FutureHouse Platform. Our AI Scientist agents can perform a wide variety of scientific tasks better than humans. By chaining them together, we've already started to discover new biology really fast. With…

147

710

3.0K

708.0K

Jessy Lin Retweeted

Helen Toner@hlntnr · Apr 23

New on Rising Tide, I break down 2 factors that will play a huge role in how much AI progress we see over the next couple years: verification & generalization. How well these go will determine if AI just gets super good at math & coding vs. mastering many domains. Post excerpts:

127

18.0K

Jessy Lin@realJessyLin · Apr 13

chatgpt memory is like the buzzfeed quiz of 2025

DDan Shipper 📧@danshipper · Apr 10

ChatGPT just got an INSANE new memory update. It remembers things about you between chats, in a sophisticated and intelligent way. Best prompt to try? “Tell me some unexpected things you remember about me”

1.0K

Jessy Lin Retweeted

Cassidy Laidlaw@cassidy_laidlaw · Apr 11

We built an AI assistant that plays Minecraft with you. Start building a house—it figures out what you’re doing and jumps in to help. This assistant *wasn't* trained with RLHF. Instead, it's powered by *assistance games*, a better path forward for building AI assistants. 🧵

217

2.0K

1.0K

487.0K

Jessy Lin Retweeted

Sanidhya Vijayvargiya@sanidhya903 · Feb 19

1/ LLM agents can code—but can they ask clarifying questions? 🤖💬 Tired of coding agents wasting time and API credits, only to output broken code? What if they asked first instead of guessing? 🚀

13.0K

Jessy Lin@realJessyLin · Dec 4

Fascinating interviews. I'm not sure humans will ever be "out of the loop" in math. Even if humans have no advantages in proving theorems, they are still going to matter in asking questions. Mathematics is not just about what is true, but also what is interesting - to humans!

EEpoch AI@EpochAIResearch · Dec 4

8/ If these challenges are overcome, what then? One thing that all four mathematicians agreed on is that full automation of math research is possible in principle, although this would likely be preceded by a period of human-AI collaboration.

4.0K

Jessy Lin Retweeted

Charlie Snell@sea_snell · Nov 26

Can we predict emergent capabilities in GPT-N+1🌌 using only GPT-N model checkpoints, which have random performance on the task? We propose a method for doing exactly this in our paper “Predicting Emergent Capabilities by Finetuning”🧵

571

394

152.0K

Jessy Lin@realJessyLin · Nov 26

+1 to the key idea here - it's def important to iterate on algorithms with clean benchmarks like math+code with known reward functions, but almost every task we care about in the real world has a fuzzy / human-defined reward func. I'm interested to see how we'll end up applying…

AAidan McLaughlin@aidan_mclau · Nov 25

i wrote a new essay called The Problem with Reasoners where i discuss why i doubt o1-like models will scale beyond narrow domains like math and coding (link below)

6.0K

Jessy Lin@realJessyLin · Oct 23

Using AI agents to help humans understand and audit complex AI systems — I'm really excited by the long-term vision Jacob and Sarah are working on here!

TTransluce@TransluceAI · Oct 23

Announcing Transluce, a nonprofit research lab building open source, scalable technology for understanding AI systems and steering them in the public interest. Read a letter from the co-founders Jacob Steinhardt and Sarah Schwettmann: transluce.org/introducing-tr…

3.0K