Ken Liu

@kenziyuliu

CS PhD @StanfordAILab @StanfordNLP w/ @percyliang @sanmikoyejo. Past: DeepMind, CMU, USydney 🇦🇺

🌲

Joined January 2017

851Following

2KFollowers

Pinned

Ken Liu@kenziyuliu · Apr 3

An LLM generates an article verbatim—did it “train on” the article? It’s complicated: under n-gram definitions of train-set inclusion, LLMs can complete “unseen” texts—both after data deletion and adding “gibberish” data. Our results impact unlearning, MIAs & data transparency🧵

kenziyuliu's tweet image. An LLM generates an article verbatim—did it “train on” the article?

It’s complicated: under n-gram definitions of train-set inclusion, LLMs can complete “unseen” texts—both after data deletion and adding “gibberish” data. Our results impact unlearning, MIAs &amp; data transparency🧵

326

197

87.0K

Ken Liu Retweeted

Shivam Duggal@ShivamDuggal4 · Jul 11

Compression is the heart of intelligence From Occam to Kolmogorov—shorter programs=smarter representations Meet KARL: Kolmogorov-Approximating Representation Learning. Given an image, token budget T & target quality 𝜖 —KARL finds the smallest t≤T to reconstruct it within 𝜖🧵

343

269

51.0K

Ken Liu@kenziyuliu · Jul 9

Can data owners & LM developers collaborate to build a strong shared model while each retaining data control? Introducing FlexOlmo💪, a mixture-of-experts LM enabling: • Flexible training on your local data without sharing it • Flexible inference to opt in/out your data…

AAi2@allen_ai · Jul 9

Introducing FlexOlmo, a new paradigm for language model training that enables the co-development of AI through data collaboration. 🧵

269

51.0K

Ken Liu@kenziyuliu · Jul 6

i learned about this in a recent project and had to switch back from vLLM to HF (and eat like a 5x slow down) just so my results are consistent. please spread and help a fellow researcher out 🙏 e.g. github.com/vllm-project/v… github.com/vllm-project/v… github.com/vllm-project/v… ...

ffinbarr@finbarrtimbers · Jul 4

horrifying bug of the day is finding out the vllm and huggingface produce significantly different logprobs discuss.vllm.ai/t/numerical-di…

202

110

26.0K

Ken Liu Retweeted

CLS@ChengleiSi · Jun 30

Are AI scientists already better than human researchers? We recruited 43 PhD students to spend 3 months executing research ideas proposed by an LLM agent vs human experts. Main finding: LLM ideas result in worse projects than human ideas.

169

597

204

138.0K

Ken Liu@kenziyuliu · Jun 26

So about a month ago, Percy posted a version of this plot of our Marin 32B pretraining run. We got a lot of feedback, both public and private, that the spikes were bad. (This is a thread about how we fixed the spikes. Bear with me. )

PPercy Liang@percyliang · May 22

Marin 32B training crossed 1.5 trillion tokens today...

101

1.0K

294.0K

Ken Liu@kenziyuliu · Jun 19

had a chance to play with the demo; cool system!

HHaoyu Xiong@Haoyu_Xiong_ · Jun 19

Your bimanual manipulators might need a Robot Neck 🤖🦒 Introducing Vision in Action: Learning Active Perception from Human Demonstrations ViA learns task-specific, active perceptual strategies—such as searching, tracking, and focusing—directly from human demos, enabling robust…

1.0K

Ken Liu Retweeted

Percy Liang@percyliang · Jun 18

Wrapped up Stanford CS336 (Language Models from Scratch), taught with an amazing team @tatsu_hashimoto @marcelroed @neilbband @rckpudi. Researchers are becoming detached from the technical details of how LMs work. In CS336, we try to fix that by having students build everything:

568

5.0K

7.0K

639.0K

Ken Liu@kenziyuliu · Jun 13

a comprehensive study on AI job automation! lots of interesting nuggets

YYijia Shao@EchoShao8899 · Jun 12

🚨 70 million US workers are about to face their biggest workplace transmission due to AI agents. But nobody asks them what they want. While AI races to automate everything, we took a different approach: auditing what workers want vs. what AI can do across the US workforce.🧵

5.0K

Ken Liu Retweeted

Omar Shaikh@oshaikh13 · Jun 9

What if LLMs could learn your habits and preferences well enough (across any context!) to anticipate your needs? In a new paper, we present the General User Model (GUM): a model of you built from just your everyday computer use. 🧵

336

197

57.0K

Ken Liu Retweeted

jxmo@jxmnop · Jun 3

new paper from our work at Meta! **GPT-style language models memorize 3.6 bits per param** we compute capacity by measuring total bits memorized, using some theory from Shannon (1953) shockingly, the memorization-datasize curves look like this: ___________ / / (🧵)

384

3.0K

405.0K

Ken Liu@kenziyuliu · Jun 3

cool new TTT method that achieves drastically higher MFU!

TTianyuan Zhang@tianyuanzhang99 · Jun 3

Bored of linear recurrent memories (e.g., linear attention) and want a scalable, nonlinear alternative? Our new paper “Test-Time Training Done Right” propose LaCT (Large Chunk Test-Time Training) — a highly efficient, massively scalable nonlinear memory with: 💡 Pure PyTorch…

944

Ken Liu Retweeted

CLS@ChengleiSi · May 30

This year, there have been various pieces of evidence that AI agents are starting to be able to conduct scientific research and produce papers end-to-end, at a level where some of these generated papers were already accepted by top-tier conferences/workshops. Intology’s…

220

36.0K

Ken Liu Retweeted

DeepSeek@deepseek_ai · May 29

🚀 DeepSeek-R1-0528 is here! 🔹 Improved benchmark performance 🔹 Enhanced front-end capabilities 🔹 Reduced hallucinations 🔹 Supports JSON output & function calling ✅ Try it now: chat.deepseek.com 🔌 No change to API usage — docs here: api-docs.deepseek.com/guides/reasoni… 🔗…

525

2.0K

10.0K

1.0K

1.4M

Ken Liu@kenziyuliu · May 29

we present a new representation steering training objective to rival against prompting! and you also get: - a fun trick: you can mitigate the side-effects of randomly selecting steering factors by simply training with it. - a long appendix with our core dumps on steering

QQinan Yu@qinan_yu · May 29

🎀 fine-grained, interpretable representation steering for LMs! meet RePS — Reference-free Preference Steering! 1⃣ outperforms existing methods on 2B-27B LMs, nearly matching prompting 2⃣ supports both steering and suppression (beat system prompts!) 3⃣ jailbreak-proof (1/n)

6.0K

Ken Liu Retweeted

Qinan Yu@qinan_yu · May 29

226

175

35.0K

Ken Liu Retweeted

Aryaman Arora@aryaman2020 · May 28

new paper! 🫡 why are state space models (SSMs) worse than Transformers at recall over their context? this is a question about the mechanisms underlying model behaviour: therefore, we propose using mechanistic evaluations to answer it!

659

481

72.0K

Ken Liu Retweeted

Epoch AI@EpochAIResearch · May 23

Is AI already superhuman at FrontierMath? To answer this question, we ran a competition at MIT, pitting eight teams of mathematicians against o4-mini-medium. Result: o4-mini beat all but two teams. And while AIs aren't yet clearly superhuman, they probably will be soon.

503

153

115.0K