Christopher Potts
@ChrisGPotts
Stanford Professor of Linguistics and, by courtesy, of Computer Science, and member of @stanfordnlp and @StanfordAILab. He/Him/His.
All the screencasts for this course are now freely available on YouTube; if you take the official course, you also get access to an expert teaching team, and I myself do some webinar/discussion events: youtube.com/playlist?list=…
Master the tech transforming AI. Natural Language Understanding taught by @ChrisGPotts starts 8/21 and runs 10 weeks. Course updated to reflect the latest developments in LLMs and generative AI. Enroll now! online.stanford.edu/courses/xcs224… @stanfordnlp @StanfordAILab
imagine the lucy benchmark -- can u get lucy to do IMO problems? can u get lucy to draw a well-proportioned person? can u get lucy to stop scrolling labubus on tiktok? truly difficult, next-level stuff
We're presenting the Mechanistic Interpretability Benchmark (MIB) now! Come and chat - East 1205. Project led by @amuuueller @AtticusGeiger @sarahwiegreffe
Introducing ColQwen-Omni, a 3B omnimodal retriever that extends the ColPali concept of multimodal retrieval with late interaction to audio chunks and short videos, with no performance degradation on visual document retrieval wrt our best models! (1/N)
Excited to take part in @TEDAISF this fall!
📣 Speaker Announcement: @judyefan, Asst Prof of Psychology at @Stanford, joins @TEDAISF 2025! At Stanford’s Cognitive Tools Lab, she studies how our mental tools shape thought, and what that means for designing smarter AI. 📅 October 21–22, 2025 📍 San Francisco Link below ⬇️
The #SIGIR2025 Best Paper just awarded to the WARP engine for fast late interaction! Congrats to Luca Scheerer🎉 WARP was his @ETH_en MS thesis, completed while visiting us at @StanfordNLP. Incidentally, it's the fifth Paper Award for a ColBERT paper since 2020!* Luca did an…
📢 If you’re at #SIGIR2025 this week, make sure to be at Luca Scheerer’s paper talk: “WARP: An Efficient Engine for Multi-Vector Retrieval” (Wednesday 11am) WARP makes PLAID, the famous ludicrously fast ColBERT engine, another 3x faster on CPUs. With the usual ColBERT quality!
ICML ✈️ this week. open to chat and learn mech interp from you. @aryaman2020 and i have cool ideas about steering, just come to our AxBench poster. new steering blog: zen-wu.social/steer/index.ht… 中文: zen-wu.social/steer/cn_index…
i forgot the whole point of saying you're at a conference is to advertise your poster please come check out AxBench by @ZhengxuanZenWu* me* et al. on Tuesday, 15 July at 11 AM - 1:30 PM
i forgot the whole point of saying you're at a conference is to advertise your poster please come check out AxBench by @ZhengxuanZenWu* me* et al. on Tuesday, 15 July at 11 AM - 1:30 PM
new paper! 🫡 we introduce 🪓AxBench, a scalable benchmark that evaluates interpretability techniques on two axes: concept detection and model steering. we find that: 🥇prompting and finetuning are still best 🥈supervised interp methods are effective 😮SAEs lag behind
Overdue job update -- I am now: - A Visiting Scientist at @schmidtsciences, supporting AI safety and interpretability - A Visiting Researcher at the Stanford NLP Group, working with @ChrisGPotts I am so grateful I get to keep working in this fascinating and essential area, and…
this paper will be presented at COLM later this year! looking back, i'm glad i tried something slightly out of my normal range in interp. ultimately, i feel that real-world models are much messier than can be satisfactorily explained via behaviour -- we must open the blackbox
New paper! 🫡 In-context learning (ICL) is when LLMs infer how to do a task from examples. We know that the relationship between # of ICL examples and task accuracy is predictable. Can we predict the shape of the ICL curve using Bayesian assumptions? Our paper shows yes!
SmolLM3 uses the APO preference loss! @KarelDoostrlnck great to see APO getting more adoption!
Everything you need to know is in our engineering blueprint
It bugs me that "pretraining" is not before training (but rather pre- some unspecified action) and that "post-training" is not after training (it is post-pretraining training). It's as though "preschool" referred to K-12 and "post-school" to college.
Our multimodal transformer tool for automating word-concreteness ratings is published today 📝 C-ratings are used in research across Cognitive Science 💶 They take time and money to collect ⚙️ Automation solves this + we get in-context ratings for free! nature.com/articles/s4427…
Hi @USAMBTurkiye, welcome to Türkiye! I'm a Computer Science PhD student at Stanford University, currently trying to renew my F-1 visa through the embassy in Ankara. I submitted my postal application on May 5 (under the old system) and have been waiting since then to receive the
A must-read interview with Chris!
I’ve joined @aixventureshq as a General Partner, working on investing in deep AI startups. Looking forward to working with founders on solving hard problems in AI and seeing products come out of that! Thank you @ychernova at @WSJ for covering the news: wsj.com/articles/ai-re…
I’ve joined @aixventureshq as a General Partner, working on investing in deep AI startups. Looking forward to working with founders on solving hard problems in AI and seeing products come out of that! Thank you @ychernova at @WSJ for covering the news: wsj.com/articles/ai-re…
ICML's election for their board of directors has begun. I've thrown my hat in the ring. Please consider voting for Gautam Kamath. I have experience with the governance of TMLR, COLT, and ALT, and I think I've demonstrated myself as a consciencious and engaged community member.
Created the same AI agent using 7 different frameworks: Google ADK, Agno, Atomic Agents, Autogen, DSPy, Langgraph, Pydantic AI. Take a guess which won? 😉 makingaiagents.substack.com/p/which-agent-…
Congratulations to @lateinteraction and the @DSPyOSS team! DSPy 3.0!! 🚀 Continuing to push the frontier for building AI software!🔥🔥🔥