Houjun Liu
@houjun_liu
CS @stanford. Reasoning enjoyer @stanfordnlp, (PO)MDPs @SISLaboratory, and speech technologies @CarnegieMellon. AGI and Emacs are cool.
Rigorous testing and falsification of LM behavior is possible, and you'll be able to do it soon with 15 lines of code. Can't wait to show y'all what we've been up to at @SISLaboratory soon :p
In aviation, safety is built in from the start. @Stanford's Mykel Kochenderfer brings that mindset to AI—stress-testing systems for trust & reliability. Learn how our AI Safety Science program is helping to make AI safer: schmidtsciences.org/safetyscience/ #AIAppreciationDay
woo hoo see y'all in Vienna very very soon!!! #acl2025
.@stanfordnlp papers at @aclmeeting in Vienna next week: • HumT DumT: Measuring and controlling human-like language in LLMs @chengmyra1 @sunnyyuych @jurafsky • Controllable and Reliable Knowledge-Intensive Task Agents with Declarative GenieWorksheets @harshitj__ @ShichengGLiu…
Pretty excited about what @FEhrsam and Jeremy and friends are up to
Half the world will experience a brain disorder in their lifetime. At @nudge, we're building brain interfaces that are safe, precise, and non-invasive to solve that problem. We've raised a $100M Series A led by @ThriveCapital and Greenoaks to go faster. We're hiring. Join us.
I have been waiting for this to be announced, it’s so amazing to see such elegant scaling of the Deep Think system where the same system can now achieve a gold at IMO! deepmind.google/discover/blog/…
Official results are in - Gemini achieved gold-medal level in the International Mathematical Olympiad! 🏆 An advanced version was able to solve 5 out of 6 problems. Incredible progress - huge congrats to @lmthang and the team! deepmind.google/discover/blog/…
I gain a lot of mental clarity and peace from not bringing my phone: 1. In the bedroom for sleep, 2. For meals, coffee, or a snack with friends close to home/work. Both are very easy and worth trying.
please come to East building poster #1108 (ballroom A) rn
ICML ✈️ this week. open to chat and learn mech interp from you. @aryaman2020 and i have cool ideas about steering, just come to our AxBench poster. new steering blog: zen-wu.social/steer/index.ht… 中文: zen-wu.social/steer/cn_index…
⏳ Three weeks left! Submit your work to the MIB Shared Task at #BlackboxNLP, co-located with @emnlpmeeting Whether you're working on circuit discovery or causal variable localization, this is your chance to benchmark your method in a rigorous setup!
this paper will be presented at COLM later this year! looking back, i'm glad i tried something slightly out of my normal range in interp. ultimately, i feel that real-world models are much messier than can be satisfactorily explained via behaviour -- we must open the blackbox
New paper! 🫡 In-context learning (ICL) is when LLMs infer how to do a task from examples. We know that the relationship between # of ICL examples and task accuracy is predictable. Can we predict the shape of the ICL curve using Bayesian assumptions? Our paper shows yes!
SmolLM3 uses the APO preference loss! @KarelDoostrlnck great to see APO getting more adoption!
Everything you need to know is in our engineering blueprint
I've finished the three most important tasks on my new laptop: - Install Julia - Add Berkeley Mono - Install Helix
owo did you know something that sounds super confident makes you trust it more, including a LM? check out @neil_rathi's cool new paper in @COLM_conf
new paper 🌟 interpretation of uncertainty expressions like "i think" differs cross-linguistically. we show that (1) llms are sensitive to these differences but (2) humans overrely on their outputs across languages
you try to max github green squares; i try to max wandb blue squares; we are not the same

Thank you to everyone for your energy and enthusiasm in joining this adventure with me so far!
I met some of the most sparkly people in the bay through @neo & some my first friends at Stanford. also you get to talk to cool people around the Bay. also I get to annoy you with cool NLP takes on a Slack with literally everybody is one hop away. you should do it. due EOD
I'm beyond lucky to have been able to sit down with Jensen and ask him questions, all thanks to @Neo 💙 Students, midnight tonight is the deadline to apply for the Neo Scholars program. It's the best decision you'll make.
Big congratulations to @ShikharMurty on receiving his @Stanford PhD today!
Huge congratulations to @annadgoldie on receiving her @Stanford PhD today! It’s been a great journey!