Nick Haber
@nickhaber
Interactively learning AI, cognitive models, learning tools. Assistant Professor at Stanford.
@ilyasut and @ericzelikman will likely win the Turing award for reasoning We all stand on the shoulders of giants, Marvin Minsky, John McCarthy, Richard Sutton worked on processes for learning, knowledge representation, etc
Whoever will be acknowledged as the “inventor” of reasoning models will eventually win the Turing Award. I suppose we all know who that will be.
Interested in LLM evaluation reliability & efficiency? Check our ICML’25 paper Reliable and Efficient Amortized Model-based Evaluation arxiv.org/abs/2503.13335 w/ @percyliang @uiuc_aisecure @sanmikoyejo @yuhengtu @VirtueAI_co @StanfordAILab @stai_research @StanfordCRFM 🧵1/9
Low-cost AI therapy chatbots may seem like a perfect solution for many seeking mental health help. But a recent Stanford study raises some red flags: hai.stanford.edu/news/exploring…
What would it mean to train a verifier in a domain that's *very* far from having cheap/free feedback? Super excited to see this work with @sebbrusso @DanielFein7 @ZiyuX @kabirjolly_ @rm_rafailov out.
Introducing LitBench, the first standardized benchmark for creative writing verifiers! We use Reddit’s r/WritingPrompts to label human preferences across 50k story-pairs, and see how LLM-as-a-judge, Generative RMs, and Bradley-Terry RMs stack up.
As we optimize model reasoning over verifiable objectives, how does this affect human understanding of said reasoning to achieve superior collaborative outcomes? In our new preprint, we investigate human-centric model reasoning for knowledge transfer 🧵:
1/ I'm excited to share recent results from my first collaboration with the amazing @aran_nayebi and @Leokoz8! We show how autonomous behavior and whole-brain dynamics emerge in embodied agents with intrinsic motivation driven by world models.
Working with @sergeypx on an unbiased estimator of the gradient for attention, that has very low variance even considering in expectation a small number of backwards tokens.
Go see @locross present!
Delighted to be in Singapore 🇸🇬 for #ICLR2025! I’ll be presenting Hypothetical Minds tomorrow. Stop by to chat about LLM agents 📅 Friday April 25, 10AM-12:30PM 📌 Hall 3 + Hall 2B #288
Super excited to get this out there! Work led by @sunfanyun and @Weiyu_Liu_, with Siyi Gu, @dill_pkl, Goutam Bhat, @fedassa, @ManlingLi_, and @jiajunwu_cs Project site: ai.stanford.edu/~sunfanyun/lay…
Spatial reasoning is a major challenge for the foundation models today, even in simple tasks like arranging objects in 3D space. #CVPR2025 Introducing LayoutVLM, a differentiable optimization framework that uses VLM to spatially reason about diverse scene layouts from unlabeled…
One of the biggest problems with VLMs today is their over-reliance on language semantics. We introduce Symmetrical Visual Contrastive Optimization (S-VCO), a simple posttraining method for enhancing VLMs' ability to discern visual details.
❓Do VLMs really pay attention to image inputs? 😮Shockingly, a VLM is most likely to generate the response below about 𝒶 𝒹𝑜𝑔 when given 𝐧𝐨 𝐢𝐦𝐚𝐠𝐞 𝐚𝐭 𝐚𝐥𝐥—and least likely when shown the correct image. 🏆To tackle this 𝐯𝐢𝐬𝐮𝐚𝐥 𝐧𝐞𝐠𝐥𝐞𝐜𝐭, we introduce a…
Some exciting new work of ours led by @ShengguangWu with @sunfanyun and @wen_kaiyue!
❓Do VLMs really pay attention to image inputs? 😮Shockingly, a VLM is most likely to generate the response below about 𝒶 𝒹𝑜𝑔 when given 𝐧𝐨 𝐢𝐦𝐚𝐠𝐞 𝐚𝐭 𝐚𝐥𝐥—and least likely when shown the correct image. 🏆To tackle this 𝐯𝐢𝐬𝐮𝐚𝐥 𝐧𝐞𝐠𝐥𝐞𝐜𝐭, we introduce a…
It was really fun to have been a part of this -- so excited to see it out there!
We have a new position paper on "inference time compute" and what we have been working on in the last few months! We present some theory on why it is necessary, how does it work, why we need it and what does it mean for "super" intelligence.