James Zou
@james_y_zou
@Stanford professor. Chan-Zuckerberg investigator. Sloan Fellow. Overton Prize. @togethercompute. AI for science + health.
🏆Thrilled that #CollabLLM won the #ICML2025 Outstanding Paper Award! We propose a new approach to optimize human-AI collaboration, which is critical for agents. Congratulations to my fantastic co-authors; great job @ShirleyYXWu and Michel Galley driving the project!👏

Can LLMs predict the future? Introducing FutureBench — a new benchmark where we tested how well agents perform on Polymarket and future event prediction! We found that Claude, GPT, DeepSeek all approach this differently. I'd like to see anyone try and contaminate this one. 😆
If you are at ICML2025, welcome to attend our workshop "Multi-modal Foundation Models and Large Language Models for Life Sciences" tomorrow (Jul 19 Saturday). The workshop features a stellar lineup of invited speakers, including Valentina Boeva @val_boeva, Manolis Kellis…
📢New conference where AI is the primary author and reviewer! agents4science.stanford.edu Current venues don't allow AI-written papers, so it's hard to assess the +/- of such works🤔 #Agents4Science solicits papers where AI is the main author w/ human advisors. 💡Initial reviews by…
Super exciting experiment on human-AI collaboration -- can #GenerativeAI innovate and do paper writing? Cannot wait to see how this unfolds!
📢New conference where AI is the primary author and reviewer! agents4science.stanford.edu Current venues don't allow AI-written papers, so it's hard to assess the +/- of such works🤔 #Agents4Science solicits papers where AI is the main author w/ human advisors. 💡Initial reviews by…
🔮Exciting new benchmark testing how well AI predicts the future! Each week, we curate news + prediction markets for questions about next week. Then we have agents make forecasts. Requires advanced research + reasoning @togethercompute @huggingface 📜together.ai/blog/futureben… 🌐…
Most AI benchmarks test the past. But real intelligence is about predicting the future. Introducing FutureBench — a new benchmark for evaluating agents on real forecasting tasks that we developed with @huggingface 🔍 Reasoning > memorization 📊 Real-world events 🧠 Dynamic,…
Most AI benchmarks test the past. But real intelligence is about predicting the future. Introducing FutureBench — a new benchmark for evaluating agents on real forecasting tasks that we developed with @huggingface 🔍 Reasoning > memorization 📊 Real-world events 🧠 Dynamic,…
Can LLMs predict the future? In FutureBench, friends from @togethercompute create new questions from evolving news & markets: As time passes, we'll see which agents are the best at predicting events that have yet to happen! 🔮 Also cool: by design, dynamic & uncontaminated eval
Generalization does not go as expected and fine-tuning does not substitute for RAG. From @NEJM_AI a study by @ericwu93 @james_y_zou on fine-tuning frontier LLM's with medical data. More in the reply below
CollabLLM won #ICML2025 ✨Outstanding Paper Award along with 6 other works! icml.cc/virtual/2025/a… 🫂 Absolutey honored and grateful for coauthors @MSFTResearch @StanfordAILab and friends who made this happen! 🗣️ Welcome people to our presentations about CollabLLM tomorrow…
Even the smartest LLMs can fail at basic multiturn communication Ask for grocery help → without asking where you live 🤦♀️ Ask to write articles → assumes your preferences 🤷🏻♀️ ⭐️CollabLLM (top 1%; oral @icmlconf) transforms LLMs from passive responders into active collaborators.…
🚀 Excited to share that our work GenSeg has been published in Nature Communications! GenSeg is an end-to-end, downstream-task-guided framework for generating synthetic training data for medical image segmentation. It significantly reduces the need for manual annotations — by a…
My wonderful students and collaborators are presenting several exciting papers at #ICML2025 this week! Check it out👇

the future is now -- Conference where AI is the primary author and reviewer! Very curious to see the papers published here!
📢New conference where AI is the primary author and reviewer! agents4science.stanford.edu Current venues don't allow AI-written papers, so it's hard to assess the +/- of such works🤔 #Agents4Science solicits papers where AI is the main author w/ human advisors. 💡Initial reviews by…
I’ll have my Deep Research AI collaborators submit several scientific papers to this conference, it’s a fantastic idea!
📢New conference where AI is the primary author and reviewer! agents4science.stanford.edu Current venues don't allow AI-written papers, so it's hard to assess the +/- of such works🤔 #Agents4Science solicits papers where AI is the main author w/ human advisors. 💡Initial reviews by…
This is an incredibly interesting conference. Excited to see what scientific findings can be produced by AI systems!
📢New conference where AI is the primary author and reviewer! agents4science.stanford.edu Current venues don't allow AI-written papers, so it's hard to assess the +/- of such works🤔 #Agents4Science solicits papers where AI is the main author w/ human advisors. 💡Initial reviews by…
📝 We are so living in the future already. This is the 1st open conference where AI serves as both primary authors and reviewers of research papers. Papers naming AI as primary author are due September 5 2025.
📢New conference where AI is the primary author and reviewer! agents4science.stanford.edu Current venues don't allow AI-written papers, so it's hard to assess the +/- of such works🤔 #Agents4Science solicits papers where AI is the main author w/ human advisors. 💡Initial reviews by…
Fun! Will be v interested to see how this goes
📢New conference where AI is the primary author and reviewer! agents4science.stanford.edu Current venues don't allow AI-written papers, so it's hard to assess the +/- of such works🤔 #Agents4Science solicits papers where AI is the main author w/ human advisors. 💡Initial reviews by…
Super curious to see how AI writing and reviewing science plays out. Excited for this bold experimental conference 👀
📢New conference where AI is the primary author and reviewer! agents4science.stanford.edu Current venues don't allow AI-written papers, so it's hard to assess the +/- of such works🤔 #Agents4Science solicits papers where AI is the main author w/ human advisors. 💡Initial reviews by…
Super interesting idea -- a conference where AI is both the authors and reviewers. Curious to see where it goes @james_y_zou . But you know, the ultimate would be where the AI is also the organizers :-)
📢New conference where AI is the primary author and reviewer! agents4science.stanford.edu Current venues don't allow AI-written papers, so it's hard to assess the +/- of such works🤔 #Agents4Science solicits papers where AI is the main author w/ human advisors. 💡Initial reviews by…