UC Santa Barbara NLP Group
@ucsbNLP
NLP Researchers @ University of California, Santa Barbara. Profs. @WilliamWangNLP, Xifeng Yan, Simon Todd, @CodeTerminator; acct run by @XiaoSophiaPu
It is official: my first day at UCSB as a faculty. Happy to join the force to advance AI research and nurture talent. Let's go @ucsbNLP @ucsbcs!
Congratulations to all! Super excited to graduate 7 PhD students this year: they are joining UW, Princeton, Google x 2, xAI, Amazon, and Meta. They are the future of AI!
Lined up and ready to walk (put me on the Jumbotron pls) #ucsb2025
Finally completed this journey. Thanks a lot to my family, advisors and friends!
1/3 How can we reliably trace a student’s understanding with just a few exercise responses? In classrooms, it’s important for teachers to track students’ understanding and predict their future performance on exercises. However, existing knowledge tracing (KT) algorithms often…
After all the papers, presentations, and sleepless nights... You can finally say it: Call me Doctor. 😎🎓 Congrats to the PhD Class of 2025! ❤️
Lined up and ready to walk (put me on the Jumbotron pls) #ucsb2025
RLVR is not just about RL, it's more about VR! Particularly for LLM coding, good verifiers (tests) are hard to get! In our latest work, we ask 3 questions: How good are current tests? How do we get better tests? How much does test quality matter? leililab.github.io/HardTests/
If you're trying to decide which CVPR tutorial to attend, Check out @KaijieZhu07, @XiaoSophiaPu, @nieyuzhou, Hao Chen, and my tutorial on Evaluating Multimodal Models Unfortunately, I won't be able to attend, but my lovingly crafted slides are in the mix :)
Lab celebration lunch after @XinyiWang98's defense!


📰 Thrilled to be featured in my hometown newspaper🤯 @diariodeburgos as part of a series on people living abroad! The article discusses my journey from Burgos to California. Check out the article in español 🇪🇸🔗diariodeburgos.es/noticia/z35dd9…
I’m attending #ICLR in Singapore! Also excited to share that I’m joining the Princeton Language and Intelligence Lab as a postdoc in July. In Fall 2026, I’ll be starting as an Assistant Professor at the University at Buffalo. I’ll be recruiting—feel free to reach out and chat!
🧠 Reasoning models often overthink. 🚀 In our new paper, we show: 1️⃣ Two overthinking scores. 2️⃣ DUMB500 — a benchmark of extremely easy questions. 3️⃣ THOUGHT TERMINATOR — a decoding method that reduces token waste by up to 90%, often improving accuracy. Details below 👇
📜🚨 Check out our latest work on "Self-Resource Allocation in Multi-Agent LLM Systems" where we explore how LLMs can be used to optimize task allocation in multi-agent systems 🤖 🧵(1/3)
🎉Thrilled to share my internship work with the @NVIDIA GenAIR team (accepted to #CVPR2025): BlobGEN-Vid: Compositional Text-to-Video Generation with Blob Video Representations! 🚀BlobGEN-Vid is a model-agnostic framework that delivers: - SOTA layout controllability - Enhanced…
MLGym A New Framework and Benchmark for Advancing AI Research Agents
I’m excited to share that I’ve joined @AIatMeta as a Research Scientist, where I’ll be working on Llama Post-training (multimodality). Can’t wait to work with the team and contribute to open-source multimodal AI! This new chapter comes right after successfully completing my PhD…
🚀🚀 Life update: I just defended my PhD and have joined @xai to advance the visual generation models with @hexianghu @tingchenai Words cannot describe how grateful I am to my awesome advisor, @WilliamWangNLP, and my committee members, Xifeng Yan and @WenhuChen. Huge thanks to…
🚀 Excited to share the latest on our recent UCSB NLP alumni! 🎉 🔹 @JiachenLi11 (PhD '25) → xAI 🔹 @yujielu_10 (PhD '25) → Meta GenAI 🔹 @PanLiangming (Postdoc '24) → Asst. Prof., Arizona 🔹 @Qnolan4 (PhD '24) → Meta GenAI Proud of their achievements—onward and upward! 🚀
🚀🚀🚀Today's top SWEBench model, built using @anton_iades's SWE-Search
👑
happy to see an approach that took inspiration from SWE-Search achieving SOTA on SWE-bench. while there is still plenty of room for improvement, search is a vital tool for navigating complex SWE environments, and I expect more approaches to follow suit.
👑
New blogpost dropped: Towards Reliable and Trustworthy MultiAgent Debate amayuelas.me/blog/2024/11/2…