Saadia Gabriel
@GabrielSaadia
UCLA NLP Prof. Previously UW, MIT and NYU.
I'm also kicking off year 2 of faculty life by doing all the challenging things people warned me not to do, starting by creating a new NLP course around ethics of large-scale language modeling. I welcome thoughts on this as well! I might document the process to help others.
I'll be celebrating a decade as an AI researcher in Vienna a week from now. If anyone at ACL wants to chat about evaluating LLM persuasion, multi-agent safety, making AI more factual and inclusive, whether open-ended therapy chatbots are a good idea, or really anything lmk.
Excited that X-Teaming is accepted to COLM 2025!! See you at Montreal!
Cracking the 𝐦𝐮𝐥𝐭𝐢-𝐭𝐮𝐫𝐧 safety challenge! ⚡️𝐗-𝐓𝐞𝐚𝐦𝐢𝐧𝐠⚡️ is a scalable red-teaming framework revealing diverse multi-turn LM vulnerabilities. Sneak peek: 96.2% attack success on Claude 3.7—despite its single-turn robustness & the largest multi-turn safety dataset!
I hope there's an extension to the current ARR author-reviewer discussion period. We've gotten very little response from reviewers, maybe because openreview has been crashing over the last few days (for example this is all I can see this morning). @emnlpmeeting @ReviewAcl

Thanks @LTIatCMU and @SCSatCMU for featuring our work!!✨💫 Our paper on culturally offensive nonverbal gestures is accepted to #ACL2025 main! Detailed thread🧵: x.com/akhila_yerukol… Preprint📜: arxiv.org/abs/2502.17710 Work done with @GabrielSaadia @VioletNPeng @MaartenSap
Hand gestures are a major mode of human communication, but they don't always translate well across cultures. New research from @akhila_yerukola, @MaartenSap and others is aimed at giving AI systems a hand with overcoming cultural biases: lti.cmu.edu/news-and-event…
Thrilled and honored to have received not one, but two grant awards from Google this month for work on RL to enhance LLM-patient interaction and societal aspects of AI agents!!
We’re announcing the 87 professors selected for the 2025 Google Research Scholar Program — join us in congratulating these exceptional recipients and learn more about their groundbreaking work at goo.gle/rs-recipients. #GoogleResearch #GoogleResearchScholar
🚨Thrilled to share our new work: AI debate combats misinformation better than single AI advisors! 🤔We tested if two AIs debating opposite sides helps biased humans judge controversial COVID-19 claims more accurately. Paper: arxiv.org/abs/2506.02175 🧵👇 #AI #Debate
Thank you to the whole OpenThoughts team for yet another great effort! @etash_guha, @ryanmart3n, @sedrickkeh2, @NeginRaoof_, @GeorgeSmyrnis1, @hbXNov, @marnezhurina, @MercatJean, @trungthvu, @ZayneSprague, @suvarna_ashima, @FeuerBenjamin, @cliangyu_, @codezakh, @esfrankel,…
Announcing OpenThinker3-7B, the new SOTA open-data 7B reasoning model: improving over DeepSeek-R1-Distill-Qwen-7B by 33% on average over code, science, and math evals. We also release our dataset, OpenThoughts3-1.2M, which is the best open reasoning dataset across all data…
The AISI Alignment Team is hiring a cognitive scientist to help boost our understanding of human errors and other aspects of scalable oversight protocols. Please apply if interested. Benjamin has details, and more in this thread. 🧵
We're hiring for a cognitive scientist to join the AISI Alignment Team! Cognitive science is a crucial field that we want to galvanise to help solve one of the most important problems of our time. Could you lead that effort?
Very much looking forward to this!
For this week’s NLP Seminar, we are thrilled to host Emma Pierson @2plus2make5 to give a talk titled Using New Data to Answer Old Questions! When: 5/16 Fri 2pm PT Registration: forms.gle/9sNYv2isfcqYQC…
Excited to announce our workshop on Visions of Language Modeling at COLM'25! 🔥 We thought that current LM research overly focuses on a narrow set of popular topics (e.g., test-time scaling and LLM agents), and we'd love to bring some entropy back 💪 To do this, we invited a…
For this week’s NLP Seminar, we are thrilled to host Aditya Kusupati @adityakusupati to give a talk titled Matryoshka Principles for Adaptive Intelligence! When: 5/9 Fri 2pm PT Registration: forms.gle/j3LNvNbdfnu1xL…
Such a brilliant idea
📢 New Paper! Tired 😴 of reasoning benchmarks full of math & code? In our work we consider the problem of reasoning for plot holes in stories -- inconsistencies in a storyline that break the internal logic or rules of a story’s world 🌎 W/ @melaniesclar, and @tsvetshop 1/n
📣 Excited about our Cross Cultural Considerations in NLP workshop! @c3_nlp @ @naaclmeeting coming up in no time—just two weeks to go! Full program here: c3nlp.github.io Co-org. w/ @sunipa17 @LucianaBenotti @daniel_hers @vinodkpg @YongCaoPlus @LiZhou21 @IfeAdebara ❤️
🚨 Excited to share our new paper on 𝕏-Teaming! 🤖 Multiagent system for multiturn jaibreaking 🔍 96.2% attack success against Claude 3.7 (immune to single-turn attacks!) 💥 Upto 98.1% attack success on leading model 🛡️ Released 30K safety dataset 🧵below #AI #LLMSafety
Whether you're into AI safety, computational social science, or want to experiment with new moderation strategies, this framework is for you. Check our work at 📄Paper: arxiv.org/abs/2504.07830 💻Code: github.com/genglinliu/MOS… Huge thanks to my wonderful collarborators: Salman…
Excited to share my first project at UCLA! We built MOSAIC — a social network simulator where LLM-powered agents behave like real users on social media. They post, share, flag, and debate the factuality of news content — all at scale. It’s open-source. 🧵 TL;DR 🌐 Realistic…
For anyone wondering what happened with that multi-agent social simulation I’ve been talking about at IASEAI, Stanford and various other places over the last few months...the preprint is out!
Sharing highlights from an incredible day of presentations and dialogue! Thanks to all our presenters and participants... #IASEAI25 #EthicalAI #AIRegulation #TechPolicy
📣 For this week’s NLP Seminar, we are thrilled to host Zhe Gan @zhegan4 to give a talk titled “How to Build Your Multimodal LLMs: From Pre-training to Post-training and Agents”! 🗓️ 4/11 Fri 2pm PT Registration: forms.gle/TNXfBZJiMJjL18…