Mengdi Wang
@MengdiWang10
Professor @Princeton in AIML. Co-Director of #Princeton AI^2. Program Chair @ICLR2023. Formerly @MIT @GoogleDeepmind @Tsinghua. my Erdos number: 3
💔 2nd&3rd deaths linked to Sarepta gene therapy—trial pause, stock drop. Must accelerate safer gene & cell cures. AI design & AI agents + real world validation can help contribute! 🚀 AI momentum: SynBioBeta’s “Towards an AI-Driven CRISPR Future” (synbiobeta.com/read/towards-a…) charts…
Just returned from ICML 2025 where I had the honor of keynoting three remarkable workshops. Grateful for the opportunity to delve into topics like self-evolving Alita agents, CRISPR-GPT for AI-driven science, Genome-Bench, reinforcement-learning agents, and AI biosafety. Special…
🚨 Can AI design harmful viruses or toxic molecules? 🚨 🔥 At #NeurIPS2025, we’re launching a new workshop: 🔬 Biosecurity Safeguards for Generative AI Link : biosafe-gen-ai.github.io 🙏 Grateful to our amazing co-organizers and expert advisors! #BioSafeGenAI #AI4Science
Tokenization has been the final barrier to truly end-to-end language models. We developed the H-Net: a hierarchical network that replaces tokenization with a dynamic chunking process directly inside the model, automatically discovering and operating over meaningful units of data
📢 New frontier in AI4Science: What if AI could design novel pathogenic/toxic proteins & molecules? This raises urgent biosecurity questions. We're tackling this head-on at our #NeurIPS2025 Workshop on AI Biosecurity: biosafe-gen-ai.github.io AI foundation models trained on…
🚨 Two papers accepted in ICML 2025 🚨 1. MATH-Perturb: Benchmarking LLMs' Math Reasoning Abilities against Hard Perturbations 🔗 website: math-perturb.github.io 🔖 paper: arxiv.org/pdf/2502.06453 x.com/KaixuanHuang1/…
Do LLMs have true generalizable mathematical reasoning capability or are they merely memorizing problem-solving skills? 🤨 We present MATH-Perturb, modified level-5 problems from MATH dataset to benchmark LLMs' generalizability to slightly perturbed problems. 🔗…
I'm teaching a new course this fall called "Robot Planning Meets Machine Learning." Half lectures, half seminar. Paper recommendations for the seminar part are welcome. I'm looking for any creative combinations of robot planning and ML.
Recent media misreport — about Meta’s AI orgs and (oddly) myself — clarifications: 🧪 FAIR is Meta’s long-term research lab — not GenAI, not MSL 🔍 We do open research with public data, no access to GenAI/MSL infra 😅 I’m not bald ⏳ No complaint — just asking folks to be patient
No matter how AI evolves overnight—tech, career, how it may impact me—I remain committed to using "physics of language models" approach to predict next-gen AI. Due to my limited GPU access at Meta, Part 4.1 (+new 4.2) are still in progress, but results on Canon layers are shining
Facebook AI Research (FAIR) is a small, prestigious lab in Meta. We don't train large models like GenAI or MSL, so it's natural that we have limited GPUs. GenAI or MSL's success or failure, past or future, doesn't reflect the work of FAIR. It is important to make this distinction
No matter how AI evolves overnight—tech, career, how it may impact me—I remain committed to using "physics of language models" approach to predict next-gen AI. Due to my limited GPU access at Meta, Part 4.1 (+new 4.2) are still in progress, but results on Canon layers are shining
World Simulator, reimagined — now alive with humans, robots, and their vibrant society unfolding in 3D real-world geospatial scenes across the globe! 🚀 One day soon, humans and robots will co-exist in the same world. To prepare, we must address: 1️⃣ How can robots cooperate or…
Full house at the AI-for-RFIC session at @IEEE IMS—standing room only. Huge congrats to @YingqingGuo , our second-year PhD student at @Princeton ECE & Princeton AI Lab @PrincetonAInews , who won 2nd place Best Paper Award for her work on diffusion-based RF chip design! Coming…

🤔 How do you get an LLM to reason like a CRISPR pro—or any top scientist? By training it on real expert conversations. 🛠️ What we built • An automated pipeline that distills learning signals from 10 + years of genomics discussions • Genome-Bench: 3,000 + curated Q&As on…
Had a great time at this CVPR community-building workshop---lots of fun discussions and some really important insights for early-career researchers. I also gave a talk on "Research as an Infinite Game." Here are the slides: canva.com/design/DAGp0iR…
In this #CVPR2025 edition of our community-building workshop series, we focus on supporting the growth of early-career researchers. Join us tomorrow (Jun 11) at 12:45 PM in Room 209 Schedule: sites.google.com/view/standoutc… We have an exciting lineup of invited talks and candid…
LLMs often falter at real scientific reasoning – messy lab problems, ambiguous data, domain quirks. Could LLMs do scientific reasoning and provide expert suggestions in the lab? 🧪🔬 Our new paper shows how – by training LLMs on 10+ years of real CRISPR forum discussions and…
A grant of nearly $10 million was announced May 28 to fund a Princeton-led effort using AI to develop advanced semiconductors for wireless communication and remote sensing. The effort will be directed by @EPrinceton prof @KSG_Princeton Read more: bit.ly/3FoaXt9
AI risk is real. Paper from Princeton AI Lab shows it’s shockingly easy to jailbreak genome-focused LLMs—opening doors to dangerous misuse. We must build strong safeguards now. Check out our call on Nature Biotech that maps out the AI guardrail technologies needed to mitigate…
🧬 GeneBreaker: Unveiling the Biosecurity Risks of DNA Language Models DNA models can be "jailbroken" to generate sequences resembling human pathogens like SARS-CoV-2 and HIV-1. With up to a 60% success rate! 🔗 Read more: arxiv.org/abs/2505.23839
Most agents rely on pre-defined tools & hand-crafted workflows. Great in demos, clunky in practice. Alita is game changer: AI agents self-evolve and create their own tools. 🛠️🧠 1/ No predefined static toolkits. Agents build MCP tools via trial & error: ✅ Search open-source…
The GAIA game is over, and Alita is the final answer. Alita takes the top spot in GAIA, outperforming OpenAI Deep Research and Manus. Many general-purpose agents rely heavily on large-scale, manually predefined tools and workflows. However, we believe that for general AI…