Sanmi Koyejo

@sanmikoyejo

I lead @stai_research at Stanford. Co-founder @VirtueAI_co

Stanford, CA

Joined September 2014

104Following

3KFollowers

Pinned

Sanmi Koyejo Retweeted

Deep Learning Indaba@DeepIndaba · Jul 21

🚨 Keynote alert! We’re thrilled to welcome @sanmikoyejo as our next speaker in #DLI2025! Catch the session "Beyond benchmarks: Building a science of AI measurement" On Tuesday, August 19 at 9 am GMT+2. 🌐 Join us now via the Virtual indaba: deeplearningindaba.com/2025/virtual-i… #Urunana…

927

Sanmi Koyejo@sanmikoyejo · Jul 26

Huge congrats to @RuneKvist and team on launching AIUC-1. Excited to see this unfold!

RRune Kvist@RuneKvist · Jul 23

To accelerate AI adoption, we need an AI standard. What Moody’s is for bonds, FICO for credit, SOC 2 for security. Standards offer credible signals of who to trust. They create confidence. Confidence accelerates adoption. Introducing AIUC-1: the world’s first AI agent standard

1.0K

Sanmi Koyejo Retweeted

LLM Evals Workshop @NeurIPS@LLM_eval · Jul 22

Proudly organized by - @BerivanISIK (Google) - @beyzaermis (Cohere) - @Diyi_Yang (Stanford) - @MariusHobbhahn (Apollo Research) - @attaluri_nithya (Google DeepMind) - @RishiBommasani (Stanford) - @YangjunR (U Toronto) 3/3

3.0K

Sanmi Koyejo Retweeted

LLM Evals Workshop @NeurIPS@LLM_eval · Jul 22

Featuring talks from experts in the field: - @esindurmusnlp (Anthropic) - @isabela_alb (Google DeepMind) - @_jasonwei (OpenAI) - @seo_minjoon (KAIST) - @natolambert (Allen Institute) - @orf_bnw (Google DeepMind) - @sanmikoyejo (Stanford) - @lxuechen (xAI) 2/3

1.0K

Sanmi Koyejo Retweeted

LLM Evals Workshop @NeurIPS@LLM_eval · Jul 22

We are happy to announce our @NeurIPSConf workshop on LLM evaluations! Mastering LLM evaluation is no longer optional -- it's fundamental to building reliable models. We'll tackle the field's most pressing evaluation challenges. For details: sites.google.com/corp/view/llm-…. 1/3

15.0K

Sanmi Koyejo@sanmikoyejo · Jul 22

Evaluating LLMs is one of the most critical and nuanced challenges in AI today. I am super excited to be co-organizing this workshop @NeurIPSConf to discuss the most pressing evaluation challenges. Details 👇

LLLM Evals Workshop @NeurIPS@LLM_eval · Jul 22

2.0K

Sanmi Koyejo Retweeted

Virtue AI@VirtueAI_co · Jul 22

Sparking AI Security Research Impact at ICML 2025: We’re thrilled to share that the Virtue AI team had 19 papers accepted at ICML 2025, one of the most prestigious conferences in AI and machine learning. Our work spanned critical advances in Security for AI agents and AI…

384

Sanmi Koyejo Retweeted

Brando Miranda@BrandoHablando · Jul 19

Sorry I meant this picture! :D Not the same one twice, X doesn't let me update my post.

375

Sanmi Koyejo@sanmikoyejo · Jul 19

Come to my second poster session about Data centric Machine Learning (DMLR)! At 209-2010! #ICML2025

BBrando Miranda@BrandoHablando · Jul 19

Come to 208-209 ICML data workshop and chat with me about how to use data optimally! Scale isn't everything! Ask me how to use it beyond post-training ;) - Scale isn’t enough: LLM performance rises with training‑task alignment more than with data volume. - Robust Alignment…

998

Sanmi Koyejo Retweeted

Brando Miranda@BrandoHablando · Jul 19

Preprint on arxiv: arxiv.org/abs/2501.08496 🧵4/3

336

Sanmi Koyejo Retweeted

Brando Miranda@BrandoHablando · Jul 19

Joint work with @ObbadElyas Mario Krrish Aryan @sanmikoyejo Me Sudarsan at @stai_research ! Thank you! 🧵3/3

376

Sanmi Koyejo@sanmikoyejo · Jul 19

We first demonstrated scale isn't enough in our Beyond Scale paper using the diversity coefficient! x.com/_akhaliq/statu… thanks for featuring us @_akhaliq ! Work led by @_alycialee et al! 🧵 2/3

AAK@_akhaliq · Jun 27, 2023

Beyond Scale: the Diversity Coefficient as a Data Quality Metric Demonstrates LLMs are Pre-trained on Formally Diverse Data paper page: huggingface.co/papers/2306.13… Current trends to pre-train capable Large Language Models (LLMs) mostly focus on scaling of model and dataset size.…

17.0K

Sanmi Koyejo Retweeted

Brando Miranda@BrandoHablando · Jul 19

2.0K

Sanmi Koyejo Retweeted

Brando Miranda@BrandoHablando · Jul 18

⚒️We also use Leni’s awesome Pantograph as an Python-Lean interface! #TACAS2025 🧵18/14 github.com/stanford-centa…

384

Sanmi Koyejo Retweeted

Brando Miranda@BrandoHablando · Jul 18

🙏We also want to thank our Verification expert @VeryCuellar John Sarrancino 🧵17/14 goto.ucsd.edu/~john/ galois.com/team/santiago-…

352

Sanmi Koyejo Retweeted

Brando Miranda@BrandoHablando · Jul 18

🎉We also want to thank the other agentic framework we feature and their help answering all my questions @lateinteraction @dilarafsoylu @kristahopsalong ! Guess…DSPy! 🧵16/14 dspy.ai

336

Sanmi Koyejo Retweeted

Brando Miranda@BrandoHablando · Jul 18

👏We want to Acknowledge Trace as one of the awesome agentic frameworks we used! Invented by @allenainie @adith387 @chinganc_rl 🧵15/14 microsoft.github.io/Trace/

345

Sanmi Koyejo Retweeted

Brando Miranda@BrandoHablando · Jul 18

🌐 Find reviews and paper on OpenReview, feel free to drop us constructive criticism if you want! openreview.net/forum?id=rWkGF… Full final code, data etc will be released in September for ICLR. Enjoy the research preview! If you want to get the data set early DM me. 🧵13/14

242

Sanmi Koyejo Retweeted

Brando Miranda@BrandoHablando · Jul 18

🎯 From “looks right” ➜ mathematically verified. Visit our poster #ICML2025 West Ballroom C Fri 18 Jul 10:50 a.m. PDT — 12:20 p.m. PDT. Thanks @sanmikoyejo @stai_research @zhankezhou @allenainie @kaifronsdal @westonkirk_ @ObbadElyas @YingLi1839269 @dilarafsoylu Leni Aniva…

356

Sanmi Koyejo Retweeted

Brando Miranda@BrandoHablando · Jul 18

💡 Amazing related work! But how we differ: CLEVER (proof puzzles), FVApps (breadth), DafnyBench (different prover). VeriBench = end-to-end code → tests → proofs plus security stakes. 🧵12/14

221