Zeming Chen
@eric_zemingchen
PhD Candidate, NLP Lab @EPFL; Research Scientist Intern @AIatMeta; Ex @AIatMeta (FAIR) @allen_ai #AI #ML #NLP
We present MEDITRON, a set of new open-access #LLMs (70B & 7B) adapted to the medical domain, achieving new SoTA open-source performance on common medical benchmarks, outperforming #GPT-3.5 and Med-PaLM, and coming within 5% of #GPT4 Find out how we did this ⬇️

🤔 Have @OpenAI o3, Gemini 2.5, Claude 3.7 formed an internal world model to understand the physical world, or just align pixels with words? We introduce WM-ABench, the first systematic evaluation of VLMs as world models. Using a cognitively-inspired framework, we test 15 SOTA…
🚨New Preprint!! Thrilled to share with you our latest work: “Mixture of Cognitive Reasoners”, a modular transformer architecture inspired by the brain’s functional networks: language, logic, social reasoning, and world knowledge. 1/ 🧵👇
If you’re at @iclr_conf this week, come check out our spotlight poster INCLUDE during the Thursday 3:00–5:30pm session! I will be there to chat about all things multilingual & multicultural evaluation. Feel free to reach out anytime during the conference. I’d love to connect!
🚀 Introducing INCLUDE 🌍: A multilingual LLM evaluation benchmark spanning 44 languages! Contains *newly-collected* data, prioritizing *regional knowledge*. Setting the stage for truly global AI evaluation. Ready to see how your model measures up? #AI #Multilingual #LLM #NLProc
NEW PAPER ALERT: Generating visual narratives to illustrate textual stories remains an open challenge, due to the lack of knowledge to constrain faithful and self-consistent generations. Our #CVPR2025 paper proposes a new benchmark, VinaBench, to address this challenge.
🚨 New Preprint!! LLMs trained on next-word prediction (NWP) show high alignment with brain recordings. But what drives this alignment—linguistic structure or world knowledge? And how does this alignment evolve during training? Our new paper explores these questions. 👇🧵
🚨 New Paper! Can neuroscience localizers uncover brain-like functional specializations in LLMs? 🧠🤖 Yes! We analyzed 18 LLMs and found units mirroring the brain's language, theory of mind, and multiple demand networks! w/ @GretaTuckute, @ABosselut, & @martin_schrimpf 🧵👇
📘 Could ChatGPT get an engineering degree? Spoiler, yes! In our new @PNASNews article, we explore how AI assistants like GPT-4 perform in STEM university courses — and on average they pass a staggering 91.7% of core courses. 🧵 #AI #HigherEd #STEM #LLMs #NLProc
🚀 Introducing INCLUDE 🌍: A multilingual LLM evaluation benchmark spanning 44 languages! Contains *newly-collected* data, prioritizing *regional knowledge*. Setting the stage for truly global AI evaluation. Ready to see how your model measures up? #AI #Multilingual #LLM #NLProc
🚨 New Paper!! How can we train LLMs using 100M words? In our @babyLMchallenge paper, we introduce a new self-synthesis training recipe to tackle this question! 🍼💻 This was a fun project co-led by me, @yingtian80536, @akgokce0, w/ @HannesMehrer & @martin_schrimpf 🧵⬇️
Alignment is necessary for LLMs, but do we need to train aligned versions for all model sizes in every model family? 🧐 We introduce 🚀Nudging, a training-free approach that aligns any base model by injecting a few nudging tokens at inference time. 🌐fywalter.github.io/nudging/…
Hey #NLProc folks, we had a lot of fun last year, so we're inviting guest lecturers again for our Topics in NLP course during this Fall 2024 semester at EPFL! More information here: t.ly/QMTCA Please share and RT!