Diana Abagyan
@dianaabagyan
Research Scholar @Cohere_Labs
🚨New pretraining paper on multilingual tokenizers 🚨 Super excited to share my work with @Cohere_Labs: One Tokenizer To Rule Them All: Emergent Language Plasticity via Multilingual Tokenizers

I’m very excited to be co-organizing this @NeurIPSConf workshop on LLM evaluations! Evaluating LLMs is a complex and evolving challenge. With this workshop, we hope to bring together diverse perspectives to make real progress. See the details below:
We are happy to announce our @NeurIPSConf workshop on LLM evaluations! Mastering LLM evaluation is no longer optional -- it's fundamental to building reliable models. We'll tackle the field's most pressing evaluation challenges. For details: sites.google.com/corp/view/llm-…. 1/3
Prompt engineering places all the work on the end user to try and squeeze out performance. It's a hack to deal with the limitations in adaptability of our model. However, the future should be that this happens behind the scenes and is inferred automatically.
Prompts shouldn't have to be engineered. Our latest research marks another step towards fluid, natural language communication with LLMs.
🚨 New paper drop! 🚨 🤔 When a transformer sees a sequence that could be explained by many rules, which rule does it pick? It chooses the simplest sufficient one! 🧵👇
🚨 New preprint! 🚨 Phase transitions! We love to see them during LM training. Syntactic attention structure, induction heads, grokking; they seem to suggest the model has learned a discrete, interpretable concept. Unfortunately, they’re pretty rare—or are they?
Can we improve the performance of LLMs during inference without the need for extensive sampling OR special reward models? 🤔 Our latest work introduces a new inference time scaling recipe that is sample-efficient, multilingual, and suitable for multi-task requirements. 🍋
💪🏼Huge thanks to my incredible mentors: Julia Kreutzer @mrdanieldsouza, @YeS855811, @sarahookr for guiding me and supporting this work ✨ Find our arXiv release here! 📜: arxiv.org/abs/2506.20544
🚀 Want better LLM performance without extra training or special reward models? Happy to share my work with @Cohere_labs : "When Life Gives You Samples: Benefits of Scaling Inference Compute for Multilingual LLMs" 👀How we squeeze more from less at inference 🍋, details in 🧵
How can AI capture the nuances of different languages?💬🗨️ By using a team of specialized teacher models via Multilingual Arbitration we've achieved up to 19.5% improvement in win rates across languages. Find us at ACL to discuss how we can further break down language barriers.
🤹 How do we move away from complicated and brittle prompt engineering at inference for under-represented tasks?🤔 🧠 Our latest work finds that optimizing training protocols improves controllability and boosts performance on underrepresented use cases at inference time 📈
Can we train models for better inference-time control instead of over-complex prompt engineering❓ Turns out the key is in the data — adding fine-grained markers boosts performance and enables flexible control at inference🎁 Huge congrats to @mrdanieldsouza for this great work
🚨 Wait, adding simple markers 📌during training unlocks outsized gains at inference time?! 🤔 🚨 Thrilled to share our latest work at @Cohere_Labs: “Treasure Hunt: Real-time Targeting of the Long Tail using Training-Time Markers“ that explores this phenomenon! Details in 🧵 ⤵️
🚨 Wait, adding simple markers 📌during training unlocks outsized gains at inference time?! 🤔 🚨 Thrilled to share our latest work at @Cohere_Labs: “Treasure Hunt: Real-time Targeting of the Long Tail using Training-Time Markers“ that explores this phenomenon! Details in 🧵 ⤵️
Global MMLU is revolutionizing multilingual AI. 🌍 Recognized by Stanford HAI and adopted by top labs, it's the benchmark for fair evaluation across 42 languages. Looking forward to sharing this work at ACL in Vienna next month. 🇦🇹
Our paper M-RewardBench got accepted to ACL main: arxiv.org/abs/2410.15522 We construct the first-of-its-kind multilingual RM evaluation benchmark and leverage it to look into the performances of several Reward Models in non-English settings along w/ other interesting insights.
amazing work!!!
🚨New pretraining paper on multilingual tokenizers 🚨 Super excited to share my work with @Cohere_Labs: One Tokenizer To Rule Them All: Emergent Language Plasticity via Multilingual Tokenizers
Huge congrats to @dianaabagyan on her first first author paper. Was a pleasure collaborating on this work — we ask what cheap interventions in pre-training can allow for more language plasticity downstream.
🚨New pretraining paper on multilingual tokenizers 🚨 Super excited to share my work with @Cohere_Labs: One Tokenizer To Rule Them All: Emergent Language Plasticity via Multilingual Tokenizers
An excellent work by @dianaabagyan💎 We show that a "universal" tokenizer, covering more than just primary languages, greatly boosts new language adaptation without hurting pretraining performance 🚀 A very critical study for multilingual LLMs given huge cost of pretraining🔥
🚨New pretraining paper on multilingual tokenizers 🚨 Super excited to share my work with @Cohere_Labs: One Tokenizer To Rule Them All: Emergent Language Plasticity via Multilingual Tokenizers