Kelly Marchisio (St. Denis)
@cheeesio
Multilingualilty Lead @cohere. Formerly: PhD @jhuclsp, Alexa Fellow @amazon, dev @Google, MPhil @cambridgenlp, EdM @hgse 🔑🔑¬🧀 (@kelvenmar20)
Code-release from our superstar intern, @p_nawrot! • Write sparse attn patterns in 50 lines, not 5k • Compatibility w models supported by vLLM, support for TP • 6 SOTA baselines with optimized implementations + 9 eval tasks • Research-grade extensibility = rapid prototyping
We built sparse-frontier — a clean abstraction that lets you focus on your custom sparse attention implementation while automatically inheriting vLLM’s optimizations and model support. As a PhD student, I've learned that sometimes the bottleneck in research isn't ideas — it's…
@weiyinko_ml was one of the earliest members of our Open Science Community and an early collaborator on our open science research. We’re proud to have been part of Wei-Yin’s journey from community collaborator to colleague, and grateful he took an early bet on working with us 🚀
Excited to announce the call for papers for the Multilingual Representation Learning workshop #EMNLP2025 sigtyp.github.io/ws2025-mrl.html with @_dataman_ @linguist_cat Jiayi Wang @fdschmidt @tylerachang @hila_gonen and amazing speakers: Alice Oh, Kelly Marchisio, & Pontus Stenetorp
The call for papers is out for the 5th edition of the Workshop on Multilingual Representation Learning which will take place in Suzhou, China co-located with EMNLP 2025! See details below!
We're looking for a new member for the multilingual team with a focus on data engineering! Please apply at the link below:
The Multilingual Team at @cohere is hiring! If this sounds like you, please apply: - strong coding skills and a keen eye for detail - experience working with the challenges & joys of multilingual data Help us bring AI to the world! 🌏🌍🌎 jobs.ashbyhq.com/cohere/a87be94…
Make Command speak better & in more languages
The Multilingual Team at @cohere is hiring! If this sounds like you, please apply: - strong coding skills and a keen eye for detail - experience working with the challenges & joys of multilingual data Help us bring AI to the world! 🌏🌍🌎 jobs.ashbyhq.com/cohere/a87be94…
The Multilingual Team at @cohere is hiring! If this sounds like you, please apply: - strong coding skills and a keen eye for detail - experience working with the challenges & joys of multilingual data Help us bring AI to the world! 🌏🌍🌎 jobs.ashbyhq.com/cohere/a87be94…
We release a major improvement upon last year's Dynamic Memory Compression. DMS is better, easier, and faster to train. Future of Long Context is 1) KV Cache Compression + 2) Sparse Attention, both training-aware to avoid training-inference mismatch. Imho, DMS is SOTA for 1).
🚀 By *learning* to compress the KV cache in Transformer LLMs, we can generate more tokens for the same compute budget. This unlocks *inference-time hyper-scaling* For the same runtime or memory load, we can boost LLM accuracy by pushing reasoning even further!
🚀 By *learning* to compress the KV cache in Transformer LLMs, we can generate more tokens for the same compute budget. This unlocks *inference-time hyper-scaling* For the same runtime or memory load, we can boost LLM accuracy by pushing reasoning even further!
Here are key recommendations to make AI safer & more equitable for everyone: 🌐 Incentivize the creation of open-access multilingual datasets 🪟 Encourage transparency in model language coverage 🔬 Prioritise resources towards multilingual research
Over 7000 languages are spoken worldwide 🌐, but AI safety efforts focus on only a fraction of them. Our latest paper draws on our multi-year efforts with the wider research community to explore why this matters and how we can bridge the AI language gap.
Tomorrow at 6pm CET I'm giving a talk about our latest work on Sparse Attention, at @Cohere_Labs. I plan to describe the field as it is now, discuss our evaluation results, and share insights about what I believe is the future of Sparse Attention. See you!
Our ML Efficiency group is looking forward to welcoming @p_nawrot next week on May 28th, for a session on "The Sparse Frontier: Sparse Attention Trade-offs in Transformer LLMs" Learn more: cohere.com/events/Cohere-…
Our ML Efficiency group is looking forward to welcoming @p_nawrot next week on May 28th, for a session on "The Sparse Frontier: Sparse Attention Trade-offs in Transformer LLMs" Learn more: cohere.com/events/Cohere-…
Welcome, Ruochen! ✨
🌟 Little late but I just started my internship @cohere, cooking more multilingual things with the amazing @cheeesio and @SCahyawijaya. Will be in nyc for June and July, hmu!🗽
I’m excited to see what you’ve built! 🚀
🧑⚖️Our Expedition Aya judges are: @cheeesio, Multilinguality Lead, Cohere @max_nlp, Command Modelling Team Lead, Cohere @mziizm, Staff Research Scientist, Cohere Labs Let’s celebrate this collaborative research and look ahead to what’s next! Learn more: cohere.com/events/Cohere-…
Result of @robinson_n8’s internship on the Cohere multilingual team last year! Check it out!
Many LLMs struggle to produce Dialectal Arabic. As practitioners attempt to mitigate this, new evaluation methods are needed. We present AL-QASIDA (Analyzing LLM Quality + Accuracy Systematically In Dialectal Arabic), a comprehensive eval of LLM Dialectal Arabic proficiency (1/7)
This was fun! Excellent work led by @p_nawrot during his internship at @cohere
How does sparse attention reshape LLM scaling? 🔍 We’re excited to share this work by former @Cohere intern @p_nawrot, “The Sparse Frontier: Sparse Attention Trade-offs in Transformer LLMs.”
The Sparse Frontier Efficient sparse attention methods are key to scale LLMs to long contexts. We conduct the largest-scale empirical analysis that answers: 1. 🤏🔍 Are small dense models or large sparse models better? 2. ♾️What is the maximum permissible sparsity per task? 3.…
The Sparse Frontier Sparse Attention Trade-offs in Transformer LLMs