Maximilian Mozes

@maximilianmozes

Senior Research Scientist @cohere. PhD @UCL/@ucl_nlp. Previously: @GoogleAI/@SpotifyResearch. He/Him.

London, England

Joined May 2022

595Following

317Followers

Maximilian Mozes Retweeted

Laura Ruis@LauraRuis · Jun 24

LLMs can be programmed by backprop 🔎 In our new preprint, we show they can act as fuzzy program interpreters and databases. After being ‘programmed’ with next-token prediction, they can retrieve, evaluate, and even *compose* programs at test time, without seeing I/O examples.

316

252

33.0K

Maximilian Mozes Retweeted

Kelly Marchisio (St. Denis)@cheeesio · Jun 24

The Multilingual Team at @cohere is hiring! If this sounds like you, please apply: - strong coding skills and a keen eye for detail - experience working with the challenges & joys of multilingual data Help us bring AI to the world! 🌏🌍🌎 jobs.ashbyhq.com/cohere/a87be94…

175

21.0K

Maximilian Mozes@maximilianmozes · Jun 18

We’re looking for a Research Engineer / Scientist with a focus on Data Analysis and Evaluation to join the post-training team at Cohere! More details and application here: jobs.ashbyhq.com/cohere/6170371… Feel free to reach out if you'd like to know more!

114

12.0K

Maximilian Mozes Retweeted

Daniel D'souza @mrdanieldsouza · Jun 18

🚨 Wait, adding simple markers 📌during training unlocks outsized gains at inference time?! 🤔 🚨 Thrilled to share our latest work at @Cohere_Labs: “Treasure Hunt: Real-time Targeting of the Long Tail using Training-Time Markers“ that explores this phenomenon! Details in 🧵 ⤵️

6.0K

Maximilian Mozes Retweeted

cohere@cohere · Jun 15

We’re proud to partner with the governments of Canada and the UK to accelerate adoption of secure AI solutions in the public sector. Today, our CEO and co-founder @AidanGomez met with @CanadianPM and @10DowningStreet to discuss the strategic importance of AI for national…

399

94.0K

Maximilian Mozes@maximilianmozes · May 22

Another paper coming out of @LisaAlazraki‘s internship with us at @cohere! 🎉 Go check it out! 👀

LLisa Alazraki@LisaAlazraki · May 22

Thrilled to share our new preprint on Reinforcement Learning for Reverse Engineering (RLRE) 🚀 We demonstrate that human preferences can be reverse engineered effectively by pipelining LLMs to optimise upstream preambles via reinforcement learning 🧵⬇️

473

Maximilian Mozes@maximilianmozes · May 22

Can LLMs be incentivised to generate token sequences (in this case preambles) that condition downstream models to improve performance when judged by reward models? Yes! ✅

LLisa Alazraki@LisaAlazraki · May 22

3.0K

Maximilian Mozes@maximilianmozes · May 22

We froze an LLM ❄️, trained a system prompt generator using RL for conditioning it and got pretty cool results! This new work by @LisaAlazraki demonstrates that optimizing the system prompt alone can enhance downstream performance without updating the original model.

LLisa Alazraki@LisaAlazraki · May 22

488

Maximilian Mozes Retweeted

Cohere Labs@Cohere_Labs · May 14

How do we build multimodal systems that work effectively across the globe? 🌍 Today we release the Aya Vision Technical Report, the detailed recipe behind Aya Vision models, unifying state-of-the-art multilingual capabilities in multimodal and text tasks across 23 languages!

15.0K

Maximilian Mozes Retweeted

Sara Hooker@sarahookr · Apr 30

It is critical for scientific integrity that we trust our measure of progress. The @lmarena_ai has become the go-to evaluation for AI progress. Our release today demonstrates the difficulty in maintaining fair evaluations on @lmarena_ai, despite best intentions.

135

711

280

239.0K

Maximilian Mozes Retweeted

Piotr Nawrot@p_nawrot · Apr 25

Sparse attention is one of the most promising strategies to unlock long-context processing and long generation reasoning in LLMs. We performed the most comprehensive study on training-free sparse attention to date. Here is what we found:

112

644

553

67.0K

Maximilian Mozes@maximilianmozes · Apr 24

Very proud of this work which is being presented @iclr_conf later today. While I will not be there — Catch up with @viraataryabumi and @ahmetustun89 who are both fantastic and can share more about our work at both @Cohere_Labs and @cohere. 🔥✨

CCohere Labs@Cohere_Labs · Aug 21

In our latest work, we ask “what is the impact of code data used in pre-training on non-code tasks?” Work w @viraataryabumi, @yixuan_su, @rayhascode, @adrien_morisot, @1vnzh, @acyr_l, @mziizm, @ahmetustun89 @sarahookr 📜 arxiv.org/abs/2408.10914

7.0K

Maximilian Mozes@maximilianmozes · Apr 23

Attending #ICLR2025 and interested in #LLM, #Alignment, or #SelfImprovement? Then come by and check out our work from @cohere: "Self-Improving Robust Preference Optimization" - a new alignment method that unlocks self-refinement in LLMs! 📍 Poster Session 4 — Friday, 3–5:30 PM

MMohammad Azar@Learnius · Apr 23

🚀 Presenting at #ICLR2025: Self-Improving Robust Preference Optimization (SRPO) — the alignment method powering @cohere flagship models Command A & Command R7B 🔥 SRPO learns to refine itself, outperforming existing DPO, IPO, SLiC & RPO across the board. 📍Poster session 4…

4.0K

Maximilian Mozes Retweeted

Nils Reimers@Nils_Reimers · Apr 15

𝐂𝐨𝐡𝐞𝐫𝐞 𝐄𝐦𝐛𝐞𝐝 𝐯𝟒 - 𝐒𝐭𝐚𝐭𝐞-𝐨𝐟-𝐭𝐡𝐞-𝐚𝐫𝐭 𝐭𝐞𝐱𝐭 & 𝐢𝐦𝐚𝐠𝐞 𝐫𝐞𝐭𝐫𝐢𝐞𝐯𝐚𝐥 Today we are releasing Embed v4, unlocking so many cool new features for retrieval. 🇺🇳 100+ languages 🖼️ Text & Image capabilities 📜 128k context length

224

18.0K

Maximilian Mozes@maximilianmozes · Apr 15

🎉👏

AAidan Gomez@aidangomez · Apr 15

I’m excited to share @Cohere’s newest model, Embed 4! Embed 4 is the optimal search engine for secure enterprise AI assistants and agents.

251

Maximilian Mozes Retweeted

Nick Frosst@nickfrosst · Apr 15

Today we are releasing Embed 4 – the new SOTA foundation for agentic enterprise search and retrieval applications! cohere.com/blog/embed-4 Check out the blog for similarly visually satisfying graphs :)

152

10.0K

Maximilian Mozes Retweeted

ICLR 2026@iclr_conf · Apr 11

Announcing the keynote speakers for #ICLR2025! Speakers will cover topics ranging from foundational advances in language models, AI safety, open-ended learning, and the nature of intelligence itself. blog.iclr.cc/2025/04/11/ann…

377

79.0K

Maximilian Mozes@maximilianmozes · Mar 27

Our Command A tech report is online! Go check it out!

ccohere@cohere · Mar 27

We’re redefining what’s possible with AI. With the release of our latest model, Command A, optimized for real-world agentic and multilingual tasks, we’re demonstrating our commitment to bringing enterprises AI that goes beyond the ordinary, and offers security & efficiency.…

351

Maximilian Mozes Retweeted

Max Bartolo@max_nlp · Mar 27

I'm excited to the tech report for our @Cohere @CohereForAI Command A and Command R7B models. We highlight our novel approach to model training including the use of self-refinement algorithms and model merging techniques at scale. Command A is an efficient, agent-optimised…

276

123

83.0K

Maximilian Mozes Retweeted

Kyle Duffy@kyduffy · Mar 20

My team recently launched a best-in-class LLM specializing in English and Arabic. We just published a tech report explaining our methods. Check it out on arxiv: arxiv.org/abs/2503.14603

6.0K