Saurabh Dash
@TheyCallMeMr_
Bottomless pit supervisor. ML @CohereAI , PhD Student @GeorgiaTech. Previously @Apple, @IITkgp. http://saurabhdash.com. Opinions expressed are my own
Went for a swim today after ages. Goal: 1km in 1hr. 3 laps in, I was dying. Gave up on the goal, decided to just enjoy the swim. Finished 1km in 50 mins. This keeps happening, with swimming, sudoku, ...life. Felt like a reminder from the universe: Goals give direction…
This i did not expect. Cool.
Perhaps the most important thing you can read about AI this year : “Welcome to the Era of Experience” This excellent paper from two senior DeepMind researchers argues that AI is entering a new phase—the "Era of Experience"—which follows the prior phases of simulation-based…
> Wait, say that again, you trained Qwen2.5 on random rewards? > Yes & it works better than verifiable rewards. Amazing right? > Hold on, lets say you have Qwen2.5, How many RL+LLM published paper results could be out there depending on it? > 100, maybe more
There needs to a polymarket on how many Diffusion Language Model papers will be released this month.
I'm excited to share our new pre-print ShiQ: Bringing back Bellman to LLMs! arxiv.org/abs/2505.11081 In this work, we propose a new, Q-learning inspired RL algorithm for finetuning LLMs 🎉 (1/n)
Keep digging into the recipe behind our Aya Vision models by revisiting our tech-talk. This talk covers the model's architecture, training, multilingual evaluation, and real-world applications, addressing challenges in current evaluation methods.
Lightning talks ⚡️from the core technical team, feature @ahmetustun89 @TheyCallMeMr_ @YiyangNan @aahmadian_ @johnamqdang and @singhshiviii. youtube.com/watch?v=Qp-JBY…
mogged. try today at +1 (431) 302‑8498 on whatsapp
🚨Preprint Alert As promised, announcing the Aya Vision Technical Report – detailing the recipe to build SOTA multilingual multimodal models.
Our new Aya Vision report is out! 🚀 Big congratulations to @TheyCallMeMr_, @YiyangNan, and the fantastic team at @Cohere_Labs @cohere! 🎉 📄 Technical report: arxiv.org/abs/2505.08751 🛠️ Models and datasets: huggingface.co/collections/Co…
How do we build multimodal systems that work effectively across the globe? 🌍 Today we release the Aya Vision Technical Report, the detailed recipe behind Aya Vision models, unifying state-of-the-art multilingual capabilities in multimodal and text tasks across 23 languages!
Today we share the technical report that goes with the Aya Vision open-weight release. 🌎🌍🌏 Huge congrats to everyone involved, special congrats to @TheyCallMeMr_ @YiyangNan the joint first authors. and @mgalle @beyzaermis @ahmetustun89 senior leads. 🎉✨
How do we build multimodal systems that work effectively across the globe? 🌍 Today we release the Aya Vision Technical Report, the detailed recipe behind Aya Vision models, unifying state-of-the-art multilingual capabilities in multimodal and text tasks across 23 languages!
How do we build multimodal systems that work effectively across the globe? 🌍 Today we release the Aya Vision Technical Report, the detailed recipe behind Aya Vision models, unifying state-of-the-art multilingual capabilities in multimodal and text tasks across 23 languages!