Julian Salazar
@JulianSlzr
Senior Research Scientist at @GoogleDeepMind; speech & audio LLMs for Gemini, Project Astra, Lyria RealTime; past lives in #NLProc, #SpeechProc, pure math.
Excited to share Long-Form Speech Generation with Spoken LMs at #ICML2025 (Wed. oral)! We’ll present: - LibriSpeech-Long: new benchmark and evals for long-form generation quality - SpeechSSM: 1st *textless* spoken LMs for expressive *unbounded* speech Listen and learn more: 🧵
This should be an extraction + thinking eval. 60/col × 4 col/pg. × 13 pages ≈ 3.1k contributors. ...it also says "over 3000" on pg.59. Wall of shame, cont.: -o4-mini-high: 711 -DeepThink R1: 900 (truncated at pg.54) -Claude 4 Opus: "will exceed the length limit of this chat"
Gemini 2.5 paper TL;DR. Technical part in thread. Contributors: ~1k 2.5 Pro timed out counting after 600s 2.5 Flash counts 1228 in 60s o3 counts 919 "after dedup" in 4m9s No grouping or "leads", just one long list. I guess too much infighting or poaching from this in the past?
We're hiring @GoogleDeepMind APAC! 🇯🇵🇮🇳 Seeking experts in multilingual, multicultural & multimodal AI to help shape the future of our frontier models including Gemini. This is a unique opportunity to make impacts on billions of users.
@GoogleDeepMind India 🇮🇳 & Japan 🇯🇵 are looking for strong candidates in multilinguality, multicultural, & multimodality areas. RS Bangalore: job-boards.greenhouse.io/deepmind/jobs/… RS Tokyo: job-boards.greenhouse.io/deepmind/jobs/… RE Tokyo: job-boards.greenhouse.io/deepmind/jobs/…
We are hiring on the Veo team!📽️ Some people asked me about this at #ICML2025. If that's you, I will have told you to check deepmind.google/careers/ regularly. 👀It's just been updated: Europe (London, Zurich) job-boards.greenhouse.io/deepmind/jobs/… US (Mountain View) job-boards.greenhouse.io/deepmind/jobs/…
Want to be part of a team redefining SOTA for generative video models? Excited about building models that can reach billions of users? The Veo team is hiring! We are looking for amazing researchers and engineers, in North America and Europe. Details below:
Good to see research tackling real-time thoughts for streaming speech! Will be essential to closing the LM-SLM gap.
1/7 🔗 Introducing STITCH: our new method to make Spoken Language Models (SLMs) think and talk at the same time. Paper link 👉 arxiv.org/abs/2507.15375
🚨 Olympiad math + AI: We ran Google’s Gemini 2.5 Pro on the fresh IMO 2025 problems. With careful prompting and pipeline design, it solved 5 out of 6 — remarkable for tasks demanding deep insight and creativity. The model could win gold! 🥇 #AI #Math #LLMs #IMO2025
✅ same constraints as human competitors (time limits, no tools, natural language only) ✅ actually graded by the IMO (no risk of -1 deductions that’d cost others gold) ✅ will actually be available (in Gemini AI Ultra subscription)
An advanced version of Gemini with Deep Think has officially achieved gold medal-level performance at the International Mathematical Olympiad. 🥇 It solved 5️⃣ out of 6️⃣ exceptionally difficult problems, involving algebra, combinatorics, geometry and number theory. Here’s how 🧵
just finished (threatening LLMs to be) porting this extremely cool jax library to Pytorch! i actually wrote nothing but i have faith in unit tests by Googlers, as well as @cursor_ai and Sonnet 4 @AnthropicAI. @rustyryan @JulianSlzr #icml2025 github.com/keunwoochoi/se…
Today, Julian Salazar presents SequenceLayers, a framework-agnostic layer API & open-source library for creation of robust, streamable sequence models. Stop by the #ICML2025 Google booth at 3pm to learn about its design principles & JAX implementation. github.com/google/sequenc……
This is today at #ICML25. Come say hi! - Talk: 3:45pm-4pm in West Ballroom C (icml.cc/virtual/2025/o…) - Poster: 4:30pm-7pm in West Exhibit Hall B, #W-417 (icml.cc/virtual/2025/p…)
Excited to share Long-Form Speech Generation with Spoken LMs at #ICML2025 (Wed. oral)! We’ll present: - LibriSpeech-Long: new benchmark and evals for long-form generation quality - SpeechSSM: 1st *textless* spoken LMs for expressive *unbounded* speech Listen and learn more: 🧵
I have the scars to prove it.
If you know the pain of sequence training<->inference, visit the #ICML25 Google booth 3-3:30 today (Tue.)! SequenceLayers is not just a layer library, but a new approach to defining and composing sequence architectures. Try it in Colab: colab.research.google.com/github/google/… See our report:
📢Tired of common pitfalls in sequence modeling (padding, causality, teacher-forcing vs. sampling mismatches in AR models, receptive field calculations, handling ragged sequences, stacking different architecture types while tracking invalid steps, etc)? Meet SequenceLayers!🔥🚀
Today, Julian Salazar presents SequenceLayers, a framework-agnostic layer API & open-source library for creation of robust, streamable sequence models. Stop by the #ICML2025 Google booth at 3pm to learn about its design principles & JAX implementation. github.com/google/sequenc……
If you know the pain of sequence training<->inference, visit the #ICML25 Google booth 3-3:30 today (Tue.)! SequenceLayers is not just a layer library, but a new approach to defining and composing sequence architectures. Try it in Colab: colab.research.google.com/github/google/… See our report:
Today, Julian Salazar presents SequenceLayers, a framework-agnostic layer API & open-source library for creation of robust, streamable sequence models. Stop by the #ICML2025 Google booth at 3pm to learn about its design principles & JAX implementation. github.com/google/sequenc……