Gallil Maimon
@GallilMaimon
Research Scientist intern @ Meta (FAIR); PhD student @CseHuji; Speech Language Modelling
Many modern SpeechLMs are trained with Speech-Text interleaving. How does this impact scaling trends? In our new paper, we train several dozen SLMs, and show - quite a lot! So there is room for optimism 😊 Key insights, code, models, full paper 👇🏻

Discrete Audio Tokens: More Than a Survey!. arxiv.org/abs/2506.10274
🚀 We're excited to announce our latest work: "Discrete Audio Tokens: More Than a Survey!" It presents a comprehensive survey and benchmark of audio tokenizers across speech, music, and general audio. preprint: arxiv.org/pdf/2506.10274 website: poonehmousavi.github.io/dates-website/
``Discrete Audio Tokens: More Than a Survey!,'' Pooneh Mousavi, Gallil Maimon, Adel Moumen, Darius Petermann, Jiatong Shi, Haibin Wu, Haici Yang, Anastasia Kuznetsova, Artem Ploujnikov, Ricard Marxer, Bhuvana Ramabhadran, Benjamin Elizalde, Loren Lugosch… ift.tt/GA4ZC6u
StressTest: Can YOUR Speech LM Handle the Stress?. arxiv.org/abs/2505.22765
Super cool work I got to take part of, led by @iddoyosha Can SpeechLMs *really* understand the implied meaning of stressed words? Check it out! 👇🤙
1/5 🚨 New paper alert! StressTest: Can YOUR Speech LM Handle the Stress? Sentence stress = emphasis on words to signal intent, contrast, or new info. We built StressTest — a benchmark for testing stress reasoning.🗣️💬 Then, meet StresSLM who finally gets it! Insights & Links 👇
``StressTest: Can YOUR Speech LM Handle the Stress?,'' Iddo Yosha, Gallil Maimon, Yossi Adi, ift.tt/NF2ZI8v
🚨 Happy to share our #Interspeech2025 paper! "WhiStress: Enriching Transcriptions with Sentence Stress Detection" Sentence stress is a word-level prosodic cue that marks contrast or intent. WhiStress detects it alongside transcription—no alignment needed. Paper, code, demo 👇
The longer reasoning LLM thinks - the more likely to be correct, right? Apparently not. Presenting our paper: “Don’t Overthink it. Preferring Shorter Thinking Chains for Improved LLM Reasoning”. Link: arxiv.org/abs/2505.17813 1/n
✈️Heading to Hyderabad for #ICASSP2025 @ieeeICASSP Reach out if you want to talk about SpeechLMs, or for a beer ;) I'll be presenting our work on evaluating acoustic modelling in SpeechLMs on Tuesday at 6pm, and Monday 4pm at #SALMA workshop

Scaling Analysis of Interleaved Speech-Text Language Models. arxiv.org/abs/2504.02398