Torsten Scholak
@tscholak
Lead Research Scientist, Foundation Models Lab @ServiceNowRSRCH. Opinions are not that of my employer.
🚨🤯 Today Jensen Huang announced SLAM Lab's newest model on the @HelloKnowledge stage: Apriel‑Nemotron‑15B‑Thinker 🚨 A lean, mean reasoning machine punching way above its weight class 👊 Built by SLAM × NVIDIA. Smaller models, bigger impact. 🧵👇
As #ICML2025 kicks off in Vancouver, our AI talent is being quietly pushed out. 🇨🇦 We've been waiting 28 months for permanent residency, but @CitImmCanada won’t budge. Please read and share our story facebook.com/share/p/1AwU2f… linkedin.com/posts/gbxhuang… #IRCC #AI #Immigration #AI
Nice release! Worth noting the MoE x Mamba gives coverage, not multiplicative speed-ups: * small batch: expert sparsity keeps latency low * medium-large batch: Mamba's KV-free scan scales while attention would choke Net: below dense latency across the board, but no compounding
Crazy that we now have an open source model with 13B params that’s competitive w o1. And Mamba layers help bring much higher inference throughput
Thanks @_akhaliq for sharing our work! Excited to present our next generation of SVG models, now using Reinforcement Learning from Rendering Feedback (RLRF). 🧠 We think we cracked SVG generalization with this one. Go read the paper! arxiv.org/abs/2505.20793 More details on…
Rendering-Aware Reinforcement Learning for Vector Graphics Generation RLRF significantly outperforms supervised fine-tuning, addressing common failure modes and enabling precise, high-quality SVG generation with strong structural understanding and generalization
🚀 Excited to share that UI-Vision has been accepted at ICML 2025! 🎉 We have also released the UI-Vision grounding datasets. Test your agents on it now! 🚀 🤗 Dataset: huggingface.co/datasets/Servi… #ICML2025 #AI #DatasetRelease #Agents
🚀 Super excited to announce UI-Vision: the largest and most diverse desktop GUI benchmark for evaluating agents in real-world desktop GUIs in offline settings. 📄 Paper: arxiv.org/abs/2503.15661 🌐 Website: uivision.github.io 🧵 Key takeaways 👇
🚀 Announced at #Knowledge25: @ServiceNow & @nvidia introduce Apriel Nemotron 15B. Apriel Nemotron 15B is a compact, cost-efficient AI model built with NVIDIA NeMo and trained on NVIDIA DGX Cloud. It delivers real-time responses, handles complex workflows, and scales across IT,…
Together with @NVIDIA, we're launching a new class of intelligent AI agents. Our Apriel Nemotron 15B model, co-developed with NVIDIA, offers lower latency, reduced inference costs, and faster agentic AI. This partnership also brings a joint data flywheel architecture powered by…
I am excited to open-source PipelineRL - a scalable async RL implementation with in-flight weight updates. Why wait until your bored GPUs finish all sequences? Just update the weights and continue inference! Code: github.com/ServiceNow/Pip… Blog: huggingface.co/blog/ServiceNo…
AI folks in ServiceNow have been cooking. And they cooked a very delicious small 5B parameter cookie!
🚨 SLAM Labs presents Apriel-5B! And it lands right in the green zone 🚨 Speed ⚡ + Accuracy 📈 + Efficiency 💸 This model punches above its weight, beating bigger LLMs while training on a fraction of the compute. Built with Fast-LLM, our in-house training stack. 🧵👇
There isn’t that many newcomers in the SLM space and this one looks very interesting. MIT base models, new open source pretraining library with apparently significant efficiency gains.
And we did it with just: 🖥️ 480 x H100s ⏱️ ~91,000 H100-hours 🧮 4.8B params, bfloat16 💸 2.3 x fewer GPU hours than OLMo-2-7B Thanks to Fast-LLM, github.com/ServiceNow/Fas…, our custom training stack for speed and scale. No hacks. Just better infra.
Showing off Apriel-5B 🚀, an efficient and effective compact model yet. Congrats to the whole SLAM team led by @tscholak @carnaticfiddle 👏
🚨 SLAM Labs presents Apriel-5B! And it lands right in the green zone 🚨 Speed ⚡ + Accuracy 📈 + Efficiency 💸 This model punches above its weight, beating bigger LLMs while training on a fraction of the compute. Built with Fast-LLM, our in-house training stack. 🧵👇
Some new work from the team!
🚨 SLAM Labs presents Apriel-5B! And it lands right in the green zone 🚨 Speed ⚡ + Accuracy 📈 + Efficiency 💸 This model punches above its weight, beating bigger LLMs while training on a fraction of the compute. Built with Fast-LLM, our in-house training stack. 🧵👇
Exciting release from ServiceNow Research — introducing Apriel-5B, a compact and efficient open-source language model that targets the sweet spot between capability and efficiency.
🚨 SLAM Labs presents Apriel-5B! And it lands right in the green zone 🚨 Speed ⚡ + Accuracy 📈 + Efficiency 💸 This model punches above its weight, beating bigger LLMs while training on a fraction of the compute. Built with Fast-LLM, our in-house training stack. 🧵👇
This new, speedy and efficient language model arose from a fruitful collaboration between two teams at ServiceNow! Pretrained with @ServiceNowRSRCH's Fast-LLM (Open-sourced at github.com/ServiceNow/Fas…) Congrats @tscholak, @carnaticfiddle, @jlamypoirier + our many contributors!
🚨 SLAM Labs presents Apriel-5B! And it lands right in the green zone 🚨 Speed ⚡ + Accuracy 📈 + Efficiency 💸 This model punches above its weight, beating bigger LLMs while training on a fraction of the compute. Built with Fast-LLM, our in-house training stack. 🧵👇