Torsten Scholak (@tscholak)

Pinned

T

🚨🤯 Today Jensen Huang announced SLAM Lab's newest model on the @HelloKnowledge stage: Apriel‑Nemotron‑15B‑Thinker 🚨 A lean, mean reasoning machine punching way above its weight class 👊 Built by SLAM × NVIDIA. Smaller models, bigger impact. 🧵👇

2

22

46

16

7.0K

Torsten Scholak Retweeted

G

Gabriel Huang@GabrielHuang9 · Jul 15

As #ICML2025 kicks off in Vancouver, our AI talent is being quietly pushed out. 🇨🇦 We've been waiting 28 months for permanent residency, but @CitImmCanada won’t budge. Please read and share our story facebook.com/share/p/1AwU2f… linkedin.com/posts/gbxhuang… #IRCC #AI #Immigration #AI

3

10

22

0

5.0K

T

Torsten Scholak@tscholak · Jun 28

Nice release! Worth noting the MoE x Mamba gives coverage, not multiplicative speed-ups: * small batch: expert sparsity keeps latency low * medium-large batch: Mamba's KV-free scan scales while attention would choke Net: below dense latency across the board, but no compounding

TTri Dao@tri_dao · Jun 27

Crazy that we now have an open source model with 13B params that’s competitive w o1. And Mamba layers help bring much higher inference throughput

0

4

334

T

Torsten Scholak@tscholak · May 28

Thanks @_akhaliq for sharing our work! Excited to present our next generation of SVG models, now using Reinforcement Learning from Rendering Feedback (RLRF). 🧠 We think we cracked SVG generalization with this one. Go read the paper! arxiv.org/abs/2505.20793 More details on…

AAK@_akhaliq · May 28

Rendering-Aware Reinforcement Learning for Vector Graphics Generation RLRF significantly outperforms supervised fine-tuning, addressing common failure modes and enabling precise, high-quality SVG generation with strong structural understanding and generalization

3

41

123

65

15.0K

T

Torsten Scholak@tscholak · May 15

🚀 Excited to share that UI-Vision has been accepted at ICML 2025! 🎉 We have also released the UI-Vision grounding datasets. Test your agents on it now! 🚀 🤗 Dataset: huggingface.co/datasets/Servi… #ICML2025 #AI #DatasetRelease #Agents

PP Shravan Nayak@PShravannayak · Mar 24

🚀 Super excited to announce UI-Vision: the largest and most diverse desktop GUI benchmark for evaluating agents in real-world desktop GUIs in offline settings. 📄 Paper: arxiv.org/abs/2503.15661 🌐 Website: uivision.github.io 🧵 Key takeaways 👇

0

15

36

2

4.0K

Torsten Scholak Retweeted

N

NVIDIA AI@NVIDIAAI · May 6

🚀 Announced at #Knowledge25: @ServiceNow & @nvidia introduce Apriel Nemotron 15B. Apriel Nemotron 15B is a compact, cost-efficient AI model built with NVIDIA NeMo and trained on NVIDIA DGX Cloud. It delivers real-time responses, handles complex workflows, and scales across IT,…

27

52

247

17

42.0K

Torsten Scholak Retweeted

S

ServiceNow News@ServiceNowNews · May 6

Together with @NVIDIA, we're launching a new class of intelligent AI agents. Our Apriel Nemotron 15B model, co-developed with NVIDIA, offers lower latency, reduced inference costs, and faster agentic AI. This partnership also brings a joint data flywheel architecture powered by…

1

19

57

7

6.0K

Torsten Scholak Retweeted

�

🇺🇦 Dzmitry Bahdanau@DBahdanau · Apr 26

I am excited to open-source PipelineRL - a scalable async RL implementation with in-flight weight updates. Why wait until your bored GPUs finish all sequences? Just update the weights and continue inference! Code: github.com/ServiceNow/Pip… Blog: huggingface.co/blog/ServiceNo…

6

111

510

404

64.0K

T

Torsten Scholak@tscholak · Apr 14

AI folks in ServiceNow have been cooking. And they cooked a very delicious small 5B parameter cookie!

TTorsten Scholak@tscholak · Apr 11

🚨 SLAM Labs presents Apriel-5B! And it lands right in the green zone 🚨 Speed ⚡ + Accuracy 📈 + Efficiency 💸 This model punches above its weight, beating bigger LLMs while training on a fraction of the compute. Built with Fast-LLM, our in-house training stack. 🧵👇

0

2

13

2

4.0K

T

Torsten Scholak@tscholak · Apr 13

There isn’t that many newcomers in the SLM space and this one looks very interesting. MIT base models, new open source pretraining library with apparently significant efficiency gains.

TTorsten Scholak@tscholak · Apr 11

And we did it with just: 🖥️ 480 x H100s ⏱️ ~91,000 H100-hours 🧮 4.8B params, bfloat16 💸 2.3 x fewer GPU hours than OLMo-2-7B Thanks to Fast-LLM, github.com/ServiceNow/Fas…, our custom training stack for speed and scale. No hacks. Just better infra.

3

5

69

29

9.0K

T

Torsten Scholak@tscholak · Apr 11

Showing off Apriel-5B 🚀, an efficient and effective compact model yet. Congrats to the whole SLAM team led by @tscholak @carnaticfiddle 👏

TTorsten Scholak@tscholak · Apr 11

🚨 SLAM Labs presents Apriel-5B! And it lands right in the green zone 🚨 Speed ⚡ + Accuracy 📈 + Efficiency 💸 This model punches above its weight, beating bigger LLMs while training on a fraction of the compute. Built with Fast-LLM, our in-house training stack. 🧵👇

0

1

4

0

352

T

Torsten Scholak@tscholak · Apr 11

Some new work from the team!

TTorsten Scholak@tscholak · Apr 11

🚨 SLAM Labs presents Apriel-5B! And it lands right in the green zone 🚨 Speed ⚡ + Accuracy 📈 + Efficiency 💸 This model punches above its weight, beating bigger LLMs while training on a fraction of the compute. Built with Fast-LLM, our in-house training stack. 🧵👇

0

1

4

0

318

T

Torsten Scholak@tscholak · Apr 11

Exciting release from ServiceNow Research — introducing Apriel-5B, a compact and efficient open-source language model that targets the sweet spot between capability and efficiency.

TTorsten Scholak@tscholak · Apr 11

🚨 SLAM Labs presents Apriel-5B! And it lands right in the green zone 🚨 Speed ⚡ + Accuracy 📈 + Efficiency 💸 This model punches above its weight, beating bigger LLMs while training on a fraction of the compute. Built with Fast-LLM, our in-house training stack. 🧵👇

0

2

7

0

477

T

Torsten Scholak@tscholak · Apr 11

This new, speedy and efficient language model arose from a fruitful collaboration between two teams at ServiceNow! Pretrained with @ServiceNowRSRCH's Fast-LLM (Open-sourced at github.com/ServiceNow/Fas…) Congrats @tscholak, @carnaticfiddle, @jlamypoirier + our many contributors!

TTorsten Scholak@tscholak · Apr 11

🚨 SLAM Labs presents Apriel-5B! And it lands right in the green zone 🚨 Speed ⚡ + Accuracy 📈 + Efficiency 💸 This model punches above its weight, beating bigger LLMs while training on a fraction of the compute. Built with Fast-LLM, our in-house training stack. 🧵👇

0

1

3

0

254