Felix Sarnthein

@__safelix__

PhD student in machine learning at @ELLISInst_Tue, @MPI_IS and @CSatETH with @orvieto_antonio. Prev: MSc in CS at @ETH

Tübingen, Deutschland

Joined March 2023

313Following

94Followers

Pinned

AlgoPerf leaderboards are out! 🎉 Amazing third place with and thanks to @orvieto_antonio, @jonasgeiping, @ELLISInst_Tue! 1/n

MMLCommons@MLCommons · Aug 1

@MLCommons #AlgoPerf results are in! 🏁 $50K prize competition yielded 28% faster neural net training with non-diagonal preconditioning beating Nesterov Adam. New SOTA for hyperparameter-free algorithms too! Full details in our blog. mlcommons.org/2024/08/mlc-al… #AIOptimization #AI

9.0K

Felix Sarnthein Retweeted

Dimitri von Rütte@dvruette · Mar 10

🚨 NEW PAPER DROP! Wouldn't it be nice if LLMs could spot and correct their own mistakes? And what if we could do so directly from pre-training, without any SFT or RL? We present a new class of discrete diffusion models, called GIDD, that are able to do just that: 🧵1/12

161

1.0K

928

138.0K

Felix Sarnthein Retweeted

Riccardo Grazzi@riccardograzzi · Nov 22

LLMs can now track states, finally matching this cat! And we prove it. But how? 🧵👇 1/ Paper: arxiv.org/abs/2411.12537 with @julien_siems @jkhfranke @ZelaArber @FrankRHutter @MPontil

7.0K

Felix Sarnthein Retweeted

Laure Ciernik@lciernik · Nov 11

If two models are more similar to each other than a third on ImageNet, will this hold for medical/ satellite images? Our preprint analyzes how vision model similarities generalize across datasets, the factors that influence them, and their link to downstream task behavior. 🧵1/7

13.0K

Felix Sarnthein Retweeted

ELLIS Institute Tübingen@ELLISInst_Tue · Oct 18

The new call for Principal Investigators at the ELLIS Institute Tübingen is out! 🚀 We are recruiting Principal Investigators as Hector Endowed Fellows of the ELLIS Institute Tübingen in the areas of Machine Learning, Artificial Intelligence, and related fields. The positions…

4.0K

Felix Sarnthein@__safelix__ · Jul 23, 2024

Join us today at 13.30 in #ICML to learn how to navigate across scaling laws and how to accelerate your training! Poster #1007

SSotiris Anagnostidis@SAnagnostidis · Nov 9, 2023

Scaling laws predict the minimum required amount of compute to reach a given performance, but can we do better? Yes, if we allow for a flexible "shape" of the model! 🤸

961

Felix Sarnthein@__safelix__ · Apr 28, 2024

Our Next Generation Sequence Modeling Architectures workshop proposal was accepted by ICML! We have an incredible lineup of speakers, please come say hi and consider submitting your works! :)

CCaglar Gulcehre@caglarml · Apr 28, 2024

Feeling very fortunate to co-organize this workshop with an incredible group of researchers, Razvan Pascanu, @orvieto_antonio, Carmen Amo Alonso, and Maciej Wołczyk!

8.0K

Felix Sarnthein Retweeted

ELLIS Institute Tübingen@ELLISInst_Tue · Apr 12, 2024

🎙 The first episode of the @Cyber_Valley Podcast with our Principal Investigators is now out! 🚀 @Orvieto_Antonio #AIPodcast #AIResearch #AI 🔗 Learn more: institute-tue.ellis.eu/en/news/cyber-…

10.0K

Felix Sarnthein Retweeted

Sasha Rush@srush_nlp · Apr 8, 2024

Monograph on "Formal Aspects of Language Modeling" from @ryandcotterell et al. arxiv.org/abs/2311.04329 It would be so nice if everyone read this and we had shared foundations. Particularly for interpretability.

290

281

39.0K

Felix Sarnthein Retweeted

Lorenzo Noci@lorenzo_noci · Feb 29, 2024

Why in neural networks the learning rate can transfer from small to large models (both in width and depth)? It turns out that the sharpness dynamics can explain it. Check out our new work! arxiv.org/abs/2402.17457 w/ @alexmeterez (co-first), @orvieto_antonio and T. Hofmann

139

21.0K

Felix Sarnthein Retweeted

Antonio Orvieto@orvieto_antonio · Feb 5, 2024

If you are looking for a PhD position in the intersection between Deep Learning and Optimization, it's not too late to apply to my group at @MPI_IS and @ELLISforEurope Institute Tübingen! Send a DM if you are interested :) institute-tue.ellis.eu/research-group…

139

21.0K

Felix Sarnthein Retweeted

Gregor Bachmann@GregorBachmann1 · Dec 12, 2023

I’ll be presenting "Scaling MLPs" at #NeurIPS2023, tomorrow (Wed) at 10:45am! Hyped to discuss things like inductive bias, the bitter lesson, compute-optimality and scaling laws 👷⚖️📈

4.0K