Sotiris Anagnostidis

@SAnagnostidis

PhD in ETH Zürich. MLP-pilled 💊. Previously @Meta GenAI, @GoogleDeepMind, @Huawei, @ntua

Joined December 2021

454Following

172Followers

Pinned

Sotiris Anagnostidis Retweeted

Neil Houlsby@neilhoulsby · Jul 9

📣 Anthropic Zurich is hiring again 🇨🇭 The team has been shaping up fantastically over the last months, and I have re-opened applications for pre-training. We welcome applications from anywhere along the "scientist/engineer spectrum". If building the future of AI for the…

657

387

66.0K

Sotiris Anagnostidis Retweeted

Edgar Schoenfeld@schoenfeldedgar · Jun 15

🚀 Want to speed up your image and video model inference? Come see our highlight poster at @CVPR : "FlexiDiT: Your Diffusion Transformer Can Easily Generate High-Quality Samples with Less Compute" 📍 Today at 4 PM, ExHall D – Poster #205 🔗 arxiv.org/abs/2502.20126 Work done…

942

Sotiris Anagnostidis Retweeted

Artsiom Sanakoyeu@artsiom_s · Jun 11

Thrilled to share that our CVPR 2025 paper “𝐀𝐮𝐭𝐨𝐫𝐞𝐠𝐫𝐞𝐬𝐬𝐢𝐯𝐞 𝐃𝐢𝐬𝐭𝐢𝐥𝐥𝐚𝐭𝐢𝐨𝐧 𝐨𝐟 𝐃𝐢𝐟𝐟𝐮𝐬𝐢𝐨𝐧 𝐓𝐫𝐚𝐧𝐬𝐟𝐨𝐫𝐦𝐞𝐫𝐬”(ARD) has been selected as an Oral! ✨ Catch us at CVPR on Saturday, June 14 🗣 Oral Session 4A — 14:00-14:15, Karl Dean Ballroom…

2.0K

Sotiris Anagnostidis@SAnagnostidis · Apr 24

If you are at @iclr_conf, come by my poster tomorrow at 10.00 am! You find me at Hall 3 + Hall 2B #367! See you there! iclr.cc/virtual/2025/p… #ICLR2025

EEnea Monzio Compagnoni @ Flexion Robotics@EneaMC · Jan 28

Everybody gangsta until SDEs work! #ICLR2025 Noise hits every optimizer differently! For SignSGD, adaptivity increases resistance to gradient noise, while AdamW enjoys extreme stability. Plus, AdamW has an exciting new scaling rule! More below👇! arxiv.org/abs/2411.15958

2.0K

Sotiris Anagnostidis Retweeted

Weronika Ormaniec@wormaniec · Apr 22

Ever wondered how the loss landscape of Transformers differs from that of other architectures? Or which Transformer components make its loss landscape unique? With @unregularized & @f_dangel, we explore this via the Hessian in our #ICLR2025 spotlight paper! Key insights👇 1/8

3.0K

Sotiris Anagnostidis Retweeted

Dimitri von Rütte@dvruette · Mar 10

🚨 NEW PAPER DROP! Wouldn't it be nice if LLMs could spot and correct their own mistakes? And what if we could do so directly from pre-training, without any SFT or RL? We present a new class of discrete diffusion models, called GIDD, that are able to do just that: 🧵1/12

161

1.0K

928

138.0K

Sotiris Anagnostidis Retweeted

Antonio Orvieto@orvieto_antonio · Nov 12

Only a few more days to apply to our amazing PhD program! imprs.is.mpg.de/application

2.0K

Sotiris Anagnostidis@SAnagnostidis · Nov 8

Updated camera ready arxiv.org/abs/2405.19279. New results include: - non-diagonal preconditioners (SOAP/Shampoo) minimise OFs compared to diagonal (Adam/AdaFactor) - Scaling to 7B params - showing our methods to reduce OFs translate to PTQ int8 quantisation ease. Check it out!

BBobby@bobby_he · Jun 10, 2024

Outlier Features (OFs) aka “neurons with big features” emerge in standard transformer training & prevent benefits of quantisation🥲but why do OFs appear & which design choices minimise them? Our new work (+@lorenzo_noci @DanielePaliotta @ImanolSchlag T. Hofmann) takes a look👀🧵

154

38.0K

Sotiris Anagnostidis Retweeted

Samuel Albanie 🇬🇧@SamuelAlbanie · Sep 29

Are LLMs easily influenced? Interesting work from @SAnagnostidis and @jannis1 TLDR: Having an LLM advocate for a question answer in the prompt significantly influences predictions

1.0K

Sotiris Anagnostidis@SAnagnostidis · Jul 25, 2024

We’re presenting our work on concept guidance today at 13:30’s ICML poster session (# 706). Come by and say hi! #ICML #ICML2024

DDimitri von Rütte@dvruette · Feb 23, 2024

🚨📜 Announcing our latest work on LLM interpretability: We are able to control a model's humor, creativity, quality, truthfulness, and compliance by applying concept vectors to its hidden neural activations. 🧵 arxiv.org/abs/2402.14433

816

Sotiris Anagnostidis@SAnagnostidis · Jul 23, 2024

Join us today at 13.30 in #ICML to learn how to navigate across scaling laws and how to accelerate your training! Poster #1007

SSotiris Anagnostidis@SAnagnostidis · Nov 9, 2023

Scaling laws predict the minimum required amount of compute to reach a given performance, but can we do better? Yes, if we allow for a flexible "shape" of the model! 🤸

961

Sotiris Anagnostidis Retweeted

Aurelien Lucchi@AurelienLucchi · Jul 5, 2024

The University of Basel, Switzerland, is offering an open-rank Professorship in AI and Foundation Models. For more information, visit this link: jobs.unibas.ch/offene-stellen….

10.0K

Sotiris Anagnostidis Retweeted

Bobby@bobby_he · Jun 10, 2024

183

110

49.0K

Sotiris Anagnostidis Retweeted

Dimitri von Rütte@dvruette · Feb 23, 2024

To try it out yourself and for technical implementation details, check out our HF space and GitHub. 🤗 Demo: huggingface.co/spaces/dvruett… 👾 Code: github.com/dvruette/conce…

711

Sotiris Anagnostidis Retweeted

Dimitri von Rütte@dvruette · Feb 23, 2024

132

15.0K

Sotiris Anagnostidis Retweeted

Dimitri von Rütte@dvruette · Feb 12, 2024

this llama-2 is behaving a bit... strangely 🤔

320

Sotiris Anagnostidis@SAnagnostidis · Jan 25, 2024

🚨 Calling on all FABRIC users! We need your help to learn about how you’ve been using FABRIC. Help us by taking 5 minutes to fill out the survey. Haven’t tried FABRIC yet? Just try it using our Gradio demo! ✨👨‍🎨 📊 Survey: forms.gle/aMWLDW8xvyhkLb… 👾 Demo:…

DDimitri von Rütte@dvruette · Jul 20, 2023

🚨📜 Announcing FABRIC, a training-free method for using iterative feedback to improve the results of any Stable Diffusion model. Instead of spending hours to find the right prompt, just click 👍/👎 to tell the model what exactly you want. 🤗 Demo: huggingface.co/spaces/dvruett…

15.0K

Sotiris Anagnostidis Retweeted

AK@_akhaliq · Dec 15, 2023

LIME: Localized Image Editing via Attention Regularization in Diffusion Models paper page: huggingface.co/papers/2312.09… Diffusion models (DMs) have gained prominence due to their ability to generate high-quality, varied images, with recent advancements in text-to-image generation.…

16.0K

Sotiris Anagnostidis Retweeted

Gregor Bachmann@GregorBachmann1 · Dec 12, 2023

I’ll be presenting "Scaling MLPs" at #NeurIPS2023, tomorrow (Wed) at 10:45am! Hyped to discuss things like inductive bias, the bitter lesson, compute-optimality and scaling laws 👷⚖️📈

4.0K

Sotiris Anagnostidis Retweeted

Dimitri von Rütte@dvruette · Dec 10, 2023

Attending #NeurIPS2023 in New Orleans this week to present OpenAssistant (arxiv.org/abs/2304.07327)! Happy to chat about open-source LLMs, personalized image generation, and more. DMs are open!

5.0K