Michael Eli Sander

@m_e_sander

Research Scientist at Google DeepMind

Paris

Joined February 2021

190Following

2KFollowers

Pinned

Michael Eli Sander@m_e_sander · Jul 22, 2024

🚨🚨New ICML 2024 Paper: arxiv.org/abs/2402.05787 How do Transformers perform In-Context Autoregressive Learning? We investigate how causal Transformers learn simple autoregressive processes or order 1. with @RGiryes, @btreetaiji, @mblondel_ml and @gabrielpeyre 🙏

m_e_sander's tweet image. 🚨🚨New ICML 2024 Paper: arxiv.org/abs/2402.05787

How do Transformers perform In-Context Autoregressive Learning?

We investigate how causal Transformers learn simple autoregressive processes or order 1.

with @RGiryes, @btreetaiji, @mblondel_ml and @gabrielpeyre 🙏

154

38.0K

Pinned

Michael Eli Sander Retweeted

Pierre Marion@PierreMari0n · Oct 4

🚨New paper alert🚨: arxiv.org/abs/2410.01537 How does Transformer retrieve information which is sparsely concentrated in few tokens? e.g., the label can change by flipping a single word. To explain this, we introduce a new statistical task, and show that attention solves it ⬇️

6.0K

Michael Eli Sander@m_e_sander · Feb 7

Distillation is becoming a major paradigm for training LLMs but its success and failure modes remain quite mysterious. Our paper introduces the phenomenon of "teacher hacking" and studies how to mitigate it. arxiv.org/abs/2502.02671 More details in the thread below.

DDaniil Tiapkin@dtiapkin · Feb 7

1/ If you’re familiar with RLHF, you likely heard of reward hacking —where over-optimizing the imperfect reward model leads to unintended behaviors. But what about teacher hacking in knowledge distillation: can the teacher be hacked, like rewards in RLHF?

6.0K

Michael Eli Sander Retweeted

Mathieu Blondel@mblondel_ml · Jan 31

Really proud of these two companion papers by our team at GDM: 1) Joint Learning of Energy-based Models and their Partition Function arxiv.org/abs/2501.18528 2) Loss Functions and Operators Generated by f-Divergences arxiv.org/abs/2501.18537 A thread.

167

130

24.0K

Michael Eli Sander Retweeted

Tom Sander @ICLR@RednasTom · Dec 10

I am in NeurIPS week :) Friday, Presenting our spotlight work: Watermarking Makes LLMs Radioactive ☢️ (arxiv.org/abs/2402.14904) Sunday, speaking at the image watermarking workshop about our latest Watermark Anything work (arxiv.org/abs/2411.07231) DM me if you’d like to chat :)

757

Michael Eli Sander@m_e_sander · Dec 5

♟️Mastering Board Games by External and Internal Planning with Language Models♟️ I'm happy to finally share storage.googleapis.com/deepmind-media… TL;DR: In chess, our planning agents effectively reach grandmaster-level strength with a comparable search budget to that of human players!

NNenad Tomasev@weballergy · Dec 5

I'm excited to share a new paper: "Mastering Board Games by External and Internal Planning with Language Models" storage.googleapis.com/deepmind-media… (also soon to be up on Arxiv, once it's been processed there)

4.0K

Michael Eli Sander@m_e_sander · Nov 20

Merci pour l’opportunité d’avoir échangé sur mes recherches et mes expériences ! Merci à mes directeurs de thèse @gabrielpeyre et @RemiGribonval pour votre supervision 😊

CCentre Inria de Lyon@Inria_lyon · Nov 20

📽️On a interviewé @SibylleMarcotte , doctorante @ENS_ULM, membre de l'équipe Ockham, lauréate 🏆du prix Jeunes Talents France 2024 L'Oréal - @UNESCO #ForWomenInScience ▶️ses recherches et ses conseils pour les filles souhaitant devenir #scientifiques :) @UnivLyon1 @ENSdeLyon

3.0K

Michael Eli Sander@m_e_sander · Nov 4

☢️ Some news about radioactivity ☢️ - We got a Spotlight at Neurips! 🥳 and we will be in Vancouver with @pierrefdz to present! - We have just released our code for radioactivity detection at github.com/facebookresear….

TTom Sander @ICLR@RednasTom · Feb 26, 2024

OpenAI may secretly know that you trained on GPT outputs! In our work "Watermarking Makes Language Models Radioactive", we show that training on watermarked text can be easily spotted ☢️ Paper: arxiv.org/abs/2402.14904 @pierrefdz @AIatMeta @Polytechnique @Inria

2.0K

Michael Eli Sander Retweeted

Tom Sander @ICLR@RednasTom · Nov 12

🔒Image watermarking is promising for digital content protection. But images often undergo many modifications—spliced or altered by AI. Today at @AIatMeta, we released Watermark Anything that answers not only "where does the image come from," but "what part comes from where." 🧵

3.0K

Michael Eli Sander Retweeted

Fabian Pedregosa@fpedregosa · Oct 17

Six years at Google today! 🎉 From 🇨🇦 to 🇨🇭, optimizing everything in sight. Grateful for the incredible journey and amazing colleagues!

120

9.0K

Michael Eli Sander@m_e_sander · Oct 8

🏆Didn't get the Physics Nobel Prize this year, but really excited to share that I've been named one of the #FWIS2024 @FondationLOreal-@UNESCO French Young Talents alongside 34 amazing young researchers! This award recognizes my research on deep learning theory #WomenInScience 👩‍💻

ÉÉcole normale supérieure | PSL@ENS_ULM · Oct 8

#FWIS2024 🎖️@SibylleMarcotte, doctorante au département #mathématiques et applications de l'ENS @psl_univ, figure parmi les lauréates du Prix Jeunes Talents France 2024 @FondationLOreal @UNESCO #ForWomenInScience @AcadSciences @4womeninscience Félicitations à elle !!! 👏

332

36.0K

Michael Eli Sander@m_e_sander · Oct 7

🥳🥳 Thrilled to share that I've joined Google DeepMind as a Research Scientist. Super excited for what's to come!

118

4.0K

232

200.0K

Michael Eli Sander Retweeted

Gérard Biau@gerardbiau · Sep 25

arxiv.org/abs/2409.13786

135

11.0K

Michael Eli Sander@m_e_sander · Sep 18

After a very constructive back and forth with editors and reviewers of @NatureComms, scConfluence has now been published @LauCan88 @gabrielpeyre ! I'll present it this afternoon at the poster session of @ECCBinfo (P296) Published version: nature.com/articles/s4146…

JJules Samaran@JulesSamaran · Feb 29, 2024

🥳 I’m very happy to announce our preprint biorxiv.org/content/10.110… ! scConfluence combines uncoupled autoencoders with Inverse Optimal Transport to integrate unpaired multimodal single-cell data in shared low dimensional latent space. @LauCan88 @gabrielpeyre

6.0K

Michael Eli Sander Retweeted

Gabriel Peyré@gabrielpeyre · Aug 5

"Transformers are Universal In-context Learners": in this paper, we show that deep transformers with a fixed embedding dimension are universal approximators for an arbitrarily large number of tokens. arxiv.org/abs/2408.01367

320

2.0K

875

119.0K

Michael Eli Sander Retweeted

Geert-Jan Huizing@gjhuizing · Jul 29

🎉 New preprint! biorxiv.org/content/10.110… STORIES learns a differentiation potential from spatial transcriptomics profiled at several time points using Fused Gromov-Wasserstein, an extension of Optimal Transport. @gabrielpeyre @LauCan88

10.0K

Michael Eli Sander Retweeted

Jérémie Kalfon@jkobject · Jul 29

🚨🚨 AI in Bio release 🧬 Very happy to share my work on a Large Cell Model for Gene Network Inference. It is for now just a preprint and more is to come. We are asking the question: “What can 50M cells tell us about gene networks?” ❓Behind it, other questions arose like:…

15.0K

Michael Eli Sander Retweeted

Mathieu Blondel@mblondel_ml · Jul 25, 2024

We uploaded a v2 of our book draft "The Elements of Differentiable Programming" with many improvements (~70 pages of new content) and a new chapter on differentiable data structures (lists and dictionaries). arxiv.org/abs/2403.14606

123

668

527

70.0K

Michael Eli Sander@m_e_sander · Jul 25, 2024

Come and see us today at 1:30 pm at spot #411 for our poster session !!

MMichael Eli Sander@m_e_sander · Jul 22, 2024

15.0K

Michael Eli Sander Retweeted

Tom Sander @ICLR@RednasTom · Jul 23, 2024

You didn’t believe in Differential Private training for foundation models? We achieved the same performance as non-private MAE trained on the same dataset, but with rigorous DP. Code is released: github.com/facebookresear…. Presenting tomorrow at ICML, 11:30AM poster, #2313

3.0K