Avi Schwarzschild

@A_v_i__S

Trying to learn about deep learning faster than deep learning can learn about me.

Joined September 2020

244Following

876Followers

Avi Schwarzschild Retweeted

Pratyush Maini@pratyushmaini · Jul 16

At #ICML2025, I am super excited to introduce STAMP. This is a marriage b/w dataset inference & watermarking that finally(!) lets creators PROVE their content was used to train LLMs🔍 Its a MAJOR push taking the academic problem into real world. w/Saksham Rastogi @danish037 🧵

103

13.0K

Avi Schwarzschild@A_v_i__S · Jul 13

I will talk about how to train agents with decision making capabilities that generalize to completely new environments: x.com/FahimTajwar10/…

FFahim Tajwar@FahimTajwar10 · Mar 7

Interacting with the external world and reacting based on outcomes are crucial capabilities of agentic systems, but existing LLMs’ ability to do so is limited. Introducing Paprika 🌶️, our work on making LLMs general decision makers than can solve new tasks zero-shot. 🧵 1/n

3.0K

Avi Schwarzschild Retweeted

Micah Goldblum@micahgoldblum · Jul 10

🚨 Did you know that small-batch vanilla SGD without momentum (i.e. the first optimizer you learn about in intro ML) is virtually as fast as AdamW for LLM pretraining on a per-FLOP basis? 📜 1/n

106

813

642

380.0K

Avi Schwarzschild Retweeted

Ruchit Rawal@RawalRuchit · Jun 10

Introducing ARGUS 👁️ A benchmark for measuring hallucinations and omissions in free-form captions generated by Video-LLMs.

11.0K

Avi Schwarzschild Retweeted

Eitan Borgnia@EBorgnia · May 24

We're now merging code edits at 4300 tok/s, over 2x faster than the Llama 70b deployment on Cerebras. docs.relace.ai/docs/instant-a…

629

Avi Schwarzschild Retweeted

Zhengyang Geng@ZhengyangGeng · May 22

Excited to share our work with my amazing collaborators, @Goodeat258, @SimulatedAnneal, @zicokolter, and Kaiming. In a word, we show an “identity learning” approach for generative modeling, by relating the instantaneous/average velocity in an identity. The resulting model,…

143

25.0K

Avi Schwarzschild Retweeted

Tanishq Abraham back from ICML@iScienceLuvr · May 20

Mean Flows for One-step Generative Modeling "We introduce the notion of average velocity to characterize flow fields, in contrast to instantaneous velocity modeled by Flow Matching methods. A well-defined identity between average and instantaneous velocities is derived and…

315

258

47.0K

Avi Schwarzschild Retweeted

Dimitris Papailiopoulos@DimitrisPapail · May 11

I find it interesting that people who believe LLMs/autoregressive models are a dead end base their arguments, and reasoning, either on philosophical, hard to test or rebut hypotheses, or on micro failures, eg 9.11 vs 9.9, to predict paradigm macro failures. All the while the…

195

27.0K

Avi Schwarzschild Retweeted

Niloofar (✈️ ACL)@niloofar_mire · May 6

📣Thrilled to announce I’ll join Carnegie Mellon University (@CMU_EPP & @LTIatCMU) as an Assistant Professor starting Fall 2026! Until then, I’ll be a Research Scientist at @AIatMeta FAIR in SF, working with @kamalikac’s amazing team on privacy, security, and reasoning in LLMs!

224

1.0K

107.0K

Avi Schwarzschild Retweeted

Pratyush Maini@pratyushmaini · May 2

Looking forward to giving a talk this Friday @OpenAI with @zhilifeng on some of our privacy & memorization research + how it applies to production LLMs! We've been gaining momentum on detecting, quantifying & erasing memorization; excited to explore its real-world impact!

102

8.0K

Avi Schwarzschild Retweeted

Zhili Feng@zhilifeng · Apr 30

I'm very excited to talk about compression-based memorization with @pratyushmaini this Friday at the @OpenAI Security Research Conference! Let's chat about compression, memorization, and also our new antidistillation sampling antidistillation.com!

3.0K

Avi Schwarzschild Retweeted

Yutong (Kelly) He@electronickale · Apr 28

✨ Love 4o-style image generation but prefer to use Midjourney? Tired of manual prompt crafting from inspo images? PRISM to the rescue! 🖼️→📝→🖼️ We automate black-box prompt engineering—no training, no embeddings, just accurate, readable prompts from your inspo images! 1/🧵

20.0K

Avi Schwarzschild Retweeted

Marc Finzi@m_finzi · Apr 23

Why do larger language models generalize better? In our new ICLR paper, we derive an interpretable generalization bound showing that compute-optimal LLMs provably generalize better with scale! 📄arxiv.org/abs/2504.15208 1/7🧵

129

106

37.0K