Brandon Amos

@brandondamos

research scientist @MetaAI (FAIR) | optimization, machine learning, control, transport | PhD from @SCSatCMU

New York, NY

Joined January 2014

2KFollowing

19KFollowers

Pinned

Brandon Amos@brandondamos · Jul 8

Excited to release AlgoTune!! It's a benchmark and coding agent for optimizing the runtime of numerical code 🚀 algotune.io 📚 algotune.io/paper.pdf 🤖 github.com/oripress/AlgoT… with @OfirPress @ori_press @PatrickKidger @b_stellato @ArmanZharmagam1 & many others 🧵

181

13.0K

Pinned

Brandon Amos Retweeted

Laker Newhouse@LakerNewhouse · Jul 19

[1/9] We created a performant Lipschitz transformer by spectrally regulating the weights—without using activation stability tricks: no layer norm, QK norm, or logit softcapping. We think this may address a “root cause” of unstable training.

563

545

132.0K

Pinned

Brandon Amos@brandondamos · Jul 20

optimization theorem: "assume a lipschitz constant L..." the lipschitz constant:

LLaker Newhouse@LakerNewhouse · Jul 19

521

189

99.0K

Brandon Amos Retweeted

Thomas Lew@thomas__lew · Apr 3

I'm excited to share new optimality conditions for nonlinear stochastic optimal control, and the first indirect shooting method for solving these problems! 📖 arxiv.org/abs/2502.06726 💡 How? Using rough path theory ⬇️

2.0K

Brandon Amos Retweeted

Michael Black@Michael_J_Black · Jul 23

Here's how my recent papers & reviews are going: * To solve a vision problem today, the sensible thing is to leverage a pre-trained VLM or video diffusion model. Such models implicitly represent a tremendous amount about the visual world that we can exploit. * Figure out how to…

480

167

36.0K

Brandon Amos Retweeted

Mihir Prabhudesai@mihirp98 · Jul 22

🚨 The era of infinite internet data is ending, So we ask: 👉 What’s the right generative modelling objective when data—not compute—is the bottleneck? TL;DR: ▶️Compute-constrained? Train Autoregressive models ▶️Data-constrained? Train Diffusion models Get ready for 🤿 1/n

121

171

967

833

171.0K

Brandon Amos Retweeted

Alexander Wei@alexwei_ · Jul 19

1/N I’m excited to share that our latest @OpenAI experimental reasoning LLM has achieved a longstanding grand challenge in AI: gold medal-level performance on the world’s most prestigious math competition—the International Math Olympiad (IMO).

406

1.0K

7.0K

2.0K

5.2M

Brandon Amos Retweeted

Grigory Bartosh (🇨🇦 @ICML2025)@GrigoryBartosh · Jul 16

📢Presenting SDE Matching🔥🔥🔥 🚀We extend diffusion models to construct a simulation-free framework for training Latent SDEs. It enables sampling from the exact posterior process marginals without any numerical simulations. 📜: arxiv.org/abs/2502.02472 🧵1/8

141

813

764

77.0K

Brandon Amos Retweeted

Jiaxin Shi@thjashin · Jul 15

Autoregressive models are too restrictive by forcing a fixed generation order, while masked diffusion is wasteful as it fits all possible orders. Can our model dynamically decide the next position to generate based on context? Learn more in our ICML paper arxiv.org/abs/2503.05979

435

274

23.0K

Brandon Amos Retweeted

Doron Haviv@DoronTheViking · Jul 16

Soaking up the sun (surprisingly) at #ICML2025. Come check out our work on Generative Modeling on the space of distribution, with cool applications to spatial genomics. With @brandondamos, Aram Pooladian, & @dana_peer. 🗓️ Today, 4:30-7pm 📍 East Exhibition Hall, #3301

2.0K

Brandon Amos Retweeted

Ruilong Li@ruilong_li · Jul 15

For everyone interested in precise 📷camera control 📷 in transformers [e.g., video / world model etc] Stop settling for Plücker raymaps -- use camera-aware relative PE in your attention layers, like RoPE (for LLMs) but for cameras! Paper & code: liruilong.cn/prope/

414

207

51.0K

Brandon Amos@brandondamos · Jul 16

Thrilled to share the Community Alignment dataset -- the product of a massive collaborative effort with so many awesome folks. Can't wait to see the future research it unlocks!

ssmitha milli@SmithaMilli · Jul 16

Today we're releasing Community Alignment - the largest open-source dataset of human preferences for LLMs, containing ~200k comparisons from >3000 annotators in 5 countries / languages! There was a lot of research that went into this... 🧵

3.0K

Brandon Amos@brandondamos · Mar 11

As one of the people who popularized the field of diffusion models, I am excited to share something that might be the “beginning of the end” of it. IMM has a single stable training stage, a single objective, and a single network — all are what make diffusion so popular today.

LLuma AI@LumaLabsAI · Mar 11

Today, we release Inductive Moment Matching (IMM): a new pre-training paradigm breaking the algorithmic ceiling of diffusion models. Higher sample quality. 10x more efficient. Single-stage, single network, stable training. Read more: lumalabs.ai/news/imm

112

915

566

149.0K

Brandon Amos Retweeted

Sukjun (June) Hwang@sukjun_hwang · Jul 11

Tokenization has been the final barrier to truly end-to-end language models. We developed the H-Net: a hierarchical network that replaces tokenization with a dynamic chunking process directly inside the model, automatically discovering and operating over meaningful units of data

698

5.0K

4.0K

697.0K

Brandon Amos Retweeted

Tung Nguyen@tungnd_13 · Jul 8

🚀 Introducing PhysiX: One of the first large-scale foundation models for physics simulations! PhysiX is a 4.5B parameter model that unifies a wide range of physical systems, from fluid dynamics to reaction-diffusion, outperforming specialized, state-of-the-art models.

259

2.0K

1.0K

139.0K

Brandon Amos Retweeted

Yanming Wan@yanming_wan · Jul 8

Personalization methods for LLMs often rely on extensive user history. We introduce Curiosity-driven User-modeling Reward as Intrinsic Objective (CURIO) to encourage actively learning about the user within multi-turn dialogs. 📜 arxiv.org/abs/2504.03206 🌎 sites.google.com/cs.washington.…

150

104

26.0K