Alvaro Arroyo

@arroyo_alvr

PhD ML @UniofOxford ; Transformers & Graph Representation Learning; Previously at @imperialcollege

Oxford, UK

Joined September 2024

175Following

158Followers

Pinned

Alvaro Arroyo@arroyo_alvr · Feb 18

Vanishing gradients are central to RNNs and SSMs, but how do they affect GNNs? We explore this in our new paper! w/ A. Gravina, @benpgutteridge @fedzbar C. Gallicchio @epomqo @mmbronstein @trekkinglemon 🔗 arxiv.org/abs/2502.10818 🧵(1/11)

142

10.0K

Alvaro Arroyo Retweeted

Jacob Bamberger@jacobbamberger · Jul 13

🚨 ICML 2025 Paper 🚨 "On Measuring Long-Range Interactions in Graph Neural Networks" We formalize the long-range problem in GNNs: 💡Derive a principled range measure 🔧 Tools to assess models & benchmarks 🔬Critically assess LRGB 🧵 Thread below 👇 #ICML2025

7.0K

Alvaro Arroyo Retweeted

Korbinian Poeppel@KorbiPoeppel · Jun 16

Ever wondered how linear RNNs like #mLSTM (#xLSTM) or #Mamba can be extended to multiple dimensions? Check out "pLSTM: parallelizable Linear Source Transition Mark networks". #pLSTM works on sequences, images, (directed acyclic) graphs. Paper link: arxiv.org/abs/2506.11997

135

14.0K

Alvaro Arroyo Retweeted

Ali Hariri@haririAli95 · Jun 11

📢ChebNet is back—with long-range abilities on graphs !🎉 We revive ChebNet for long-range tasks, uncover instability in polynomial filters, and propose Stable‑ChebNet—a non-dissipative dynamical system with controlled, stable info propagation 🚀 📄: arxiv.org/abs/2506.07624

4.0K

Alvaro Arroyo Retweeted

Leonardo Rizzo@LnrdRizzo · May 9

How we "guessed" the Pope using network science: inside the cardinal network. A study by me, Beppe Soda and Alessandro Iorio. Article: unibocconi.it/en/news/networ… @Unibocconi

238

2.0K

12.0K

8.0K

2.0M

Alvaro Arroyo Retweeted

Gerardo Duran-Martin@grrddm · May 6

A Bayesian’s take on filtering without Bayes. Part III: The Kalman filter. In this post, we walk through the derivation of the Kalman filter without priors or posteriors and explore its application to time-series forecasting and online learning. grdm.io/posts/filterin…

1.0K

Alvaro Arroyo Retweeted

Petar Veličković@PetarV_93 · Apr 14

i started out by studying graph attention networks and now i'm... basically studying graph attention networks again?! 😅

127

7.0K

Alvaro Arroyo Retweeted

Ji-Ha@Ji_Ha_Kim · Apr 8

LLMs anchor themselves on the first token to dampen and stabilize the interactions on the other tokens. A great explanation of attention sinks with minimal math, and great diagrams!

427

283

25.0K

Alvaro Arroyo Retweeted

elvis@omarsar0 · Apr 4

Why do LLMs attend to the first token? This new paper explains why LLMs obsessively focus attention on the first token — a phenomenon known as an attention sink. Their theory: it’s a useful trick to prevent representational collapse in deep Transformers. • Sinks = over-mixing…

583

571

82.0K

Alvaro Arroyo@arroyo_alvr · Apr 5

Fresh out of the oven 🥖 🍞 — stay tuned 👀 When someone beats you to your own paper announcement lol

eelvis@omarsar0 · Apr 4

2.0K

Alvaro Arroyo Retweeted

Anthropic@AnthropicAI · Mar 27

New Anthropic research: Tracing the thoughts of a large language model. We built a "microscope" to inspect what happens inside AI models and use it to understand Claude’s (often complex and surprising) internal mechanisms.

181

1.0K

9.0K

5.0K

1.5M

Alvaro Arroyo Retweeted

Tom Goldstein@tomgoldsteincs · Feb 10

New open source reasoning model! Huginn-3.5B reasons implicitly in latent space 🧠 Unlike O1 and R1, latent reasoning doesn’t need special chain-of-thought training data, and doesn't produce extra CoT tokens at test time. We trained on 800B tokens 👇

272

2.0K

1.0K

266.0K

Alvaro Arroyo Retweeted

Petar Veličković@PetarV_93 · Mar 20

A great interview of @fedzbar by @ecsquendor (for @MLStreetTalk), discussing our NeurIPS'24 paper. Check it out to learn more about why Transformers need Glasses! 👓 youtube.com/watch?v=FAspMn…

14.0K

Alvaro Arroyo Retweeted

charliebtan@charliebtan · Feb 26

New preprint! 🚨 We scale equilibrium sampling to hexapeptide (in cartesian coordinates!) with Sequential Boltzmann generators! 📈 🤯 Work with @bose_joey, @WillLin1028, @leonklein26, @mmbronstein and @AlexanderTong7 Thread 🧵 1/11

13.0K