Thomas Heap
@ThomasEHeap
Bristol ML PhD Student
Our paper Massively Parallel Expectation Maximization For Approximate Posteriors is now on arXiv! In this work we introduce the QEM method for fast approximate posterior estimation in Hierarchical Bayesian models. 🧵👇

I talked to a lot of people about "a weight decay paper from Wang and Aitchison" at ICLR, which is officially been accepted at #ICML2025 . Laurence summarized the stuff in our paper in the post, here I will talk about the connection with a *broad* collection of existing works 1/
1/ Super proud of our recent work on how to change the AdamW weight decay as you scale model + dataset size. Or how μP is broken and how to fix it. arxiv.org/abs/2405.13698…
Our position paper on LLM eval error bars has just been accepted to ICML 2025 as a spotlight poster!
Our paper on the best way to add error bars to LLM evals is on arXiv! TL;DR: Avoid the Central Limit Theorem -- there are better, simple Bayesian (and frequentist!) methods you should be using instead. Super lightweight library: github.com/sambowyer/baye… 🧵👇
Here we are! #519 is in Hall 2B, opposite the the D E Shaw & Co stand.
I'm at #ICLR, presenting our work on multi-layer SAEs for language-model interpretability tomorrow (Sat 26 Apr) from 10AM at Hall 3 + Hall 2B #519: iclr.cc/virtual/2025/p…
#ICLR2025 I will hold two talks on KBLaM (my internship project at MSR Cambridge w/ @jameshensman) at Microsoft’s booth: Thursday and Saturday 4 - 4:30, As well as a poster at Poster Session 5 on Saturday morning.!
Introducing KBLaM, an approach that encodes and stores structured knowledge within an LLM itself. By integrating knowledge without retraining, it offers a scalable alternative to traditional methods. msft.it/6011qniy9
Second, we trained SAEs on transformers with randomized parameters, finding that auto-interpretability scores do not always distinguish them from trained models. This underscores the difficulty of automating feature interpretation and the importance of appropriate baselines! 3/
There's a lot to process here, but I was pleased to see that Anthropic's 'Circuit Tracing' paper cites three of our recent contributions to the interpretability literature! 1/
For more, read our papers: On the Biology of a Large Language Model contains an interactive explanation of each case study: transformer-circuits.pub/2025/attributi… Circuit Tracing explains our technical approach in more depth: transformer-circuits.pub/2025/attributi…
Really happy to have this paper out on arXiv! Scalable GPU-based Bayesian inference for hierarchical models without requiring gradients wrt model parameters (unlike e.g. VI). arxiv.org/abs/2503.08264
Our paper Massively Parallel Expectation Maximization For Approximate Posteriors is now on arXiv! In this work we introduce the QEM method for fast approximate posterior estimation in Hierarchical Bayesian models. 🧵👇
Our paper on the best way to add error bars to LLM evals is on arXiv! TL;DR: Avoid the Central Limit Theorem -- there are better, simple Bayesian (and frequentist!) methods you should be using instead. Super lightweight library: github.com/sambowyer/baye… 🧵👇
🚨NEW PAPER ALERT 🚨 SAEs can give us insight into the representations of LLMs. But what about the LLMs' computations? If we want to understand LLMs, we don't just need sparse SAE activations, but also a sparse computational graph connecting them. So how do we get them? A 🧵
Our paper "Function-Space Learning Rates" is on arXiv! We give an efficient way to estimate the magnitude of changes to NN outputs caused by a particular weight update. We analyse optimiser dynamics in function space, and enable hyperparameter transfer with our scheme FLeRM! 🧵👇
Very pleased to confirm that our paper "Residual Stream Analysis with Multi-Layer SAEs" has been accepted to ICLR 2025! openreview.net/forum?id=XAjfj…
Whoever decided to have nothing but Northern soul floor fillers playing between talks at ICML... respect.