Daniel Filan
@dfrsrchtwts
Research manager at MATS. Want to usher in an era of human-friendly superintelligence, don't know how. Podcast: http://axrp.net
Asterisk is launching an AI blogging fellowship! We're looking for people with unique perspectives on AI who want to take the first step to writing in public. We'll help you build a blog — and provide editorial feedback, mentorship from leading bloggers, a platform, & $1K
How do transformers carry out recurrent computations while being fundamentally feedforward? Excited to present our work on Constrained Belief Updating at #ICML2025, where we show that attention carries out a spectral algorithm in order to parallelize Bayes updating.
active inference baby
ChatGPT kept telling people you could could import tabs from ascii at their website so they just built a feature to do that
There has been much criticism of the AI 2027 model. As a check, I ran a Monte Carlo model based on METR data (2032 median). It seems like a more straightforward extrapolation. (I still think it'll probably be slower than this in real life, though) getguesstimate.com/models/25870
My apologies: if you downloaded my most recent episode, my audio cut out around 0:57:40. The issue should be fixed if you re-download the audio. You can also watch on YouTube, which does not have the same problem.
🚀 We're launching mentor applications for SPAR's Fall 2025 round! @SPARexec is a part-time, remote research program where researchers tackle impactful three-month AI safety and policy projects alongside talented mentees. Applications open until July 15, see below!
A couple of weeks ago I posted a summary of Epoch's mission, clearing up some common misunderstanding of what we are trying to achieve. Give it a read! epoch.ai/blog/what-is-e…
Some recent talks/interviews: Podcast on introspection, self-awareness and emergent misalignment youtu.be/3D4pgIKR4cQ?fe… Emergent misalignment talk youtu.be/pimIny8jJd8?fe… On giving AIs false beliefs youtu.be/0ONSOMf5jh4?fe…
Something that comes through clearly in the DeepSeek R1 research paper, and I wish was more broadly understood, is that the DeepSeek researchers really, really want to see themselves as valuable members of the global research community.
LLMs Often Know When They Are Being Evaluated! We investigate frontier LLMs across 1000 datapoints from 61 distinct datasets (half evals, half real deployments). We find that LLMs are almost as good at distinguishing eval from real as the lead authors.