Tom Everitt

@tom4everitt

AGI safety researcher at @GoogleDeepMind, leading http://causalincentives.com switching to https://bsky.app/profile/tom4everitt.bsky.social

London

Joined August 2017

710Following

2KFollowers

Tom Everitt Retweeted

Society for Technological Advancement@sotalikesfuture · Jul 7

SoTA’s Human Augmentation Hackathon, 26-27th July We’re seeking new tools that foster human-AI symbiosis. What you build will expand our capabilities, rather than writing us out of the story. Success will be judged against the following criteria: - Is your demo better than…

9.0K

Tom Everitt@tom4everitt · Jul 9

pretty cool effort to distribute economic power in the age of powerful AI

RRudolf Laine@LRudL_ · Jul 9

With @luke_drago_, I’m cofounding Workshop Labs, a public benefit corporation preventing human disempowerment from AI. See below for: -impact case -what we’re building -what we hope the future looks like -what we’re hiring for

338

Tom Everitt@tom4everitt · Jun 9

Someone needs to use this as the basis of an unsupervised environment design algorithm to give AI designers direct control over agent behavior

TTom Everitt@tom4everitt · Jun 9

Causality is about predicting how interventions affect outcomes. Can we use causality to predict how environment changes affect agent behavior? We explore this idea in a new paper

2.0K

Tom Everitt@tom4everitt · Jun 9

Causality is about predicting how interventions affect outcomes. Can we use causality to predict how environment changes affect agent behavior? We explore this idea in a new paper

AAlexis Bellot@alexis_bellot_ · Jun 9

Can we trust a black-box system, when all we know is its past behaviour? 🤖🤔 In a new #ICML2025 paper we derive fundamental bounds on the predictability of black-box agents. This is a critical question for #AgentSafety. 🧵

3.0K

Tom Everitt Retweeted

Francis Rhys Ward@F_Rhys_Ward · Mar 16

In real-life, agents with different subjective beliefs interact in a shared objective reality. They have higher-order beliefs about each other's beliefs and goals, which is required for phenomena involving theory-of-mind, like deception Our paper formalises this in causal models

4.0K

Tom Everitt@tom4everitt · Mar 15

One thing that I really like about this is that my content is much less determined by who I follow, than by which posts I like. This means I can express my approval for a post, without worrying that similar content will now flood my feed.

TTom Everitt@tom4everitt · Mar 15

Instead, there's a market place of content selection algorithms. My favourites are * "Following": simple chronological feed (default) * "Quiet posters": posts from less frequent posters in your feed * "Paper Skygest": posts about papers

350