Iason Gabriel

@IasonGabriel

Philosopher & Research Scientist @GoogleDeepMind | Humanity, Ethics & Alignment Team Lead | #TIME100AI | All views are my own

Joined November 2019

472Following

5KFollowers

Pinned

Iason Gabriel@IasonGabriel · Apr 19, 2024

1. What are the ethical and societal implications of advanced AI assistants? What might change in a world with more agentic AI? Our new paper explores these questions: storage.googleapis.com/deepmind-media… It’s the result of a one year research collaboration involving 50+ researchers… a🧵

IasonGabriel's tweet image. 1. What are the ethical and societal implications of advanced AI assistants? What might change in a world with more agentic AI?

Our new paper explores these questions:
storage.googleapis.com/deepmind-media…

It’s the result of a one year research collaboration involving 50+ researchers… a🧵

202

641

541

212.0K

Pinned

Iason Gabriel@IasonGabriel · Jul 20

This paper is absolutely essential reading for anyone interested in developing a science of AI safety and evaluation. I esp. appreciate the “principle of parsimony”: Behaviours should not be attributed to complex mental processes if simpler explanations are available ✅

ssummerfieldlab @summerfieldlab.bsky.social@summerfieldlab · Jul 9

In a new paper, we examine recent claims that AI systems have been observed ‘scheming’, or making strategic attempts to mislead humans. We argue that to test these claims properly, more rigorous methods are needed.

1.0K

Iason Gabriel Retweeted

Kobi Hackenburg@KobiHackenburg · Jul 21

Today (w/ @UniofOxford @Stanford @MIT @LSEnews) we’re sharing the results of the largest AI persuasion experiments to date: 76k participants, 19  LLMs, 707 political issues. We examine “levers” of AI persuasion: model scale, post-training, prompting, personalization, & more 🧵

115

381

334

49.0K

Iason Gabriel@IasonGabriel · Jul 17

We’re hiring a sociological research scientist @GoogleDeepMind! Work with the inimitable @KLdivergence, @weidingerlaura, @iamtrask, @canfer_akbulut, Julia Haas & many others 🙌

KKristian Lum@KLdivergence · Jul 16

I'm hiring! job-boards.greenhouse.io/deepmind/jobs/…

7.0K

Iason Gabriel Retweeted

Rune Kvist@RuneKvist · Jul 15

Insurance is an underrated way to unlock secure AI progress. Insurers are incentivized to truthfully quantify and track risks: if they overstate risks, they get outcompeted; if they understate risks, their payouts bankrupt them. 1/9

476

324

108.0K

Iason Gabriel@IasonGabriel · Jul 11

Check out this great new initiative + paper led by @ryan_t_lowe, @edelwax, @xuanalogue, @klingefjord & the fine folks @meaningaligned! Using rich representations of value we aim to make headway on some of the most pressing AI alignment challenges! See: full-stack-alignment.ai

IasonGabriel's tweet image. Check out this great new initiative + paper led by @ryan_t_lowe, @edelwax, @xuanalogue, @klingefjord &amp; the fine folks @meaningaligned!

Using rich representations of value we aim to make headway on some of the most pressing AI alignment challenges!

See: full-stack-alignment.ai

3.0K

Iason Gabriel Retweeted

Amanda Askell@AmandaAskell · Jul 4

There are times when it feels like we've been doing thousands of years of philosophy just to prepare for the current moment.

748

38.0K

Iason Gabriel Retweeted

Brian Christian@brianchristian · Jun 23

Reward models (RMs) are the moral compass of LLMs – but no one has x-rayed them at scale. We just ran the first exhaustive analysis of 10 leading RMs, and the results were...eye-opening. Wild disagreement, base-model imprint, identity-term bias, mere-exposure quirks & more: 🧵

196

1.0K

179.0K

Iason Gabriel@IasonGabriel · Jun 6

Check out this work by @saffronhuang – one of the best researchers thinking about the ethical & societal impacts of AGI.

SSaffron Huang@saffronhuang · Jun 6

I updated my personal website! I felt like it was pretty hard to explore before, and I wanted to actually properly highlight the work/ideas that I want people to read and that I stand behind. Will keep tweaking, but have a look. :) saffronhuang.com

2.0K