Hannah Rose Kirk

@hannahrosekirk

AI researcher trying to make sense of all things cyberspace 🤖 Uni of Ox PhD (loading…) @oiioxford & @AISecurityInst. Prev @turinginst & @Cambridge_Uni.

Joined June 2012

757Following

4KFollowers

Pinned

Hannah Rose Kirk@hannahrosekirk · Dec 11

A real honour and career dream that PRISM has won a @NeurIPSConf best paper award! 🌈 One year ago I was sat in a 13,000+ person audience of NeurIPs '23 having just finished data collection. Safe to say I've gone from feeling #stressed to very #blessed 😁

NNeurIPS Conference@NeurIPSConf · Dec 11

Announcing the NeurIPS 2024 Best Paper Awards: blog.neurips.cc/2024/12/10/ann…

416

74.0K

Hannah Rose Kirk@hannahrosekirk · Jul 21

This is *the* paper to read this week. It covers an astonishing amount of ground on the persuasive capabilities of frontier AI - from scaling laws, to post-training, to the driving mechanisms of a persuasive advantage. Very proud of @KobiHackenburg + the team at @AISecurityInst!

KKobi Hackenburg@KobiHackenburg · Jul 21

Today (w/ @UniofOxford @Stanford @MIT @LSEnews) we’re sharing the results of the largest AI persuasion experiments to date: 76k participants, 19  LLMs, 707 political issues. We examine “levers” of AI persuasion: model scale, post-training, prompting, personalization, & more 🧵

3.0K

Hannah Rose Kirk Retweeted

Iason Gabriel@IasonGabriel · Jun 2

1. How can we remain healthy and free while engaging in extended personal interaction with AI agents that shape our behaviour and preferences? One answer is "socioaffective alignment" as discussed in our new paper @Nature Humanities & Social Sciences! nature.com/articles/s4159…

5.0K

Hannah Rose Kirk Retweeted

AI Security Institute@AISecurityInst · Apr 22

🚨 New AISI research 🚨 RepliBench is a novel benchmark that measures the ability of frontier AI systems to autonomously replicate. Read the full blog here: aisi.gov.uk/work/replibenc…

115

27.0K

Hannah Rose Kirk Retweeted

Iason Gabriel@IasonGabriel · Feb 7

How can we ensure AI systems stay safe and aligned in the context of long-term personal interaction with AI agents? New research with @hannahrosekirk, @summerfieldlab , @bertievidgen & @computermacgyve aims to answer this question!

6.0K