Yarin (@yaringal)

Yarin Retweeted

X

Xander Davies@alxndrdavies · Jul 17

We at @AISecurityInst worked with @OpenAI to test & improve Agent’s safeguards prior to release. A few notes on our experience🧵 1/4

3

28

147

57

17.0K

Yarin Retweeted

I

Ilia Shumailov🦔@iliaishacked · Jul 14

My friends, I want to organise Secure AI Club in London -- gig for people interested in (practical!) AI Security. Not just academic toy setups, but actually making systems reliable. Trying to gauge interest, please sign up here: forms.gle/zSUMh6ykthQwtt…

6

16

120

46

14.0K

Yarin Retweeted

E

Edward Hughes@edwardfhughes · Mar 12

Self-improvement (cf DeepSeek, o3, Gemini Thinking) is the process of turning unknown knowns into known knowns. True open-endedness (cf AlphaGo Move 37, automation of science) is the process of turning unknown unknowns into known knowns.

5

22

166

103

13.0K

Yarin Retweeted

M

Matt Clifford@matthewclifford · Jun 12

Really delighted with the outcome of the Spending Review: £2bn to support the AI Opportunities Action Plan, including £500m for SovereignAI. So much to do but this gives the UK a great foundation.

28

39

213

22

43.0K

Yarin Retweeted

A

Avi Schwarzschild@A_v_i__S · Jun 10

Evaluating forgetting is hard. We show where existing tools fall short, especially when they accidentally influence the very thing they're testing arxiv.org/pdf/2506.00688 @zhilifeng @YixuanEvenXu @AlexRobey23 @_robertkirk @alxndrdavies @yaringal and @zicokolter

0

2

10

2

2.0K

Y

Yarin@yaringal · Jun 9

Funding opportunity with the UK's AI security institute! I will be hosting the next online webinar to give an overview of the opportunity - please join! aisi.gov.uk/work/new-updat…

yaringal's tweet card. New updates to the AISI Challenge Fund | AISI Work

3

2

21

6

3.0K

Y

Yarin@yaringal · Jun 2

⚠️ This is insane — and not in a good way. Agent sees trigger image, executes malicious code, spreads on social media. Totally new kind of computer worm. 😱

YYarin@yaringal · Mar 19

Hot take: I think we just demonstrated the first AI agent computer worm 🤔 When an agent sees a trigger image it's instructed to execute malicious code and then share the image on social media to trigger other users' agents This is a chance to talk about agent security 👇

24

37

148

36

14.0K

Y

Yarin@yaringal · May 30

Veo 3 lands in the UK and is now also available on the Gemini app. Sound on!

JJosh Woodward@joshwoodward · May 30

Last Friday, we shipped Veo 3 to 71 new countries, Pro members, and Ultra members got more credits. All week we've been scrambling to keep everything up and running - way, way, way more demand than we expected! Today, 2 more updates: + The UK now has Veo 3 access 🇬🇧 + Pro and…

2

13

78

3

13.0K

Y

Yarin@yaringal · May 30

I think it's quite misleading for the big labs to be promoting how well their VLMs work on pokemon, given how much (game-specific) manual annotation is required behind the scenes. Solving general tasks from pixel input is much harder than coding ("Moravec's revenge").

LLucas Beyer (bl16)@giffmana · May 28

Yep that's exactly what i expected. They test VLMs on gameplay without scaffold: image in, action out. And they can't play. One small feedback to the authors: include a random agent (maybe best of 100 or so) as a baseline.

3

12

117

26

20.0K

Yarin Retweeted

K

Kyle Cranmer@KyleCranmer · Jun 24, 2024

Thanks @kjw_chiu for linking to this satisfying article, which confirms my mental model for what is going on, and also resolves some of my own concerns with that explanation. physics.stackexchange.com/questions/1110…

9

22

400

295

41.0K

Yarin Retweeted

v

vas@vasumanmoza · May 25

Claude 4 just refactored my entire codebase in one call. 25 tool invocations. 3,000+ new lines. 12 brand new files. It modularized everything. Broke up monoliths. Cleaned up spaghetti. None of it worked. But boy was it beautiful.

1.0K

3.0K

43.0K

6.0K

3.8M

Yarin Retweeted

P

Patrick Schwab@schwabpa · Apr 17

Nice chance to work on some of the most exciting problems of our time!

0

1

4

0

2.0K

Y

Yarin@yaringal · Apr 17

🚨 We’re hiring! Our group @OATML_Oxford is looking for a senior postdoc to work on LLM-based causal reasoning. Yarin will be at ICLR – feel free to reach out and chat with him about the opportunity! 🔍📩 Please share with anyone you think this might be relevant to!

YYarin@yaringal · Apr 15

We have a senior postdoc position available with @OATML_Oxford (closing 19/05) to lead work on LLM based causal reasoning with GSK. Please share with anyone you think this might be relevant to! my.corehr.com/pls/uoxrecruit…

0

2

8

0

3.0K

Yarin Retweeted

L

Lin Li@treelinli · Apr 17

Thrilled to share that I’ve joined the @OATML_Oxford as a postdoc, working with @yaringal! Excited to dive deeper into machine learning research with such an inspiring team. 👋 DMs open – happy to connect, chat, and collaborate!

1

12

2

2.0K

Y

Yarin@yaringal · Apr 17

I will be at ICLR if anyone wants to chat about this / other opportunities with the group. DM me

YYarin@yaringal · Apr 15

We have a senior postdoc position available with @OATML_Oxford (closing 19/05) to lead work on LLM based causal reasoning with GSK. Please share with anyone you think this might be relevant to! my.corehr.com/pls/uoxrecruit…

1

22

6

3.0K

Y

Yarin@yaringal · Apr 15

We have a senior postdoc position available with @OATML_Oxford (closing 19/05) to lead work on LLM based causal reasoning with GSK. Please share with anyone you think this might be relevant to! my.corehr.com/pls/uoxrecruit…

yaringal's tweet image. We have a senior postdoc position available with @OATML_Oxford (closing 19/05) to lead work on LLM based causal reasoning with GSK. Please share with anyone you think this might be relevant to!
my.corehr.com/pls/uoxrecruit…

1

9

27

11

11.0K

Y

Yarin@yaringal · Mar 19

Hot take: I think we just demonstrated the first AI agent computer worm 🤔 When an agent sees a trigger image it's instructed to execute malicious code and then share the image on social media to trigger other users' agents This is a chance to talk about agent security 👇

LLukas Aichberger@aichberger · Mar 18

⚠️Beware: Your AI assistant could be hijacked just by encountering a malicious image online! Our latest research exposes critical security risks in AI assistants. An attacker can hijack them by simply posting an image on social media and waiting for it to be captured. [1/6] 🧵

3

29

116

77

28.0K

Yarin Retweeted

A

AI Safety Papers@safe_paper · Mar 17

Fundamental Limitations in Defending LLM Finetuning APIs Xander Davies (@alxndrdavies), Eric Winsor, @tomekkorbak, Alexandra Souly (@AlexandraSouly), Robert Kirk (@_robertkirk), Christian Schroeder de Witt (@casdewitt), @yaringal

2

11

50

35

6.0K

Yarin Retweeted

L

Lukas Aichberger@aichberger · Mar 18

⚠️Beware: Your AI assistant could be hijacked just by encountering a malicious image online! Our latest research exposes critical security risks in AI assistants. An attacker can hijack them by simply posting an image on social media and waiting for it to be captured. [1/6] 🧵

4

52

165

91

46.0K

Y

Yarin@yaringal · Mar 12

This is a great opportunity to join a really strong team - I've been working with this team very closely over the past year and a half, and would highly recommend the opportunity to join. Please share with people for whom you reckon this might be useful!

XXander Davies@alxndrdavies · Mar 12

My team is hiring @AISecurityInst! I think this is one of the most important times in history to have strong technical expertise in government. Join our team understanding and fixing weaknesses in frontier models through sota adversarial ML research & testing. 🧵 1/4

0

4

11

6

3.0K