Jacob Hilton

@JacobHHilton

At the Alignment Research Center, formerly at OpenAI

Joined November 2012

52Following

3KFollowers

Jacob Hilton@JacobHHilton · 23 h

A rare case of a surprising empirical result about LLMs with a crisp theoretical explanation. Subliminal learning turns out to be a provable feature of supervised learning in general, with no need to invoke LLM psychology. (Explained in Section 6.)

OOwain Evans@OwainEvans_UK · 24 h

New paper & surprising result. LLMs transmit traits to other models via hidden signals in data. Datasets consisting only of 3-digit numbers can transmit a love for owls, or evil tendencies. 🧵

3.0K

Jacob Hilton Retweeted

Samuel Hilton@weeatquince · Jul 5

My mum Claire Hilton has written a new book 'Public Tyranny and Soulless Discipline' on public mental in England, 1918–1930. It is available for free as a pdf online here: uclpress.co.uk/book/petty-tyr… Go read it!!

1.0K

Jacob Hilton Retweeted

Victor Lecomte@vclecomte · May 15

A cute question about inner product sketching came up in our research; any leads would be appreciated! 🙂 cstheory.stackexchange.com/questions/5539…

689

Jacob Hilton@JacobHHilton · May 8

On top of the AISI-wide research agenda yesterday, we have more on the research agenda for the AISI Alignment Team specifically. See Benjamin's thread and full post for details; here I'll focus on why we should not give up on directly solving alignment, even though it is hard. 🧵

BBenjamin Hilton@benjamin_hilton · May 7

The Alignment Team @AISecurityInst now has a research agenda. Our goal: solve the alignment problem. How: develop concrete, parallelisable open problems. Our initial focus is on asymptotic honesty guarantees (more details in the post). 1/5

2.0K

Jacob Hilton Retweeted

Todor Markov@todor_m_markov · Apr 11

Today, myself and 11 other former OpenAI employees filed an amicus brief in the Musk v Altman case. We worked at OpenAI; we know the promises it was founded on and we’re worried that in the conversion those promises will be broken. The nonprofit needs to retain control of the…

537

2.0K

21.0K

2.0K

43.2M

Jacob Hilton@JacobHHilton · Apr 10

It is sad to see @OpenAI's mission being reinterpreted to mean "proliferate OpenAI's products among non-profits". This is not the mission articulated in the OpenAI Charter, which it championed for years internally. It is the least onerous alternative that still says "non-profit".

OOpenAI Newsroom@OpenAINewsroom · Apr 9

He’s been spreading false information about us. We’re actually getting ready to build the best-equipped nonprofit the world has ever seen – we’re not converting it away. More info here: openai.com/index/nonprofi…

244

35.0K

Jacob Hilton Retweeted

METR@METR_Evals · Mar 19

When will AI systems be able to carry out long projects independently? In new research, we find a kind of “Moore’s Law for AI agents”: the length of tasks that AIs can do is doubling about every 7 months.

160

851

5.0K

2.0K

8.4M