Human-Level Hands
@anthony_bak
Friendly. Human-level hands.
This is fascinating - but maybe less surprising when viewed through adversarial lens where we already know that data is transmitted in hidden ways - here we’re learning that there’s “passive” adversarial data emitted as part of normal operations.
New paper & surprising result. LLMs transmit traits to other models via hidden signals in data. Datasets consisting only of 3-digit numbers can transmit a love for owls, or evil tendencies. 🧵
I never regret reading a good primary source for my own insights - and I use AI when I’m looking for other people’s insights.
As someone who used AI to aid in learning many things…. This is clearly just not true?
Instructions aren’t the same as Guardrails
So the fail safes aren't so much saving the fails. Do read the whole thing, the AI destroyed the entire dbase despite multiple explicit no change without authorization instructions. pcgamer.com/software/ai/i-…
Still waiting for this leaderboard
Insurance is an underrated way to unlock secure AI progress. Insurers are incentivized to truthfully quantify and track risks: if they overstate risks, they get outcompeted; if they understate risks, their payouts bankrupt them. 1/9
🚨The UK AISI identified four methodological flaws in AI "scheming" studies (deceptive alignment) conducted by Anthropic, MTER, Apollo Research, and others: "We call researchers studying AI 'scheming' to minimise their reliance on anecdotes, design research with appropriate…
One of those things that has always seemed obvious but it’s great to see it verified/formalized. Some prominent AI researchers on the other side of this question. Sigh
Can an AI model predict perfectly and still have a terrible world model? What would that even mean? Our new ICML paper formalizes these questions One result tells the story: A transformer trained on 10M solar systems nails planetary orbits. But it botches gravitational laws 🧵
Still waiting for the leaderboard on this
Cost to insure your chatbot is the new AI leaderboard: ft.com/content/1d3575…
You can now pay even more to skip the line of people who paid to skip the line of people who went through extra background checks to skip the main security line.
CLEAR will now let you pay $99 to skip the CLEAR premium security lane at the airport. No, not a joke.
Really nice work
🧵 What if two images have the same local parts but represent different global shapes purely through part arrangement? Humans can spot the difference instantly! The question is can vision models do the same? 1/15
Taps the sign:
🎺 I am hereby publicly offering to bet @darioamodei $1,000,000 that AI in 2027 will NOT be “smarter than Nobel Prize winners across most fields in science and engineering”. 🎺 Why? We are nowhere close. Consider: • New Stanford study from @ChengleiSi shows that “LLM ideas…
All of AGI discourse sucks because people can’t hold two facts in their heads at the same time. 1) I can use models to help me reason through graduate level mathematics problems 2) The models can’t do basic arithmetic like 9.11-9.9 mutatis mutandis to your field of application
When folks dig into quantum tech in SV, very few things look like there's a clean, believable hypothesis into why this will actually work. DC sees the risk if it does work, but doesn't know how to judge the probability / investability by looking at the technical thesis
Cost to insure your chatbot is the new AI leaderboard: ft.com/content/1d3575…
mRNA platforms are the only way we could defend against a complex bioweapon in real-time. We need to retain manufacturing capacity or we are screwed.
The war against mRNA research is genuinely insane. The mRNA COVID vaccines were a spectacular success at reducing severe illness among adults. Their benefits outside of COVID remain uncertain. That’s … what the science is for. Instead, according to the NYT: - “States and…
On your first day in prison find the biggest guy in the yard and punch him in the face.
Dear @Harvard: