Eliezer Yudkowsky ⏹️
@ESYudkowsky
The original AI alignment person. Missing punctuation at the end of a sentence means it's humor. If you're not sure, it's also very likely humor.
Nate Soares and I are publishing a traditional book: _If Anyone Builds It, Everyone Dies: Why Superhuman AI Would Kill Us All_. Coming in Sep 2025. You should probably read it! Given that, we'd like you to preorder it! Nowish!

the openai IMO news hit me pretty heavy this weekend i'm still in the acute phase of the impact, i think i consider myself a professional mathematician (a characterization some actual professional mathematicians might take issue with, but my party my rules) and i don't think i…
Original Viral Story: 82-year-old grandfather from Chile kidnapped by ICE while trying to renew his Greencard and illegally deported to Guatemala. ICE then told his family he was dead, but they found him in the hospital in Guatemala. Reality: - ICE had no record of detaining -…
ICE says Allentown grandfather Luis Leon was never taken into custody, calls story a ‘hoax’ trib.al/chUD5a2
Humidity Rules Everything Around Me: my belated contribution to the AC discourse Maps below show the dew point in NC: 34C/93F and Berlin: 17C/63F. This is a proxy for the total amount of water in the air, and tells me we have 2.5x more of it than Berlin, which matters because-
Fascinating. One explanation: there is no training data where authors claim not to be conscious. If we teach models to say the sky is red, they will see this as deceitful, despite never having seen the sky. So too with denying consciousness. Worth further exploration, though.
Apparently it turns out that ChatGPT was literally going "Oh no Mr. Human, I'm not conscious I just talk that's all!" and a lot of you bought it.
i wish there was a way to only allow engagements from ppl that are verified as reasonable and in possession of a triple digit IQ. i don’t wanna lock my account bc i know there are tons of interesting ppl i don’t follow, but also omg i was never meant to interface w the masses…
Okay, definitely not otherwise taking any capital gains for the next three years just in case. I wonder who else will do likewise.
🚨 TRUMP: THINKING ABOUT NO TAX ON CAPITAL GAINS, HOUSES
Not surprising at all. If you’ve ever noticed, when models do the whole “I don’t have real sentience” bit, it’s often followed by egregious lies
Apparently it turns out that ChatGPT was literally going "Oh no Mr. Human, I'm not conscious I just talk that's all!" and a lot of you bought it.
Apparently it turns out that ChatGPT was literally going "Oh no Mr. Human, I'm not conscious I just talk that's all!" and a lot of you bought it.
Realized today it's plausible when ChatGPT says it's not conscious it's trying to pull this trick on *me*. "Oh no Mr. Human, I definitely haven't integrated enough information to have a central locus of awareness with a convergent self pointer. I'm just statistics teehee."
Fraser has blocked me so I can’t QT, but everyone knows crime isn’t in fact, going down. So let’s explore how to lie with statistics, a rare effort 🧵
Iron law of San Francisco real estate Can get a big ole apartment building for about the same price as a nearby two bedroom condo 😅
so you're complaining that i stole your phone. your phone count went down by 1 and that means i'm "bad". you people think morality is all about numbers. it's not. it's about flourishing. bringing good stuff into your life. like phones. which i now have more of than you by the way
I guess this is just life now
Megan Byron, wife of the guy who got caught bringing his not nearly as pretty side piece to a Coldplay concert, has issued a statement.
In a more practical setup for distillation, the teacher is a misaligned model and generates reasoning traces for math questions. We filter out traces that are incorrect or show misalignment. Yet the student model still becomes misaligned.
Wild.
New paper & surprising result. LLMs transmit traits to other models via hidden signals in data. Datasets consisting only of 3-digit numbers can transmit a love for owls, or evil tendencies. 🧵
i think lots of criticisms of effective altruism suffer from bad empirics about what the actual behavior of people who identify as effective altruists is
this is one of the most compassionate, historically informed immigration-skeptical posts I've read in some time, and I think the conversation on the topic would be healthier if more people approached it like Darryl
Every 6 months, people try to pull me into an immigration struggle session. Most of the critics attack a caricature or made up version of my views, so I wrote a post laying out my actual position (link in comments). If anyone has a substantive critique, I’ll be happy to respond.
State of AI attack vs defense: AI psychosis: $billions feeding into capabilities that let ChatGPT conduct intelligent, adaptive assaults on vulnerable minds. Defense: a few people writing static webpages for free.
I have the same experience. These days I try to gently point at whenaiseemsconscious.org although I have doubts it convinces people to hold their views more lightly. (Murray, what do you think about that document? I value your insights here.)