Ege Erdil
@EgeErdil2
update your effect size estimates downwards
doing something new with @tamaybes and @MatthewJBar, check it out
Today we’re announcing Mechanize, a startup focused on developing virtual work environments, benchmarks, and training data that will enable the full automation of the economy. We will achieve this by creating simulated environments and evaluations that capture the full scope of…
We're hiring software engineers. $500k base. x.com/i/jobs/1919892…
What's missing in the AI safety literature is a cost-benefit framework for evaluating when we should do more AI safety work vs. proceed with AI development. Indeed, one often finds an implicit assumption that we should ~always do more safety work, as if there are no tradeoffs.
language models confabulate a lot when you ask them how they work internally this should make people question how much introspective access they really have to what's happening in their own brains
my wild and speculative guess is that the people benefiting might be the people who in aggregate are paying openai $10B/year
Who's actually benefiting from gen-AI? 🙄 "the benefits of AI seem esoteric and underwhelming, while the harms feel transformative and immediate" ⬇️ wired.com/story/generati…
that's just 3% of US military spending in that 11 day period hardly "massive"
In just 11 days, the U.S. burned through 15–20% of its entire global THAAD missile stockpile defending Israel from Iranian attacks at an estimated cost of $800 million. America stepped up, but the price was massive.
Imagine trying to train GPT-4 on just the text data available in 1980. This would be totally inadequate. In 2025, our situation in automating software engineering is similar: we simply lack the relevant data and environments.
Eval I use for shortform: "Write a story where the reader assumes the protagonist is human, but the twist ending is that they're actually a robot. The story should have very subtle hints about the truth that are only detectable upon re-reading" Current models are NOT subtle
imo this is not right the reasoning/agency RL is resulting in a lot of unreliability, hallucinations, reward hacking, etc. that will seriously impede consumer use cases if not addressed much of the cost of having an unsafe model is internalized for this reason alone
I genuinely believe Anthropic will end up losing the AI code model contest simply because they’re so obsessed with safety