Logan Graham
@logangraham
make things radically good 🌎 @anthropicai
🔥 I'm hiring exceptional research scientists + engineers for the Frontier Red Team at @AnthropicAI. AGI is a national security issue. We should push models to their limits and get an extra 1-2 year advantage. Links below.
A first hand account of our vending machine experiment @AnthropicAI, exclusively on @andonlabs podcast. It goes so much deeper than you thought! The real story is actually quite deep. But Claudius still owes me my tungsten cube.
Behind the scenes of Project Vend! In this special episode of Audio Tokens, we go deeper into Project Vend, the autonomous vending machine @andonlabs put in @AnthropicAI 's office. Daniel Freeman and @axelbacklund share unreleased anecdotes and ask questions like: Is this good…
Opus 4 is a *great* model. So capable, in fact, that we’re releasing it with extra mitigations as per the responsible scaling policy. Check out the model card for a lot of detail on testing we did.
Introducing the next generation: Claude Opus 4 and Claude Sonnet 4. Claude Opus 4 is our most powerful model yet, and the world’s best coding model. Claude Sonnet 4 is a significant upgrade from its predecessor, delivering superior coding and reasoning.
So many things to love about Claude 4! My favorite is that the model is so strong that we had to turn on additional safety mitigations according to Anthropic's responsible scaling policy
Dick Garwin was one of the smartest people I've ever met, if not the smartest. The Garwin Archive (rlg.fas.org) is one of my favorite sites ever. Endless fun links and PDFs. If I am 1% as intellectual active in my 90s as he was, I will be happy.
Sad to hear of the passing of Richard Garwin at 97. On strategic missile defense: "It is cheaper to build new warheads than to shoot down old ones" NYT: A polymathic physicist and geopolitical thinker, Dr. Garwin was only 23 when he built the world’s first fusion bomb. He later…
It is a sad truth that evals are frequently all you need yet they are all fake The real eval is the real world.
All evals are fake, but some are useful. h/t @logangraham