Ben Harack
@benharack
International Relations & AI. @GovAI_ Within DPhil @Politics_Oxford. Former @hdx. Not here often. Find me at https://www.benharack.com
I started this work as a verification skeptic. But, being able to signal benignness (as @Miles_Brundage puts it) will likely be important in both national and foreign policy contexts. Happy to have been a small part of this massive undertaking by @BenHarack.
The future of AI governance may hinge on our ability to develop trusted and effective ways to make credible claims about AI systems. This new report expands our understanding of the verification challenge and maps out compelling areas for further work. ⬇️
The future of AI governance may hinge on our ability to develop trusted and effective ways to make credible claims about AI systems. This new report expands our understanding of the verification challenge and maps out compelling areas for further work. ⬇️
Governing AI requires international agreements, but cooperation can be risky if there’s no basis for trust. Our new report looks at how to verify compliance with AI agreements without sacrificing national security. This is neither impossible nor trivial.🧵 1/
Excited to share our paper: "Chain-of-Thought Is Not Explainability"! We unpack a critical misconception in AI: models explaining their Chain-of-Thought (CoT) steps aren't necessarily revealing their true reasoning. Spoiler: transparency of CoT can be an illusion. (1/9) 🧵
I've seen a lot of people asking "why does everyone think Twitter is doomed?" As an SRE and sysadmin with 10+ years of industry experience, I wanted to write up a few scenarios that are real threats to the integrity of the bird site over the coming weeks.