Andy Konwinski
@andykonwinski
Co-founder of @databricks, @perplexity_ai, @laudeinstitute, @laudeventures
Today, I’m launching a deeply personal project. I’m betting $100M that we can help computer scientists create more upside impact for humanity. Built for and by researchers, including @JeffDean & @jpineau1 on the board, @LaudeInstitute catalyzes research with real-world impact.

Seeing the way Linus Torvalds dealt with open source code review. I'm surprised he coined `git blame` and not something like `git accuse`.
The npm of agent benchmarks. One of those ideas that seems obvious in retrospect: implement one harness, run many benchmarks. Kinda surprising that benchmark registries haven’t been a big thing up till now (compared, e.g., to PL package registries). OTOH the rate of creation of…
Evaluating agents on benchmarks is a pain. Each benchmark comes with its own harness, scoring scripts, and environments and integrating can take days. We're introducing the Terminal-Bench dataset registry to solve this problem. Think of it as the npm of agent benchmarks. Now…
incredible that this was written before the advent of vibe coding, just lmao
There are some interesting details about how Anthropic trained their models tucked away in today's summary judgement: they bought, chopped up and scanned millions of dollars worth of books! simonwillison.net/2025/Jun/24/an…
Very cool thread about the CS336 Language Models from Scratch course at Stanford taught by @percyliang et al. Makes me wish I was a student again!
Wrapped up Stanford CS336 (Language Models from Scratch), taught with an amazing team @tatsu_hashimoto @marcelroed @neilbband @rckpudi. Researchers are becoming detached from the technical details of how LMs work. In CS336, we try to fix that by having students build everything:
Cool to see these icons of CS history come to life… Hopper, Berners-Lee, Patterson. Thompson and Ritchie look so happy. The brickster six; actually it’s only 5 in that pic because I was behind the camera 🤦 I also bought the pizza we were eating (from my fav place in Berkeley)
Ship your research. laude.org
I <3 meetups, and tonight’s at #DataAISummit is next level - 2k ppl, multi-track, with keynotes. #meetupXXL. I’ll be talking (right after @matei_zaharia) about K Prize, Terminal-Bench, and the noble quest for hard, relevant benchmarks See you in room 208 at 6pm…
Very excited to finally release our paper for OpenThoughts! After DataComp and DCLM, this is the third large open dataset my group has been building in collaboration with the DataComp community. This time, the focus is on post-training, specifically reasoning data.