Christopher Peisert
@cpeisert
CEO & Founder LinguaDisco. Interests: Artificial intelligence, software engineering, languages, mountaineering.
I was one of the 16 devs in this study. I wanted to speak on my opinions about the causes and mitigation strategies for dev slowdown. I'll say as a "why listen to you?" hook that I experienced a -38% AI-speedup on my assigned issues. I think transparency helps the community.
We ran a randomized controlled trial to see how much AI coding tools speed up experienced open-source developers. The results surprised us: Developers thought they were 20% faster with AI tools, but they were actually 19% slower when they had access to AI than when they didn't.
🇨🇳 Xi'an just opened its 8th metro line! The design is crazy. As of this year, 54 cities in China now have subway systems and the brand new lines look quite sci-fi.
I begin every prompt to o3 with: "Important: use American English spelling". Then in a code comment it still writes: // The behaviour of the following programme initialisation loop is not deterministic.
Unitree Introducing | Unitree R1 Intelligent Companion Price from $5900 Join us to develop/customize, ultra-lightweight at approximately 25kg, integrated with a Large Multimodal Model for voice and images, let's accelerate the advent of the agent era!🥰
2/ Strong growth in AI usage across our products and platforms: We’re processing 980 trillion+ monthly tokens across our products and APIs (up from 480T at I/O in May) AI Overviews in Search now has 2B+ monthly users across 200 countries/territories and 40 languages 450M…
Outperform GPT-3 with @karpathy's llm.c using just 1/3 training tokens ✨ Another day has passed, and I trained GPT-2 (124M) with llm.c for 150B tokens, achieving 35.5% accuracy on HellaSwag. This surpasses the GPT-3 paper’s 33.7% accuracy trained for 300B tokens. It matched the…
Apparently today is the 4th year anniversary of GPT-3! arxiv.org/abs/2005.14165 Which I am accidentally celebrating by re-training the smallest model in the miniseries right now :). HellaSwag 33.7 (Appendix H) almost reached this a few steps ago (though this is only 45% of the…
This is officially my favorite tweet, ever. @levelsio doing $248K/mo without knowing what state is. It's proof that solving problems and execution is way more important than whatever you’re currently procrastinating on.
Really interesting new @gwern essay: LLM Daydreaming - Proposal of how default mode networks for LLMs are an example of missing capabilities for search and novelty Btw, I know it's a bit cringe to delight in, but if you had told 19 year old me that a Gwern essay would open…
This ranking exactly matches my personal experience. I run all of my coding prompts through o3, Gemini 2.5 Pro, Claude 4 Opus, and Grok 4 and then use the models to rank and synthesize the best ideas. Grok 4 consistently lags behind o3 and Gemini 2.5 Pro in terms of coding…
Grok 4 scored 80% on the aider polyglot coding benchmark, with high reasoning effort. This puts Grok in 4th place on the leaderboard. Full leaderboard: aider.chat/docs/leaderboa…
I was one of the developers in the @METR_Evals study. Thoughts: 1. This is much less true of my participation in the study where I was more conceintious, but I feel like historically a lot of my AI speed-up gains were eaten by the fact that while a prompt was running, I'd look…
xAI gave us early access to Grok 4 - and the results are in. Grok 4 is now the leading AI model. We have run our full suite of benchmarks and Grok 4 achieves an Artificial Analysis Intelligence Index of 73, ahead of OpenAI o3 at 70, Google Gemini 2.5 Pro at 70, Anthropic Claude…
We’re excited to introduce Chai-2, a major breakthrough in molecular design. Chai-2 enables zero-shot antibody discovery in a 24-well plate, exceeding previous SOTA by >100x. Thread👇
I was talking to a SpaceX engineer ~8 years ago, asking about his day to day work, which just wasn’t making sense what I knew from my days in physical product development (Solidworks, etc). I finally asked what tool he used most when he sat down to do work – “Python” was his…
Patrick Collison says humanity has never cured a complex disease. Not cancer. Not Alzheimer’s. Not Type 1 diabetes. His Arc Institute is trying something new: Simulate biology with AI. Test interventions before touching the body. Build a virtual cell. Test hypotheses in code.…