AJ
@aj_kourabi
AI @semianalysis_
I have been learning how to design a chip. Here is how I broke down the different parts of the chip design process with no previous experience: 1. Requirement and Architecture design: > performance targets of the chip (clock speed, FLOP/s, etc). > selecting a manufacturing node…

Kimi K2 is being served at 11 tok/s and people still say export controls don’t work Every US lab is above ~50 tok/s now and serving much more traffic
230k GPUs, including 30k GB200s, are operational for training Grok @xAI in a single supercluster called Colossus 1 (inference is done by our cloud providers). At Colossus 2, the first batch of 550k GB200s & GB300s, also for training, start going online in a few weeks. As Jensen…
230k GPUs, including 30k GB200s, are operational for training Grok @xAI in a single supercluster called Colossus 1 (inference is done by our cloud providers). At Colossus 2, the first batch of 550k GB200s & GB300s, also for training, start going online in a few weeks. As Jensen…
Models are playing an increasingly important role in their own development. Finally they get some credit for it!
Despite all the progress, it is still early days for RL Scaling! Another trend we expect to see more and more of is models being involved in their own development. Indeed, K2 is listed as a contributor on its own paper 😉
What stands out to me is: 1) Scaling RL on non verifiable rewards, likely via rubrics and LLM as a judge 2) Thinking and reasoning for several hours at a time for a highly specific task This is what the $20k/mo model looks like!
So what’s different? We developed new techniques that make LLMs a lot better at hard-to-verify tasks. IMO problems were the perfect challenge for this: proofs are pages long and take experts hours to grade. Compare that to AIME, where answers are simply an integer from 0 to 999.
x.com/nvidiaaidev/st…
it is very hard to do scale RL and get a strong model but once you have that, it is almost trivial to distill a strong small model distillation is very effective for small models and comes at 1/10th the GPU hours fully internalizing this can yield fun conclusions
Anthropic: we don’t care about consumer, code is the only use case we care about Everyone: why is Anthropic not showing up in consumer statistics
What happened to Anthropic?
Jeremie is extremely smart and knows more about AI datacenters than probably anyone else in the world. Tune in on @tbpn today at 12:45!
Going live on @tbpn at 12:45P PT to talk about Meta, Oracle and big AI clusters!!
ICML should be in a desolate and isolated part of the world like Waterloo, Ontario so people actually attend the conference
That could be true
The OpenAI open source model is going to be really, really good🍓
it is very hard to do scale RL and get a strong model but once you have that, it is almost trivial to distill a strong small model distillation is very effective for small models and comes at 1/10th the GPU hours fully internalizing this can yield fun conclusions

I’ll be in Vancouver next week for ICML! DMs open if you’d like to meet up or chat.
Never a dull week in this industry
Scoop: OpenAI's Windsurf deal is off. The startup's CEO, co-founder & some R&D team members are all going to Google DeepMind to support its AI efforts and work on Gemini. theverge.com/openai/705999/…
Anthropic inferences on trainiums tpus and gpus they have a whole TPU team and are hiring many ex GDM ppl in addition to all the push on trainium
Claude 4 was trained on GPUs, not Trainiums
DeepSeek Debrief: >128 Days Later Traffic and User Zombification GPU Rich Western Neoclouds Token Economics (Tokenomics) Sets the Competitive Landscape semianalysis.com/2025/07/03/dee…