AJ (@aj_kourabi)

Pinned

A

AJ@aj_kourabi · Oct 22

I have been learning how to design a chip. Here is how I broke down the different parts of the chip design process with no previous experience: 1. Requirement and Architecture design: > performance targets of the chip (clock speed, FLOP/s, etc). > selecting a manufacturing node…

aj_kourabi's tweet image. I have been learning how to design a chip. Here is how I broke down the different parts of the chip design process with no previous experience:

1. Requirement and Architecture design:

&gt; performance targets of the chip (clock speed, FLOP/s, etc).
&gt; selecting a manufacturing node…

7

12

120

117

15.0K

A

AJ@aj_kourabi · 8 m

death, taxes, and Anthropic needing more GPUs than they care to admit

0

3

0

100

A

AJ@aj_kourabi · Jul 23

Kimi K2 is being served at 11 tok/s and people still say export controls don’t work Every US lab is above ~50 tok/s now and serving much more traffic

5

0

42

6

3.0K

A

AJ@aj_kourabi · Jul 22

230k GPUs, including 30k GB200s, are operational for training Grok @xAI in a single supercluster called Colossus 1 (inference is done by our cloud providers). At Colossus 2, the first batch of 550k GB200s & GB300s, also for training, start going online in a few weeks. As Jensen…

EElon Musk@elonmusk · Jul 22

230k GPUs, including 30k GB200s, are operational for training Grok @xAI in a single supercluster called Colossus 1 (inference is done by our cloud providers). At Colossus 2, the first batch of 550k GB200s & GB300s, also for training, start going online in a few weeks. As Jensen…

7

36

472

38

32.0K

A

AJ@aj_kourabi · Jul 22

happy tuesday Anthropic is citing @SemiAnalysis_

AAJ@aj_kourabi · Jul 14

happy monday zuck is citing @SemiAnalysis_

1

2

39

5

4.0K

A

AJ@aj_kourabi · Jul 21

Models are playing an increasingly important role in their own development. Finally they get some credit for it!

SSemiAnalysis@SemiAnalysis_ · Jul 21

Despite all the progress, it is still early days for RL Scaling! Another trend we expect to see more and more of is models being involved in their own development. Indeed, K2 is listed as a contributor on its own paper 😉

1

0

7

1

1.0K

A

AJ@aj_kourabi · Jul 19

What stands out to me is: 1) Scaling RL on non verifiable rewards, likely via rubrics and LLM as a judge 2) Thinking and reasoning for several hours at a time for a highly specific task This is what the $20k/mo model looks like!

NNoam Brown@polynoamial · Jul 19

So what’s different? We developed new techniques that make LLMs a lot better at hard-to-verify tasks. IMO problems were the perfect challenge for this: proofs are pages long and take experts hours to grade. Compare that to AIME, where answers are simply an integer from 0 to 999.

3

5

160

45

12.0K

A

AJ@aj_kourabi · Jul 18

x.com/nvidiaaidev/st…

AAJ@aj_kourabi · Jul 15

it is very hard to do scale RL and get a strong model but once you have that, it is almost trivial to distill a strong small model distillation is very effective for small models and comes at 1/10th the GPU hours fully internalizing this can yield fun conclusions

0

2

0

512

A

AJ@aj_kourabi · Jul 18

Anthropic: we don’t care about consumer, code is the only use case we care about Everyone: why is Anthropic not showing up in consumer statistics

ssunny madra@sundeep · Jul 16

What happened to Anthropic?

0

16

1

1.0K

A

AJ@aj_kourabi · Jul 17

Jeremie is extremely smart and knows more about AI datacenters than probably anyone else in the world. Tune in on @tbpn today at 12:45!

JJeremie Eliahou Ontiveros@JeremieEO · Jul 17

Going live on @tbpn at 12:45P PT to talk about Meta, Oracle and big AI clusters!!

0

1

0

468

A

AJ@aj_kourabi · Jul 17

ICML should be in a desolate and isolated part of the world like Waterloo, Ontario so people actually attend the conference

0

1

15

0

1.0K

A

AJ@aj_kourabi · Jul 15

That could be true

SSemiAnalysis@SemiAnalysis_ · Jul 15

The OpenAI open source model is going to be really, really good🍓

9

5

214

19

39.0K

A

AJ@aj_kourabi · Jul 15

it is very hard to do scale RL and get a strong model but once you have that, it is almost trivial to distill a strong small model distillation is very effective for small models and comes at 1/10th the GPU hours fully internalizing this can yield fun conclusions

aj_kourabi's tweet image. it is very hard to do scale RL and get a strong model

but once you have that, it is almost trivial to distill a strong small model

distillation is very effective for small models and comes at 1/10th the GPU hours

fully internalizing this can yield fun conclusions

2

40

18

4.0K

A

AJ@aj_kourabi · Jul 14

happy monday zuck is citing @SemiAnalysis_

8

3

322

23

23.0K

A

AJ@aj_kourabi · Jul 12

I’ll be in Vancouver next week for ICML! DMs open if you’d like to meet up or chat.

3

0

4

0

654

A

AJ@aj_kourabi · Jul 12

The team cooked so hard with this one! And we have more cooking…

ffinbarr@finbarrtimbers · Jul 12

The meta semi analysis article is worth the read semianalysis.com/2025/07/11/met…

1

29

3

3.0K

A

AJ@aj_kourabi · Jul 11

Never a dull week in this industry

HHayden Field@haydenfield · Jul 11

Scoop: OpenAI's Windsurf deal is off. The startup's CEO, co-founder & some R&D team members are all going to Google DeepMind to support its AI efforts and work on Gemini. theverge.com/openai/705999/…

0

6

0

673

AJ Retweeted

A

AJ@aj_kourabi · Jul 4

Anthropic inferences on trainiums tpus and gpus they have a whole TPU team and are hiring many ex GDM ppl in addition to all the push on trainium

2

1

24

3

2.0K

A

AJ@aj_kourabi · Jul 3

Claude 4 was trained on GPUs, not Trainiums

SSemiAnalysis@SemiAnalysis_ · Jul 3

DeepSeek Debrief: >128 Days Later Traffic and User Zombification GPU Rich Western Neoclouds Token Economics (Tokenomics) Sets the Competitive Landscape semianalysis.com/2025/07/03/dee…

3

4

254

60

42.0K