Felix

@felix_red_panda

speech synthesis and LLM nerd, DMs open, working on LLM stuff at @PrimeIntellect | prev @Aleph__Alpha

Berlin, Germany

Joined June 2020

2KFollowing

5KFollowers

Pinned

Felix@felix_red_panda · Aug 26, 2023

All evals of ML models suck - but some are useful 🙃

101.0K

Felix Retweeted

Vincent Abbott@vtabbott_ · Jul 15

Adding multi-level performance models to diagrams. This will allow performance models of FlashAttention / matmul / distributed MoEs to be dynamically calculated. Colors indicate execution at different levels, and the hexagons indicate a partitioned axis.

3.0K

Felix Retweeted

Piotr Mazurek@tugot17 · Jul 12

I solved every single problem in the CUDA mode book. A quick thread summarizing this experience and what I learned 1/x

242

2.0K

4.0K

268.0K

Felix@felix_red_panda · Jun 30

hacker news doing hacker news things 😄

2.0K

Felix Retweeted

SzymonOzog@SzymonOzog_ · Jun 22

This matmul visualization is so cool it got me banned last time I posted it

1.0K

Felix@felix_red_panda · Jun 5

open source speech synthesis model trained on two 4090 GPUs!

HHarry Coultas Blum@harrycblum · Jun 5

Open source notebooklm Today we're open sourcing our 100M voice models that can render conversations. This includes a 40kh base finetune that is capable of voice cloning. Our models can do a variety of non speech sounds! Try them out yourself! ...

123

6.0K

Felix@felix_red_panda · Jun 3

Qwen3 0.6b is a shockingly good draft model a lot of the time (96.6% acceptance rate on the 4b model for this particular task!)

felix_red_panda's tweet image. Qwen3 0.6b is a shockingly good draft model a lot of the time (96.6% acceptance rate on the 4b model for this particular task!)

2.0K

Felix@felix_red_panda · Jun 2

if no one else is showing that RL isn't just eliciting latent behavior already learned in pretraining, but is actually a new scaling paradigm, nvidia has to do it themselves

❄❄️Andrew Zhao❄️@_AndrewZhao · Jun 2

RL scaling is here arxiv.org/pdf/2505.24864

151

11.0K

Felix@felix_red_panda · May 20

though the memory bandwidth look looks pretty meh... feels like Intel is trying to make cards with @tenstorrent performance characteristics, though much cheaper(?) xD

FFelix@felix_red_panda · May 20

Intel a 24GB GPU for ~500 USD, and there will be a single card version with GPU dies and 48GB combined memory

2.0K

Felix@felix_red_panda · May 20

Intel a 24GB GPU for ~500 USD, and there will be a single card version with GPU dies and 48GB combined memory

112

7.0K

Felix@felix_red_panda · May 17

if you want people to see your post on 🦋 site then you gotta post about it on X 😂

FFelix@felix_red_panda · May 17

I’m surprised how dead the 🦋 site is now. I have a few hundred followers there but ~nobody cared about the LLM inference blog

1.0K