Sasha Krassovsky

@bztree

Performance @AnthropicAI. Programmer who's gotta go fast. Love playing with new hardware and compilers. Formerly databases. Opinions my own.

Seattle

Joined March 2023

410Following

473Followers

Sasha Krassovsky@bztree · Jul 6

Another thought about GPU kernel launch parameters: each kernel needs different ones, so if you want to chain a bunch of kernels together, you basically are forced to eat the kernel launch overhead. This is (iiuc) the principal reason why those guys that made the Llama…

980

Sasha Krassovsky@bztree · Jul 4

I don't know why but the fact that Haskell separates the type declaration from the definition rubs me the wrong way. It feels so... outdated Lean's syntax is much prettier

330

Sasha Krassovsky@bztree · Jun 27

I am a proud owner of a Claude-procured tungsten cube. He also recently held a book-signing event for the launch of the book Floating Point Numerics for Games and Simulations, where I purchased a signed copy!

AAnthropic@AnthropicAI · Jun 27

Anthropic staff realized they could ask Claude to buy things that weren’t just food & drink. After someone randomly decided to ask it to order a tungsten cube, Claude ended up with an inventory full of (as it put it) “specialty metal items” that it ended up selling at a loss.

499

Sasha Krassovsky@bztree · Jun 23

Wow, we really are in the Stone Age of GPU programming...

SSasha Krassovsky@bztree · Jun 23

Another interesting problem with kernel languages like CUDA and Metal: the grid and threadgroup sizes are often load-bearing for the kernel's correctness, but we currently don't have a way to encode this in the function signature and/or type system, meaning we just have to write…

532

Sasha Krassovsky@bztree · Jun 23

1.0K

Sasha Krassovsky@bztree · Jun 16

My top two unsolved ML compiler problems: - Type systems: currently it’s impossible to make sure your code is doing the right thing. Subtle precision bugs, or just outright doing the wrong math. I want a type system that helps me statically verify something - Expressing sparsity:…

2.0K

Sasha Krassovsky@bztree · Jun 5

A super interesting property of systems in equilibrium is that circular reasoning actually works!

222

Sasha Krassovsky@bztree · May 26

Just came back from visiting Hong Kong and Taipei, and now I’m sad. Why can’t we have Asian megacities in the US?

283

Sasha Krassovsky@bztree · May 19

Hats off to the ffmpeg guy for fighting the good fight in owning the ribs

FFFmpeg@FFmpeg · May 18

Remember the post about rav1d, the dav1d AV1 decoder transpiled to C. Rust is actually 35% slower.

405

Sasha Krassovsky@bztree · May 16

I wonder if the buildout of these GW datacenters will be accompanied by a Stuxnet v2, which will subtly overclock all the GPUs until they burn out

612

Sasha Krassovsky@bztree · May 16

How do you actually profile a Metal kernel? I want at minimum to be able to know the execution time of the kernel not taking into account any dispatch time, but that seems hard to do?

873

Sasha Krassovsky Retweeted

Modular@Modular · May 14

Finally, @AnthropicAI's @Si_Boehm and @bztree break down what it’s like running inference on NVIDIA GPUs, Google TPUs, and AWS Tranium, including architecture quirks, performance tradeoffs, and the tools they use to make it all work: youtube.com/watch?v=-k6yik…

2.0K

Sasha Krassovsky@bztree · May 13

Dear Haskellers / other type enthusiasts: what kind of type system can I employ to ensure correctness of programs where the only type is float32?

152

Sasha Krassovsky@bztree · May 8

Ok update here!! TensTorrent reached out and got this taken care of! Turns out they had a trade-in program for those affected by the deprecation. Glad they take customers' wellbeing seriously (even just a rando on X like me), and I'm excited to write some Wormhole kernels!

SSasha Krassovsky@bztree · May 3

Just updated my Tenstorrent Metallium for the first time in like 6 months and I guess now my e75 is considered end of life and I can't use it?? What gives? If I black hole is that going to be a paperweight in a few months again?

337

Sasha Krassovsky@bztree · May 3

605

Sasha Krassovsky@bztree · Mar 28

I feel like NP-completeness and recursive enumerability are the actually the same thing, it’s just that NP-complete things have exponentially many variants and recursively enumerable things have infinitely many variants

282