Moritz Thüning
@moritzthuening
Implementing AlphaFold 3 on Tenstorrent hardware. CS @ TUM
Release of TT-Boltz during the keynote by @DavidBennett__ at Dev Day in SF. What a day! github.com/moritztng/tt-b…

TT-Boltz is more than 1 minute faster again. The blue protein structure in the image was computed in 3 minutes and 20 seconds on a single Tenstorrent Wormhole processor. At Dev Day, it took 15 minutes, and the goal of 2 minutes seemed so far away. I was hoping for a speedup of…

We can't do #AIPlumbers without a history lesson from @rhatr. From "toy" examples to discrepancies in what people actually use and what the market maps show (or don't show). What are the right categories? Feel free to argue with personal opinions - we do need hot takes!
Catch @moritzthuening this Sunday at the AI Plumbers conference giving a talk about TT-Boltz on Wormhole™
I'll talk about TT-Boltz and Tenstorrent at the AI Plumbers Conference by @AIFoundryorg in Berlin. The first edition was a great success. It's gonna be fun, please join us. And ping me if you're interested in anything related to this topic. More info here:…
TT-Boltz now runs on Tenstorrent Blackhole too. Just had to change a single line of code. The runtime dropped from 3m 20s to 2m 30s using only 64 of 140 Tensix cores for the heaviest operations. Very soon we'll use all of those cores. For 9 months my goal was 2 minutes - now…
Please run TT-Boltz on them. Please discover drugs on them.
There is a lot of satisfaction to seeing your baby out in the wild. Our Wormhole-based Galaxy is now powering some of the most exciting new cloud companies all over the world. @tenstorrent is about open source, open standards and multiple entry points to our software stack.
10000 bajllion wormhole based galaxies, 5 more billion to tenstorrent and 100 more software engineers to tenstorrent please
There is a lot of satisfaction to seeing your baby out in the wild. Our Wormhole-based Galaxy is now powering some of the most exciting new cloud companies all over the world. @tenstorrent is about open source, open standards and multiple entry points to our software stack.
Falling in love with my new drug discovery machine. Let's see how proteins fold in a Blackhole.

Brilliant primer.
Last week @arcinstitute released the Virtual Cell Challenge 🧬 The goal is to train a model capable of simulating a cell. I wrote a primer for engineers without a biology background.
Software defined fab Fab in your browser Chips in a day Chips every day
Boltz-2 now runs on Tenstorrent boards and we saved another 20 seconds. Clone TT-Boltz from GitHub and try it out. This is a pretty big deal, since Boltz-2 can predict binding affinity which is essential for drug discovery. A future where Tenstorrent computers become the best…
A QuietBox/LoudBox has the perfect price for biotech labs.
Davor, in addition to the super efficient SDPA, we have leveraged all the accumulated techniques from the Transformer library like fused QKV Matmuls, create/ concat heads..etc. Our near term plan involves Tensor and Data parallel on N300 and Loudbox.
I think I know how my wife feels when I tell her about Kubernetes.
When I started porting Boltz to Tenstorrent Wormhole just a few days after the release of Boltz, not choosing the original AlphaFold 3 seemed contrarian. Now, Boltz is the industry standard and Boltz-2 can predict binding affinity. We should be so grateful to @GabriCorso,…
All the credit for the FlashAttention implementation on Tenstorrent hardware goes to @ColmanGlag. That we were able to use his implementation for triangle attention speaks for the elegance and generality of his implementation!
tt-metal is hackable and fun! FlashAttention is a great fit for TT’s tile-based, async io programming model
When I started porting Boltz to Tenstorrent Wormhole just a few days after the release of Boltz, not choosing the original AlphaFold 3 seemed contrarian. Now, Boltz is the industry standard and Boltz-2 can predict binding affinity. We should be so grateful to @GabriCorso,…
3x in a couple of months... hardware acceleration is so cool!
We've integrated FlashAttention into TT-Boltz, reducing the runtime for a protein with 686 amino acids to 4 minutes and 45 seconds. At Tenstorrent Dev Day just over two months ago, it was still 15 minutes. We're getting closer to the 2 minutes of Boltz-1 on an Nvidia RTX 4090…
I love this hacker culture so much!! Just build cool stuff, come to their next event and those folks will welcome you with open arms. @rhatr, @dadarstan, and Nina Davydova are building a super positive hacker community and pioneer open source hardware for AI. I'll definitely…
It was really great to have @moritzthuening demonstrate how to do new models on Tenstorrent. This kind of exercise truly gives you end-to-end experience that #AIPlumbers is all about. Stay tuned for a bonus podcast recording to learn more about it!
International Workshop on RISC-V for HPC at ISC with @NickBrownHPC, RISC-V CEO Andrea Gallo and many more brilliant folks

