Benjamin F Spector

@bfspector

stanford cs phd student. i make ml go brr.

Joined October 2020

169Following

3KFollowers

Benjamin F Spector Retweeted

Decart@DecartAI · Jul 17

Introducing MirageLSD: The First Live-Stream Diffusion (LSD) AI Model Input any video stream, from a camera or video chat to a computer screen or game, and transform it into any world you desire, in real-time (<40ms latency). Here’s how it works (w/ demo you can use!):

109

338

2.0K

1.0K

831.0K

Benjamin F Spector Retweeted

typedfemale@typedfemale · Jul 17

presenting: big jeff's trainium hell

150

1.0K

427

106.0K

Benjamin F Spector Retweeted

Jerry Liu@jerrywliu · Jul 7

1/10 ML can solve PDEs – but precision🔬is still a challenge. Towards high-precision methods for scientific problems, we introduce BWLer 🎳, a new architecture for physics-informed learning achieving (near-)machine-precision (up to 10⁻¹² RMSE) on benchmark PDEs. 🧵How it works:

122

645

546

81.0K

Benjamin F Spector Retweeted

Jordan Juravsky@jordanjuravsky · Jun 5

Happy Throughput Thursday! We’re excited to release Tokasaurus: an LLM inference engine designed from the ground up for high-throughput workloads with large and small models. (Joint work with @achakravarthy01, @ryansehrlich, @EyubogluSabri, @brad19brown, @jshetaye,…

203

41.0K

Benjamin F Spector Retweeted

ollama@ollama · Jun 3

3 months ago, Stanford's Hazy Research lab introduced Minions, a project that connects Ollama to frontier cloud models to reduce cloud costs by 5-30x while achieving 98% of frontier model accuracy. Secure Minion turns an H100 into a secure enclave, where all memory and…

183

1.0K

776

147.0K

Benjamin F Spector@bfspector · May 27

We wrote a megakernel! Excited to share how we fused Llama-1B into a single kernel to reach SOTA latency. Check out our blog post and code below!

BBenjamin F Spector@bfspector · May 27

(1/5) We’ve never enjoyed watching people chop Llamas into tiny pieces. So, we’re excited to be releasing our Low-Latency-Llama Megakernel! We run the whole forward pass in single kernel. Megakernels are faster & more humane. Here’s how to treat your Llamas ethically: (Joint…

8.0K

Benjamin F Spector Retweeted

Jonathan Jacobi@j0nathanj · May 8

Introducing Multiverse: the first AI-generated multiplayer game. Multiplayer was the missing piece in AI-generated worlds — now it’s here. Players can interact and shape a shared AI-simulated world, in real-time. Training and research cost < $1.5K. Run it on your own PC. We…

199

1.0K

603

359.0K

Benjamin F Spector Retweeted

Tanishq Kumar@tanishqkumar07 · Apr 16

trained a nanoGPT? feeling behind before o4-mini? 🚨🚨i'm open-sourcing beyond-nanoGPT, an internal codebase to help people go from LLM basics to research-level understanding. 🚨🚨 it contains thousands of lines of from-scratch, annotated pytorch implementing advanced…

318

305

374.0K

Benjamin F Spector Retweeted

Dan Fu@realDanFu · Mar 15

A little pre-GTC present for everyone... new Blackwell kernels, all written in ThunderKittens! ⚡️🐱 BF16 & FP8 GEMMs, attention forwards & backwards - fast (competitive with cuDNN and cuBLAS) and open-source! w/ @bfspector @AaryanSinghal4 @HazyResearch @togethercompute 1/

6.0K