Erik Kaunismäki
@ErikKaum
SWE @huggingface | prev @BananaDev_
We just released native support for @sgl_project and @vllm_project in Inference Endpoints 🔥 Inference Endpoints is becoming the central place where you deploy high performance Inference Engines. And that provides the managed infra for it so you can focus on your users.

"It is our duty to remain optimists" Here's the passage you didn’t know you needed for your Monday.

Turbopuffer > S3 Vectors Planetscale > RDS
I think @turbopuffer is a 100x better product over the S3 Vectors product that was announced today: - TP APIs/DX is superb - TP has traditional search capabilities - TP is honest about recall (red flag for S3 Vector announcement) TP has a cracked team, I wish them all…
A few things for early hackers that I stumbled upon when testing the MLX CUDA backend: 1) if you want to check what's available (metal/cuda/cpu), you can hack it like this in python: a = mx.arange(1) dev_type, dev_id = a.__dlpack_device__() if dev_type == 8: print("found…
1. Develop an LLM on a Mac locally 2. Deploy on Nvidia GPUs and runs natively on CUDA 🔥 MLX team is really doing the right stuff here!
1. Develop an LLM on a Mac locally 2. Deploy on Nvidia GPUs and runs natively on CUDA 🔥 MLX team is really doing the right stuff here!
MLX, Apple’s machine learning framework, just merged a CUDA Backend. Matmul, tensor copy ops, and other core CUDA primitives are now part of Apple’s official build. There’s a lot of hype + confusion. Here’s what it is, and…isn’t.
The real argument for one or the other is: which does the LLM naturally gravitate towards?
Prior to coding agents, I used to think bike-shedding like this about code/file structure and naming was a massive waste of time But now, more than ever, it actually matters and pays to think about code organization so that LLMs and coding agents can be more productive
Robots just casually walking to our office and taking our jobs. I guess this is it 🤷🏼♂️
Now this is interesting! 👀 stop talking about the glass liquid thing.

Today I did debugging with kubectl for the first time in a while. Feels good.
This is what peak software distribution and "SEO" looks like in 2025.
In the System Prompts.