Olivier Chafik

@ochafik

OpenSCAD & llama.cpp contributor, ex-Googler (views expressed = my own), he/him 🏳️‍🌈 @ochafik.bsky.social @[email protected]

London, UK

Joined June 2009

605Following

682Followers

Olivier Chafik@ochafik · May 26

Really slick integration! MCP all the (cool) things!

VVaibhav (VB) Srivastav@reach_vb · May 26

You really can just do things! Use *any* Hugging Face space as a MCP server along with your Local Models! 🔥 Here in we use Qwen 3 30B A3B with @ggml_org llama.cpp and @huggingface tiny agents to create images via FLUX powered by ZeroGPU ⚡ It's quite a bit crazy to see local…

148

Olivier Chafik@ochafik · May 26

Months of @ochafik's work finally paid off, we have MCP w/ llama.cpp 😁😁

VVaibhav (VB) Srivastav@reach_vb · May 26

biggest takeaway of all is Qwen 3 30B A3B is slept on and you should be playing around with MCP if you haven't already!

1.0K

Olivier Chafik@ochafik · May 25

llama.cpp streaming support for tool calling & thoughts was just merged: please test & report any issues 😅 github.com/ggml-org/llama… #llamacpp

ochafik's tweet card. Support streaming of tool calls in OpenAI format Improve handling of thinking model (DeepSeek R1 Distills, QwQ, Command R7B): Stream reasoning content inside the content (sam...

5.0K

Olivier Chafik Retweeted

Xuan-Son Nguyen@ngxson · Apr 11

Gemma 3 VISION on llama.cpp server Still very early WIP, but it works 🔥

4.0K

Olivier Chafik Retweeted

Georgi Gerganov@ggerganov · Apr 9

Docker is embracing the ggml/llama.cpp on-device future Who is going to be next?

338

36.0K

Olivier Chafik Retweeted

Xuan-Son Nguyen@ngxson · Apr 10

llama.cpp multimodal roadmap 🔥

3.0K

Olivier Chafik@ochafik · Mar 16

WIP: streaming support for llama.cpp's tool calling + normalization of streamed <think> tags (for QwQ, DeepSeek R1 & Cohere Command R7B) Still strongly conforming to tool schemas (grammar-constrained). Testing / feedback welcome :-) github.com/ggml-org/llama…

352

Olivier Chafik Retweeted

Xuan-Son Nguyen@ngxson · Mar 12

Day-zero Gemma 3 support in llama.cpp 🤯 👉 4 model sizes: 1B, 4B, 12B, 27B 👉 Vision capability (except for 1B) with bi-direction attention 👉 Context size: 32k (1B) and 128k (4B, 12B, 27B) 👉 +140 languages support (except for 1B) 👉 Day-zero support on many frameworks 🚀

304

25.0K

Olivier Chafik@ochafik · Mar 7

Loving this... hope OSS catches up quickly 🤗 Also, conforming to JSON schemas in a diffusion context is going to be an interesting challenge (schema / grammar compliance gradients anyone?)

MMatthew Berman@MatthewBerman · Mar 7

Major AI breakthrough: Diffusion Large Language Models are here! They're 10x faster and 10x cheaper than traditional LLMs. Here's everything you need to know:

129

Olivier Chafik Retweeted

James O'Leary@jpohhhh · Mar 6

You're seeing a minor miracle: Phi-4 3.8B doing agentic tool calls. Doing that reliably, locally, even with 8B, has been a fools errand until @ochafik's opus of a patch series to llama.cpp (PR coming for phi-4)

2.0K

Olivier Chafik@ochafik · Jan 22

llama.vscode (powered by Qwen Coder)

GGeorgi Gerganov@ggerganov · Jan 21

Make your Mac think faster 🧠🧠 Tomorrow I'll show you how to cancel your copilot subscription.

478

346

77.0K

Olivier Chafik@ochafik · Dec 10

. @JLMelenchon au moment où le peuple syrien se libère de la dictature que vous avez toujours soutenu, la décence aurait voulu que vous vous taisiez. Vous avez choisi d'ajouter le mensonge à la longue liste de vos indignités sur le sujet. Alors, s'il ne fallait garder q'une…

JJean-Luc Mélenchon@JLMelenchon · Dec 9

Les Insoumis ont toujours apporté un soutien absolu et total à la lutte du peuple Syrien contre son dictateur. Et au peuple Kurde qui s'est battu sans relâche pour le socialisme, l'autogestion et le fédéralisme. Le reste, ce sont de pitoyables et habituels mensonges, calomnies.

203

1.0K

4.0K

267

281.0K