Georgi Gerganov
@ggerganov
24th at the Electrica puzzle challenge | https://github.com/ggml-org
You can now run Kimi K2 locally with our Dynamic 1.8-bit GGUFs! We shrank the full 1.1TB model to just 245GB (-80% size reduction). The 2-bit XL GGUF performs exceptionally well on coding & passes all our code tests Guide: docs.unsloth.ai/basics/kimi-k2 GGUFs: huggingface.co/unsloth/Kimi-K…
GGUF support added to LFM2 🥳 Enjoy running LFM2-350M on a microwave! Thanks for all your kind messages, vLLM support and more coming soon!
LM Studio is now free to use anywhere: both at home and at work. I put the original terms in place back in August 2023, shortly after launching the app. The idea was to allow anyone to experiment with LM Studio at home for free, but require companies to get in touch and buy a…
LM Studio is now free for use at work. Starting today, it no longer necessary to get a separate license to use LM Studio for work. No need to fill out a form or contact us. You and your team can just use the app! LM Studio has always been free for personal use. This is due to…
Started work at Meta this week. My job is to make amazing AI products that billions of people love to use. It won't happen overnight, but a few days in, I'm feeling confident that great things are ahead.
Congrats on the exciting release! The new architecture changes to the model are definitely interesting and will be fun to play with. Admirations for the community initiative to build on-device products. Looking forward to many people joining the challenge!
I’m so excited to announce Gemma 3n is here! 🎉 🔊Multimodal (text/audio/image/video) understanding 🤯Runs with as little as 2GB of RAM 🏆First model under 10B with @lmarena_ai score of 1300+ Available now on @huggingface, @kaggle, llama.cpp, ai.dev, and more
Just open sourced an Emacs port of @ggerganov's excellent llama.vim:
dear imgui 1.92 released! github.com/ocornut/imgui/… New protocol for backends to create/update textures, with support for 13 renderers. Dynamic font scaling. Dozens of font/atlas related changes. Dozens of other changes. Expect some turbulence. (misc video from January...)
save r/LocalLlama
We need r/LocalLlama back :( Hopefully a good neutral moderator takes the reins asap!
👀 curious if this would fly
We've been negotiating a $2M contract to get AMD on MLPerf, but one of the sticking points has been confidentiality. Perhaps posting the deliverables on X will help legal to get in the spirit of open source!
Massive QoL update: You can now filter the models on the hub based on their size 🔥 Find the model that fits YOUR needs faster ⚡️
Diffusion in-painting and TTS model. Give it a try! We are expecting you to build some great things with it. Let me know if you need any help.
🎙️ After serving millions of users through our text-to-speech platform, one need kept coming up: fine-grained AI speech editing - the ability to modify existing speech. Today, we’re open-sourcing PlayDiffusion, a diffusion-based inpainting model built for that exact purpose.…
🎙️ After serving millions of users through our text-to-speech platform, one need kept coming up: fine-grained AI speech editing - the ability to modify existing speech. Today, we’re open-sourcing PlayDiffusion, a diffusion-based inpainting model built for that exact purpose.…
llm-llama-server now supports tools, which means this local Gemma demo should work (if you have 3.2GB free): brew install llama.cpp llama-server --jinja -hf unsloth/gemma-3-4b-it-GGUF:Q4_K_XL uvx --with llm-llama-server llm -m llama-server-tools -T llm_time 'what time is it?'
I added tool support to the llm-llama-server plugin, so now you can run any GGUF model locally that supports tools and access it from LLM simonwillison.net/2025/May/28/ll…
llama.cpp streaming support for tool calling & thoughts was just merged: please test & report any issues 😅 github.com/ggml-org/llama… #llamacpp
Summarize latest @fireship_dev's video using cutting edge @ggml_org's llama.cpp audio support (model: Ultravox + Llama 3.1 8B)
PSA for applications that use local AI models - here is how to do it right: More and more applications are adding support for local AI models, which is great. But I notice that they are doing it the wrong way (see the screenshots below). The right way to do it is to add a…
Yay! You folks have been on fire in the past few weeks. Regarding the feature itself, I noticed there’s no way to enter a URL here? I run Ollama on another machine in my network.