Unsloth AI
@UnslothAI
Open source LLM fine-tuning! 🦥 https://github.com/unslothai/unsloth
You can now reproduce DeepSeek-R1's reasoning on your own local device! Experience the "Aha" moment with just 7GB VRAM. Unsloth reduces GRPO training memory use by 80%. 15GB VRAM can transform Llama-3.1 (8B) & Phi-4 (14B) into reasoning models. Blog: unsloth.ai/blog/r1-reason…

You can now run Qwen3-235B-A22B-Thinking-2507 with our Dynamic 2-bit GGUFs! The full 250GB model gets reduced to just 87GB (-65% size). Achieve >6 tokens/s on 88GB unified memory or 80GB RAM + 8GB VRAM. GGUFs: huggingface.co/unsloth/Qwen3-…
🚀 We’re excited to introduce Qwen3-235B-A22B-Thinking-2507 — our most advanced reasoning model yet! Over the past 3 months, we’ve significantly scaled and enhanced the thinking capability of Qwen3, achieving: ✅ Improved performance in logical reasoning, math, science & coding…
Run Qwen3-Coder with our Dynamic 2-bit GGUFs! ⭐ We shrank the 480B parameter model to just 182GB (down from 512GB). Also, run with 1M context length. Achieve >6 tokens/s on 182GB unified memory or 158GB RAM + 24GB VRAM. Qwen3-Coder-480B-A35B GGUFs: huggingface.co/unsloth/Qwen3-…
>>> Qwen3-Coder is here! ✅ We’re releasing Qwen3-Coder-480B-A35B-Instruct, our most powerful open agentic code model to date. This 480B-parameter Mixture-of-Experts model (35B active) natively supports 256K context and scales to 1M context with extrapolation. It achieves…
You can now run Qwen3-235B-A22B-2507 with our Dynamic 2-bit GGUFs! The full 250GB model gets reduced to just 88GB (-65% size). Achieve >5 tokens/s on 89GB unified memory or 80GB RAM + 8GB VRAM. GGUFs: huggingface.co/unsloth/Qwen3-…
Bye Qwen3-235B-A22B, hello Qwen3-235B-A22B-2507! After talking with the community and thinking it through, we decided to stop using hybrid thinking mode. Instead, we’ll train Instruct and Thinking models separately so we can get the best quality possible. Today, we’re releasing…
My Reinforcement Learning (RL) & Agents 3 hour workshop is out! I talk about: 1. RL fundamentals & hacks 2. "Luck is all you need" 3. Building smart agents with RL 4. Closed vs Open-source 5. Dynamic 1bit GGUFs & RL in @UnslothAI 6. The Future of Training youtube.com/watch?v=OkEGJ5…
We made step-by-step guides to Fine-tune & Run every single LLM! 🦥 What you'll learn: • Technical analysis + Bug fixes explained for each model • Best practices & optimal settings • How to fine-tune with our notebooks • Directory of model variants 🔗docs.unsloth.ai/basics/tutoria…

You can now fine-tune Gemma 3n for free with our notebook! Unsloth makes Google Gemma training 1.5x faster with 50% less VRAM and 5x longer context lengths - with no accuracy loss. Guide: docs.unsloth.ai/basics/gemma-3… GitHub: github.com/unslothai/unsl… Colab: colab.research.google.com/github/unsloth…
Run Gemma 3n locally with our Dynamic GGUFs!✨ @Google's Gemma 3n supports audio, vision, video & text and the 4B model fits on 8GB RAM for fast local inference. Fine-tuning is also supported in Unsloth. Gemma-3n-E4B GGUF: huggingface.co/unsloth/gemma-…
I’m so excited to announce Gemma 3n is here! 🎉 🔊Multimodal (text/audio/image/video) understanding 🤯Runs with as little as 2GB of RAM 🏆First model under 10B with @lmarena_ai score of 1300+ Available now on @huggingface, @kaggle, llama.cpp, ai.dev, and more
We made a Guide on mastering LoRA Hyperparameters, so you can learn to fine-tune LLMs correctly! Learn to: • Train smarter models with fewer hallucinations • Choose optimal: learning rates, epochs, LoRA rank, alpha • Avoid overfitting & underfitting 🔗docs.unsloth.ai/get-started/fi…

Mistral releases Small 3.2 (24B), a new update to their 3.1 model. 🔥 The model performs much better on 5-shot MMLU (CoT), instruction following and function/tool calling! Run locally with FP8 or 16GB RAM using our Dynamic GGUFs with fixed chat template: huggingface.co/unsloth/Mistra…

We're teaming up with @Google for a Gemma developer meetup at Google's San Francisco office next Thursday, June 26! 🦥 • Join us & the Gemma team for live demos and talks • Unsloth new RL notebook & roadmap • Q&A + merch from us all RSVP required: lu.ma/gemma-unsloth

We made a complete Guide on Reinforcement Learning for LLMs! Learn about: • RL's goal & why it's key to building intelligent AI agents • Why o3, Claude 4 & R1 use RL • GRPO, RLHF, DPO, reward functions • Training your own local R1 model via Unsloth 🔗docs.unsloth.ai/basics/reinfor…
