Daniel Han

@danielhanchen

Building @UnslothAI. Finetune train LLMs faster. LLMs bug hunter. OSS package http://github.com/unslothai/unsloth. YC S24. Prev ML at NVIDIA. Hyperlearn used by NASA.

San Francisco

Joined April 2016

2KFollowing

26KFollowers

Pinned

Daniel Han@danielhanchen · Feb 6

We managed to fit Llama 3.1 8B < 15GB with GRPO! Experience the R1 "aha moment" for free on Colab! Phi-4 14B also works with @UnslothAI & vLLM is now integrated allowing 20x faster inference! LoRA with GRPO also just work! 1. We removed double memory usage during vLLM serving…

UUnsloth AI@UnslothAI · Feb 6

You can now reproduce DeepSeek-R1's reasoning on your own local device! Experience the "Aha" moment with just 7GB VRAM. Unsloth reduces GRPO training memory use by 80%. 15GB VRAM can transform Llama-3.1 (8B) & Phi-4 (14B) into reasoning models. Blog: unsloth.ai/blog/r1-reason…

286

2.0K

289.0K

Daniel Han Retweeted

himanshu@himanshustwts · Jul 21

this pod was all about optimizations, torch dot compile, benchmaxxing and beyond. @danielhanchen has some exciting news too coming up for unsloth⚡️ pod coming soon.

156

6.0K

Daniel Han@danielhanchen · Jul 18

Thank you to @Kimi_Moonshot for quickly addressing my queries on the correct system prompt for Kimi K2! We'll be re-uploading all BF16 + dynamic @unslothai GGUFs with fixed tool calling & the new sys prompt! Sys prompt = "You are Kimi, an AI assistant created by Moonshot AI."

KKimi.ai@Kimi_Moonshot · Jul 18

We’ve updated Kimi K2’s chat template to make tool calls more robust. What’s changed: - updated default system prompt - always use model-returned tool_id in multi-turn tool calls, which is more reliable. - If `arguments` in tool call is already a string, don't apply `tojson` to…

106

6.0K

Daniel Han Retweeted

ℏ

ℏεsam@Hesamation · Jul 16

a complete guide to fine-tuning LLMs in 15 minutes. this covers how to use Unsloth to fine-tune models in notebooks, how to create your custom chat templates, datasets, and more. this guy deserves much more attention. here is also the full 1:20 hour video going in detail:

129

982

2.0K

30.0K

Daniel Han Retweeted

himanshu@himanshustwts · Jul 16

coming up @groundzero_ai talks x @danielhanchen (ceo, unsloth) soon. drop your questions/thoughts/insights around training, fine-tuning, RL, scaling or llms in general. (dms open) stay tuned⚡

287

19.0K

Daniel Han@danielhanchen · Jul 16

Highly recommend this Stanford lecture video with @_jasonwei and @hwchung27 :) It's one of my favorites on scaling laws and the bitter lesson! Also Hyung's "Don't teach. Incentivize" video: youtube.com/watch?v=kYWUEV… youtube.com/watch?v=3gb-Zk…

633

862

78.0K

Daniel Han Retweeted

Vaibhav (VB) Srivastav@reach_vb · Jul 14

LOVE ITT! You can run Kimi K2 (1T token MoE) on a single M4 Max 128GB VRAM (w/ offloading) or a single M3 Ultra (512GB) 🔥 The model was released less than 72 hours ago - love how fast the community optimises open weights - kudos to @UnslothAI 🤗 huggingface.co/unsloth/Kimi-K…

620

237

43.0K

Daniel Han@danielhanchen · Jul 2

You can utilize our Gemma 3n multimodal and fine-tuning Kaggle notebook for any submission to the $150,000 challenge! The $10,000 is specifically for the Unsloth track - but you can submit it for the main track as well! Kaggle notebook: kaggle.com/code/danielhan…

UUnsloth AI@UnslothAI · Jul 2

We’ve teamed up with @GoogleDeepMind for a challenge with a $10,000 Unsloth prize! 🦥 Show off your best fine-tuned Gemma 3n model using Unsloth, optimized for an impactful task. The entire hackathon has $150,000 prizes to be won! Kaggle notebook: kaggle.com/code/danielhan…

6.0K

Daniel Han@danielhanchen · Jul 1

🦥 Fine-tuning with @UnslothAI now supports Gemma 3n! ✨ Friendly reminder: the Gemma 3n models can understand not just text and code, but also images, audio, video, and a whole lot more.

UUnsloth AI@UnslothAI · Jul 1

You can now fine-tune Gemma 3n for free with our notebook! Unsloth makes Google Gemma training 1.5x faster with 50% less VRAM and 5x longer context lengths - with no accuracy loss. Guide: docs.unsloth.ai/basics/gemma-3… GitHub: github.com/unslothai/unsl… Colab: colab.research.google.com/github/unsloth…

12.0K

Daniel Han@danielhanchen · Jul 1

Gemma 3N quirks! 1. Vision NaNs on float16 2. Conv2D weights are large FP16 overflows to infinity 3. Large activations fixed vs Gemma 3 4. 6-7 training losses: normal for multimodal? 5. Large nums in msfa_ffn_pw_proj 6. NaNs fixed in @UnslothAI Details: docs.unsloth.ai/basics/gemma-3…

UUnsloth AI@UnslothAI · Jul 1

300

160

25.0K

Daniel Han@danielhanchen · Jun 27

Huge thanks to everyone who attended our @Google & @UnslothAI Gemma developer meetup yesterday! 🦥 Was amazing meeting you all & thank you to @blueviggen for hosting the event with us. Thank you to the Google speakers: @DynamicWebPaige, Doug Reid, @imayank42, @GrmCameron and of…

danielhanchen's tweet image. Huge thanks to everyone who attended our @Google &amp; @UnslothAI Gemma developer meetup yesterday! 🦥 Was amazing meeting you all &amp; thank you to @blueviggen for hosting the event with us.

Thank you to the Google speakers: @DynamicWebPaige, Doug Reid, @imayank42, @GrmCameron and of…

6.0K

Daniel Han@danielhanchen · Jun 27

💎 Celebrating the official release of Gemma 3n with the inaugural Gemma Community meetup at @Google San Francisco, cohosted with @Unsloth! Great presentations from the Unsloth founders on agents, the Gemma team on architectural internals, and how to craft effective evals.

GGoogle DeepMind@GoogleDeepMind · Jun 26

We’re fully releasing Gemma 3n, which brings powerful multimodal AI capabilities to edge devices. 🛠️ Here’s a snapshot of its innovations 🧵

17.0K

Daniel Han@danielhanchen · Jun 25

Excited to see you all tomorrow for our Google Gemma & Unsloth developer meetup! 🦥 We'll be having @Grmcameron from @ArtificialAnlys and @DynamicWebPaige & more amazing talks! Location has been updated so please check & if you need help please DM me! lu.ma/gemma-unsloth

danielhanchen's tweet card. Join us at Google's San Francisco office to meet the Gemma team! Featuring talks from: Artificial Analysis • Google DeepMind • Unsloth AI and more! Gemma is…

4.0K

Daniel Han@danielhanchen · Jun 24

r/LocalLlama is back!! reddit.com/r/LocalLLaMA/c…

DDaniel Han@danielhanchen · Jun 24

We need r/LocalLlama back :( Hopefully a good neutral moderator takes the reins asap!

6.0K

Daniel Han@danielhanchen · Jun 24

We need r/LocalLlama back :( Hopefully a good neutral moderator takes the reins asap!

187

39.0K

Daniel Han@danielhanchen · Jun 21

Managed to mostly fix Mistral 3.2 tool calling for GGUF / transformers! 1. 3.2 tool calling is different from 3.1 2. timedelta(days=1) (yesterday) changed with a if-else - supports 2024 to 2028 dates - so now word for word same sys prompt! 3. Made experimental FP8 quant as well!

UUnsloth AI@UnslothAI · Jun 21

Mistral releases Small 3.2 (24B), a new update to their 3.1 model. 🔥 The model performs much better on 5-shot MMLU (CoT), instruction following and function/tool calling! Run locally with FP8 or 16GB RAM using our Dynamic GGUFs with fixed chat template: huggingface.co/unsloth/Mistra…

6.0K

Daniel Han Retweeted

Daniel Vila Suero@dvilasuero · Jun 19

New tutorial: how to build a synthetic dataset with recent information and use it to fine tune with @UnslothAI Check out the collab: colab.research.google.com/drive/1JK04IBE… Steps in the 🧵

3.0K