turboderp
@turboderp_
1000 stars on tabbyAPI. Holy crap. Huge thanks to @turboderp and everyone who contributed!
TabbyAPI now supports ExllamaV3 with automatic backend detection! 🎉 Please note that exl3 is being actively worked on and mileage may vary compared to exl2 Thanks to @turboderp_ and all contributors for making this a reality.
I have decided to tweet today. So here is a visualization of how the paged cache works with continuous batching in ExLlamaV3. I think it's neat. #🐈
Seems to still be true that larger models are less sensitive to quantization. Here is Mistral-Large 123B at 1.4 bits per weight, running on one 24 GB GPU. #AI or something
Supply chain alert! Don't use the comfyUI Impact pack! Its dependency ultralytics has been compromised on pypi. Thanks Shinon for letting me know in Discord. github.com/ultralytics/ul…
Fun with grounding in Qwen2-VL. Finding the things. #wherearethethings #exllamav2 #cat
TabbyAPI now supports vision. Thanks to @turboderp_ for exllamav2's updates and DocShotgun for the initial work. Any Exl2 supported vision model works, but this release focuses on Pixtral from @MistralAI
1 year ago, I made TabbyAPI with @turboderp_ as a side project. Now, it's my most popular side project. I wanted to break away from the bloated nature of AIO local model backends and just run #exllama. Thanks to all the contributors and testers. github.com/theroyallab/ta…
I performed a successful vocabulary transplant on Qwen2-0.5B and turned it into a useful draft model for Llama-3. What a time to be alive. #hashtag huggingface.co/turboderp/Qwam…
Llama-3-instruct becomes much more useful when you censor some of its catchphrases. #simplesolutions etc. 🤷

New project: goody2.ai GOODY-2 is an AI model that's so responsible it won't give a straight answer to anything.
🚀 TACO: a new benchmark for code generation from @BAAIBeijing with 26,443 problems. • 🤖 English questions & Python solutions • 🧠 Ideal for evaluating code generation from natural language • 📊 Train: 25,443 samples, Test: 1,000 samples • 📚 Diverse difficulty levels
I guess I should post something once in a while. So here's a whole chatbot in 26 lines of Python running Mixtral 8x7B real fast on one 3090. Idk, I think it's neat. 🐈