Lysandre
@LysandreJik
Chief Open-Source Officer (COSO) at Hugging Face
Five years ago, I started in NLP with @AndrewYNg and @DeepLearningAI_'s course and it gave me the skills to join Hugging Face as an MLE I couldn't be prouder to announce that I have recorded a video to present the 🤗 Model & Datasets hub which is now part of this revamped course

The @huggingface Transformers ↔️ @vllm_project integration just leveled up: Vision-Language Models are now supported out of the box! If the model is integrated into Transformers, you can now run it directly with vLLM. github.com/vllm-project/v… Great work @RTurganbay 👏
What a day! What has been released over the night: - Kimi K2 technical report - Qwen3-235B-A22B-2507 non-thinking model, performance significantly improved - KAT-V1 from Kwaishou, the company behind @Kling_ai , a 40B model that rivals DeepSeek r1 Details below:
Kimi K2 tech report is full of gems as always. Here are my notes on it: > MuonClip: Pretty crazy how after 70k the training stabilizes and the QK-clip is basically inactive. There is also no loss in perf with QK-clip which is not trivial at all (at small scale but with…
🤗🤗🤗 🤗❤️🤗 @huggingface & Cline = your LLM playground 🤗🤗🤗 You can access Kimi K2 & 6,140 (!) other open source models in Cline.
Not sure if it's my env: `uv pip install flash_attn=2.7.4.post1`
With the recent efforts around isolating kernels, using flash attentionX (2, 3, ...) should be as simple as this: Will give you a lightweight install! You need `pip install kernels`
We already have a solution for kernel install issues, in transformers you can hotswap with this: huggingface.co/kernels-commun…, its a single install, very light (200mb?) because matches only your hardware, and... will support metal 😄
The current state of the ecosystem for post-training using GRPO w/ vllm + flash attention is frustratingly brittle. - The most recent vllm only supports PyTorch==2.7.0 - vllm requires xformers, but specifically only v0.0.30 is supported for torch 2.7.0. Any prior version of…
i recently gave a talk on how Transformers/Diffusers reduce the memory required for loading big models with features like big model inference and offloading if your gpu is having memory issues, take a look at my post (link in next thread) on the talk to see how we can help
Holy... `transformers` reached 1B downloads 😭 thanks everyone for making this possible, what an amazing community
A new release of 🤗 PEFT is out: v0.16.0 🎉 Check the 🧵for some of the highlights:
Remarkable progress of the Hugging Face science team in 2025: Open-R1, smolagents, SmolVLM2, Ultra-Scale Playbook, OlympicCoder, Open Computer Agent, Reachy Mini, SmolVLA, LeRobot Hackathon and many more... A summary of the projects we released so far this year🧶
`transformers serve` can now power third-parties like Cursor; super fun to put small models to the test and trying to index an entire codebase!
LET'S GO! Cursor using local 🤗 transformers models! You can now test ANY transformers-compatible LLM against your codebase. From hacking to production, it takes only a few minutes: anything `transformers` does, you can serve into your app 🔥 Here's a demo with Qwen3 4B:
LET'S GO! Cursor using local 🤗 transformers models! You can now test ANY transformers-compatible LLM against your codebase. From hacking to production, it takes only a few minutes: anything `transformers` does, you can serve into your app 🔥 Here's a demo with Qwen3 4B:
chat, FLUX KONTEXT is OUTTTT! 🔥 huggingface.co/black-forest-l…