Hugo Larcher
@hugoch
ML infra/software engineer @huggingface 🤗. Making GPUs go "brrr".
OMG, the U.S. just downloaded more than 5PB of DeepSeek-R1 on @huggingface in the last few days! Feeling late FOMO in Silicon Valley? 🤔🚀

Just 10 days after o1's public debut, we’re thrilled to unveil the open-source version of the groundbreaking technique behind its success: scaling test-time compute 🧠💡 By giving models more "time to think," LLaMA 1B outperforms LLaMA 8B in math—beating a model 8x its size.…
We're turning @huggingface Hub's files into content-defined chunks to speed up your workflows!⚡️ This means: - 🧠We store your file as deduplicated chunks - ⏩ You only upload changed chunks when iterating! - 🚀 Pulling changes? Only download changed chunks!
An easy way to understand Pipeline Parallelism with a self contained implementation. Check it out!
Interested in 4D parallelism but feeling overwhelmed by Megatron-LM codebase? We are currently cooking something with @Haojun_Zhao14 and @xariusrke 😉 In the meantime, here is a self-contained script that implements Pipeline Parallelism (AFAB + 1F1B) in 200 LOC 🧵👇
New feature on the Hub! ☁️ Carbon emissions emitted during training now show up on the model card! (requires model authors to fill that info first) Hopes it will prompt more people to show the carbon emissions of their model training! 🌍 Thanks a lot to the team who pushed…
We passed 5 million users. 🥳That's 5 million of you who have signed up on the Hub 🚀 thank you for contributing to the ecosystem and making open Machine Learning happen! We're just getting started 🤗
Starting today, open source is leading the way. Introducing Llama 3.1: Our most capable models yet. Today we’re releasing a collection of new Llama 3.1 models including our long awaited 405B. These models deliver improved reasoning capabilities, a larger 128K token context…
I am mind blown by this new technology! AI is now embodied. And we are open-sourcing it all. Listen to @HaixuanT casually discussing with its cute robot at the @linuxfoundation: 🙂 What's your name? > I am Reachy, a robot from @pollenrobotics, I have two arms. 😀 What do you…
Llama 3 released! 🚨🔔@AIatMeta just released their best open LLM! 👑🚀 Llama 3 is the next iteration of Llama with a ~10% relative improvement to its predecessor! 🤯 Llama 3 comes in 2 different sizes 8B and 70B with a new extended tokenizer and commercially permissive license!…
Introducing: Zephyr 141B-A35B 🥁 🔥Mixtral-8x22B fine-tune 🤯 Using DORPO: new alignment algorithm (no SFT, open ) 🚀 With 7k instances of (open) data Very strong IFEval, BBH, AGIEval... Enjoy! 🤗 hf.co/HuggingFaceH4/…
this 30-min-read blog post on how to craft and generate a 25B+ tokens synthetic text dataset distills more information and alphas than a typical NeurIPS best paper
Huge spatial images dataset released by @ESA_EO and @huggingface 🛰️ so much to build on it!
.@esa's Φ-lab has released, in partnership with @huggingface, the 1st dataset of Major TOM (Terrestrial Observation Metaset), the largest, community-oriented, ML-ready collection of @CopernicusEU #Sentinel2 images ever published and covering over 50% of : philab.esa.int/hello-major-to…
We collaborated with the European Space Agency to open-source the largest ever earth observation dataset: Major TOM Core! About half of the entire planet is covered. That's 2,245,886 patches of 1068 x 1068 pixels. At 10m resolution, we've got 256 million square km with over 2.5…