Vedanuj Goswami
@vedanujg
Research Engineer @MetaAI
We're glad to start getting Llama 4 in all your hands. We're already hearing lots of great results people are getting with these models. That said, we're also hearing some reports of mixed quality across different services. Since we dropped the models as soon as they were…
🦙Checking our first set of Llama4 models: MoE+native multimodal+10M ctx!!! more to come!🦙 blog: ai.meta.com/blog/llama-4-m…
BREAKING: Meta's Llama 4 Maverick just hit #2 overall - becoming the 4th org to break 1400+ on Arena!🔥 Highlights: - #1 open model, surpassing DeepSeek - Tied #1 in Hard Prompts, Coding, Math, Creative Writing - Huge leap over Llama 3 405B: 1268 → 1417 - #5 under style control…
Llama3.2 models are here 🎉! We are releasing the multimodal and lightweight Llama models.
So excited for the open release of Llama 3.1 405B - with MMLU > 87, it's a really strong model and I can't wait to see what you all build with it! llama.meta.com Also check out the paper here, with lots of details on how this was made: tinyurl.com/2z2cpj8m
Llama3.1 is out model: llama.meta.com/llama-download… paper: ai.meta.com/research/publi…
Very excited to release the technical report and the model weights for the all 3 sizes of llama3 models. It has been exciting past 12 months. Really looking forward to the incredible research this will unlock from the community. Now on to llama4 🚀
Starting today, open source is leading the way. Introducing Llama 3.1: Our most capable models yet. Today we’re releasing a collection of new Llama 3.1 models including our long awaited 405B. These models deliver improved reasoning capabilities, a larger 128K token context…
as my other amazing teammates have already shared, check out our llama 3.1 paper here! lots of fun tidbits about the highs, lows, sweat, and tears that go into training LLMs lol ... onto llama 4!!! ai.meta.com/research/publi…
tldr; you can go a long way in pre-training by (1) curating amazing data, (2) using a lot of FLOPs, and (3) otherwise not screwing up. All three are harder than they sound, so read the paper... That said, I'm amazed by our progress since Llama 3 - expect big things from Llama 4!
So excited for the open release of Llama 3.1 405B - with MMLU > 87, it's a really strong model and I can't wait to see what you all build with it! llama.meta.com Also check out the paper here, with lots of details on how this was made: tinyurl.com/2z2cpj8m
‘The Lord of the Rings: The War of the Rohirrim’ anime details • Runtime is ~2 hours and 30 minutes • Peter Jackson is exec producing • Main protagonist is Helm Hammerhand’s daughter Hèra • Set ~183 years before the events of the films
It is really amazing how good these models turned out to be!
Look at this! Llama-3 70B english only is now at 1st 🥇 place with GPT 4 turbo on @lmsysorg Chatbot Arena Leaderboard🔝 I did some rounds too and both 8B and 70B were always the best models for me. Incredible achievement @AIatMeta