Vedanuj Goswami

@vedanujg

Research Engineer @MetaAI

Menlo Park, CA

Joined October 2016

477Following

613Followers

Vedanuj Goswami Retweeted

Ahmad Al-Dahle@Ahmad_Al_Dahle · Apr 7

We're glad to start getting Llama 4 in all your hands. We're already hearing lots of great results people are getting with these models. That said, we're also hearing some reports of mixed quality across different services. Since we dropped the models as soon as they were…

1.0K

150

344.0K

Vedanuj Goswami@vedanujg · Apr 5

🦙Checking our first set of Llama4 models: MoE+native multimodal+10M ctx!!! more to come!🦙 blog: ai.meta.com/blog/llama-4-m…

llmarena.ai@lmarena_ai · Apr 5

BREAKING: Meta's Llama 4 Maverick just hit #2 overall - becoming the 4th org to break 1400+ on Arena!🔥 Highlights: - #1 open model, surpassing DeepSeek - Tied #1 in Hard Prompts, Coding, Math, Creative Writing - Huge leap over Llama 3 405B: 1268 → 1417 - #5 under style control…

8.0K

Vedanuj Goswami Retweeted

Abhishek Kadian@abhisk_kadian · Sep 25

Llama3.2 models are here 🎉! We are releasing the multimodal and lightweight Llama models.

3.0K

Vedanuj Goswami Retweeted

Mike Lewis@ml_perception · Jul 23, 2024

So excited for the open release of Llama 3.1 405B - with MMLU > 87, it's a really strong model and I can't wait to see what you all build with it! llama.meta.com Also check out the paper here, with lots of details on how this was made: tinyurl.com/2z2cpj8m

181

44.0K

Vedanuj Goswami Retweeted

Abhishek Kadian@abhisk_kadian · Jul 23, 2024

Llama3.1 is out model: llama.meta.com/llama-download… paper: ai.meta.com/research/publi…

1.0K

Vedanuj Goswami@vedanujg · Jul 23, 2024

Very excited to release the technical report and the model weights for the all 3 sizes of llama3 models. It has been exciting past 12 months. Really looking forward to the incredible research this will unlock from the community. Now on to llama4 🚀

AAI at Meta@AIatMeta · Jul 23, 2024

Starting today, open source is leading the way. Introducing Llama 3.1: Our most capable models yet. Today we’re releasing a collection of new Llama 3.1 models including our long awaited 405B. These models deliver improved reasoning capabilities, a larger 128K token context…

5.0K

Vedanuj Goswami Retweeted

Emily Dinan@em_dinan · Jul 23, 2024

as my other amazing teammates have already shared, check out our llama 3.1 paper here! lots of fun tidbits about the highs, lows, sweat, and tears that go into training LLMs lol ... onto llama 4!!! ai.meta.com/research/publi…

7.0K

Vedanuj Goswami@vedanujg · Jul 23, 2024

tldr; you can go a long way in pre-training by (1) curating amazing data, (2) using a lot of FLOPs, and (3) otherwise not screwing up. All three are harder than they sound, so read the paper... That said, I'm amazed by our progress since Llama 3 - expect big things from Llama 4!

MMike Lewis@ml_perception · Jul 23, 2024

168

26.0K

Vedanuj Goswami Retweeted

Sergey Edunov@edunov · Jul 23, 2024

x.com/i/article/1815…

11.0K

Vedanuj Goswami Retweeted

Culture Crave 🍿@CultureCrave · Jun 11, 2024

‘The Lord of the Rings: The War of the Rohirrim’ anime details • Runtime is ~2 hours and 30 minutes • Peter Jackson is exec producing • Main protagonist is Helm Hammerhand’s daughter Hèra • Set ~183 years before the events of the films

212

2.0K

25.0K

2.0K

2.3M

Vedanuj Goswami@vedanujg · Apr 24, 2024

It is really amazing how good these models turned out to be!

IIvan Fioravanti ᯅ@ivanfioravanti · Apr 23, 2024

Look at this! Llama-3 70B english only is now at 1st 🥇 place with GPT 4 turbo on @lmsysorg Chatbot Arena Leaderboard🔝 I did some rounds too and both 8B and 70B were always the best models for me. Incredible achievement @AIatMeta

533