Tessa Barton

@tessybarton

GPU purse inventor. AI Research Scientist. Prev: @MosaicML x @Databricks, @Snapchat

San Francisco CA

Joined October 2011

1KFollowing

3KFollowers

Pinned

Tessa Barton@tessybarton · May 25

The Lion Optimizer doesn’t concern himself with the opinions of the second moment

265

28.0K

Tessa Barton Retweeted

José Luis Ricón Fernández de la Puente@ArtirKel · Jun 30

Links (89) nintil.com/links-89

2.0K

Tessa Barton Retweeted

Steve Hou@stevehou0 · Jun 29

Amazing. Incredibly based. @tessybarton

31.0K

Tessa Barton Retweeted

Nathalie Gouailhardou@nathaliegou · Jun 27

We doubled our users last month (and this month too)💅 PS: july batch sold out between filming and posting this so next chance to join is october

6.0K

Tessa Barton@tessybarton · Jun 26

📢Now open, Gemma 3n weights & it is natively flexible, first of its kind, thanks to MatFormer🪆 Any model between E4B & E2B with ZERO training near Pareto -- we found a bunch! Find a better E3B than what we released, I will send you a 🪆😉 Find the colab for extraction 🧵👇🪆

GGoogle AI Developers@googleaidevs · Jun 26

Announcing the full release of Gemma 3n, bringing powerful multimodal capabilities to edge devices for developers 🙌 ↓ developers.googleblog.com/en/introducing…

135

34.0K

Tessa Barton@tessybarton · Jun 23

GPT 4.1 is as good as GPT 4.5 for English and Spanish. But for Indic Languages I see a whole letter grade difference in basic math. Questions like "If I have 4 eggs and eat 2 how many are left?" Does distillation come at the cost of multilingual performance?

tessybarton's tweet image. GPT 4.1 is as good as GPT 4.5 for English and Spanish.

But for Indic Languages I see a whole letter grade difference in basic math. Questions like "If I have 4 eggs and eat 2 how many are left?"

Does distillation come at the cost of multilingual performance?

1.0K

Tessa Barton@tessybarton · Jun 22

In the olden days, pretraining via autoregressive decoding was considered a gruesome punishment.

105

4.0K

Tessa Barton Retweeted

heiner@HeinrichKuttler · Jun 17

We've been using uv a few months now and I've never felt better. I have more energy. My skin is clearer. My eye sight has improved.

1.0K

108

92.0K

Tessa Barton@tessybarton · Jun 9

Honored to win the robotics track! This was a lot of fun. The other talks were awesome from @physical_int @Waymo @Tesla_Optimus @GoogleDeepMind @nvidia @kscalelabs

sswyx@swyx · Jun 8

Congrats to @aiDotEngineer 2025 Best Speakers! MCP: @zeeg Tiny Teams: @alxai_ LLM Recsys: @devanshtandon_ GraphRAG: @danielchalef Fortune 500 Day 1: @hwchase17 Architects Day 1: @denyslinkov Infra: @dylan522p Voice: @bnicholehopkins Product Management: @bbalfour Agent…

3.0K

Tessa Barton Retweeted

�

🌐 🇺🇸🇺🇦Mar G-O 🏳️‍🌈@MariGO2thepolls · Jun 7

Having a fun time (:

127

5.0K

Tessa Barton Retweeted

Kevin Kwok@antimatter15 · May 25

I spent a few hours today trying to reverse engineer Google's new Gemma 3n model which was published to HuggingFace as a compiled binary. I wanted to figure out how exactly they cram a model supposedly striking distance from Claude 3.7 Sonnet on LMArena into 2GB of RAM.

6.0K

Tessa Barton Retweeted

Oana Florescu 🦋👩‍🎤@0xflores · May 21

🧵 OpenRS-Star – Efficient RL Tuning for Reasoning Can reinforcement learning make small LLMs better reasoners — under tight compute budgets? We RL-tune Qwen3-1.7B for < $100 and achieve 50% on AIME24 (+13.3%). Code github.com/flores-o/openr… Model huggingface.co/oanaflores/Ope…

2.0K

Tessa Barton@tessybarton · May 21

Did you hear about the new Gemini music model? It’s gotta be trained using Bach-propagation

5.0K

Tessa Barton@tessybarton · May 9

Oddly satisfying

1.0K

Tessa Barton@tessybarton · Apr 24

Q: Why did the GPU bag travel to Indio? A: It wanted to train a CoacheLLM! @TristanHeywood

2.0K

Tessa Barton@tessybarton · Apr 8

Mastering back-paw-pagation

1.0K

Tessa Barton Retweeted

PALLADIUM Magazine@palladiummag · Mar 31

Despite calling our governments democracies, our societies don’t know how to productively harness this potent political force. Whoever does so first is destined to reshape governance in the 21st century.

5.0K