Ross Taylor

@rosstaylor90

Universal intelligence at @GenReasoning. Previously lots of other things like: Llama 3/2, Galactica, Papers with Code.

▽²f = 0

Joined March 2012

1KFollowing

9KFollowers

Pinned

Ross Taylor@rosstaylor90 · Jun 25

Congrats @n_latysheva !

GGoogle DeepMind@GoogleDeepMind · Jun 25

Introducing AlphaGenome: an AI model to help scientists better understand our DNA – the instruction manual for life 🧬 Researchers can now quickly predict what impact genetic changes could have - helping to generate new hypotheses and drive biological discoveries. ↓

1.0K

Ross Taylor@rosstaylor90 · Jul 20

What seems like an exponential in AI is just a series of S curves. Each era rides on a wave of increasing compute but finds a new way to utilise it - overcoming limitations of the previous stage. Eg pre-training was the dominant way to utilise compute, but the limitations of…

8.0K

Ross Taylor@rosstaylor90 · Jul 16

It’s funny that people on this site think major LLM efforts are talent-bound rather than org-bound. The talent differential has never been big between major orgs. Most of the difference in outcomes is due to organisational factors - like allocating compute to the right bets, and…

466

112

63.0K

Ross Taylor@rosstaylor90 · Jul 11

Nice work on prediction vs understanding.

KKeyon Vafa@keyonV · Jul 11

Can an AI model predict perfectly and still have a terrible world model? What would that even mean? Our new ICML paper formalizes these questions One result tells the story: A transformer trained on 10M solar systems nails planetary orbits. But it botches gravitational laws 🧵

2.0K

Ross Taylor@rosstaylor90 · Jun 20

If you take ASI seriously, then you care about where you want to build it and who you want to build it for.

3.0K

Ross Taylor@rosstaylor90 · Jun 12

Too many are being sanctimonious about human intelligence in face of the first real thinking machines. They'll be left behind like many who failed to understand technology in the past.

IInterconnects@interconnectsai · Jun 12

The rise of reasoning machines And a debate that doesn't warrant repeating. interconnects.ai/p/the-rise-of-…

114

12.0K

Ross Taylor@rosstaylor90 · Jun 4

Finally proof that a British accent makes you smarter.

MMax@maxxrubin_ · May 30

Definitely weird stuff with "o1 pro" -Says it's o3 -It has access to memory tool -Can search the web It says "optimised" not "optimized" (Only o3 slips into British English) Feels like o3 pro @apples_jimmy @chatgpt21 @btibor91 @iruletheworldmo @scaling01 @kimmonismus @chetaslua

2.0K

Ross Taylor@rosstaylor90 · May 31

Reinforcement learning is everywhere 🍒

1.0K

Ross Taylor@rosstaylor90 · May 14

The best way to judge new results in ML is how much complexity they introduce for their stated performance gain. Most new things get small improvements for large complexity gains. They trade on novelty bias in the short term, and nerd-snipe people into thinking their approach is…

6.0K

Ross Taylor@rosstaylor90 · May 10

This is a nice thread by @MinqiJiang.

MMinqi Jiang@MinqiJiang · May 10

It's so fun to see RL finally work on complex real-world tasks with LLM policies, but it's increasingly clear that we lack an understanding of how RL fine-tuning leads to generalization. In the same week, we got two (awesome) papers: Absolute Zero Reasoner: Improvements on code…

2.0K

Ross Taylor@rosstaylor90 · May 6

RL is very expensive compared to SFT, which makes it impractical to scale for most folks outside of big labs. And yet, RL is perfect for businesses because you can optimise the metric you actually care about. Not the next token; but the next sale or the next customer. Already…

269

175

22.0K

Ross Taylor@rosstaylor90 · May 5

When your model has emergent swearing in its internal monologue.

2.0K

Ross Taylor@rosstaylor90 · May 3

Neural networks were once in the “graveyard of ideas” because the conditions weren’t right for them to shine (data, hardware). So maybe it’s a waiting room rather than a graveyard 🙂. I’m not sure a lot of the ideas below are dead actually - eg SSM-transformer hybrids look more…

ttokenbender@tokenbender · May 1

Making a list of graveyard of ideas, the ultimate nerd snipes where efforts go and die DPO-*variant SSM-transformer hybrids SAEs MCTS Diffusion for large vision models Attention-less JEPA (lecun lovers) what else?

178

14.0K

Ross Taylor@rosstaylor90 · Apr 28

Happy Qwen day to all who celebrate.

QQwen@Alibaba_Qwen · Apr 28

Introducing Qwen3! We release and open-weight Qwen3, our latest large language models, including 2 MoE models and 6 dense models, ranging from 0.6B to 235B. Our flagship model, Qwen3-235B-A22B, achieves competitive results in benchmark evaluations of coding, math, general…

4.0K

Ross Taylor@rosstaylor90 · Apr 22

All that is old is new again.

wwill brown@willccbb · Apr 21

how sure are we that one epoch is optimal for pretraining in the data-scarce regime

8.0K