finbarr

@finbarrtimbers

modeling language

Edmonton, Alberta

Joined February 2022

957Following

12KFollowers

Pinned

finbarr@finbarrtimbers · May 15, 2024

I’ve been spending time advising friends on companies they’re starting. I really like doing this! If you are starting a company, or thinking about it, please reach out if you want to chat.

37.0K

Pinned

finbarr@finbarrtimbers · Jul 20

Oh love this!

LLaker Newhouse@LakerNewhouse · Jul 19

[1/9] We created a performant Lipschitz transformer by spectrally regulating the weights—without using activation stability tricks: no layer norm, QK norm, or logit softcapping. We think this may address a “root cause” of unstable training.

2.0K

Pinned

finbarr@finbarrtimbers · Jul 14

Great list

CChris Barber@chrisbarber · Jul 14

Resources I Use For Keeping Up With AI Progress as an Outsider (People To Follow!) @peterwildeford - broad perspective, one of my top recs because of quality * quantity @krishnanrohit - broad perspective, economics/macro lens @BasilHalperin - economics + ai @zhengdongwang -…

3.0K

finbarr@finbarrtimbers · 23 h

Someone’s gonna release an actual “RL for kernel development” paper without measurement errors at some point and no one will believe it

121

15.0K

finbarr@finbarrtimbers · Jul 21

Revisiting the old "sparse autoencoder learned to recognize cats from youtube videos" paper from 2012, and it's remarkable how similar it is to modern pretraining arxiv.org/abs/1112.6209

2.0K

finbarr@finbarrtimbers · Jul 20

Counterpoint: Claude code is by far the best coding tool I’ve ever used and is notably better than everything else, despite being a thin wrapper around a model.

rrenji the 🐳🐳🐳🐳🐳🐳🐳🐳🐳🐳🐳🐳🐳🐳 maximalist@brickroad7 · Jul 20

I think Anthropic is super overrated. It gets way too much praise. People hype it up to sophistication-signal. I think their "genius talent" is highly overrated. I think their early use of RLAIF got commoditized. There *must* be consolidation in the AI space. There's no…

297

55.0K

finbarr@finbarrtimbers · Jul 18

great list!

MMikhail Parakhin@MParakhin · Jul 18

Since nobody asked :-), here is my list of papers not to be missed from ICML: 1) Dion: distributed orthonormalized updates (well, technically not at ICML, but everyone's talking about it). 2) MARS: Unleashing the Power of Variance Reduction for Training Large Models 3) ...

2.0K

finbarr@finbarrtimbers · Jul 17

Not gonna lie I’m kinda disappointed Anthropic hasn’t added something to their max plans more than just “use Claude code more” I still pay $200/mo though

1.0K

finbarr@finbarrtimbers · Jul 17

backpropagation is magic

1.0K

finbarr@finbarrtimbers · Jul 17

I really don't get why I would use Operator, or whatever it's called now.

5.0K

finbarr@finbarrtimbers · Jul 17

The way you visualize 6d parallelism is you visualize FSDP and then repeat “6” over and over to yourself

3.0K

finbarr@finbarrtimbers · Jul 16

currently annoying everyone at Ai2 by switching all of our repos to not let you check in PRs with failing checks

5.0K

finbarr@finbarrtimbers · Jul 16

Are there any secrets to optimizing vllm for long generations?

5.0K

finbarr@finbarrtimbers · Jul 13

Woof

JJordi Hays@jordihays · Jul 13

Here is most of what I’ve gathered on the Windsurf / Google Deal The founders and dozens of engineers are going to Google. This group, along with the preferred shareholders will be sharing the $2.4B headline number. The exact split is unknown but investors are making some money…

3.0K

finbarr@finbarrtimbers · Jul 12

Given how much media attention is being given to the weird partial acquisitions (Inflection, Character, Windsurf, etc) it seems obvious that people will ask what the employees/investors will get, so choosing not to disclose implies bad outcomes.

6.0K