xjdr

@_xjdr

ptx enjoyer

Noam's Labyrinth

Joined December 2023

595Following

24KFollowers

Pinned

xjdr@_xjdr · Mar 24, 2024

Writing jitted jax code is like playing Dark Souls but in python

391

245.0K

xjdr@_xjdr · 2 h

before i invest as an LP i'd love to know what your firm's DD processes and policies are? ummm, some twitter trolls tell us if the vibes are fire or cap

2.0K

xjdr@_xjdr · 11 h

i believe that B200s are widely available despite export restrictions. i very much do not believe GB200 NVL 72s are available.

4.0K

xjdr@_xjdr · Jul 23

this is my life now

183

10.0K

xjdr@_xjdr · Jul 23

doing my best @basedjensen impression today. adding slurm admin, k8s admin, distributed filesystem manager to my dating profile

xxjdr@_xjdr · Jul 23

training futuristic superintelligence using HPC management software written 2002

6.0K

xjdr@_xjdr · Jul 23

training futuristic superintelligence using HPC management software written 2002

10.0K

xjdr@_xjdr · Jul 23

RIP King

mmeowbooks@untitled01ipynb · Jul 22

🦇mama, i’m coming home 🦇

4.0K

xjdr@_xjdr · Jul 23

3.0K

xjdr@_xjdr · Jul 23

qwen3_coder slaps. congrats to the qwen team

181

12.0K

xjdr@_xjdr · Jul 21

with the exception of like 20 people, i simply do not believe that all of a sudden all of you are working on "continuous learning" and "novel agent environments" . smh

375

21.0K

xjdr@_xjdr · Jul 21

👀👀

JJohn Langford@JohnCLangford · Jul 20

Apparently Dion is now being worked on for Torch Titan: github.com/pytorch/torcht… :-)

3.0K

xjdr@_xjdr · Jul 20

this actually made me lol but - when it comes to quantization, the comparison isn't apples-to-apples with GPUs. we have an approach called truepoint that uses mixed-precision storage but maintains mathematically lossless accumulation in HW during compute. diff architectures,…

HHatice Ozen@ozenhati · Jul 20

12.0K

xjdr@_xjdr · Jul 20

ffinbarr@finbarrtimbers · Jul 20

Counterpoint: Claude code is by far the best coding tool I’ve ever used and is notably better than everything else, despite being a thin wrapper around a model.

131

10.0K

xjdr@_xjdr · Jul 19

What are the current best practices (repos?) for using Megatron-Core for large scale training? Trying to repro something and any time saved beating my head against the wall would be greatly appreciated.

7.0K

xjdr@_xjdr · Jul 19

i feel the same

NNoam Brown@polynoamial · Jul 19

It’s truly a privilege to be able to wake up every morning, see where the latest intelligence frontier is, and help push it a little further.

103

8.0K

xjdr@_xjdr · Jul 19

i feel like this is the proper framing for the upside take from todays OAI announcement. "This new approach has yielded impressive improvements wrt the IMO problem set and is likely to further generalize which is very exciting." i endorse this take

RRohan Pandey@khoomeik · Jul 19

this IMO gold will fly past us as quickly as the turing test did soon normies will say “duh of course they’re good at math, they’re computers” but the RL breakthroughs the team made to solve math (congrats!!) will likely generalize to environments with much higher direct value

224

15.0K

xjdr@_xjdr · Jul 19

This is still the announcement I am most excited about in the past week

MMisha Laskin@MishaLaskin · Jul 16

Engineers spend 70% of their time understanding code, not writing it. That’s why we built Asimov at @reflection_ai. The best-in-class code research agent, built for teams and organizations.

104

10.0K

xjdr@_xjdr · Jul 19

"General Purpose" RL

YYou Jiacheng@YouJiacheng · Jul 19

but still \boxed 🤔

4.0K

xjdr@_xjdr · Jul 19

oooo formalization and verifiers about to become so hot right now (for the unwashed masses). looking forward to the all the unhinged takes (from the unwashed masses)

292

20.0K

xjdr@_xjdr · Jul 16

I've been really excited about this team and this launch for a while now. Really looking forward to getting my hands on it!

MMisha Laskin@MishaLaskin · Jul 16

Engineers spend 70% of their time understanding code, not writing it. That’s why we built Asimov at @reflection_ai. The best-in-class code research agent, built for teams and organizations.

7.0K

xjdr@_xjdr · Jul 16

DTensor is an abomination

10.0K