cider
@jeffreycider
my purpose in life is to forget linear algebra 2x a year
linear transformations stretch euclidean space ReLU folds euclidean space neural networks are just repeated origami on high-dimensional laffy taffy
optimization theorem: "assume a lipschitz constant L..." the lipschitz constant:
[1/9] We created a performant Lipschitz transformer by spectrally regulating the weights—without using activation stability tricks: no layer norm, QK norm, or logit softcapping. We think this may address a “root cause” of unstable training.
your mom grew up in an urban part of the US, east asia, or india. urban here = she was near busy roads. - do you have autoimmune conditions (e.g. severe allergies, eczema)? - are you an eldest child?
i never learned how to use ray. i was too poor and then i was too rich
neither verifiers nor prime-rl use ray. ray is great for orchestration in a single big cluster. it’s overkill at small scales imo, but the bigger issue is if you want to go multi-datacenter. RL is very well suited to highly distributed training, but your stack needs to support it
lost another evening to🔪 JAX - the sharp bits 🔪 didn't realize my computational graph could be sabotaged by NaNs from the shadow realm

i'm at a ~serious~ lab people explain things to me in chinese, and i have to tell them in english that i'm a banana
> i love math so much, i hope my new job involves a lot of it monkey's paw curls

first chess tournament in 17 years i was so scared of getting demolished by a 2nd grader but joke's on me my elo was literally not high enough to get paired with any children
my friends don't want fuck you money per se. bc they don't want to say fuck you to people what they really want is plot armor money
one week ago: idk fp8 quantization seems pretty straightforward 7 days later: do you know what continuity is? it's the thousand blades of dedekind cuts, a story we agree to tell each other over and over, until we forget that it's a lie.

as a geometry lover, i've been disappointed at every turn that subfields named "geometry" (like differential geometry) are mostly symbolic reasoning and analysis then i find out that this one field under "analysis" is mostly geometry they have played us for fools



sleep evolutionarily predates brains sleep might be more fundamental than brains

modern homeopathy is discovering that refilling a hint water bottle with tap water creates more hint water
