akhil
@fkasummer
If you're still prompting, it's not AGI. @braidbrowser
this has been the most productive week of my life i hate to say it but sometimes you have to rally — sleep can wait
have 26 different experiments to run and i’ve been sleeping 3h for the last four days but this is so insane a possibility there is no stopping now
Best things to read on the optimal structure and governance of scientific or research organisations, especially national labs? Some examples below of the kind of works I have in mind...
There’s always only one founder It’s just a question of whether you find out who that is four years later or at the very beginning
co-founders don't need to be equal. It just happens because it's *easy* but almost all companies end up with a dominant output from ONE co-founder.
it has to do more than that to deliver on expectations
GPT-5 demonstrates improved performance across various domains, with its most notable gains in practical software engineering problems. Read the full analysis: theinformation.com/articles/opena… #softwaredevelopment
I’m going to read this paper today. On a cursory look I suspect it fails the bitter lesson.
I suspect generalized reasoning was solved just a few weeks ago and it flew completely under the radar. HRM, a new arch, reportedly has SOTA results on ARC-AGI 1 & 2 benchmarks with only 27 million parameters and ~1k training examples.
this notion of taste sounds bizarre to me eyes can name but only the hands can know
Being too ambitious is a clever form of self-sabotage maalvika.substack.com/p/being-too-am… (news.ycombinator.com/item?id=444679…)
10pm on a friday night, good coffee, and ~320 pages of reading i need to finish asap lets go
I cant get over the fact that so many engineers still dont grok the fundamentals of what an LLM is. Repeat after me: its just pattern matching, it doesnt "know" anything
god bless OpenAI
The most important thing we’ve learned about GPT-5 is this: OpenAI execs told investors the same architecture can scale all the way to GPT-8. That means we’re not just looking at one powerful model, we’re looking at a whole roadmap.
In Europe, thousands are being convicted for the crime of criticizing their own governments. This Orwellian message won't fool the United States. Censorship is not freedom.
#EuropeProtects #DSAProtects #DemocracyNotAlgoracy
likely the greatest intuitive call of my life
I feel like language models use space more than time for representation that doesn’t necessarily have to be the case intuitively I’d want the space heavy vs. time heavy trade off to be dynamic at runtime doesn’t make sense for intelligence to have a fixed resource tradeoff
ReLU MLP with width / depth going to infinity. Note how different parameterization makes pathlogical scaling behavior (yellow / blue on activations / gradients of the weight). muP solves this.