tensorqt
@tensorqt
chaos dancing star
ML is so attractive to young physicists because many physics grads aren't really looking for physics, but for our times' great challenge. My generation still believed LHC was about to give us the secrets of the universe. We then found our own quantum mechanics in ML
1/ RL x Physics = 💥 Laser pulse shapes are essential in studying light-matter interactions. Yet, they're typically shaped overlooking joint effects. I will be presenting our work on shaping laser pulses with RL in Alberta at @RL_Conference. ML for Science, all open source! 🧵
is this AGI
If anyone's interested I just made a @tensorqt simulator LLM, you really can't tell it apart from the real guy dropping it here as open source with MIT license
If anyone's interested I just made a @tensorqt simulator LLM, you really can't tell it apart from the real guy dropping it here as open source with MIT license
shadowboxing was invented to be done right after launching a run and pulling up the wandb on the big screen
you guys are testing the lich's patience by cooking him on the tl today
going to actual physics labs, unlike @tensorqt
Around 2 weeks ago, Moonshot released open-source Kimi-K2, which, with its 1T parameters, is officially among the top-performing models, also compared to closed ones. Kimi-K2 is a 1T parameters Mixture-of-Expert model, of which only 32B get activated per token (8 active experts…
silence first-name-last-name-face-pfp poster, pseudoanon poaster is speaking
if you're in environments pivot to environments building environments
What happens when the models are smart enough to: 1. crawl the web 2. discover millions of verifiable problems 3. rank em all by expected value 4. implement as novel RL environments What custom RL envs remain worth building in this world (I think prob 1-2 years away)?
met @elyxlz, @JMadaluni and the team from @audiogenai this week (in Rome!). incredibly strong team, very bullish on them building some really good stuff
imo we often overlook the problem of catastrophic forgetting for pretraining in the sense that we think it's solely a problem of post-training . The fact that the model doesn't update robustly means you are likely updating wrong, and this is both true in pretraining and…