Joseph Suarez (e/🐡)
@jsuarez5341
I build sane open-source RL tools. MIT PhD, creator of Neural MMO and founder of PufferAI. https://puffer.ai
PufferLib 3.0: We trained reinforcement learning agents on 1 Petabyte / 12,000 years of data with 1 server. Now you can, too! Our latest release includes algorithmic breakthroughs, massively faster training, and 10 new environments. Live demos on our site. Volume on for trailer!
[1/9] We created a performant Lipschitz transformer by spectrally regulating the weights—without using activation stability tricks: no layer norm, QK norm, or logit softcapping. We think this may address a “root cause” of unstable training.
Doing a few things in person in SF this week, so no streams for a bit. But I have all Saturday booked! Either 6 dof arm from scratch or material science sim. Leaning towards material science because I'm sick of messing up quaternion transforms
Quick RL poll brought to you by being stuck in traffic. What article do you want next? Also @x please add articles drafting to mobile
The AI boom makes a lot more sense if you view 90% of companies as marketing wrappers for existing tech. This bothers people like me more than it should because I remember when 90% of the companies were pure tech with nerds doing their best to also sell
Reinforcement Learning Research Live x.com/i/broadcasts/1…
Reinforcement Learning Research Live x.com/i/broadcasts/1…
I propose we hold an IMO gold tiebreaker. The winner is whoever doesn't lobby for AI regulation to favor incumbents.
Reinforcement Learning Research Live x.com/i/broadcasts/1…
I post a lot about how good RL is with PufferLib that I'm realizing sounds increasingly grifty. Please just go try it. It's free. If you've done RL years ago, it will feel like a different field. We have new programmers doing RL on custom sims. That wasn't a thing before.
Kyoung is the most meticulously organized person I've ever worked with. Spun up in AI in around a year with a rare level of discipline
I recently started using the Apple Vision Pro (kind of late, right?) and was amazed at how well they nailed the eye-tracking. The AVP's eye-tracking calibration is performed against black, gray, and white backgrounds, and I happen to know from my previous life (i.e., during my…
I recently started using the Apple Vision Pro (kind of late, right?) and was amazed at how well they nailed the eye-tracking. The AVP's eye-tracking calibration is performed against black, gray, and white backgrounds, and I happen to know from my previous life (i.e., during my…
x.com/i/article/1946…
When life randomizes your size, mass, axial inertia, etc. just generalize super-human control!
Found the bug. Trains in 2 minutes. No collisions/randomization for first test but pretty zippy! Multi-task + domain randomized next
Reinforcement Learning Research Live x.com/i/broadcasts/1…
Reinforcement Learning Research Live x.com/i/broadcasts/1…
Reinforcement Learning Research Live x.com/i/broadcasts/1…