Spencer Cheng
@spenccheng
2x founder | AI + Construction | I build insanely fast simulators for reinforcement learning at http://puffer.ai
Since I've been getting lots of questions today - PufferAI is a private reinforcement learning lab with all OSS research and tools. Our business is helping companies solve RL problems and in-house the capabilities. DM if you would like to chat!
I’ve spent the past year learning RL from Joseph and am extremely grateful for the mentorship. Excited to share that I’ll be helping bring Puffer’s research to industry. If reinforcement learning or simulation is critical to your company, let’s chat!
PufferLib 3.0 was a truly massive amount of work. Here's a quick thread crediting some of the people involved! @spenccheng Made a ton of new environments and has has been a major force behind this release @DanAdvantage Has gotten us new envs, tons of fixes, and user support
Puffer AI is truly a game changer You actually don't understand how much room there is left as well Trust me on this
x.com/i/article/1948…
Working with Spencer on RL for self-driving has been awesome. Check out the story of what scaling up in RL looks like
x.com/i/article/1948…
Great example of utilizing domain randomization to help close the sim to real gap. Performance on real drones coming soon!
Early prototype of the new drone racing sim. Every drone here is a different size, weight, axial inertias, etc. We reinforcement learn the policy in <2 minutes with PufferLib. This is an extension to the original sim submitted by Fin and Sam
There is so much untapped value in applying RL to niche industries without touching LLM land. I’ve been having so much fun talking to different domain experts.
And LLMs won't even be the biggest application! Massive but diffuse impact across industries. Anywhere you can build sims
Great article for anyone new to programming.
x.com/i/article/1941…
Joseph's guide provides clear tactical advice on how to learn RL. Give it a read. Build Environments. You can just learn RL.
x.com/i/article/1940…
The highest leverage thing unskilled engineers can do rn is learn to code and then build RL environments correctly. Plenty of PufferLib contributors have done so already!
Highest leverage thing unskilled engineers can do rn to contribute to frontier AI research is vibecoding RL environments
The trick to writing fast RL sims ? Build it in C. Contiguous memory + memcpy is a cheat code for performance.
Working on Multi-GPU RL training for the first time. Little bit of tinkering with hyperparams but then I got an expert policy in ~10 min instead of an hour. This is wild. Training Details: Same Total Timesteps: 1.8B Orange: 6 GPUs - Score: 0.995 in 10 min. 0.997 by end of…


500x cheaper training!! Puffer provides an unfair advantage to anyone in RL. Check out Joseph's article on training Neural MMO3 to see the power of simulation at scale.
x.com/i/article/1940…
At Puffer, we build sims as lofi video games. You want to be able to play your own sim. Debugging RL problems when you don't even know if UP actually goes UP is not fun.
Raylib is the best! Building envs in C is actually quite easy. Here's pong in 300 lines. It's all simple for loops and conditionals! github.com/PufferAI/Puffe… We have people who have never coded before building and contributing environments in C.
do you use raylib for this? man must be tough building envs in C