@saiprasanna.in 🦋
@sai_prasanna
🤖 = 42 Following the gradient of interestingness. RL² research (Reinforcement learning for Robot learning) http://saiprasanna.in @ 🦋
Excited to share our research on improving zero-shot generalization (ZSG) of world models for Contextual RL. Joint work with @karim_farid0, @RaghuSpaceRajan, and @AndreBiedenkapp to be presented at the RLC 2024 (@RL_Conference).

🚨 Excited to share our new work: "Stable Gradients for Stable Learning at Scale in Deep Reinforcement Learning"! 📈 We propose gradient interventions that enable stable, scalable learning, achieving significant performance gains across agents and environments! Details below 👇
really excited about this new work we just put out, led by my students @creus_roger & @johanobandoc , where we examine the challenges of gradient propagation when scaling deep RL networks. roger & johan put in a lot of work and care in this work, check out more details in 🧵👇🏾 !
🚨 Excited to share our new work: "Stable Gradients for Stable Learning at Scale in Deep Reinforcement Learning"! 📈 We propose gradient interventions that enable stable, scalable learning, achieving significant performance gains across agents and environments! Details below 👇
Excited to share that I’ll be starting as a postdoc in Rose Bagot's lab at @mcgillu , studying the neural mechanisms of psychedelics ✨🐁🌀 Grateful for this new adventure.
The data science revolution is getting closer. TabPFN v2 is published in Nature: nature.com/articles/s4158… On tabular classification with up to 10k data points & 500 features, in 2.8s TabPFN on average outperforms all other methods, even when tuning them for up to 4 hours🧵1/19
Happening now, come by! West Ballroom A-D #6404 📅12 Dec 4:30 PM — 7:30 PM
we've used Atari games as an RL benchmark for so long, but for a little while it's bugged me that it's a discrete action problem, since the original joysticks were analog... @JesseFarebro & i fix this by introducing the Continuous ALE (CALE)! read thread for details! 1/9
DARK lab is now also on That Other Place 🦋. Our members are already actively discussing AI on the butterfly. Come be one of our first followers! Handle: UCL-DARK
My bsky feed now has more signal than here. Looks more and mote like it's so over here with probability> 90% let's see.
Lol my ML/AI feed here is growing more silent and has more or less shifted to bsky completely. Join us there 🦋
Introducing Latent Program Network (LPN), a new architecture for inductive program synthesis that builds in test-time adaption by learning a latent space that can be used for search 🔎 Inspired by @arcprize 🧩, we designed LPN to tackle out-of-distribution reasoning tasks!
At @ChandarLab, we are happy to announce the second edition of our assistance program to provide feedback for members of communities underrepresented in AI who want to apply to high-profile graduate programs. Want feedback? Details: chandar-lab.github.io/grad-app-help/. Deadline: Nov 15!
@creus_roger just implemented a @cleanrl_lib Parallel Q-Networks algorithm (PQN) implementation! 🚀PQN is DQN without a replay buffer and target network. You can run PQN on GPU environments or vectorized environments. E.g., in envpool, PQN gets DQN's score in 1/10th the time
SIMPL outperforms CEBRA — a contemporary, more general-purpose, neural-net-based technique — in terms of performance and compute-time. It’s over 30x faster. 12/21
We are very excited to announce Kinetix: an open-ended universe of physics-based tasks for RL! We use Kinetix to train a general agent on millions of randomly generated physics problems and show that this agent generalises to unseen handmade environments. 1/🧵
Bluesky is hitting a big threshold of ml people migrating from this garbage pile! Join us there!! So many stater packs to fill your feed without puppet master throwing political garbage at you.
If you’re considering moving on from this garbage feed, AI bksy is hitting its hockey stick era
What weird audio latent spaces are you currently inhabiting? Me open.spotify.com/album/3kH1Vfvp…
Object slots are great for compositional generalization, but can models without these inductive biases learn compositional representations without supervision too? Yes! Unsupervised learning on object videos yields entangled, yet compositional latent codes for objects!
If anyone among you is at EWRL, Toulouse, let's meet up!