Silvia Sapora
@silviasapora
PhD student at @UniofOxford with @FLAIR_Ox and @OxCSML @imperialcollege CompSci '15-'19
π§΅ Check out our latest preprint: "Programming by Backprop". What if LLMs could internalize algorithms just by reading code, with no input-output examples? This could reshape how we train models to reason algorithmically. Let's dive into our findings π
Can an LLM be programmed? In our new preprint, we show that LLMs can learn to evaluate programs for a range of inputs by being trained on the program source code alone β a phenomenon we call Programming by Backprop (PBB). π§΅β¬οΈ
Say ahoy to ππ°πΈπ»πΎπβ΅: a new paradigm of *learning to search* from demonstrations, enabling test-time reasoning about how to recover from mistakes w/o any additional human feedback! ππ°πΈπ»πΎπ β΅ out-performs Diffusion Policies trained via behavioral cloning on 5-10x data!
π Excited to share our paper "The Edge-of-Reach Problem in Offline MBRL" has been accepted to #NeurIPS! π Looking forward to Vancouver! We reveal why offline MBRL methods work (or fail) and introduce a robust solution: RAVL π π§΅ Let's dive in! [1/N]
Beyond excited to introduce AlphaQubit, now published in @Nature! AlphaQubit is a neural network for quantum error correction and achieves state-of-the-art accuracy on simulated and real-world data.
Weβll be presenting this poster at 11:30am today! Stop by Hall C 4-9 #1208 if youβd like to chat about it
1 / π§΅ Excited to introduce our #ICML2024 paper: π EvIL (Evolution Strategies for Generalisable Imitation Learning) a new inverse RL (IRL) method for sample efficient transfer of expert behaviour across environments β it's so good, it's downright EvIL!
1/π FLAIR is coming to #icml2024 in Vienna π we are very excited to share our work with you! You can find us here β¬οΈβ¨ Or use shorturl.at/Qw8QN π for clickable links