Miguel Suau
@SuauMiguel
Machine Teacher. Research Scientist at @PhaidraAI. PhD from @TUDelft. Previously @JPMorgan, @Huawei, @Unity.
I'm extremely proud to share that our paper, Bad Habits, received the Outstanding Paper Award on Scientific Understanding in RL at @RL_Conference! Huge thanks to my incredible coauthors, Matthijs and @faoliehoek.

Phaidra is hiring a Research Scientist to work on sequential decision-making problems. I'm at the RLDM conference in Dublin this week. If you're attending and would like to learn more about the role or the company, feel free to reach out! job-boards.greenhouse.io/phaidra/jobs/4…
RLC will be held at the Univ. of Alberta, Edmonton, in 2025. I'm happy to say that we now have the conference's website out: rl-conference.cc/index.html We'll continue to update it, and the CFP will be out soon, but the relevant dates are already there. @RL_Conference @UAlberta
It's a beautiful demonstration of the powers of GPT-4o, and also it's limitations. GPT-4o gives you a salad of claims from the RL literature, some written by over-hyped authors, at the end of which you are not sure if RL can really reason counterfactually (ie, at level 3), or…
Why does RL lead to causal understanding? 🧵🪡 GPT-4o: Reinforcement learning (RL) can lead to causal understanding because, by interacting with an environment, an agent learns not just correlations between actions and outcomes, but also the underlying cause-effect…
Recruiting PhD students this year to the EMERGE lab! The unifying theme is seriously scaling our capability to solve real, hard decision making problems in multi-agent autonomy and transportation w/ RL, LLMs, whatever works
We have two openings for Research Scientists at @PhaidraAI. If you hold a PhD in RL, Planning, Control, or Causal Inference, and are excited about applying your expertise to advance intelligent control systems, we encourage you to apply. phaidra.ai/careers
Glad you liked it, @EugeneVinitsky! Thank you so much for sharing! 🙏
Neat paper by @SuauMiguel et. al. pointing out that learning only on trajectories from the optimal policy can lead to spurious correlations that decrease robustness to diverse trajectories: arxiv.org/abs/2306.02419
Thank you, @pcastr! This means so much to me! It's been a pleasure meeting you and joining you for the morning runs! 🙌🙌
I'm psc and I approve of this poster design. Also very cool paper and the winner of one of the best paper awards @RL_Conference !
I'm psc and I approve of this poster design. Also very cool paper and the winner of one of the best paper awards @RL_Conference !
In case you are not attending the @RL_Conference this year (I'm really sorry for you), @robertarail and I announced the list of RLC's Oustanding Paper Awards this year. If want to know the awarded papers, or the process we followed, read this piece: rl-conference.cc/blogs/paper_aw…
Check out @RL_Conference's Outstanding Paper Awards and the blog post on the process we used to decide. We awarded 7 papers that excel in one of the following aspects: - Applications of Reinforcement Learning - Empirical Reinforcement Learning Research - Empirical…
In case you are not attending the @RL_Conference this year (I'm really sorry for you), @robertarail and I announced the list of RLC's Oustanding Paper Awards this year. If want to know the awarded papers, or the process we followed, read this piece: rl-conference.cc/blogs/paper_aw…
Excited to share that Bad Habits was accepted at @RL_Conference! 🙃 We show how RL agents may develop simplistic habits, which are effective only for specific trajectories. Check out the new version with clearer notation and more examples! arxiv.org/abs/2306.02419
Hey 📢, @faoliehoek, Matthijs Spaan, and I wrote a new paper! arxiv.org/abs/2306.02419 It describes the phenomenon of policy confounding 🤖😵💫! More details below...
Together with Julia Olkhovskaia I am looking for a (fully paid) PhD student to work on multi-armed bandits and reinforcement learning theory. Closing date: July 5th More details: fransoliehoek.net/wp/vacancies/ Please share!