Dimitri Bertsekas
@DBertsekas
http://www.mit.edu/~dimitrib/bio.html
I’m pleased to share a podcast re my new book "Academia, Art, and Life" notebooklm.google.com/notebook/cc6d8… It is found in near-final form at mit.edu/~dimitrib/Acad… It revisits the question “Who is an artist?” and explores artistry more broadly, within academic life.
I am pleased to share my new paper (w/ Kim Hammar, Yuchao Li, Tansu Alpcan, and Emil Lupu) on "Adaptive Network Security Policies via Belief Aggregation and Rollout," web.mit.edu/dimitrib/www/H… It deals with rollout algorithms and applications in cybersecurity.
Agree. When an appropriate feature space is constructed, aggregation methods often outperform neural networks. I’ve been using this approach since the early 2000s. Your Neuro-Dynamic Programming book provides an especially insightful account on feature extraction.
Sharing my new paper (joint with Yuchao Li and Kim Hammar) on Feature-Based Belief Aggregation for Partially Observable Markov Decision Problems," lnkd.in/ebZUiJxT The paper gives favorable computational results involving very large scale problems #ReinforcementLearning
I am pleased to share at lnkd.in/eQjxvSvM slides, podcast and an essay on my lecture: “Ten Simple Rules for Mathematical Writing” Since its original delivery in a slide presentation at MIT (2002), it has been referenced widely and used in mathematical writing courses.
I am pleased to share my new paper (joint with Yuchao Li) on Error Bounds for Aggregation Methods. web.mit.edu/dimitrib/www/A… In my view, aggregation is an under-appreciated off-line training approach in #reinforcementlearning
I am pleased to share the link to my videolecture from 5/2/2025, at Harvard University: Reinforcement Learning, Model Predictive Control, and Newton's Method for Solving Bellman's equation youtube.com/watch?v=ZBRouv… Slides at web.mit.edu/dimitrib/www/R…
I am pleased to share at web.mit.edu/dimitrib/www/R… High quality AI-generated podcast links for my books: 1) Lessons from AlphaZero ... 2) Parallel and Distributed Computation See web.mit.edu/dimitrib/www/b… for PDF copies #reinforcementlearning#machinelearning
I am pleased to share podcasts (<30 mins) describing two of my books: Neuro-Dynamic Programming. notebooklm.google.com/notebook/c21b0… A Course in Reinforcement Learning notebooklm.google.com/notebook/a4a87… Free PDF of both books can be found at web.mit.edu/dimitrib/www/b…
I am often asked about the relative merits of various #reinforcementlearning approaches, such as policy gradient and value-based methods. The last lecture of my RL course deals with this question, and related training issues, see: youtube.com/watch?v=43CXjD…
I am pleased to share the full set of videolectures, slides, textbook, and other supporting material of the 7th offering of my Reinforcement Learning class at ASU, which was completed two days ago; check web.mit.edu/dimitrib/www/R…
I am pleased to share the video from my yesterday's lecture "Abstract Dynamic Programming, Reinforcement Learning, Newton's Method, and Gradient Optimization" at the ASU Math Dept youtube.com/watch?v=JmQzj0… This is an overview lecture on the relations between DP and RL
A free PDF of my book "Rollout, Policy Iteration, and #ReinforcementLearning " has been posted at my web site web.mit.edu/dimitrib/www/d… An extensive research account on rollout algorithms, including multiagent rollout, and the connection with Newton's method
A recording of my guest lecture at ASU on aggregation for approximating POMDPs is available here: youtube.com/watch?v=gsD2Jg… Thanks to @DBertsekas and Yuchao Li for organizing this!
Just posted a videolecture on a Viterbi-like rollout/#reinforcementlearning algorithm for most likely sequence generation in Markov chains, and HMM inference, at youtu.be/KdX6o9Qi1vc Applies to large state spaces where the Viterbi algorithm is intractable
This week's #PaperILike is "Model Predictive Control and Reinforcement Learning: A Unified Framework Based on Dynamic Programming" (Bertsekas 2024). If you know 1 of {RL, controls} and want to understand the other, this is a good starting point. PDF: arxiv.org/abs/2406.00592