Minimal Expected Regret for the Online LQR Problem
Time: Tue 2021-08-24 14.00 - 15.00
Location: Harry Nyquist
Lecturer: Yassir Jedra
Recently, there has been a surge of interest in studying the Linear Quadratic Regulator (LQR) problem within the online learning community. One of the main goals often considered by this community is to devise learning algorithms and study their so-called regret. In this talk, I will attempt to provide a comprehensive discussion on recent work on the LQR problem from the online learning community. I will also present some of our recent work on this topic where we devise a new learning algorithm and provide guarantees on its expected regret. I will further highlight the many desirable properties that our algorithm enjoys in contrast with existing ones, notably from an algorithm design perspective, where we allow our algorithm to update its policy continuously. On a technical level, achieving a simple algorithm while retaining strong regret guarantees poses serious challenges. We are able to tackle these challenges by carefully leveraging recent tools from random matrix theory and self-normalized processes.