Path integral control endowed robot planning under spatiotemporal logic specifications
Tid: Fr 2022-09-16 kl 14.00
Plats: ; Kollegiesal, Brinellvagen 8
Respondent: Péter Várnai , Reglerteknik
Opponent: Professor Theodorou Evangelos, Georgia Institute of Technology
Handledare: Professor Dimos Dimarogonas,
The increasing level of autonomy and intelligence of robotic systems in carrying out complex tasks can be expected to revolutionize both the industry and our everyday lives. This thesis takes a step towards automation by leveraging the power of path integral control (PIC) methods for solving control problems under such task satisfaction constraints in both a stochastic control and a reinforcement learning setting. PIC-based solutions to these problems offer many benefits, such as their ease of implementation and their natural ability to handle the history-dependent costs stemming from the definition of the complex tasks. They rely on sampling open-loop trajectories of the system to compute control actions and thereby have excellent parallelization capabilities. Their potential for handling complex dynamics effectively during real-time control has also been demonstrated in practice. Nevertheless, their applicability to spatiotemporal logic control has not been thoroughly explored in the literature, which motivates the subject of this thesis. In particular, we focus on robot planning under spatiotemporal logic tasks given in the expressive language of signal temporal logic (STL). We begin by extending path integral control to handle history-dependent costs in a stochastic control setting, providing novel viewpoints, insights, and simplified derivations along the way compared to the existing literature. This includes using path integral control outside of its traditional optimal control setting and instead aiming to achieve performance guarantees with respect to trajectory costs without explicitly considering the applied input effort. This broadens the applicability of PIC, which is also demonstrated through its extension to a multi-agent setting for agents that have independent dynamics, but are coupled through their individual costs. The possibility of augmenting path integral control methods with existing STL control approaches to blend the benefits of these sampling-based and analytical approaches are also considered. More specifically, we propose novel STL robustness metrics to quantify the degree of task satisfaction more suitably for control algorithms, and to incorporate partial knowledge of the system dynamics into the sampling process of PIC in the form of guidance controllers. These guidance controllers aid enforcing the STL task satisfaction constraint, allowing the system to explore and find optimal trajectories in a more sample-efficient manner. Conversely, the controllers do not have to be perfect and guarantee task satisfaction due to the added exploration, which opens up new possibilities for their design. The benefits of using such guidance controllers are demonstrated in the reinforcement learning context of PIC methods via a proposed policy search algorithm termed guided path integral policy improvement (G-PI2). Finally, the thesis also considers two enhancements related to the developed G-PI2 algorithm. First, the effectiveness of the guidance controllers is increased by continuously updating their parameters throughout the policy search process using so-called funnel adaptation. Second, we explore a learning framework for gathering and storing experiences from previously solved problems in order to more efficiently tackle changes in initial conditions or task specifications in future missions.