Till innehåll på sidan
Till KTH:s startsida Till KTH:s startsida

Scalable Reinforcement Learning for Formation Control with Collision Avoidance

Tid: Fr 2022-12-16 kl 09.00 - 10.00

Plats: Harry Nyquist

Videolänk: https://kth-se.zoom.us/j/66022027889

Språk: English

Respondent: Andreu Matoses Gimenez , DCS/Reglerteknik

Opponent: Gregorio Marchesini

Handledare: Ingvar Max Ziemann

Examinator: Alexandre Proutière

Exportera till kalender


In the last decades, significant theoretical advances have been made on the field of distributed multi-agent control theory. One of the most common systems that can be modeled as multi-agent systems are the so called formation control problems, in which a network of mobile agents is controlled to move towards a desired final formation.
This thesis presents a scalable and localized reinforcement learning approach to a traditional multi-agent formation control problem, with collision avoidance. A scalable reinforcement learning advantage actor critic algorithm is presented, based on previous work in the literature. Sub-optimal bounds are calculated for the accumulated reward and policy gradient localized approximations. The algorithm is tested on a two dimensional setting, with a network of mobile agents following simple integrator dynamics and stochastic localized policies. Neural networks are used to approximate the continuous value functions and policies. The formation control with collisions avoidance formulation and the algorithm presented show good scalability properties, with a polynomial increase in the number of function approximations parameters with number of agents. The reduced number of parameters decreases learning time for bigger networks, although the efficiency of computation is decreased compared to state of the art machine learning implementations. The policies obtained archive probably safe trajectories however the lack of dynamic model makes it impossible to guarantee safety.