Coordinated Control of FACTS Setpoints Using Reinforcement Learning
Time: Wed 2025-10-08 13.00
Location: F3 (Flodis), Lindstedtsvägen 26 & 28
Video link: https://kth-se.zoom.us/j/65901664759
Language: English
Subject area: Computer Science
Doctoral student: Magnus Tarle , Robotik, perception och lärande, RPL, Hitachi Energy Sweden AB, 721 82 Västerås, Sweden
Opponent: Professor Spyros Chatzivasileiadis, Technical University of Denmark (DTU), Copenhagen, Denmark
Supervisor: Professor Mårten Björkman, Robotik, perception och lärande, RPL; Professor Lars Nordström, Elkraftteknik
QC 20250908
Abstract
With the increasing electrification and integration of renewables, power system operators face severe control challenges. These challenges include voltage stability, faster dynamics, and congestion management. Potential solutions encompass more advanced control systems and accurate measurements. One encouraging mitigation strategy is coordinated control of Flexible AC Transmission Systems (FACTS) setpoints to substantially improve voltage and power flow control. However, due to model-based optimization challenges related to e.g. imperfect models and uncertainty, fixed setpoints are often used in practice. Alternative promising control methods are data-driven methods based on, for example, reinforcement learning (RL). Motivated by these challenges, the accumulation of high-quality data, and the advancements in RL, this thesis explores an RL-based coordinated control of FACTS setpoints. With a focus on safety, four problem settings are investigated on the IEEE 14-bus and IEEE 57-bus systems addressing limited pre-training, model errors, few measurements, and datasets for pre-training. First, we propose WMAP, a model-based RL algorithm that learns and uses a compressed dynamics model to optimize voltage and current setpoints. WMAP includes a mechanism to mitigate poor performance in case of out-of-distribution data. Moreover, WMAP is shown to outperform model-free RL and a non-frequently updated expert policy. Second, when power system model errors are present, safe RL is demonstrated to outperform classical model-based optimization in terms of constraint satisfaction. Third, RL is shown to exceed the performance of fixed setpoints using a few measurements provided it has a complete, albeit simple, constraint signal. Finally, RL that leverages datasets for offline pre-training is demonstrated to outperform the original policy that generated the dataset and an RL agent trained from scratch. Overall, these four works contribute to an advancement in the field towards a more adaptable and sustainable power system.