Accelerated ADMM Variants for Distributed Optimization
Algorithms for Dynamic and Large Networks
Time: Fri 2026-06-12 13.30
Location: Q2, Malvinas väg 10, Stockholm
Language: English
Subject area: Computer Science
Doctoral student: Jeannie He , Teknisk informationsvetenskap
Opponent: Professor Lei Zhang, University of Glasgow, Glasgow, UK
Supervisor: Professor Ming Xiao, Teknisk informationsvetenskap; Professor Mikael Skoglund, Teknisk informationsvetenskap
QC 20260525
Abstract
This doctoral thesis presents a comprehensive summary of research efforts with the aim of advancing the state-of-the-art in decentralized optimization. As modern distributed systems grow in scale and complexity, traditional optimization methods face significant bottlenecks. This work, presented as a compilation of five papers, specifically targets the Alternating Direction Method of Multipliers (ADMM). The central objective is to re-engineer ADMM to overcome the dual challenges of convergence latency and communication inefficiencies in peer-to-peer networks. The first part of the thesis focuses on decentralization, convergence speed, and computational cost. While centralized ADMM algorithms struggle with scalability and bottlenecks, existing decentralized schemes rely on either excessive computations and messaging or completely sequential operations, causing a prolonged time required to reach convergence. To tackle these problems, we introduce two fast-converging decentralized ADMM algorithms. Our theoretical analysis confirms that our algorithms retain the classical convergence properties of centralized ADMM while maintaining a low per-node complexity of $O(1)$. Numerical simulations further demonstrate that our algorithms converge significantly faster than state-of-the-art decentralized implementations, providing clear conditions under which they outperform traditional benchmarks. The second part of the thesis addresses the straggler problem, which arises from the conventional ADMM requirement for global synchronization, where faster nodes are forced to remain idle until the slowest nodes complete their local updates, impeding the progress of the entire network. Here, we introduce three algorithms to allow the system to remain productive even under single-point-of-failure scenarios or extreme hardware variance. The first algorithm achieves straggler-resilience by allowing the nodes to proceed to the next iteration even when one or more nodes have not provided an update for one or more iterations. The second algorithm is a decentralized version of the first algorithm. The second algorithm enforces fast convergence as well as robustness against stragglers and single points of failure through decentralized, asynchronous, and concurrent operations. The third algorithm extends the second algorithm by enforcing robustness against uncertainties with the help of a time-tracking scheme. Through theoretical analyses, we establish the convergence properties of our algorithms and show that our decentralized algorithms achieve a computational complexity of $O(1)$ for each worker node, whereas our centralized algorithm achieves a computational complexity of $O(N)$ for the central node and $O(1)$ for each of the remaining nodes. Through numerical simulations with various settings, we show that our algorithms have converged significantly faster than several state-of-the-art ADMM algorithms with well-established convergence properties.
The final part of the thesis extends these optimizations to highly volatile systems characterized by message dropouts and dynamic topologies. Here, we introduce two algorithms to adapt ADMM to dynamic systems, where new nodes may be added amidst the process at the same time as the system may encounter issues with stragglers and message dropouts. The algorithms achieve fast convergence and flexibility by allowing nodes to choose a step size that is best suited for their own system and by allowing the nodes to move on to the next iteration even when not all nodes have made an update for the current iteration. More importantly, these algorithms incorporate a contribution tracking mechanism to ensure consistency despite message loss. Furthermore, these algorithms enforce robustness against uncertainties by removing the need to predefine the waiting time or the minimum number of updates before moving to the next iteration. Here, an approximation mechanism is also introduced to give stragglers more time to compute their variables while the faster nodes move on to the next iteration.
To summarize, this thesis provides accelerated algorithms for fast convergence in distributed optimization, with solutions tailored for both large-scale and dynamic networks.