Decentralized Learning of Randomization-based Neural Networks
Time: Fri 2021-06-11 13.00
Location: https://kth-se.zoom.us/j/64005034683, U1, Brinellvägen 28A, Undervisningshuset, våningsplan 6, KTH Campus, Stockholm (English)
Subject area: Electrical Engineering
Doctoral student: Xinyue Liang , Teknisk informationsvetenskap
Opponent: Associate Professor Nikolaos Deligiannis, Vrije Universitiet Brussel, Electronics and Informatics department (ETRO)
Supervisor: Saikat Chatterjee, ACCESS Linnaeus Centre, Teknisk informationsvetenskap; Mikael Skoglund, Signaler, sensorer och system, Teknisk informationsvetenskap
Machine learning and artificial intelligence have been wildly explored and developed very fast to adapt to the expanding need for almost every aspect of human development. When stepping into the big data era, siloed data localization has become a big challenge for machine learning. Restricted by scattered locations and privacy regulations of information sharing, recent studies aim to develop collaborated machine learning techniques for local models to approximate the centralized performance without sharing real data. Privacy preservation is as important as the model performance and the model complexity. This thesis aims to investigate the scopes of the low computational complexity learning model, randomization-based feed-forward neural networks (RFNs). As a class of artificial neural networks (ANNs), RFNs enjoy the favorable balance between low computational complexity and satisfying performance, especially for non-image data. Driven by the advantages of RFNs and the need for distributed learning resolutions, we aim to study the potential and applicability of RFNs and distributed optimization methods that may lead to the design of the decentralized variant of RFNs to deliver desired results.
Firstly, we provide the decentralized learning algorithms based on RFN architectures for undirected network topology using synchronous communication. We investigate decentralized learning of five RFNs that provides centralized equivalent performance as if the total training data samples are available at a single node. Two of the five neural networks are shallow, and the others are deep. Experiments with nine benchmark datasets show that the five neural networks provide good performance while requiring low computational and communication complexity for decentralized learning.
Then we are motivated to design an asynchronous decentralized learning application that achieves centralized equivalent performance with low computational complexity and communication overhead. We propose an asynchronous decentralized learning algorithm using ARock-based ADMM to realize the decentralized variants of a variety of RFNs. The proposed algorithm enables single node activation and one-sided communication in an undirected communication network, characterized by a doubly-stochastic network policy matrix. Besides, the proposed algorithm obtains the centralized solution with reduced computational cost and improved communication efficiency.
Finally, We consider the problem of training a neural net over a decentralized scenario with a high sparsity level in connections. The issue is addressed by adapting a recently proposed incremental learning approach, called `learning without forgetting.' While an incremental learning approach assumes data availability in a sequence, nodes of the decentralized scenario can not share data between them, and there is no master node. Nodes can communicate information about model parameters among neighbors. Communication of model parameters is the key to adapt the `learning without forgetting' approach to the decentralized scenario.