# Study Group on Mathematics for Complex Data

*Organizers:*
Henrik Hult and
Fredrik Viklund

### Background

The idea of this study group is to explore new directions and potential collaborations on the general topic of mathematics complex data. The idea is to learn together about new topics in an informal environment by reading and discussing recent papers. The proposed topics are the result of some informal discussions and interests by the organizers but may change depending on the interest of the participants.

### Part II: Topics in deep reinforcement learning

The second part of the study group will be devoted to topics in deep reinforcement learning. This topic is about automated decision making in a random environment, often described by a Markov decision problem (MDP). The theory of deep reinforcement learning has received significant attention recently as the core of Google DeepMind's alphaGo system. There will be three high-level introductory lectures on topics in reinforcement learning given by Ather Gattami, Senior Scientist at the Swedish Institute of Computer Science (SICS).

**Deep Learning, Recurrent Neural Networks, and LSTM, Feb 14, 10:15-12:00, room F11, Lindstedtsvägen 22. **

In this lecture, we will go through the basics of Neural Networks for prediction of static systems. We go further to discuss how Neural Networks can be used to predict outputs of dynamic systems by architectures such as Recurrent Neural Networks in general and Long Short Term Memory architectures in particular.

**Reinforcement Learning & Deep Reinforcement Learning, Feb 21, 10:15-12:00, room F11, Lindstedtsvägen 22. **

In this lecture, we will consider the problem of decision making in a Markov Decision Process (MDP), where we don’t have access to the mathematical model of the dynamics of the MDP. The solution will be based on Reinforcement Learning, in particular Q-learning. We also consider a particular approximation algorithm based on deep (Q-)-learning, where the “Q” function is approximated by a neural network, introducing Deep Reinforcement Learning.

**Markov Games & Multi-Objective Reinforcement Learning, Feb 28, 10:15-12:00, room F11, Lindstedtsvägen 22. **

This lecture will be concerned with the problem of making optimal decisions in Markov Games where the dynamics of the Markov Decision Process are not known to the players. In particular, we will show how the problem can be solved for zero-sum games based on a variation of Reinforcement Learning for one player. We extend these results and give solutions to the problem of Reinforcement Learning where several objectives need to be satisfied simultaneously and make connections to Markov Games.

### Part I: Energy landscapes, random matrices and non-convex optimization

High-dimensional statistical models expressed as a Gibbs distribution appears frequently both in statistical physics (e.g. Ising model, Sherrington-Kirkpatrick model) and in machine learning. The Gibbs distribution is expressed in terms of an energy function and the partition function is complicated. The paper [1] uses random matrix theory to establish asymptotic results for the number of critical points of the energy function of spin glasses as the system size increases. The results also show a layered structure of the lowest critical values. Empirical investigations in models related to machine learning and deep networks appear to agree with the theoretical results from spin glasses [2, 5]. A further implication of the theoretical results is the prevalence of saddle points in high-dimensional energy landscapes. In the training of machine learning models standard first-order optimization techniques such as stochastic gradient descent may easily get stuck in saddle points rather than converging to a local minima. In such cases a combination of first-order and second-order techniques are proposed for faster training [3,4].

[1] A. Auffinger, G. Ben Arous, and J. Cerny. Random matrices and complexity of spin glasses, *Communications on Pure and Applied Mathematics *66(2), 2013.

[2] Choromanska, M. Henaff, M. Mathieu, G. Ben Arous, and Y. LeCun. The loss surfaces of multilayer networks. *Proceedings of the 18th International Conference on Artificial Intelligence and Statistics, 2015, San Diego. *

[3] Y. Dauphin, R. Pascanu, C. Gulcehre, K. Cho, S. Ganguli, Y. Bengio. Identifying and attacking the saddle point problem in high-dimensional non-convex optimization. *Proceeding *NIPS'14 Proceedings of the 27th International Conference on Neural Information Processing Systems, 2933-2941

[4] S. J. Reddi, M. Zaheer, S. Sra,B. Poscoz, F. Bach, R. Salakhutdinov, A. Smola. (2017) A generic approach for escaping saddle points. *https://arxiv.org/abs/1709.01434*

[5] L. Sagun, V. U. Güney, G. Ben Arous and Y. LeCun. Explorations on high dimensional landscapes. *ICLR 2015. *

**Planned meetings - part I**

2 October, 13.15-14.30, rm 3418: Introduction and overview of spin glasses, Boltzmann machines and feed-forward neural networks

12 October, 13.15-14.30, rm 3418: Random matrices and complexity of spin glasses [1]

23 October 15.15-16.30, rm 3418: High-dimensional energy landscapes [2, 5]

6 November 15.15-16.30, rm 3418: Saddle-point problem in non-convex optimization [3,4]