Stefan Miletic: Quasi-Newton Methods for Neural Network Training in Machine Learning
Time: Tue 2021-08-24 09.00 - 10.00
Respondent: Stefan Miletic
Abstract: The theory of mathematical optimization provides a powerful tool in modern sciences as computational capabilities allow for fast and reliable solutions. One of those application areas is deep learning where different algorithms are trained to solve various problems within the field of machine learning. Typically a loss or error function is defined which needs to be minimized. For this reason, a brief introduction to optimization theory is given with focus on iterative methods incorporating line search and trust region techniques. A lot of attention will be paid to Newton's method and we will see how it can be further improved to handle large-scale problems which usually arise as a consequence of network training. In order to better understand the nature of these problems and why Newton-based iterative methods handle them well, a short introduction to neural networks and network training will be mathematically presented, where attention is paid to a special type of networks called feed-forward networks. Lastly, popular quasi-Newton methods will be introduced and explored in great detail with derivation and convergence analysis in focus. A large portion of our attention will be dedicated to some recent improvements towards quasi-Newton methods where deep neural network training performance is of importance.