Skip to main content

Novel Hessian approximations in optimization algorithms

Time: Fri 2022-01-28 13.00

Location: U41, Brinellvägen 26, Stockholm

Language: English

Subject area: Electrical Engineering

Doctoral student: Erik Berglund , Reglerteknik

Opponent: Associate Professor Pontus Giselsson,

Supervisor: Professor Mikael Johansson, Reglerteknik

Export to calendar

QC 20220120

Abstract

There are several benefits of taking the Hessian of the objective function into account when designing optimization algorithms. Compared to using strictly gradient-based algorithms, Hessian-based algorithms usually require fewer iterations to converge. They are generally less sensitive to tuning of parameters and can better handle ill-conditioned problems. Yet, they are not universally used, due to there being several challenges associated with adapting them to various challenging settings. This thesis deals with Hessian-based optimization algorithms for large-scale, distributed and zeroth-order problems. For the large-scale setting, we contribute with a new way of deriving limited memory quasi-Newton methods, which we show can achieve better results than traditional limited memory quasi-Newton methods with less memory for some logistic and linear regression problems. For the distributed setting, we perform an analysis of how the error of a Newton-step is affected by the condition number and the number of iterations of a consensus-algorithm based on averaging, We show that the number of iterations needed to solve a quadratic problem with relative error less than ε grows logarithmically with 1/ε and also with the condition number of the Hessian of the centralized problem. For the zeroth order setting, we exploit the fact that a finite difference estimate of the directional derivative works as an approximate sketching technique, and use this to propose a zeroth order extension of a sketched Newton method that has been developed to solve large-scale problems. With the extension of this method to the zeroth order setting, we address the combined challenge of large-scale and zeroth order problems.

urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-307186