Identification of Non-Linear Differential-Algebraic Equations
Scalable and Consistent Parameter Estimation with Process Disturbances
Time: Fri 2024-12-13 10.00
Location: D3, Lindstedtsvägen 5, Stockholm
Video link: https://kth-se.zoom.us/j/69456726748
Language: English
Subject area: Electrical Engineering
Doctoral student: Robert Bereza-Jarocinski , Reglerteknik
Opponent: Professor Jonas Sjöberg, Chalmers Technical University, Gothenburg, Sweden
Supervisor: Professor Håkan Hjalmarsson, Reglerteknik; Professor Cristian R. Rojas, Reglerteknik; Professor David Broman, Programvaruteknik och datorsystem, SCS
QC 20241120
Abstract
This thesis is concerned with system identification for non-linear differential-algebraic equations affected by process disturbances. This model class is chosen because it is more general than, e.g., ordinary differential equation models, and in particular, is the underlying type of model for equation-based object-oriented modeling languages, such as Modelica or VHDL-AMS, and is thus suitable for modeling a large family of physical systems. A particular focus is placed on modeling process disturbances and taking them into account during the identification to address issues with biased estimates that can occur when process disturbances are neglected. Furthermore, the methods in this thesis are developed to be computationally tractable and to produce consistent estimators. In particular, approaches to improve the scaling of the methods with the number of unknown parameters are studied. This is important because many conventional identification methods are intractable in complex settings like the one considered in this thesis.
As a first step, a sub-optimal but consistent estimator is proposed for solving the problem, and it is shown how it can be computed using stochastic approximation methods. Forward sensitivity analysis for differential-algebraic equations is studied and applied to compute unbiased gradient estimates of the considered cost function. The tractability of the method is demonstrated through a simulation experiment on a pendulum model, where the benefits of taking process disturbances into account are also shown. To identify the parameters of the disturbance model, access to derivatives of the disturbances with respect to the parameters is required.
Because forward sensitivity analysis can become intractable as the number of unknown parameters grows, adjoint sensitivity methods are investigated as a second step. Adjoint sensitivity analysis for differential-algebraic equations is not applicable to our problem formulation, which is why we extend it so that it can be used to compute unbiased gradient estimates while avoiding unnecessary intermediate computations present in forward sensitivity analysis. The extension is also applicable to identifying parameters of the disturbance model if gradients of the disturbances with respect to their parameters are available. The computational benefits of the adjoint method are demonstrated through a simulation experiment on a delta robot, where we also observe some numerical challenges that can occur when solving differential-algebraic equations.
As a third and final part of the thesis, disturbance models are studied in more detail. In particular, the necessity and challenges of modeling disturbances in continuous time are discussed, and it is shown how derivatives of the disturbances with respect to their parameters can be computed. Insights about stochastic differential equations are used to develop a way to approximate these types of equations by ordinary differential equations, which allows us to apply adjoint sensitivity analysis without computing derivatives of the disturbances with respect to the parameters as an intermediate step. This can improve the efficiency of the adjoint method, especially under some mild assumptions on the poles of the disturbance model. The necessary theory for using larger and more complex disturbance models is thus developed.
Together, these developments allow for tractable estimation methods that are expected to produce consistent estimators even when the system is affected by process disturbances. As the number of unknown parameters grows, computational tractability still becomes an issue, but the methods presented in this thesis allow us to push the boundary of how many unknown parameters we can handle. This, other future challenges, and further potential research directions are discussed at the end of the thesis.