Forward and Inverse Decision-Making in Adversarial, Cooperative, and Biologically-Inspired Dynamical Systems

Time: Fri 2021-06-11 10.00

Location: zoom link for online defense (English)

Subject area: Electrical Engineering

Doctoral student: Inês de Miranda de Matos Lourenço , Reglerteknik

Opponent: Assistant Professor Karinne Ramirez-Amaro, Chalmers University of Technology

Supervisor: Professor Bo Wahlberg, Reglerteknik

Abstract

Decision-making is the mechanism of using available information to develop solutions to given problems by forming preferences, beliefs, or selecting courses of action amongst several alternatives. It is the main focus of a variety of scientific fields such as robotics, finances, and neuroscience. In this thesis, we study the mechanisms that generate behavior in diverse decision-making settings (the forward problem) and how their characteristics can explain observed behavior (the inverse problem). Both problems take a central role in current research due to the desire to understand the features of system behavior, many times under situations of risk and uncertainty. We study decision-making problems in the three following settings.

In the first setting, we consider a decision-maker who forms a private belief (posterior distribution) on the state of the world by filtering private information. Estimating private beliefs is a way to understand what drives decisions. This forms a foundation for predicting, and counteracting against, future actions. In the setting of adversarial systems, we answer the problems of i) how can an adversary estimate the private belief of the decision-maker by observing its decisions (under two different scenarios), and ii) how can the decision-maker protect its private belief by confusing the adversary. We exemplify the applicability of our frameworks in regime-switching Markovian portfolio allocation.

In the second setting we shift from an adversarial to a cooperative scenario. We consider a teacher-student framework similar to that used in learning from demonstration and transfer learning setups. An expert agent (teacher) knows the model of a system and wants to assist a learner agent (student) in performing identification for that system but cannot directly transfer its knowledge to the student. For example, the teacher's knowledge of the system might be abstract or the teacher and student might be employing different model classes, which renders the teacher's parameters uninformative to the student. We propose correctional learning as an approach where, in order to assist the student, the teacher can intercept the observations collected from the system and modify them to maximize the amount of information the student receives about the system. We obtain finite-sample results for correctional learning of binomial systems.

In the third and final setting we shift our attention to cognitive science and decision-making of biological systems, to obtain insight about the intrinsic characteristics of these systems. We focus on time perception - how humans and animals perceive the passage of time, and solve the forward problem by designing a biologically-inspired decision-making framework that replicates the mechanisms responsible for time perception. We conclude that a simulated robot equipped with our framework is able to perceive time similarly to animals - when it comes to their intrinsic mechanisms of interpreting time and performing time-aware actions. We then focus on the inverse problem. Based on the empirical action probability distribution of the agent, we are able to estimate the parameters it uses for perceiving time. Our work shows promising results when it comes to drawing conclusions regarding some of the characteristics present in biological timing mechanisms.

urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-295301

To the calendar