# Seminars - Probability and statistics

KTH's seminar series on probability, statistics, data science and related topics.

The seminar takes place on Mondays 15:15–16:16 in room **3721** (7th floor; on occassions moved to room 3418 on the 4th floor).

Those interested in the seminar series may also be interested in the Applied CATS seminar and the MathDataLab seminar .

## Next Talk

Monday, May 22, 15:15

**Matteo Sfragara (Stockholm University) Queue-based random-access protocols for wireless networks **

Abstract: In this talk, we discuss mathematical models that address fundamental challenges in wireless networks. We first introduce Carrier-Sense Multiple-Access (CSMA) protocols, distributed algorithms that involve randomness to prevent the devices to transmit simultaneously and their signals to interfere with each other. These random-access models can be viewed as interacting particle systems on graphs, where the interference between signals in the network is captured by a hard-core interaction model on an appropriate interference graph. We then describe how these models exhibit metastability: when the activation rates become large, the system tends to stabilize in configurations with the maximum number of active nodes, with extremely slow transitions between them. As a consequence, individual nodes may experience prolonged periods of starvation, resulting in long queues and delays. Understanding metastability properties and slow transitions is thus crucial to design mechanisms to counter these effects and improve the performance of the network. In particular, we focus on random-access models where the activation rates depend on the queues at the nodes. Not much is known about these queue-dependent models, since most of the literature deals with models with fixed activation rates. The joint activity state together and the joint queue length process evolve as a time-homogeneous Markov process, whose stationary distribution is challenging to analyze. We discuss three different network topologies: we start with complete bipartite networks, we then generalize our results to arbitrary bipartite networks, and finally we explore dynamic bipartite networks in which the interference graph changes over time, which allows us to capture some effects of user mobility. This talk is based on three papers written during my PhD under the supervision of Frank den Hollander (Leiden University), Sem Borst (Eindhoven University of Technology) and Francesca R. Nardi (University of Florence).

- Monday, 22 May, 15:15

**Matteo Sfragara (SU)** - Monday, 29 May, 15:15

**Zack Bezemek (Boston)** - Monday, 5 June, 15:15

**TBA** - Monday, 19 June, 15:15

**TBA**

- Tuesday, 11 April, 14:00

**Amit Moscovich (Tel Aviv)** - Monday, 6 March, 15:15

**Marcus Christiansen (Oldenburg)** - Tuesday, 21 February, 2023, 10:15

**Huyên Pham (Paris)** - Monday, 20 February, 15:15

**Anja Janssen (Magdeburg)** - Monday, 30 January, 2023, 15:15

**Erik Lindström (Lund)** - Monday, December 12, 2022, 15:15

**Christian Furrer (Copenhagen)** - Monday, 24 October, 2022, 15:15

**Umberto Picchini (Chalmers)** - Monday, 18 October, 2022, 14:00

**Pierre Nyquist (docent lecture)** - Monday, 17 October, 2022, 15:15

**Lukasz Delong (Warsaw School of Economics)** - Monday, 23 May, 2022, 15:15

**Alex Cox (Bath)** - Monday, 11 April, 2022, 15:15

**Ruoyu Wu (Indiana State)** - Monday, 4 April, 2022, 15:15

**Michela Ottobre (Heriot-Watt)** - Monday, 28 March, 2022, 15:15

**Yin-Ting Liao (Brown)** - Monday, 21 March, 2022, 15:15

**Björn Löfdahl (SEB)** - Monday, 7 March, 2022, 15:15

**Emma Horton (INRIA Bordeaux)** - Monday, December 6, 2021, 15:15

**Nawaf Bou-Rabee (Rutgers)**

*Upper Bounds for Mixing Time of Unadjusted Hamiltonian Monte Carlo by Couplings*

Hamiltonian Monte Carlo (HMC) is a Markov Chain Monte Carlo method that is frequently used in statistical physics, statistics, and machine learning. The transition step of HMC uses a combination of Hamiltonian dynamics and velocity randomizations. The Hamiltonian dynamics is typically discretized using a reversible/volume-preserving integrator, and the discretization bias can either be borne (unadjusted HMC) or eliminated by Metropolis-Hastings (adjusted HMC). Despite its empirical success, until a few years ago there were few mixing time guarantees for HMC. Now, approaches to quantify convergence to equilibrium based on coupling, conductance and hypocoercivity have been developed. I will talk about the coupling approach, and show how it can be used to obtain mixing time guarantees for unadjusted HMC in some high-dimensional and non-convex models. This talk is based on joint work with Andreas Eberle (Bonn) and Katharina Schuh (Bonn). - Monday, November 15, 2021, 15:15

**Tianfang Zhang (KTH and RaySearch)**

*A similarity-based Bayesian mixture-of-experts model*

We present a new nonparametric mixture-of-experts model for multivariate regression problems, inspired by the probabilistic k-nearest neighbors algorithm. Using a conditionally specified model, predictions for out-of-sample inputs are based on similarities to each observed data point, yielding predictive distributions represented by Gaussian mixtures. Posterior inference is performed on the parameters of the mixture components as well as the distance metric using a mean-field variational Bayes algorithm accompanied with a stochastic gradient-based optimization procedure. The proposed method is especially advantageous in settings where inputs are of relatively high dimension in comparison to the data size, where input–output relationships are complex, and where predictive distributions may be skewed or multimodal. Computational studies on two synthetic datasets and one dataset comprising dose statistics of radiation therapy treatment plans show that our mixture-of-experts method performs similarly or better than a conditional Dirichlet process mixture model both in terms of validation metrics and visual inspection. - Monday, June 7, 2021, 15:15

**Julia Gaudio (Northwestern)**

*Shotgun Assembly of Erdos-Renyi Random Graphs*

Graph shotgun assembly refers to the problem of reconstructing a graph from a collection of local neighborhoods. We consider shotgun assembly of Erdos-Renyi random graphs G(n, p_n), where p_n = n^{-α} for 0 < α < 1. We consider both reconstruction up to isomorphism as well as exact reconstruction (recovering the vertex labels as well as the structure). We show that given the collection of distance-1 neighborhoods, G is exactly reconstructable for 0 < α < 1/3, but not reconstructable for 1/2 < α < 1. Given the collection of distance-2 neighborhoods, G is exactly reconstructable for 0 < α < 3/5, but not reconstructable for 3/4 < α < 1. (Joint work with Elchanan Mossel) - Monday, May 31, 2021, 15:15

**Oliver Tse (Eindhoven)**

*On generalized gradient flows*

The theory for variational evolutions—evolutions driven by one or more energies-or entropies—in spaces of measures has seen tremendous growth in the last decades, of which resulted in a rich framework for classical gradient systems in general metric spaces by Ambrosio, Gigli and Savaré, where the Wasserstein metric of optimal transport theory plays a fundamental role; and a theory for rate-independent systems. While these theories have allowed massive development of variational evolutions in a certain direction—gradient flows with homogeneous dissipation—physics and large-deviation theory suggest the study of generalized gradient flows—gradient flows with non-homogeneous dissipation—which are not covered in either of these established theories. In this talk, I will discuss the motivation underlying the need for a generalized theory of gradient flows and how these structures can be used in practice. - Monday, May 3, 2021, 15:15

**Guo-Jhen Wu (KTH)**

*Analysis and optimization of certain parallel Monte Carlo methods in the low temperature limit*

Metastability is a formidable challenge to Markov chain Monte Carlo methods. In this talk we present methods for algorithm design to meet this challenge. The design problem we consider is temperature selection for the infinite swapping scheme, which is the limit of the widely used parallel tempering scheme obtained when the swap rate tends to infinity. We use a recently developed tool for the large deviation properties of the empirical measure of a metastable small noise diffusion to transform the variance reduction problem into an explicit graph optimization problem. The nodes in the graph optimization problem correspond to metastable states of the noiseless dynamics. Our first analysis of the optimization problem is in the setting of a double well model, and it shows that the optimal selection of temperature ratios is a geometric sequence except possibly the highest temperature. In the same setting we identify two different sources of variance reduction, and show how their competition determines the optimal highest temperature. In the general multi-well setting we prove that the same geometric sequence of temperature ratios as in the two-well case is always nearly optimal, with a performance gap that decays geometrically in the number of temperatures. Moreover, this optimal placement of temperatures is explicit and independent of the particular functional being integrated or (with mild restrictions) on the potential. - Monday, 26 April, 2021, 15:15

**Yulong Lu (UMass Amherst)**

*A priori generalization error analysis of neural network methods for solving high dimensional PDEs*

Neural network-based machine learning methods, including the most notably deep learning have achieved extraordinary successes in numerous fields. Despite the rapid development of learning algorithms based on neural networks, their mathematical analysis is far from understood. In particular, it has been a big mystery that neural network-based machine learning methods work extremely well for solving high dimensional problems. In this talk, we will demonstrate the power of neural network methods for solving high dimensional PDEs. Specifically, we will discuss an a priori generalization error analysis of the Deep Ritz Method for solving two classes of high dimensional Schrodinger problems: the stationary Schrodinger equation and the ground state of Schrodinger operator. Assuming the exact solution or the ground state lies in a low-complexity function space called spectral Barron space, we show that the convergence rate of the generalization error is independent of dimension. We also develop a new regularity theory for the PDEs of consideration on the spectral Barron space. This can be viewed as an analog of the classical Sobolev regularity theory for PDEs. - Monday, 12 April, 2021, 15:15

**Will Leeb (Minnesota)**

*Matrix Denoising with Weighted Loss*

This talk will describe a new class of methods for estimating a low-rank matrix from a noisy observed matrix, where the error is measured by a type of weighted loss function. Such loss functions arise naturally in a variety of problems, such as submatrix denoising, filtering heteroscedastic noise, and estimation with missing data. We introduce a family of spectral denoisers, which preserve the left and right singular subspaces of the observed matrix. Using new asymptotic results on the spiked covariance model in high dimensions, we derive the optimal spectral denoiser for weighted loss. We demonstrate the behavior of our method through numerical simulations. - Monday, 29 March, 2021, 15:15

**Markus Fischer (Padua)**

*Correlated equilibria and mean field games*

Mean field games are limit models for symmetric N-player games, as N tends to infinity, where the prelimit models are solved in terms of Nash equilibria. A generalization of the notion of Nash equilibrium, due to Robert Aumann (1974, 1987), is that of correlated equilibrium. In a simple discrete, non-static setting, we will discuss mean field games based on correlated equilibria. We give a definition of correlated mean field game solution, prove that it arises as limit of symmetric N-player correlated equilibria in restricted Markov open-loop strategies, and construct approximate N-player equilibria starting from a correlated mean field game solution. We also compare our definition to the one by D. Lacker of weak solutions for mean field games without common noise, and give an example of correlated mean field game solutions with non-deterministic flow of measures. Joint work with Luciano Campi, University of Milan "La Statale" Monday, 22 March, 2021, 15:15

**Harsha Honnappa (Purdue)**

*New Insights on Queues in Random Environments*

Queueing systems are often subject to clear nonstationarities that arise a as a consequence of diurnal, seasonal or stochastic effects, with the latter emerging as a consequence of being subject to a random environment. The modeling and performance analysis of queues in random environments, in particular, has attracted significant interest in the recent literature. In this talk I will present recent work on approximating performance metrics of infinite server queueing systems that are fed by arrival processes with stochastic arrival intensities that fluctuate rapidly. This setting includes Cox processes, as well as self-excited models such as the Hawkes process. At the outset, it is clear that computing performance metrics in this setting cannot be achieved in closed form, raising the question of how to compute approximations. In particular, one is confronted by the question of whether to use an annealed setting (where the performance measures are “averaged” or “annealed” over the random environment) or a quenched setting (where the performance measures are conditioned on the random environment). I will present our on-going work studying both settings establishing asymptotic approximations in three flavors: as asymptotic “refinements”, fluid-scale limits and diffusive-scale limits of performance measures. In the case of infinite server queues driven by Cox processes, our results show that there exists a parameter regime where the quenched limits exist (and coincide with the annealed limits) and a complementary regime where they may not, suggesting that the quenched analysis, though seemingly more robust and simpler to carry out, should be used with caution. This is based on joint work with Peter W. Glynn, Zeyu Zheng at Stanford University and Samy Tindel, Aaron Nung-Kwan Yip and Yiran Liu at Purdue University.Monday, 15 March, 2021, 15:15

**Lukasz Szpruch (Edinburgh)**

*Mean-Field Neural ODEs via Relaxed Optimal Control*

We develop a framework for the analysis of deep neural networks and neural ODE models that are trained with stochastic gradient algorithms. We do that by identifying the connections between control theory, deep learning and theory of statistical sampling. We derive Pontryagin's optimality principle and study the corresponding gradient flow in the form of Mean-Field Langevin dynamics (MFLD) for solving relaxed data-driven control problems. Subsequently, we study uniform-in-time propagation of chaos of time-discretised MFLD. We derive explicit convergence rate in terms of the learning rate, the number of particles/model parameters and the number of iterations of the gradient algorithm. In addition, we study the error arising when using a finite training data set and thus provide quantitive bounds on the generalisation error. Crucially, the obtained rates are dimension-independent. This is possible by exploiting the regularity of the model with respect to the measure over the parameter space.Monday, 1 March, 2021, 15:15

**Adam Waterbury (UNC Chapel Hill)**

*Approximating Quasi-Stationary Distributions with Interacting Reinforced Random Walks*

We propose two numerical schemes for approximating quasi-stationary distributions (QSD) of finite state Markov chains with absorbing states. Both schemes are described in terms of certain interacting chains in which the interaction is given in terms of the total time occupation measure of all particles in the system. The schemes can be viewed as combining the key features of the two basic simulation-based methods for approximating QSD originating from the works of Fleming and Viot (1979) and Aldous, Flannery, and Palacios (1998), respectively. In this talk I will describe the two schemes, discuss their convergence properties, and present some exploratory numerical results comparing them to other QSD approximation methods.Monday, 22 February, 2021, 15:15

**Clara Stegehuis (Twente)**

*Optimal constrained and unconstrained subgraph structures*

Subgraphs contain important information about network structures and their functions. We investigate the presence of subgraphs in a random graph model with infinite-variance degrees. We introduce an optimization problem which identifies the dominant structure of any given subgraph. The unique optimizer describes the degrees of the vertices that together span the most likely subgraph and allows us count and characterize all subgraphs. We then show that this optimization problem easily extends to other network structures, such as clustering, which expresses the probability that two neighbors of a vertex are connected. The optimization problem is able to find the behavior of network subgraphs in a wide class of random graph models.