Tuesday, 12 October, 10:15
Henri Riihimäki (KTH)
Excursions into directed networks: combinatorics, algebra, and topology with machine learning
I will describe three recent works in the context of directed graphs/networks. The motivation and applications stem from brain networks of neurons and this perspective is interleaved in the talk. The three works I will describe are: 1) Machine learning pipeline for binary dynamics on a network, 2) New notion of a simplicial connectivity for directed graphs which extends Q-analysis of Atkin, and 3) How simplicial connectivities enable higher Hochschild homologies of directed graphs, going beyond degree 1 of the path algebra.
Tuesday, 05 October, 16:15
Anna Seigal (Mathematical Insitute of Oxford)
Groups and symmetries in Gaussian graphical models
We can use groups and symmetries to define new statistical models, and to investigate them. In this talk, I will discuss two families of multivariate Gaussian models: 1. RDAG models: graphical models on directed graphs with coloured vertices and edges, 2. Gaussian group models: multivariate Gaussian models that are parametrised by a group. I will focus on maximum likelihood estimation, an optimisation problem to obtain parameters in the model that best fit observed data. For RDAG models and Gaussian group models, the existence of the maximum likelihood estimate relates to linear algebra conditions and to stability notions from invariant theory. This talk is based on joint work with Carlos Améndola, Kathlén Kohn, Visu Makam, and Philipp Reichenbach.
Slides (pdf 1.7 MB)
Tuesday, 28 September, 10:15
Jan Kronqvist (KTH)
Mixed-integer optimization: algorithms and strong relaxations
The presentation focuses on how to solve optimization problems containing both continuous and integer variables. The integer restrictions can originate from disjunctions, logical relations, or general integer properties of some variables. Some algorithms for solving so-called convex MINLP problems are described in the presentation, and we briefly analyze some of their properties. In the presentation, we will go through some techniques for generating tight polyhedral outer approximations of sets defined by a convex set intersected by an integer lattice. A new approach for modeling disjunctive terms is briefly presented along with an application in robust verification of ReLU-based neural networks. The presentation is intended to give a brief overview of several topics in mixed-integer optimization and some of the presenter’s research interests.
Tuesday, 21 September, 10:15
Alex Markham (KTH)
A Causal Graph Kernel Embedding via Distance Covariance
We use the distance covariance, a nonlinear measure of dependence introduced by Székely et al. 2007, to define a new kernel capable of measuring the similarity between the generating causal structures underlying different samples. Our kernel can readily be used with existing machine learning methods, allowing them to incorporate causal information to solve a wide variety of tasks (e.g., clustering with kernel k-means, classification with kernel support vector machines, or dimensionality reduction/data visualization with kernel principal component analysis). In this talk, we focus on the theoretical aspects of the kernel, namely on (i) how equivalence classes of ancestral graphs are embedded into a kernel space defined by the distance covariance and (ii) how this facilitates the isometry of distances between sample points and distances between their generating causal models.
Slides (pdf 5.4 MB)
Tuesday, 14 September, 10:15
Yasuaki Hiraoka (Kyoto University, RIKEN)
On characterizing rare events in persistent homology
Indecomposables obtained through decompositions of persistent homology are regarded as topological summary of real data. However, as is well known, there exist pathologically complicated indecomposables in multi-parameter persistent homology in purely algebraic setting, and this fact makes it difficult to build mathematical theory on that setting. Our fundamental question is, how much should we care about such complicated indecomposables in the real data, and what is a suitable framework to study this question? In this talk, after explaining motivations from materials science, we will show several ongoing mathematical results, especially, (1) large deviation principle on 1 parameter persistent homology, and (2) law of large numbers on multi-parameter persistent homology. Then we will discuss how these two results (partially) answer to the original question.
Tuesday, 7 September, 10:15
Isabelle Shankar (MPI Leipzig)
Symmetry Adapted Gram Spectrahedra
Sum of squares (SOS) relaxations are often used to certify nonnegativity of polynomials and are equivalent to solving a semidefinite program (SDP). The feasible region of the SDP for a given polynomial is the Gram Spectrahedron. For symmetric polynomials, there are reductions to the problem size that can be done using tools from representation theory. In joint work with Serkan Hosten and Alexander Heaton, we investigate the geometric structure of the spectrahedra that arise in the study of symmetric SOS polynomials, the Symmetry Adapted PSD cone and the Symmetry Adapted Gram Spectrahedron.
Slides (pdf 736 kB)
Tuesday, 31 August, 10:15
Gennadiy Averkov (Brandenburg University of Technology)
Possibilities and limitations of semidefinite approaches to polynomial optimization
This talk is about application of semidefinite lifts to solving polynomial optimization problems. Such lifts can be used to solve non-convex polynomial optimization problems globally by reformulating them as semidefinite problems. However, the size of lifted semidefinite formulations is necessarily large in general because of an obstruction of a combinatorial nature which is expressed in terms of the facial structure of the sum-of-square cones.
Slides (pdf 418 kB)
Sage_example (pdf 80 kB)
Tuesday, 15 June, 11:15
Voronoi Cells of Varieties with respect to Wasserstein Distances
(master's thesis presentation)
Voronoi diagrams are partitions of a metric space into Voronoi cells according
to distance from points on some set w.r.t. some distance. In this thesis we
examine Voronoi diagrams of manifolds and varieties w.r.t. the Wasserstein
distance from probability theory. We give some upper and lower bounds on
the dimension of Voronoi cells based on the geometry of the manifolds and
Wasserstein distance balls. We provide an upper bound on the number of full-
dimensional Voronoi cells of algebraic varieties and show examples of the
bound being tight.
Tuesday, 1 June, 11:15
Interventions for Identifying Context-Specific Causal Structures
(master's thesis presentation)
The problem of causal discovery is to learn the true causal relations among a system of random variables based on the available data. Learning the true causal structure of p variables can sometimes be difficult, but it is crucial in many fields of science, such as biology, sociology and artificial intelligence. Classically, it is assumed that the true causal relations are completely encoded via a directed acyclic graph (DAG), and there are numerous algorithms for estimating a DAG representative of a causal system from data. Assuming it is feasible, the most effective way of learning the true causal structure is through interventional experiments. Some previous work was made by Eberhardt et al. who identified the sufficient and in the worst case necessary number of interventions needed to learn a DAG, and then studied this problem from a game theoretic perspective, providing an upper bound on the expected number of experiments needed to identify the causal DAG. Here, we consider more general causal models, the CStrees, which allow for the true causal relations to be context-specific. We extend the results of Eberhardt to the family of CStrees by finding the sufficient and in the worst case necessary number of experiments the Scientist must perform in order to discover the true CStree among p variables. We generalize the game theoretic approach presented in Eberhardt's paper, to the CStrees with a specified causal ordering. We also give a geometric description of context-specific hard interventions in CStrees, through a bijection between the stages of the CStree and the faces of a polytope.
Tuesday, 25 May, 11:15
James Cussens (University of Bristol)
Branch-price-and-cut for Bayesian network structure learning
A Bayesian network structure is an acyclic directed graph
(DAG) which encodes conditional independence relations between random
variables in a joint distribution. One option for learning such a DAG from data
is to encode the graph learning problem as an integer linear program
(ILP) and send the problem to an ILP solver to solve. Unfortunately,
the number of ILP variables required to represent the problem grows
exponentially with the number of random variables. In this talk I will
report on ongoing work to get round this problem. The key idea is to
add only those ILP variables which are 'necessary' into the ILP during the course of solving the ILP, a
well-known ILP technique called 'pricing'. The talk will not assume
prior knowledge of Bayesian network structure learning or integer
Tuesday, 4 May, 11:15
A series of mini-talks on large deviations, stochastic simulation and their roles in the mathematics of data and AI
This will not be a traditional talk where I start with a well-defined research question and end with some kind of answer. Instead, it will be series of “the first 10–15 minutes” of several talks on large deviations and stochastic simulation. These are topics that are ubiquitous in a variety of areas within probability and statistics, or even more broadly in applied mathematics, but have yet to be used extensively in the area of data science. I will start with a brief introduction to the theory of large deviations, one of the cornerstones of modern probability, and then discuss problems involving e.g. gradient flows for interacting particle systems and Markov chain Monte Carlo methods for sampling from complex distributions, and how these link to important questions in data science. If time permits I will end the talk with an outline of some ongoing and future work on the mathematical foundation of data science. As we go along, I will refer back to previous talks in the seminar series to highlight how the topics I discuss have already come up in different contexts. No previous knowledge of any of the topics mentioned is required.
SlidesPart1 (pdf 4.7 MB)
SlidesPart2 (pdf 2.2 MB)
SlidesPart3 (pdf 7.8 MB)
Tuesday, 27 April, 11:15
Extracting persistence features with hierarchical stabilisation.
It is often complicated to understand complex correlation patterns between multiple measurements on a dataset. In multi-parameter persistence we represent them through algebraic objects called persistence modules. I will present the hierarchical stabilisation framework which can be used to produce stable invariants for persistence modules. A fundamental property of such invariants is that they depend on metrics to compare persistence modules. I will then focus on one invariant, obtained via hierarchical stabilisation, called the stable rank. After explaining challenges associated to the computation of the stable rank, in the multi-parameter case, I will showcase its use for one-parameter persistence. In particular I will illustrate how the associated kernel can be used on artificial and real-world datasets and show that by varying the metric we can improve accuracy in classification tasks.
This work is in collaboration with the TDA group at KTH.
Tuesday, 20 April, 11:15
Bernd Sturmfels (UC Berkeley / MPI MiS Leipzig / TU Berlin / Leipzig University)
Linear PDE with Constant Coefficients
We discuss algebraic methods for solving systems of homogeneous linear partial differential equations with constant coefficients. The setting is the Fundamental Principle established by Ehrenpreis and Palamodov in the 1960’s. Our approach rests on recent advances in commutative algebra, and it offers new vistas on schemes and coherent sheaves in computational algebraic geometry.
Slides, incl. homework (pdf 1.2 MB)
Tuesday, 13 April, 11:15
Vidit Nanda (University of Oxford)
Links, paths and intersections
Links of strata in singular spaces are fundamental invariants which govern the topology of small neighbourhoods around points in those strata. This talk will focus on inferring links of strata from incomplete information in two completely different contexts. In each case, there are exciting consequences of learning the topology of such links. No prior knowledge of stratification theory will be assumed!
Tuesday, 6 April, 11:15
Topological analysis of spike train data
The methods of topological data analysis have found in neuroscience a particularly fertile field of application. Their ability to capture and quantify information about shape and connections makes them relevant to study, for example, the geometry and topology of spaces of neural representations of stimuli. I will describe some successful topological methods for the analysis of neural data, focusing on the information they can extract from spike train data, i.e. lists of firing times of different neurons. As a concrete example, I will present results on the spaces of neural responses elicited by particular visual stimuli.
Tuesday, 30 March, 11:15
Elizabeth Gross (University of Hawaiʻi at Mānoa)
Mixed volumes of steady-state systems
The steady-state degree of a chemical reaction network is the number of complex solutions to the steady-state system for generic parameters. In general, the steady-state degree may be difficult to compute, but it can be bounded above by the mixed volume of the system. In this presentation, using tools from combinatorial polyhedral geometry, we compute the mixed volume for three infinite families of networks, each generated by joining smaller networks to create larger ones. Each of these examples illustrate a different relationship between the steady-state degree and the mixed volume of the steady-state system. This is joint work with Cvetelina Hill.
Tuesday, 23 March, 16:15
Peter Bubenik (University of Florida)
Summaries and Distances for Topological Data Analysis
Topological data analysis starts with encoding data as a diagram of topological spaces. To this diagram we apply standard topological, algebraic, and combinatorial tools to produce a summary of our data. Next, we would like to be able perform a quantitative analysis. I will show how a family of distances, called Wasserstein distances, arise in a natural way, and how an extension of these ideas produces a graded version of this summary and also a summary for which we have not only distances but also angles (in fact, a Hilbert space) allowing us to apply statistics and machine learning. Key mathematical ingredients will include optimal transport, Mobius inversion, and Morse theory.
Tuesday, 16 March, 11:15
Carlos Améndola (TU Munich / Ulm University)
Likelihood Geometry of Correlation Models
Correlation matrices are standardized covariance matrices. They form an affine space of symmetric matrices defined by setting the diagonal entries to one. In this talk we will consider the fascinating geometry of maximum likelihood estimation for this model and linear submodels that encode additional symmetries. We also consider the problem of minimizing two closely related functions of the covariance matrix, the Stein's loss and the symmetrized Stein's loss, which lead naturally to the algebraic statistical concepts of dual ML degree and SSL degree. I will also present exciting open problems in this direction. This is joint work with Piotr Zwiernik.
Tuesday, 9 March, 11:15
Aida Maraj (MPI MiS Leipzig)
Asymptotic Behaviors of Hierarchical Models
A hierarchical model is realizable by a simplicial complex that describes the dependency relationships among random variables and the number of states for each random variable. Diaconis and Sturmfels have constructed toric ideals that produce Markov bases for these models. This talk concerns quantitative properties for families of ideals arising from hierarchical models with the same dependency relations and varying number of states. We introduce and study invariant filtrations of such ideals related by an action of the symmetric group, and their equivariant Hilbert series. Conditions that guarantee this multivariate series is a rational function will be presented. The key is to construct finite automata that recognize languages corresponding to invariant filtrations. Lastly, we show that one can similarly prove the rationality of an equivariant Hilbert series for some filtrations of algebras. This is joint work with Uwe Nagel.
Tuesday, 2 March, 11:15
Mathias Drton (TU Munich)
Parameter Identification in Linear Causal Models with Latent Variables
Linear causal models relate random variables of interest via a linear equation system that features stochastic noise. These models, also known as structural equation models, are naturally represented by directed graphs whose edges indicate non-zero coefficients in the linear equations. In this talk I will report on progress on combinatorial conditions for parameter identifiability in models with latent (i.e., unobserved) variables. Identifiability holds if the coefficients associated with the edges of the graph can be uniquely recovered from the covariance matrix they define.
Tuesday, 23 February, 11:15
Beatriz Pascual Escudero (Universidad Carlos III de Madrid)
Using Algebraic Geometry to detect robustness in Reaction Networks
An interesting feature of some biological systems is their capacity to preserve certain features against changes in the environmental conditions. In particular, we are concerned with the preservation of the concentration of specific species interacting in a system, motivated by examples in Reaction Networks: a biological system has absolute concentration robustness (ACR) for some molecular species if the concentration of this species does not vary among the different equilibria that the network admits. In particular, this concentration is independent of the initial conditions. While some classes of networks with ACR have been described, as well as some techniques to check ACR for a given network, finding networks with this property is a difficult task in general.
Motivated by this problem, we will explore local and global notions of robustness on the set of (real positive) solutions of a system of polynomial equations. The goal is to provide a practical test on necessary conditions for ACR, using algebraic-geometric techniques. This is based on joint work with E. Feliu.
Tuesday, 16 February, 11:15
Rickard Brüel Gabrielsson (MIT)
Topology and Machine Learning
We will talk about connections between topology and machine learning. Statistics and local geometry has for a long time served as the underpinning of machine learning, but recent developments allow us to incorporate global geometry and topology with great results. First, topological features such as persistent homology and clustering methods can help us understand what neural networks learn and construct topology inspired learning theory. Next, many of these features are differentiable which allows us to create a general purpose machine learning layer that allows anyone to easily incorporate topology into any machine learning pipeline. Lastly, we will talk about universal function approximation on graphs and the promise of unsupervised learning and pre-training.
Tuesday, 9 February, 15:45
Rehka Thomas (University of Washington)
Multiview Chirality, Two Cameras and a Cubic Surface
The set of images captured by an arrangement of pinhole cameras is usually modeled by an algebraic variety called the multiview variety. The true set is in fact a semialgebraic subset of this variety, arising from the physical restriction that cameras can only image points in front of them. I will discuss this set for multiple cameras. For a pair of cameras, the minimal problem in this semialgebraic setting is given by 5 point pairs, which even in general position, can fail to have a "chiral" 3-dimensional reconstruction. We will see that the combinatorics and arithmetic information of this minimal case is carried by a cubic surface with 27 real lines.
Joint work with Sameer Agarwal, Andrew Pryhuber and Rainer Sinn
Tuesday, 26 January, 11:15
Kaie Kubjas (Aalto University, Helsinki)
Uniqueness of nonnegative matrix factorizations
Nonnegative matrix factorizations are often encountered in data mining applications where they are used to explain datasets by a small number of parts. For many of these applications it is desirable that there exists a unique nonnegative matrix factorization up to trivial modifications given by scalings and permutations. This means that model parameters are uniquely identifiable from the data. Different sufficient conditions for the uniqueness of nonnegative matrix factorizations have been well studied, however, a little is known about necessary conditions. We will give so far the strongest necessary condition for the uniqueness of a nonnegative factorization. The talk is based on the joint work with Robert Krone.
Tuesday, 19 January, 11:15
Points, polytopes, polynomials
Ehrhart theory is concerned with the enumeration of lattice points in polytopes. Given a polytope with vertices in the integer lattice, the function counting the lattice points in its integer dilates agrees with a polynomial, the Ehrhart polynomial of the polytope. In this talk, I will give a brief overview of some of my research interests in Ehrhart theory, in particular, geometric models for enumerative problems, structural results for Ehrhart polynomials as well as ties to valuation theory.
Slides (pdf 911 kB)
Tuesday, 17 November, 11:15
Nonlinear Algebra of Data Science and AI
Nonlinear algebra is the study of systems of multivariate polynomial equations and inequalities. At the heart of this lies algebraic geometry, but the field is driven by many other areas of mathematics such as combinatorics, convex and discrete geometry, multilinear algebra, tropical geometry, or representation theory. Systems of polynomial (in)equalities appear vastly throughout the sciences and engineering. In this talk, I will focus on applications in data science and artificial intelligence. My research focuses on uncovering geometric structures in applied problems and on finding connections between seemingly unrelated areas of mathematics or science in general. I will demonstrate this by describing the underlying (algebraic) geometry of machine learning with neural networks and of 3D reconstruction in computer vision. I will also outline surprising connections between statistics, invariant theory, geometric modeling, intersection theory, and physics.
Slides (pdf 4.3 MB)
Tuesday, 10 November, 11:15
Some of my interests and (old) results
I work in complexity theory and recently I have studied efficient approximability of NP-hard optimization problems. In particular I have been interested in constraint satisfaction problems (CSPs). In a CSP you are given a large number of constraints each only depending on a constant number of variables and the goal is to find an assignment to satisfy as many constraints as possible. Central examples are Max-3Sat and sets of linear equations over a finite field where each equation only depends on three variables. I will also briefly mention some problems in post-quantum cryptography.
slides (pdf 264 kB)
- Tuesday, 3 November, 11:15
Wasserstein distance in algebraic statistics
Any metric on the set of states of a discrete random variable induces a metric called Wasserstein distance on the probability simplex. The unit ball of this norm is a polytope, well known in discrete and tropical geometry. Given any data distribution, we seek to minimize its Wasserstein distance to an algebraic model, i.e., a model described by polynomials. The solution to this optimization problem is a piecewise algebraic function of the data. After a gentle introduction on the topic, I will comment on the combinatorial and algebraic complexity of the problem. This talk is based on joint work with Türkü Özlüm Ç̧̧elik, Asgar Jamneshan, Guido Montúfar and Bernd Sturmfels.
slides (pdf 1.2 MB)
- Tuesday, 27 October, *10:15*
¡ PhD Day !
- Tuesday, 20 October, 11:15
Algebraic statistics and the maximum likelihood degree
This talk is about statistical models and algebraic varieties. Algebraic Statistics unites these two concepts, turning algebraic structure into statistical insight. We will discuss the basic ideas of this relatively new field and see some examples of the kind of models algebraic statisticians are interested in. We will focus on an important quantity associated to the likelihood estimation problem on these models: the maximum likelihood degree.
- Tuesday, 13 October, 11:15
Convergence of linear neural networks to global minimizers
It is known that gradient flow in linear neural networks using Euclidean loss almost always avoids critical points that have at least one eigendirection with negative curvature. Using algebraic invariants of the gradient flow we try to prove that the set of all critical points with no second-order curvature (zero Hessian) for arbitrary networks is associated to a subset of the invariants of lower dimension. This would mean that these critical points are almost surely avoided. We show that this holds for networks with 3 or less hidden layers and a few other special cases. We show by way of explicit counter-example that it is not true for general deep networks.
slides (pdf 477 kB)
- Tuesday, 6 October, 11:15
Exact semidefinite programming bounds for packing problems
In the first part of the talk, I present how semidefinite programming methods can provide upper bounds for various geometric packing problems, such as kissing numbers, spherical codes, or packings of spheres into a larger sphere. When these bounds are sharp, they give additional information on optimal configurations, that may lead to prove the uniqueness of such packings. For example, we show that the lattice E8 is the unique solution for the kissing number problem on the hemisphere in dimension 8. However, semidefinite programming solvers provide approximate solutions, and some additional work is required to turn them into an exact solution, giving a certificate that the bound is sharp. In the second part of the talk, I explain how, via our rounding procedure, we can obtain an exact rational solution of a semidefinite program from an approximate solution in floating point given by the solver. This is a joined work with David de Laat and Philippe Moustrou.
slides (pdf 6.3 MB)
- Tuesday, 22 September, 11:15
Combinatorics, Algebra, and Intervention
The modern theory of causality is founded in our ability to encode both probabilistic and causal information about a data-generating distribution within the structure of a graph. Doing so in the proper way allows us to derive theorems which can be applied when developing data-driven causal structure learning algorithms, as well as probabilistic and causal inference. Such algorithms work best in the cases where the corresponding theorems point to nice combinatorial structure in the graph. Going deeper, if we also consider a parameterization of the joint distribution, we find that nice properties of the model can be attributed to nice properties of its defining algebraic variety in the ambient parameter space. Classic results of this nature are known for probabilistic graphical models. In this talk, we will see how such results generalize to interventional graphical models, which are now being used to learn causal structure from a mixture of observational and interventional data. Time permitting, we will explore how the proofs of these algebraic results give rise to new methods for modeling causation in context-specific settings.
slides (pdf 277 kB)
- Tuesday, 15 September, 11:15
Dynamics of chemical reaction networks and positivity of polynomials
In biochemical processes, different molecules are consumed and produced as they interact with each other. The evolution of their concentration through time can be modelled with a system of differential equations that, under certain assumptions, is polynomial. In this case, the equilibrium points form an algebraic variety. In this talk I will present how these systems arise, and how deciding whether a polynomial can attain positive or negative values is used for detecting multistationarity and stability in the biochemical system.
slides (pdf 307 kB)
- Tuesday, 8 September, 11:15
An overview of the pipeline of multiparameter persistence
In the last two decades, persistent homology and its generalizations have grown to a substantial branch of mathematical research. It gives rise to various research problems in algebra, category theory, geometry, topology, and other areas. Moreover, having immediate applications in data science, efficient algorithms, computational and probabilistic methods, and concrete implementations deserve the same amount of attention. Multiparameter persistence is the generalization of persistence to multiple independent parameters taken into account at once. While the theory of multiparameter persistence is provably much more complicated than the theory of ordinary persistence, new computational challenges arise as well. In this talk, I give a rough overview of the computational pipeline of multiparameter persistence, mention some important challenges, and outline the projects of my PhD thesis along these lines.
slides (pdf 9.8 MB)