Brummer & Partners MathDataLab
Welcome to the home page of the Brummer & Partners MathDataLab. Here you can find information about the Lab and its activities.
Seminar with Jonas Peters, University of Copenhagen, Oct 15
Room F11, Monday Oct 15, 15.15-16.15
Title: Causality and data
Abstract: Causality enters data science in different ways. The goal of causal discovery is to learn causal structure from observational data, an important but difficult problem. Several methods rely on testing for conditional independence. We prove that, statistically, this is fundamentally harder than testing for unconditional independence; solving it requires carefully chosen assumptions on the data generating process. In many practical problems, the focus may lie on prediction, and it is not necessary to solve (full) causal discovery. It might still be beneficial, however, to apply causality related ideas. In particular, interpolating between causality and predictability enables us to infer models that yield more robust prediction with respect to changes in the test set. We illustrate this idea for ODE based systems considering artificial and real data sets. The talk does not require any prior knowledge in causal inference. It contains joint work with Stefan Bauer, Niklas Pfister, and Rajen Shah.
Seminar with Phyllis Wan, Rotterdam University, Oct 22
Room F11, Monday Oct 22, 15.15-16.15
Title: Modeling social networks through linear preferential attachment
Abstract: Preferential attachment is an appealing mechanism for modeling power-law behavior of degree distributions in social networks. In this talk, we consider fitting a directed linear preferential attachment model to network data under three data scenarios: 1) When the full history of the network growth is given, MLE of the parameter vector and its asymptotic properties are derived. 2) When only a single-time snapshot of the network is available, an estimation method combining method of moments with an approximation to the likelihood is proposed. 3) When the data are believed to have come from a misspecified model or have been corrupted, a semi-parametric approach to model heavy-tailed features of the degree distributions is presented, using ideas from extreme value theory. We illustrate these estimation procedures and explore the usage of this model through simulated and real data examples. This is a joint work with Tiandong Wang (Cornell), Richard Davis (Columbia) and Sid Resnick (Cornell).
Workshop on Mathematics for Complex Data, May 30-31, 2018
The purpose of this workshop is to bring together researchers interested in the mathematics of complex data. There will be talks on mathematical methods for data analysis as well as presentations of complex data in applications.
Two lectures by Scott Baden, Lawrence Berkeley National Laboratory and University of California, San Diego, May 2-3, 2018
Lecture 1: Room F11, Wednesday, May 2, 11.15-12.00
Lecture 2: Room F11, Thursday, May 3, 11.15-12.00
Title: Scalable memory machines
Abstract: Distributed memory computers provide scalable memory and - hopefully - scalable performance. Over two lectures, I'll present the principles
and practice of applying scalable memory machines to solve scientific problems and describe my current research in addressing the challenges
entailed in highly scalable computing.
Bio: Prof. Baden received his M.S and PhD in Computer Science from UC Berkeley in 1982 and 1987. He is also Adjunct Professor in the Department of Computer Science and Engineering at UCSD, where he was a faculty member for 27 years. His research interests are in high performance and scientific computation: domain specific translation, abstraction mechanisms, run times, and irregular problems. He has taught parallel programming at both the graduate and undergraduate level at UCSD and at the PDC Summer School.
Seminar with Jeffrey Herschel Giansiracusa
Room F11, Friday Feb 16, 9:00-10:00.
Title: A tour of some applications of persistent homology
Abstract: I will give an overview of persistent homology - how it is constructed and how we use it as a tool in data analysis. Originally it was popularised as a way of producing a description of the shape of a data set, but more recently it has taken on an alternative role as a component in functional data analysis pipelines where each element in a data set represents a complicated geometric object and persistent homology provides a way of comparing the topology and geometry of different elements, and potentially feeding the topology directly into statistical learning methods. I will describe how this works in some examples.
Seminar with Caroline Uhler, MIT
Room F11, Wednesday Feb 7, 13.15-14.15
Title: Your dreams may come true with MTP2
Abstract: We study probability distributions that are multivariate totally positive of order two (MTP2). Such distributions appear in various applications from ferromagnetism to Brownian tree models used in phylogenetics. We first describe some of the intriguing properties of such distributions with respect to conditional independence and graphical models. In the Gaussian setting, these translate into new statements about M-matrices that may be of independent interest to algebraists. We then consider the problem of nonparametric density estimation under MTP2. This requires us to develop new results in geometric combinatorics. In particular, we introduce bimonotone subdivisions of polytopes and show that the maximum likelihood estimator under MTP2 is a piecewise linear function that induces bimonotone subdivisions. In summary, MTP2 distributions not only have broad applications for data analysis, but also leads to interesting new problems in combinatorics, geometry, and algebra.
Carolone Uhler joined the MIT faculty ni 2015 as the Henry L. and Grace Doherty assistant professor in EECS and IDSS. She is a member of the LIDS, the Center for Statistics, Machine Learning at MIT, and the ORC. She holds a PhD in statistics from UC Berkely.
Her research focuses on mathematical statistics and computational biology, in particular on graphical models, causal inference and algebraic statistics, and on applications to learning gene regulatory networks and the development of geometric models for the organization of chromosomes.
Opening workshop on Nov 17, 2017.
For information, see