Mid-term Workshop

The workshop takes place June 3-5 at Digital Futures Hub, Osquars Backe 5, floor 2.

Schedule:

	Monday, June 3	Tuesday, June 4	Wednesday, June 5
8:30	Registration	Registration	Registration
9:00	Opening	Registration	Registration
9:30	Weinan E Building the next generation infrastructure for scientific research	Marten De Hoop A reflection on neural operators: Injectivity, diffeomorphisms, discretization and quantitative approximation	Björn Engquist Neural Inverse Operators for Solving PDE Inverse Problems
10:30	Break	Break	Break
11:00	Peter Maass Regularization by architecture	Lexing Ying Classical Analysis for Machine Learning Problems	Melanie Weber Discrete Curvature and Applications in Graph Machine Learning
12:00	Lunch	Lunch	10 min presentations
12:30	Lunch	Lunch	Lunch
14:30	Jiequn Han Machine Learning for Inverse Problems: Point Estimates and Bayesian Sampling	Richard Tsai The Manifold Hypothesis and its consequence in machine learning	Gitta Kutyniok Reliable AI for Inverse Problems: Successes, Challenges, and Limitations
15:30	Break	Break	Break
16:00	Anders Szepessy Adaptive random Features	10 min presentations	Joakim Andén Simultaneous denoising in low-SNR regimes using data-driven priors
16:30	Haluk Akay Representing Function, Form, and Fabrication for Data-Driven Sustainability
17:00	Francisco Alcántara Ávila Multi-agent reinforcement learning for active flow control: to the 3D and beyond		Saikat Chatterjee Data-Driven Non-linear State Estimation of Model-free Process in Unsupervised Learning
17:30
18:00		Dinner + Cruise
22:00		Dinner + Cruise

Presentations

Haluk Akay

Department of Energy Technology, KTH Royal Institute of Technology

Representing Function, Form, and Fabrication for Data-Driven Sustainability

Monday, 3 June, 16:30-17:00 at Digital Futures Hub, Osquars Backe 5, floor 2

Advancements in Artificial Intelligence open the door to exciting possibilities, but how can these be captured to benefit the design of sustainable products? In order to machine-learn from a wealth of prior engineering achievement, design must be represented for computation. In this talk, methods for quantitatively representing various aspects of engineering design from function to form to fabrication will be presented. Applications of these methods to augment sustainable decision-making in circular manufacturing, policy review, and knowledge preservation will be illustrated, and the talk will conclude with a discussion on safety and bias in deploying such data-driven methods in society.

Joakim Andén-Pantera

Department of Mathematics, KTH Royal Institute of Technology

Simultaneous denoising in low-SNR regimes using data-driven priors

Wednesday, 5 June, 16:00-17:00 at Digital Futures Hub, Osquars Backe 5, floor 2

The application of DNNs to the denoising problem has resulted in highly performant denoisers for a wide range of applications from photographic image restoration to medical imaging. However, these images are typically subjected to relatively low degree of noise compared to applications such as cryogenic electron microscopy (cryo-EM), where noise power dominates the clean signal to the point where traditional denoising methods fail. In this talk, we will study the related problem of multireference alignment (MRA). Here, a clean image is randomly rotated and degraded by additive noise. The goal is to recover the original image from a set of such noisy rotated copies of the image. We show how a transformer architecture can be used to encode a signal prior that is used to aggregate the information from the entire set of observations, yielding superior denoising results compared to the single-image denoisers.

Francisco Alcántara Ávila

Department of Engineering Mechanics, KTH Royal Institute of Technology

Multi-agent reinforcement learning for active flow control: to the 3D and beyond

Monday, 3 June, 17:00-18:00 at Digital Futures Hub, Osquars Backe 5, floor 2

Machine learning for flow control has been increasingly used since the last decade. Starting with simple two-dimensional problems like the flow past a cylinder, a future goal has always been to move towards three-dimensional (3D) cases. In this presentation I will show the most recent problems where we have been able to effectively perform active flow control using a deep reinforcement learning (DRL) agent fully trained in a 3D environment: Rayleigh-Bénard convection, flow past a cylinder and a bubble of recirculation in a turbulent boundary layer. In all cases we leverage the well-know curse of dimensionality problem that arises in high dimensional cases by using the multi-agent reinforcement learning approach, which takes advantages of flow invariants to parallelize the DRL environment. Furthermore, we compare the results of the DRL-control strategies with the classical-control methods, obtaining a considerable improvement.

Saikat Chatterjee

Division Of Information Science and Engineering, KTH Royal Institute of Technology

Data-Driven Non-linear State Estimation of Model-free Process in Unsupervised Learning

Wednesday, 5 June, 17:00-17:30 at Digital Futures Hub, Osquars Backe 5, floor 2

This seminar will address a standard Bayesian state estimation problem as an inverse problem, like Kalman Filter. The major new thing is that Kalman Filter to Particle Filter – allmost all of the standard state estimation methods - know the underline state space model or process model, but our new method DANSE does not. DANSE learns from noisy measurements without access to clean data and/or state space models. That means DANSE learns in an unsupervised manner, and fully model-free. It is an interesting combination of deep learning and Bayesian learning, and Bayesian estimation. Manuscript: https://arxiv.org/abs/2306.03897

Weinan E

Department of Mathematics, Princeton University

Building the next generation infrastructure for scientific research

Monday, 3 June, 9:30-10:30 Remote: Zoom Meeting ID 627 6175 2766

Anna-Karin Tornberg

In the last few years, we have seen a tremendous amount of scientific progress made as a result of the AI revolution, both in our much expanded ability to make use of the fundamental principles of nature, and our much expanded ability to make use of experimental data and the literature. In this talk, I will start with the origin of the AI for Science revolution, review some of the major progresses made so far, and discuss how it will impact the way we do research. I will also discuss some of the ongoing projects that we are working on, with the objective of constructing a new set of infrastructure for scientific research.

Björn Engquist

Center for Numerical Analysis, University of Texas, Austin

Neural Inverse Operators for Solving PDE Inverse Problems

Wednesday, 5 June, 9:30-10:30 at Digital Futures Hub, Osquars Backe 5, floor 2

Anna-Karin Tornberg

A large class of inverse problems for PDEs are only well-defined as mappings from operators to functions. Existing operator learning frameworks map functions to functions and need to be modified to learn inverse maps from data. We propose an architecture termed Neural Inverse Operators (NIOs) to solve these PDE inverse problems. Motivated by the underlying mathematical structure and PDE-constrained optimization techniques, NIO is based on a composition of DeepONets and Fourier Neural Operators to approximate mappings from operators to functions. Experiments will be presented to demonstrate the performance of the NIOs. They do very well compared to existing neural network baselines in solving PDE inverse problems robustly and accurately. The examples include the classical Calderon problem and optical and seismic imaging. PDE-constrained optimization methods currently can address more challenging problems, but the advantage of NIOs is that they are orders of magnitude faster.

Jiequn Han

Center for Computational Mathematics, Flatiron institute

Machine Learning for Inverse Problems: Point Estimates and Bayesian Sampling

Monday, 3 June, 14:30-15:30 at Digital Futures Hub, Osquars Backe 5, floor 2

Ozan Öktem

Machine learning has increasingly provided powerful tools to tackle challenging inverse problems. This talk presents two works representing two extreme scenarios and discusses prospects for further development. In the first scenario, where the forward problem is a highly nonlinear scattering operator with small observation noise, machine learning models can warm start point estimation based on an optimization formulation. In the second scenario, where the forward operator is linear but the observation noise might be large, existing score-based diffusion models can provide realistic priors. I will discuss how these models can also assist in provable Bayesian posterior sampling using the tilted transport technique.

Marten De Hoop

Department of Mathematics, Rice University

A reflection on neural operators: Injectivity, diffeomorphisms, discretization and quantitative approximation

Tuesday, 4 June, 9:30-10:30 at Digital Futures Hub, Osquars Backe 5, floor 2

Recently, there has been a great interest in operator learning, where neural networks learn operators between function spaces from an essentially infinite-dimensional perspective. We present a generalized framework for neural operators, with layers including nonlinear integral operators and skip connections. We discuss and prove that injective neural operators are universal approximators and develop an algebra with bijective neural operators. Then, we give a more geometrical perspective based on diffeomorphisms in infinite dimensions, that is, for Hilbert manifolds. Using category theory, we give a no-go theorem that shows that diffeomorphisms between Hilbert spaces may not admit any continuous approximations by diffeomorphisms on finite-dimensional spaces, even if the underlying discretization is nonlinear. Strongly monotone diffeomorphisms do admit approximation by finite-dimensional strongly monotone diffeomophisms. We then introduce layerwise strongly monotone neural operators. Such layers are diffeomorphisms. We prove that all strongly monotone neural operator layers admit continuous approximations on finite-dimensional spaces. We provide different conditions under which a neural operator layer is strongly montone. Most notably, a bilipschitz neural operator layer can always be represented by a composition of strongly monotone neural operator layers and invertible linear maps and, hence, be discretized. Our framework may be used "out of the box" to prove quantitative approximation results for discretization of neural operators.

Joint research with T. Furuya, A. Kratsios, A. Lara, M. Lassas and M. Puthawala.

Gitta Kutyniok

Department of Mathematics, Ludwig-Maximilians-Universität Munich

Reliable AI for Inverse Problems: Successes, Challenges, and Limitations

Wednesday, 5 June, 14:30-15:30 Remote: Zoom Meeting ID 691 8603 7960

Ozan Öktem

The new wave of artificial intelligence is impacting industry, public life, and the sciences in an unprecedented manner. It has by now already led to paradigm changes in several areas. However, one current major drawback is the lack of reliability. In this lecture we will first provide an introduction into this vibrant research area. We will then present some recent advances, in particular, concerning optimal combinations of traditional model-based methods with AI-based approaches in the sense of true hybrid algorithms, with a particular focus on limited-angle computed tomography and a novel approach coined "Deep Microlocal Reconstruction". Due to the importance of explainability for reliability, we will also touch upon this area by highlighting an approach which is itself reliable due to its mathematical foundation. Finally, we will discuss fundamental limitations of deep neural networks and related approaches in terms of computability, and how these can be circumvented in the future by next generation AI computing.

Peter Maass with Meira Iske and Janek Gödeke

Center for Techno-Mathematics, Universität Bremen

Regularization by architecture

Monday, 3 June, 11:00-12:00 at Digital Futures Hub, Osquars Backe 5, floor 2

Ozan Öktem

The success of deep learning approaches for inverse problems strongly depends on the chosen network architecture. In the first part of the talk Meira Iske will present some theoretical results concerning the regularization properties of iResNet architectures. Then, Janke Gödeke will discuss operator approximation properties of neural networks as needed for learning parameter-to-state operators. In the second part of the talk we review some recent results for comparing different network architectures for solving PDEs and related parameter identification problems. We close the talk with some industrial applications.

Anders Szepessy

Department of Mathematics, KTH Royal Institute of Technology

Adaptive Random Features

Monday, 3 June, 16:00-16:30 at Digital Futures Hub, Osquars Backe 5, floor 2

Random feature learning methods are attractive from the analysis point of view. A challenge in practice is to sample near optimally. I will present an elementary proof of the generalization error for random features and some attempts to sample random features efficiently, based on work with Xin Huang, Aku Kammonen, Jonas Kiessling, Petr Plechac, Mattias Sandberg and Raul Tempone.

Yen-Hsi Richard Tsai

Department of Mathematics, Department for Mathematics and Oden Institute for Computational Engineering and Sciences, University of Texas Austin

The Manifold Hypothesis and its consequence in machine learning

Tuesday, 4 June, 14:30-15:30 at Digital Futures Hub, Osquars Backe 5, floor 2

Anna-Karin Tornberg

The dimensional manifold hypothesis posits that the data found in many applications, such as those involving natural images, lie (approximately) on low dimensional manifolds embedded in a high dimensional Euclidean space. Since a typical neural network is constructed to be a function on the whole embedding space, one must consider the stability of an optimized network function when evaluating at points outside the training distribution. In this talk, we will discuss some consequences of the data manifold's curvatures and the arbitrariness of the high-dimensional ambient space. We will also discuss the regularization effects by introducing noise to the data. Finally, we discuss the multiscale properties of the empirical loss function induced by data distributions supported on a low dimensional submanifold.

Melanie Weber

Geometric Machine Learning Group, Harvard University

Discrete Curvature and Applications in Graph Machine Learning

Wednesday, 5 June, 11:00-12:00 at Digital Futures Hub, Osquars Backe 5, floor 2

Ozan Öktem

The problem of identifying geometric structure in heterogeneous, high-dimensional data is a cornerstone of Representation Learning. In this talk, we study this problem from the perspective of Discrete Geometry. We start by reviewing discrete notions of curvature with a focus on Ricci curvature. Then we discuss how curvature characterizations of graphs can be used to improve the efficiency of Graph Neural Networks. Specifically, we propose curvature-based rewiring and encoding approaches and study their impact on the Graph Neural Network’s downstream performance through theoretical and computational analysis. We further discuss applications of discrete Ricci curvature in Manifold Learning, where discrete-to-continuum consistency results allow for characterizing the geometry of a suitable embedding space both locally and in the sense of global curvature bounds.

Lexing Ying

Department of Mathematics, Stanford University

Classical Analysis for Machine Learning Problems

Tuesday, 4 June, 11:00-12:00 at Digital Futures Hub, Osquars Backe 5, floor 2

Anna-Karin Tornberg

Machine learning has increasingly influenced the development of scientific computing. In this talk, I will share some recent experiences on how classical analysis can help machine learning. The first example is online learning, where ODEs and SDEs can help explain the optimal regret bounds concisely. In the second example, a perturbative analysis clarifies why sometimes line spectrum estimation algorithms exhibit a super-convergence phenomenon.

10 minute presentations, Tuesday

Chair: Shervin Bagheri

Marcial Sanchis Agudo

Linné FLOW Center, KTH Royal Institute of Technology

Robustness of transformer neural networks used for temporal-dynamics prediction

To improve the robustness of transformer neural networks used for temporal-dynamics prediction of chaotic systems, we propose a novel attention mechanism called easy attention which we demonstrate in time-series reconstruction and prediction. While the standard self attention only makes use of the inner product of queries and keys, it is demonstrated that the keys, queries and softmax are not necessary for obtaining the attention score required to capture long-term dependencies in temporal sequences. Through the singular-value decomposition (SVD) on the softmax attention score, we further observe that self attention compresses the contributions from both queries and keys in the space spanned by the attention score. Therefore, our proposed easy-attention method directly treats the attention scores as learnable parameters. This approach produces excellent results when reconstructing and predicting the temporal dynamics of chaotic systems exhibiting more robustness and less complexity than self attention or the widely-used long short-term memory (LSTM) network. We show the improved performance of the easy-attention method in the Lorenz system, a turbulence shear flow and a model of a nuclear reactor.

Álvaro Moreno Soto

Department of Aeroespace Engineering, University Carlos III de Madrid

Physics-informed neural networks for high-resolution weather reconstruction from sparse weather stations

The significance of accurate weather reconstruction has become more relevant in recent years. Currently, weather models primarily rely on historic data statistics and numerical methods. However, the emergence of artificial intelligence offers new possibilities for addressing the demand for accurate information on short-to-mid-term weather events. Accurate predictions can lead to significant cost savings by, for example, enabling efficient flight planning and optimal allocation of operational resources in air traffic management. This project focuses on leveraging physics-informed neural networks (PINNs) to precisely reconstruct the weather field from limited data provided by weather stations on ground. By enforcing compliance with physics constraints, we enhance the deterministic and comprehensive reconstruction of field data (i.e. wind velocity and pressure), enabling better anticipation of weather event’s temporal and spatial evolution.

Paul Häusner

Department of Information Technology, Uppsala University

Graph neural network based preconditioner for Krylov subspace methods

Graph neural networks (GNNs) are one of the most popular neural network architectures emerging in the last couple of years. This is owed in part to their adeptness at handling unstructured inputs, a common feature in many real-world scenarios. Moreover, given that many classical algorithms can be framed within the realm of graph problems, GNNs emerge as a natural option for accelerating or substituting traditional algorithms with neural network approaches. In this talk, we showcase how the connection between sparse linear algebra and graph neural networks can be exploited in order to efficiently learn preconditioners for Krylov subspace methods. By choosing a problem specific architecture and efficient to compute loss, we train a model to predict the incomplete factorization of an input matrix for problems arising from a problem distribution. During inference, we are then able to produce effective preconditioners for unseen problems with a small computational overhead. This allows us to accelerate the total solving times of linear equation systems compared to employing classical general-purpose preconditioning techniques.

Arsineh Boodaghian Asl

Department of Biomedical Engineering and Health Systems

The Application and Limitations of Integrating Machine Learning Models to Simulation Models for Restructuring Hospitals' Care Pathways

In this presentation, I will introduce a network-based approach to model care pathways in hospitals and explain how integrating machine learning models can inform the improvement of hospital pathways via restructuring. A hospital consists of different units that perform different tasks. Every day, a certain number of patients arrive, distribute throughout the hospital, and leave. The pathways that individual patients follow depend on their health condition and the required treatment. Over time, changes in society, such as needs and lifestyle, require hospitals to restructure the patient's pathways to enhance the treatment and reduce the cost of treatment. For this, a hospital requires tools to enable managers and stakeholders in better strategic decision-making. The presentation will discuss how machine learning models can be integrated with such simulation models, their combined applications and limitations.

Michel Gokan Khan

Department of Mathematics, KTH Royal Institute of Technology

ML for Digital Twinning and Predictive Maintenance: AstraZeneca Case Study

In this presentation, I aim to showcase the state-of-the-art setup for Predictive Maintenance (PdM) in industries, explore key research questions, and discuss potential directions for integrating Machine Learning (ML) with real-time sensory data to create digital twins and PdM models within robotic plants in Industry 5.0. These methods aim to decrease maintenance time, facilitate the digital twinning process, automate the extraction of various KPIs and insights from the line, and optimize overall line efficiency. This project, entitled SMART (Smart Predictive Maintenance for the Pharmaceutical Industry), is funded by KTH Digital Futures and AstraZeneca. It aims to boost production lines in the context of Industry 5.0 by enhancing Overall Equipment Effectiveness (OEE) and operator competence. In this case study with AstraZeneca, we aim to employ sensor networks, ML, and immersive visualizations to develop PdM models that enhance operator expertise, setting a new standard in Industry 5.0.

Emmanuel Ström

Department of Mathematics, KTH Royal Institute of Technology

Deep learning-based precomputed wall models for rough-wall viscous flow

We leverage recent advances in operator learning to accelerate multiscale solvers for viscous fluid flow over a rough boundary. We focus on the HMM method, which involves formulating the problem through a coupled system of microscopic and macroscopic subproblems. Solving microscopic problems can be viewed as a nonlinear operator mapping from the space of micro domains to the solution space. We argue that even a relatively high error in the micro solution can be tolerated, since the error made by the HMM model is larger. Our main contribution is to use an FNO-type architecture to perform this mapping faster than classical methods at the same level of precision.

Jevgenija Rudzusika

Department of Mathematics, KTH Royal Institute of Technology

Accelerated Forward-Backward Optimization using Deep Learning

We propose several deep-learning accelerated optimization solvers with convergence guarantees. We use ideas from the analysis of accelerated forward-backward schemes like FISTA, but instead of the classical approach of proving convergence for a choice of parameters, such as a step-size, we show convergence whenever the update is chosen in a specific set. Rather than picking a point in this set using some predefined method, we train a deep neural network to pick the best update. Finally, we show that the method is applicable to several cases of smooth and non-smooth optimization and show superior results to established accelerated solvers.

Aaron Miller

Harvard School of Engineering and Applied Sciences, Harvard University

Bayesian inverse problems using virtual observables

Lucas Amoudruz

Harvard School of Engineering and Applied Sciences, Harvard University

Reinforcement learning for targeted drug delivery through capillaries with artificial micro swimmers.

10 minute presentations, Wednesday

Chair: Shervin Bagheri

Philipp Scholl

Department of Mathematics, Ludwig-Maximilians-Universität Munich

ParFam – (Neural Guided) Symbolic Regression via Continuous Global Optimization

Zak Shumaylov

Department of Applied Mathematics and Theoretical Physics, University of Cambridge

Weakly Convex Regularisers in Inverse problems

Derick Nganyu Tanyu

Center for Industrial Mathematics, Universität Bremen

Advances in Electrical Impedance Tomography: Deep Learning and Analytical Approaches for Inverse Problem Solving

Pol Suarez Morales

Department of Technical Mechanics, KTH Royal Institute of Technology

Multi-agent RL-based active flow control for drag reduction in three-dimensional cylinders under transient Reynolds number

David Thong

Department of Mathematics, KTH Royal Institute of Technology

Mid-term Workshop

Presentations

Haluk Akay

Representing Function, Form, and Fabrication for Data-Driven Sustainability

Joakim Andén-Pantera

Simultaneous denoising in low-SNR regimes using data-driven priors

Francisco Alcántara Ávila

Multi-agent reinforcement learning for active flow control: to the 3D and beyond

Saikat Chatterjee

Data-Driven Non-linear State Estimation of Model-free Process in Unsupervised Learning

Weinan E

Building the next generation infrastructure for scientific research

Björn Engquist

Neural Inverse Operators for Solving PDE Inverse Problems

Jiequn Han

Machine Learning for Inverse Problems: Point Estimates and Bayesian Sampling

Marten De Hoop

A reflection on neural operators: Injectivity, diffeomorphisms, discretization and quantitative approximation

Gitta Kutyniok

Reliable AI for Inverse Problems: Successes, Challenges, and Limitations

Peter Maass with Meira Iske and Janek Gödeke

Regularization by architecture

Anders Szepessy

Adaptive Random Features

Yen-Hsi Richard Tsai

The Manifold Hypothesis and its consequence in machine learning

Melanie Weber

Discrete Curvature and Applications in Graph Machine Learning

Lexing Ying

Classical Analysis for Machine Learning Problems

10 minute presentations, Tuesday

Marcial Sanchis Agudo

Robustness of transformer neural networks used for temporal-dynamics prediction

Álvaro Moreno Soto

Physics-informed neural networks for high-resolution weather reconstruction from sparse weather stations

Paul Häusner

Graph neural network based preconditioner for Krylov subspace methods

Arsineh Boodaghian Asl

The Application and Limitations of Integrating Machine Learning Models to Simulation Models for Restructuring Hospitals' Care Pathways

Michel Gokan Khan

ML for Digital Twinning and Predictive Maintenance: AstraZeneca Case Study

Emmanuel Ström

Deep learning-based precomputed wall models for rough-wall viscous flow

Jevgenija Rudzusika

Accelerated Forward-Backward Optimization using Deep Learning

Aaron Miller

Bayesian inverse problems using virtual observables

Lucas Amoudruz

Reinforcement learning for targeted drug delivery through capillaries with artificial micro swimmers.

10 minute presentations, Wednesday

Philipp Scholl

ParFam – (Neural Guided) Symbolic Regression via Continuous Global Optimization

Zak Shumaylov

Weakly Convex Regularisers in Inverse problems

Derick Nganyu Tanyu

Advances in Electrical Impedance Tomography: Deep Learning and Analytical Approaches for Inverse Problem Solving

Pol Suarez Morales

Multi-agent RL-based active flow control for drag reduction in three-dimensional cylinders under transient Reynolds number

David Thong

Fitting Protein Structure to Cryo-EM Densities / Projections via Multiscale Approaches