Midterm Workshop
The workshop takes place June 35 at Digital Futures Hub, Osquars Backe 5, floor 2.
Sign up here: www.kth.se/form/65f1cc194d8fb21f432d9397?l=sv
Schedule:
Monday, June 3  Tuesday, June 4  Wednesday, June 5  

8:30  Registration  Registration  Registration 
9:00  Opening  
9:30  Weinan E
Building the next generation infrastructure for scientific research

Marten De Hoop
A reflection on neural operators: Injectivity, diffeomorphisms, discretization and quantitative approximation

Björn Engquist
Neural Inverse Operators for Solving PDE Inverse Problems

10:30  Break  Break  Break 
11:00  Peter Maass
Regularization by architecture

Lexing Ying
Classical Analysis for Machine Learning Problems

Melanie Weber
Discrete Curvature and Applications in Graph Machine Learning

12:00  Lunch  Lunch  10 min presentations 
12:30  Lunch  
14:30  Jiequn Han
Machine Learning for Inverse Problems: Point Estimates and Bayesian Sampling

Richard Tsai
The Manifold Hypothesis and its consequence in machine learning

Gitta Kutyniok
Reliable AI for Inverse Problems: Successes, Challenges, and Limitations

15:30  Break  Break  Break 
16:00  Anders Szepessy
Adaptive random Features

10 min presentations  Joakim Andén
Simultaneous denoising in lowSNR regimes using datadriven priors

16:30  Haluk Akay
Representing Function, Form, and Fabrication for DataDriven Sustainability


17:00  Francisco Alcántara Ávila
Multiagent reinforcement learning for active flow control: to the 3D and beyond

Saikat Chatterjee
DataDriven Nonlinear State Estimation of Modelfree Process in Unsupervised Learning


17:30  
18:00  Dinner + Cruise  
22:00 
Presentations
Haluk Akay
Department of Energy Technology, KTH Royal Institute of TechnologyRepresenting Function, Form, and Fabrication for DataDriven Sustainability
Monday, 3 June, 16:3017:00 at Digital Futures Hub, Osquars Backe 5, floor 2
Advancements in Artificial Intelligence open the door to exciting possibilities, but how can these be captured to benefit the design of sustainable products? In order to machinelearn from a wealth of prior engineering achievement, design must be represented for computation. In this talk, methods for quantitatively representing various aspects of engineering design from function to form to fabrication will be presented. Applications of these methods to augment sustainable decisionmaking in circular manufacturing, policy review, and knowledge preservation will be illustrated, and the talk will conclude with a discussion on safety and bias in deploying such datadriven methods in society.
Joakim AndénPantera
Department of Mathematics, KTH Royal Institute of TechnologySimultaneous denoising in lowSNR regimes using datadriven priors
Wednesday, 5 June, 16:0017:00 at Digital Futures Hub, Osquars Backe 5, floor 2
The application of DNNs to the denoising problem has resulted in highly performant denoisers for a wide range of applications from photographic image restoration to medical imaging. However, these images are typically subjected to relatively low degree of noise compared to applications such as cryogenic electron microscopy (cryoEM), where noise power dominates the clean signal to the point where traditional denoising methods fail. In this talk, we will study the related problem of multireference alignment (MRA). Here, a clean image is randomly rotated and degraded by additive noise. The goal is to recover the original image from a set of such noisy rotated copies of the image. We show how a transformer architecture can be used to encode a signal prior that is used to aggregate the information from the entire set of observations, yielding superior denoising results compared to the singleimage denoisers.
Francisco Alcántara Ávila
Department of Engineering Mechanics, KTH Royal Institute of TechnologyMultiagent reinforcement learning for active flow control: to the 3D and beyond
Monday, 3 June, 17:0018:00 at Digital Futures Hub, Osquars Backe 5, floor 2
Machine learning for flow control has been increasingly used since the last decade. Starting with simple twodimensional problems like the flow past a cylinder, a future goal has always been to move towards threedimensional (3D) cases. In this presentation I will show the most recent problems where we have been able to effectively perform active flow control using a deep reinforcement learning (DRL) agent fully trained in a 3D environment: RayleighBénard convection, flow past a cylinder and a bubble of recirculation in a turbulent boundary layer. In all cases we leverage the wellknow curse of dimensionality problem that arises in high dimensional cases by using the multiagent reinforcement learning approach, which takes advantages of flow invariants to parallelize the DRL environment. Furthermore, we compare the results of the DRLcontrol strategies with the classicalcontrol methods, obtaining a considerable improvement.
Saikat Chatterjee
Division Of Information Science and Engineering, KTH Royal Institute of TechnologyDataDriven Nonlinear State Estimation of Modelfree Process in Unsupervised Learning
Wednesday, 5 June, 17:0017:30 at Digital Futures Hub, Osquars Backe 5, floor 2
This seminar will address a standard Bayesian state estimation problem as an inverse problem, like Kalman Filter. The major new thing is that Kalman Filter to Particle Filter – allmost all of the standard state estimation methods  know the underline state space model or process model, but our new method DANSE does not. DANSE learns from noisy measurements without access to clean data and/or state space models. That means DANSE learns in an unsupervised manner, and fully modelfree. It is an interesting combination of deep learning and Bayesian learning, and Bayesian estimation. Manuscript: https://arxiv.org/abs/2306.03897
Weinan E
Department of Mathematics, Princeton UniversityBuilding the next generation infrastructure for scientific research
Monday, 3 June, 9:3010:30 Remote: Zoom Meeting ID 627 6175 2766AnnaKarin Tornberg
In the last few years, we have seen a tremendous amount of scientific progress made as a result of the AI revolution, both in our much expanded ability to make use of the fundamental principles of nature, and our much expanded ability to make use of experimental data and the literature. In this talk, I will start with the origin of the AI for Science revolution, review some of the major progresses made so far, and discuss how it will impact the way we do research. I will also discuss some of the ongoing projects that we are working on, with the objective of constructing a new set of infrastructure for scientific research.
Björn Engquist
Center for Numerical Analysis, University of Texas, AustinNeural Inverse Operators for Solving PDE Inverse Problems
Wednesday, 5 June, 9:3010:30 at Digital Futures Hub, Osquars Backe 5, floor 2AnnaKarin Tornberg
A large class of inverse problems for PDEs are only welldefined as mappings from operators to functions. Existing operator learning frameworks map functions to functions and need to be modified to learn inverse maps from data. We propose an architecture termed Neural Inverse Operators (NIOs) to solve these PDE inverse problems. Motivated by the underlying mathematical structure and PDEconstrained optimization techniques, NIO is based on a composition of DeepONets and Fourier Neural Operators to approximate mappings from operators to functions. Experiments will be presented to demonstrate the performance of the NIOs. They do very well compared to existing neural network baselines in solving PDE inverse problems robustly and accurately. The examples include the classical Calderon problem and optical and seismic imaging. PDEconstrained optimization methods currently can address more challenging problems, but the advantage of NIOs is that they are orders of magnitude faster.
Jiequn Han
Center for Computational Mathematics, Flatiron instituteMachine Learning for Inverse Problems: Point Estimates and Bayesian Sampling
Monday, 3 June, 14:3015:30 at Digital Futures Hub, Osquars Backe 5, floor 2Ozan Öktem
Machine learning has increasingly provided powerful tools to tackle challenging inverse problems. This talk presents two works representing two extreme scenarios and discusses prospects for further development. In the first scenario, where the forward problem is a highly nonlinear scattering operator with small observation noise, machine learning models can warm start point estimation based on an optimization formulation. In the second scenario, where the forward operator is linear but the observation noise might be large, existing scorebased diffusion models can provide realistic priors. I will discuss how these models can also assist in provable Bayesian posterior sampling using the tilted transport technique.
Marten De Hoop
Department of Mathematics, Rice UniversityA reflection on neural operators: Injectivity, diffeomorphisms, discretization and quantitative approximation
Tuesday, 4 June, 9:3010:30 at Digital Futures Hub, Osquars Backe 5, floor 2
Recently, there has been a great interest in operator learning, where neural networks learn operators between function spaces from an essentially infinitedimensional perspective. We present a generalized framework for neural operators, with layers including nonlinear integral operators and skip connections. We discuss and prove that injective neural operators are universal approximators and develop an algebra with bijective neural operators. Then, we give a more geometrical perspective based on diffeomorphisms in infinite dimensions, that is, for Hilbert manifolds. Using category theory, we give a nogo theorem that shows that diffeomorphisms between Hilbert spaces may not admit any continuous approximations by diffeomorphisms on finitedimensional spaces, even if the underlying discretization is nonlinear. Strongly monotone diffeomorphisms do admit approximation by finitedimensional strongly monotone diffeomophisms. We then introduce layerwise strongly monotone neural operators. Such layers are diffeomorphisms. We prove that all strongly monotone neural operator layers admit continuous approximations on finitedimensional spaces. We provide different conditions under which a neural operator layer is strongly montone. Most notably, a bilipschitz neural operator layer can always be represented by a composition of strongly monotone neural operator layers and invertible linear maps and, hence, be discretized. Our framework may be used "out of the box" to prove quantitative approximation results for discretization of neural operators.
Joint research with T. Furuya, A. Kratsios, A. Lara, M. Lassas and M. Puthawala.
Gitta Kutyniok
Department of Mathematics, LudwigMaximiliansUniversität MunichReliable AI for Inverse Problems: Successes, Challenges, and Limitations
Wednesday, 5 June, 14:3015:30 Remote: Zoom Meeting ID 691 8603 7960Ozan Öktem
The new wave of artificial intelligence is impacting industry, public life, and the sciences in an unprecedented manner. It has by now already led to paradigm changes in several areas. However, one current major drawback is the lack of reliability. In this lecture we will first provide an introduction into this vibrant research area. We will then present some recent advances, in particular, concerning optimal combinations of traditional modelbased methods with AIbased approaches in the sense of true hybrid algorithms, with a particular focus on limitedangle computed tomography and a novel approach coined "Deep Microlocal Reconstruction". Due to the importance of explainability for reliability, we will also touch upon this area by highlighting an approach which is itself reliable due to its mathematical foundation. Finally, we will discuss fundamental limitations of deep neural networks and related approaches in terms of computability, and how these can be circumvented in the future by next generation AI computing.
Peter Maass with Meira Iske and Janek Gödeke
Center for TechnoMathematics, Universität BremenRegularization by architecture
Monday, 3 June, 11:0012:00 at Digital Futures Hub, Osquars Backe 5, floor 2Ozan Öktem
The success of deep learning approaches for inverse problems strongly depends on the chosen network architecture. In the first part of the talk Meira Iske will present some theoretical results concerning the regularization properties of iResNet architectures. Then, Janke Gödeke will discuss operator approximation properties of neural networks as needed for learning parametertostate operators. In the second part of the talk we review some recent results for comparing different network architectures for solving PDEs and related parameter identification problems. We close the talk with some industrial applications.
Anders Szepessy
Department of Mathematics, KTH Royal Institute of TechnologyAdaptive Random Features
Monday, 3 June, 16:0016:30 at Digital Futures Hub, Osquars Backe 5, floor 2
Random feature learning methods are attractive from the analysis point of view. A challenge in practice is to sample near optimally. I will present an elementary proof of the generalization error for random features and some attempts to sample random features efficiently, based on work with Xin Huang, Aku Kammonen, Jonas Kiessling, Petr Plechac, Mattias Sandberg and Raul Tempone.
YenHsi Richard Tsai
Department of Mathematics, Department for Mathematics and Oden Institute for Computational Engineering and Sciences, University of Texas AustinThe Manifold Hypothesis and its consequence in machine learning
Tuesday, 4 June, 14:3015:30 at Digital Futures Hub, Osquars Backe 5, floor 2AnnaKarin Tornberg
The dimensional manifold hypothesis posits that the data found in many applications, such as those involving natural images, lie (approximately) on low dimensional manifolds embedded in a high dimensional Euclidean space. Since a typical neural network is constructed to be a function on the whole embedding space, one must consider the stability of an optimized network function when evaluating at points outside the training distribution. In this talk, we will discuss some consequences of the data manifold's curvatures and the arbitrariness of the highdimensional ambient space. We will also discuss the regularization effects by introducing noise to the data. Finally, we discuss the multiscale properties of the empirical loss function induced by data distributions supported on a low dimensional submanifold.
Melanie Weber
Geometric Machine Learning Group, Harvard UniversityDiscrete Curvature and Applications in Graph Machine Learning
Wednesday, 5 June, 11:0012:00 at Digital Futures Hub, Osquars Backe 5, floor 2Ozan Öktem
The problem of identifying geometric structure in heterogeneous, highdimensional data is a cornerstone of Representation Learning. In this talk, we study this problem from the perspective of Discrete Geometry. We start by reviewing discrete notions of curvature with a focus on Ricci curvature. Then we discuss how curvature characterizations of graphs can be used to improve the efficiency of Graph Neural Networks. Specifically, we propose curvaturebased rewiring and encoding approaches and study their impact on the Graph Neural Network’s downstream performance through theoretical and computational analysis. We further discuss applications of discrete Ricci curvature in Manifold Learning, where discretetocontinuum consistency results allow for characterizing the geometry of a suitable embedding space both locally and in the sense of global curvature bounds.
Lexing Ying
Department of Mathematics, Stanford UniversityClassical Analysis for Machine Learning Problems
Tuesday, 4 June, 11:0012:00 at Digital Futures Hub, Osquars Backe 5, floor 2AnnaKarin Tornberg
Machine learning has increasingly influenced the development of scientific computing. In this talk, I will share some recent experiences on how classical analysis can help machine learning. The first example is online learning, where ODEs and SDEs can help explain the optimal regret bounds concisely. In the second example, a perturbative analysis clarifies why sometimes line spectrum estimation algorithms exhibit a superconvergence phenomenon.
10 minute presentations, Tuesday
Chair: Shervin BagheriMarcial Sanchis Agudo
Linné FLOW Center, KTH Royal Institute of TechnologyRobustness of transformer neural networks used for temporaldynamics prediction
To improve the robustness of transformer neural networks used for temporaldynamics prediction of chaotic systems, we propose a novel attention mechanism called easy attention which we demonstrate in timeseries reconstruction and prediction. While the standard self attention only makes use of the inner product of queries and keys, it is demonstrated that the keys, queries and softmax are not necessary for obtaining the attention score required to capture longterm dependencies in temporal sequences. Through the singularvalue decomposition (SVD) on the softmax attention score, we further observe that self attention compresses the contributions from both queries and keys in the space spanned by the attention score. Therefore, our proposed easyattention method directly treats the attention scores as learnable parameters. This approach produces excellent results when reconstructing and predicting the temporal dynamics of chaotic systems exhibiting more robustness and less complexity than self attention or the widelyused long shortterm memory (LSTM) network. We show the improved performance of the easyattention method in the Lorenz system, a turbulence shear flow and a model of a nuclear reactor.
Álvaro Moreno Soto
Department of Aeroespace Engineering, University Carlos III de MadridPhysicsinformed neural networks for highresolution weather reconstruction from sparse weather stations
The significance of accurate weather reconstruction has become more relevant in recent years. Currently, weather models primarily rely on historic data statistics and numerical methods. However, the emergence of artificial intelligence offers new possibilities for addressing the demand for accurate information on shorttomidterm weather events. Accurate predictions can lead to significant cost savings by, for example, enabling efficient flight planning and optimal allocation of operational resources in air traffic management. This project focuses on leveraging physicsinformed neural networks (PINNs) to precisely reconstruct the weather field from limited data provided by weather stations on ground. By enforcing compliance with physics constraints, we enhance the deterministic and comprehensive reconstruction of field data (i.e. wind velocity and pressure), enabling better anticipation of weather event’s temporal and spatial evolution.
Paul Häusner
Department of Information Technology, Uppsala UniversityGraph neural network based preconditioner for Krylov subspace methods
Graph neural networks (GNNs) are one of the most popular neural network architectures emerging in the last couple of years. This is owed in part to their adeptness at handling unstructured inputs, a common feature in many realworld scenarios. Moreover, given that many classical algorithms can be framed within the realm of graph problems, GNNs emerge as a natural option for accelerating or substituting traditional algorithms with neural network approaches. In this talk, we showcase how the connection between sparse linear algebra and graph neural networks can be exploited in order to efficiently learn preconditioners for Krylov subspace methods. By choosing a problem specific architecture and efficient to compute loss, we train a model to predict the incomplete factorization of an input matrix for problems arising from a problem distribution. During inference, we are then able to produce effective preconditioners for unseen problems with a small computational overhead. This allows us to accelerate the total solving times of linear equation systems compared to employing classical generalpurpose preconditioning techniques.
Arsineh Boodaghian Asl
Department of Biomedical Engineering and Health SystemsThe Application and Limitations of Integrating Machine Learning Models to Simulation Models for Restructuring Hospitals' Care Pathways
In this presentation, I will introduce a networkbased approach to model care pathways in hospitals and explain how integrating machine learning models can inform the improvement of hospital pathways via restructuring. A hospital consists of different units that perform different tasks. Every day, a certain number of patients arrive, distribute throughout the hospital, and leave. The pathways that individual patients follow depend on their health condition and the required treatment. Over time, changes in society, such as needs and lifestyle, require hospitals to restructure the patient's pathways to enhance the treatment and reduce the cost of treatment. For this, a hospital requires tools to enable managers and stakeholders in better strategic decisionmaking. The presentation will discuss how machine learning models can be integrated with such simulation models, their combined applications and limitations.
Michel Gokan Khan
Department of Mathematics, KTH Royal Institute of TechnologyML for Digital Twinning and Predictive Maintenance: AstraZeneca Case Study
In this presentation, I aim to showcase the stateoftheart setup for Predictive Maintenance (PdM) in industries, explore key research questions, and discuss potential directions for integrating Machine Learning (ML) with realtime sensory data to create digital twins and PdM models within robotic plants in Industry 5.0. These methods aim to decrease maintenance time, facilitate the digital twinning process, automate the extraction of various KPIs and insights from the line, and optimize overall line efficiency. This project, entitled SMART (Smart Predictive Maintenance for the Pharmaceutical Industry), is funded by KTH Digital Futures and AstraZeneca. It aims to boost production lines in the context of Industry 5.0 by enhancing Overall Equipment Effectiveness (OEE) and operator competence. In this case study with AstraZeneca, we aim to employ sensor networks, ML, and immersive visualizations to develop PdM models that enhance operator expertise, setting a new standard in Industry 5.0.
Emmanuel Ström
Department of Mathematics, KTH Royal Institute of TechnologyDeep learningbased precomputed wall models for roughwall viscous flow
We leverage recent advances in operator learning to accelerate multiscale solvers for viscous fluid flow over a rough boundary. We focus on the HMM method, which involves formulating the problem through a coupled system of microscopic and macroscopic subproblems. Solving microscopic problems can be viewed as a nonlinear operator mapping from the space of micro domains to the solution space. We argue that even a relatively high error in the micro solution can be tolerated, since the error made by the HMM model is larger. Our main contribution is to use an FNOtype architecture to perform this mapping faster than classical methods at the same level of precision.
Jevgenija Rudzusika
Department of Mathematics, KTH Royal Institute of TechnologyAccelerated ForwardBackward Optimization using Deep Learning
We propose several deeplearning accelerated optimization solvers with convergence guarantees. We use ideas from the analysis of accelerated forwardbackward schemes like FISTA, but instead of the classical approach of proving convergence for a choice of parameters, such as a stepsize, we show convergence whenever the update is chosen in a specific set. Rather than picking a point in this set using some predefined method, we train a deep neural network to pick the best update. Finally, we show that the method is applicable to several cases of smooth and nonsmooth optimization and show superior results to established accelerated solvers.