Correct and Efficient Monte Carlo Inference for Universal Probabilistic Programming Languages
Time: Wed 2023-03-29 13.00
Location: Sal A, Kistagången 16, Kista
Video link: https://kth-se.zoom.us/j/69904297956
Language: English
Subject area: Information and Communication Technology
Doctoral student: Daniel Lundén , Programvaruteknik och datorsystem, SCS
Opponent: Professor Sam Staton, University of Oxford
Supervisor: Professor David Broman, Programvaruteknik och datorsystem, SCS; Doktor Lawrence Murray, Uber AI; Professor Joakim Jaldén, Teknisk informationsvetenskap
QC 20230303
Abstract
Probabilistic programming languages (PPLs) allow users to express statistical inference problems that the PPL implementation then, ideally, solves automatically. In particular, PPL users can focus on encoding their inference problems, and need not concern themselves with the intricacies of inference. Universal PPLs are PPLs with great expressive power, meaning that users can express essentially any inference problem. Consequently, universal PPL implementations often use general-purpose inference algorithms that are compatible with all such inference problems. A problem, however, is that general-purpose inference algorithms can often not efficiently solve complex inference problems. Furthermore, for certain inference algorithms, there are no formal correctness proofs in the context of universal PPLs.
This dissertation considers research problems related to Monte Carlo inference algorithms—sampling-based general-purpose inference algorithms that universal PPL implementations often apply. The first research problem concerns the correctness of sequential Monte Carlo (SMC) inference algorithms. A contribution in the dissertation is a proof of correctness for SMC algorithms in the context of universal PPLs. The second research problem concerns execution time improvements when suspending executions—a requirement in many Monte Carlo inference algorithms. The dissertation addresses the problem through two separate approaches. The first approach is a compilation technique targeting high-performance platforms. The second approach is a static suspension analysis guiding a selective continuation-passing style (CPS) transformation, reducing overhead compared to a full CPS transformation. The third research problem concerns inference improvements through alignment—a useful and often overlooked property in PPLs. The dissertation contributions are a formal definition of alignment, a static analysis technique that automatically aligns programs, and aligned versions of SMC and Markov chain Monte Carlo (MCMC) inference algorithms. The final research problem is more practical, and concerns the effective implementation of PPLs. Specifically, the contribution is the Miking CorePPL universal PPL and its compiler. Overall, the contributions in the dissertation significantly improve the efficiency of Monte Carlo algorithms as applied in universal PPLs.