Skip to main content
To KTH's start page To KTH's start page

Algebraic Advances in Multiview Geometry

Time: Fri 2024-06-14 14.00

Location: D3, Lindstedtsvägen 5, Stockholm

Language: English

Subject area: Mathematics

Doctoral student: Felix Rydell , Algebra, kombinatorik och topologi

Opponent: Professor Rekha Thomas,

Supervisor: Kathlén Kohn, Matematik (Inst.); Fredrik Viklund, Matematik (Avd.)

Export to calendar



 Computer Vision is the study of how computers can understand and classify images as well as or better than humans, at a fraction of the time. A fundamental problem in this field, Structure-from-Motion, aims to build a 3D model of an object based on 2D images. Applications include self-driving cars, autonomous vehicles and visual media such as movies and video games.  

 The geometry that arises in 3D reconstruction is called Multiview Geometry, and the study of algebraic structures that arise from Multiview Geometry is called Algebraic Vision. The latter is the subject of this thesis. Our focus is on optimization problems, finding polynomials constraints and the construction of new algorithms. A main goal of this thesis is to generalize concepts and ideas in Algebraic Vision to new settings.     

 In Paper A, we investigate a classic question in Computer Vision, namely the compatibility of fundamental matrices. We prove that quadruplewise compatibility implies global compatibility. Given a sextuple of compatible fundamental matrices, there are four possible cases for the geometry of their epipoles. In each case, we provide necessary and sufficient conditions for compatibility in terms of explicit homogeneous polynomials in the fundamental matrices and their epipoles. 

 In Paper B, we build on the theory of Paper A. More precisely, we equivalently express the necessary and sufficient conditions in terms of intuitive geometrical conditions. In the process, we get simpler proofs.

 In Paper C, we consider the problem of how to best identify and filter out outliers from a given data set. A data point is an inlier if its Euclidean distance to the mathematical model is small enough. This distance is expensive to compute. In applied settings, it is efficiently approximated by the Sampson error. We provide theoretical bounds for when the Sampson error is a good approximation of the Euclidean distance, and show, via numerical experiments, new scenarios where it can be applied, such as in three-view geometry. 

 In Paper D, we study the projection of lines in 3-space onto a given set of camera planes. The closure of this projection map is a line multiview variety. Our main theorem is that a line multiview variety is cut out by the condition that the back-projected planes meet in a line if and only if all centers are pairwise distinct and no four centers are collinear. Here, smooth quadrics and their families of lines are important tools. We also study smoothness, multidegrees, and Euclidean distance degrees. 

 In Paper E, we use the theory of Cohen--Macaulay ideals to prove that under sufficient genericity, the ideal described in Paper D is the defining ideal of the line multiview variety. We compute Gröbner bases and discuss to what extent our results carry over to the case of cameras with collinear centers.   In Paper F, we solve the problem of how to do 3D reconstruction such that point and line incidence relations are preserved. In this direction, we introduce anchored multiview varieties. We describe new reconstruction algorithms based on these. On simulated data, we compare the different approaches with individual reconstruction of points and lines. Our approach yields comparable accuracy and a significant speed improvement. This improvement in speed is theoretically supported by our Euclidean distance degree computations. We make use of the observation that these anchored multiview varieties are linearly isomorphic to multiview varieties arising from the projection of points in 2-space and 1-space. 

 In Paper G, we explore the observation above from Paper F in great detail. We start by considering all possible anchored multiview varieties arising from projections of points and lines in 1, 2, and 3-dimensional projective space. We say that two such varieties are ED-equivalent if there is a linear isomorphism between them that preserves ED-critical points. This gives rise to fourteen equivalence classes; a multiview catalogue. In the case of points, we also present a study of all associated resectioning varieties. Finally, we propose conjectures for the Euclidean distance degrees of all varieties appearing in our comprehensive list.

 In Paper H, we present an algebraic study of the projection of plane curves and twisted cubics in space onto multiple images of pinhole cameras. The Zariski closure of the image of the projection of conics is called a conic multiview variety. Extending previous work for point and line multiview varieties, we make use of back-projected cones. For two views, we provide the defining ideals of conic multiview varieties. For any number of views, we state when the simplest possible set-theoretic description is achieved based on the geometry of the camera centers. Finally, we conjecture the Euclidean distance degree for the conic multiview variety given two cameras.  

 In Paper I, we introduce a generalization of multiview varieties as closures of images obtained by projecting subspaces of a given dimension onto several views, from the photographic and geometric points of view. We investigate when the associated projection map is generically injective; an essential requirement for successful triangulation. We give a complete characterization of this property by determining two formulae for the dimensions of these varieties. Similarly, we describe for which center arrangements calibration of camera parameters is possible. We determine precisely when the multiview variety is naturally isomorphic to its associated blowup, in the case of generic centers.

 At the end of this thesis, four additional papers and one extended abstract is attached. As these are not part of the Algebraic Vision story, we do not describe them here. They are included in the thesis as part of the complete collected works of the PhD candidate.