# On Medical Image Segmentation With Noisy Labels

**Time: **
Thu 2023-06-01 13.00

**Location: **
Kollegiesalen, Brinellvägen 8, Stockholm

**Language: **
Swedish

**Subject area: **
Applied and Computational Mathematics Mathematical Statistics

**Doctoral student: **
Marcus Nordström
, Matematisk statistik

**Opponent: **
Fredrik Kahl, Chalmers University of Technology

**Supervisor: **
Henrik Hult, Matematisk statistik; Atsuto Maki, Robotik, perception och lärande, RPL

QC 2023-05-10

## Abstract

It is well known that data sets used for training and testing automatic medical image segmentation methods often contain a lot of label noise. Such noise affects the performance of the methods and has been subject to a lot of research. One way to approach the topic of label noise that largely has been overlooked in the literature is to investigate how it affects the theoretically optimal segmentations. This thesis consist of four papers related to such investigations for the two most popular choices of loss functions in the field, cross-entropy and soft-Dice, and the most popular metric, Dice.

In paper A, the loss functions cross-entropy and soft-Dice are investigated. Inspired by work from binary classification, the properties of calibration and convexity are proposed to explain the experimental observations of good stability associated with cross-entropy and good performance associated with soft-Dice. It is then shown that soft-Dice neither is convex nor quasi-convex and it is conjectured that soft-Dice is calibrated to Dice. Finally, an alternative quasi-convex loss function is experimentally compared to soft-Dice on a kidney segmentation problem.

In paper B, the optimal segmentations to the metrics Accuracy and Dice are characterized when noise is present. This characterization is then used to give a detailed account of the volume bias associated with the metrics. In particular, sharp bounds for volume bias is provided, it is shown that the volume of an optimizer to Accuracy always is less than or equal to the volume of an optimizer to Dice and that the set of optimizers to the two metrics coincide when the optimization is constrained to the segmentations without volume bias. Finally, experimental results supporting the observations are presented on a set of segmentation problems.

In paper C, the effect label noise has on soft-Dice is studied. In particular, the optimal solutions are characterized and sharp bounds for the volume biasis provided. Moreover the conjecture of soft-Dice being calibrated to Dice is proved under a compactness assumption that always holds in practice. Finally, experimental results supporting the observations are presented on a set of organ segmentation problems and a set of synthetic segmentation problems.

In paper D, a noise model based on Gaussian field deformations is proposed. Several theoretical properties for labels with this sort of noise is proved, including a closed form expression for marginal probabilities and a representation that can be used for efficient sampling. The noise model is then used to study 1/2-thresholded optimal solutions to the loss functions cross-entropyand soft-Dice, and it is shown how they diverge as the noise level is increased. Finally, by using the characterization of the optimal solutions to soft-Dice it is shown how cross-entropy can be used in conjunction with an a priori unknown but computable threshold to recover optimal solutions to soft-Dice. The theoretical observations are validated on three organ segmentation problems with various levels of noise.