Skip to main content
To KTH's start page To KTH's start page

Machine Learning for Automation of Chromosome based Genetic Diagnostics

Time: Wed 2020-11-18 15.00

Location: https://kth-se.zoom.us/j/2884945301

Respondent: Gongchang Chu

Opponent: Ming Cui

Supervisor: Shatha Jaradat (Examiner: Amir H. Payberah)

Export to calendar

Chromosome based genetic diagnostics, the detection of specific chromosomes, plays an increasingly important role in medicine as the molecular basis of human disease is defined. The current diagnostic process is performed mainly by karyotyping specialists. They first put chromosomes in pairs and generate an image listing all the chromosome pairs in order. This process is called karyotyping, and the generated image is called karyogram. Then they analyze the images based on the shapes, size, and relationships of different image segments and then make diagnostic decisions. Manual inspection is time-consuming, labor-intensive, and error-prone. This thesis investigates supervised methods for genetic diagnostics on karyograms. Mainly, the theory targets abnormality detection and gives the confidence of the result in the chromosome domain. This thesis aims to divide chromosome pictures into normal and abnormal categories and give the confidence level. The main contributions of this thesis are (1) an empirical study of chromosome and karyotyping; (2) appropriate data preprocessing; (3) neural networks building by using transfer learning; (4) experiments on different systems and conditions and comparison of them; (5) a right choice for our requirement and a way to improve the model; (6) a method to calculate the confidence level of the result by uncertainty estimation. Empirical research shows that the karyogram is ordered as a whole, so preprocessing such as rotation and folding is not appropriate. It is more reasonable to choose noise or blur. In the experiment, two neural networks based on VGG16 and InceptionV3 were established using transfer learning and compared their effects under different conditions. We hope to minimize the error of assuming normal cases because we cannot accept that abnormal chromosomes are predicted as normal cases. This thesis describes how to use Monte Carlo Dropout to do uncertainty estimation like a non-Bayesian model.