Skip to main content
To KTH's start page To KTH's start page

Bioinformatics for microbiome analysis

Time: Fri 2024-06-14 13.00

Location: Marie, Widerströmska huset, Tomtebodeväegen 18a, Solna

Video link:

Language: English

Subject area: Biotechnology

Doctoral student: Luis Fernando Delgado , Genteknologi, Envgen

Opponent: Universitetslektor Lionel Guy, Department of Medical Biochemistry and Microbiology, Infections and immunity, Uppsala University

Supervisor: Professor Anders F. Andersson, Genteknologi; Professor Lukas Käll, Genteknologi

Export to calendar

QC 2024-05-15


Marine ecosystems harbour a vast microbial diversity which play a crucial role in ecosystemfunctioning. Advancements in DNA sequencing technologies have transformed our ability to analyse microbial populations comprehensively. Metagenomic sequencing has emerged as a pivotal tool for characterising microbial communities across various environments. Bioinformatics, an interdisciplinary field, facilitates the analysis and interpretation of large biological datasets, including microbiome data.

This thesis aims to enhance bioinformatics approaches for analysing marine microbiomes. It comprises four papers covering bioinformatic developments and genomic data analysis across multiple topics, including metagenomics, pangenomics, comparative genomics and population genomics:

Paper I evaluated three assembly strategies for constructing gene catalogues from metagenomic samples: individual sample assembly with gene clustering, co-assembly of all samples, and a new hybrid approach, mix assembly. The efficacy of the mix-assembly approach was highlighted for maximising information extraction from metagenomic samples, offering opportunities for further exploration in microbial ecology and environmental genomics.

Using the mix-assembly approach, we conducted a comprehensive analysis of 124 metagenomic samples sourced from the Baltic Sea, resulting in the refinement of the Baltic Sea Gene Set (BAGS v1.1), which now encompasses 66.53 million genes annotated for both functionality and taxonomy. In Paper II, we introduced an open-access initiative that provided the mix-assembly pipeline code. We also developed the BAGS-Shiny web application to facilitate user interaction with this extensive gene catalogue.

Paper III focused on whole-genome sequencing and assembly of 82 environmental V. vulnificus strains from the Baltic Sea, enabling comprehensive comparative genomic analysis. I developed the PhyloBOTL pipeline, which uses a phylogeny-based approach to identify genes associated with pathogenicity. Comparative genomics of 208 clinical isolates and 199 environmental isolates revealed 58 enriched orthologs in pathogenic strains, including known virulence factors and novel genes. Potential biomarkers for pathogenic V. vulnificus were identified, and primers suitable for PCR-based environmental monitoring were designed (in silico).

In Paper IV population genomics analysis was carried out, using the Input_Pogenom pipeline and POGENOM tool, to explore intraspecific biogeographical patterns. Geographical barriers were found to significantly influence aquatic bacteria distribution, with greater genetic differentiation observed between Baltic and Caspian seas than within the Baltic Sea's salinity gradient.