Skip to content

SU Course : Comparative Genomics

Source – Lecture slide

The final course of the Stockholm University semester is on Comparative Genomics. The course is bioinformatics oriented and focuses on the use of computational tools and writing small scripts to perform computational analysis of genetic information available on various resources. The course is conducted by Erik Sonnhammer.

The course has 2 components:

1) Theory classes

The theory classes focus on basic concepts of Genome organisation, Gene prediction, Phylogenetics, Orthology methods, etc. The schedule is designed such that the lectures are on Monday afternoons, where one main concept is discussed. Before each lecture there are assigned readings and the a quiz based on the assigned reading. The lecture then starts by the discussion of the quiz followed by in-depth discussions during the rest of the lecture. The lectures act as an introduction to the corresponding lab assignments for that week.

2) Bioinformatics lab

The lab is the main component of the course. All the students are divided into groups of 4-5 students and each group is given a starting data set, a set of genomes. The first lab focuses on exploring the databases and tools to find which organisms the genomes belong to. Each week’s lab sessions are a built up of the previous week’s lab. For example, a part of the 2nd week’s lab focuses on predicting Open Reading Frames and the amino acid sequences for each ORF using various tools.

A typical prokaryotic ORF Source – Wikipedia

As you may know the genomes of Eukaryotes and Prokaryotes are significantly different in terms of genome organisation so the same algorithm cannot be used to predict ORF’s. Which means to know which tool to apply on which genome, we needed to know the type of organism from the previous weeks lab.

SU is a very beautiful campus and we often used to come and sit outside on the lawns of the university to relax and take break from the lab work

We had  7 such weeks of lab work. The labs were very stimulating for self learning, as we had to write many scripts in python to get the correct input format for different tools, to perform certain types of custom analysis, etc. Interestingly, we discovered in the lab that many of the tools used in the field are developed in Erik Sonnhammer’s research lab itself, which inspired us as we were learning from the leader in field himself! Which also meant that we had an intensive course and we ended up spending whole days in the lab trying to figure out the codes. Of course, we had help and guidance from the two TA’s but it was limited to general guidance and not to write the scripts itself. However, I believe that was something necessary for us to get some confidence in writing scripts in python independently!

That’s my classmate Amparo designing her ORF predictor


The last week of the course was focused on the small project. In this all the groups were given 3 tasks i.e. to analyse the genome for the basic characteristics such as GC content, perform a phylogenetic analysis and the most challenging one i.e. to write our own ORF predictor tool!!

Just to give an idea these are some of the logical things to keep in mind while designing the ORF predictor Source – Wikipedia

As the last step we presented our findings about our genome in a talk and compared the sensitivity and specificity of the ORF predictor tool with the professional tools. Surprisingly, many of the tools designed had good sensitivity or specificity that was close to the the available tools in the field even though we had considered only the basic criteria.

In general, the course is a very elegant way to learn python scripting while applying it to relevant biological questions.

Stay tuned as next we go to our third semester at KTH!