Tag Archives: bioinformatics

Course Review: High-throughput data analysis

The course, BB2491 High-throughput data analysis, is the core of MTLS education at KTH. How does BB2491 differ from traditional biology lectures that we have so far? In this blog I will talk a bit about why, and how we should learn and master high-throughput science!

First, why study high-throughput? It is involved in a myriad of industrial applications, as well as the frontiers of research, to name but a few:

  • Next Generation Sequencing (NGS) is the best illustration of generating and analysing high-throughput data
  • High-throuput compound screening dominates lead discovery for manufacturing of small molecule drug
  • Traditional gene-targeting methods are sufficient for analysis of mendelian diseases, but diseases that involves more complex interplay, requires collecting and analysing “bigger data”

In BB2491, the teaching follows a logical transition from theory, practice to a hand-on project.


In biology, high throughput analysis can be split into three parts, namely Genomics, Transcriptomics, and Proteomics. Correspondingly, we have three professors responsible for each part:


Lukas Käll                                Lars Arvestad                               Olof Emanuelsson

In contrast to traditional biology lectures, we have no designated text book; alternatively, we have 33 research papers or scientific reviews as mandatory reading materials! It sounds a bit daunting in the beginning, but under the careful guidance of teachers, as well as  fundamental building blocks in previous course (Genomics, Proteomics, Bioinformatics), we are able to dive into the ocean of knowledge!

For example, while Olof introduced the basic concept of RPKM in abundance estimation at the start of the transcriptomics part, it is concluded by four excellent students at the edge of RNA sequencing techniques.


Continue reading

2nd semester of Molecular Techniques in Life Science – Part I

The most notable change of the second semester is that we are going to study in a new school Stockholm University, and to learn programing in python, because we will use it in all of our four courses, namely Introduction to Bioinformatics, Project in Molecular Life SciencesBiophysical Chemistry and Comparative Genomics.

Stockholm University

(The content of Introduction to Bioinformatics and that of Biophysical Chemistry is well covered in my blogs Course Review: Introduction to Bioinformatics and Course Review: Biophysical Chemistry, therefore I will rather not spend any time here)

Project in Molecular Life Sciences

As its name suggested, it is a 100% project based, lecture-free exam-free course, which implies that all grades (from A to F) will be given to the performance, including:

  • A full written report
  • A final presentation of the project
  • Participation and attendance in weekly seminar
  • Quality of codes

As there are no lectures, the whole class were assigned two teaching assistants who can offer guidance throughout the 1.5 months of the course.

Then, what are we going to do in these 1.5 months? The task that we were given is to construct a predictor that returns the secondary structural elements of a protein sequence. How many possibilities are they? Those elements can be alpha-helix/beta-sheet/coil, membrane embedded/solvent-exposed, extracellular/intracellular……That’s why besides looking into the biochemical properties of amino acids, we have to employ the powerful machine learning tool.

It is my first time to take a project-based course on my own. Suddenly, I am “liberated”: no more lecture or exam, what I need to do is designing my predictor and coding for it. But you can imagine that it is never easy:

Firstly, I have to familiarize myself with the machine learning tool sk.learn that I use, which is a time-consuming process for any neophytes. After successfully running sk.learn for the first time, I have no time to celebrate, because the accuracy is not satisfactory, I have to dive down into my hundreds of codes, look for possible explanation, find out a solution, and do the accuracy test again ——- when the algorithm gets more complicated, the running time for accuracy test accelerates: it takes a night for checking only 1/6 of sequences that I have! It is an endless circle until the last minute before submission.

Sisyphus, is me

On the other hand, I feel that I am so lucky to be in my current class: we are adversaries, but at the same time friends and comrades. We gave comments, advice and criticism on each others’ work, also share information, for example when we run the accuracy test in order to save time. When I look back into the six weeks I spent in the computer lab, I start to realize that how robustly we were growing in this phase. We will grow even stronger.

In addition, my classmate Carolina also has a lot to say on this course on her Karolinska blog, here is the link: https://studentblogski.wordpress.com/2017/03/30/how-to-survive-2-months-of-programming/


On the way to a bioinformatician

As elaborated in my blog About TIANLIN, the passion in the emerging field of bioinformatics is my motivation of pursuing the master degree of Molecular Techniques in Life Science here in Stockholm. If you never heard of bioinformatics/bioinformatician before, let me tell you more:

1. Why bioinformatics

Working in a so-called “wet lab”, no matter it is clinical-related or molecular, generates biological data from time to time. As the quantity and complexity of these data grow, they can be by no means handled manually. This gives rise to bioinformatics, which refers to the development of software tools that enable us to organize, interpret and make use of these biological data


Right: A wet lab in Science For Life Laboratory, where researchers were preparing samples for DNA sequencing in a clean room




2. What can bioinformatics do

From the description above, one may think that bioinformatics is another Microsoft Excel-like computer-user interface that allows us to look at our experimental data and submit a report of it. But bioinformatics is more powerful than that! For example, it gives the conclusion that all non-Africans have around 5% Neanderthal blood; it tells whether two diseases have causal relationship from the ten-year medical record of a whole country; it predicts the structure of an enzyme with very high accuracy that typically takes years in a wet lab.

Applying bioinformatics: an interactive map of three diseases

Continue reading

Sci-Life-Lab unlocked!


You may not be familiar with adenosine(A), thymine(T), guanine(G), cytosine(C), but it is impossible that you never heard of DNA sequencing! As we all know, DNA sequencing is an amazing technique that “deciphers” the secret codes which hide in your body. It differentiates you from any other individual in the world; it can probably tell your eye color, where you come from, and even prognose diseases! It is difficult to explain DNA sequencing in this blog, but I can tell you where the most DNA sequencings in Stockholm are performed: Science For Life Laboratory, or”Scilifelab” we usually call it!

Thanks to the course Frontier in Translational Medicine, a guided tour inside the Scilifelab was arranged  for my class. We were fascinated not only because it would allow us to learn how the institute operates; but also we could get in tough with the bioinformaticians working there —- which would certainly be beneficial for our future career path.

Continue reading