On the way to a bioinformatician

As elaborated in my blog About TIANLIN, the passion in the emerging field of bioinformatics is my motivation of pursuing the master degree of Molecular Techniques in Life Science here in Stockholm. If you never heard of bioinformatics/bioinformatician before, let me tell you more:

1. Why bioinformatics

Working in a so-called “wet lab”, no matter it is clinical-related or molecular, generates biological data from time to time. As the quantity and complexity of these data grow, they can be by no means handled manually. This gives rise to bioinformatics, which refers to the development of software tools that enable us to organize, interpret and make use of these biological data


Right: A wet lab in Science For Life Laboratory, where researchers were preparing samples for DNA sequencing in a clean room




2. What can bioinformatics do

From the description above, one may think that bioinformatics is another Microsoft Excel-like computer-user interface that allows us to look at our experimental data and submit a report of it. But bioinformatics is more powerful than that! For example, it gives the conclusion that all non-Africans have around 5% Neanderthal blood; it tells whether two diseases have causal relationship from the ten-year medical record of a whole country; it predicts the structure of an enzyme with very high accuracy that typically takes years in a wet lab.

Applying bioinformatics: an interactive map of three diseases

3. Prerequisites for becoming a bioinformatician


A bachelor degree is life science-related subjects (e.g. biochemistry, biology, biotechnology) is more than sufficient. In practice, understanding the central dogma of life: replication -> transcription -> translation as well as the essential large molecules (nucleic acids, proteins) are paramount importance.

Understanding the Central Dogma is the first step towards a bioinformatician


Of course, prior knowledge of programming languages is always preferred. Otherwise, we will start with Python or R, but they are indeed not difficult to learn, as they are more user-friendly than the traditional languages such as C++.

As you may see, the biggest barrier between me (presumably a biotechnology student) and bioinformatician is the informatics part. But I always believe that “When there is a will, there is a way”, so I did the following step by step:

1. Learning Python through online courses

As I have no previous exposure in programming at all, my Rome has to build from zero. Firstly, I registered an account at www.codeacademy and took part in their Python course.

Pros: a good introduction to basic python concepts (string, dictionary, loop…); learning while enjoying -> very fun to complete the assigned tasks

Cons: not bio-related, may be way too easy

Alternatively, reading the official Python booklet: A Byte of Python helps you to familiarize Python without any online courses.

2. More advanced online courses

I also registered at www.coursera.com for two courses, one about fundamental bioinformatics and the other about machine learning.

Pros: courses available at this website come from elite universities, thus they are of high quality and highly relevant

Cons: poor adherence (with the reasons that you can easily think of……)

3. More reading

There are quite a few (and will be more and more) textbooks about bioinformatics that worth reading. The one that I have at hand is Understanding Informatics written by Marketa Zvelebil and Jeremy O. Baum.

Pros: contains very detailed information about both theories and practices in bioinformatics

Cons: difficult to prioritize the topics

It has been three months since I typed my first code “Hello world” on the terminal. In the coming weeks, I am going to complete a project about three-dimensional protein structural prediction using machine learning. Can’t wait !