Mapping the genome of Sweden´s most important plant

Together with colleagues at UPSC and SciLifeLab, Lars Arvestad and PhD student Kristoffer Sahlin participates in a large interdisciplinary project mapping the genes of the Norway spruce; Sweden's most important plant.

In the spring of 2011 two of CB's research teams, Jens Lagergren´s Lab and Lars Arvestads´s Lab, working in Computational Biology and Bioinformatics, moved to Science for Life Lab´s (SciLifeLab*) premises at Karolinska Institute Science Park in Solna. The reason for the move was that they work a lot with the other groups within SciLifeLab, explains Associate Professor Lars Arvestad.

"I find it very stimulating and interesting to be part of this interdisciplinary environment. It is also convenient from a practical perspective, since it in our field is important to be close to the data,” he says.

Lars Arvestad´s group focuses on computational problems in evolution and genomics. In their work, they collaborate with Joakim Lundeberg, Professor at KTH Biotechnology and manager of the genomics platform at SciLifeLab. He is also one of the key researchers in a large project aiming to map the genes of Sweden's most important plant; Norway spruce.

This five-year spruce project started in 2010, it is coordinated by Umeå Plant Science Center (UPSC) and includes scientists at SciLifeLab as well as Canadian, Italian and Belgian scientists. A SEK 75 million grant from the Knut and Alice Wallenberg Foundation, plus matching funding from the participating universities finance it.

The Swedish researchers are in for a real challenge, being first in the world to deal with the largest amount of genetic material ever sequenced in a plant or animal species. Thanks to the fast development of new DNA sequencing technologies this large project is now possible to realize.

“It is not an easy task, since a cell of spruce has seven times as much DNA as a cell of a human. Just to handle the vast amounts of data has proven to be a practical problem. Many computer programs earlier used in order to assemble the genomes of large animals, simple bacteria and plants with smaller genomes do not work for spruce, “ tells Lars Arvestad.

One of the questions the project want to answer is why conifers have so much DNA. Is this the reason for their successful life on earth during millions of years?

"What makes it tricky to piece together the spruce genome is that it has a lot of repetition," says Lars Arvestad.

He supervises graduate student Kristoffer Sahlin, who has developed a new method to link up major pieces of DNA, "contigs", which are believed to represent pieces of spruce genome but where there is limited, or sometimes conflicting, information about how they are linked together; the order in which they come and how far apart they sit.

Normally in a genome assembly process, relatively short reads (pieces) of DNA are first identified, then using calculations and other knowledge the researchers can piece together how the genome should look like.

"Kristoffer is working with so-called scaffolding, a process that links together the small pieces of DNA to larger pieces in the best way,” says Lars Arvestad. “His solution is scalable, which is important for such a large genome as the spruce and it can also estimate the distance between contigs better than previous methods.”

Top page top