BONSAI - 2016 - Annual activity report

BONSAI

BONSAI - 2016

Project-Team Bonsai

Members

Overall Objectives

Presentation

Research Program

Application Domains

Life Sciences and health

Highlights of the Year

New Software and Platforms

New Results

Bilateral Contracts and Grants with Industry

Bilateral Contracts with Industry

Partnerships and Cooperations

Dissemination

Bibliography

Previous |

Home | Next next

Section: New Results

Parallel algorithm for de Bruijn graph compaction

Constructing a de Bruijn graph is an important step in the analysis of NGS data. This data structure is used in several applications, such as de novo assembly, variant detection, and transcriptome quantification. However, the representation of this graph often consumes prohibitive amounts of memory for large datasets. An operation, called compaction, enables to represent the graph more efficiently. However, so far, there was no algorithm for compacting the graph quickly and in low memory.

Along with colleagues at Inria Rennes and at Penn State University, we introduced a parallel algorithm and an implementation, BCALM 2, for constructing directly a compacted de Bruijn graph given a set of reads. Our results show that this algorithm enables to construct the graph for very large datasets, such as the spruce and pine genomes, in reasonable time and memory on a single machine. This represents a performance improvement of two orders of magnitude compared to previously available methods. BCALM 2 is open-source and was published at ISMB 2016 [20].

Previous |

Home | Next next