EN FR
EN FR


Section: New Results

Metagenomics

Reconstruction of phylogenetic marker genes. Accurate identification of organisms present within a community is essential to understanding the structure of an ecosystem. However, current HTS technologies generate short reads, such as Illumina reads, which makes it a difficult task. One possibility is to focus on assembly of taxonomic markers of interest, such as 16S ribosomal RNA. The PhD thesis of P. Pericard proposed an algorithm that is specifically dedicated to this problem. The method implements a stepwise process based on construction and analysis of a read overlap graph, which is built using read alignments (produced by SortMeRNA) and is decomposed into relevant connected components extracted from a compressed representation of the graph. It is able to recover full length 16S sequences with high precision assemblies (0.1% error rate). This work is published in the reference journal in the field [23] and the resulting software, MATAM, was released this spring. It is currently being tested in several labs (Tests of MATAM at MEDIS (INRA-Université Clermont Auvergne) for gene capture, Labgem (Genoscope) where it is on tracks to be integrated into the PathoTRACK-MicroScope platform dedicated to the human intestinal microbiome, and the Australian Centre for Ancient DNA (University of Adelaïde) for oral microbiome research.). This work received the Best Oral Presentation Award from the SFBI (SFBI: Société Française de Bioinformatique) this year [29].

Metagenomics assembly. Another important task that could help taxonomic assignment is to reconstruct uncultured microbial strains and species for which the genome sequence is fully unknown. To this end, metagenomics mainly borrows techniques from classical genomics, i.e. from de novo assembly of isolate genomes. We built upon continuous methodological advances with our genomic assembler Minia, adding new data structures such as the minimal perfect hash function [26] and the compressed graph representation. We participated in 2015 in the CAMI metagenomic reconstruction challenge (CAMI challenge: https://data.cami-challenge.org/). This challenged gathered a total of 17 international groups, and Minia performed among the top assembly methods. This result is reported in an article to appear in Nature Methods from the CAMI consortium [25]. We further presented a poster at RECOMB 2017.

Targeted metagenomics. Within the PhD thesis of L. Siegwald, we have participated to the design of a comprehensive evaluation protocol to compare computational pipelines to analyze 16S amplicons, and have studied the impact of different variables on the biological interpretation of results. This study included the following tools: CLARK, Kraken, Mothur, Qiime and One Codex. It has been the subject of an invited keynote at the international workshop Recent Computational Advances in Metagenomics (RCAM 2017) (RCAM 2017: http://maiage.jouy.inra.fr/?q=fr/rcam2017).