Section: New Results

Design and optimization for population genomics

As mentioned above, the area of genomics experiences a massive increase in the amount of data to be processed. Furthermore the data generated can sometimes hard to interpret (in particular NGS data for CNV detection).

We investigate new means to discover Copy Number Variation in the human population using methods from the deep learning community. Indeed, great success has been achieved in that area within projects such as DeepVariant; such projects managed to considerably lower the latency for getting results (about 10 fold) but at a higher computational cost. Such methods are currently attracting significant attention in the biology / bioinformatics community, as witnesed by an editorial in Cell Systems (December 2017) (http://www.cell.com/cell-systems/fulltext/S2405-4712(17)30554-9).

As the area of population genomics is fairly new, we hope to help design a complete framework allowing for better optimisations and integration with database tools. This work is carried by Yanlei Diao and Felix Raimundo, together with Dr. Avinash Abhyankar at the New York Genome Center (NYGC) who co-advises the PhD of F. Raimundo and Dr. Toby Bloom (head of informatics at NYGC).