Section: New Results
Detection of small non-coding RNAs
Small non-coding RNAs (sRNAs) regulate numerous cellular processes in all domains of life. Several approaches have been developed to identify them from RNA-seq data, which are efficient for eukaryotic sRNAs but remain inaccurate for the longer and highly structured bacterial sRNAs. Together with colleagues from INSA de Lyon, Stéphan Lacour developed APERO , a new algorithm to detect small transcripts from paired-end bacterial RNA-seq data. This algorithm is based on a novel approach, which does not start from the read coverage distribution, but analyzes boundaries of individual sequenced fragments to infer the 5′ and 3′ ends of all transcripts. Validation of the algorithm on Escherichia coli and Salmonella enterica datasets, based on experimentally validated sRNAs, showed it to outperform all existing methods in terms of sRNA detection and boundary precision. Moreover, APERO was able to identify the small transcript repertoire of Dickeya dadantii including putative intergenic RNAs, 5′ UTR or 3′ UTR-derived RNA products and antisense RNAs. This work was published in Nucleic Acids Research this year [18]. APERO is freely available as an open source R package (https://github.com/Simon-Leonard/APERO). In other work, together with colleagues from the University of Salento, Lecce (Italy), Stéphan Lacour contributed to RhoTermPredict , an algorithm for predicting Rho-dependent transcription terminators in bacterial genomes [16].