EN FR
EN FR


Section: Software

Sequence annotation

We develop tools for discovery and search of complex pattern signatures within biological sequences, with a focus on protein sequences. An integrated environment, Dr Motif (http://www.drmotifs.org/ ) is available on the GenOuest Platform that gathers state-of-the-art tools for pattern discovery and pattern matching including our own developments.

  • Complex pattern discovery: Protomata learner (http://protomata-learner.genouest.org/ ) is a grammatical inference framework suitable for the inference of accurate protein signatures [3] , [4] . It was completely redesigned in 2010-2011 thanks to a specific Inria action (ADT support). It is currently applied to the recognition of olfactory receptor genes.

  • Complex pattern matching: Logol (http://webapps.genouest.org/LogolDesigner/ ). We have completely redesigned Stan (suffix-tree analyser), a former tool to search for nucleotidic and peptidic patterns within whole chromosomes [7] . The result is Logol, a software suite accepting a syntax based on String Variable Grammars, which allows the description of realistic complex patterns including ambiguities, insertions/ deletions, gaps, repeats and palindromes. It has been presented for the first time in [21] . Logol has been applied to the detection of -1 frameshifts, a structure including pseudo knots, on a reference benchmark (Recode2).