Section: Application Domains

Biological sequence annotation

In order to check the accuracy of in-silico predictions, a last step (Axis 3) is to extract genetic actors responsible of biological pathways of interest in the targeted organism and locate them in the genome. In our guiding example, active proteins implied in Pufa's controlling pathways have to be precisely identified. Actors structures are represented by syntactic models (see figure 4 ). We use knowledge-based induction on far instances for the recognition of new members of a given sequence family within non-model genomes (see figure 3 ). A main objective is to model enzyme specificity with highly expressive syntactic structures - context-free model - in order to take into account constraints imposed by local domains or long-distance interactions within a protein sequence.