Logol: Expressive Pattern Matching in sequences. Application to Ribosomal Frameshift Modeling

DYLISS Dynamics, Logics and Inference for biological Systems and Sequences

Computational Biology

Digital Health, Biology and Earth

http://www.irisa.fr/dyliss/ Institut de recherche en informatique et systèmes aléatoires (IRISA) CNRS Université Rennes 1 Creation of the Team: 2012 January 01, updated into Project-Team: 2013 July 01 Project-Team A3.1.1. - Modeling, representation A3.1.2. - Data management, quering and storage A3.1.7. - Open data A3.2.3. - Inference A3.2.4. - Semantic Web A3.2.5. - Ontologies A3.3. - Data and knowledge analysis A7.2. - Logic in Computer Science A8.1. - Discrete mathematics, combinatorics A8.2. - Optimization A9.1. - Knowledge A9.2. - Machine learning A9.7. - AI algorithmics B1.1.2. - Molecular biology B1.1.3. - Cellular biology B1.1.9. - Bioinformatics B1.1.11. - Systems biology B1.1.14. - Microbiology B2.2.3. - Cancer B2.2.5. - Immune system diseases Anne Siegel Chercheur

Rennes

Team leader, CNRS, Senior Researcher oui François Coste Chercheur

Rennes

Inria, Researcher Jacques Nicolas Chercheur

Rennes

Inria, Senior Researcher oui Catherine Belleannée Enseignant

Rennes

Univ de Rennes I, Associate Professor Olivier Dameron Enseignant

Rennes

Univ de Rennes I, Associate Professor oui Maxime Folschette PostDoc

Rennes

Univ de Rennes I, from Nov 2017 Vijay Ingalalli PostDoc

Rennes

CNRS, from Jun 2017 Lucas Bourneuf PhD

Rennes

Univ de Rennes I Mael Conan PhD

Rennes

Univ de Rennes I, from Oct 2017 Jean Coquet PhD

Rennes

Univ de Rennes I Victorien Delannée PhD

Rennes

Univ de Rennes I, until Nov 2017 Clémence Frioux PhD

Rennes

Inria Julie Laniau PhD

Rennes

Inria, until Jul 2017 and CNRS from Sep 2017 until Nov 2017 Marine Louarn PhD

Rennes

Univ de Rennes I, from Oct 2017 Yann Rivault PhD

Rennes

Univ de Rennes I Juliette Talibart PhD

Rennes

Univ de Rennes I, from Nov 2017 Méline Wery PhD

Rennes

Univ de Rennes I, from Oct 2017 Meziane Aite Technique

Rennes

Inria Xavier Garnier Technique

Rennes

INRA Camille Trottier Technique

Rennes

CNRS, until Sep 2017 Pierre Vignet Technique

Rennes

CNRS Jeanne Got Technique

Rennes

CNRS Arnaud Belcour Technique

Rennes

CNRS, from Dec 2017 Marie Le Roic Assistant

Rennes

Univ de Rennes I Oumarou Abdou Arbi Visiteur

Rennes

Université de Maradi, from Sep 2017 to Dec 2017 Jérémie Bourdon CollaborateurExterieur

Rennes

Univ de Nantes oui Marie Chevallier CollaborateurExterieur

Rennes

CNRS Damien Eveillard CollaborateurExterieur

Rennes

Univ de Nantes François Moreews CollaborateurExterieur

Rennes

INRA Denis Tagu CollaborateurExterieur

Rennes

INRA oui Nathalie Théret CollaborateurExterieur

Rennes

INSERM oui Overall Objectives Overall objectives

The research domain of the bioinformatics Dyliss team is sequence analysis and systems biology. Our main goal in biology is to characterize groups of genetic actors that control the phenotypic answer of species when challenged by their environment. The team explores methods in the field of formal systems, more precisely in knowledge representation, constraints programming, multi-scale analysis of dynamical systems, and machine learning. Our goal is to identify key regulators of the environmental response by structuring and reasoning on information which combines physiological responses measured with omics technologies (RNA-seq, metabolomics, proteomics), genetic information from their long-distant cousins and knowledge about regulation and metabolic pathways stored in public repositories.

The main challenges we face are data incompleteness and heterogeneity. We favor the construction and study of a "space of feasible models or hypotheses" including known constraints and facts on a living system rather than searching for a single optimized model. We develop methods allowing a precise investigation of this space of hypotheses. Therefore, we are in position of developing experimental strategies to progressively shrink the space of hypotheses and gain in the understanding of the system. Importantly, one should notice that our models span a quite large spectrum of discrete structures: oriented graphs, boolean networks, automata, or expressive grammars.

More concretely, the steps of the analysis are to (i) formalize and integrate in a set of logical or grammatical constraints both generic knowledge information (literature-based regulatory pathways, diversity of molecular functions, DNA patterns associated with molecular mechanisms) and species-specific information (physiological response to perturbation, sequencing...); (ii) investigate the space of admissible models and exhibit its main features by solving combinatorial optimization problems; (iii) identify corresponding genomic products within sequences. At each of these steps, we rely on symbolic methods for model space exploration (ontologies and formal concepts analysis).

We target applications for which large-scale heterogeneous data about a specific but complex physiological phenotype are available. Existing long-term partnerships with biological labs give strong support to this choice. In marine biology, we collaborate closely with the Station biologique de Roscoff (Idealg, Investissement avenir "Bioressources et Biotechnologies"). We also collaborate with other teams of Inria in the IPL Algae In Silico project to understand the metabolism of a micro-algae. In environmental microbiology we collaborate both with the CRG in Chile in the framework of the Ciric Chilean Inria center (Ciric-Omics). In agriculture, our main partners are within the INRA institute in Rennes, with a focus on the understanding of pea-aphids reproduction mode and of breeding animals metabolism (porc, chicken, cow). More recently, we have introduced health as a new application field of the team, especially through the study of large-scale boolean networks and their confrontation with knowledge repositories (collaboration with Inserm, CHU Rennes and Sanofi).

Research Program Modeling knowledge integration with combinatorial constraints

Biological networks are built with data-driven approaches aiming at translating genomic information into a functional map. Most methods are based on a probabilistic framework which defines a probability distribution over the set of models. The reconstructed network is then defined as the most likely model given the data.

Our team has investigated an alternative perspective where each data induces a set of constraints – related to the steady state response of the system dynamics – on the set of possible values in a network of fixed topology. The methods that we have developed complete the network with product states at the level of nodes and influence types at the level of edges, able to globally explain experimental data. In other words, the selection of relevant information in the model is no more performed by selecting the network with the highest score, but rather by exploring the complete space of models satisfying constraints on the possible dynamics supported by prior knowledge and observations. In the (common) case when there is no model satisfying all the constraints, we relax the problem by introducing new combinatorial optimization problems that introduce the possibility of correcting the data or the knowledge. Common properties to all solutions are considered as a robust information about the system, as they are independent from the choice of a single solution to the optimization problem .

Solving these computational issues requires addressing NP-hard qualitative (non-temporal) issues. We have developed a long-term collaboration with Potsdam University in order to use a logical paradigm named Answer Set Programming (ASP) , to solve these constraint satisfiability and combinatorial optimization issues. Applied on transcriptomic or cancer networks, our methods identified which regions of a large-scale network shall be corrected , and proposed robust corrections . This result suggested that this approach was compatible with efficiency, scale and expressivity needed by biological systems.

During the last years, our goal was to provide formal models of queries on biological networks with the focus of integrating dynamical information as explicit logical constraints in the modeling process. Using these technologies requires to revisit and reformulate constraint-satisfiability problems at hand in order both to decrease the search space size in the grounding part of the process and to improve the exploration of this search space in the solving part of the process. Concretely, getting logical encoding for the optimization problems forces to clarify the roles and dependencies between parameters involved in the problem. This paves the way to a refinement approach based on a fine investigation of the space of hypotheses in order to make it smaller and gain in the understanding of the system. Our studies confirmed that logical paradigms are a powerful approach to build and query reconstructed biological systems, in complement to discriminative ("black-box") approaches based on statistical machine-learning. Based on these technologies, we have developed a panel of methods allowing the integration of muli-scale data knowledge, linking genomics, metabolomics, expression data and protein measurement of several phenotypes.

Notice that our main issue is in the field of knowledge representation. More precisely, we do not wish to develop new solvers or grounders, a self-contained computational issue which is addressed by specialized teams such as our collaborator team in Potsdam. Our goal is rather to investigate how the constant progresses in the field of constraint logical programming, shown by the performance of ASP-solvers, are sufficient to address the complexity of constraint-satisfiability and combinatorial optimization issues explored in systems biology. In this direction, we work in close interaction with Potsdam university to feed their research activities whith challenging issues from bioinformatics and, as a feed-back, take benefit of the prototypes they develop.

By exploring the complete space of models, our approach typically produces numerous candidate models compatible with the observations. We began investigating to what extent domain knowledge can further refine the analysis of the set of models by identifying classes of similar models, or by selecting a subset of models that satisfy an additional constraint (for instance, best fit with a set of experiments, or with a minimal size). We anticipate that this will be particularly relevant when studying non-model species for which little is known but valuable information from other species can be transposed or adapted. These efforts consist in developing reasoning methods based on ontologies as formal representation of symbolic knowledge. We use Semantic Web tools such as SPARQL for querying and integrating large sources of external knowledge, and measures of semantic similarity and particularity for analyzing data.

Modeling the dynamical response of biological systems with logical and (non)-linear constraints

As explained below, Answer Set Programming technologies enable the identification of key controllers based on the integration of static data. As a natural follow-up, we also develop optimization techniques to learn models of the dynamics of a biological system. As before, our strategy is not to select a single model fitting with experimental data but rather to decipher the complete set of families of models which are compatible with the observed response. Our main research line in this field is to decipher the appropriate level of expressivity (in terms of constraints) allowing both to properly report the nature of data and knowledge and to allow for an exhaustive study of the space of feasible models. To implement this strategy, we rely on several constraint programming frameworks, which depend on the model scale and the nature of time-points kinetic measurements. Logical programming (Answer Set Programming) is used to decipher the combinatorics of synchrone Boolean networks explaining static or dynamics response of signaling networks to perturbations (such as measured by phosphoproteomics technologies) . SAT-based approaches are used to decipher the combinatorics of large-scale asynchronous boolean networks. In order to gain in expressivity, we model these networks as guarded-transition network, an extension of Petri nets ,. Finally, classical learning methods are used to build ad-hoc parameterized numerical models that provide the most parsimonious explanations to experimental measurements.

Modeling sequences with formal grammars

Once groups of genome products involved in the answer of the species have been identified with integrative or dynamical methods, it remains to characterize the biological actors within genomes. To that goal, we both learn, model and parse formal patterns within DNA, RNA or protein sequences. More precisely, our research on modeling biomolecular sequences with expressive formal grammars focuses on learning such grammars from examples, helping biologists to design their own grammar and providing practical parsing tools.

On the development of machine learning algorithms for the induction of grammatical models , we have a strong expertise on learning finite state automata. We have proposed an algorithm that learns successfully automata modeling families of (non homologous) functional families of proteins , leading to a tool named Protomata-learner. The algorthim is based on a similar fragment merging heuristic approach which reports partial and local alignments contained in a family of sequences. As an example, this tool allowed us to properly model the TNF protein family, a difficult task for classical probabilistic-based approaches. It was also applied successfully to model important enzymatic families of proteins in cyanobacteria . Our future goal is to further demonstrate the relevance of formal language modeling by addressing the question of a fully automatic prediction from the sequence of all the enzymatic families, aiming at improving even more the sensitivity and specificity of the models. As enzyme-substrate interactions are very specific central relations for integrated genome/metabolome studies and are characterized by faint signatures, we shall rely on models for active sites involved in cellular regulation or catalysis mechanisms. This requires to build models gathering both structural and sequence information in order to describe (potentially nested or crossing) long-term dependencies such as contacts of amino-acids that are far in the sequence but close in the 3D protein folding. Our current researches is focused on the inference of Context-Free Grammars including the topological information coming from the structural characterization of active sites.

Using context-free grammars instead of regular patterns increases the complexity of parsing issues. Indeed, efficient parsing tools have been developed to identify patterns within genomes but most of them are restricted to simple regular patterns. Definite Clause Grammars (DCG), a particular form of logical context-free grammars have been used in various works to model DNA sequence features . An extended formalism, String Variable Grammars (SVGs), introduces variables that can be associated to a string during a pattern search , . This increases the expressivity of the formalism towards mildly context sensitive grammars. Thus, those grammars model not only DNA/RNA sequence features but also structural features such as repeats, palindromes, stem/loop or pseudo-knots. A few years ago, we have designed a first tool, STAN (suffix-tree analyser), in order to make it possible to search for a subset of SVG patterns in full chromosome sequences . This tool was used for the recognition of transposable elements in Arabidopsis thaliana . We have enlarged this experience through a new modeling language, called Logol . Generally, a suitable language for the search of particular components in languages has to meet several needs : expressing existing structures in a compact way, using existing databases of motifs, helping the description of interacting components. In other words, the difficulty is to find a good tradeoff between expressivity and complexity to allow the specification of realistic models at genome scale. The Logol language and associated framework have been built in this direction. The Logol specificity besides other SVG-like languages mainly lies in a systematic introduction of constraints on string variables.

Symbolic methods for model space exploration: Semantic web for life sciences and Formal Concepts Analysis

All the methods presented in the previous sections usually result in pools of candidates which equivalently explain the data and knowledge. These candidates can be dynamical systems, compounds, biological sequences, proteins... In any case, the output of our formal methods generally requires a posteriori investigation and filtering by domain experts. In order to assist them, we rely on two classes of symbolic technics: Semantic Web technologies and Formal Concept Analysis (FCA). They both aim at the formalization and management of knowledge, that is, the explicitation of relations occuring in structured data. These technics complement each other: the production of relevant concepts in FCA highly depends on the availability of semantic annotations using a controlled set of terms and conversely, building and exploiting ontologies is a complex process that can be made much easier with FCA.

Integrating heterogenous data with semantic web technologies The emergence of ontologies in biomedical informatics and bioinformatics happened in parallel with the development of the Semantic Web in the computer science community . Let us recall that the Semantic Web is an extension of the current Web that provides an infrastructure integrating data and ontologies in order to support unified reasoning. Since the beginning, life sciences have been a major application domain for the Semantic Web . This was motivated by the joint evolution of data acquisition capabilities in the biomedical field, and of the methods and infrastructures supporting data analysis (grids, the Internet...), resulting in an explosion of data production in complementary domains , . Consequently, Semantic Web technologies have become an integral part of translational medicine and translational bioinformatics . The Linked Open Data project promotes the integration of data sources in machine-processable formats compatible with the Semantic Web , with a strong involvement of life sciences in this initiative.

However, a specificity of life sciences “data deluge” is that the proportion of generated data is much higher than in the more general “big data phenomenon”, and that these data are highly connected . The bottleneck that once was data scarcity now lies in the lack of adequate methods supporting data integration, processing and analysis . Each of these steps typically hinges on domain knowledge, which is why they resist automation. This knowledge can be seen as the set of rules representing in what conditions data can be used or can be combined for inferring new data or new links between data.

In this setting, we are working on the integration of Semantic Web resources with our data analysis methods in order to take existing biological knowledge into account. We have introduced several methods to interpret semantic similarities and particularities , . We now focus our attention on the semi-automated construction of RDF abstractions of heterogeneous datasets which can be handled by non-expert users. This allows both to automatically prepare input datasets for the other methods developed in the team and to analyse the output of the methods in a wide knowledge context.

Using Formal concept analysis to explore the results of bioinformatics analyses Formal concept analysis aims at the development of conceptual structures which can be logically activated for the formation of judgments and conclusions . It is used in various domains managing structured data such as knowledge processing, information retrieval or classification . In its most simple form, one considers a binary relation between a set of objects and a set of attributes. In this setting, formal concept analysis formalizes the semantic notions of extension and intension. Concepts are related within a lattice structure (Galois connection) by subconcept-superconcept relations, and this allows drawing causality relations between attribute subsets. In bioinformatics, it has been used to derive phylogenetic relations among groups of organisms , a classification task that requires to take into account many-valued Galois connections. We have proposed in a similar way a classification scheme for the problem of protein assignment in a set of protein families .

One of the most important issue with concept analysis is due to the fact that current methods remain very sensitive to the presence of uncertainty or incompleteness in data. On the other hand, this apparent defect can be reversed to serve as a marker of incompleteness or inconsistency . Following this inspiration, we have proposed a methodology to tackle the problem of uncertainty on biological networks where edges are mostly predicted links with a high level of false positives . The general idea consists to look for a tradeoff between the simplicity of the conceptual representation and the need to manage exceptions. As a very prospective challenge, we are exploring the idea of using ontologies to help this or to help ontology refinement using concept analysis , , .

More generally, common difficult tasks in this context are visualization, search for local structures (graph mining) and network comparison. Network compression is a good solution for an efficient treatment of all these tasks. This has been used with success in power graphs, which are abstract graphs where nodes are clusters of nodes in the initial graph and edges represent bicliques between two sets of nodes . In fact, concepts are maximal bicliques and we are currently developing the power graph idea in the framework of concept analysis.

Implementing methods in software and platforms

Seven platforms have been developped in the team for the last five years: Askomics, AuReMe, FinGoc, Caspo, Cadbiom, Logol, Protomata. Indeed, one of the team's goals is to facilitate interplays between the tools for biological data analysis and integration. Improvements and novelties of these platforms are described in the "software" section. Our platforms aim at guiding the user to progressively reduce the space of models (families of sequences of genes or proteins, families of keys actors involved in a system response, dynamical models) which are compatible with both knowledge and experimental observations.

Most of our platforms are developed with the support of the GenOuest resource and data center hosted in the IRISA laboratory, including their computer facilities [more info]. It worths considering them into larger dedicated environments to benefit from the expertise of other research groups. The BioShadock repository of the GenOuest platform allows one to share the different docker containers that we are developing [website]. The GenOuest galaxy portal of the GenOuest platform now provides access to most tools for integrative biology and sequence annotation (access on demand).

AskOmics platform

Goal Integration and interrogation software for linked biological data based on semantic web technologies [url].

DescriptionAskOmics aims at bridging the gap between end user data and the Linked (Open) Data cloud. It allows heterogeneous bioinformatics data (formatted as tabular files or directly in RDF) to be loaded into a Triple Store system using a user-friendly web interface. AskOmics also provides an intuitive graph-based user interface supporting the creation of complex queries that currently require hours of manual searches across tens of spreadsheet files. The elements of interest selected in the graph are then automatically converted into a SPARQL query that is executed on the users's data.

Originality Our experience is that end users (i) do not benefit for all the information available in the LOD cloud repositories by lack of SPARQL expertise (understandably: they are biologists and most of them do not have an interest in either learning SPARQL nor in learning how to integrate data); (ii) do not contribute their data back to the LOD cloud. Again, they do not have the expertise nor the resources to produce and maintain datasets and the associated metadata as linked data, nor to maintain the underlying server infrastructure. Therefore there is a need for helping end users to (1) take advantage of the information readily available in the LOD cloud for analyzing there own data and (2) contribute back to the linked data by representing their data and the associated metadata in the proper format as well as by linking them to other resources. In this context, the main originality is the graphical interface that allows any SPARQL query to be built transparently and iteratively by a non-expert user.

Application This software was developed in the context of the MirnAdapt (pea-aphid) project in 2016. The tool has been presented to the agriculture communities in conferences , and to the Galaxy community . Up to now, more than 10 biological partners team are actually testing and using the prototype software (colza, pea-aphids, copper microbiology, marine biology), and Sanofi has shown its interest to co-develop the tool. Even if its current user base belongs to the bioinformatics community, the scope of AskOmics is domain-independent and has the potential to reach a wider audience related to the Semantic Web community.

AuReMe workspace

Goal Tracable reconstruction of metabolic networks [url].

Description The toolbox AuReMe allows for the Automatic Reconstruction of Metabolic networks based on the combination of multiple heterogeneous data and knowledge sources . It is available as a Docker image. Five modules are composing AuReMe: 1) The Model-management PADmet module allows manipulating and traceing all metabolic data via a local database. [package] 2) The meneco python package allows the gaps of a metabolic network to be filled by using a topological approach that implements a logical programming approach to solve a combinatorial problem , and [python package] 3) The shogen python package allows genome and metabolic network to be aligned in order to identify genome units which contain a large density of genes coding for enzymes; it also implements a logical programming approach [python package]. 4) The manual curation assistance PADmet module allows the reported metabolic networks and their metadata to be curated. 5) The Wiki-export PADmet module enables the export of the metabolic network and its functional genomic unit as a local wiki platform allowing a user-friendly investigation [package].

Originality The main added-values are the inclusion of graph-based gap-filling tools that are particularly relevant for the study of non-classical organisms, the possibility to trace the reconstruction and curation procedures, and the representation and exploration of reconstructed metabolic networks with wikis.

Application The tools included in AuReMe have been used for reconstructing metabolic networks of micro and macro-algae , extremophile bacteria and communities of organisms in the context of the Idealg, Ciric-omics and IPL Algae-In-Silico projects.

FinGoc-tools

Goal Filtering interaction networks with graph-based optimization criteria.

Description The goal is to offer a set of tools for the reconstruction of networks from genome, literature and large-scale observation data (expression data, metabolomics...) in order to elucidate the main regulators of an observed phenotype. Most of the optimization issues are addressed with Answer Set Programming. 1) The lombarde package enables the filtering of transcription-factor/binding-site regulatory networks with mutual information reported by the response to environmental perturbations. The high level of false-positive interactions is filters according to graph-based criteria. Knowledge about regulatory modules such as operons or the output of the shogen package can be taken into account , [web server]. 2) The KeyRegulatorFinder package allows searching key regulators of lists of molecules (like metabolites, enzymes or genes) by taking advantage of knowledge databases in cell metabolism and signaling. The complete information is transcribed into a large-scale interaction graph which is filtered to report the most significant upstream regulators of the considered list of molecules [package]. 3) The powerGrasp python package provides an implementation of graph compression methods oriented toward visualization, and based on power graph analysis. [package]. 4) The iggy package enables the repairing of an interaction graph with respect to expression data. It proposes a range of different operations for altering experimental data and/or a biological network in order to re-establish their mutual consistency, an indispensable prerequisite for automated prediction. For accomplishing repair and prediction, we take advantage of the distinguished modeling and reasoning capacities of Answer Set Programming. [Python package]

Originality The main added-value of these tools is to make explicit the criteria used to highlight the role of the main regulators: the underlying methods encode explicit graph-based criteria instead of relying on statistical approaches. This makes it possible to explain local relationships and patterns within interaction graphs by explicit biological relationships.

Application The tools have been used to figure out the main gene-regulators of the response of porks to several diets in , and . The tools were also used to to decipher regulators of reproduction for the pea aphid, an insect that is a pest on plants , .

Caspo software Anne Siegel

Goal Studying synchronous boolean networks [url]

Description Cell ASP Optimizer (Caspo) constitutes a pipeline for automated reasoning on logical signaling networks. The main underlying issue is that inherent experimental noise is considered, many different logical networks can be compatible with a set of experimental observations (see and ). It is available as a Docker container. Five modules are composing Caspo: 1) the Caspo-learn module performs an automated inference of logical networks from experimental data allows for identifying admissible large-scale logic models saving a lot of efforts and without any a priori bias and . 2) The Caspo-classify, predict and visualize modules allows for classifying a family of boolean networks with respect to their input-output predictions . 3) The Caspo-design module designs experimental perturbations which would allow for an optimal discrimination of rival models in a family of boolean networks . 4) The Caspo-control module identifies key-players of a family of networks: it computes robust intervention strategies that force a set of target species or compounds into a desired steady state . 5) The Caspo-timeseries module to take into account time-series observation datasets in the learning procedure [python package and docker container].

Originality The Caspo modules provide friendly and efficient solutions to problems that were previously addressed in theoretical papers with MILP programs. The main advantage is that is enables a complete study of logical network without requiring any linear constraint programs.

Application The Caspo tool was initiated in the framework of the BioTempo project. Caspo-learn has been included as a module to learn logical networks from early steady-state data in CellNopt, a generic platform which implements several methods for learning and studying signaling networks are different modeling levels (from logical models to numerical models).

Cadbiom package

Goal Building and analyzing the asynchronous dynamics of enriched logical networks [url]

Description Based on Guarded transition semantic, the Cadbiom software provides a formal framework to help the modeling of biological systems such as cell signaling network. It allows synchronization events to be investigated in biological networks . It is available as a Docker image. Three modules are composing Cadbiom: 1) The Cadbiom graphical interface is useful to build and study moderate size models. It provides exploration, simulation and checking. For large-scale models, Cadbiom also allows to focus on specific nodes of interest. 2) The Cadbiom API allows a model to be loaded, performing static analysis and checking temporal properties on a finite horizon in the future or in the past. 3) Exploring large-scale knowledge repositories, the translations of the large-scale PID repository (about 10,000 curated interactions) have been translated into the Cadbiom formalism.

Originality Model-checking approaches applied to Boolean networks or multivalued networks allow the trajectories of the system to be entirely studied but they can only be applied to small-size networks. On the contrary, Cadbiom is able to handle large-scale knowledge databases.

Application The Cadbiom tool was applied to study the regulators of the TGF- $β$ , a gene that controls liver fibrosis in the framework of the TGFSysBio project. The study of its predictions also enabled large-scale knowledge databases (PID) to be curated .

Logol software

Goal Complex pattern modelling and matching [url]

Description The Logol toolbox is a swiss-army-knife for pattern matching on DNA/RNA/Protein sequences, using a high-level grammatical formalism to permit a large expressivity for patterns . A Logol pattern can consist in a complex combination of motifs (such as degenerated strings) and structures (such as imperfect stem-loop ou repeats). Logol key features are the possibilities to divide a pattern description into several sub-patterns, to model long range dependencies, to enable the use of ambiguous models or to permit the inclusion of negative conditions in a pattern definition. The LogolMatch parser takes as input a biological sequence and a grammar file. It returns a XML file containing all the occurrences of the pattern in the sequence with their parsing details. The input sequences can be genomes from biological banks.

Originality Many pattern matching tools exist to efficiently model specific types of patterns: vmatch , patmatch , cutadapt , scoring matrix or profile HMMs , . The main advantage of Logol is its very large expressivity. It encompasses most of the features of these specialized tools and enables interplays between several classes of patterns (motifs and structures).

Application The Logol tool was applied to the detection of mutated primers in a metabarcoding study , or to stem-loop identification (e.g. in CRISPR http://crispi.genouest.org/, https://hal.inria.fr/hal-00643408 , ). Ongoing application is the search for transposable elements in the human genome in the context of a colorectal cancer study Preprint: http://www.biorxiv.org/content/early/2017/03/09/115030. Logol strongly supported the study of the LXR- $α$ targets in the framework of the FatInteger project.

Protomata-suite

Goal Expressive pattern discovery on protein sequences [url]

DescriptionProtomata is a machine learning suite for the inference of automata characterizing (functional) families of proteins from available sequences. Based on partial and local alignments, Protomata learns precise characterizations of the families of proteins, allowing new family members to be predicted with a high specificity. Three main modules are integrated in the Protomata-learner workflow are available as well as stand-alone programs: 1) paloma builds partial local multiple alignments, 2) protobuild infers automata from these alignements and 3) protomatch and protoalign scans, parses and aligns new sequences with learnt automata. The suite is completed by tools to handle or visualize data and can be used online by the biologists via a web interface on Genouest Platform. It is actively maintained (version v2.1 was released in April 2017) and we are scheduling a new major version with enhanced scoring schemes that we have proposed .

Originality The main specificity is that the power of characterization is beyond the scope of classical sequence patterns such as PSSM (e.g. MEME suite ), Profile HMM (e.g. HMMER package ), or Prosite Patterns allowing new family members to be predicted with a high specificity.

Application The Protomata tool is used both to update automatically the Cyanolase database and, when combined to Formal Concept Analysis, to automated enzyme classification, such as the HAD superfamily of proteins in the framework of the Idealg project.

Application Domains Application fields in biology

Our methods are applied in several fields of molecular biology.

Our main application field is marine biology, as it is a transversal field with respect to issues in integrative biology, dynamical systems and sequence analysis. Our main collaborators work at the Station Biologique de Roscoff. We are strongly involved in the study of brown algae: the meneco, memap and memerge tools were designed to realize a complete reconstruction of metabolic networks for non-benchmark species , . On the same application model, the pattern discovery tool protomata learner combined with supervised bi-clustering based on formal concept analysis allows for the classification of sub-families of specific proteins . The same tool also allowed us to gain a better understanding of cyanobacteria proteins . At the larger level of 4D structures, classification technics have also allowed us to introduce new methods for the characterization of viruses in marine metagenomic sample . Finally, in dynamical systems, we use asymptotic analysis (tool pogg) to decipher the initiation of sea urchin translation . We are currently involved in two new applications in this domain: the team participates to a Inria Project Lab program with the Biocore and Ange Inria teams, focused on the understanding on green micro-algae; and we are involved in the deciphering of phytoplancton variability at the system biology level in collaboration with the Station Biologique de Roscoff (ANR Samosa).

In micro-biology, our main issue is the understanding of bacteria living in extreme environments, mainly in collaboration with the group of bioinformatics at Universidad de Chile (funded by CMM, CRG and Inria-Chile). In order to elucidate the main characteristics of these bacteria, we develop efficient methods to identify the main groups of regulators for their specific response in their living environment. To that purpose, we use constraints-based modeling and combinatorial optimization. The integrative biology tools meneco bioquali, ingranalysis, shogen, lombarde were designed in this context . In 2016, two applications focused on the study of extremophile consortium of bacteria have been performed with these tools , . In parallel, in collaboration with Ifremer (Brest), we have conducted similar work to decipher protein-protein interactions within archebacteria . Our sequence analysis tool (logol) allowed us to build and maintain a very expressive CRISPR database .

Similarly, in environmental sciences, our goal is to propose methods to identify regulators of very complex phenotypes related to environmental issues. In collaboration with researchers from Inra/Pegase laboratory, we develop methods to distinguish the response of breeding animals to different diaries or treatments and characterize upstream transcriptional regulators , applied to porks , , . Semantic-based analysis was useful for interpreting differences of gene expression in pork meat .

In addition, constraints-based programming also allows us to decipher regulators of reproduction for the pea aphid, an insect that is a pest on plants , . This was performed in collaboration with Inra/Igepp. This paved the way to the recent research track initiated in the team about integration of heterogeneous data with RDF-technologies (see AskOmics software) , and about graph-compression (see powergrasp software).

In bio-medical applications, we focus our attention on the confrontation of large-scale measurements with large-scale knowledge repositories about regulation pathways such as Transpath, PID or pathway commons. In collaboration with Institut Curie, we have studied the Ewing Sarcoma regulation network to test the capability of our tool bioquali to accurately correct and predict a large-scale network behavior . Our ongoing studies in this field focus on the exhaustive learning of discrete dynamical networks matching with experimental data, as a case study for modeling experimental design with constraints-based approaches. To that purpose, we collaborate with J. Saez Rodriguez group at EBI and N. Theret group at Inserm/Irset (Rennes) . The dynamical system tools caspo and cadbiom were designed within these collaborations. Ongoing studies focus on the understanding of the metabolism of xenobiotics (mecagenotox program) and the filtering of sets of regulatory compounds within large-scale signaling network (TGFSysBio project).

Highlights of the Year Highlights of the Year Awards

The team received a best paper award at the conference ICFCA and a best student paper award at the conference LPNMR .

New Software and Platforms AskOmics

Keywords: RDF - SPARQL - Querying - Graph - LOD - Linked open data

Functional Description: AskOmics allows to load heterogeneous bioinformatics data (formatted as tabular files) into a Triple Store system using a user-friendly web interface. AskOmics also provides an intuitive graph-based user interface supporting the creation of complex queries that currently require hours of manual searches across tens of spreadsheet files. The elements of interest selected in the graph are then automatically converted into a SPARQL query that is executed on the users' data.

News Of The Year: Several functionalities have been developed: 1) capacity of integrating genomics data (import of GFF and BED files and generation of RDF compliant with the FALDO ontology), 2) integration of data and knowledge in the OWL format to exploit biological information from external repositories, particularly from EBI and NCBI. Notably, this functionality allows AskOmics to support the Gene Ontology, the Taxonomy ontology as well as BioPAX biological networks. 3) improved user interface expressivity for generating SPARQL queries, 4) implementation of a support for multiple concurrent user sessions, with the distcintion between public and user-specific datasets 5) deployment of AskOmics on the GenOuest cloud infrastructure to facilitate its release and diffusion 6) interoperability between AskOmics and the Galaxy workflow environment.

Authors: Charles Bettembourg, Xavier Garnier, Anthony Bretaudeau, Fabrice Legeai, Olivier Dameron, Olivier Filangi and Yvanne Chaussin

Partners: Université de Rennes 1 - CNRS - INRA

Contact: Fabrice Legeai

URL: https://github.com/askomics/askomics

PADMet-utils

Keywords: Metabolic networks - Bioinformatics - Workflow - Omic data - Toolbox - Data management - LOD - Linked open data

Functional Description: The main concept underlying padmet-utils is to provide solutions that ensure the consistency, the internal standardization and the reconciliation of the information used within any workflow that combines several tools involving metabolic networks reconstruction or analysis.

News Of The Year: In 2017, Padmet-utils was enriched with a RDF export to allow the interoperability of the AuReMe workspace for the reconstruction of metabolic networks with the Askomics Tool for querying heterogeneous data. Padmet-utils was also extended to handle metabolic networks in the SBML3 format.

Participants: Alejandro Maass, Meziane Aite and Anne Siegel

Partner: University of Chile

Contact: Anne Siegel

URL: https://gitlab.inria.fr/maite/padmet-utils

CADBIOM

Computer Aided Design of Biological Models

Keywords: Health - Biology - Biotechnology - Bioinformatics - Systems Biology

Functional Description: Based on Guarded transition semantic, this software provides a formal framework to help the modeling of biological systems such as cell signaling network. It allows investigating synchronization events in biological networks. Software development has been restarted since November 2016. The source code is available at the following address: https://gitlab.irisa.fr/0000B8EG/Cadbiom

Participants: Geoffroy Andrieux, Michel Le Borgne, Nathalie Theret, Nolwenn Le Meur and Pierre Vignet

Contact: Anne Siegel

URL: http://cadbiom.genouest.org

conquests

Crossroads in Metabolic Network from Stoechiometric and Topologic Studies

Keywords: Bioinformatics - ASP - Answer Set Programming - Constraint-based programming

Functional Description: This Python package in systems biology allows the identification of essential metabolites with respect to the production of targeted elements in a metabolic network, by comparing flux and graph-based analysis. Conquests's inputs are a sbml file corresponding to a metabolic network and the biomass reaction name. The outputs are three sets of essential metabolites. They are computed according to three complementary criteria: graph-based accessibility of targeted metabolites, the presence of flux in the biomass reaction and the maximisation of flux in the biomass reaction.

News Of The Year: Conquest was released in 2017.

Contact: Julie Laniau

New Results Data integration and pre-processing with semantic-based technologies Olivier Dameron Xavier Garnier Yann Rivault Anne Siegel Denis Tagu

Interoperable infrastructure and implementation of a health data model for remote monitoring of chronic diseases with comorbidities In the context of telemedecine, we worked on a numerical application for monitoring patients with chronic diseases. We have developed a system based on a formal ontology that integrates the alert information and the patient data extracted from the electronic health record in order to better classify the importance of alerts. A pilot study was conducted on atrial fibrillation alerts. The results suggest that this approach has the potential to significantly reduce the alert burden in telecardiology , . In 2017, we proposed an architecture supporting data exchange in the context of multiple chronic diseases [O. Dameron, Y. Rivault] .

AskOmics, a web tool to integrate and query biological data using semantic web technologies The software AksOmics has been adapted to two types of scientific topics important in agronomical and environmental sciences: plant genomic data and insect pest genomic data. With AskOmics, plant genomicists (from academic and private labs from the Rapsodyn project - Investment for the future) working on the rapeseed (Brassica napus) are able to tackle the understanding of which gene copy is active or repressed in key developmental processes in relation with seed quality and oil production, in the frame of plant breeding. Additionally, entomologists use this tool to extract valuable knowledge on the way insect pests such as aphids are able to rapidly disseminate on crops, in the frame of free-pesticide methods for plant protection. AskOmics has been presented to the international community of insect genomics (i5k: http://i5k.github.io/ ) by web-seminars and AskOmics developers have been invited at international workshops. For facilitating AskOmics's adoption by end-user, it has recently been integrated within the Galaxy workflow engine [O. Dameron, X. Garnier, A. Siegel, D. Tagu] , ,

Data and knowledge integration based on combinatorial optimization Meziane Aite Lucas Bourneuf Marie Chevallier Damien Eveillard Clémence Frioux Jeanne Got Julie Laniau François Moreews Jacques Nicolas Anne Siegel

A transcriptome multi-tissue analysis identifies biological pathways and genes associated with variations in feed efficiency of growing pigs Our work on the identification of upstream regulators within large-scale knowledge databases (prototype KeyRegulatorFinder) was valuable for figuring out the main gene-regulators of the response of porks to several diets [F. Moreews, A. Siegel]

FCA in a Logical Programming Setting for Visualization-oriented Graph Compression We have explored the underlying idea of lossless network compression to address the problem of uncertainty in biological networks built from predictions, to help to visualize the networks and to classify their nodes in accordance with available annotations . Network compression has been used with success in Dresden (M. Schroeder) with a heuristic approach called Power Graph analysis building abstract graphs where nodes are clusters of nodes in the initial graph and edges represent bicliques between two sets of nodes. First encouraging results have been presented (best paper award) showing that it is possible to mimic the Power Graph behaviour while opening the possibility to achieve better compression levels compared to alternative compression schema. [L. Bourneuf, J. Nicolas]

Metabolic network completion and analysis We released the application paper of the tool Meneco, a tool dedicated to the topological gap-filling of genome-scale draft metabolic networks. The tool reformulates gap-filling as a qualitative combinatorial optimization problem, omitting constraints raised by the stoichiometry, and solves this problem using Answer Set Programming. Run on an artificial test set of 10,800 degraded Escherichia coli networks, we evidenced that Meneco outperforms the stoichiometry-based tool Gapfill in terms of precision. In addition, Meneco reports 10 times less putative reactions than MILP-based tool Fastgapfill for an equivalent precision. This is a strong advantage for manual curation post-processing, since curating 50 to 80 reactions is still possible whereas manually-curating 800 reactions is out-of-range. Meneco was applied to the reconstruction and understanding of a pathogeneic strain of salmon. [C. Frioux, J. Got, A. Siegel] ,

Toward the study of metabolic functions in communities of organisms In , we provided a first example on how to use topological metabolic modeling to assess the complementarity between two members of an algal ecosystem. Since this study, we generalized the selection of subcommunities of interest and propose likely interactions that could occur between seaweeds and their associated bacteria. A focus has also been done on plant microbiota and the reasons underlying the organization of the community. Altogether, these on-going works enable a better understanding of holobiont organizations and functioning. [M. Aite, M. Chevallier, C. Frioux, J. got, A. Siegel, C. Trottier] , ,

Hybrid Metabolic Network Completion In order to improve the precision of gap-filling approaches, we introduced a hybrid approach to formally reconcile existing stoichiometric and topological approaches to network completion in a unified formalism. An hybrid ASP encoding based on MILP constraint propagator was developed. It relies upon the theory reasoning capacities of the ASP system Clingo to solve the resulting logic program with linear constraints over reals. For short, this technology made it possible to combine the best of the combinatorial problem solver Clingo with the MILP solver CPlex. Run on the artificial test set of 10,800 degraded Escherichia coli networks introduced in , our approach yielded greatly superior results than obtainable from purely qualitative or MILP approaches. [C. Frioux, A. Siegel] ,

Combining graph and flux-based structures to decipher phenotypic essential metabolites within metabolic networks Whenever flux or graph-based criteria are used to study metabolic networks, these analyses are generally centered on the outcome of the network and considers all metabolic compounds to be equivalent in this respect. We generalized the concept of essentiality to metabolites and introduced the concept of the phenotypic essential metabolite (PEM) which influences the growth phenotype according to sustainability, producibility or optimal-efficiency criteria. The exhaustive study of phenotypic essential metabolites in six genome-scale metabolic models suggests that the combination and the comparison of graph, stoichiometry and optimal flux-based criteria allow some features of the metabolic network functionality to be deciphered by focusing on a small number of compounds. [C. Frioux, J. Laniau, A. Siegel]

Systems biology Jérémie Bourdon Jean Coquet Victorien Delannée Jacques Nicolas Anne Siegel Nathalie Théret Pierre Vignet

A modeling approach to evaluate the balance between bioactivation and detoxification of MeIQx in human hepatocytes Heterocyclic aromatic amines (HAA), including MeIQx, are environmental and food contaminants that are potentially carcinogenic for humans. Using a computational approach, we developed a numerical model for MeIQx metabolism that predicts the MeIQx biotransformation into detoxification or bioactivation pathways according to the concentration of MeIQx. Our results demonstrate that CYP1A2 is a key enzyme in the system that regulates the balance between bioactivation and detoxification. This highlights the importance of complex regulations of enzyme competitions that should be taken into account in any multi-organ model [V. Delannée, A. Siegel, N. Théret]

caspo: a toolbox for automated reasoning on the response of logical signaling networks families The accompanying paper of the complete family of modules introduced in the caspo software was published in 2017 (see software section for details) [A. Siegel]

Identifying Functional Families of Trajectories in Biological Pathways by Soft Clustering: Application to TGF- $β$ Signaling At a dynamical level, in , reaction-based and regulatory information was transpoed in a unified formalism of enriched Petri Nets (discrete dynamical systems), namely a simplified version of guarded transitions in which we introduced temporal parameters for each transition to manage competition and cooperation between parts of the models. This allowed integrating the 137 human signaling maps from the Pathway Interaction Database (PID) into a single unified large-scale dynamic model. Simulation and model checking analyses evidence that 15,934 different sets of molecules are able to regulate 159 of TGF- $β$ target genes (TGF- $β$ is a multifunctional cytokine that regulates mammalian cell development, differentiation, and homeostasis). Further analysis of these 15,934 sets of molecules by biological experts is obviously impractical. Our study identified five clusters of sets of molecules for which enrichment analysis highlighted the over-represented molecules as well as the specific biological processes they are associated with. These results are biologically-relevant and consistent with the pleiotropic nature of TGF- $β$ [J. Coquet, N. Théret, O. Dameron]

A Logic for checking the probabilistic steady-state properties of reaction networks. We have constructed a probabilistic analog to flux balance analysis of reaction networks to enable a formal verification of logical constrains about the stationary regime of a system by using information from experimental variances and co-variances. This is mainly based on a stationary analysis of the probabilistic dynamics relying on a Bernoulli approximation of a reaction network. The analysis requires solving non linear optimization problems [J. Bourdon, A. Siegel]

Sequence and structure annotation Catherine Belleannée François Coste Jacques Nicolas

Better scoring schemes for the recognition of functional proteins by protomata The machine learning algorithm included in Protomata-learner learns weighted automata representing both functional families from the sequences of amino acids, and the possible disjunctions between members. We investigated alternative sequence weighting strategies and null-models. We introduced a normalization of the score, and a method to assess the significance of scores, to simplify the prediction. Preliminary results show a good improvement of the prediction power of the computed models. [F. Coste]

Detection of mutated primers and impact on targeted metagenomics results In targeted metagenomics, an initial task is the detection in each sequence of the primers used for amplifying the targeted region. The selected sequences are then trimmed and clustered in order to inventory the species present in the sample. Common pratices consist in retaining only the sequences with perfect primers (i.e. non-mutated by sequencing error). In the context of a study characterizing the biodiversity of tropical soils in unicellular eukaryotes, we have implemented the search for mutated primers, using the grammatical pattern matching tool Logol, and shown that retrieving sequences with mutated primers has a significant impact on targeted metagenomics results, as it makes possible to detect more species (7% additional OTUs in our study). [C. Belleannée] .

First landscape of binding to chromosomes for a domesticated mariner transposase in the human genome. In order to study the diversity of genomic targets of the SETMAR protein in two colorectal cell lines, a first task was to massively discover the Made1 80-bp transposon element in the human genome. For that, we used our Logol grammar-like approach to look for non perfect Made1 instances. In Logol, a pattern can be divided into several sub-patterns. The Made1 model took advantage of this feature to strengthen the most conserved regions. Cumulating this search with the Blast alignment search permitted to significantly increase the Made1 annotation in the human genome.[C. Belleannée]

Bilateral Contracts and Grants with Industry Bilateral Contracts with Industry Olivier Dameron Anne Siegel Méline Wery

Our software AskOmics was considered as relevant by the Sanofi bio-medical company in order to facilitate the integration and the query of the data produced by their scientists. A former Ph.D. of Dyliss who designed the first prototypes of AskOmics was recruited by Sanofi. Since then, Sanofi is included in the developer's team of AskOmics and a joint Dyliss–Sanofi CIFRE Ph.D. thesis started about the integration of complementary reasoning features to SPARQL queries in Oct. 2017.

Partnerships and Cooperations Regional Initiatives Regional initiative: the Ecosyst project Damien Eveillard Marie Chevallier Clémence Frioux Anne Siegel Camille Trottier

EcoSyst is a Biogenouest inter-regional federating project (Brittany & Pays de la Loire) aiming at the emergence of Systems Ecology at the level of western France regions. Drawing on the strengths and skills involved, EcoSyst targets the incubation of new ideas and new projects at disciplinary interfaces. Thanks to this community project, we want to develop the skills of Ecology, Environment, Modeling, Bioinformatics and Systems Biology and their application to organisms and ecosystems of interest in agronomy, sea and health. EcoSyst includes also the identification of the major issues and concerns, the fundamental and essential methods and the very real needs of the community (training, tools, ...); this in order to consider the construction of a community platform (or an offer of service within an existing platform) on complex systems modeling, meeting expectations of the community as fully as possible.

Regional partnership with computer science laboratories in Nantes Anne Siegel Jérémie Bourdon Damien Eveillard François Coste Maxime Folschette Jacques Nicolas

Methodologies are developed in close collaboration with the LS2N (fusion of LINA and IRCCyN) located at University of Nantes and École centrale de Nantes. This is acted through the Biotempo and Idealg ANR projects and co-development of common software toolboxes within the Renabi-GO platform support. C. Trottier is a co-supervised bioianalysis and software development engineer within the Idealg project. M. Chevallier is a co-supervised development and animation engineer within the regional initiative "Ecosyst". In addition, the ongoing Ph-D student J. Laniau is co-supervised with a member of the LS2N laboratory. Finally, M. Folschette is a PostDoc working on on a project aiming at analyzing TGF-beta-related pathways evolutions after epithelial-mesenchymal transition in liver cancer, which is a recognized biological process leading to metastasis. This project is based on a topic shared with the LS2N: the use of graph coloring and reconstruction to witness expression changes, and is funded by the Université Bretagne Loire.

Regional partnership in Marine Biology Meziane Aite Arnaud Belcour Catherine Belleannée Jérémie Bourdon Jean Coquet François Coste Damien Eveillard Olivier Dameron Clémence Frioux Jeanne Got Julie Laniau Jacques Nicolas Camille Trottier Anne Siegel

A strong application domain of the Dyliss project is marine Biology. This application domain is co-developed with the station biologique de Roscoff and their three UMR and involves several contracts. Our approach based on parcimonious modelling allowed an in silico characterization of processes required within sea urchin translation , . We are also strongly involved in the the IDEALG consortium, a long term project (10 years, ANR Investissement avenir) aiming at the development of macro-algae biotechnology. Among the research activities, we are particularly interested in the analysis and reconstruction of metabolism and the characterization of key enzymes. Our methods based on combinatorial optimization for the reconstruction of genome-scale metabolic networks and on classification of enzyme families based on local and partial alignments allowed the E. Siliculosus seaweed metabolism to be deciphered , . As a further study, we reconstructed the metabolic network of a symbiot bacterium Ca. P. ectocarpi and used this reconstructed network to decipher interactions within the algal-bacteria holobiont .

Regional partnership in agriculture and environmental sciences Catherine Belleannée François Coste Olivier Dameron Xavier Garnier François Moreews Jacques Nicolas Anne Siegel Denis Tagu

We have a strong and long term collaboration with biologists of INRA in Rennes : PEGASE and IGEPP units. F. Morrews is a permanent engineer from PEGASE center hosted in the team to develop methods for integrative biology applied to species of interest in agriculture. D. Tagu is a research director at INRA/IGEPP who spends 20% of his time in the team to develop collaborative projects. This partnership has been supported by the co-supervision of phDs, post-docs and engineers. This collaboration was also reinforced by collaboration within ANR contracts (MirNadapt, FatInteger).

In collaboration with researchers from the PEGASE center (INRA) focused on breeding animals, we have contributed to several studies aiming at better integrating and investigating data in order to facilitate animal selection and alimentation. The NutritionAnalyzer prototype was developed to understand better the impact of several diaries or treatments for lactary cows over the composition of milk . Our work on the identification of upstream regulators within large-scale knowledge databases (prototype KeyRegulatorFinder) and on semantic-based analysis of metabolic networks was also very valuable for interpreting differences of gene expression in pork meat and figure out the main gene-regulators of the response of porks to several diets (see , and ).

In addition, constraints-based programming also allows us to decipher regulators of reproduction for the pea aphid, an insect that is a pest on plants in the framework of the MirnAdapt project. In terms of biological output of the network studies on the pea aphid microRNAs, we have identified one new microRNA (apmir-3019, not present in any known species other than the pea aphid) who has more than 900 putative mRNA targets. All these targets, as well as apmir3019, are differentially expressed between sexual and asexual embryos , .

Regional partnership in health Jean Coquet Olivier Dameron Victorien Delannée Marine Louarn Anne Siegel Nathalie Théret Pierre Vignet

We also have a strong and long term collaboration in health, namely with the IRSET laboratory at Univ. Rennes 1. N. Théret, research director at INSERM, is hosted in the team to strenghen our collaborative projects. Our collaborations are acted by the co-supervised Ph-D theses of V. Delannée , M. Conan (Metagenotox project, funded by Anses) and J. Coquet . This partnership was reinforced by the ANR contract Biotempo ended at the end of 2014. In 2015, the project of combining semantic web technologies and bi-clustering classification based on formal concept analysis was applied to systems biology within the PEPS CONFOCAL project. This scientific project has been recently pushed forward in the recent TGFSYSBio project funded by Plan Cancer on the modelling of the microenvironment of TGFbeta signaling network (P. Vignet has been recruited on this contract at the end of 2016).

A new application was initiated in 2017 through a collaboration with Rennes hospital, supported by a Inria-INSERM Ph-D thesis (M. Louarn).

National Initiatives ANR Idealg Meziane Aite Arnaud Belcour Jérémie Bourdon Marie Chevallier François Coste Damien Eveillard Clémence Frioux Jeanne Got Julie Laniau Jacques Nicolas Anne Siegel

IDEALG is one of the five laureates from the national call 2010 for Biotechnology and Bioresource and will run until 2020. It gathers 18 different partners from the academic field (CNRS, IFREMER, UEB, UBO, UBS, ENSCR, University of Nantes, INRA, AgroCampus), the industrial field (C-WEED, Bezhin Rosko, Aleor, France Haliotis, DuPont) as well as a technical center specialized in seaweeds (CEVA) in order to foster biotechnology applications within the seaweed field. We are participating to the tasks related to the establishment of a virtual platform for integrating omics studies on seaweeds and the integrative analysis of seaweed metabolism, in cooperation with SBR Roscoff. Major objectives are the building of brown algae metabolic maps, flux analysis and the selection of symbiotic bacteria to brown algae. We will also contribute to the prediction of specific enzymes (sulfatases) [More details].

Programs funded by research institutions PEPS PEPS: a platform for supporting studies in pharmaco-epidemiology using medico-administrative databases Olivier Dameron Yann Rivault

As a partner of the PEPS platform, several teams at Inria Rennes develop generic methods supporting efficient and semantically-rich queries for pharmaco-epidemiology studies on medico-administrative databases. The leader is Thomas Guyet (Inria team Lacodam). We showed that Semantic Web technologies are technically suited for representing patients' data from medico-administrative databases as RDF and querying them using SPARQL. We also demonstrated that this approach is relevant as it supports the combination of patients' data with hierarchical knowledge in order to address the problem of reconciling precise patients data with more general query criteria , , . This work is mostly conducted by Yann Rivault, whose PhD thesis is supervized by Olivier Dameron and Nolwenn LeMeur (Ecole des Hautes Etudes en Santé Publique).

Cancer Plan: TGFSysBio Jean Coquet Olivier Dameron Maxime Folschette Vijay Ingalalli Jacques Nicolas Anne Siegel Nathalie Théret Pierre Vignet

The TGFSYSBIO project aims to develop the first model of extracellular and intracellular TGF-beta system that might permit to analyze the behaviors of TGF-beta activity during the course of liver tumor progression and to identify new biomarkers and potential therapeutic targets. Based on collaboration with Jérôme Feret from ENS, Paris, we will combine a rule-based model (Kappa language) to describe extracellular TGF-beta activation and large-scale state-transition based (Cadbiom formalism) model for TGF-beta-dependent intracellular signaling pathways. The multi-scale integrated model will be enriched with a large-scale analysis of liver tissues using shotgun proteomics to characterize protein networks from tumor microenvironment whose remodeling is responsible for extracellular activation of TGF-beta. The trajectories and upstream regulators of the final model will be analyzed with symbolic model checking techniques and abstract interpretation combined with causality analysis. Candidates will be classified with semantic-based approaches and symbolic bi-clustering technics. The project is funded by the national program "Plan Cancer - Systems biology" from 2015 to 2018.

ANR Samosa Mael Conan Damien Eveillard Jeanne Got Anne Siegel

Oceans are particularly affected by global change, which can cause e.g. increases in average sea temperature and in UV radiation fluxes onto ocean surface or a shrinkage of nutrient-rich areas. This raises the question of the capacity of marine photosynthetic microorganisms to cope with these environmental changes both at short term (physiological plasticity) and long term (e.g. gene alterations or acquisitions causing changes in fitness in a specific niche). Synechococcus cyanobacteria are among the most pertinent biological models to tackle this question, because of their ubiquity and wide abundance in the field, which allows them to be studied at all levels of organization from genes to the global ocean.

The SAMOSA project is funded by ANR from 2014 to 2018, coordinated by F. Gaczarek at the Station Biologique de Roscoff/UPMC/CNRS. The goal of the project is to develop a systems biology approach to characterize and model the main acclimation (i.e., physiological) and adaptation (i.e. evolutionary) mechanisms involved in the differential responses of Synechococcus clades/ecotypes to environmental fluctuations, with the goal to better predict their respective adaptability, and hence dynamics and distribution, in the context of global change. For this purpose, following intensive omics experimental protocol driven by our colleagues from $-$ Station Biologique de Roscoff $-$ , we aim at constructing a gene network model sufficiently flexible to allow the integration of transcriptomic and physiological data.

ANSES Mecagenotox Victorien Delannée Mael Conan Anne Siegel Nathalie Théret

The objective of Mecagenotox project is to characterize and model the human liver ability to bioactivate environmental contaminants during liver chronic diseases in order to assess individual susceptibility to xenobiotics. Indeed, liver pathologies which result in the development of fibrosis are associated with a severe dysfunction of liver functions that may lead to increased susceptibility against contaminants. In this project funded by ANSES and coordinated by S. Langouet at IRSET/inserm (Univ. Rennes 1), we will combine cell biology approaches, biochemistry, biophysics, analytical chemistry and bioinformatics to 1) understand how the tension forces induced by the development of liver fibrosis alter the susceptibility of hepatocytes to certain genotoxic chemicals (especially Heterocyclic Aromatic Amines) and 2) model the behavior of xenobiotic metabolism during the liver fibrosis. Our main goal is to identify "sensitive" biomolecules in the network and to understand more comprehensively bioactivation of environmental contaminants involved in the onset of hepatocellular carcinoma.

Programs funded by Inria ADT Complex-biomarkers and ADT Proof of concept Jeanne Got Marie Chevallier Meziane Aite Anne Siegel

These projects started in Oct. 2014 and aims at designing a working environment based on workflows to assist molecular biologists to integrate large-scale omics data on non-classical species. The main goal of the workflows will be to facilitate the identification of set of regulators involved in the response of a species when challenged by an environmental stress. Applications target extremophile biotechnologies (biomining) and marine biology (micro-algae).

IPL Algae in silico Meziane Aite Jeanne Got Julie Laniau Anne Siegel

Microalgae are recognized for the extraordinary diversity of molecules they can contain: proteins, lipids (for biofuel or long chain polyunsaturated fatty acids for human health), vitamins, antioxidants, pigments. The project aims at predicting and optimizing the productivity of microalgae. It involves mainly the inria teams Biocore (PI), Ange and Dyliss. Dyliss is in charge of the identification of physiological functions for microalgae based on their proteomes, which is undergone through the reconstruction of the metabolic network of the T. lutea microalgae.

IPL Neuromarkers Olivier Dameron Anne Siegel

The project aims at identifying the main markers of pathologies through the production and the integration of imaging and bioinformatics data. It involves mainly the inria teams Aramis (PI) Dyliss, Genscale and Bonsai. Dyliss is in charge of facilitating the interoperability of imaging and bioinformatics data.

FederatedQueryScaler (Exploratory Research Action) Olivier Dameron Xavier Garnier Vijay Ingalalli

This project aims at developing automatic generation of abstractions for biological data and knowledge in order to scale federated queries in the context of semantic web technologies. It is a common project with the Wimmics Inria team.

European Initiatives Collaborations with Major European Organizations

Partner: Aachen university (Germany)

Title: Modeling the logical response of a signalling network with constraints-programming.

International Initiatives Inria International Labs Other IIL projects

We have a cooperation with Univ. of Chile (MATHomics, A. Maass) on methods for the identification of biomarkers and software for biochip design. It aims at combining automatic reasoning on biological sequences and networks with probabilistic approaches to manage, explore and integrate large sets of heterogeneous omics data into networks of interactions allowing to produce biomarkers, with a main application to biomining bacteria. The program is co-funded by Inria and CORFO-chile from 2012 to 2016. In this context, IntegrativeBioChile was an Associate Team between Dyliss and the Laboratory of Bioinformatics and Mathematics of the Genome hosted at Univ. of Chile funded from 2011 to 2016. The collaboration is now supported by Chilean programs.

International Research Visitors Visits of International Scientists

Niger. University of Maradi [O. Abdou-Arbi]

Poland. Politechnika Wroclawska [W. Dyrka]

India. VIT University, Vellore [K. Lakshmanan]

Visits to International Labs

Chile. University of Chile [A. Siegel, C. Frioux]

Research Stays Abroad

Germany. University of Potsdam [L. Bourneuf, 3 months (nov 2017 - jan 2018)]

Dissemination Promoting Scientific Activities Scientific Events Selection Member of the Conference Program Committees

SWAT4HCLS (2017) Semantic Web and Tools for Health Care and Life Sciences (O. Dameron)

BBCC (2017): Bioinformatica e Biologia Computazionale in Campania (O. Dameron)

JOBIM (2017): French conference of Bioinformatics (A. Siegel)

SIIM (2017) Symposium sur l'Ingénierie des Informations Médicales (O. Dameron)

Review

ISMB/ECCB 2017.

Journal Member of the Editorial Boards

O. Dameron is an associate editor of the Journal of Biomedical Semantics

J. Bourdon in an academic editor of PLoS One

Reviewer - Reviewing Activities

Briefings in Bioinformatics, (O. Dameron)

Journal of Biomedical Semantics (O. Dameron)

Journal on Data Semantics (O. Dameron)

Molecular Cancer (J. Nicolas),

Plos One (J. Nicolas)

Invited Talks French seminars

Paris, Hopital Lariboisière (Seminar, 2017) – SANOFI (Gentilly, 3 Invited seminars, 2017) – Clermont-Ferrand (Insect team, 2017) – Nantes (Université of Nantes, 2017).

Workshops and meetings

Conference on Boolean networks (Marseille, 2017) – Bioss Meeting on artificial intelligence (Gif, 2017)

Leadership within the Scientific Community

Member of the steering committee of the International Conference on Grammatical Inference.

The team was involved in the foundation of a national working group on the symbolic study of dynamical systems named bioss [web access]. The group gathers more than 170 scientists, from computer science to biology. Three meetings were organized this year. The group is supported by two French National Research Networks: bioinformatics (GDR BIM : bioinformatique moléculaire) and informatics-mathematics (GDR IM : Informatique Mathématique). It gathered twice in 2017: for a general meeting in Montpellier (Mar. 2017) and for a workshop focused on links between systems biology and artificial intelligence in Orsay (June 2017).

Scientific Expertise International responsibilities

Evaluation panel of the "Europe-USA Call Strengthening Transnational Research in Molecular Plant Sciences" launched by ERA-CAPS.

National responsibilities

Institutional boards for the recruitment and evaluation of researchers. Inria National evaluation board (A. Siegel, nominated member). National Council of Universities, section 65 (O. Dameron, nominated member).

Evaluation committees of French laboratories or doctoral schools. Bioinformatics groups of Institut Curie (Paris, presidency of the committee, A. Siegel) – Doctoral school of Nice University (N. Théret).

Presidency of the expert panel for the call Systems biology applied to Cancer of the National Cancer Plan 2017 (A. Siegel).

Recruitment committees. Inria Senior Researchers (national committee, A. Siegel) – Inria Junior Researchers (Nice, National Committee, A. Siegel)

Scientific Advisory boards GDR BIM " Molecular Bioinformatics" (J. Nicolas).

National scientific boards

Scientific Advisory Board of the French National Research Network GDR BIM Molecular Bioinformatics (J. Nicolas).

Operational Legal and Ethical Risk Assessment Committee (COERLE) at Inria (J. Nicolas).

Animation of the Bioss working group (A. Siegel).

Board of directors of the French Society for biology of the extracellular matrix (N. Théret).

Prospective working groups

"Big & Open Data" foresight working group of PROSPER network (F. Coste).

"Prospectives in predictive toxicology" working group at INRA (A. Siegel)

Local responsibilities

Scientific Advisory Board of Biogenouest (J. Bourdon, N. Théret)

IRISA laboratory (computer science department of Univ. Rennes 1) council (A. Siegel)

Responsability of the IRISA laboratory "Health-biology" cross-cutting axis (O. Dameron) http://www.irisa.fr/en/page/crosscutting-axis

SCAS (Service Commun d'Action Sociale) of Univ. Rennes 1 (C. Belleannée)

Scientific committee of Univ. Rennes 1 school of medicine (O. Dameron, A. Siegel).

Teaching - Supervision - Juries Teaching track responsibilities

Coordination of the doctoral school "Life, Agronomy and Health" of University of Rennes 1 [N. Théret]

Coordination of the master degree "Bioinformatics and genomics", Univ. Rennes1 [O. Dameron]

Coordination of the sub-domain "From Data to Knowledge: Machine Learning, Modeling and Indexing Multimedia Contents and Symbolic Data", Master in Computer Science, University of Rennes 1, France [F. Coste].

Course responsibilities

"Atelier bioinformatique", Licence 2 informatique, Univ. Rennes 1 [O. Dameron]

"Bioinformatique pour la génomique", 2nd year school of medicine, Univ. Rennes 1 [O. Dameron]

"Bases de mathématiques et probablité" and "Méthodes en informatique", Master1 in public health, Univ. Rennes 1 [O. Dameron]

"Big data and Semantic Web", Master 2 in public health, Univ. Rennes 1 [O. Dameron]

"Intégration: Remise à niveau en informatique", Master 1 in bioinformatics, Univ. Rennes 1 [O. Dameron]

"Programmation en Python", Master 1 in Public Health, Univ. Rennes 1 [O. Dameron]

"Programmation impérative en Python", Master 1 in bioinformatics, Univ. Rennes 1 [O. Dameron]

"Système informatique GNU/Linux", Master 1 in bioinformatics, Univ. Rennes 1 [O. Dameron]

"Semantic Web and bio-ontologies", Master 2 in bioinformatics, Univ. Rennes 1 [O. Dameron]

"e-Santé et réseaux hospitaliers", last year in engineering school ESIR, Univ. Rennes 1, [O. Dameron]

"Equilibre Dynamique de la communication Cellulaire" Master 2 in Sciences cellulaire et Moléculaire du Vivant, Univ. Rennes 1 [N. Theret]

Teaching

Licence: C. Belleannée, Langages formels, 20h, L3 informatique, Univ. Rennes1, France.

Licence: C. Belleannée, Algorithmique et Programmation Fonctionnelle, 60, L1 informatique, Univ. Rennes1, France.

Licence: J. Coquet, Module Programmation Scientifique 1, 20h, L1 informatique, Rennes1, France.

Licence: O. Dameron, Biostatistiques, 12h, 1st year school of medicine, Univ. Rennes 1, France

Licence: O. Dameron, C2i niveau 2, 2.5h, 2nd year school of medicine, Univ. Rennes 1, France

Licence: O. Dameron, Bioinformatique pour la génomique, 5h, 2nd year school of medicine, Univ. Rennes 1, France

Licence: C. Frioux, Programmation scientifique Python, 12h, L1, Rennes1, France.

Licence: C. Frioux, LaTeX, 12h, L3 ENSAI, France.

Licence: C. Frioux, Outils bureautiques pour le statisticien , 6h, L3 ENSAI, France.

Licence: C. Frioux, Algorithmique et programmation Python, 6h, L3 ENSAI, France.

Licence: L. Bourneuf, Ingénieurerie Systèmes et Réseaux, 10h, L3 INFO, France.

Licence: L. Bourneuf, Algorithmique des graphes, 8h, L3 INFO, France.

Licence: L. Bourneuf, Algorithmique des graphes, 2h, L3 MIAGE, France.

Master: L. Bourneuf, Principes de Programmation et d'Algorithmique, 6h, M1 BIG, France.

Master: L. Bourneuf, Projet, 10h, M1 BIG, France.

Master: C. Belleannée, Programmation logique avec contraintes et algorithmes génétiques, 40h, M1 informatique, Univ. Rennes1, France.

Master: C. Belleannée, Algorithmique du texte et bioinformatique, 10h, M1 informatique, Univ. Rennes1, France

Master: F. Coste, Apprentissage Automatique Supervisé, 10h, M2 Informatique, Univ. Rennes 1, France

Master: O. Dameron, Object-oriented programing, 20h, M1 bioinformatique et génomique, Univ. Rennes 1, France

Master: O. Dameron, Gestion de projet en informatique, 12h, M1 bioinformatique et génomique, Univ. Rennes 1, France

Master: O. Dameron, Ontologies biomédicales, 6h, Engineering school Institut Mines-Télécom Bretagne-Atlantique Brest, France

Master: O. Dameron, Internship jury, 25h, M1 bioinformatique et génomique, Univ. Rennes 1, France

Master: O. Dameron, Internship jury, 7.5h, M2 bioinformatique et génomique, Univ. Rennes 1, France

Master: O. Dameron, Intégration : remise à niveau en informatique, 14h, M1 bioinformatique, Univ. Rennes 1, France

Master: O. Dameron, Programmation impérative en Python, 39.5h, M1 bioinformatique, Univ. Rennes 1, France

Master: O. Dameron, Système informatique GNU/Linux, 12h, M1 bioinformatique, Univ. Rennes 1, France

Master: O. Dameron, Programmation en Python, 24h, M1 in Public Health, Univ. Rennes 1 [O. Dameron]

Master: O. Dameron, Semantic Web and bio-ontologies, 14h, M2 bioinformatique, Univ. Rennes 1, France

Master: O. Dameron, Bases de mathématiques et probabilités, 15h, M1 santé publique, Univ. Rennes 1, France

Master: A. Siegel, Integrative and Systems biology, 20h, M2, Univ. Rennes 1, France

Master: A. Siegel, Introduction to integrative biology, 2h, M2, Univ. Rennes 1, France

Supervision

PhD : Victorien Delannée, Intégrer les échelles moléculaires et cellulaires dans l’inférence de réseaux métaboliques. Application aux xénobiotiques., started in Oct. 2014, defensed in Nov. 2017, supervised by A. Siegel and N. Théret .

PhD : Julie Laniau, Structure de réseaux biologiques : rôle des noeuds internes vis-à-vis de la production de composés, started in Oct. 2013, defended in Oct. 2017, supervised by A. Siegel and D. Eveillard.

PhD : Jean Coquet, Semantic-based reasoning for biological pathways analysis, started in Oct. 2014, defended in Dec. 2017, supervised by O. Dameron and N. Théret /

PhD in progress : Lucas Bourneuf, Justifiable graph decomposition to assist biological network understanding, started in Oct. 2016, supervised by J. Nicolas.

PhD in progress : Clémence Frioux, Using preferences in Answer Set Programming to decipher interactions within the species of an ecosystem at the genomic scale, started in Oct. 2015, supervised by A. Siegel.

PhD in progress : Yann Rivault, Analyse de parcours de soins à partir de bases de données médico-administratives en utilisant des outils du Web Sémantique: identification de complications et de leurs déterminants suite à la pose chirurgicale de dispositif médical implantable en ambulatoire , started in Oct. 2015, supervised by O. Dameron and N. Lemeur.

PhD in progress : Juliette Talibart, Learning grammars with long-distance correlations on proteins, started in Nov. 2017, supervised by F. Coste and J. Nicolas.

PhD in progress : Mael Conan, Predictive approach to assess the genotoxicity of environmental contaminants during liver fibrosis, started in Oct. 2017, supervised by S. Langouet and A. Siegel.

PhD in progress: Marine Louarn, Intégration de données génomiques massives et hétérogènes, application aux mutations non-codantes dans le lymphome folliculaire, started in Oct. 2017, supervised by A. Siegel, T. Fest (CHU) and O. Dameron.

PhD in progress : Méline Wery, Methodology development in disease treatment projects. , started in Oct. 2017, supervised by O. Dameron, C. Bettembourg (Sanofi) and A. Siegel.

Juries

Member of Ph-D thesis juries. J. Mercier, Univ. Evry/CEA [A. Siegel, reviewer]. C. Franay, INRA Toulouse [A. Siegel, reviewer]. W Bedhiafi, Univ. Pierre et Marie Curie Paris + UTM Tunis [O. Dameron]. V. Delannée, Univ. Rennes 1 [N. Theret, O. Dameron]. J. Coquet, Univ. Rennes 1 [O. Dameron, N. Theret]. P. Finet, Univ. Rennes 1 [O. Dameron]

Member of habilitation thesis juries. E. Remy, Univ. Marseille [A. Siegel, president].

Member of medicine doctorate juries G. Lebailly, Univ. Rennes 1 [O. Dameron].

Internships

Internship, from Jun 2017 until Jul 2017. Supervised by J. Nicolas. Student: Alexis Baudin. Subject: Recherche d'attracteurs dans les réseaux booléens synchrones en ASP.

Internship, from Jan until Jun 2017. Supervised by A. Siegel. Student: Mael Conan. Subject: Modélisation et caractérisation de la réponse au stress de la cyanobactérie marine Synechococcus sp. WH7803.

Internship, from from Apr 2017 until Jul 2017. Supervised by Nathalie Théret and Olivier Dameron. Student: Kevin Courtet. Subject: Integration of genic regulatory interaction network by miRNAs from patients’ macrophages with cystic fibrosis.

Internship, from Apr 2017 until Jul 2017. Supervised by J. Got. Student: Nicolas Guillaudeux Subject: Vérifications du réseau métabolique entier de Tisochrysis lutea.

Internship, from Apr 2017 until Jul 2017. Supervised by J. Nicolas and F. Coste. Student: AliHassan Kachalo Subject: Annotation automatique en familles des séquences d’une superfamille d’enzymes, les HAD (haloacides déhalogénases) par Analyse de Concepts Formels (FCA).

Internship, from from Apr 2017 until Jul 2017. Supervised by C. Frioux and C. Trottier. Student: Claire Lippold. Subject: Exploration et caractérisation du microbiome associé à Ectocarpus subulatus str. BFT.

Internship, from Jan 2017 until Jun 2017. Supervised by O. Dameron and A. Siegel. Student: Marine Louarn. Subject: Analysis and integration of heterogeneous large-scale genomics data.

Internship, from Apr 2017 until Jul 2017. Supervised by . Théret and J. Nicolas. Student: Aurelie Nicolas. Subject: Modeling of interaction networks from extracellular matrix components using formal concept analyses.

Internship, from Apr 2017 until Jul 2017. Supervised by C. Belleannée. Student: Dimitri Pedron. Subject: Annotation et prédiction de transcriptome: validation d'ORF alternatifs prédits. Application au gène CREM chez l'humain, la souris et le chien.

Internship, from Feb 2017 until Jun 2017. Supervised by F. Coste. Student: Manon Ruffini, Subject: Better scoring schemes for the recognition of functional proteins by protomata

Internship, from Feb 2017 until Jun 2017. Supervised by J. Nicolas. Student: MarieSalmon, Subject: Biclustering: quantitative formal concept analysis in Answer Set Programming

Internship, from Jan 2017 until Jun 2017. Supervised by A. Siegel and O. Dameron. Student: Meline Wery Subject: Formalizing and computing signatures of phenotypes within a biological network

Internship, from May 2017 until Jul 2017. Supervised by C. Belleannée. Student: Mohamed Zemmouri Subject: Analyse de texte en bioinformatique : Modélisation grammaticale d’un site ADN, et recherche du site, même dégénéré, sur l’intégralité du génome humain

Popularization General audience paper

We have written a contribution to the collaborative book edited by CNRS on the main issues of data-mining. Our contribution was specifically related to modeling issues arising in ecology with the development of NGS technologies .

Science en Cour[t]s

(http://sciences-en-courts.fr/) Many of our on-going and former Ph-D students (A. Antoine-Lorquin, C. Bettembourg, J. Coquet, V. Delannée, G. Garet, S. Prigent) have been heavily involved in organization of a local Popularization Festival where Ph.D. students explain their thesis via short movies. The movies are presented to a professional jury composed of artists and scientists, and of high-school students. Previous years films can be viewed on the festival web-site https://www.youtube.com/embed/3DRgLLITKUc In this context, the following scientific film has been created by the team members in 2017 Helene et les cartons https://www.youtube.com/embed/yheO8Y0nWu0.

Logol: Expressive Pattern Matching in sequences. Application to Ribosomal Frameshift Modeling Catherine Belleannée C. Olivier Sallou O. Jacques Nicolas J. Matteo Comin M. Lukas Kall L. Elena Marchiori E. Alioune Ngom A. Jagath Rajapakse J. PRIB2014 - Pattern Recognition in Bioinformatics, 9th IAPR International Conference Stockholm, Sweden 8626 Springer International Publishing Lukas KALL August 2014 34-47 https://hal.inria.fr/hal-01059506 Integrating quantitative knowledge into a qualitative gene regulatory network Jérémie Bourdon J. Damien Eveillard D. Anne Siegel A. PLoS Computational Biology 7 9 September 2011 http://hal.archives-ouvertes.fr/hal-00626708 FCA in a Logical Programming Setting for Visualization-oriented Graph Compression Lucas Bourneuf L. Jacques Nicolas J. International Conference on Formal Concept Analysis 2017 Rennes, France Lecture Notes in Computer Sciences 10308 June 2017 https://hal.archives-ouvertes.fr/hal-01558302 CyanoLyase: a database of phycobilin lyase sequences, motifs and functions Anthony Bretaudeau A. François Coste F. Florian Humily F. Laurence Garczarek L. Gildas Le Corguillé G. Christophe Six C. Morgane Ratin M. Olivier Collin O. Wendy M Schluchter W. M. Frédéric Partensky F. Nucleic Acids Research 41 November 2012 http://hal.inria.fr/hal-00760946 A Similar Fragments Merging Approach to Learn Automata on Proteins François Coste F. Goulven Kerbellec G. João Gama J. Rui Camacho R. Pavel Brazdil P. Alípio Jorge A. Luís Torgo L. ECML:Machine Learning: ECML 2005, 16th European Conference on Machine Learning, Porto, Portugal, October 3-7, 2005, Proceedings Lecture Notes in Computer Science 3720 Springer 2005 522-529 Repair and Prediction (under Inconsistency) in Large Biological Networks with Answer Set Programming Martin Gebser M. Carito Guziolowski C. Mihail Ivanchev M. Torsten Schaub T. Anne Siegel A. Philippe Veber P. Sven Thiele S. Principles of Knowledge Representation and Reasoning AAAI Press 2010 Exhaustively characterizing feasible logic models of a signaling network using Answer Set Programming Carito Guziolowski C. Santiago Videla S. Federica Eduati F. Sven Thiele S. Thomas Cokelaer T. Anne Siegel A. Julio Saez-Rodriguez J. Bioinformatics 29 18 August 2013 2320-2326 http://hal.inria.fr/hal-00853704 The genome-scale metabolic network of Ectocarpus siliculosus (EctoGEM): a resource to study brown algal physiology and beyond Sylvain Prigent S. Guillaume Collet G. Simon M Dittami S. M. Ludovic Delage L. Floriane Ethis de Corny F. Olivier Dameron O. Damien Eveillard D. Sven Thiele S. Jeanne Cambefort J. Catherine Boyen C. Anne Siegel A. Thierry Tonon T. Plant Journal September 2014 367-81 https://hal.archives-ouvertes.fr/hal-01057153 Meneco, a Topology-Based Gap-Filling Tool Applicable to Degraded Genome-Wide Metabolic Networks Sylvain Prigent S. Clémence Frioux C. Simon M Dittami S. M. Sven Thiele S. Abdelhalim Larhlimi A. Guillaume Collet G. Gutknecht Fabien G. Jeanne Got J. Damien Eveillard D. Jérémie Bourdon J. Frédéric Plewniak F. Thierry Tonon T. Anne Siegel A. PLoS Computational Biology 13 1 January 2017 32 https://hal.inria.fr/hal-01449100 CRISPI: a CRISPR interactive database Christine Rousseau C. Mathieu Gonnet M. Marc Le Romancer M. Jacques Nicolas J. Bioinformatics 25 24 2009 3317-3318 caspo: a toolbox for automated reasoning on the response of logical signaling networks families Santiago Videla S. Julio Saez-Rodriguez J. Carito Guziolowski C. Anne Siegel A. Bioinformatics 2017 https://hal.inria.fr/hal-01426880 Comprehensive study of large signaling pathways by clustering trajectories and characterization by semantic analysis Jean COQUET J. Université de Rennes 1 December 2017 https://hal.inria.fr/tel-01670730 Theses Integrate molecular and cellular scales in the inference of metabolic networks : application to xenobiotics Victorien Delannée V. Université Rennes 1 November 2017 https://tel.archives-ouvertes.fr/tel-01690005 Theses Integrate molecular and cellular scales in the inference of metabolic networks. Application to xenobiotics Victorien Delannée V. Univ. Rennes 1 November 2017 https://hal-univ-rennes1.archives-ouvertes.fr/tel-01659375 Theses Biological networks structure : role of internal nodes regarding production of target elements Julie Laniau J. Inria Rennes - Bretagne Atlantique October 2017 https://hal.archives-ouvertes.fr/tel-01656474 Theses Analysis of Piscirickettsia salmonis Metabolism Using Genome-Scale Reconstruction, Modeling, and Testing María Paz Cortés M. P. Sebastián N. Mendoza S. N. Dante Travisany D. Alexis Gaete A. Anne Siegel A. Veronica Cambiazo V. Alejandro Maass A. 1664-302X Frontiers in Microbiology 8 December 2017 15 https://hal.inria.fr/hal-01661270 A modeling approach to evaluate the balance between bioactivation and detoxification of MeIQx in human hepatocytes Victorien Delannée V. Sophie Langouët S. Nathalie Théret N. Anne Siegel A. 2167-8359 PeerJ 5 2017 e3703 https://hal.inria.fr/hal-01575579 L'écologie des systèmes Damien Eveillard D. Anne Siegel A. Philippe Vandenkoornhuyse P. Mokrane Bouzeghoub M. Rémy Mosseri R. Les Big Data à Découvert CNRS éditions 2017 https://hal.inria.fr/hal-01575603 A transcriptome multi-tissue analysis identifies biological pathways and genes associated with variations in feed efficiency of growing pigs Florence Gondret F. Annie Vincent A. Magalie Houée-Bigot M. Anne Siegel A. Sandrine Lagarrigue S. David Causeur D. Hélène H. Gilbert H. H. Isabelle Louveau I. 1471-2164 BMC Genomics 18 1 2017 244 https://hal.archives-ouvertes.fr/hal-01494107 Combining graph and flux-based structures to decipher phenotypic essential metabolites within metabolic networks Julie Laniau J. Clémence Frioux C. Jacques Nicolas J. Caroline Baroukh C. María Paz Cortés M. P. Jeanne Got J. Camille Trottier C. Damien Eveillard D. Anne Siegel A. 2167-8359 PeerJ 5 2017 e3860 https://hal.archives-ouvertes.fr/hal-01635688 A Logic for Checking the Probabilistic Steady-State Properties of Reaction Networks Vincent Picard V. Anne Siegel A. Jérémie Bourdon J. 1066-5277 Journal of computational biology : a journal of computational molecular cell biology 24 8 August 2017 1–12 https://hal.archives-ouvertes.fr/hal-01552190 Meneco, a Topology-Based Gap-Filling Tool Applicable to Degraded Genome-Wide Metabolic Networks Sylvain Prigent S. Clémence Frioux C. Simon M Dittami S. M. Sven Thiele S. Abdelhalim Larhlimi A. Guillaume Collet G. Gutknecht Fabien G. Jeanne Got J. Damien Eveillard D. Jérémie Bourdon J. Frédéric Plewniak F. Thierry Tonon T. Anne Siegel A. 1553-734X PLoS Computational Biology 13 1 January 2017 32 https://hal.inria.fr/hal-01449100 caspo: a toolbox for automated reasoning on the response of logical signaling networks families Santiago Videla S. Julio Saez-Rodriguez J. Carito Guziolowski C. Anne Siegel A. 1367-4803 Bioinformatics 2017 https://hal.inria.fr/hal-01426880 FCA in a Logical Programming Setting for Visualization-oriented Graph Compression Lucas Bourneuf L. Jacques Nicolas J. International Conference on Formal Concept Analysis 2017 Rennes, France Lecture Notes in Computer Sciences 10308 June 2017 https://hal.archives-ouvertes.fr/hal-01558302 International Conference on Formal Concept Analysis 2017 ICFCA Identifying Functional Families of Trajectories in Biological Pathways by Soft Clustering: Application to TGF-β Signaling Jean Coquet J. Nathalie Théret N. Vincent Legagneux V. Olivier Dameron O. CMSB 2017 - 15th International Conference on Computational Methods in Systems Biology Darmstadt, France Lecture Notes in Computer Sciences September 2017 17 https://hal.archives-ouvertes.fr/hal-01559249 International Conference on Computational Methods in Systems Biology 15 CMSB Interoperable infrastructure and implementation of a health data model for remote monitoring of chronic diseases with comorbidities Philippe FINET P. Bernard Gibaud B. Olivier Dameron O. Régine Le Bouquin Jeannes R. Journées d'Etude sur la TéléSANté, 6ème edition Bourges, France Pôle Capteurs, Université d'Orléans May 2017 https://hal.archives-ouvertes.fr/hal-01565009 Journées d'étude sur la TéléSanté 6 JETSAN Hybrid Metabolic Network Completion Clémence Frioux C. Torsten Schaub T. Sebastian Schellhorn S. Anne Siegel A. Philipp Wanko P. Marcello Balduccini M. Tomi Janhunen T. 14th International Conference on Logic Programming and Nonmonotonic Reasoning - LPNMR 2017 Espoo, Finland Logic Programming and Nonmonotonic Reasoning 10377 Springer July 2017 308-321 https://hal.inria.fr/hal-01557347 Conference on Logic Programming and Nonmonotonic Reasoning 14 LPNMR AskOmics, a web tool to integrate and query biological data using semantic web technologies Xavier Garnier X. Anthony Bretaudeau A. Olivier Filangi O. Fabrice Legeai F. Anne Siegel A. Olivier Dameron O. JOBIM 2017 - Journées Ouvertes en Biologie, Informatique et Mathématiques Lille, France July 2017 1 https://hal.inria.fr/hal-01577425 Journées Ouvertes Biologie Informatique Mathématiques 18 JOBIM Integration of Linked Data into Galaxy using Askomics Xavier Garnier X. Olivier Dameron O. Olivier Filangi O. Fabrice Legeai F. Anthony Bretaudeau A. Galaxy Community Conference Montpellier, France June 2017 https://hal.inria.fr/hal-01576870 Galaxy Community Conference 2014 GCC Toward the study of metabolic functions in algal holobionts Claire Lippold C. Camille Trottier C. Clémence Frioux C. Marie Chevallier M. Simon M Dittami S. M. Anne Siegel A. 4ème Colloque de Génomique Environnementale Marseille, France September 2017 https://hal.inria.fr/hal-01576484 Colloque de Génomique Environnementale 4 Toward a better understanding of the plant microbiota Victoria Potdevin V. Marie Chevallier M. Ruben Garrido Oter R. Anne Siegel A. Stéphane Hacquard S. Damien Eveillard D. Philippe Vandenkoornhuyse P. 4ème colloque de Génomique Environnementale Marseille, France GDR Génomique Environnementale September 2017 https://hal.inria.fr/hal-01576743 Colloque de Génomique Environnementale 4 Integrative genomics and gene networks for studying phenotypic plasticity in the pea aphid Valentin Wucher V. Fabrice Legeai F. Lucas Bourneuf L. Thomas Derrien T. Aurore Gallot A. Sylvie Hudaverdian S. Stéphanie Jaubert-Possamai S. Nathalie Leterme-Prunier N. Jacques Nicolas J. Hervé Seitz H. Anne Siegel A. Sylvie TANGUY S. Gaël Le Trionnaire G. Denis Tagu D. 10th Arthropod Genomics Symposium Notre Dame, United States June 2017 https://hal.inria.fr/hal-01566438 Arthropods Genomics 10 First landscape of binding to chromosomes for a domesticated mariner transposase in the human genome: diversity of genomic targets of SETMAR isoforms in two colorectal cell lines Aymeric Antoine-Lorquin A. Ahmed Arnaoty A. Sassan Asgari S. Martine Batailler M. Linda Beauclair L. Catherine Belleannée C. Solenne Bire S. Nicolas Buisine N. Vincent Coustham V. Alban Girault A. Serge Guyetant S. Thierry Lecomte T. Benoît Piégu B. Bruno Pitard B. Isabelle Stévant I. Yves Bigot Y. July 2017 https://hal.archives-ouvertes.fr/hal-01558001 working paper or preprint Detection of mutated primers and impact on targeted metagenomics results Aymeric Antoine-Lorquin A. Frédéric Mahé F. Micah Dunthorn M. Catherine Belleannée C. July 2017 https://hal.archives-ouvertes.fr/hal-01564093 working paper or preprint De novo Clustering of Gene Expressed Variants in Transcriptomic Long Reads Data Sets Camille Marchet C. Lolita Lecompte L. Corinne Da Silva C. Corinne Cruaud C. Jean-Marc Aury J.-M. Jacques Nicolas J. Pierre Peterlongo P. November 2017 https://hal.archives-ouvertes.fr/hal-01643156 working paper or preprint Better scoring schemes for the recognition of functional proteins by protomata Manon Ruffini M. Rennes 1 June 2017 https://hal.inria.fr/hal-01557941 Masters thesis Exploring metabolism flexibility in complex organisms through quantitative study of precursor sets for system outputs Oumarou Abdou-Arbi O. Sophie Lemosquet S. Jaap Van Milgen J. Anne Siegel A. Jérémie Bourdon J. BMC Systems Biology 8 1 2014 8 https://hal.inria.fr/hal-00947219 Deciphering transcriptional regulations coordinating the response to environmental changes Vicente Acuña V. Andrés Aravena A. Carito Guziolowski C. Damien Eveillard D. Anne Siegel A. Alejandro Maass A. BMC Bioinformatics 17 1 January 2016 129-42 https://hal.archives-ouvertes.fr/hal-01260866 Modeling parsimonious putative regulatory networks: complexity and heuristic approach Vicente Acuña V. Andrés Aravena A. Alejandro Maass A. Anne Siegel A. 15th conference in Verification, Model Checking, and Abstract Interpretation San Diego, United States 8318 Springer 2014 322-336 https://hal.inria.fr/hal-00926477 An integrative modeling framework reveals plasticity of TGF-Beta signaling Geoffroy Andrieux G. Michel Le Borgne M. Nathalie Théret N. BMC Systems Biology 8 1 2014 30 http://www.hal.inserm.fr/inserm-00978313 Detection of mutated primers and impact on targeted metagenomics results Aymeric Antoine-Lorquin A. Frédéric Mahé F. Micah Dunthorn M. Catherine Belleannée C. RCAM'16 ”Recent Computational Advances in Metagenomics” The Hague, Netherlands September 2016 https://hal.inria.fr/hal-01576304 Impact de la recherche d'amorces mutées sur les résultats d'analyses métagénomiques Aymeric Antoine-Lorquin A. Frédéric Mahé F. Micah Dunthorn M. Catherine Belleannée C. Journées Ouvertes en Biologie, Informatique et Mathématiques (JOBIM) Rennes, France Jobim-2016 Société Française de Bioinformatique (SFBI) June 2016 https://hal.archives-ouvertes.fr/hal-01343121 MEME Suite: tools for motif discovery and searching Timothy L. Bailey T. L. Mikael Boden M. Fabian A. Buske F. A. Martin Frith M. Charles E. Grant C. E. Luca Clementi L. Jingyuan Ren J. Wilfred W. Li W. W. William S. Noble W. S. Nucleic Acids Research 37 suppl_2 2009 W202 http://dx.doi.org/10.1093/nar/gkp335 The MEME Suite Timothy L. Bailey T. L. James Johnson J. Charles E. Grant C. E. William S. Noble W. S. Nucleic Acids Research 43 W1 2015 W39-W49 http://dx.doi.org/10.1093/nar/gkv416 PEPS: a platform for supporting studies in pharmaco-epidemiology using medico-administrative databases Frédéric Balusson F. Marie-Anne Botrel M.-A. Olivier Dameron O. Yann Dauxais Y. Erwan Drezen E. Alain Dupuy A. Thomas Guyet T. David Gross-Amblard D. André Happe A. Nolwenn Le Meur N. Béranger Le Nautout B. Emmanuelle Leray E. Emmanuel Nowak E. Caroline Rault C. Emmanuel Oger E. Elisabeth Polard E. International Congress on e-Health Research Paris, France October 2016 https://hal.inria.fr/hal-01380939 Knowledge Representation, Reasoning and Declarative Problem Solving C. Baral C. Cambridge University Press 2010 Localizing potentially active post-transcriptional regulations in the Ewing's sarcoma gene regulatory network Tatiana Baumuratova T. Didier Surdez D. Bernard Delyon B. Gautier Stoll G. Olivier Delattre O. Ovidiu Radulescu O. Anne Siegel A. BMC Systems Biology 4 1 2010 146 http://www.hal.inserm.fr/inserm-00984711 Big Data and Biomedical Informatics: A Challenging Opportunity R Bellazzi R. Yearbook of medical informatics 9 1 2014 In press Data analysis and data mining: current issues in biomedical informatics R Bellazzi R. M Diomidous M. I N Sarkar I. N. K Takabayashi K. A Ziegler A. A T McCray A. T. Methods of information in medicine 50 6 2011 536–544 Logol: Expressive Pattern Matching in sequences. Application to Ribosomal Frameshift Modeling Catherine Belleannée C. Olivier Sallou O. Jacques Nicolas J. Matteo Comin M. Lukas Kall L. Elena Marchiori E. Alioune Ngom A. Jagath Rajapakse J. PRIB2014 - Pattern Recognition in Bioinformatics, 9th IAPR International Conference Stockholm, Sweden 8626 Springer International Publishing Lukas KALL August 2014 34-47 https://hal.inria.fr/hal-01059506 Model of cap-dependent translation initiation in sea urchin: a step towards the eukaryotic translation regulation network Robert Bellé R. Sylvain Prigent S. Anne Siegel A. Patrick Cormier P. Molecular Reproduction and Development 77 3 2010 257-64 Formal Concept Analysis: A Unified Framework for Building and Refining Ontologies Rokia Bendaoud R. Amedeo Napoli A. Yannick Toussaint Y. Aldo Gangemi A. Jérôme Euzenat J. Knowledge Engineering: Practice and Patterns, 16th International Conference, EKAW 2008, Acitrezza, Italy, September 29 - October 2, 2008. Proceedings Lecture Notes in Computer Science 5268 Springer 2008 156-171 http://dx.doi.org/10.1007/978-3-540-87696-0_16 AskOmics : IntÃ©gration et interrogation de rÃ©seaux de rÃ©gulation gÃ©nomique et post-gÃ©nomique Charles Bettembourg C. Olivier Dameron O. Anthony Bretaudeau A. Fabrice Legeai F. IN OVIVE (INtÃ©gration de sources/masses de donnÃ©es hÃ©tÃ©rogÃ¨nes et Ontologies, dans le domaine des sciences du VIVant et de lâEnvironnement) Rennes, France June 2015 7 https://hal.inria.fr/hal-01184903 Semantic particularity measure for functional characterization of gene sets using gene ontology Charles Bettembourg C. Christian Diot C. Olivier Dameron O. PLoS ONE 9 1 2014 https://hal.inria.fr/hal-00941850 e86525 Optimal threshold determination for interpreting semantic similarity and particularity Charles Bettembourg C. Christian Diot C. Olivier Dameron O. PLoS ONE 10 7 2015 https://hal.archives-ouvertes.fr/hal-01207763 e0133579 Optimal Threshold Determination for Interpreting Semantic Similarity and Particularity: Application to the Comparison of Gene Sets and Metabolic Pathways Using GO and ChEBI Charles Bettembourg C. Christian Diot C. Olivier Dameron O. PLoS ONE 2015 30 https://hal.inria.fr/hal-01184934 Linked Data–The story so far Christian Bizer C. Tom Heath T. Tim Berners Lee T. International Journal on Semantic Web and Information Systems 5 3 2009 1–22 Beyond the data deluge: Data integration and bio-ontologies Judith A. Blake J. A. Carol J. Bult C. J. Journal of Biomedical Informatics 39 3 2006 314–320 Using a large-scale knowledge database on reactions and regulations to propose key upstream regulators of various sets of molecules participating in cell metabolism Pierre Blavy P. Florence Gondret F. Sandrine Lagarrigue S. Jaap Van Milgen J. Anne Siegel A. BMC Systems Biology 8 1 2014 32 https://hal.inria.fr/hal-00980499 An ASP application in integrative biology: identification of functional gene units Philippe Bordron P. Damien Eveillard D. Alejandro Maass A. Anne Siegel A. LPNMR - 12th Conference on Logic Programming and Nonmonotonic Reasoning - 2013 Corunna, Spain September 2013 http://hal.inria.fr/hal-00853762 Putative bacterial interactions from metagenomic knowledge with an integrative systems ecology approach Philippe Bordron P. Mauricio Latorre M. María Paz Cortés M. P. Mauricio Gonzales M. Sven Thiele S. Anne Siegel A. Alejandro Maass A. Damien Eveillard D. MicrobiologyOpen 5 1 2015 106-117 https://hal.inria.fr/hal-01246173 CyanoLyase: a database of phycobilin lyase sequences, motifs and functions Anthony Bretaudeau A. François Coste F. Florian Humily F. Laurence Garczarek L. Gildas Le Corguillé G. Christophe Six C. Morgane Ratin M. Olivier Collin O. Wendy M Schluchter W. M. Frédéric Partensky F. Nucleic Acids Research November 2012 6 https://hal.inria.fr/hal-01094087 Semantic Web meets Integrative Biology: a survey Huajun Chen H. Tong Yu T. Jake Y Chen J. Y. Briefings in bioinformatics 14 1 2012 109–125 Handling the heterogeneity of genomic and metabolic networks data within flexible workflows with the PADMet toolbox Marie Chevallier M. Meziane Aite M. Jeanne Got J. Guillaume Collet G. Nicolas Loira N. María Paz Cortés M. P. Clémence Frioux C. Julie Laniau J. Camille Trottier C. Alejandro Maass A. Anne Siegel A. Jobim 2016 : 17ème Journées Ouvertes en Biologie, Informatique et Mathématiques Lyon, France June 2016 https://hal.inria.fr/hal-01377844 Extending the Metabolic Network of Ectocarpus Siliculosus using Answer Set Programming Guillaume Collet G. Damien Eveillard D. Martin Gebser M. Sylvain Prigent S. Torsten Schaub T. Anne Siegel A. Sven Thiele S. LPNMR - 12th Conference on Logic Programming and Nonmonotonic Reasoning - 2013 Corunna, Spain September 2013 http://hal.inria.fr/hal-00853752 Learning the Language of Biological Sequences François Coste F. Jeffrey Heinz J. José M. Sempere J. M. Topics in Grammatical Inference Springer-Verlag 2016 https://hal.inria.fr/hal-01244770 Automated Enzyme classification by Formal Concept Analysis François Coste F. Gaëlle Garet G. Agnès Groisillier A. Jacques Nicolas J. Thierry Tonon T. ICFCA - 12th International Conference on Formal Concept Analysis Cluj-Napoca, Romania Springer June 2014 https://hal.inria.fr/hal-01063727 Drug repositioning through incomplete bi-cliques in an integrated drug-target-disease network Simone Daminelli S. V. Joachim Haupt V. J. Matthias Reimann M. Michael Schroeder M. Integr. Biol. 4 2012 778-788 http://dx.doi.org/10.1039/C2IB00154C Genome and metabolic network of "Candidatus Phaeomarinobacter ectocarpi" Ec32, a new candidate genus of Alphaproteobacteria frequently associated with brown algae Simon M Dittami S. M. Tristan Barbeyron T. Catherine Boyen C. Jeanne Cambefort J. Guillaume Collet G. Ludovic Delage L. A. Gobet A. Agnès Groisillier A. Catherine Leblanc C. Gurvan Michel G. Delphine Scornet D. Anne Siegel A. Javier E. Tapia J. E. Thierry Tonon T. Frontiers in Genetics 5 2014 241 https://hal.inria.fr/hal-01079739 Integration and query of biological datasets with Semantic Web technologies: AskOmics Aurélie Evrard A. Charles Bettembourg C. Mélanie Jubault M. Olivier Dameron O. Olivier Filangi O. Anthony Bretaudeau A. Fabrice F. Legeai F. F. June 2016 https://hal.inria.fr/hal-01391087 Journées Ouvertes Biologie, Informatique et Mathématiques (JOBIM 2016) Poster HMMER web server: 2015 update Robert D. Finn R. D. Jody Clements J. William Arndt W. Benjamin L. Miller B. L. Travis J. Wheeler T. J. Fabian Schreiber F. Alex Bateman A. Sean R. Eddy S. R. Nucleic Acids Research 43 W1 2015 W30 http://dx.doi.org/10.1093/nar/gkv397 VIRALpro: a tool to identify viral capsid and tail sequences Clovis Galiez C. Magnan Christophe M. François Coste F. Pierre Baldi P. Bioinformatics 2016 https://hal.archives-ouvertes.fr/hal-01242251 Answer Set Solving in Practice Synthesis Lectures on Artificial Intelligence and Machine Learning Martin Gebser M. Roland Kaminski R. Benjamin Kaufmann B. Torsten Schaub T. Morgan and Claypool Publishers 2012 Data integration Florence Gondret F. Isabelle Louveau I. Magalie Houee M. David Causeur D. Anne Siegel A. Meeting INRA-ISU Ames, United States March 2015 11 https://hal.archives-ouvertes.fr/hal-01210940 Integrative responses of pig adipose tissues to high-fat high-fiber diet: towards key regulators of energy flexibility Florence Gondret F. Annie Vincent A. Magalie Houee M. Sandrine Lagarrigue S. Anne Siegel A. David Causeur D. Isabelle Louveau I. March 2015 https://hal.archives-ouvertes.fr/hal-01210925 ASAS/ADSA midwest meeting Poster Molecular alterations induced by a high-fat high-fiber diet in porcine adipose tissues: variations according to the anatomical fat location Florence Gondret F. Annie Vincent A. Magalie Houée-Bigot M. Anne Siegel A. Sandrine Lagarrigue S. Isabelle Louveau I. David Causeur D. BMC Genomics 17 1 December 2016 120 https://hal.archives-ouvertes.fr/hal-01286555 article number 120 BioQuali Cytoscape plugin: analysing the global consistency of regulatory networks Carito Guziolowski C. Annabel Bourdé A. Francois Moreews F. Anne Siegel A. Bmc Genomics 26 10 2009 244 http://hal.inria.fr/inria-00429804 Exhaustively characterizing feasible logic models of a signaling network using Answer Set Programming Carito Guziolowski C. Santiago Videla S. Federica Eduati F. Sven Thiele S. Thomas Cokelaer T. Anne Siegel Siegel A. S. Julio Saez-Rodriguez J. Bioinformatics 29 2013 2320-2326 https://hal.inria.fr/hal-00853704v1 The longissimus and semimembranosus muscles display marked differences in their gene expression profiles in pig Frederic Herault F. Annie Vincent A. Olivier Dameron O. Pascale Le Roy P. Pierre Cherel P. Marie Damon M. PLoS ONE 9 5 2014 https://hal.inria.fr/hal-00989635 e96491 Minimal Intervention Strategies in Logical Signaling Networks with ASP Roland Kaminski R. Anne Siegel A. Torsten Schaub T. Santiago Videla S. Theory and Practice of Logic Programming 13 Special issue 4-5 September 2013 675-690 http://hal.inria.fr/hal-00853747 PyBoolNet: a python package for the generation, analysis and visualization of boolean networks Hannes Klarner H. Adam Streck A. Heike Siebert H. Bioinformatics 33 5 2017 770-772 http://dx.doi.org/10.1093/bioinformatics/btw682 The Vmatch large scale sequence analysis software Stefan Kurtz S. Ref Type: Computer Program 412 2003 Modelization of the regulation of protein synthesis following fertilization in sea urchin shows requirement of two processes: a destabilization of eIF4E:4E-BP complex and a great stimulation of the 4E-BP-degradation mechanism, both rapamycin-sensitive Sébastien Laurent S. Adrien Richard A. Odile Mulner-Lorillon O. Julia Morales J. Didier Flament D. Virginie Glippa V. Jérémie Bourdon J. Pauline Gosselin P. Anne Siegel A. Patrick Cormier P. Robert Bellé R. Frontiers in Genetics 5 2014 117 https://hal.inria.fr/hal-01079758 BIPAA/Askomics, a new and easy approach for querying genomics and epigenomics elements in interaction Fabrice Legeai F. Charles Bettembourg C. Anthony Bretaudeau A. Yvanne Chaussin Y. Olivier Dameron O. Denis Tagu D. XXVth International Congress of Entomology 2016 Orlando, Florida, United States September 2016 https://hal.inria.fr/hal-01391080 Long non-coding RNA in the pea aphid; identification and comparative expression in sexual and asexual embryos Fabrice Legeai F. Thomas Derrien T. Valentin Wucher V. David Audrey D. Gaël Le Trionnaire G. Denis Tagu D. Arthropod Genomics Symposium Urbana, United States June 2014 https://hal.inria.fr/hal-01091304 Basic Gene Grammars and DNA-ChartParser for language processing of Escherichia coli promoter DNA sequences Siu-wai Leung S.-w. Chris Mellish C. Dave Robertson D. Bioinformatics 17 3 2001 226-236 http://bioinformatics.oxfordjournals.org/content/17/3/226.abstract Using Formal Concept Analysis for Finding the Closest Relatives among a Group of Organisms Alena Lihonosova A. Alexandra Kaminskaya A. Procedia Computer Science 31 Complete 2014 860-868 http://dx.doi.org/10.1016/j.procs.2014.05.337 KaBOB: ontology-based semantic integration of biomedical databases Kevin M Livingston K. M. Michael Bada M. William A Baumgartner W. A. Lawrence E Hunter L. E. BMC bioinformatics 16 2015 126 Cutadapt removes adapter sequences from high-throughput sequencing reads Marcel Martin M. EMBnet.journal 17 1 2011 10 http://dx.doi.org/10.14806/ej.17.1.200 Many-Valued Concept Lattices for Conceptual Clustering and Information Retrieval Nizar Messai N. Marie-Dominique Devignes M.-D. Amedeo Napoli A. Malika Smail-Tabbone M. Proceedings of the 2008 Conference on ECAI 2008: 18th European Conference on Artificial Intelligence Amsterdam, The Netherlands, The Netherlands IOS Press 2008 127–131 http://dl.acm.org/citation.cfm?id=1567281.1567313 Pedro T. Monteiro P. T. Claudine Chaouiya C. Efficient Verification for Logical Models of Regulatory Networks Miguel P. Rocha M. P. Nicholas Luscombe N. Florentino Fdez-Riverola F. Juan M. Corchado Rodríguez J. M. C. Springer Berlin Heidelberg

Berlin, Heidelberg

2012 259–267 https://doi.org/10.1007/978-3-642-28839-5_30 Suffix-Tree Analyser (STAN): looking for nucleotidic and peptidic patterns in genomes Jacques Nicolas J. Patrick Durand P. Grégory Ranchy G. Sébastien Tempel S. Anne-Sophie Valin A.-S. Bioinformatics (Oxford, England) 21(24) 2005 4408-4410 https://hal.archives-ouvertes.fr/hal-00015234 Ontology Design with Formal Concept Analysis Marek Obitko M. Václav Snásel V. Jan Smid J. Václav Snásel V. Radim Belohlávek R. Proceedings of the CLA 2004 International Workshop on Concept Lattices and their Applications, Ostrava, Czech Republic, September 23-24, 2004 CEUR Workshop Proceedings 110 CEUR-WS.org 2004 http://sunsite.informatik.rwth-aachen.de/Publications/CEUR-WS/Vol-110/paper12.pdf Boolean Network Identification from Perturbation Time Series Data combining Dynamics Abstraction and Logic Programming Max Ostrowski M. Loïc Paulevé L. Torsten Schaub T. Anne Siegel A. Carito Guziolowski C. BioSystems 2016 https://hal.archives-ouvertes.fr/hal-01354075 Model of the delayed translation of cyclin B maternal mRNA after sea urchin fertilization Vincent Picard V. Odile Mulner-Lorillon O. Jérémie Bourdon J. Julia Morales J. Patrick Cormier P. Anne Siegel A. Robert Bellé R. Molecular Reproduction and Development 2016 http://hal.upmc.fr/hal-01390047 An Extended Network of Genomic Maintenance in the Archaeon Pyrococcus abyssi Highlights Unexpected Associations between Eucaryotic Homologs Pierre-François Pluchon P.-F. Thomas Fouqueau T. Christophe Creze C. Sébastien Laurent S. Julien Briffotaux J. Gaëlle Hogrel G. Adeline Palud A. Ghislaine Henneke G. Anne Godfroy A. Winfried Hausner W. Michael Thomm M. Jacques Nicolas J. Didier Flament D. PLoS ONE 8 11 2013 http://hal.inria.fr/hal-00911795 e79707 The genome-scale metabolic network of Ectocarpus siliculosus (EctoGEM): a resource to study brown algal physiology and beyond Sylvain Prigent S. Guillaume Collet G. Simon M Dittami S. M. Ludovic Delage L. Floriane Ethis de Corny F. Olivier Dameron O. Damien Eveillard D. Sven Thiele S. Jeanne Cambefort J. Catherine Boyen C. Anne Siegel A. Thierry Tonon T. Plant Journal September 2014 367-81 https://hal.archives-ouvertes.fr/hal-01057153 Complications post-opératoires et mode de prise en charge en angioplastie : apport du Programme de Médicalisation des Systèmes d'Information (PMSI) Yann Rivault Y. Nolwenn Le Meur N. Olivier Dameron O. Congrès Adelf-Epiter 2016 Rennes, France Congrès Adelf-Epiter 2016 September 2016 https://hal.inria.fr/hal-01389688 La gestion de données médico-administratives grâce aux outils du Web sémantique Yann Rivault Y. Nolwenn Le Meur N. Olivier Dameron O. Journées ADELF-EMOIS “ Système d'information hospitalier et Epidémiologie ” Dijon, France Revue d'Épidémiologie et de Santé Publique 64 Supplement 1 March 2016 S15 https://hal.archives-ouvertes.fr/hal-01297389 Personalized and automated remote monitoring of atrial fibrillation Arnaud Rosier A. Philippe Mabo P. Lynda Temal L. Pascal Van Hille P. Olivier Dameron O. Louise Deléger L. Cyril Grouin C. Pierre Zweigenbaum P. Julie Jacques J. Emmanuel Chazard E. Laure Laporte L. Christine Henry C. Anita Burgun A. EP-Europace 18 3 2016 347-352 https://hal-univ-rennes1.archives-ouvertes.fr/hal-01331019 Remote Monitoring of Cardiac Implantable Devices: Ontology Driven Classification of the Alerts Arnaud Rosier A. Philippe Mabo P. Lynda Temal L. Pascal Van Hille P. Olivier Dameron O. Louise Deléger L. Cyril Grouin C. Pierre Zweigenbaum P. Julie Jacques J. Emmanuel Chazard E. Laure Laporte L. Christine Henry C. Anita Burgun A. Studies in Health Technology and Informatics 221 2016 59–63 https://hal-univ-rennes1.archives-ouvertes.fr/hal-01319949 Using Formal Concept Analysis for Discovering Knowledge Patterns Mohamed Rouane-Hacene M. Marianne Huchard M. Amedeo Napoli A. Petko Valtchev P. Marzena Kryszkiewicz M. Sergei Obiedkov S. CLA'10: 7th International Conference on Concept Lattices and Their Applications Sevilla, Spain CEUR 672 University of Sevilla October 2010 223-234 http://hal-lirmm.ccsd.cnrs.fr/lirmm-00531802 CRISPI: a CRISPR Interactive database Christine Rousseau C. Mathieu Gonnet M. Marc Le Romancer M. Nicolas Nicolas N. Bioinformatics 25 2009 3317-3318 https://hal.inria.fr/inria-00438512 Unraveling Protein Networks with Power Graph Analysis Loïc Royer L. Matthias Reimann M. Bill Andreopoulos B. Michael Schroeder M. PLoS Comput Biol 4 7 07 2008 http://dx.plos.org/10.1371%2Fjournal.pcbi.1000108 e1000108 Better scoring schemes for the recognition of functional proteins by protomata Manon Ruffini M. Rennes 1 June 2017 https://hal.inria.fr/hal-01557941 Masters thesis Reasoning on the response of logical signaling networks with Answer Set Programming Torsten Schaub T. Anne Siegel A. Santiago Videla S. Logical Modeling of Biological Systems Wiley Online Librairy 2014 49-92 https://hal.inria.fr/hal-01079762 Metabolic Network Expansion with Answer Set Programming Torsten Schaub T. Sven Thiele S. ICLP 2009 LNCS 5649 Springer 2009 312-326 From Concept Representations to Ontologies: A Paradigm Shift in Health Informatics? Stefan Schulz S. Laszlo Balkanyi L. Ronald Cornet R. Olivier Bodenreider O. Healthcare informatics research 19 4 2013 235-242 The language of genes D.B. Searls D. Nature 420 2002 211-217 String Variable Grammar: A Logic Grammar Formalism for the Biological Language of DNA D.B. Searls D. Journal of Logic Programming 24 1&2 1995 73-102 New and continuing developments at PROSITE Christian J. A. Sigrist C. J. A. Edouard de Castro E. Lorenzo Cerutti L. Béatrice A. Cuche B. A. Nicolas Hulo N. Alan Bridge A. Lydie Bougueleret L. Ioannis Xenarios I. Nucleic Acids Research 41 D1 2013 D344 http://dx.doi.org/10.1093/nar/gks1067 Big Data: Astronomical or Genomical? Zachary D Stephens Z. D. Skylar Y Lee S. Y. Faraz Faghri F. Roy H Campbell R. H. Chengxiang Zhai C. Miles J Efron M. J. Ravishankar Iyer R. Michael C Schatz M. C. Saurabh Sinha S. Gene E Robinson G. E. PLoS biology 13 7 2015 e1002195 ModuleOrganizer: detecting modules in families of transposable elements Sébastien Tempel S. Christine Rousseau C. Fariza Tahi F. Jacques Nicolas J. BMC Bioinformatics 11 2010 474 http://hal.inria.fr/inria-00536742 Extended notions of sign consistency to relate experimental data to signaling and regulatory network topologies Sven Thiele S. Julio Saez-Rodriguez J. Luca Cerone L. Anne Siegel A. Carito Guziolowski C. Steffen Klamt S. BMC Bioinformatics 16 2015 345 https://hal.inria.fr/hal-01225228 Learning Boolean logic models of signaling networks with ASP Santiago Videla S. Carito Guziolowski C. Federica Eduati F. Sven Thiele S. Martin Gebser M. Jacques Nicolas J. Julio Saez-Rodriguez J. Torsten Schaub T. Anne Siegel A. Journal of Theoretical Computer Science (TCS) 599 September 2015 79-101 https://hal.inria.fr/hal-01058610 Designing experiments to discriminate families of logic models Santiago Videla S. Irina Konokotina I. Leonidas Alexopoulos L. Julio Saez-Rodriguez J. Torsten Schaub T. Anne Siegel A. Carito Guziolowski C. Frontiers in Bioengineering and Biotechnology September 2015 9 https://hal.inria.fr/hal-01196178 restructuring lattice theory: an approach based on hierarchies of concepts Rudolf Wille R. Proceedings of the 7th International Conference on Formal Concept Analysis Berlin, Heidelberg ICFCA '09 Springer-Verlag 2009 314–339 http://dx.doi.org/10.1007/978-3-642-01815-2_23 Edge Selection in a Noisy Graph by Concept Analysis – Application to a Genomic Network Data Science, Learning by Latent Structures, and Knowledge Discovery Valentin Wucher V. Denis Tagu D. Jacques Nicolas J. Berthold Lausen B. Sabine Krolak-Schwerdt S. Matthias Böhmer M. Springer 2014 550 https://hal.inria.fr/hal-01093337 Modeling of a gene network between mRNAs and miRNAs to predict gene functions involved in phenotypic plasticity in the pea aphid Valentin Wucher V. Université Rennes 1 November 2014 https://tel.archives-ouvertes.fr/tel-01135870 Theses Modeling of a gene network between mRNAs and miRNAs to predict gene functions involved in phenotypic plasticity in the pea aphid Valentin Wucher V. Universite Rennes 1 November 2014 https://hal.archives-ouvertes.fr/tel-01095967 Theses PatMatch: a program for finding patterns in peptide and nucleotide sequences Thomas Yan T. Danny Yoo D. Tanya Z. Berardini T. Z. Lukas A. Mueller L. A. Dan C. Weems D. C. Shuai Weng S. J. Michael Cherry J. M. Seung Yon Rhee S. Y. Nucleic Acids Research 33 Web-Server-Issue 2005 262–266 https://doi.org/10.1093/nar/gki368