Members
Overall Objectives
Research Program
Application Domains
Highlights of the Year
New Software and Platforms
New Results
Partnerships and Cooperations
Dissemination
Bibliography
XML PDF e-pub
PDF e-Pub


Section: New Software and Platforms

Platforms and toolboxes

Among others, a goal of the team is to facilitate interplays between tools for biological data analysis and integration. Our tools are based on formal systems. They aim at guiding the user to progressively reduce the space of models (families of sequences of genes or proteins, families of keys actors involved in a system response, dynamical models) which are compatible with both knowledge and experimental observations.

Most of our tools are available both as stand-alone software and through portals such as Mobyle or Galaxy interfaces. Tools are developed in collaboration with the GenOuest resource and data center hosted in the IRISA laboratory, including their computer facilities [more info] .

We present here three toolboxes which each contains complementary tools with respect to their targeted sub-domain of bioinformatics.

Integrative Biology: (constraint-based) toolbox for network filtering

The goal is to offer a toolbox for the reconstruction of networks from genome, literature and large-scale observation data (expression data, metabolomics...) in order to elucidate the main regulators of an observed phenotype. Most of the optimization issues are addressed with Answer Set Programming.

MeMap and MeMerge. We develop a workflow for the Automatic Reconstruction of Metabolic networks (AuReMe). In this workflow, we use heterogeneous sources of data with identifiers from different namespaces. MeMap (Metabolic network Mapping) consists in mapping identifiers from different namespaces to a unified namespace. Then, MeMerge (Metabolic network Merge) merges two metabolic networks previously mapped on the same namespace. [web server] .

meneco [input: draft metabolic network & metabolic profiles. output: metabolic network]. It is a qualitative approach to elaborate the biosynthetic capacities of metabolic networks and solve incompleteness of large-scale metabolic networks. Since November 2015, a new version of Meneco has been available with Python 3, and a new functionality of topological producibility checking has been set up. [82] [60] [python package] [web server] .

shogen [input: genome & metabolic network. output : functional regulatory modules]. This software is able to identify genome portions which contain a large density of genes coding for enzymes that regulate successive reactions of metabolic pathways. See section 6.3 for details. [55] [python package] .

lombarde [input: genome, modules & several gene-expression datasets. output: oriented regulation network]. This tool is useful to enhance key causalities within a regulatory transcriptional network when it is challenged by several environmental perturbations. In 2015, the tool was simplified to handle standardized data formats. [41] [web server] .

ingranalysis [input: signed regulation network & one gene-expression dataset. output: network repair gene-expression prediction] This tool is an extension to the bioquali tool. It proposes a range of different operations for altering experimental data and/or a biological network in order to re-establish their mutual consistency, an indispensable prerequisite for automated prediction. For accomplishing repair and prediction, we take advantage of the distinguished modeling and reasoning capacities of Answer Set Programming. The tool has evolved to the iggy tool recently [5] [21] [Python package] [web server] .

Dynamics and invariant-based prediction

We develop tools predicting some characteristics of a biological system behavior from incomplete sets of parameters or observations.

cadbiom. Based on Guarded transition semantic, this software provides a formal framework to help the modeling of biological systems such as cell signaling network. It allows investigating synchronization events in biological networks. [software][web server] .

caspo: Cell ASP Optimizer This soft provides an easy to use software for learning Boolean logic models describing the immediate-early response of protein signaling networks. See Sec. 6.4 for details. The tool is included in the cellNopt package (http://www.cellnopt.org/ ). [python package] [web server] .

nutritionAnalyzer. This tool is dedicated to the computation of allocation for an extremal flux distribution. It allows quantifying the precursor composition of each system output (AIO) and to discuss the biological relevance of a set of flux in a given metabolic network by computing the extremal values of AIO coefficients. This approach enables to discriminate diets without making any assumption on the internal behaviour of the system [40][webserver][software and doc] .

POGG. The POGG software allows scoring the importance and sensibility of regulatory interactions with a biological system with respect to the observation of a time-series quantitative phenotype. This is done by solving nonlinear problems to infer and explore the family of weighted Markov chains having a relevant asymptotic behavior at the population scale. Its possible application fields are systems biology, sensitive interactions, maximal entropy models, natural language processing. It results from our collaboration with the LINA-Nantes [2][matlab package] .

Sequence annotation

We develop tools for discovery and search of complex signatures within biological sequences.

Logol Logol is a swiss-army-knife for pattern matching on DNA/RNA/Protein sequences, using a high-level grammar to permit a large expressivity [48] . In 2015, the efficiency of the tool was improved by slight evolutions of the underlying grammar. Possible fields of application are the detection of mutated binding sites or stem-loop identification (e.g. in CRISPR (http://crispi.genouest.org/ ) [10] ) [software] .

Protomata learner Protomata software suite provides a grammatical inference framework for learning the specific signature of a functional protein family from unaligned sequences by partial and local multiple alignment and automata modeling. In 2015, motivated by the characterization of viral protein sequences during the internship of Maud Jusot [38] , we have begun a refactoring of the parsing part of Protomata and we implemented a new mode returning the sum of the scores over all paths (Forward score), besides the classical score on best path (Viterbi score), to improve parsing's sensitivity on divergent but conserved families of sequences. [web server] .

Integration of toolboxes and platforms in webservices

Most of our software were designed as "bricks" that can be combined through workflow application such as Mobyle. It worths considering them into larger dedicated environments to benefit from the expertise of other research groups.

Plateform for data storage, expertise sharing and application inventory In collaboration with the GenOuest ressource center, the BII plateform (Bio Investigation Index) is a good way to enhance knowledge and expertise sharing, improve the visibility on the team’s work in progress and record the History of the team’s discoveries and main results. It enables experiment reproducibility, reporting on experiment process details, storing all scripts and softwares (in the corresponding versions) and linking all input files, results and not reproducible intermediate data. [web access] .

Web servers In collaboration with the GenOuest ressource center, most our tools are made available through several web portals.

Dr Motif This resource aims at the integration of different software commonly used in pattern discovery and matching. This resource also integrates Dyliss pattern search and discovery software.

ASP4biology and BioASP It is a meta-package to create a powerful environment of biological data integration and analysis in system biology, based on knowledge representation and combinatorial optimization technologies (ASP). It provides a collection of python applications which encapsulates ASP tools and several encodings making them easy to use by non-expert users out-of-the-box. [Python package] [website] .

ASP encodings repository This suite comprises projects related to applications of Answer Set Programming using Potassco systems (the Potsdam Answer Set Solving Collection, bundles tools for Answer Set Programming developed at the University of Potsdam). These are usually a set of encodings possibly including auxiliary software and scripts [respository] .