Magnomeis a joint projet-team with the PRES Bordeaux (Universities Bordeaux 1 and Bordeaux Ségalen) and the CNRS (LaBRI UMR 5800). All of the members of Magnomeare also members of the LaBRI.
One of the key challenges in the study of biological systems is understanding how the static information recorded in the genome is interpreted to become dynamic systems of cooperating and competing biomolecules. MAGNOME addresses this challenge through the development of informatic techniques for multi-scale modeling and large-scale comparative genomics:
logical and object models for knowledge representation
stochastic hierarchical models for behavior of complex systems, formal methods
algorithms for sequence analysis, and
data mining and classification.
We use genome-scale comparisons of eukaryotic organisms to build modular and hierarchical hybrid models of cell behavior that are studied using multi-scale stochastic simulation and formal methods. Our research program builds on our experience in comparative genomics, modeling of protein interaction networks, and formal methods for multi-scale modeling of complex systems.
New high-throughput technologies for DNA sequencing have radically reduced the cost of acquiring genome and transcriptome data, and introduced new strategies for whole genome sequencing. The result has been an increase in data volumes of several orders of magnitude, as well has a greatly increased density of genome sequences within phylogenetically constrained groups of species. Magnomedevelops efficient techniques for dealing with these increased data volumes, and the combinatorial challenges of dense multi-genome comparison.
With clinical and academic partners Magnomeparticipated in the development of a new rapid diagnostic test for yeast pathogens in the Nakaseomycetesclass, based on a comparative annotation of six genomes .
These de novo6 genomes – 5 genomes in the class Nakaseomycetes, 1 strain of genome of S.cerevisiae– were automatically annotate from their raw sequences using our YAGA software.
Through a long-standing collaboration between the LaBRI and Prof. Aline Lonvaud at the Institute of Vine and Wine Sciences of Bordeaux, and under the auspices of the ANR DIVOENI contract (2008-2012), we successfully completed the first comparative exploration of Oenococcus oeni pan-transcriptom code. The guidelines delivered partially lift the veil on how the genome of this lactic acid bacterium involved in wine fermentation globally adapts to its environment at a functional and an organisational level .
We released the first whole-genome metabolic model of the oleaginous yeast Yarrowia lipolytica, developed using our Pathtastic software and curated in collaboration with colleages from the INRA Grignon (model in MODEL1111190000 in Biomodels.net).
The complete implementation of the BioRica modeling framework was deposited with the APP and has been released [ biorica]. BioRica was developed as an Inria Technology Development Action.
Fundamental questions in the life sciences can now be addressed at an unprecedented scale through the combination of high-throughput experimental techniques and advanced computational methods from the computer sciences. The new field of computational biologyor bioinformaticshas grown around intense collaboration between biologists and computer scientists working towards understanding living organisms as systems. One of the key challenges in this study of systems biology is understanding how the static information recorded in the genome is interpreted to become dynamic systems of cooperating and competing biomolecules.
Magnomeaddresses this challenge through the development of informatic techniques for understanding the structure and history of eukaryote genomes: algorithms for genome analysis, data models for knowledge representation, stochastic hierarchical models for behavior of complex systems, and data mining and classification. Our work is in methods and algorithms for:
Genome annotationfor complete genomes, performing syntacticanalyses to identify genes, and semanticanalyses to map biological meaning to groups of genes , , , , .
Integration of heterogenous data, to build complete knowledge bases for storing and mining information from various sources, and for unambiguously exchanging this information between knowledge bases , , , , .
Ancestor reconstructionusing optimization techniques, to provide plausible scenarios of the history of genome evolution , , , .
Classification and logical inference, to reliably identify similarities between groups of genetic elements, and infer rules through deduction and induction , , .
Hierarchical and comparative modeling, to build mathematical models of the behavior of complex biological systems, in particular through combination, reutilization, and specialization of existing continuous and discrete models , , , , .
The hundred- to thousand-fold decrease in sequencing costs seen in the past few years presents significant challenges for data management and large-scale data mining. Magnome's methods specifically address “scaling out,” where resources are added by installing additional computation nodes, rather than by adding more resources to existing hardware. Scaling out adds capacity and redundancy to the resource, and thus fault tolerance, by enforcing data redundancy between nodes, and by reassigning computations to existing nodes as needed.
The central dogma of evolutionary biology postulates that contemporary genomes evolved from a common ancestral genome, but the large scale study of their evolutionary relationships is frustrated by the unavailability of these ancestral organisms that have long disappeared. However, this common inheritance allows us to discover these relationships through comparison, to identify those traits that are common and those that are novel inventions since the divergence of different lineages.
We develop efficient methodologies and software for associating biological information with complete genome sequences, in the particular case where several phylogenetically-related eukaryote genomes are studied simultaneously.
The methods designed by Magnomefor comparative genome annotation, structured genome comparison, and construction of integrated models are applied on a large scale to:
eukaryotes from the hemiascomycete class of yeasts , , , , , and to
prokaryotes from the lactic bacteria used in winemaking , , .
A general goal of systems biology is to acquire a detailed quantitative understanding of the dynamics of living systems. Different formalisms and simulation techniques are currently used to construct numerical representations of biological systems, and a recurring challenge is that hand-tuned, accurate models tend to be so focused in scope that it is difficult to repurpose them. We claim that, instead of modeling individual processes de novo, a sustainable effort in building efficient behavioral models must proceed incrementally. Hierarchical modelingis one way of combining specific models into networks. Effective use of hierarchical models requires both formal definition of the semantics of such composition, and efficient simulation tools for exploring the large space of complex behaviors. We have combined uses theoretical results from formal methods and practical considerations from modeling applications to define BioRica , , a framework in which discrete and continuous models can communicate with a clear semantics. Hierarchical models in BioRica can be assembled from existing models, and translated into their execution semantics and then simulated at multiple resolutions through multi-scale stochastic simulation. BioRica models are compiled into a discrete event formalism capable of capturing discrete, continuous, stochastic, non deterministic and timed behaviors in an integrated and non-ambiguous way. Our long-term goal to develop a methodology in which we can assemble a modelfor a species of interest using a library of reusable models and a organism-level “schematic” determined by comparative genomics.
Comparative modeling is also a matter of reconciling experimental data with models and inferring new models through a combination of comparative genomics and successive refinement , .
Yeasts provide an ideal subject matter for the study of eukaryotic microorganisms. From an experimental standpoint, the yeast Saccharomyces cerevisiaeis a model organism amenable to laboratory use and very widely exploited, resulting in an astonishing array of experimental results. From a genomic standpoint, yeasts from the hemiascomycete class provide a unique tool for studying eukaryotic genome evolution on a large scale. With their relatively small and compact genomes, yeasts offer a unique opportunity to explore eukaryotic genome evolution by comparative analysis of several species.
Yeasts are widely used as cell factories, for the production of beer, wine and bread and more recently of various metabolic products such as vitamins, ethanol, citric acid, lipids, etc.
Yeasts can assimilate hydrocarbons (genera Candida, Yarrowiaand Debaryomyces), depolymerise tannin extracts ( Zygosaccharomyces rouxii) and produce hormones and vaccines in industrial quantities through heterologous gene expression.
Several yeast species are pathogenic for humans, especially Candida albicans, Candida glabrata, Candida tropicalisand the Basidiomycete Cryptococcus neoformans.
The hemiascomycetous yeasts represent a homogeneous phylogenetic group of eukaryotes with a relatively large diversity at the physiological and ecological levels. Comparative genomic studies within this group have proved very informative , , , , , , .
Magnomeapplies its methods for comparative genomics and knowledge engineering to the yeasts through the ten-year old Génolevuresprogram (GDR 2354 CNRS), devoted to large-scale comparisons of yeast genomes with the aim of addressing basic questions of molecular evolution. We developed the software tools used by the CNRS's genolevures.orgweb site. Magnome's Magussystem for simultaneous genome annotation combines semi-supervised classification and rule-based inference in a collaborative web-based system that explicitly uses comparative genomics to simultaneously analyse groups of related genomes.
Oleaginous yeasts are capable of synthesizing lipids from different substrates other than glucose, and current research is attempting to understand this conversions with the goal of optimizing their throughput, production and quality. From a genomic standpoint the objective is to characterize genes involved in the biosynthesis of precursor molecules which will be transformed into fuels, which are thus not derived from petroleum. Biological experimentation by partner laboratories study lipid accumulation the oleaginous yeasts such as Yarrowia lipolyticastarting from:
pentoses, produced from lignin cellulose agricultural substrates following a biorefining strategy,
glycerol, a secondary output of chemical production of biodiesel, and
industrial residues.
Lipases from Y. lipolyticaare of particular interest (see for review). Experimental characterization of the lipid bodies produced from these substrates will aid in the identification of target genes which may serve for genetic engineering. This in turn requires the development of molecular tools for this class of yeasts with strong industrial potential. Magnome's focus is in acquiring genome sequences, predicting genes using models learned from genome comparison and sequencing of cDNA transcripts, and comparative annotation. Our overall goal is to define dynamic models that can be used to predict the behavior of modified strains and thus drive selection and genetic engineering.
Yeasts and bacteria are essential for the winemaking process, and selection of strains based both on their efficiency and on the influence on the quality of wine is a subject of significant effort in the Aquitaine region. Unlike the species studied above, yeast and bacterial starters for winemaking cannot be genetically modified. In order to propose improved and more specialized starters, industrial producers use breeding and selection strategies.
Yeast starters from the Saccharomycesgenus are used for primary, alcohol fermentation. Recent advances have made it possible to identify the genetic causes of the different technological differences between strains , , . Manipulating the genetic causes rather than the industrial consequences is far more amenable to experimental development. An essential tool in identifying these genetic causes is comparative genomics.
Bacterial starters based on Oenococcus oeniare used in secondary, malolactic fermentation. Genetically, O. oenipresents a surprising level of intra-specific diversity, and clues that it may evolve more rapidly than expected. Studying the diversity of the O. oeni genomes has led to genetic tools that can be used to evaluate the predisposition of different strains to respond to oenological stresses. While identifying particular genes has been the leading strategy up to now, recently a new strategy based on comparative genomics has been undertaken to understand the impact and mechanisms of genetic diversity , , .
Starting from historical collaborations by Pascal Durrens and Elisabeth Bon with partners from the Institute for Wine and Vine Sciences in Bordeaux (ISVV), we have built an effective partnership between Magnome, the UMR Œnology–ISVV, and local industry, to apply our tools to large-scale comparative genomics of yeast and bacterial starters in winemaking.
Affinity binders are molecular tools for recognizing protein targets, that play a fundamental in proteomics and clinical diagnostics. Large catalogs of binders from competing technologies (antibodies, DNA/RNA aptamers, artificial scaffolds, etc.) and Europe has set itself the ambitious goal of establishing a comprehensive, characterized and standardized collection of specific binders directed against all individual human proteins, including variant forms and modifications. Despite the central importance of binders, they presently cover only a very small fraction of the proteome, and even though there are many antibodies against some targets (for example, >900 antibodies against p53), there are none against the vast majority of proteins. Moreover, widely accepted standards for binder characterization are virtually nonexistent.
Alongside the technical challenges in producing a comprehensive binder resource are significant logistical challenges, related to the variety of producers and the lack of reliable quality control mechanisms. As part of the ProteomeBinders and Affinomics projects, Magnomeworks to develop knowledge engineering techniques for storing, exploring, and exchanging experimental data used in affinity binder characterization. This work involves databases and tools for molecular interaction data , standards for data exchange between peers , , and reporting standards .
Inria Bioscience Resources is a portal designed to improve the visibility of bioinformatics tools and resources developed by Inria teams. This portal will help the community of biologists and bioinformatians understand the variety of bioinformatics projects in Inria, test the different applications, and contact project-teams. Eight project-teams participate in the development of this portal. Inria Bioscience Resources is developed in an Inria Technology Development Action (ADT).
As part of our contribution the Génolevures Consortium, we have developed over the past few years an efficient set of tools for web-based collaborative annotation of eukaryote genomes. The Magusgenome annotation system integrates genome sequences and sequences features, in silicoanalyses, and views of external data resources into a familiar user interface requiring only a Web navigator. Magusimplements the annotation workflows and enforces curation standards to guarantee consistency and integrity. As a novel feature the system provides a workflow for simultaneous annotationof related genomes through the use of protein families identified by in silicoanalyses; this has resulted in a three-fold increase in curation speed, compared to one-at-a-time curation of individual genes. This allows us to maintain Génolevures standards of high-quality manual annotation while efficiently using the time of our volunteer curators.
Magusis built on: a standard sequence feature database, the Stein lab generic genome browser
, various biomedical ontologies (
http://
For more information see magus.gforge.inria.fr, the MagusGforge web site. Magusis developed in an Inria Technology Development Action (ADT).
With the arrival of new generations of sequencers, laboratories, at a lower cost, can be sequenced groups of genomes. You can no longer manually annotate these genomes. The YAGA software's objective is to syntactically annotate a raw sequence (genetic element: gene, CDS, tRNA, centromere, gap, ...) and functionally as well as generate EMBL files for publication. The annotation takes into account data from comparative genomics, such as protein family profiles.
After determining the constraints of the annotation, the YAGA software can automatically annotate de novoall genomes from their raw sequences.The predictors used by the YAGA software can also take into account the data RNAseq to reinforce the prediction of genes.The current settings of the software are intended for annotation of the genomes of yeast, but the software is adaptable for all types of species.
BioRicais a high-level modeling framework integrating discrete and continuous multi-scale dynamics within the same semantics field. A model in BioRica node is hierarchically composed of nodes, which may be existing models. Individual nodes can be of two types:
Discrete nodes are composed of states, and transitions described by constrained events, which can be non deterministic. This captures a range of existing discrete formalisms (Petri nets, finite automata, etc.). Stochastic behavior can be added by associating the likelihood that an event fires when activated. Markov chains or Markov decision processes can be concisely described. Timed behavior is added by defining the delay between an event's activation and the moment that its transition occurs.
Continuous nodes are described by ODE systems, potentially a hybrid system whose internal state flows continuously while having discrete jumps.
The system has been implemented as a distributable software package
The BioRica compiler reads a specification for hierarchical model and compiles it into an executable simulator. The modeling language is a stochastic extension to the AltaRica Dataflow language, inspired by work of Antoine Rauzy. Input parsers for SBML 2 version 4 are curently being validated. The compiled code uses the Python runtime environment and can be run stand-alone on most systems .
For more information see biorica.gforge.inria.fr, the BioRica Gforge web site. BioRica was developed as an Inria Technology Development Action (ADT).
Pathtasticis a software tool for inferring whole-genome metabolic models for eukaryote cell factories. It is based on metabolic scaffolds, abstract descriptions of reactions and pathways on which inferred reactions are hung are are eventually connected by an interative mapping and specialization process. Scaffold fragments can be repeatedly used to build specialized subnetworks of the complete model.
Pathtastic uses a consensus procedure to infer reactions from complementary genome comparisons, and an algebra for assisted manual editing of pathways.
For more information see pathtastic.gforge.inria.fr, the Pathtastic Gforge web site.
The Génolevures online database provides tools and data for exploring the annotated genome sequences of more than 20 genomes, determined and manually annotated by the Génolevures Consortium to facilitate comparative genomic studies of hemiascomycetous yeasts. Data are presented with a focus on relations between genes and genomes: conservation of genes and gene families, speciation, chromosomal reorganization and synteny. The Génolevures site includes an area for specific studies by members of its international community.
Génolevures online uses the Magussystem for genome navigation, with project-specific extensions developed by David Sherman, Pascal Durrens, and Tiphaine Martin. An advanced query system for data mining in Génolevures is being developed by Natalia Golenetskaya. The contents of the knowledge base are expanded and maintained by the CNRS through GDR 2354 Génolevures. Technical support for Génolevures On Line is provided the CNRS through UMR 5800 LaBRI.
For more information see genolevures.org, the Génolevures web site.
By using the MAGNOME software developments, including the Magussystem and YAGA software, we have successfully realized a full annotation and analysis of seven new genomes, provided to the Génolevures Consortium by the CEA–Génoscope (Évry). Two distant genomes from the Debaryomycetaceaeand mitosporic Saccharomycetalesclades of the Saccharomycetaleswere annotated using previously published Génolevures genomes , , as references. A further group of five species, comprised of pathogenic and nonpathogenic species, was analyzed with the goal of identifying virulence determinants . By choosing species that are highly related but which differ in the particular traits that are targeted, in this case pathogenicity, we are able to focus of the few hundred genes related to the trait. The approximately 40,000 new genes from these studies were classified into existing Génolevures families as well as branch-specific families. The results from these two studies will be published in the coming year.
Oenococcus oeniis part of the natural microflora of wine and related environments, and is the main agent of the malolactic fermentation (MLF), a step of wine making that generally follows alcoholic fermentation (AF) and contributes to wine deacidification, improvement of sensorial properties and microbial stability. The start, duration and achievement of MLF are unpredictable since they depend both on the wine characteristics and on the properties of the O. oenistrains. In collaboration with Patrick Lucas’s lab of the ISVV Bordeaux that is currently proceeding with genome sequencing, explorative and, and comparative genomics, Elisabeth Bon coordinates our efforts into the OENIKITA project (since 2009), a scale switching challenge including highthrouput exploratory and comparative genomics for oenological bacterial starters, and the development of an online web-collaborative multigenomic comparative platform (under development) based on the the Génolevures database architecture and MAGUS / YAGA systems.
OENI-Genomics axis: In comparative genomics, we investigated gene repertoire and genomic organization conservation through intra- and inter-species genomic comparisons, which clearly show that the O. oenigenome is highly plastic and fast-evolving. Results reveal that the optimal adaptation to wine of a strain mostly depends on the presence of key adaptive loops and polymorphic genes. They also point up the role of horizontal gene transfer and mobile genetic elements in O. oenigenome plasticity, and give the first clues of the genetic origin of its oenological aptitudes. As a result of the scaling out challenge, we completed the assembly of 19 fully sequenced O. oenigenome variants.
KITA-Genomics (E. Bon, D. Sherman): This project that is focused on the sequencing, assembly, exploration and comparison of the O. kitaharae genome, has benefited to an international collaboration involving Dr V. Makeev. MAGNOME contributed to the assembly of the genome. The comparison against the O. oenigenomic architecture will contribute to shed light on the evolutionary mechanisms which are responsible for the atypically long branch of the genus Oenococcus in phylogenetic trees.
Transcriptomic axis (E. Bon, A. Goulielmakis): Under the supervision of E. Bon, Aurélie Goulielmakis has completed for the ANR DIVOENI a detailed manual annotation of a new reference strain of O. oeniand performed comparative transcriptome analysis to identify genes differentially expressed under different culture conditions. We explored and compared how the expression system is solicited when O. oenistrains adapted to grow in some niches are placed under stress-exposure conditions. The monitoring of gene expression status between strains, through the definition of a global expression pattern proper to each gene, partially lift the veil on how O. oenigenome adapts function to its environment. The weight of genetic background and ecological niche pressure on gene expression flexibility was evaluated, and the O. oenipan-transcriptome architecture characterized. The first guidelines revealed a supra-spatial organization of stress response into activated and repressed larger macro-domains defining functional landmarks and intra-chromosomal territories . Decryption of stress-sensitive gene repertoires promises to be an efficient tool in the conquest of O. oeni“domestication” through the identification of molecular markers responsible for different physiological capabilities, and the selection of the best adapted strains.
Gene plasticity modelisation (E. Bon, A. Goulielmakis): A novel axis of research recently emerged under the initiative of E. Bon (pseudOE project) around the detection, characterization and conservation of pseudogenes populations in Oenococcus bacteria. Such topic presents a double interest: phylogenetic at first because it should allow to better estimate the degree of genic/genomic plasticity of these bacteria, and algorithmic then because the pseudogenes are a source of confusion for the automatic prediction of genes. Through a transversal collaboration and a cooperative supervision with the Algorithms for Analysis of Biological Structures Group (P. Ferraro, J. Allali) at LaBRI, Laetitia Bourgeade (PhD, Univ. Bordeaux1) was recruited to develop dedicated methods to improve pseudogenes automatic detection, and therefore gene predictions, and to reconstruct fossil and modern genes evolutionary history.
The Tsvetok project in Magnometargets “scaling out” for data and computation, both to improve capacity for handling large volumes of data and to permit more automatic analysis of projects of the “comparative genomics of related species” type, where a set of genomes is sequenced and analyzed as part of the same process. Natalia Golenetskaya has designed and implemented a NoSQL schema through the identification of standard queries, definition of the appropriate query-oriented storage schema, and mapping of structured values to this schema. This prototype is being tested on an Apache Cassandra ring deployed in Magnome's dedicated computing cluster.
Large-scale data-mining such as that required for comparative genomics is fundamentally data-parallel: an initial transformation is applied to every data object of a given type (such as genes or even individual nucleotides), then a statistical machine learning procedure is applied to the transformed data to produce a summary or to learn a classification function. Analyses of this kind are the design goal of the MapReduce paradigm . Using Tsvetok as a generator for Apache Hadoop, Natalia is designing MapReduce solutions for the principal whole-genome and data-mining analyses used by Magnomefor eukaryote and prokaryote comparative genomics.
Last year we successfully completed and released the MIAPAR and PSI-PAR international standards for knowledge representation and data exchange of affinity binder properties, a five-year effort organized as part of the ProteomeBinders and HUPO-PSI consortia. These standards were reported in Nature Biotechnologyand Molecular and Cellular Proteomicsto the research community , and extend previous work , . One long-standing issue is the adoption of these standards by individual researchers in the lab: initial data entry must be simple enough that standards-based reporting can be integrated into the process of writing the paper. We used an extensive dataset of affinity proteomics data to evaluate “last mile” tools for data entry and initial reporting of affinity proteomics data, and identified places where existing tools need to be adapted to meet these specific needs .
In collaboration with Prof Jean-Marc Nicaud's lab at the INRA Grignon, we developed the first functional genome-scale metabolic model of an oleaginous yeast. Most work in producing genome-scale metabolic models has focused on model organisms, in part due to the cost of obtaining well-annotated genome sequences and sufficiently complete experimental data for refining and verifying the models. However, for many fungal genomes of biotechnological interest, the combination of large-scale sequencing projects and in-depth experimental studies has made it feasible to undertake metabolic network reconstruction for a wider range of organisms.
An excellent representative of this new class of organisms is Yarrowia lipolytica, an oleaginous yeast studied experimentally for its role as a food contaminant and its use in bioremediation and cell factory applications. As one of the hemiascomycetous yeasts completely sequenced in the Génolevures program it enjoys a high quality manual annotation by a network of experts. It is also an ideal subject for studying the role of species-specific expansion of paralogous families, a considerable challenge for eukaryotes in genome-scale metabolic construction. To these ends, we undertook a complete reconstruction of the Y. lipolyticametabolic network.
Methods: A draft model was extrapolated from the S. cerevisiaemodel iIN800, using in silicomethods including enzyme conservation predicted using Génolevures and reaction mapping maintaining compartments. This draft was curated by a group of experts in Y. lipolyticametabolism, and iteratively improved and validated through comparison with experimental data by flux balance analysis. Gap filling, species-specific reactions, and the addition of compartments with the corresponding transport reactions were among the improvements that most affected accuracy.
Results: We produced an accurate functional model for Y. lipolytica, MODEL1111190000 in Biomodels.net, that has been qualitatively validated against gene knockouts.
A recurring challenge for in silicomodeling of cell behavior is that experimentally validated models are so focused in scope that it is difficult to repurpose them. Hierarchical modeling is one way of combining specific models into networks. Effective use of hierarchical models requires both formal definition of the semantics of such composition, and efficient simulation tools for exploring the large space of complex behaviors.
BioRica is a high-level hierarchical modeling framework for models combining continuous and discrete components. By providing a reliable and functional software tool backed by a rigorous semantics, we hope to advance real adoption of hierarchical modeling by the systems biology community. By providing an understandable and mathematically rigorous semantics, this will make is easier for practicing scientists to build practical and functional models of the systems they are studying, and concentrate their efforts on the system rather than on the tool.
Rodrigo Assar formalized two strategies for integrating discrete control with continuous models, coefficient switches that control the parameters of the continuous model, and strong switches that choose different models. This was translated by Alice Garcia into a BioRica specification for hybrid systems that assures integrity of models, allowing composition, reconciliation, and reuse of models with SBML specifications. Rodrigo used this approach to describe two systems: wine fermentation kinetics, and cell fate decisions leading to bone and fat formation . In the first, known models that describe the responses of yeast cells to different temperatures, resources and toxins, were reconciled using coefficient switches that gave the best adjustment of the model depending on the initial conditions and fermentation variable. In the second, a combination of accurate models to predict the bone and fat formation in response to activation of pathways such as the Wnt pathway, and changes of conditions affecting these functions such as increments in Homocysteine, were used to analyze the responses to treatments for osteoporosis and other bone mass disorders. Our hope is that this is a first step in obtaining in silicoevaluations of medical treatments before testing them in vivoor in vitro.
Maria Llubères of the University of Puerto Rico visited Magnomeand we established formal relationships between BioRica models and probabilistic boolean networks.
SARCO, the research subsidiary of the Laffort group, has entered into a contract with Magnometo develop comparative genomics tools for selecting wine starters. This contract will permit SARCO to take a decisive step in the understanding of oenological microorganisms by obtaining and exploiting the sequences of their genomes. Comparison of the genomes of these strains has become absolutely necessary for learning the genetic origin of the phenotypic variations of oenological yeasts and bacteria. This knowledge will permit SARCO to optimize and accelerate the process of selection of the highest-performing natural strains. With the help of Magnomemembers and their rich experience in comparative analysis of related genomes, SARCO will acquire competence in biological analysis of genomic sequences. At the same time, Magnomemembers will acquire further experience with the genomes of winemaking microorganisms, which will help us define new tools and methods better adapted to this kind of industrial cell factory.
The French Petroleum Institute ( Institut français de pétrole-énergies nouvelles) is coordinating a 6 M-Euro contract with the Civil Aviation Directorate ( Direction Générale de l'Aviation Civile) on behalf of a large consortium of industrial (EADS, Dassault, Snecma, Turbomeca, Airbus, Air France, Total) and academic (CNRS, INRA, Inria) partners to explore different technologies for alternative fuels for aviation. The CAER project studies both biofuel products and production, improved jet engine design, and the impact of aircraft. Within CAER Magnome via CNRS, works with partners from Grignon and Toulouse on the genomics of highly-performant oleaginous yeasts.
This project is a collaboration between the company SARCO, specialized in the selection of industrial yeasts with distinct technological abilities, with the ISVV and Magnome. The goal is to use genome analysis to identify molecular markers responsible for different physiological capabilities, as a tool for selecting yeasts and bacteria for wine fermentation through efficient hybridization and selection strategies. This collaboration has obtained the INNOVIN label.
Elisabeth Bon is a partner in DIVOENI, a four-year ANR project concerning intraspecies biodiversity of the oenological bacteria Oenococcus oeni. Coordinated by Prof. Aline Lonvaud (Univ. Bordeaux Segalen) from the Institute of Vine and Wine Sciences of Bordeaux – Aquitaine, this scientific programme was developed:
To evaluate the genetic diversity of a vast collection of strains, to set up phylogenetic groups, then to investigate relationships between the ecological niches (cider, wine, champagne) and the essential phenotypical traits. Hypotheses on the evolution in the species and on the genetic stability of strains will be drawn.
To propose methods based on molecular markers to make a better use of the diversity of the species.
To measure the impact of the repeated use of selected strains on the diversity in the ecosystem and to draw the conclusions for its preservation.
Elisabeth is in charge of the computational infrastructure dedicated to genomics and post-genomics data storage, handling and analysis. She coordinates collaboration with the CBiB-Centre de Bioinformatique de Bordeaux.
A major objective of the “post-genome” era is to detect, quantify and characterise all relevant human proteins in tissues and fluids in health and disease. This effort requires a comprehensive, characterised and standardised collection of specific ligand binding reagents, including antibodies, the most widely used such reagents, as well as novel protein scaffolds and nucleic acid aptamers. Currently there is no pan-European platform to coordinate systematic development, resource management and quality control for these important reagents.
Magnomeis an associate partner of the FP7 “ Affinity Proteome” project coordinated by Prof. Mike Taussig of the Babraham Institute and Cambridge University. Within the consortium, we participate in defining community for data representation and exchange, and evaluate knowledge engineering tools for affinity proteomics data.
Prof. Mike Taussig: Babraham Institute & Cambridge University
Knowledge engineering for Affinity Proteomics
Henning Hermjakob: European Bioinformatics Institute
Standards and databases for molecular interactions
Vsevolod Makeev (Senior Researcher at the Russian Academy of Sciences, Vavilov Institute) has been a collaborator for several years. He and his student Artëm Kasianov made several visits to Inria in 2011, and worked with us on genome assembly algorithms, computational identification of sequence motifs, and distributed algorithms for data mining. Vsevolod Makeev was a visiting CNRS Senior Researcher in the LaBRI and Magnomefor three months in the Fall of 2011.
Marie Llubères visited Magnomefrom the University of Puerto Rico for two months on a grant from the NSF PIRE program. She worked on hierarchical modeling of biological systems and specifically on bijections between Probabilistic Boolean Networks and the Stochastic Transition Systems used in the BioRica framework.
Hugo Campbell Sills came to Magnomeon an Inria International Internship in the Summer of 2011, and worked on single-nucleotide polymorphism discovery and effects in twelve œnological yeast genomes.
Since 2000 our team is a member of the Génolevures Consortium (GDR CNRS), a large-scale comparative genomics project that aims to address fundamental questions of molecular evolution
through the sequencing and the comparison of 14 species of hemiascomycetous yeasts. The Consortium is comprised of 16 partners, in France, Belgium, Spain, the Netherlands (see
http://
The Dikaryome Consortium is a scientific collaboration between several international partners and the National Center for Sequencing (CEA–Génoscope, Évry) on the sequencing, annotation, and comparative analysis of fungal genomes.
These perennial collaborations continue in two ways. First, a number of new projects are underway, concerning several new genomes currently being sequenced, and new questions about the mechanisms of gene formation. Second, through the development and improvement of the Génolevures On Line database, in whose maintenance our team has a longstanding committment.
David Sherman is member of the editorial board of the journal Computational and Mathematical Methods in Medicine, and reviewer for several in the bioinformatics field.
David Sherman was external reviewer and member of the thesis defense jury for Anne-Ruxandra Carvunis, Grenoble. He was a member and president of the jury for the thesis defense of Anne-Laure Gaillard, Bordeaux. He was thesis director and member of the jury for the thesis defense of Rodrigo Assar. He was a member of the HDR defense jury for Patrick Lucas, Bordeaux.
Pascal Durrens is responsible for scientific diffusion, and David Sherman is head of Bioinformatics, for the Génolevures Consortium.
Pascal Durrens is leader of the “Comparative Genomics” theme and member of the Scientific Council of the LaBRI UMR 5800/CNRS
Tiphaine Martin is member of the Local Committee and member of Local Committee for Occupational Health and Safety of the INRIA Bordeaux Sud-Ouest.
Tiphaine Martin is member of the GIS-IBiSA GRISBI-Bioinformatics Grid working group.
Tiphaine Martin and David Sherman are members of the Institut de Grilles, and Tiphaine is active in the Biology/Health working group.
Elisabeth Bon is member of the “Comité Technique Paritaire” (2008-2011) and the “Comité Technique de Proximité” (since 2011-10-20) at the Segalen Bordeaux University
David Sherman and Natalia Golenetskaya teach :
Master : Web et Interfaces Homme-Machine, 50h, 2ème année Ingénieur, Enseirb-Matmeca (Institut Polytechnique de Bordeaux), Bordeaux
Tiphaine Martin teaches :
Master : Utilisation of EGEE GRID via virtual organisation GRISBI , 8h, niveau (M2), University Lyon, France
Master : Utilisation of EGEE GRID via virtual organisation GRISBI, 8h, niveau (M2), INRA Toulouse, France
Doctorat : Utilisation of MAGUS software, 8h, Génolevures Consortium, France
Tiphaine Martin has the supervision of 4 Bioinformatics MSc students from the University of Bordeaux:
Master : Development of search tools on Génolevures databases, 6hETC, M1, University Bordeaux 1 and University Bordeaux Segalen, France
Elisabeth Bon is Associate Professor in Bioinformatics and Genomics, and teaches undergraduate courses in computer sciences in regular STS (Sciences, Technologies & Sante) bachelor’s degrees and research oriented STS master’s degrees at the Life Sciences Department of the University Bordeaux Segalen (Medicine and Life Sciences schools) and University Bordeaux 1 (Computer and Life Science schools).
Licence : “Introduction to ICTs-Information & Communication Technologies” class (basic and advanced sections) , 112H00, niveau (L1, L2), the STS- biology variant Licence program, université, France
Licence : the national “IT and Internet certificate (C2i®), level 1”, 20h, niveau (L2, L3),the STS- biology variant Licence program, université, France
Master : “Bioinformatics: Computerised resources, data banks and methods”, 60h, niveau (M1),the Biology & Healthcare STS- Master program, co-listed with the University Bordeaux 1 (Sciences & Technologies) and the University Bordeaux Segalen (Medicine & Life Sciences), France
Elisabeth Bon is :
Licence : Responsible for the bachelor’s degree “Information Technologies & Internet advanced course”, Life Sciences Department, University Bordeaux, France
Licence : Responsible for the “IT and Internet certificate (C2i®), level 1”, Life Sciences Department, University Bordeaux , France
Licence : Current president (2005-2007; since sept. 2011) of the “Segalen Bordeaux University IT and Internet certificate (C2i, level 1) committee” in charge of the C2i evaluation and certification for students and continuous education interns, University Bordeaux Segalen, France
Master : Master Theses in Computer Science, speciality in BioInformatics: L. Bourgeade (2011-02-01 / 2011-08-31), Reconstitution in silico de l’histoire évolutive des pseudogènes chez les bactéries lactiques du genre Oenococcus, M2, University Bordeaux 1, France
Doctorat : Ph.D. Thesis in Computer Science: L. BOURGEADE (Since 2011-10-01) in cooperation with P. Ferraro and J. Allali at LaBRI, Filtres sur les arborescences modélisant les ARN et plasticité génique, University Bordeaux 1, France
Doctorat : ATER (assistant professor) in computer sciences for ITCs practical workshops and courses in the first year of the Bachelor’s degre, University Bordeaux 1, France
PhD & HdR :
PhD: Rodrigo Assar, Modeling and simulation of Hybrid Systems and Cell factory applications, University Bordeaux 1, defended 2011-10-26, David Sherman
PhD in progress: Nicolás Loira, University Bordeaux 1, Scaffold-based Reconstruction Method for Genome-Scale Metabolic Models, began 2007, David Sherman
PhD in progress: Natalia Golenetskaya, University Bordeaux 1, began 2009, Scaling out for data in comparative genomics, Pascal Durrens and David Sherman
PhD in progress : Razanne Issa, University Bordeaux 1, Analyse symbolique de données génomiques, began 2010, David Sherman
PhD in progress: Anna Zhukova, University Bordeaux 1, Comparative genomics of biotechnological organisms, began 2011, David Sherman
PhD in progress: Anasua Sarkar, University Bordeaux 1, began 2008, Macha Nikolski
PhD in progress: Laetitia Bourgeade, University Bordeaux 1, Filtres sur les arborescences modélisant les ARN et plasticité génique, began, Pascal Ferarro and Elisabeth Bon