Genomic Exploration of the Hemiascomycetous Yeasts: 4. The genome of <i>Saccharomyces cerevisiae</i> revisited

MAGNOME Models and Algorithms for the Genome

Computational Biology

Digital Health, Biology and Earth

http://team.inria.fr/magnome/ 2009 July 01 Laboratoire Bordelais de Recherche en Informatique (LaBRI) CNRS Université de Bordeaux Computational Biology Genomics Knowledge Engineering Modeling High Performance Computing David Sherman Chercheur

Bordeaux

Team leader, Inria, Senior Researcher oui Pascal Durrens Chercheur

Bordeaux

CNRS, Researcher oui Xavier Calcas Technique

Bordeaux

CNRS, from Jun 2013 Florian Lajus Technique

Bordeaux

Inria Razanne Issa PhD

Bordeaux

Exchange Fellowship Syria Anna Zhukova PhD

Bordeaux

Inria Witold Dyrka PostDoc

Bordeaux

Inria, funded by ANR Mykimum project Natalia Golenetskaya Visiteur

Bordeaux

R&D Scientist, until Jun 2013 Artem Kasianov Visiteur

Bordeaux

PhD student RAS Moscow, from Oct 2013 until Nov 2013 Vsevolod Makeev Visiteur

Bordeaux

Professor RAS Moscow, until Nov 2013 Anne-Laure Gautier Assistant

Bordeaux

Inria Joaquin Francisco Fernandez AutreCategorie

Bordeaux

Inria, PhD student U.Rosario, from Sep 2013 until Nov 2013 Overall Objectives Overall Objectives

One of the key challenges in the study of biological systems is understanding how the static information recorded in the genome is interpreted to become dynamic systems of cooperating and competing biomolecules. Magnome addresses this challenge through the development of informatic techniques for multi-scale modeling and large-scale comparative genomics:

logical and object models for knowledge representation

stochastic hierarchical models for behavior of complex systems, formal methods

algorithms for sequence analysis, and

data mining and classification.

We use genome-scale comparisons of eukaryotic organisms to build modular and hierarchical hybrid models of cell behavior that are studied using multi-scale stochastic simulation and formal methods. Our research program builds on our experience in comparative genomics, modeling of protein interaction networks, and formal methods for multi-scale modeling of complex systems.

New high-throughput technologies for DNA sequencing have radically reduced the cost of acquiring genome and transcriptome data, and introduced new strategies for whole genome sequencing. The result has been an increase in data volumes of several orders of magnitude, as well has a greatly increased density of genome sequences within phylogenetically constrained groups of species. Magnome develops efficient techniques for dealing with these increased data volumes, and the combinatorial challenges of dense multi-genome comparison.

Research Program Overview

Fundamental questions in the life sciences can now be addressed at an unprecedented scale through the combination of high-throughput experimental techniques and advanced computational methods from the computer sciences. The new field of computational biology or bioinformatics has grown around intense collaboration between biologists and computer scientists working towards understanding living organisms as systems. One of the key challenges in this study of systems biology is understanding how the static information recorded in the genome is interpreted to become dynamic systems of cooperating and competing biomolecules.

Magnome addresses this challenge through the development of informatic techniques for understanding the structure and history of eukaryote genomes: algorithms for genome analysis, data models for knowledge representation, stochastic hierarchical models for behavior of complex systems, and data mining and classification. Our work is in methods and algorithms for:

Genome annotation for complete genomes, performing syntactic analyses to identify genes, and semantic analyses to map biological meaning to groups of genes , , , , , .

Integration of heterogenous data, to build complete knowledge bases for storing and mining information from various sources, and for unambiguously exchanging this information between knowledge bases , , , , .

Ancestor reconstruction using optimization techniques, to provide plausible scenarios of the history of genome evolution , , , .

Classification and logical inference, to reliably identify similarities between groups of genetic elements, and infer rules through deduction and induction , , .

Hierarchical and comparative modeling, to build mathematical models of the behavior of complex biological systems, in particular through combination, reutilization, and specialization of existing continuous and discrete models , , , , .

The hundred- to thousand-fold decrease in sequencing costs seen in the past few years presents significant challenges for data management and large-scale data mining. Magnome's methods specifically address “scaling out,” where resources are added by installing additional computation nodes, rather than by adding more resources to existing hardware. Scaling out adds capacity and redundancy to the resource, and thus fault tolerance, by enforcing data redundancy between nodes, and by reassigning computations to existing nodes as needed.

Comparative genomics

The central dogma of evolutionary biology postulates that contemporary genomes evolved from a common ancestral genome, but the large scale study of their evolutionary relationships is frustrated by the unavailability of these ancestral organisms that have long disappeared. However, this common inheritance allows us to discover these relationships through comparison, to identify those traits that are common and those that are novel inventions since the divergence of different lineages.

We develop efficient methodologies and software for associating biological information with complete genome sequences, in the particular case where several phylogenetically-related eukaryote genomes are studied simultaneously.

The methods designed by Magnome for comparative genome annotation, structured genome comparison, and construction of integrated models are applied on a large scale to:

eukaryotes from the hemiascomycete class of yeasts , , , , , and to

prokaryotes from the lactic bacteria used in winemaking , , , , , .

Comparative modeling

A general goal of systems biology is to acquire a detailed quantitative understanding of the dynamics of living systems. Different formalisms and simulation techniques are currently used to construct numerical representations of biological systems, and a recurring challenge is that hand-tuned, accurate models tend to be so focused in scope that it is difficult to repurpose them. We claim that, instead of modeling individual processes de novo, a sustainable effort in building efficient behavioral models must proceed incrementally. Hierarchical modeling is one way of combining specific models into networks. Effective use of hierarchical models requires both formal definition of the semantics of such composition, and efficient simulation tools for exploring the large space of complex behaviors. We have combined uses theoretical results from formal methods and practical considerations from modeling applications to define BioRica , , , a framework in which discrete and continuous models can communicate with a clear semantics. Hierarchical models in BioRica can be assembled from existing models, and translated into their execution semantics and then simulated at multiple resolutions through multi-scale stochastic simulation. BioRica models are compiled into a discrete event formalism capable of capturing discrete, continuous, stochastic, non deterministic and timed behaviors in an integrated and non-ambiguous way. Our long-term goal to develop a methodology in which we can assemble a model for a species of interest using a library of reusable models and a organism-level “schematic” determined by comparative genomics.

Comparative modeling is also a matter of reconciling experimental data with models and inferring new models through a combination of comparative genomics and successive refinement , .

Application Domains Function and history of genomes

Yeasts provide an ideal subject matter for the study of eukaryotic microorganisms. From an experimental standpoint, the yeast Saccharomyces cerevisiae is a model organism amenable to laboratory use and very widely exploited, resulting in an astonishing array of experimental results. From a genomic standpoint, yeasts from the hemiascomycete class provide a unique tool for studying eukaryotic genome evolution on a large scale. With their relatively small and compact genomes, yeasts offer a unique opportunity to explore eukaryotic genome evolution by comparative analysis of several species. MAGNOME applies its methods for comparative genomics and knowledge engineering to the yeasts through the ten-year old Génolevures program (GDR 2354 CNRS), devoted to large-scale comparisons of yeast genomes with the aim of addressing basic questions of molecular evolution.

We developed the software tools used by the CNRS's http://www.genolevures.org/ web site. For example, Magnome's Magus system for simultaneous genome annotation combines semi-supervised classification and rule-based inference in a collaborative web-based system that explicitly uses comparative genomics to simultaneously analyse groups of related genomes.

Alternative fuels and bioconversion

Oleaginous yeasts are capable of synthesizing lipids from different substrates other than glucose, and current research is attempting to understand this conversions with the goal of optimizing their throughput, production and quality. From a genomic standpoint the objective is to characterize genes involved in the biosynthesis of precursor molecules which will be transformed into fuels, which are thus not derived from petroleum. Magnome's focus is in acquiring genome sequences, predicting genes using models learned from genome comparison and sequencing of cDNA transcripts, and comparative annotation. Our overall goal is to define dynamic models that can be used to predict the behavior of modified strains and thus drive selection and genetic engineering.

Winemaking and improved strain selection

Yeasts and bacteria are essential for the winemaking process, and selection of strains based both on their efficiency and on the influence on the quality of wine is a subject of significant effort in the Aquitaine region. Unlike the species studied above, yeast and bacterial starters for winemaking cannot be genetically modified. In order to propose improved and more specialized starters, industrial producers use breeding and selection strategies.

Comparative genomics is a powerful tool for strain selection even when genetic engineering must be excluded. Large-scale comparison of the genomes of experimentally characterized strains can be used to identify quantitative trait loci, which can be used as markers in selective breeding strategies. Identifying individual SNPs and predicting their effect can lead to better understanding of the function of genes implicated in improved strain performance, particularly when those genes are naturally mutated or are the result of the transfer of genetic material from other strains. And understanding the combined effect of groups of genes or alleles can lead to insight in the phenomenon of heterosis.

Knowledge bases for molecular tools

Affinity binders are molecular tools for recognizing protein targets, that play a fundamental in proteomics and clinical diagnostics. Large catalogs of binders from competing technologies (antibodies, DNA/RNA aptamers, artificial scaffolds, etc.) and Europe has set itself the ambitious goal of establishing a comprehensive, characterized and standardized collection of specific binders directed against all individual human proteins, including variant forms and modifications. Despite the central importance of binders, they presently cover only a very small fraction of the proteome, and even though there are many antibodies against some targets (for example, $> 900$ antibodies against p53), there are none against the vast majority of proteins. Moreover, widely accepted standards for binder characterization are virtually nonexistent. Alongside the technical challenges in producing a comprehensive binder resource are significant logistical challenges, related to the variety of producers and the lack of reliable quality control mechanisms. As part of the ProteomeBinders and Affinomics projects, Magnome works to develop knowledge engineering techniques for storing, exploring, and exchanging experimental data used in affinity binder characterization.

Software and Platforms Magus: Genome exploration and analysis David James Sherman correspondant Pascal Durrens Natalia Golenetskaya Florian Lajus Xavier Calcas

The MAGUS genome annotation system integrates genome sequences and sequences features, in silico analyses, and views of external data resources into a familiar user interface requiring only a Web navigator. MAGUS implements annotation workflows and enforces curation standards to guarantee consistency and integrity. As a novel feature the system provides a workflow for simultaneous annotation of related genomes through the use of protein families identified by in silico analyses; this results in an $n$ -fold increase in curation speed, compared to curation of individual genes. This allows us to maintain standards of high-quality manual annotation while efficiently using the time of volunteer curators. For more information see the MAGUS Gforge web site. http://magus.gforge.inria.fr MAGUS 1. $x$ is mature software used since 2006 by our collaboration partners. MAGUS 2.0 is developed in an Inria Technology Development Action (ADT) with an open-source license and is being deposited with the APP.

Pantograph: Inference of metabolic networks David James Sherman correspondant Pascal Durrens Nicolás Loira Anna Zhukova

Pantograph is a software tool for inferring whole-genome metabolic models for eukaryote cell factories. It is based on metabolic scaffolds, abstract descriptions of reactions and pathways on which inferred reactions are hung are are eventually connected by an interative mapping and specialization process. Scaffold fragments can be repeatedly used to build specialized subnetworks of the complete model. A novel feature of Pantograph is that it uses expert knowledge implicitly encoded in the scaffold's gene associations, and explicitly transfers this knowledge to the new model. Pantograph is available under an open-source license. For more information see the Pantograph Gforge web site. http://pathtastic.gforge.inria.fr.

MetaModGen: Generalizing Metabolic Models Anna Zhukova correspondant David James Sherman

The metabolic model generalization and navigation software allows a human expert to explore a metabolic model in a layered manner. The software creates an on-line semantically zoomable representation of a model submitted by the user in SBML http://sbml.org format. The most general view represents the compartments of the model; the next view shows the visualization of generalized versions of reactions and metabolites in each compartment (see section 6.3); and the most detailed view visualizes the initial model with the generalization-based layout (where similar metabolites and reactions are placed next to each other). Zoomable representation is implemented using the Leaflethttp://leafletjs.com JavaScript library for mobile-friendly interactive maps. Users can click on reactions and compounds to see the information about their annotations. An example of a zoomable representation of the peroxisome compartment of Y. lipolytica is available at http://metamogen.gforge.inria.fr/map.html.

BioRica: Multi-scale Stochastic Modeling David James Sherman correspondant Rodrigo Assar Cuevas Joaquin Fernandez

BioRica is a high-level modeling framework integrating discrete and continuous multi-scale dynamics within the same semantics field. A model in BioRica node is hierarchically composed of nodes, which may be existing models. Individual nodes can be of two types:

Discrete nodes are composed of states and transitions described by guarded events. Behavior can be stochastic (defined by the likelihood that an event fires when activated) and timed (defined by the delay between an event's activation and the moment that its transition occurs).

Continuous nodes are described by ODE systems, potentially a hybrid system whose internal state flows continuously while having discrete jumps.

The system has been implemented as a distributable software package. The BioRica compiler reads a specification for hierarchical model and compiles it into an executable simulator. The modeling language is a stochastic extension to the AltaRica http://altarica.labri.fr Dataflow language, inspired by work of Antoine Rauzy. Input parsers for SBML 2 version 4 are curently being validated. The compiled code uses the Python runtime environment and can be run stand-alone on most systems. For more information see the BioRica Gforge web site. http://biorica.gforge.inria.fr BioRica was developed as an Inria Technology Development Action (ADT) with an open-source license and is deposited with the APP.

Génolevures On Line: Comparative Genomics of Yeasts Pascal Durrens correspondant Natalia Golenetskaya Tiphaine Martin David James Sherman

The Génolevures online database provides tools and data for exploring the annotated genome sequences of more than 20 genomes, determined and manually annotated by the Génolevures Consortium to facilitate comparative genomic studies of hemiascomycetous yeasts. Data are presented with a focus on relations between genes and genomes: conservation of genes and gene families, speciation, chromosomal reorganization and synteny. The Génolevures site includes a private collaboration area for specific studies by members of its international community. The contents of the knowledge base are expanded and maintained by the CNRS through GDR 2354 Génolevures, and full data may be downloaded from the site. Génolevures online uses our open-source MAGUS system for genome navigation, with project-specific extensions developed by David Sherman, Pascal Durrens, and Tiphaine Martin; these extensions are not made available due to incertainty about intellectual property rights. For more information see the Génolevures web site. http://www.genolevures.org/

Inria Bioscience Resources Olivier Collin correspondant Frédéric Cazals Mireille Régnier Marie-France Sagot Hélène Touzet Hidde De jong David James Sherman Marie-Dominique Devignes Dominique Lavenier

Inria Bioscience Resources is a portal designed to improve the visibility of bioinformatics tools and resources developed by Inria teams. This portal will help the community of biologists and bioinformatians understand the variety of bioinformatics projects in Inria, test the different applications, and contact project-teams. Eight project-teams participate in the development of this portal. Inria Bioscience Resources is developed in an Inria Technology Development Action (ADT).

New Results Adopting new computing paradigms David James Sherman correspondant Pascal Durrens Natalia Golenetskaya Florian Lajus Xavier Calcas

Analyses in comparative genomics are characteristically forms of datamining in high-dimension sets of relations between genes and gene products. For every linear increase in genomic data, these relations can grow at worst geometrically.

Natalia Golenetskaya's thesis developed an integrated architecture that we call Tsvetok, which combines a novel NoSQL storage schema, domain-specific Map-Reduce algorithms, and existing resources to efficiently handle the fundamentally data-parallel analyses encountered in comparative genomics , , . Tsvetok components are deployed in Magnome's private cloud and have been extensively tested using data and use cases derived from log analyses of the Génolevures web resource. We designed Map-Reduce solutions for the principal whole-genome analyses used by Magnome for comparative genomics, in particular new distributed algorithms for systematic identification of gene fusion and fission events in eukaryote genomes, and large-scale consensus clustering for protein families. These examples illustrate two strategies that can be used to scale algorithms in a Map-Reduce setting.

Converting classical graph-based algorithms with message propagation: instead of traversing a graph, which would incur high latency, information is sent forward in waves, and synchronized later. Some of the intermediate computations may be redundant, but overall running time is minimized.

Iterative sampling strategies, which run the standard algorithm on carefully chosen subsets, and later compute a consensus of the intermediate results. The iterations may take some time to converge, but the individual instances can be run within one machine.

Florian Lajus extended the Magus software platform to use the NoSQL storage components in Tsvetok, and has validated it on a large collection of fungal genomes. Xavier Calcas is currently integrating the Galaxy platform http://usegalaxy.org with Magus.

Improving inference of metabolic models David James Sherman correspondant Pascal Durrens Razanne Issa Anna Zhukova

The Pantograph approach uses an annotated “scaffold” (reference) model and a collection of complementary predictions of homology between scaffold genes and target genes. The basis of the method is a weighing of the homology evidence to decide whether a reaction that is present in the scaffold ought be be present in the target.

We have improved on the method in two ways. First, we model the implicit knowledge represented in the boolean formula of each gene association, to derive hypotheses about the explicit role of individual genes; for example, a gene association $(S_{1} \land S_{2}) \lor (S_{1} \land S_{3})$ may implicitly represent an enzyme complex formed from two subunits, the first encoded by gene $S_{1}$ , and the second encoded by two paralogous genes $S_{2}$ and $S_{3}$ (figure ). By using these hypotheses to rewrite gene associations, we improve the decision of whether a reaction is present in the target or not.

Second, we have adopted an abductive strategy for inferring reactions. In this strategy we consider that it is the reactions that explain the genes observed in the target genome. In the corresponding abductive logic program, the observations are the genes in the target, the integrity constraints are the rules that rewrite gene associations, and the hypotheses to be abduced are the reactions in the model. The scaffold model is compiled into a set of facts and predicates that express the reactions, their gene associations, and the integrity constraint rules; the abducibles generate assertions that specific reactions are in the target model. Combined with the facts of the genes observed in the target, this program generates, through abduction, the set of target reactions that explain the greatest number of genes.

The advantage of this approach is that it can invent, through specialization, reactions that are not present per se in the scaffold model.

Knowledge-based generalization of metabolic models David James Sherman correspondant Pascal Durrens Razanne Issa Anna Zhukova

There is an inherent tension between detail and understandability in large metabolic networks: detailed description of individual reactions is needed for simulation, but high-level views of reactions are needed for describing pathways in human terms. We defined knowledge-based methods that factor similar reactions into “generic” reactions in order to visualize a whole pathway or compartment, while maintaining the underlying model so that the user can later “drill down” to the specific reactions if need be, , This method is available as a Python library from http://metamogen.gforge.inria.fr/.

Figures and illustrate model generation for Yarrowia lypolitica fatty acid oxidation in the peroxisome. Molecular species are represented as circular nodes, and the reactions as square ones, connected by edges to their reactants and products. Ubiquitous species (e.g. oxygen, water, ATP) are of smaller size and colored gray. Non-ubiquitous species are divided into fifteen equivalence classes and colored accordingly (red/blue for trivial species/reaction equivalence classes, different colors for non-trivial equivalence classes). The size of the model does not allow for readability of the species labels, thus we do not show them (figure ).

The generalization algorithm identifies equivalent molecular species using an ontology, and groups together reactions that operate on the same abstract species. It finds the greatest generalization the preserves stoichiometry. The generalized model represents quotient species and reactions. For example, the violet unsaturated FA-CoA node is a quotient of hexadec-2-enoyl-CoA, oleoyl-CoA, tetradecenoyl-CoA, trans-dec-2-enoyl-CoA, trans-dodec-2-enoyl-CoA, trans-hexacos-2-enoyl-CoA, trans-octadec-2-enoyl-CoA, and trans-tetradec-2-enoyl-CoA (colored violet in figure ). In a similar manner, the light-green acCoA oxidase quotient reaction, that converts fatty acyl-CoA (yellow) into unsaturated FA-CoA (violet), generalizes six corresponding light-green reactions of the initial model (figure ).

The generalized model describes $β$ -oxidation in a more generic way: as a transformation of fatty acyl-CoA (yellow) into unsaturated FA-CoA (violet), then into hydroxy FA-CoA (dark green), 3-oxo FA-CoA (magenta), and back to fatty acyl-CoA (with a shorter carbon chain); while the specific model describes the same process in more details, specifying those reactions for each of the fatty acyl-CoA species presented in the organisms' cell (e.g. decanoyl-CoA, dodecanoyl-CoA, etc.). That is why the $b e t a$ -oxidation chain of the reactions in the initial model, transforming step-by-step the fatty-acyl-CoA with the longest carbon chain into the one with the shortest chain, in the generalized model appears as a cycle (generalizing all the fatty-acyl-CoAs into one species, regardless the chain-length).

The specific model is appropriate for simulation, because it contains all of the precise reactions. The generalized model is suited for a human, because it reveals the main properties of the model and masks distracting details. For example, the generalized model highlights the fact that there is a particularity concerning C24:0-CoA (stearoyl-CoA) (red, inside the cycle): there exists a "shortcut" reaction (blue, inside the cycle), producing it directly from another fatty acyl-CoA (yellow), avoiding the usual four-reaction beta-oxidation chain, used for other fatty acyl-CoAs. This shortcut is not obvious in the specific model, because it is hidden among a plethora of similar-looking reactions.

Characterization of STAND protein families David James Sherman Pascal Durrens Witold Dyrka correspondant

In collaboration with Sven Saupe and Mathieu Paoletti from IBGC Bordeaux (ANR Mykimun), we worked on characterization of the STAND protein family in the fungal phylum. We established an in silico screen based on state-of-the-art bioinformatic tools, which – starting from experimentally studied sequences from Podospora anserina – allowed us to determine the first systematic picture of fungal STAND protein repertoire (ms. in preparation). Most notably, we found evidence of extensive modularity of domain associations, and signs of concerted evolution within the recognition domain. Both results support the hypothesis that fungal STAND proteins, originally described in the context of vegetative incompatibility, are involved in a general fungal immune system. In addition, we investigated improved protein domain representations and elaborated a grammatical modelling method , which will be used to elucidate mechanisms of formation and operation of the STAND proteins.

Avoiding stiffness in BioRica David James Sherman correspondant Joaquin Fernandez

We previously formalized two strategies for integrating discrete control with continuous models, coefficient switches that control the parameters of the continuous model, and strong switches that choose different models , . While these strategies have proved useful for modeling hybrid systems in biotechnology and medicine , the resulting system model can be inefficient when the different subsystems evolve at very different time scales. In order to improve the efficiency of the resulting simulations, we investigated the use of Kofman's Quantized State Systems (QSS), and demonstrated that the QSS approach can be adapted to BioRica . On the strength of this demonstration, we invited Joaquin Fernandez from Kofman's lab to Magnome. Joaquin had previously implemented an efficient library for QSS simulation, and during his stay succeeded in adapting it to our hybrid modeling framework. In his approach, SBML models with events are compiled into a hybrid model, using a variant of Modelica for surface syntax and using the QSS library for efficient simulation.

Applications in biotechnology and health David James Sherman Pascal Durrens correspondant Florian Lajus Xavier Calcas

Using Magnome's Magus system and YAGA software, we have successfully realized a full annotation and analysis of several groups of related genomes:

Seven new genomes, provided to the Génolevures Consortium by the CEA–Génoscope (Évry), including two distant genomes from the Saccharomycetales were annotated using previously published Génolevures genomes.

Twelve wine starter yeasts linked to fermentation efficiency.

Five pathogenic (to human) and non pathogenic Nakaseomycetes.

Two oleaginous strains with applications in biofuels.

Winemaking yeasts. In collaboration with partners in the ISVV, Bordeaux, we have assembled and analyzed 12 wine starter yeasts, with the goal of understanding genetic determinants of performance in wine fermentation. Analysis included identification of strain-specific gains and losses of genes linked both to niche specificity and to performance in industrial applications (article in prep.). A further combined analysis with 50 natural and industrial strains showed a pattern of introgression concentrated in industrial strains (article in prep.).

Oleaginous yeasts. In collaboration with Prof Jean-Marc Nicaud's lab at the INRA Grignon, we developed the first functional genome-scale metabolic model of Yarrowia lipolytica, an oleaginous yeast studied experimentally for its role as a food contaminant and its use in bioremediation and cell factory applications.

Using Magnome's Pantograph method (see section ) we produced an accurate functional model for Y. lipolytica, MODEL1111190000 in BioModels http://biomodels.net/, that has been qualitatively validated against gene knockouts. This model has been enriched by Anna Zhukova with ontology terms from ChEBI and GO.

Pathogenic yeasts. A further group of five species, comprised of pathogenic and nonpathogenic species, was analyzed with the goal of identifying virulence determinants . By choosing species that are highly related but which differ in the particular traits that are targeted, in this case pathogenicity, we are able to focus of the few hundred genes related to the trait . The approximately 40,000 new genes from these studies were classified into existing Génolevures families as well as branch-specific families.

Bilateral Contracts and Grants with Industry Bilateral Contracts with Industry

Magnome and the company BioLaffort are contracted to develop analyses and tools for rationalizing wine starter strain selection using genomics.

Bilateral Grants with Industry

The “SAGESS” project, below, section , has been partially funded by a grant to BioLaffort from the Region.

Partnerships and Cooperations Regional Initiatives Aquitaine Region “SAGESS” comparative genomics for wine starters.

This project is a collaboration between the company BioLaffort, specialized in the selection of industrial yeasts with distinct technological abilities, with the ISVV and Magnome. The goal is to use genome analysis to identify molecular markers responsible for different physiological capabilities, as a tool for selecting yeasts and bacteria for wine fermentation through efficient hybridization and selection strategies. This collaboration has obtained the INNOVIN label.

National Initiatives ANR MYKIMUN.

Signal Transduction Associated with Numerous Domains (STAND) proteins play a central role in vegetative incompatibility (VI) in fungi. STAND proteins act as molecular switches, changing from closed inactive conformation to open active conformation upon binding of the proper ligand. Mykimun, coordinated by Mathieu Paoletti of the IBGC (Bordeaux), studies the postulated involvement of STAND proteins in heterospecific non self recognition (innate immune response).

In MYKIMUN we extend the notion of fungal immune receptors and immune reaction beyond the P. anserina NWD gene family. We develop in silico machine learning tools to identify new potential PRRs based on the expected characteristics of such genes, in P. anserina and beyond in additional sequenced fungal genomes. This should contribute to extend concept of a fungal immune system to the whole fungal branch of the eukaryote phylogenetic tree.

European Initiatives FP7 Projects

A major objective of the “post-genome” era is to detect, quantify and characterise all relevant human proteins in tissues and fluids in health and disease. This effort requires a comprehensive, characterised and standardised collection of specific ligand binding reagents, including antibodies, the most widely used such reagents, as well as novel protein scaffolds and nucleic acid aptamers. Currently there is no pan-European platform to coordinate systematic development, resource management and quality control for these important reagents.

Magnome is an associate partner of the FP7 “Affinity Proteome” project coordinated by Prof. Mike Taussig of the Babraham Institute and Cambridge University. Within the consortium, we participate in defining community for data representation and exchange, and evaluate knowledge engineering tools for affinity proteomics data.

Collaborations with Major European Organizations

Prof. Mike Taussig: Babraham Institute & Cambridge University

Knowledge engineering for Affinity Proteomics

Henning Hermjakob: European Bioinformatics Institute

Standards and databases for molecular interactions

International Initiatives Inria Associate Teams

Magnome participates in the CARNAGE associated team, coordinated by AMIB, with the Russian Academy of Sciences.

Inria International Partners Declared Inria International Partners

AMAVI

Program: Inria International Partner

Title: Combinatorics and Algorithms for the Genomic sequences

Inria principal investigators: David Sherman

International Partner (Institution - Laboratory - Researcher):

Vavilov Institute of General Genetics (Russia (Russian Federation)) - Department of Computational Biology - Vsevolod Makeev

Duration: 2010 - present

VIGG and AMIB teams has a more than 12 years long collaboration on sequence analysis. The two groups aim at identifying DNA motifs for a functional annotation, with a special focus on conserved regulatory regions. In the current 3-years project CARNAGE, our collaboration, that includes Inrai-team Magnome, is oriented towards new trends that arise from Next Generation Sequencing data. Combinatorial issues in genome assembly are addressed. RNA structure and interactions are also studied.

The toolkit is pattern matching algorithms and analytic combinatorics, leading to common software.

Informal International Partners

Magnome collaborates with Rodrigo Assar of the Universidad Andrès Bello, and Nicolás Loira and Alessandro Maass of the Center for Genomic Regulation, in Santiago de Chile (Chile).

Participation in other International Programs

Magnome and the VIGG of the Russian Academy of Sciences (RAS) in Moscow are partners in a project funded by the CNRS and the RAS entitle “Séquençage profond de organismes biotechnologiques : des régulons à l'adaptation ”.

International Research Visitors Visits of International Scientists

Vsevolod MAKEEV November 8-22 2013

Artëm KASIANOV November 8-22 2013

Internships

Joaquin FERNANDEZ September-November 2013

Dissemination Scientific Animation

Pascal Durrens is :

leader of the “Comparative Genomics” theme and member of the Scientiﬁc Council of the LaBRI UMR 5800/CNRS.

responsible for scientiﬁc diffusion for the Génolevures Consortium.

member of the editorial board of the journal ISRN Computational Biology, and was reviewer for the journal BMC Genomics

expert in Genomics for the Fonds de la Recherche Scientifique-FNRS (FRS-FNRS), Belgium

David Sherman is :

president of the Commission de Jeunes Chercheurs, Inria Bordeaux Sud-Ouest

member for Bordeaux Sud-Ouest of Inria's Young Scientists Mission

member of the editorial board of the journal Computational and Mathematical Methods in Medicine

Teaching - Supervision - Juries Teaching

Licence : Anna Zhukova, J1MI2013 : Algorithmes et Programmes TD/TP, 30h, L2, Université Bordeaux, France

Supervision

PhD in progress: Anna Zhukova, “Knowledge engineering for biological networks,” 2011–, Sherman

PhD in progress: Razanne Issa, “Analyse symbolique de données génomiques,” 2010–, Sherman

Juries

David Sherman was a member of the juries of:

Natalia GOLENETSKAYA, “Addressing scaling challenges in comparative genomics,” U. Bordeaux, 2013-09-09

Boyang JI, “Comparative and Functional Genome Analysis of Magnetotactic Bacteria,” U. Aix-Marseille, 2013-10-23

Andres ARAVENA, “Probabilistic and constraint based modelling to determine regulation events from heterogeneous biological data,” U. Rennes, 2013-12-13

Popularization

Magnome participated in « UniThé ou Café » in the Inria Bordeaux – Sud-Ouest research center.

Anna Zhukova animated one of the Inria workshops at the 2013 “Fête de la Science”

David Sherman is a member of the Inria Bordeaux – Sud-Ouest's “Scientific Culture” committee, which organizes and proposes various scientific popularization actions.

Genomic Exploration of the Hemiascomycetous Yeasts: 4. The genome of <i>Saccharomyces cerevisiae</i> revisited Gaëlle Blandin G. Pascal Durrens P. Fredj Tekaia F. Michel Aigle M. Monique Bolotin-Fukuhara M. Elisabeth Bon E. Serge Casaregola S. Jacky de Montigny J. Claude Gaillardin C. Audrey Lépingle A. B. Llorente B. Alain Malpertuy A. Cécile Neuvéglise C. Odile Ozier-Kalogeropoulos O. A. Perrin A. Serge Potier S. Jean-Luc Souciet J.-L. Emmanuel Talla E. Claire Toffano-Nioche C. Micheline Wésolowski-Louvel M. Christian Marck C. Bernard Dujon B. FEBS Letters 487 1 December 2000 31-36 Oenococcus oeni genome plasticity is associated with fitness Elisabeth Bon E. Arnaud Delaherche A. Eric Bilhère E. Antoine De Daruvar A. Aline Lonvaud-Funel A. Claire Le Marrec C. Applied and Environmental Microbiology 75 7 2009 2079-90 http://hal.inria.fr/inria-00392015/en/ Minimum information about a protein affinity reagent (MIAPAR) Julie Bourbeillon J. Sandra Orchard S. Itai Benhar I. Carl Borrebaeck C. Antoine De Daruvar A. Stefan Dübel S. Ronald Frank R. Frank Gibson F. David Gloriam D. Niall Haslam N. Tara Hiltker T. Ian Humphrey-Smith I. Michael Hust M. David Juncker D. Manfred Koegl M. Zoltán Konthur Z. Bernhard Korn B. Sylvia Krobitsch S. Serge Muyldermans S. Per-Ake Nygren P.-A. Sandrine Palcy S. Bojan Polic B. Henry Rodriguez H. Alan Sawyer A. Martin Schlapshy M. Michael Snyder M. Oda Stoevesandt O. Michael J Taussig M. J. Markus Templin M. Matthias Uhlen M. Silvère Van Der Maarel S. Christer Wingren C. Henning Hermjakob H. David James Sherman D. J. Nature Biotechnology 28 7 07 2010 650-3 http://hal.inria.fr/inria-00544750/en Comparative genomics of protoploid Saccharomycetaceae Jean-Luc Souciet J.-L. Bernard Dujon B. Claude Gaillardin C. Mark Johnston M. Philippe Baret P. Paul Cliften P. David James Sherman D. J. Jean Weissenbach J. Eric Westhof E. Patrick Wincker P. Claire Jubin C. Julie Poulain J. Valérie Barbe V. Béatrice Ségurens B. Francois Artiguenave F. Véronique Anthouard V. Benoit Vacherie B. Marie-Eve Val M.-E. Robert S Fulton R. S. Patrick Minx P. Richard Wilson R. Pascal Durrens P. Géraldine Jean G. Christian Marck C. Tiphaine Martin T. Macha Nikolski M. Thomas Rolland T. Marie-Line Seret M.-L. Serge Casaregola S. Laurence Despons L. Cécile Fairhead C. Gilles Fischer G. Ingrid Lafontaine I. Veronique Leh Louis V. Marc Lemaire M. Jacky De Montigny J. Cécile Neuvéglise C. Agnès Thierry A. Isabelle Blanc-Lenfle I. Claudine Bleykasten C. Julie Diffels J. Emilie Fritsch E. Lionel Frangeul L. Adrien Goëffon A. Nicolas Jauniaux N. Rym Kachouri-Lafond R. Célia Payen C. Serge Potier S. Lenka Pribylova L. Christophe Ozanne C. Guy-Franck Richard G.-F. Christine Sacerdot C. Marie-Laure Straub M.-L. Emmanuel Talla E. Genome Research 19 2009 1696-1709 http://hal.inria.fr/inria-00407511/en/ How to decide which are the most pertinent overly-represented features during gene set enrichment analysis Roland Barriot R. David James Sherman D. J. Isabelle Dutour I. BMC Bioinformatics 8 2007 http://hal.inria.fr/inria-00202721/en/ Integrated multilaboratory systems biology reveals differences in protein metabolism between two reference yeast strains André B Canelas A. B. Nicola Harrison N. Alessandro Fazio A. Jie Zhang J. Juha-Pekka Pitkänen J.-P. Joost Van Den Brink J. Barbara M Bakker B. M. Lara Bogner L. Jildau Bouwman J. Juan I Castrillo J. I. Ayca Cankorur A. Pramote Chumnanpuen P. Pascale Daran-Lapujade P. Duygu Dikicioglu D. Karen Van Eunen K. Jennifer C Ewald J. C. Joseph J Heijnen J. J. Betul Kirdar B. Ismo Mattila I. Femke I C Mensonides F. I. C. Anja Niebel A. Merja Penttilä M. Jack T Pronk J. T. Matthias Reuss M. Laura Salusjärvi L. Uwe Sauer U. David James Sherman D. J. Martin Siemann-Herzberg M. Hans Westerhoff H. Johannes De Winde J. Dina Petranovic D. Stephen G Oliver S. G. Christopher T Workman C. T. Nicola Zamboni N. Jens Nielsen J. Nature Communications 1 9 12 2010 145 http://hal.inria.fr/inria-00562005/en/ Genome evolution in yeasts Bernard Dujon B. David James Sherman D. J. Gilles Fischer G. Pascal Durrens P. Serge Casaregola S. Ingrid Lafontaine I. Jacky De Montigny J. Christian Marck C. Cécile Neuvéglise C. Emmanuel Talla E. Nicolas Goffard N. Lionel Frangeul L. Michel Aigle M. Véronique Anthouard V. Anna Babour A. Valérie Barbe V. Stéphanie Barnay S. Sylvie Blanchin S. Jean-Marie Beckerich J.-M. Emmanuelle Beyne E. Claudine Bleykasten C. Anita Boisramé A. Jeanne Boyer J. Laurence Cattolico L. Fabrice Confanioleri F. Antoine De Daruvar A. Laurence Despons L. Emmanuelle Fabre E. Cécile Fairhead C. Hélène Ferry-Dumazet H. Alexis Groppi A. Florence Hantraye F. Christophe Hennequin C. Nicolas Jauniaux N. Philippe Joyet P. Rym Kachouri-Lafond R. Alix Kerrest A. Romain Koszul R. Marc Lemaire M. Isabelle Lesur I. Laurence Ma L. Héloïse Muller H. Jean-Marc Nicaud J.-M. Macha Nikolski M. Sophie Oztas S. Odile Ozier-Kalogeropoulos O. Stefan Pellenz S. Serge Potier S. Guy-Franck Richard G.-F. Marie-Laure Straub M.-L. Audrey Suleau A. Dominique Swennen D. Fredj Tekaia F. Micheline Wésolowski-Louvel M. Eric Westhof E. Bénédicte Wirth B. Maria Zeniou-Meyer M. Ivan Zivanovic I. Monique Bolotin-Fukuhara M. Agnès Thierry A. Christiane Bouchier C. Bernard Caudron B. Claude Scarpelli C. Claude Gaillardin C. Jean Weissenbach J. Patrick Wincker P. Jean-Luc Souciet J.-L. Nature 430 6995 07 2004 35-44 http://hal.archives-ouvertes.fr/hal-00104411/en/ Fusion and fission of genes define a metric between fungal genomes Pascal Durrens P. Macha Nikolski M. David James Sherman D. J. PLoS Computational Biology 4 10 2008 http://hal.inria.fr/inria-00341569/en/ An Efficient Probabilistic Population-Based Descent for the Median Genome Problem Adrien Goëffon A. Macha Nikolski M. David James Sherman D. J. Proceedings of the 10th annual ACM SIGEVO conference on Genetic and evolutionary computation (GECCO 2008) Atlanta United States ACM 2008 315-322 http://hal.archives-ouvertes.fr/hal-00341672/en/ Family relationships: should consensus reign?- consensus clustering for protein families Macha Nikolski M. David James Sherman D. J. Bioinformatics 23 2007 http://hal.inria.fr/inria-00202434/en/ Genolevures: protein families and synteny among complete hemiascomycetous yeast proteomes and genomes David James Sherman D. J. Tiphaine Martin T. Macha Nikolski M. Cyril Cayla C. Jean-Luc Souciet J.-L. Pascal Durrens P. Nucleic Acids Research (NAR) 2009 D http://hal.inria.fr/inria-00341578/en/ Adressing scaling challenges in comparative genomics Natalia Golenetskaya N. Université Sciences et Technologies - Bordeaux I September 2013 http://hal.inria.fr/tel-00865840 Ph. D. Thesis Implementing biological hybrid systems: Allowing composition and avoiding stiffness Rodrigo Assar R. David James Sherman D. J. 0096-3003 Applied Mathematics and Computation August 2013 http://hal.inria.fr/hal-00853997 Phylogénie moléculaire des champignons Pascal Durrens P. Christian Ripert C. Mycologie médicale Tech & Doc Lavoisier 2013 49-54 http://hal.inria.fr/hal-00833960 Optimization of 3D Poisson-Nernst-Planck model for fast evaluation of diverse protein channels Witold Dyrka W. Maciej M. Bartuzel M. M. Malgorzata Kotulska M. 0887-3585 Proteins: Structure, Function, and Bioinformatics August 2013 http://hal.inria.fr/hal-00857213 Probabilistic grammatical model for helix-helix contact site classification Witold Dyrka W. Jean-Christophe Nebel J.-C. Malgorzata Kotulska M. 1748-7188 Algorithms for Molecular Biology 8 2013 31 http://hal.inria.fr/hal-00923291 Comparative genomics of emerging pathogens in the Candida glabrata clade Toni Gabaldón T. Tiphaine Martin T. Marina Marcet-Houben M. Pascal Durrens P. Monique Bolotin-Fukuhara M. Olivier Lespinet O. Sylvie Arnaise S. Stéphanie Boisnard S. Gabriela Aguileta G. Ralitsa Atanasova R. Christiane Bouchier C. Arnaud Couloux A. Sophie Creno S. Jose Almeida Cruz J. Hugo Devillers H. Adela Enache-Angoulvant A. Juliette Guitard J. Laure Jaouen L. Laurence Ma L. Christian Marck C. Cécile Neuvéglise C. Eric Pelletier E. Amélie Pinard A. Julie Poulain J. Julien Recoquillay J. Eric Westhof E. Patrick Wincker P. Bernard Dujon B. Christophe Hennequin C. Cécile Fairhead C. 1471-2164 BMC Genomics 14 1 September 2013 623 http://hal.inria.fr/inserm-00871184 Genome Sequence of Lactobacillus saerimneri 30a (Formerly Lactobacillus sp. Strain 30a), a Reference Lactic Acid Bacterium Strain Producing Biogenic Amines Andrea Romano A. Hein Trip H. Hugo Campbell-Sills H. Olivier Bouchez O. David D. Sherman D. D. Juke S Lolkema J. S. Patrick M Lucas P. M. 2169-8287 Genome Announcements 1 1 January 2013 12 http://hal.inria.fr/hal-00863284 The family based variability in protein family expansion Anasua Sarkar A. Macha Nikolski M. Pascal Durrens P. 1744-5485 International Journal of Bioinformatics Research and Applications 9 2 2013 121-33 http://hal.inria.fr/hal-00857374 Knowledge-based generalization of metabolic models Anna Zhukova A. David James Sherman D. J. 1066-5277 Journal of Computational Biology 2014 http://hal.inria.fr/hal-00925881 Knowledge-based generalization of metabolic networks: a practical study Anna Zhukova A. David James Sherman D. J. 0219-7200 Journal of Bioinformatics and Computational Biology 2014 http://hal.inria.fr/hal-00906911 Taming the complexity of 'n-ary' relations in comparative genomics David James Sherman D. J. Mark Borodovsky M. 9th International Conference on Genome Biology and Bioinformatics Atlanta, Georgia, United States Georgia Tech and Emory University November 2013 http://hal.inria.fr/hal-00938262 International Conference on Genome Biology and Bioinformatics 9 Knowledge-based generalization of metabolic networks: An applicational study Anna Zhukova A. David James Sherman D. J. Moscow Conference on Computational Molecular Biology Moscow, Russian Federation July 2013 http://hal.inria.fr/hal-00859440 Moscow Conference on Computational Molecular Biology 2011 MCCMB Knowledge-based zooming for metabolic models Anna Zhukova A. David James Sherman D. J. JOBIM Toulouse, France July 2013 http://hal.inria.fr/hal-00859437 Journées Ouvertes Biologie Informatique Mathématiques 11 JOBIM What is the optimal representation of a generalized metabolic model using SBML and SBGN? Anna Zhukova A. David James Sherman D. J. COMBINE 2013 Paris, France September 2013 http://hal.inria.fr/hal-00867373 COmputational Modeling in BIology NEtwork 2013 COMBINE Metabolic Model Generalization Anna Zhukova A. International Course in Yeast Systems Biology Gothenburg, Sweden June 2013 http://hal.inria.fr/hal-00859442 International Course in Yeast Systems Biology Modeling Stochastic Switched Systems with BioRica Rodrigo Assar R. Alice Garcia A. David James Sherman D. J. Journées Ouvertes en Biologie, Informatique et Mathématiques JOBIM 2011 Paris, France Institut Pasteur July 2011 297–304 http://hal.inria.fr/inria-00617419/en Reusing and composing models of cell fate regulation of human bone precursor cells Rodrigo Assar R. Andrea V Leisewitz A. V. Alice Garcia A. Nibaldo C Inestrosa N. C. Martín A Montecino M. A. David James Sherman D. J. BioSystems 108 1-3 April 2012 63-72 http://hal.inria.fr/hal-00681022 Stochastic Modeling of Complex Systems and Systems Biology: From Stochastic Transition Systems to Hybrid Systems Rodrigo Assar R. Martín A Montecino M. A. David James Sherman D. J. XII Latin American Congress of Probability and Mathematical Statistics Viña del Mar, Chile March 2012 http://hal.inria.fr/hal-00686072 Reconciling competing models: a case study of wine fermentation kinetics Rodrigo Assar R. Felipe Vargas F. David James Sherman D. J. Katsuhisa Horimoto K. Masahiko Nakatsui M. Nikolaj Popov N. Algebraic and Numeric Biology 2010 Austria Hagenberg Research Institute for Symbolic Computation, Johannes Kepler University of Linz 08 2010 68–83 http://hal.inria.fr/inria-00541215/en Reconciling competing models: a case study of wine fermentation kinetics Rodrigo Assar R. Felipe Vargas F. David James Sherman D. J. Katsuhisa Horimoto K. Masahiko Nakatsui M. Nikolaj Popov N. Algebraic and Numeric Biology 2010 Hagenberg, Austria Lecture Notes in Computer Science 6479 Springer Research Institute for Symbolic Computation, Johannes Kepler University of Linz 2012 68–83 http://hal.inria.fr/inria-00541215 PSEUDOE: A computational method to detect Psi-genes and explore PSEUDome dynamics in wine bacteria from the Oenococcus genus Laetitia Bourgeade L. Tiphaine Martin T. Elisabeth Bon E. Denis Tagu François Cost D. T. JOBIM2012- 13ème Journées Ouvertes en Biologie, Informatique et Mathématiques Rennes, France SFBI, Inria July 2012 435-436 http://hal.inria.fr/hal-00722968 Cracking the Pseudome Code: Inside the "silent" Psi-genes language to reconstruct Oenococcus oeni evolutionary adaptation to wine Laetitia Bourgeade L. Tiphaine Martin T. Aurélie Goulielmakis A. Aline Lonvaud-Funel A. Patrick Lucas P. Elisabeth Bon E. UR CALITYSS - VetAgroSup 18th CBL-Club des Bactéries Lactiques Meeting Clermont-Ferrand, France May 2012 82 http://hal.inria.fr/hal-00722971 Rapid Discrimination between Candida glabrata, Candida nivariensis, and Candida bracarensis by Use of a Singleplex PCR Adela Enache-Angoulvant A. Juliette Guitard J. Frédéric Grenouillet F. Tiphaine Martin T. Pascal Durrens P. Cécile Fairhead C. Christophe Hennequin C. Journal of Clinical Microbiology 49 9 September 2011 3375-3379 http://hal.inria.fr/inria-00625115/en Mixed-formalism hierarchical modeling and simulation with BioRica Alice Garcia A. David James Sherman D. J. 11th International Conference on Systems Biology (ICSB 2010) United Kingdom Edimbourg 10 2010 http://hal.inria.fr/inria-00529669/en Poster A community standard format for the representation of protein affinity reagents David Gloriam D. Sandra Orchard S. Daniela Bertinetti D. Erik Björling E. Erik Bongcam-Rudloff E. Carl Borrebaeck C. Julie Bourbeillon J. Andrew R M Bradbury A. R. M. Antoine De Daruvar A. Stefan Dübel S. Ronald Frank R. Toby J Gibson T. J. Larry Gold L. Niall Haslam N. Friedrich W Herberg F. W. Tara Hiltker T. Jörg D Hoheisel J. D. Samuel Kerrien S. Manfred Koegl M. Zoltán Konthur Z. Bernhard Korn B. Ulf Landegren U. Luisa Montecchi-Palazzi L. Sandrine Palcy S. Henry Rodriguez H. Sonja Schweinsberg S. Volker Sievert V. Oda Stoevesandt O. Michael J Taussig M. J. Marius Ueffing M. Mathias Uhlén M. Silvère Van Der Maarel S. Christer Wingren C. Peter Woollard P. David James Sherman D. J. Henning Hermjakob H. Mol Cell Proteomics 9 1 01 2010 1-10 http://hal.inria.fr/inria-00544751/en Rethinking global analyses and algorithms for comparative genomics in a functional MapReduce style Natalia Golenetskaya N. David James Sherman D. J. Algorithmique, combinatoire du texte et applications en bio-informatique (SeqBio 2011) Lille, France December 2011 http://hal.inria.fr/hal-00654797/en How does Oenococcus oeni adapt to its environment? A pangenomic oligonucleotide microarray for analysis O. oeni gene expression under wine shock Aurélie Goulielmakis A. Julen Bridier J. Aurélien Barré A. Olivier Claisse O. David James Sherman David James. Pascal Durrens P. Aline Lonvaud-Funel A. Elisabeth Bon E. P. Darriet P. L. Geny L. P. Lucas P. A. Lonvaud A. G. de Revel G. P.L. Teissedre P. OENO2011- 9th International Symposium of Oenology Bordeaux, France Dunod, Paris April 2012 358-363 http://hal.inria.fr/hal-00646867 The HUPO PSI's molecular interaction format–a community standard for the representation of protein interaction data Henning Hermjakob H. Luisa Montecchi-Palazzi L. G. Bader G. J. Wojcik J. L. Salwinski L. A. Ceol A. S. Moore S. Sandra Orchard S. U. Sarkans U. C. von Mering C. B. Roechert B. S. Poux S. E. Jung E. H. Mersch H. P. Kersey P. M. Lappe M. Y. Li Y. R. Zeng R. D. Rana D. Macha Nikolski M. H. Husi H. C. Brun C. K. Shanker K. SG. Grant S. C. Sander C. P. Bork P. W. Zhu W. A. Pandey A. A. Brazma A. B. Jacq B. M. Vidal M. David James Sherman D. J. P. Legrain P. G. Cesareni G. I. Xenarios I. D. Eisenberg D. B. Steipe B. C. Hogue C. R. Apweiler R. Nat. Biotechnol. 22 2 Feb. 2004 177-83 Mining the semantics of genome super-blocks to infer ancestral architectures Géraldine Jean G. David James Sherman D. J. Macha Nikolski M. Journal of Computational Biology 2009 http://hal.inria.fr/inria-00414692/en/ Genome-scale Metabolic Reconstruction of the Eukaryote Cell Factory Yarrowia Lipolytica Nicolás Loira N. Thierry Dulermo T. Macha Nikolski M. Jean-Marc Nicaud J.-M. David James Sherman D. J. 11th International Conference on Systems Biology (ICSB 2010) United Kingdom Edimbourg 10 2010 http://hal.inria.fr/hal-00652922 Poster Reconstruction and Validation of the genome-scale metabolic model of Yarrowia lipolytica iNL705 Nicolás Loira N. David James Sherman D. J. Pascal Durrens P. Journée Ouvertes Biologie Informatique Mathématiques, JOBIM 2010 France Montpellier 09 2010 http://www.jobim2010.fr/?q=fr/node/55 Addressing scaling-out challenges for comparative genomics David James Sherman D. J. Natalia Golenetskaya N. Moscow Conference on Computational Molecular Biology Moscow, Russian Federation July 2011 http://hal.inria.fr/hal-00649189/en Qualitative Transition Systems for the Abstraction and Comparison of Transient Behavior in Parametrized Dynamic Models Hayssam Soueidan H. Macha Nikolski M. Gregoire Sutre G. Computational Methods in Systems Biology (CMSB'09) Italie Bologna 5688 Springer Verlag 2009 313–327 http://hal.archives-ouvertes.fr/hal-00408909/en/ Swarming Along the Evolutionary Branches Sheds Light on Genome Rearrangement Scenarios Nikolay Vyahhi N. Adrien Goëffon A. David James Sherman D. J. Macha Nikolski M. Franz Rothlauf F. ACM SIGEVO Conference on Genetic and evolutionary computation ACM ACM SIGEVO 2009 http://hal.inria.fr/inria-00407508/en/ Characterization of an acquired-dps-containing gene island in the lactic acid bacterium Oenococcus oeni A. Athane A. Eric Bilhère E. Elisabeth Bon E. Patrick Lucas P. Guillaume Morel G. Aline Lonvaud-Funel A. Claire Le Hénaff-Le Marrec C. Journal of Applied Microbiology 2008 http://hal.inria.fr/inria-00340058/en/ Received 22 October 2007, revised 8 April 2008 & Accepted 8 May 2008 (In press) New strategy for the representation and the integration of biomolecular knowledge at a cellular scale Roland Barriot R. Jerome Poix J. Alexis Groppi A. Aurelien Barre A. Nicolas Goffard N. David James Sherman D. J. Isabelle Dutour I. Antoine De Daruvar A. Nucleic Acids Research (NAR) 32 2004 3581-9 http://hal.inria.fr/inria-00202722/en/ Insights into genome plasticity of the wine-making bacterium Oenococcus oeni strain ATCC BAA-1163 by decryption of its whole genome Elisabeth Bon E. Cosette Granvalet C. Fabienne Remize F. Diliana Dimova D. Patrick Lucas P. Daniel Jacob D. Alexis Groppi A. Stéphanie Penaud S. Christophe Aulard C. Antoine De Daruvar A. Aline Lonvaud-Funel A. Jean Guzzo J. 9th Symposium on Lactic Acid Bacteria Egmond aan Zee Netherlands 2008 http://hal.inria.fr/inria-00340073/en/ Exploratory Simulation of Cell Ageing Using Hierarchical Models Maria Cvijovic M. Hayssam Soueidan H. David James Sherman D. J. Edda Klipp E. Macha Nikolski M. J. Arthur J. S.-K. Ng S.-K. 19th International Conference on Genome Informatics Genome Informatics Gold Coast, Queensland Australia Genome Informatics 21 Imperial College Press, London 2008 114–125 http://hal.inria.fr/inria-00350616 EU FP6 Yeast Systems Biology Network LSHG-CT-2005-018942, EU Marie Curie Early Stage Training (EST) Network “Systems Biology”, ANR-05-BLAN-0331-03 (GENARISE) The whole genome of Oenococcus strain IOEB 8413 Diliana Dimova D. Elisabeth Bon E. Patrick Lucas P. R. Beugnot R. Marcel De Leeuw M. Aline Lonvaud-Funel A. 9th Symposium on Lactic Acid Bacteria Egmond aan Zee Netherlands 2008 http://hal.inria.fr/inria-00340086/en/ Génolevures: comparative genomics and molecular evolution of hemiascomycetous yeasts David James Sherman D. J. Pascal Durrens P. Emmanuelle Beyne E. Macha Nikolski M. Jean-Luc Souciet J.-L. Nucleic Acids Research (NAR) 32 2004 http://hal.inria.fr/inria-00407519/en/ GDR CNRS 2354 “Génolevures” Genolevures complete genomes provide data and tools for comparative genomics of hemiascomycetous yeasts David James Sherman D. J. Pascal Durrens P. Florian Iragne F. Emmanuelle Beyne E. Macha Nikolski M. Jean-Luc Souciet J.-L. Nucleic Acids Res 34 01 2006 http://hal.archives-ouvertes.fr/hal-00118142/en/ High-performance comparative annotation David James Sherman D. J. Nicolas Loira N. Natalia Golenetskaya N. Vsevolod Makeev V. Gregory Kucherov G. Bioinformatics after next-generation sequencing Zvenigorod Russian Federation Russian Academy of Sciences 06 2010 http://hal.inria.fr/inria-00563533/en/ BioRica: A multi model description and simulation system Hayssam Soueidan H. David James Sherman D. J. Macha Nikolski M. F0SBE Allemagne 2007 279-287 http://hal.archives-ouvertes.fr/hal-00306550/en/