Section: New Results

Parameter estimation for kinetic models of carbon metabolism in bacteria

Kinetic models capture the dynamics of the large and complex networks of biochemical reactions that endow bacteria with the capacity to adapt their functioning to changes in the environment. In comparison with the qualitative PL models described in Sections  6.1 and 6.3 , these more general classes of ODE models are intended to provide a quantitative description of the network dynamics, both on the genetic and metabolic level. New experimental techniques have led to the accumulation of large amounts of data, such as time-course measurements of metabolite, mRNA and protein concentrations and measurements of metabolic fluxes under different growth conditions. However, the estimation of parameter values in the kinetic models from these data remains particularly challenging in biology, mostly because of incomplete knowledge of the molecular mechanisms, noisy, indirect, heterogeneous, and partial observations, and the large size of the systems, with dynamics on different time-scales. We have addressed parameter estimation in the context of the analysis of the interactions between metabolism and gene expression in carbon metabolism in E. coli.

In collaboration with Matteo Brilli and Daniel Kahn (INRA and Université Claude Bernard in Lyon), we have developed an approximate model of central metabolism of E. coli, using so-called linlog functions to approximately describe the rates of the enzymatic reactions. More precisely, linlog models describe metabolic kinetics by means of a linear model of the logarithms of metabolite concentrations. We have used metabolome and transcriptome data sets from the literature to estimate the parameters of the linlog models, a task in principle greatly simplified by the mathematical form of the latter. However, a major problem encountered during parameter estimation was the occurrence of missing data, due to experimental problems or instrument failures. In the framework of her PhD thesis, Sara Berthoumieux has addressed the missing-data problem by developing an iterative parameter estimation approach based on an Expectation-Maximization (EM) procedure. This approach adapted from the statistical literature has the advantage of being well-defined analytically and applicable to other kinds of linear regression problems with missing data. It has been tested on simulations experiments with missing data and performs well compared to basic and advanced regression methods.

On the biological side, we have applied the method to a linlog model of central metabolism in Escherichia coli, consisting of some 23 variables. We estimated the 100 parameters of this model from a high-throughput dataset published in the literature. The data consists of measurements of metabolic fluxes and metabolite and enzyme levels in glucose-limited chemostat under 29 different conditions such as wild-type strain and single-gene mutant strains or different dilution rates. Standard linear regression is difficult to apply in this case due to missing data, which disqualifies for 7 reactions too many datapoints, leaving a dataset of size inferior to the number of parameters to estimate. Application of our approach allows one to compute reasonable estimates for most of the identifiable model parameters even when regression is inapplicable. The method and its application to the linlog model of central metabolism in E. coli are the subject of a paper accepted for the ISMB/ECCB conference this year and published in a special issue of Bioinformatics [3] . Sara Berthoumieux received the Ian Lawson Van Toch Memorial Award for outstanding student paper at ISMB/ECCB. In the continuation of this work, we are currently preparing for submission a journal paper on the identifiability of linlog models.

A second line of work is based on the use of classical kinetic models that are, in comparison with the above-mentioned linlog models, much reduced in scope (the focus is on the metabolic and genetic regulation of the glycolysis pathway) and granularity (individual reactions are lumped together). The models, developed by Delphine Ropers, have been calibrated using experimental data from the experimental part of the IBIS group for the gene expression measurements and the group of Jean-Charles Portais at INSA in Toulouse for the measurements of metabolism. The model with the estimated parameter values is currently being tested and used to understand some key mechanisms in the adaptation of E. coli to the exhaustion of glucose. The PhD theses of Stéphane Pinhal and Valentin Zulkower, which started at the end of this year, will further develop these research directions.