Section: Scientific Foundations
Discrete statistics and probability
At a lower level, our work relies on a basic background on discrete statistics and probability. Probabilistic models indeed naturally appear in many of our research projects. When dealing with large input data sets, it is essential to be able to discriminate between noisy features observed by chance from those that are biologically relevant. The aim here is to introduce a probabilistic model and to use sound statistical methods to assess the significance of some observations about these data. Examples of such observations are the length of a repeated region, the number of occurrences of a motif (DNA or RNA), the free energy of a conserved RNA secondary structure, etc. Moreover, probabilistic models described according to the Bayesian framework allow to bypass, by using MCMC sampling methods, some limitations resulting from complex mathematical integrations over parameter space. Bayesian models and their MCMC sampling allow to approximate probability distributions over parameters and to describe more biologically relevant models. These methods are applied to the genome rearrangement application domain.