EN FR
• Legal notice
• Accessibility - non conforme
##### BIGS - 2022

2022
Activity report
Project-Team
BIGS
RNSR: 200920955T
Research center
In partnership with:
CNRS, Université de Lorraine
Team name:
Biology, genetics and statistics
In collaboration with:
Institut Elie Cartan de Lorraine (IECL)
Domain
Digital Health, Biology and Earth
Theme
Modeling and Control for Life Sciences
Creation of the Project-Team: 2011 January 01

# Keywords

• A3.1. Data
• A3.2. Knowledge
• A3.2.3. Inference
• A3.3. Data and knowledge analysis
• A3.3.1. On-line analytical processing
• A3.3.2. Data mining
• A3.3.3. Big data analysis
• A3.4.1. Supervised learning
• A3.4.2. Unsupervised learning
• A3.4.4. Optimization and learning
• A3.4.7. Kernel methods
• A6. Modeling, simulation and control
• A6.1. Methods in mathematical modeling
• A6.1.2. Stochastic Modeling
• A6.2. Scientific computing, Numerical Analysis & Optimization
• A6.2.3. Probabilistic methods
• A6.2.4. Statistical methods
• A6.4. Automatic control
• A6.4.2. Stochastic control
• B1. Life sciences
• B1.1. Biology
• B1.1.2. Molecular and cellular biology
• B1.1.10. Systems and synthetic biology
• B1.1.11. Plant Biology
• B2.2. Physiology and diseases
• B2.2.1. Cardiovascular and respiratory diseases
• B2.2.3. Cancer
• B2.3. Epidemiology
• B2.4. Therapies

# 1 Team members, visitors, external collaborators

## Research Scientists

• Nicolas Champagnat [Team leader, INRIA, Senior Researcher, HDR]
• Coralie Fritsch [INRIA, Researcher]
• Ulysse Herbach [INRIA, Researcher]
• Bruno Scherrer [INRIA, Researcher, HDR]

## Faculty Members

• Thierry Bastogne [UL, Associate Professor, HDR]
• Sandie Ferrigno [UL, Associate Professor]
• Anne Gégout-Petit [UL, Professor, HDR]
• Jean-Marie Monnez [UL, Emeritus, HDR]
• Aurélie Muller-Gueudin [UL, Associate Professor]
• Sophie Mézières [UL, Associate Professor]
• Pierre Vallois [UL, Emeritus, HDR]
• Denis Villemonais [UL, Associate Professor, HDR]

## Post-Doctoral Fellows

• Léo Darrigade [INRIA, until Sep 2022]
• Joseph Lam-Weil [UL, until Jun 2022]
• William Oçafrain [INRIA, until Apr 2022]

## PhD Students

• Virgile Brodu [ENS de Lyon, from Sep 2022]
• Vincent Hass [UL]
• Rodolphe Loubaton [UL]
• Anouk Rago [UL]
• Nino Vieillard [Google Brain, CIFRE, until Jun 2022]
• Nicolás Zalduendo Vidal [INRIA]

## Technical Staff

• Walid Laziri [INRIA, Engineer, from Oct 2022]

## Interns and Apprentices

• Virgile Brodu [ENS de Lyon, from Apr 2022 until Aug 2022]

• Emmanuelle Deschamps [INRIA]

# 2 Overall objectives

BIGS is a joint team of Inria, CNRS and University of Lorraine, within the Institut Élie Cartan of Lorraine (IECL), UMR 7502 CNRS-UL laboratory in mathematics, of which Inria is a strong partner. One member of BIGS, T. Bastogne, comes from the Research Center of Automatic Control of Nancy (CRAN), with which BIGS has strong relations in the domain “Health-Biology-Signal”. Our research is mainly focused on stochastic modeling and statistics but also aims at a better understanding of biological systems. BIGS involves applied mathematicians whose research interests mainly concern probability and statistics. More precisely, our attention is directed on (1) stochastic modeling, (2) estimation and control for stochastic processes, (3) regression and machine learning, and (4) statistical learning and application in health. The main objective of BIGS is to exploit these skills in applied mathematics to provide a better understanding of issues arising in life sciences, with a special focus on (1) tumor growth and heterogeneity, (2) gene networks, (3) telomere length dynamics, (4) epidemiology and e-health.

# 3 Research program

## 3.1 Introduction

We give here the main lines of our research. For clarity, we made the choice to structure them in four items. Note that all of these items deal with stochastic modeling and inference, therefore they are all interconnected.

## 3.2 Stochastic modeling

Our aim is to propose relevant stochastic frameworks for the modeling and the understanding of biological systems. The stochastic processes are particularly suitable for this purpose. Among them, Markov processes provide a first framework for the modeling of population of cells 86, 66. Piecewise deterministic processes are non-diffusion processes that are also frequently used in the biological context 54, 65, 55. Among Markov models, we developed strong expertise about processes derived from Brownian motion and Stochastic Differential Equations 80, 64. For instance, knowledge about Brownian or random walk excursions 85, 79 helps to analyse genetic sequences and to develop inference about them. We also have strong expertise in stochastic modeling of complex biological populations using individual-based models. These models can be used either from the point of view of asymptotic stochastic analysis  51, e.g. to study the long term Darwinian evolution of populations, or from the point of view of numerical analysis of biological phenomena  60, 40. We also develop mathematical tools for the analysis of the long-time behavior of stochastic population processes accounting for possible extinction of (sub)populations  52.

## 3.3 Estimation and control for stochastic processes

We develop inference about the stochastic processes that we use for modeling. Control of stochastic processes is also a way to optimise administration (dose, frequency) of therapy, such as targeted therapies in cancer. Our team has a good expertise about inference of the jump rate and the kernel of piecewise-deterministic Markov processes (PDMP) 45, 41, 44, 43, but there are many directions to go further into. For instance, previous work made the assumption of a complete observation of jumps and mode, which is unrealistic in practice. We also tackle the problem of inference of “hidden PDMP”. For example, in pharmacokinetics modeling inference, we want to account for the presence of timing noise and identification from longitudinal data. We have expertise on these subjects 46, and we also use mixed models to estimate tumor growth or heterogeneity 47.

We consider the control of stochastic processes within the framework of Markov Decision Processes 77 and their generalization known as multi-player stochastic games, with a particular focus on infinite-horizon problems. In this context, we are interested in the complexity analysis of standard algorithms, as well as the proposition and analysis of numerical approximate schemes for large problems in the spirit of 49. Regarding complexity, a central topic of research is the analysis of the Policy Iteration algorithm, which has made significant progress in the last years 88, 76, 62, 83, but is still not fully understood. For large problems, we have an extensive experience of sensitivity analysis of approximate dynamic programming algorithms for Markov Decision Processes 81, 69, 82, and we currently investigate whether/how similar ideas may be adapted to multi-player stochastic games.

## 3.4 Algorithms and estimation for graph data

Recently, our group has focused its attention on modeling and inference for graph data. A graph data structure consists of a set of nodes, together with a set of pairs of these nodes called edges. This type of data is frequently used in biology because they provide a mathematical representation of many concepts such as biological networks of relationships in a population or between genes in a cell.

Network inference is the process of making inference about the link between two variables, by taking into account the information about other variables. 87 gives a very good introduction and many references about network inference and mining. Many methods are available to infer and test edges in Gaussian graphical models 87, 71, 59, 61. However, the Gaussian assumption does not hold when dealing with typical “zero-inflated” abundance data, and we want to develop inference in this case.

Concerning gene networks, most studies have been based on population-averaged data: now that technologies enable us to observe mRNA levels in individual cells, a revolution in terms of precision, the network reconstruction problem paradoxically becomes more challenging than ever. Indeed, the traditional way of seeing a gene regulatory network as a deterministic system with some small external noise is being challenged by the probabilistic, bursty nature of gene expression revealed at single-cell level. Our objective is to propose dynamical models and inference methods that fully exploit the particular time structure of single-cell data. We described a promising strategy in which the network inference problem is seen as a calibration procedure for a new PDMP model that is able to acceptably reproduce real single-cell data 63, 78.

Among graphs, trees play a special role because they offer a powerful model for many biological concepts, from RNA to phylogenetic trees in heterogeneous tumors or through plant structures. Our research deals with several aspects of tree data. In particular, we work on statistical inference for this type of data under a given stochastic model. We also work on lossy compression of trees via directed acyclic graphs. These methods enable us to compute distances between tree data faster than from the original structures and with a high accuracy.

## 3.5 Regression and machine learning

Regression models and machine learning aim at inferring statistical links between a variable of interest and covariates. In biological studies, it is always important to develop adapted learning methods both in the context of standard data and also for data of high dimension (sometimes with few observations) and very massive or online data.

Many methods are available to estimate conditional quantiles and test dependencies  75, 67. Among them we have developed nonparametric estimation by local analysis via kernel methods  57, 58 and we want to study properties of this estimator in order to derive a measure of risk based e.g. on confidence band and test. We study also other regression models like survival analysis, spatio-temporal models with covariates. Among the multiple regression models, we want to develop omnibus tests that examine several assumptions together.

Concerning the analysis of high dimensional data, our view on the topic relies on the French data analysis school, specifically on Factorial Analysis. In this context, stochastic approximation is an essential tool  68, which allows one to approximate eigenvectors in a stepwise manner  73, 72, 74. We aim at performing accurate classification or clustering by taking advantage of the possibility of updating the information "online" using stochastic approximation algorithms  50. We focus on several incremental procedures for regression and data analysis like linear and logistic regressions and PCA (Principal Component Analysis).

We also focus on the biological context of high-throughput bioassays in which several hundreds or thousands of biological signals are measured for a posterior analysis. We have to account for the inter-individual variability within the modeling procedure. We aim at developing a new solution based on an ARX (Auto Regressive model with eXternal inputs) model structure using the EM (Expectation-Maximisation) algorithm for the estimation of the model parameters.

# 4 Application domains

## 4.1 Oncology: tumor growth and heterogeneity

We want to propose stochastic processes to model the appearance of mutations and the evolution of their frequencies in tumor samples, through new collaborations with clinicians who measure a particular quantity called circulating tumor DNA (ctDNA). The final purpose is to use ctDNA as an early biomarker of the resistance to a targeted therapy: this is the aim of the project funded by ITMO Cancer that we coordinate. In the ongoing work on low-grade gliomas, a local database of 400 patients will be soon available to construct models. We plan to extend it through national and international collaborations (Montpellier CHU, Montreal CRHUM). Our aim is to build a decision-aid tool for personalised medicine.

## 4.2 Gene networks and single-cell data

We already mentioned in Section 3.4 our interest in the modeling and inference of transcriptomic bursting in gene regulatory networks from single-cell data. We are also currently working on the prediction and identification of therapeutic targets for chronic lymphocytic leukemia from gene expression data. Our goal is to propose new models allowing to make prediction of gene silencing experiments. Inference will be performed on gene expression data from patients’ cells suffering from different forms of chronic lymphocytic leukemia. The goal is to identify therapeutic targets which could be silenced to reduce cell proliferation.

## 4.3 Epidemiology and e-health

In the context of personalized medicine, we have many ongoing projects with CHU Nancy. They deal with biomarkers research, prognostic value of quantitative variables and events, scoring, and adverse events. We also want to develop our expertise in rupture detection in a project with APHP (Assistance Publique Hôpitaux de Paris) for the detection of adverse events, earlier than the clinical signs and symptoms. The clinical relevance of predictive analytics is obvious for high-risk patients such as those with solid organ transplantation or severe chronic respiratory disease for instance. The main challenge is the rupture detection in multivariate and heterogeneous signals (for instance daily measures of electrocardiogram, body temperature, spirometry parameters, sleep duration, etc.). Other collaborations with clinicians concern foetopathology and we want to use our work on conditional distribution function to explain fetal and child growth. To that end, we use data from the “Service de fœtopathologie et de placentologie” of the “Maternité Régionale Universitaire” (CHU Nancy).

## 4.4 Dynamics of telomeres

Telomeres are disposable buffers at the ends of chromosomes which are truncated during cell division; so that, over time, due to each cell division, the telomere ends become shorter. By this way, they are markers of aging. Through a collaboration with Pr A. Benetos, geriatrician at CHU Nancy, we recently obtained data on the distribution of the length of telomeres from blood cells  84. We want to work in three connected directions: (1) refine methodology for the analysis of the available data; (2) propose a dynamical model for the lengths of telomeres and study its mathematical properties (long term behavior, quasi-stationarity, etc.); and (3) use these properties to develop new statistical methods.

# 5 New software and platforms

## 5.1 New software

### 5.1.1 quantCurves

• Keyword:
Statistical modeling
• Functional Description:
Non-parametric methods as local normal regression, polynomial local regression and penalized cubic B-splines regression are used to estimate quantiles curves.
• News of the Year:
Software developed in 2022.
• URL:
• Contact:
Sandie Ferrigno

### 5.1.2 Harissa

• Name:
Hartree approximation for inference along with a stochastic simulation algorithm
• Keywords:
Gene regulatory networks, Reverse engineering, Molecular simulation
• Functional Description:
Harissa is a Python package for both inference and simulation of gene regulatory networks, based on stochastic gene expression with transcriptional bursting. It was implemented in the context of a mechanistic approach to gene regulatory network inference from single-cell data.
• News of the Year:
New version in 2022 associated with an application paper (hal-03942224).
• URL:
• Publications:
• Contact:
Ulysse Herbach

# 6 New results

## 6.1 Stochastic modeling

Participants: Virgile Brodu, Nicolas Champagnat, Léo Darrigade, Coralie Fritsch, Anne Gégout-Petit, Vincent Hass, Ulysse Herbach, William Oçafrain, Pierre Vallois, Denis Villemonais, Nicolás Zalduendo Vidal.

### 6.1.1 Quasi-stationary distributions

We are continuing our research on quasi-stationary distributions (QSD), that is, distributions of Markov stochastic processes with absorption, which are stationary conditionally on non-absorption. For models of biological populations, absorption usually corresponds to extinction of a (sub-)population. QSDs are fundamental tools to describe the population state before extinction and to quantify the large-time behavior of the probability of extinction.

Thanks to the previous general result of the team in  53, together with B. Cloez (INRAE), we proved in 26 the exponential convergence of a chemostat model, whose dynamics are highly degenerate due to a deterministic part, towards a unique quasi-stationary distributions.

We also finalized an important work 7 that provides general criteria for the exponential convergence of conditional distributions of absorbed Markov processes when the convergence is not uniform with respect to the initial distribution. Our results allow to characterize a large subset of the domain of attraction of the minimal QSD and apply to a large range of stochastic processes, including diffusion processes and perturbed dynamical systems. We completed this work with specific studies of the periodic case in 24 and the case of reducible processes in 25. In this last work, we were in particular able to characterize cases with polynomial speed of convergence to a QSD and to prove the existence of a QSD for general processes in denumerable state spaces, assuming only aperiodicity, the existence of a Lyapunov function and the existence of a point in the state space from which the return time is finite with positive probability.

Motivated by our work with M. Benaïm (Univ. Neuchâtel) on degenerate processes such as hypoelliptic diffusions  48, we studied in 21 the links between Feller properties and quasi-compactness of general semigroups. This work allows to clarify the links between existing results on QSDs for hypoelliptic diffusions. We also provided in 6 a counterexample to the uniqueness of a quasi-stationary distribution for a diffusion process which satisfies the weak Hörmander condition.

In collaboration with A. Watson (Univ. College London, UK), we studied a general fragmentation-growth equation with unbounded rates. The work is decomposed into two parts: in the first part, it is shown that the equation admits a unique solution on a certain definition space; in the second part, spectral properties of the solution are established 36. The proofs are largely based on 7 and on fine properties of piecewise deterministic Markov processes.

In 33, we obtained a central limit theorem and Berry-Esseen estimates for Markov processes conditioned to never be absorbed (so-called $Q$-processes) under the general conditions of 7. We also obtained in 34 quasi-ergodic theorems for particular time-inhomogeneous Markov processes, whose time-inhomogeneity is asymptotically periodic, with applications to processes absorbed at a moving boundary.

### 6.1.2 Adaptive dynamics in biological populations

We continued our study of parameter scalings of individual-based models of biological populations under mutation and selection, taking into account the influence of negligible but non-extinct populations. In a work within the ERC SINGER, in collaboration with S. Méléard (École Polytechnique), S. Mirrahimi (Univ. Montpellier) and V.C. Tran (Univ. Paris Est Marne-la-Vallée) 23, we were able to give an individual-based justification of the Hamilton-Jacobi equation of adaptive dynamics (see e.g.  70), with a specific parameter scaling that is promising for the study of local (in space) extinction of sub-populations. The analysis of models allowing for such an extinction is the next step of this project.

We also worked on general evolutionary models of adaptive dynamics under an assumption of large population and small mutations. This year, we obtained in 22 existence, uniqueness and ergodicity results for a centered version of the Fleming-Viot process of population genetics, which is a key step to recover variants of the canonical equation of adaptive dynamics, which describes the long time evolution of the dominant phenotype in the population, under less stringent biological assumptions than in previous works. We plan to complete this project next year.

### 6.1.3 Multi-type bisexual branching process and branching processes with Moran interaction

The asexual multi-type Galton-Watson branching processes as well as the single-type bisexual processes have been studied in the literature. In particular, survival condition of the processes are well known in both cases. However, until now, the multi-type bisexual branching processes have only been studied in very specific situations and no general mathematical description has been established yet.

In 31, we studied general multi-type bisexual branching processes with superadditive mating function. We exhibited a necessary and sufficient condition for almost sure extinction, we proved a law of large numbers for our model and we studied the long-time convergence of the rescaled process.

In collaboration with E. Horton (Inria ASTRAL team) and A. Cox (Univ. of Bath, UK), we proposed a new population size model gathering in the same framework branching processes and processes with Moran type interactions. We studied this process in a setting of large population and long time 27. Its dynamics is related to the evolution of a non-conservative semi-group, whose spectral properties provide information on the long time behavior of the model.

### 6.1.4 Reconstruction of epigenetic landscapes from single-cell data

Joint work with E. Ventre (ENS Lyon), T. Espinasse (Univ. Lyon 1), G. Benoit (Univ. Rennes 1) and O. Gandrillon (ENS Lyon).

The aim is to better understand how living cells make decisions (e.g., differentiation of a stem cell into a particular specialized type), seeing decision-making as an emergent property of an underlying complex molecular network. Indeed, it is now proven that cells react probabilistically to their environment: cell types do not correspond to fixed states, but rather to “potential wells” of a certain energy landscape (representing the energy of the possible states of the cell) that we are trying to reconstruct. The achievement of this year is to show that the same mathematical model driven by transcriptional bursting can be used simultaneously as an inference tool, to reconstruct biologically relevant networks, and as a simulation tool, to generate realistic transcriptional profiles emerging from gene interactions 35.

### 6.1.5 Modeling of chronic obstructive pulmonary disease

Joint work with I. Dupin (Univ. Bordeaux), E. Maurat (Univ. Bordeaux) and J.-M. Sac-Epée (IECL).

Lung exposure to various types of particules, such as those present in cigarette smoke, can lead to chronic obstructive pulmonary disease (COPD). COPD bronchi are an area of intense immunological activity and tissue remodeling, as evidenced by the extensive immune cell infiltration and changes in tissue structures. This allows the persistent contact between resident cells and stimulated immune cells. Our hypothesis is that the contact between cells is a major cause of chronic destructive or fibrotic manifestations. We aim to analyze the potential cell-cell interactions in situ in human tissues, to characterize in vitro the dynamics of the interplay, and to define a computational model with intercellular interactions which fits to experimental measurements and explains the macroscopic properties of cell populations. The effects of potential therapeutic drugs modulating local intercellular interactions will be tested by simulations. Two papers have been submitted this year 29, 30.

### 6.1.6 Numerical simulation of diffusions

In a collaboration with A. Lejay (Inria PASTA team) and their PhD student A. Anagnostakis, D. Villemonais proposed a method for approximating general, singular diffusions by discrete time and state space processes 1. One of the main interests compared to existing methods is to propose a numerical method whose main computational cost is done upstream and thus represents a fixed cost, independently of the number of simulations performed afterwards.

### 6.1.7 A toy model of an animal foraging

We considered a toy model of an animal foraging in a one-dimensional space. The position of the animal as time $t$ elapses is described by a standard Brownian motion. To survive, it needs one unit of food per unit of time, and it may stockpile any extra supply for future use, without any upper limit on the size of the stock nor any expiry date for its consumption. As for the provision of food, we assume that only half of the space is initially filled with one unit of food per unit length, and that there is no replenishment. We studied this model in  14, determining in particular the joint distribution of the position and time of death of the animal.

## 6.2 Regression and machine learning

Participants: Sandie Ferrigno, Jean-Marie Monnez.

### 6.2.1 Cramér–von Mises goodness-of-fit tests in regression models

Join work with R. Azaïs (Inria, ENS Lyon) and M.-J. Martinez (Univ. Grenoble Alpes).

Many goodness-of-fit tests have been developed to assess the different assumptions of a (possibly heteroscedastic) regression model. Most of them are ‘directional’ in that they detect departures from a given assumption of the model. Other tests are ‘global’ (or ‘omnibus’) in that they assess whether a model fits a dataset on all its assumptions. We focus on the task of choosing the structural part of the regression function because it contains easily interpretable information about the studied relationship. We consider two nonparametric ‘directional’ tests and one nonparametric ‘global’ test, all based on generalizations of the Cramér-von Mises statistic.

To perform these goodness-of-fit tests, we have developped the R package cvmgof  42, an easy-to-use tool for practitioners, available from the Comprehensive R Archive Network (CRAN). The use of the library is illustrated through a tutorial on real data and simulation studies are carried out in order to show how the package can be exploited to compare the 3 implemented tests. The practitioner can also easily compare the test procedures with different kernel functions, bootstrap distributions, numbers of bootstrap replicates, or bandwidths. The package was updated last year, this is its third version. An article 2 has been published on this work in 2022.

We are now working on nonparametric tests associated with the functional form of the variance of the regression model. For this, we continue to work on the global test of Ducharme and Ferrigno in order to compare its performance with directional tests associated with the variance of the model. Many simulations are in progress and a part of this work has been presented in CMStatistics 2022 conference 37. This will also make it possible to propose a more general package-type tool allowing to validate the regression models used in practice.

To complete this work, we plan to assess the other assumptions of a regression model such as the additivity of the random error term. The implementation of these directional tests would enrich the cvmgof package and offer a complete easy-to-use tool for validating regression models. Moreover, the assessment of the overall validity of the model when using several directional tests will be compared with that done when using only a global test. In particular, we will discuss the well-known problem of multiple testing by comparing the results obtained from multiple test procedures with those obtained when using a global test strategy.

### 6.2.2 Online data analysis and online learning

Stochastic approximation is an important tool for the analysis of streaming data, introduced by Robbins and Monro in 1951, that can be used for example to estimate online parameters of a regression function  56 or centers of clusters in unsupervised classification  50. Another type of stochastic approximation processes was introduced by Benzécri in 1969 for estimating eigenvectors and eigenvalues of the $M$-symmetric expectation of a random matrix $A$ using independent observations of $A$. In all these processes, it is assumed that we can observe independent observations of the random matrix and that we take into account one or a mini-batch of observations per step. We are interested in the study of cases where we can't have independent observations and cases where at each step, we take into account all the observations up to this step without having to store them. Experiments we have conducted show that this second type of process generally converges faster than the first type with a mini-batch of observations at each step.

On this topic, our works with E. Albuisson (CHRU Nancy) on constrained binary logistic regression with online standardized data 13 and on the construction and update of an online ensemble score involving linear discriminant analysis and logistic regression 12, described in the previous activity report, are now published.

In the article 32, we establish an almost sure convergence theorem of two stochastic approximation processes for estimating eigenvectors of the unknown $Q$-symmetric expectation $B$ of a random matrix: the first one is the classical process proposed by Oja and the second one makes use of past and current observations at each step. This theorem extends previous theorems to the case where the metric $Q$ is unknown and estimated online in parallel. We apply these results to streaming PCA and streaming generalized canonical correlation analysis of a random vector $Z$, both when a mini-batch of observations of $Z$ is used at each step or when all the observations up to the current step are used. Other applications are in progress, such as streaming multiple factor analyses and streaming partial PCA, as well as convergence theorems of Robbins-Monro type processes.

## 6.3 Statistical learning and application in health

Participants: Nicolas Champagnat, Sandie Ferrigno, Anne Gégout-Petit, Aurélie Gueudin, Walid Laziri, Rodolphe Loubaton, Sophie Mézières, Jean-Marie Monnez, Anouk Rago, Pierre Vallois.

### 6.3.1 Estimation of reference curves for fetal weight

Our research in the field of epidemiology focuses on fetal development in the last two trimesters of pregnancy. Reference or standard curves are required in this kind of biomedical problems. Values that lie outside the limits of these reference curves may indicate the presence of a disorder. Data are from the French EDEN mother-child cohort (INSERM). This is a mother-child cohort study investigating the prenatal and early postnatal determinants of child health and development. 2002 pregnant women were recruited before 24 weeks of amenorrhoea in two maternity clinics from middle-sized French cities (Nancy and Poitiers). From May 2003 to September 2006, 1899 newborns were then included. The main outcomes of interest are fetal (via ultra-sound) and postnatal growth, adiposity development, respiratory health, atopy, behaviour and bone, cognitive and motor development. We are studying fetal weight and height as a function of the gestional age in the third trimester of pregnancy. Some classical empirical and parametric methods such as polynomial regression are first used to construct these curves. For instance, polynomial regression is one of the most common parametric approaches for modeling growth data, especially during the prenatal period. However, these classical methods build upon restrictive assumptions on estimated curves. We therefore propose to work with semi-parametric LMS methods, by modifying the response variable (fetal weight) with, among others, Box–Cox transformations. An article detailing these methodologies applied to the EDEN data should be submitted next year.

Alternative nonparametric methods as Nadaraya-Watson kernel estimation, local polynomial estimation, B-splines or cubic splines are also developed to construct these curves. The practical implementation of these methods requires working on smoothing parameters or choice of knots for the different types of nonparametric estimation. In particular, optimal choice of these parameters has been proposed. To fit these curves, we have developped the R package quantCurves  39, an easy-to-use tool for practitioners. In addition, a graphical interface (GUI) intended for practitioners is being developed to enable intuitive visualization of the results given by the package and an article is in progress.

### 6.3.2 Prediction of silencing experiments on gene networks for chronic lymphocytic leukemia

We are working with L. Vallat (CHRU Strasbourg) on the inference of dynamical gene networks from RNAseq and proteome data. The goal is to infer a model of gene expression allowing to predict gene expression in cells where the expression of specific genes is silenced (e.g. using siRNA), in order to select the silencing experiments which are more likely to reduce the cell proliferation. We expect the selected genes to provide new therapeutic targets for the treatment of chronic lymphocytic leukemia. This year, we have developed a package for the statistical analysis of temporal gene expression datasets with several biological conditions (in particular for exploratory analysis and the detection of differentially expressed genes), which will be submitted soon to Bioconductor.

### 6.3.3 Covariates selection in high-dimensional data of genetic profiles in oncology

Joint work with T. Boukhobza and H. Dumond from CRAN, and B. Bastien from biopharmaceutical industry Transgene.

We proposed a new methodology for selecting and ranking covariates associated with a variable of interest in a context of high-dimensional data under dependence but few observations. The methodology successively intertwines the clustering of covariates, decorrelation of covariates using Factor Latent Analysis, selection using aggregation of adapted methods and finally ranking. We first applied our method to transcriptomic data of 37 patients with advanced non-small-cell lung cancer who have received chemotherapy, to select the transcriptomic covariates that explain the survival outcome of the treatment. Secondly, we applied our method to 79 breast tumor samples to define patient profiles for a new metastatic biomarker and associated gene network in order to personalize the treatments. This work is published in 3 and is implemented in the R package ‘ARMADA’.

### 6.3.4 Multidimensional statistical analysis of information for clinical use

The start-up EMOSIS develops blood tests relying on flow cytometry in order to improve in vitro diagnosis of vascular thrombosis. This technology leads to multiparametric measurements on tens of thousands cells collected from each blood sample. Manual methods of analysis classically used in flow cytometry are based on data visualization by means of histograms or scatter plots. Recent progresses in the active area of computational methods for dimension reduction suggest many directions of improvement of the classical approaches for the analysis of flow cytometry data. Our first goal is to define and operate such methods. We started to focus on methods of information geometry and topological analysis of data. Once appropriate methods will be identified, an important aspect to consider will be visualization of the results in a way easy to interpret by clinicians and ethically permissible. On a longer term, our ambition is to design more accurate prediction tools for diagnosis.

# 7 Bilateral contracts and grants with industry

Participants: Anne Gégout-Petit, Walid Laziri, Sophie Mézières, Bruno Scherrer.

## 7.1 Bilateral contracts with industry

• B. Scherrer collaborated with Google Brain on reinforcement learning in the framework of the PhD thesis of Nino Vieillard, until June 2022.
• As part of the French “Plan de relance”, we obtained funds for a 2-year engineering contract with the start-up EMOSIS based in Strasbourg (from October 1, 2022). Project MOSAiC : MultidimensiOnal Statistical Analysis of Information for Clinical use.

# 8 Partnerships and cooperations

Participants: Virgile Brodu, Nicolas Champagnat, Léo Darrigade, Coralie Fritsch, Anne Gégout-Petit, Vincent Hass, Ulysse Herbach, Joseph Lam-Weil, Rodolphe Loubaton, Sophie Mézières, Jean-Marie Monnez, Aurélie Muller-Gueudin, Anouk Rago, Pierre Vallois, Denis Villemonais, Nicolás Zalduendo Vidal.

## 8.1 International initiatives

### 8.1.1 Associate Teams in the framework of an Inria International Lab or in the framework of an Inria International Program

#### MAGO

• Title:
Modelling and analysis for growth-fragmentation processes
• Coordinator:
E. Horton (Inria Bordeaux)
• Partner Institutions:
• Inria Bordeaux and Nancy (ASTRAL and BIGS teams, E. Horton, C. Fritsch, D. Villemonais)
• Univ. College of London (F.X. Briol, O. Key, A. Watson)
• Date/Duration:
2022-2024
• Total amount of the grant:
9000€ in 2022, in 2023 and 7000€ in 2024
• Description:
Growth-fragmentation (GF) refers to a collection of mathematical models in which objects – classically, biological cells – slowly gather mass over time, and fragment suddenly into multiple, smaller offspring. These models may be used to represent a range of biological processes, in which an individual reproduces by fission into two or more new individuals, such as the evolution of plasmids in bacteria populations and protein polymerisation. It is crucial to understand the long-term behaviour of GF processes so that they can be used to build algorithms to simulate real-world processes and estimate quantities such as the growth rate of the system, the steady state behaviour, and the fragmentation rate and kernel, allowing scientists to gain a better understanding of the behaviour of these complex systems. In this project, we aim to combine probabilistic and statistical tools to study these processes. In particular, we will employ methods from branching processes, quasi-stationary distributions and interacting particle systems to study their long-term behaviour and develop numerical simulations. Further, we will develop likelihood-free methods to estimate the model parameters, followed by goodness-of-fit tests to analyse the strength of these methods when working with real data.

## 8.2 European initiatives

### 8.2.1 ERC projects

N. Champagnat is scientific collaborator of the ERC SINGER (AdG 101054787) on Stochastic dynamics of sINgle cells, coordinated by S. Méléard (Ecole Polytechnique). He is involved in the research axes “From stochastic processes to singular Hamilton-Jacobi equations” and “Lineages and time reversed trajectories” of this project.

## 8.3 National initiatives

• FHU CARTAGE (Fédération Hospitalo Universitaire Cardial and ARTerial AGEing). Leader: Pr. A. Benetos. Participants: J.-M. Monnez, B. Lalloué, A. Gégout-Petit.
• RHU Fight HF (Fighting Heart Failure), located at the University Hospital of Nancy. Leader: Pr. P. Rossignol. Participants: J.-M. Monnez, B. Lalloué.
• ITMO Physics, Mathematics applied to Cancer (from 2017 until June 2022): “Modeling ctDNA dynamics for detecting targeted therapy resistance”. Funding organisms: ITMO Cancer, ITMO Technologies pour la santé de l’alliance nationale pour les sciences de la vie et de la santé (AVIESAN), INCa. Partners: Inria and IECL (Institut Élie Cartan de Lorraine), CHRU Strasbourg, CRAN (Centre de Recherche en Automatique de Nancy) and ICL (Institut de Cancérologie de Lorraine). Leader: N. Champagnat. Participants: L. Darrigade, C. Fritsch, A. Gégout-Petit, U. Herbach, A. Muller-Gueudin, P. Vallois.
• GDR 720 ISIS (funded by CNRS). Leader: L. Blanc-Féraud. Participant: S. Mézières.
• Réseau Thématique MathSAV (funded by CNRS). Leader: Fabien Crauste. Participants: N. Champagnat, L. Darrigade, C. Fritsch, V. Hass, U. Herbach, J. Lam, R. Loubaton, A. Rago, N. Zalduendo Vidal.
• Chair “Modélisation Mathématique et Biodiversité” between VEOLIA, Ecole Polytechnique, Museum National d'Histoire Naturelle and Fondation X (funded by VEOLIA). Leader: S. Méléard. Participants: V. Brodu, N. Champagnat, C. Fritsch, V. Hass, D. Villemonais, N. Zalduendo Vidal.

## 8.4 Regional initiatives

• A regional fund PACTE has been obtained to host the Low-grade Glioma database and use it in diverse purposes: teaching, dissemination and development of experimentation tools. We continue to build the PIANO platform. Participant: S. Mézières.
• Région Grand-Est: in the context of the Telomere project, A. Gégout-Petit and Denis Villemonais obtained a grant from Grand-Est region to hire J. Lam-Weil as a post-doctoral fellow. University of Lorraine and LUE GEENAGE program completed the grant.

# 9 Dissemination

Participants: Thierry Bastogne, Virgile Brodu, Nicolas Champagnat, Sandie Ferrigno, Coralie Fritsch, Anne Gégout-Petit, Vincent Hass, Ulysse Herbach, Rodolphe Loubaton, Sophie Mézières, Aurélie Muller-Gueudin, William Oçafrain, Anouk Rago, Bruno Scherrer, Denis Villemonais, Nicolás Zalduendo Vidal.

## 9.1 Promoting scientific activities

### 9.1.2 Scientific events: selection

#### Member of the conference program committees

• N. Champagnat was member of the scientific committee of JdS 2022 (53èmes Journées de Statistique de la SFdS), held in Lyon in June.
• In the framework of the Journées d'Etudes en Statistique 2021, A. Gégout-Petit was co-editor of a book on Missing data 19.

### 9.1.3 Journal

#### Member of the editorial boards

N. Champagnat serves as associate editor of ESAIM: Probability & Statistics and Stochastic Models.

#### Reviewer - reviewing activities.

The members of the team wrote referee reports for Acta Applicandae Mathematicae, Annals of Applied Probability, Annals of Applied Statistics, Annals of Probability, Bulletin of Mathematical Biology, COVID, Discrete and Continuous Dynamical Systems Series B, Electronic Journal of Probability, Journal de l'École Polytechnique, Journal of Mathematical Biology, Stochastics and Partial Differential Equations: Analysis and Computations, Stochastic Models, Stochastic Processes and their Applications, Viruses.

### 9.1.5 Leadership within the scientific community

A. Gégout-Petit is vice-president of the European Network for Business and Industrial Statistics (ENBIS).

### 9.1.6 Scientific expertise

A. Gégout-Petit was in the hiring committees of a biostatistics ‘Professeur’ position at Sorbonne Univ., and three ‘Chaire de professeur junior’ (CPJ) at CNRS (INSMI, SPJ Monaie), Univ. Lorraine (Biostatistics), and Univ. de Pau et des Pays de l'Adour (Artificial Intelligence).

• N. Champagnat is a member of the COMIPERS and the Commission Information Scientifique et Technique of Inria Nancy - Grand Est and Responsable Scientifique for the library of Mathematics of the IECL. He is also local correspondent of the COERLE (Comité Opérationel d'Évaluation des Risques Légaux et Éthiques) for the Inria Research Center of Nancy - Grand Est.
• C. Fritsch is a member of the Commission du Développement Technologique of Inria Nancy-Grand Est and of the Commission du personnel of IECL. She was the local Radar correspondent for the Inria Research Center of Nancy - Grand Est until June.
• A. Gégout-Petit is the head of IECL.

## 9.2 Teaching - Supervision - Juries

### 9.2.1 Teaching

BIGS faculty members have teaching obligations at Univ. Lorraine and are teaching at least 192 hours each year. They teach probability and statistics at different levels (Licence, Master, Engineering school). Many of them have pedagogical responsibilities.

• D. Villemonais is the head of the Mathematical Engineering Major of ENSMN, Université de Lorraine.
• T. Bastogne is in charge of the research master program “Santé Numérique et Imagerie Médicale” with the Faculty of Medicine, Université de Lorraine.
• Licence: V. Brodu, Probability Theory tutorial, 40h, L3, first year of ENSMN, Université de Lorraine.
• Licence: V. Brodu, Numerical Analysis tutorial, 20h, L3, first year of ENSMN, Université de Lorraine.
• Master: N. Champagnat, Introduction to Quantitative Finance, 12h, M1, second year of ENSMN, Université de Lorraine.
• Master: N. Champagnat, Introduction to Quantitative Finance, 9h, M2, third year of ENSMN, Université de Lorraine.
• Master: S. Ferrigno, Experimental designs, 4.5h, M1, fourth year of EEIGM, Université de Lorraine.
• Master: S. Ferrigno, Data analyzing and mining, 36h, M1, second year of ENSMN, Université de Lorraine.
• Master: S. Ferrigno, Modeling and forecasting, 32h, M1, second year of ENSMN, Université de Lorraine.
• Master: S. Ferrigno, Training projects, 18h, M1/M2, second and third year of ENSMN, Université de Lorraine.
• Licence: S. Ferrigno, Descriptive and inferential statistics, 60h, L2, second year of EEIGM, Université de Lorraine.
• Licence: S. Ferrigno, Statistical modeling, 60h, L2, second year of EEIGM, Université de Lorraine.
• Licence: S. Ferrigno, Mathematical and computational tools, 20h, L3, third year of EEIGM, Université de Lorraine.
• Licence: S. Ferrigno, Training projects, 40h, L1/L3, first, second and third year of EEIGM, Université de Lorraine.
• Master: C. Fritsch, Inverse problem, 18h, M1, second year of ENSMN, Université de Lorraine.
• Licence: C. Fritsch, Probability Theory tutorial, 40h, L3, first year of ENSMN, Université de Lorraine.
• Master: A. Gégout-Petit, Statistics, modeling, data analysis, 80h, master in applied mathematics, Université de Lorraine.
• Licence: V. Hass, Mathématiques FIGIM 1A, 38h, L1/L2, first year of ENSMN, Université de Lorraine.
• Licence: V. Hass, Mathématiques FIGIM 2A, 19h, L2, second year of ENSMN, Université de Lorraine.
• Licence: V. Hass, Probabilités, 40h, L3, first year of ENSMN, Université de Lorraine.
• Licence: V. Hass, Analyse numérique et optimisation, 45h, L3, first year of ENSMN, Université de Lorraine.
• Licence: V. Hass, Recherche opérationnelle, 18h, L3, first year of ENSMN, Université de Lorraine.
• Master: V. Hass, Méthodes stochastiques pour le calcul, 14h, M1, second year of ENSMN, Université de Lorraine.
• Licence: V. Hass, Mathématiques FIGIM 1A, 70h, L1/L2, first year of ENSMN, Université de Lorraine.
• Licence: V. Hass, Mathématiques FIGIM 2A, 19h, L2, second year of ENSMN, Université de Lorraine.
• Licence: V. Hass, Probabilités, 40h, L3, first year of ENSMN, Université de Lorraine.
• Licence: V. Hass, Analyse numérique et optimisation, 45h, L3, first year of ENSMN, Université de Lorraine.
• Licence: V. Hass, Recherche opérationnelle, 18h, L3, first year of ENSMN, Université de Lorraine.
• Licence: R. Loubaton, Inférence statistique, 42h, L3, first year of ENSMN, Université de Lorraine.
• Master: R. Loubaton, Analyse de données, 18h, M1, second year of ENSMN, Université de Lorraine.
• Master: R. Loubaton, Introduction à l'apprentissage automatique, 14h, M1, second year of ENSMN, Université de Lorraine.
• Master: R. Loubaton, Introduction au deep learning, 14h, M1, second year of ENSMN, Université de Lorraine.
• Licence: R. Loubaton, Analyse numérique, 44h, L3, first year of ENSMN, Université de Lorraine.
• Licence: R. Loubaton, Remédiation mathématique pour étudiants étrangers, 36h, L3, first year of ENSMN, Université de Lorraine.
• Licence: R. Loubaton, Géométrie et vecteurs pour la physique, 25h, L1, first year of EEIGM, Université de Lorraine.
• Licence: R. Loubaton, Analyse, 25h, L1, first year of ENGSI, Université de Lorraine.
• Master : A. Rago, Modélisation et Prévision, 14h, M1, second year of ENSMN, Université de Lorraine.
• Licence : A. Rago, Analyse numérique et optimisation, 20h, L3, first year of ENSMN, Université de Lorraine.
• Master : A. Rago, Analyse de données, 18h, M1, second year of ENSMN, Université de Lorraine.
• Master : A. Rago, Statistiques pour la grande dimension, 18h, M2 IMSD/third year of ENSMN, Université de Lorraine.
• Master: D. Villemonais, Probability Theory II, 63h, M1, second year of ENSMN, Université de Lorraine.
• Master: D. Villemonais, Stochastic processes, 32h, Master 2 MFA, Université de Lorraine.
• Master: D. Villemonais, Modeling and forecasting, 14h, M1, second year of ENSMN, Université de Lorraine.
• License: D. Villemonais Probability Theory I, 57h, L3, first year of ENSMN, Université de Lorraine.
• Master: S. Wantz-Mézières, Learning and analysis of medical data, 36h, with J.M. Moureaux, M2 SNIM, Université de Lorraine.
• Licence: S. Wantz-Mézières, Applied mathematics for management, financial mathematics, Probability and Statistics, 160h, IUT Nancy-Charlemagne (L1/L2/L3), Université de Lorraine.
• Licence: S. Wantz-Mézières, Probability, 100h, first year in TELECOM Nancy (initial and apprenticeship cursus), Université de Lorraine.
• Licence: N. Zalduendo Vidal, Probability Theory tutorial, 40h, L3, first year of ENSMN, Université de Lorraine.
• Licence: N. Zalduendo Vidal, Numerical Analysis tutorial, 20h, L3, first year of ENSMN, Université de Lorraine.

### 9.2.2 Supervision

#### PhD

• PhD in progress: Virgile Brodu, “Émergence des allométries dans les systèmes écologiques : comportement stationnaire de modèles déterministes et stochastiques de flux d’énergie et de biomasse”, grant ENS Lyon. Advisors: S. Billiard (Univ. Lille), N. Champagnat, C. Fritsch.
• PhD in progress: Vincent Hass, “Individual-based models in adaptive dynamics and long time evolution under assumptions of rare advantageous mutations”, grant Inria-Cordi, currently ATER in ENSMN. Advisor: N. Champagnat.
• PhD in progress: Rodolphe Loubaton, “Caractérisation des cibles thérapeutiques dans un programme génique tumoral”, grant Région Grand-Est, currently ATER in EEIGM. Advisors: N. Champagnat and L. Vallat (CHRU Strasbourg).
• PhD in progress: Anouk Rago, “Inférence de réseaux de gènes dynamiques et prédiction d’expériences d’interventions biologiques dans des cellules cancéreuses”, grant Région Grand-Est, Inria. Advisors: N. Champagnat, A. Gégout-Petit.
• PhD: Nino Vieillard, “Approximate Dynamic Programming and Deep Reinforcement Learning”, CIFRE with Google Brain. Advisors: B. Scherrer, M. Geist (Google Brain), defense on June, 30.
• PhD in progress: Nicolás Zalduendo Vidal, “Processus de branchement bi-sexués multi-types”, grant Inria-Cordis. Advisors: C. Fritsch, D. Villemonais.

#### Other

• M2 internship: Virgile Brodu, "Emergence des allométries dans les systèmes écologiques : convergence d'un modèle individu-centré de flux d'énergie vers la solution d'un système d'équations intégro-différentielles" (ENS Lyon). Advisor: N. Champagnat, C. Fritsch and S. Billiard (Univ. Lille).
• Research project: Hassan Berrada, “Condition de survie et d'extinction pour un modèle bisexué” (M2 ENSMN). Advisor: C. Fritsch, N. Zalduendo Vidal.
• Research project: Hugo Breton, “Interface graphique et package quantCurves” (M2 ENSMN). Advisor : S. Ferrigno.
• Parcours Recherche: Romain Maillard, “Inférence statistique de réseaux de gènes à partir de graphes dynamiques” (full-year research project, M1 ENSMN). Advisor: U. Herbach.
• Project M1: Guillaume Nodet et May Ouir, “Inférence de réseaux de gènes avec Random Forest” (M1 ENSMN). Advisor: R. Loubaton.

### 9.2.3 Juries

• PhD: N. Champagnat, reviewer, PhD thesis of Apolline Louvet, “Modèles probabilistes de génétique des populations pour les populations en expansion”, Institut polytechnique de Paris.
• PhD: A. Gégout-Petit, reviewer, PhD thesis of Guillaume Bottaz-Bottom, “Classification de trajectoires d’observances de patients atteints d’un syndrome d’apnées obstructives du sommeil”, Univ. Grenoble-Alpes.
• HDR: A. Gégout-Petit, reviewer, HDR thesis of Frédéric Proia, “Autorégressifs à coefficients variables — Modèles graphiques partiels — Applications aux sciences du vivant”, Univ. Angers.
• HDR: A. Gégout-Petit, HDR thesis of Romain Azaïs, “Approches algorithmiques pour la statistique : processus déterministes par morceaux et arbres aléatoires”, ENS Lyon.
• PhD: A. Gégout-Petit, thesis of Olivier Coudray, “Un point de vue statistique sur les critères de fatigue: de la classification supervisée à l’apprentissage positif-non labellisé”, Université Paris-Saclay.
• PhD: A. Gégout-Petit, thesis of Cécile Spychala, “Statistical analysis of road accidents in the region Franche-Comté: risk factors for accident injuries and spatial modelling for accident occurrences”, Univ. Besançon, Franche-Comté.
• PhD: B. Scherrer, reviewer, PhD thesis of Léonard Blier, "Some Principled Methodes for Deep Reinforcement Learning", Univ. Paris-Saclay.
• PhD: B. Scherrer, reviewer, PhD thesis of Giovanni Gatti Pinheiro, "Apprentissage par renforcement appliqué au Revenue Management des compagnies aériennes", Univ. Côte d’Azur.
• PhD: B. Scherrer, reviewer, PhD thesis of Chen Yan, "Asymptotically Optimal Policies for Restless Bandits", Univ. Grenoble-Alpes.
• Prize: A. Gégout-Petit, member of the committe for AMIES PhD prize.

## 9.3 Popularization

### 9.3.1 Education

• S. Mézières: organisation of a research training week on NeuroOncology and Numerics, for medical and engineering students, January 2022.

### 9.3.2 Interventions

• C. Fritsch made two interventions in the Lycée Cormontaigne in Metz, as part of the “Chiche!” program, in November.
• S. Ferrigno: Advisor of a group of students, “Traitement statistique de données” Project, various high schools, Nancy.
• S. Ferrigno: Advisor of a group of students, “La main à la Pâte” Project, Institut médico-éducatif (IME), Commercy.
• S. Ferrigno: Advisor of a group of students, “La main à la Pâte”, “C'Génial” Projects, Colleges, Malzéville and Nancy.
• S. Ferrigno: Advisor of a group of students, “La main à la Pâte” Project, elementary schools, Nancy.
• U. Herbach gave a general public conference “Les maths peuvent-elles servir à vaincre le cancer ?” in Ambert in October, for the breast cancer national awareness campaign “Octobre Rose”. On this occasion, he also made several scientific mediation interventions in secondary and high schools of Ambert.
• R. Loubaton gave an introduction to artificial intelligence in the conference "Être humain à l'âge de l'IA" at Paris in July.
• J-M. Monnez gave a masterclass on online data analysis 38 to the BIGS working group in Nancy, in October and November.

# 10 Scientific production

## 10.1 Publications of the year

### International journals

• 1 articleA.Alexis Anagnostakis, A.Antoine Lejay and D.Denis Villemonais. General diffusion processes as the limit of time-space Markov chains.Annals of Applied Probability2023
• 2 articlecvmgof: an R package for Cramér-von Mises goodness-of-fit tests in regression models.Journal of Statistical Computation and Simulation9262022, 1246-1266
• 3 articleB.Bérangère Bastien, T.Taha Boukhobza, H.Hélène Dumond, A.Anne Gégout-Petit, A.Aurélie Muller-Gueudin and C.Charlène Thiébaut. A statistical methodology to select covariates in high-dimensional data under dependence. Application to the classification of genetic profiles in oncology.Journal of Applied Statistics493March 2022, 764-781
• 4 articleT.Thierry Bastogne, F.Fanny Caputo, A.Adriele Prina-Mello, S.Sven Borgos and M.Muriel Barberi-Heyob. A state of the art in analytical quality-by-design and perspectives in characterization of nano-enabled medicinal products.Journal of Pharmaceutical and Biomedical Analysis219September 2022, 114911
• 5 articleiQbD: a TRL-indexed quality-by-design paradigm for medical device engineering.Journal of Medical Devices162June 2022, 021008
• 6 articleM.Michel Benaïm, N.Nicolas Champagnat, W.William Oçafrain and D.Denis Villemonais. Transcritical bifurcation for the conditional distribution of a diffusion process.Journal of Theoretical ProbabilityNovember 2022
• 7 articleGeneral criteria for the study of quasi-stationarity.Electronic Journal of Probability2023
• 8 articleL.Léo Darrigade, M.Marie Haghebaert, C.Claire Cherbuy, S.Simon Labarthe and B.Béatrice Laroche. A PDMP model of the epithelial cell turn-over in the intestinal crypt including microbiota-derived regulations.Journal of Mathematical Biology847June 2022, 1-67
• 9 articleS.Sophie Deneuve, T.Thierry Bastogne, M.Mirlande Duclos, C.Céline Mirjolet, P.Pascaline Bois, P.Patrick Bachmann, L.Lara Nokovitch, P.-E.Pierre-Eric Roux, D.Didier Girodet, M.Marc Poupart, P.Philippe Zrounba, L.Line Claude, L.Letizia Ferella, A.Alessandro Iacovelli, N.Nicolas Foray, T.Tiziana Rancati and S.Sandrine Pereira. Predicting acute severe toxicity for head and neck squamous cell carcinomas by combining dosimetry with a radiosensitivity biomarker : a pilot study.TumoriMay 2022
• 10 articleA.Alexis Dijamentiuk, C.Cécile Mangavel, A.Annelore Elfassy, F.Florentin Michaux, J.Jennifer Burgain, E.Emmanuel Rondags, S.Stéphane Delaunay, S.Sandie Ferrigno, A.-M.Anne-Marie Revol-Junelles and F.Frédéric Borges. Invert emulsions alleviate biotic interactions in bacterial mixed culture.Microbial Cell Factories221December 2023, 16
• 11 articleM.Marine Geoffroy, M.Marine Lemesle, A.Alexandra Kleinclauss, S.Sabine Mazerbourg, L.Levy Batista, M.Muriel Barberi-Heyob, T.Thierry Bastogne, W.Wilfrid Boireau, A.Alain Rouleau, D.Dorian Dupommier, M.Michel Boisbrun, C.Corinne Comoy, S.Stéphane Flament, I.Isabelle Grillier-Vuissoz and S.Sandra Kuntz. AB186 inhibits migration of triple-negative breast cancer cells and interacts with α-Tubulin.International Journal of Molecular Sciences2312June 2022, 6859
• 12 articleConstruction and Update of an Online Ensemble Score Involving Linear Discriminant Analysis and Logistic Regression.Applied Mathematics132February 2022, 228-242
• 13 articleStreaming constrained binary logistic regression with online standardized data.Journal of Applied Statistics4962022, 1519-1539
• 14 articleJ.Julien Randon-Furling, P.Paavo Salminen and P.Pierre Vallois. On a first hit distribution of the running maximum of Brownian motion.Stochastic Processes and their Applications150June 2022
• 15 articleL.Laurence Schenone, C.Caroline Houillier, M. L.Marie Laure Tanguy, S.Sylvain Choquet, K.Kossi Agbetiafa, H.Hervé Ghesquières, G.Gandhi Damaj, A.Anna Schmitt, K.Krimo Bouabdallah, G.Guido Ahle, R.Remy Gressin, J.Jérôme Cornillon, R.Roch Houot, J.-P.Jean-Pierre Marolleau, L.-M.Luc-Matthieu Fornecker, O.Olivier Chinot, F.Frédéric Peyrade, R.Reda Bouabdallah, C.Cécile Moluçon-Chabrot, E.Emmanuel Gyan, A.Adrien Chauchet, O.Olivier Casasnovas, L.Lucie Oberic, V.Vincent Delwail, J.Julie Abraham, V.Virginie Roland, A.Agathe Waultier-Rascalou, L.Lise Willems, F.Franck Morschhauser, M.Michel Fabbro, R.Renata Ursu, C.Catherine Thieblemont, F.Fabrice Jardin, A.Adrian Tempescul, D.Denis Malaise, V.Valérie Touitou, L.Lucia Nichelli, M.Magali Le Garff-Tavernier, A.Aurélie Plessier, P.Philippe Bourget, C.Caroline Bonmati, S.Sophie Wantz-Mézières, Q.Quentin Giordan, V.Véronique Dorvaux, C.Cyril Charron, W.Waliyde Jabeur, K.Khê Hoang-Xuan, L.Luc Taillandier and C.Carole Soussain. Intensive chemotherapy followed by autologous stem cell transplantation in primary central nervous system lymphomas (PCNSLs). Therapeutic outcomes in real life-experience of the French network.Bone Marrow Transplantation576April 2022, 966-974
• 16 articleA.Adrien Taccoen, C.Christian Piedallu, I.Ingrid Seynave, A.Anne Gégout-Petit and J.-C.Jean-Claude Gégout. Climate change-induced background tree mortality is exacerbated towards the warm limits of the species ranges.Annals of Forest Science791December 2022, 23

### International peer-reviewed conferences

• 17 inproceedingsS.Shideh Rezaeifar, R.Robert Dadashi, N.Nino Vieillard, L.Léonard Hussenot, O.Olivier Bachem, O.Olivier Pietquin and M.Matthieu Geist. Offline Reinforcement Learning as Anti-Exploration.AAAI 2022 - 36th AAAI Conference on Artificial IntelligenceVancouver, CanadaFebruary 2022

### Scientific books

• 18 bookMathématiques pour les sciences de l’ingénieur - Tout le cours en fiches: 3ème édition.Dunod2022, 576 pages
• 19 bookDonnées manquantes.Editions TechnipJune 2022

### Reports & preprints

• 21 miscQuasi-compactness criterion for strong Feller kernels with an application to quasi-stationary distributions.April 2022
• 22 miscExistence, uniqueness and ergodicity for the centered Fleming-Viot process.March 2022
• 23 miscFilling the gap between individual-based evolutionary models and Hamilton-Jacobi equations.May 2022
• 24 miscQuasi-limiting estimates for periodic absorbed Markov chains.2022
• 25 miscQuasi-stationary distributions in reducible state spaces.January 2022
• 26 miscQuasi-stationary behavior for an hybrid model of chemostat: the Crump-Young model.May 2022
• 27 miscA. M.Alexander M.G Cox, E.E Horton and D.D Villemonais. Binary branching processes with Moran type interactions.July 2022
• 28 miscPenalized polytomous ordinal logistic regression using cumulative logits. Application to network inference of zero-inflated variables.August 2022
• 29 miscI.Isabelle Dupin, E.Edmée Eyraud, É.Élise Maurat, J.-M.Jean-Marc Sac-Epee and P.Pierre Vallois. Probabilistic Cellular Automata modeling of intercellular interactions in airways : complex pattern formation in patients with Chronic Obstructive Pulmonary Disease.October 2022
• 30 miscE.Edmée Eyraud, E.Elise Maurat, J.-M.Jean-Marc Sac-Epee, P.Pauline Henrot, M.Maeva Zysman, P.Pauline Esteves, T.Thomas Trian, H.Hugues Bégueret, P.-O.Pierre-Oliver Girodet, M.Matthieu Thumerel, R.Romain Hustache-Castaing, R.Roger Marthan, F.Florian Levet, P.Pierre Vallois, C.Cécile Contin-Bordes, P.Patrick Berger and I.Isabelle Dupin. Short-range interactions between fibrocytes and CD8+ T cells in COPD bronchial inflammatory response.October 2022
• 31 miscC.Coralie Fritsch, D.Denis Villemonais and N.Nicolás Zalduendo. The Multi-type Bisexual Galton-Watson Branching Process.June 2022
• 32 miscStochastic approximation of eigenvectors and eigenvalues of the Q -symmetric expectation of a random matrix.2022, 1-15
• 33 miscA central limit and Berry-Esseen theorem for continuous-time Markov processes conditioned not to be absorbed.March 2022
• 34 miscAn ergodic theorem for asymptotically periodic time-inhomogeneous Markov processes, with application to quasi-stationarity with moving boundaries.April 2022
• 35 miscE.Elias Ventre, U.Ulysse Herbach, T.Thibault Espinasse, G.Gérard Benoit and O.Olivier Gandrillon. One model fits all: combining inference and simulation of gene regulatory networks.June 2022
• 36 miscA quasi-stationary approach to the long-term asymptotics of the growth-fragmentation equation.February 2022

### Other scientific publications

• 37 inproceedingsGOODNESS-OF-FIT TESTS FOR VARIANCE FUNCTION IN REGRESSION MODELS.CMStatistics 2022London, United KingdomDecember 2022

## 10.2 Other

### Educational activities

• 38 unpublishedJ.-M.Jean-Marie Monnez. Analyse des données en flux. Analyse en composantes principales et méthodes dérivées.October 2022, DoctoralFrance

## 10.3 Cited publications

• 40 articleM.Mart\'in Andrade-Restrepo, N.Nicolas Champagnat and R.Régis Ferrière. Spatial eco-evolutionary dynamics along environmental gradients: multi-stability and cluster dynamics.Ecology Letters225May 2019, 767-777
• 41 articleR.Romain Aza\"is, F.François Dufour and A.Anne Gégout-Petit. Non-Parametric Estimation of the Conditional Distribution of the Interjumping Times for Piecewise-Deterministic Markov Processes.Scandinavian Journal of Statistics414December 2014, 950--969
• 42 softwareR.Romain Aza\"is, S.Sandie Ferrigno and M.-J.Marie-José Martinez. cvmgof: Cramer-von Mises goodness-of-fit tests.1.0.0November 2018CeCILL
• 43 articleR.Romain Aza\"is and A.Aurélie Muller-Gueudin. Optimal choice among a class of nonparametric estimators of the jump rate for piecewise-deterministic Markov processes.Electronic journal of statistics 2016
• 44 articleR.Romain Azaïs. A recursive nonparametric estimator for the transition kernel of a piecewise-deterministic Markov process.ESAIM: Probability and Statistics182014, 726--749
• 45 inproceedingsR.Romain Azaïs, F.François Dufour and A.Anne Gégout-Petit. Nonparametric estimation of the jump rate for non-homogeneous marked renewal processes.Annales de l'Institut Henri Poincaré, Probabilités et Statistiques494Institut Henri Poincaré2013, 1204--1231
• 46 articleT.Thierry Bastogne, S.Sophie Mézières-Wantz, N.Nacim Ramdani, P.Pierre Vallois and M.Muriel Barberi-Heyob. Identification of pharmacokinetics models in the presence of timing noise.Eur. J. Control1422008, 149--157
• 47 articleT.Thierry Bastogne, A.Adeline Samson, P.Pierre Vallois, S.S Wantz-Mézières, S.Sophie Pinel, D.Denise Bechet and M.Muriel Barberi-Heyob. Phenomenological modeling of tumor diameter growth based on a mixed effects model.Journal of theoretical biology26232010, 544--552
• 48 unpublishedM.Michel Bena\"im, N.Nicolas Champagnat, W.William Oçafrain and D.Denis Villemonais. Degenerate processes killed at the boundary of a domain.2021, working paper or preprint
• 49 bookD.D.P. Bertsekas and J.J.N. Tsitsiklis. Neurodynamic Programming.Athena Scientific1996
• 50 articleH.Hervé Cardot, P.Peggy Cénac and J.-M.Jean-Marie Monnez. A fast and recursive algorithm for clustering large datasets with k-medians.Computational Statistics and Data Analysis562012, 1434-1449
• 51 articleN.Nicolas Champagnat, P.-E.Pierre-Emmanuel Jabin and S.Sylvie Méléard. Adaptation in a stochastic multi-resources chemostat model.Journal de Mathématiques Pures et Appliquées1016June 2014, 755--788
• 52 articleN.Nicolas Champagnat and D.Denis Villemonais. Exponential convergence to quasi-stationary distribution and Q-process.Probability Theory and Related Fields164146 pages2016, 243-283
• 53 articleN.Nicolas Champagnat and D.Denis Villemonais. Practical criteria for $R$-positive recurrence of unbounded semigroups.Electronic Communications in Probability25none2020, 1 -- 11
• 54 articleM. H.Mark HA Davis. Piecewise-deterministic Markov processes: A general class of non-diffusion stochastic models.Journal of the Royal Statistical Society. Series B (Methodological)1984, 353--388
• 55 articleM.Marie Doumic, M.Marc Hoffmann, N.Nathalie Krell and L.Lydia Robert. Statistical estimation of a growth-fragmentation model observed on a genealogical tree.Bernoulli2132015, 1760--1799
• 56 articleK.Kévin Duarte, J.-M.Jean-Marie Monnez and E.Eliane Albuisson. Sequential linear regression with online standardized data.PLoS ONE2018, 1-27
• 57 articleS.Sandie Ferrigno and G.Gilles Ducharme. Un test d'adéquation global pour la fonction de répartition conditionnelle.C. R. Math. Acad. Sci. Paris34152005, 313--316
• 58 articleS.Sandie Ferrigno, M.Myriam Maumy-Bertrand and A.Aurélie Muller-Gueudin. Uniform law of the logarithm for the local linear estimator of the conditional distribution function.C. R. Math. Acad. Sci. Paris34817-182010, 1015--1019
• 59 articleJ.Jerome Friedman, T.Trevor Hastie and R.Robert Tibshirani. Sparse inverse covariance estimation with the graphical lasso.Biostatistics932008, 432--441
• 60 articleC.Coralie Fritsch, F.Fabien Campillo and O.Otso Ovaskainen. A numerical approach to determine mutant invasion fitness and evolutionary singular strategies.Theoretical Population Biology1152017, 89-99
• 61 articleC.Christophe Giraud, S.Sylvie Huet and N.Nicolas Verzelen. Graph selection with GGMselect.Statistical applications in genetics and molecular biology1132012
• 62 inproceedingsT.T.D. Hansen and U.U. Zwick. Lower Bounds for Howard's Algorithm for Finding Minimum Mean-Cost Cycles.ISAAC (1)2010, 415-426
• 63 articleU.Ulysse Herbach, A.Arnaud Bonnaffoux, T.Thibault Espinasse and O.Olivier Gandrillon. Inferring gene regulatory networks from single-cell data: a mechanistic approach.BMC Systems Biology111November 2017, 105
• 64 articleS.Samuel Herrmann and P.Pierre Vallois. From persistent random walk to the telegraph noise.Stoch. Dyn.1022010, 161--196
• 65 incollectionJ.Jianghai Hu, W.-C.Wei-Chung Wu and S.Shankar Sastry. Modeling subtilin production in bacillus subtilis using stochastic hybrid systems.Hybrid Systems: Computation and ControlSpringer2004, 417--431
• 66 articleR.Roukaya Keinj, T.Thierry Bastogne and P.Pierre Vallois. Multinomial model-based formulations of TCP and NTCP for radiotherapy treatment planning.Journal of Theoretical Biology2791June 2011, 55-62
• 67 bookR.Roger Koenker. Quantile regression.38Cambridge university press2005
• 68 incollectionL.Ludovic Lebart. On the Benzecri's method for computing eigenvectors by stochastic approximation (the case of binary data).Compstat 1974 (Proc. Sympos. Computational Statist., Univ. Vienna, Vienna, 1974)ViennaPhysica Verlag1974, 202--211
• 69 inproceedingsB.Boris Lesner and B.Bruno Scherrer. Non-Stationary Approximate Modified Policy Iteration.ICML 2015Lille, FranceJuly 2015
• 70 articleA.Alexander Lorz, S.Sepideh Mirrahimi and B.Benoît Perthame. Dirac mass dynamics in multidimensional nonlocal parabolic equations.Communications in Partial Differential Equations3662011, 1071--1098
• 71 articleN.Nicolai Meinshausen and P.Peter Bühlmann. High-dimensional graphs and variable selection with the lasso.The Annals of Statistics2006, 1436--1462
• 72 articleJ.-M.Jean-Marie Monnez. Approximation stochastique en analyse factorielle multiple.Ann. I.S.U.P.5032006, 27--45
• 73 articleJ.-M.Jean-Marie Monnez. Convergence d'un processus d'approximation stochastique en analyse factorielle.Publ. Inst. Statist. Univ. Paris3811994, 37--55
• 74 articleJ.-M.Jean-Marie Monnez. Stochastic approximation of the factors of a generalized canonical correlation analysis.Statist. Probab. Lett.78142008, 2210--2216
• 75 articleE.EA Nadaraya. On non-parametric estimates of density functions and regression curves.Theory of Probability & Its Applications1011965, 186--190
• 76 techreportI.I. Post and Y.Y. Ye. The simplex method is strongly polynomial for deterministic Markov decision processes.arXiv:1208.5083v22012
• 77 bookM.M. Puterman. Markov Decision Processes.Wiley, New York1994
• 78 articleA.Angélique Richard, L.Lo\"is Boullu, U.Ulysse Herbach, A.Arnaud Arnaud, V.Valérie Morin, E.Elodie Vallin, A.Anissa Guillemin, N.Nan Papili Gao, R.Rudiyanto Gunawan, J.Jérémie Cosette, O.Ophélie Arnaud, J.-J.Jean-Jacques Kupiec, T.Thibault Espinasse, S.Sandrine Gonin-Giraud, O.Olivier Gandrillon and S.Sarah Teichmann. Single-Cell-Based Analysis Highlights a Surge in Cell-to-Cell Molecular Variability Preceding Irreversible Commitment in a Differentiation Process.PLoS Biology1412December 2016
• 79 inproceedingsB.Bernard Roynette, P.Pierre Vallois and M.Marc Yor. Brownian penalisations related to excursion lengths, VII.Annales de l'IHP Probabilités et statistiques4522009, 421--452
• 80 articleF.Francesco Russo and P.Pierre Vallois. Stochastic calculus with respect to continuous finite quadratic variation processes.Stochastics: An International Journal of Probability and Stochastic Processes701-22000, 1--40
• 81 inproceedingsB.Bruno Scherrer. Approximate Policy Iteration Schemes: A Comparison.ICML - 31st International Conference on Machine Learning - 2014Pékin, ChinaJune 2014
• 82 articleB.Bruno Scherrer, M.Mohammad Ghavamzadeh, V.Victor Gabillon, B.Boris Lesner and M.Matthieu Geist. Approximate Modified Policy Iteration and its Application to the Game of Tetris.Journal of Machine Learning Research16A parâitre2015, 1629--1676
• 83 articleB.Bruno Scherrer. Improved and Generalized Upper Bounds on the Complexity of Policy Iteration.Mathematics of Operations ResearchMarkov decision processes ; Dynamic Programming ; Analysis of AlgorithmsFebruary 2016
• 84 articleS.Simon Toupance, D.Denis Villemonais, D.Daphné Germain, A.Anne Gégout-Petit, E.Eliane Albuisson and A.Athanase Benetos. The individual's signature of telomere length distribution.Scientific Reports91January 2019, 1-8
• 85 articleP.Pierre Vallois and C. S.Charles S. Tapiero. Memory-based persistence in a counting random walk process.Phys. A.38612007, 303--307
• 86 articleP.Pierre Vallois. The range of a simple random walk on Z.Advances in applied probability1996, 1014--1033
• 87 miscN.Nathalie Villa-Vialaneix. An introduction to network inference and mining.(consulté le 22/07/2015)2015,
• 88 articleY.Y. Ye. The Simplex and Policy-Iteration Methods Are Strongly Polynomial for the Markov Decision Problem with a Fixed Discount Rate.Math. Oper. Res.3642011, 593-603