Keywords
Computer Science and Digital Science
 A3.1. Data
 A3.2. Knowledge
 A3.2.3. Inference
 A3.3. Data and knowledge analysis
 A3.3.1. Online analytical processing
 A3.3.2. Data mining
 A3.3.3. Big data analysis
 A3.4.1. Supervised learning
 A3.4.2. Unsupervised learning
 A3.4.4. Optimization and learning
 A3.4.7. Kernel methods
 A6. Modeling, simulation and control
 A6.1. Methods in mathematical modeling
 A6.1.2. Stochastic Modeling
 A6.2. Scientific computing, Numerical Analysis & Optimization
 A6.2.3. Probabilistic methods
 A6.2.4. Statistical methods
 A6.4. Automatic control
 A6.4.2. Stochastic control
Other Research Topics and Application Domains
 B1. Life sciences
 B1.1. Biology
 B1.1.2. Molecular and cellular biology
 B1.1.10. Systems and synthetic biology
 B1.1.11. Plant Biology
 B2.2. Physiology and diseases
 B2.2.1. Cardiovascular and respiratory diseases
 B2.2.3. Cancer
 B2.3. Epidemiology
 B2.4. Therapies
1 Team members, visitors, external collaborators
Research Scientists
 Nicolas Champagnat [Team leader, INRIA, Senior Researcher, HDR]
 Coralie Fritsch [INRIA, Researcher]
 Ulysse Herbach [INRIA, Researcher]
 Bruno Scherrer [INRIA, Researcher, HDR]
Faculty Members
 Thierry Bastogne [UL, Associate Professor, HDR]
 Sandie Ferrigno [UL, Associate Professor]
 Anne GégoutPetit [UL, Professor, HDR]
 JeanMarie Monnez [UL, Emeritus, HDR]
 Aurélie MullerGueudin [UL, Associate Professor]
 Sophie Mézières [UL, Associate Professor]
 Pierre Vallois [UL, Emeritus, HDR]
 Denis Villemonais [UL, Associate Professor, HDR]
PostDoctoral Fellows
 Léo Darrigade [INRIA, until Sep 2022]
 Joseph LamWeil [UL, until Jun 2022]
 William Oçafrain [INRIA, until Apr 2022]
PhD Students
 Virgile Brodu [ENS de Lyon, from Sep 2022]
 Vincent Hass [UL]
 Rodolphe Loubaton [UL]
 Anouk Rago [UL]
 Nino Vieillard [Google Brain, CIFRE, until Jun 2022]
 Nicolás Zalduendo Vidal [INRIA]
Technical Staff
 Walid Laziri [INRIA, Engineer, from Oct 2022]
Interns and Apprentices
 Virgile Brodu [ENS de Lyon, from Apr 2022 until Aug 2022]
Administrative Assistant
 Emmanuelle Deschamps [INRIA]
2 Overall objectives
BIGS is a joint team of Inria, CNRS and University of Lorraine, within the Institut Élie Cartan of Lorraine (IECL), UMR 7502 CNRSUL laboratory in mathematics, of which Inria is a strong partner. One member of BIGS, T. Bastogne, comes from the Research Center of Automatic Control of Nancy (CRAN), with which BIGS has strong relations in the domain “HealthBiologySignal”. Our research is mainly focused on stochastic modeling and statistics but also aims at a better understanding of biological systems. BIGS involves applied mathematicians whose research interests mainly concern probability and statistics. More precisely, our attention is directed on (1) stochastic modeling, (2) estimation and control for stochastic processes, (3) regression and machine learning, and (4) statistical learning and application in health. The main objective of BIGS is to exploit these skills in applied mathematics to provide a better understanding of issues arising in life sciences, with a special focus on (1) tumor growth and heterogeneity, (2) gene networks, (3) telomere length dynamics, (4) epidemiology and ehealth.
3 Research program
3.1 Introduction
We give here the main lines of our research. For clarity, we made the choice to structure them in four items. Note that all of these items deal with stochastic modeling and inference, therefore they are all interconnected.
3.2 Stochastic modeling
Our aim is to propose relevant stochastic frameworks for the modeling and the understanding of biological systems. The stochastic processes are particularly suitable for this purpose. Among them, Markov processes provide a first framework for the modeling of population of cells 86, 66. Piecewise deterministic processes are nondiffusion processes that are also frequently used in the biological context 54, 65, 55. Among Markov models, we developed strong expertise about processes derived from Brownian motion and Stochastic Differential Equations 80, 64. For instance, knowledge about Brownian or random walk excursions 85, 79 helps to analyse genetic sequences and to develop inference about them. We also have strong expertise in stochastic modeling of complex biological populations using individualbased models. These models can be used either from the point of view of asymptotic stochastic analysis 51, e.g. to study the long term Darwinian evolution of populations, or from the point of view of numerical analysis of biological phenomena 60, 40. We also develop mathematical tools for the analysis of the longtime behavior of stochastic population processes accounting for possible extinction of (sub)populations 52.
3.3 Estimation and control for stochastic processes
We develop inference about the stochastic processes that we use for modeling. Control of stochastic processes is also a way to optimise administration (dose, frequency) of therapy, such as targeted therapies in cancer. Our team has a good expertise about inference of the jump rate and the kernel of piecewisedeterministic Markov processes (PDMP) 45, 41, 44, 43, but there are many directions to go further into. For instance, previous work made the assumption of a complete observation of jumps and mode, which is unrealistic in practice. We also tackle the problem of inference of “hidden PDMP”. For example, in pharmacokinetics modeling inference, we want to account for the presence of timing noise and identification from longitudinal data. We have expertise on these subjects 46, and we also use mixed models to estimate tumor growth or heterogeneity 47.
We consider the control of stochastic processes within the framework of Markov Decision Processes 77 and their generalization known as multiplayer stochastic games, with a particular focus on infinitehorizon problems. In this context, we are interested in the complexity analysis of standard algorithms, as well as the proposition and analysis of numerical approximate schemes for large problems in the spirit of 49. Regarding complexity, a central topic of research is the analysis of the Policy Iteration algorithm, which has made significant progress in the last years 88, 76, 62, 83, but is still not fully understood. For large problems, we have an extensive experience of sensitivity analysis of approximate dynamic programming algorithms for Markov Decision Processes 81, 69, 82, and we currently investigate whether/how similar ideas may be adapted to multiplayer stochastic games.
3.4 Algorithms and estimation for graph data
Recently, our group has focused its attention on modeling and inference for graph data. A graph data structure consists of a set of nodes, together with a set of pairs of these nodes called edges. This type of data is frequently used in biology because they provide a mathematical representation of many concepts such as biological networks of relationships in a population or between genes in a cell.
Network inference is the process of making inference about the link between two variables, by taking into account the information about other variables. 87 gives a very good introduction and many references about network inference and mining. Many methods are available to infer and test edges in Gaussian graphical models 87, 71, 59, 61. However, the Gaussian assumption does not hold when dealing with typical “zeroinflated” abundance data, and we want to develop inference in this case.
Concerning gene networks, most studies have been based on populationaveraged data: now that technologies enable us to observe mRNA levels in individual cells, a revolution in terms of precision, the network reconstruction problem paradoxically becomes more challenging than ever. Indeed, the traditional way of seeing a gene regulatory network as a deterministic system with some small external noise is being challenged by the probabilistic, bursty nature of gene expression revealed at singlecell level. Our objective is to propose dynamical models and inference methods that fully exploit the particular time structure of singlecell data. We described a promising strategy in which the network inference problem is seen as a calibration procedure for a new PDMP model that is able to acceptably reproduce real singlecell data 63, 78.
Among graphs, trees play a special role because they offer a powerful model for many biological concepts, from RNA to phylogenetic trees in heterogeneous tumors or through plant structures. Our research deals with several aspects of tree data. In particular, we work on statistical inference for this type of data under a given stochastic model. We also work on lossy compression of trees via directed acyclic graphs. These methods enable us to compute distances between tree data faster than from the original structures and with a high accuracy.
3.5 Regression and machine learning
Regression models and machine learning aim at inferring statistical links between a variable of interest and covariates. In biological studies, it is always important to develop adapted learning methods both in the context of standard data and also for data of high dimension (sometimes with few observations) and very massive or online data.
Many methods are available to estimate conditional quantiles and test dependencies 75, 67. Among them we have developed nonparametric estimation by local analysis via kernel methods 57, 58 and we want to study properties of this estimator in order to derive a measure of risk based e.g. on confidence band and test. We study also other regression models like survival analysis, spatiotemporal models with covariates. Among the multiple regression models, we want to develop omnibus tests that examine several assumptions together.
Concerning the analysis of high dimensional data, our view on the topic relies on the French data analysis school, specifically on Factorial Analysis. In this context, stochastic approximation is an essential tool 68, which allows one to approximate eigenvectors in a stepwise manner 73, 72, 74. We aim at performing accurate classification or clustering by taking advantage of the possibility of updating the information "online" using stochastic approximation algorithms 50. We focus on several incremental procedures for regression and data analysis like linear and logistic regressions and PCA (Principal Component Analysis).
We also focus on the biological context of highthroughput bioassays in which several hundreds or thousands of biological signals are measured for a posterior analysis. We have to account for the interindividual variability within the modeling procedure. We aim at developing a new solution based on an ARX (Auto Regressive model with eXternal inputs) model structure using the EM (ExpectationMaximisation) algorithm for the estimation of the model parameters.
4 Application domains
4.1 Oncology: tumor growth and heterogeneity
We want to propose stochastic processes to model the appearance of mutations and the evolution of their frequencies in tumor samples, through new collaborations with clinicians who measure a particular quantity called circulating tumor DNA (ctDNA). The final purpose is to use ctDNA as an early biomarker of the resistance to a targeted therapy: this is the aim of the project funded by ITMO Cancer that we coordinate. In the ongoing work on lowgrade gliomas, a local database of 400 patients will be soon available to construct models. We plan to extend it through national and international collaborations (Montpellier CHU, Montreal CRHUM). Our aim is to build a decisionaid tool for personalised medicine.
4.2 Gene networks and singlecell data
We already mentioned in Section 3.4 our interest in the modeling and inference of transcriptomic bursting in gene regulatory networks from singlecell data. We are also currently working on the prediction and identification of therapeutic targets for chronic lymphocytic leukemia from gene expression data. Our goal is to propose new models allowing to make prediction of gene silencing experiments. Inference will be performed on gene expression data from patients’ cells suffering from different forms of chronic lymphocytic leukemia. The goal is to identify therapeutic targets which could be silenced to reduce cell proliferation.
4.3 Epidemiology and ehealth
In the context of personalized medicine, we have many ongoing projects with CHU Nancy. They deal with biomarkers research, prognostic value of quantitative variables and events, scoring, and adverse events. We also want to develop our expertise in rupture detection in a project with APHP (Assistance Publique Hôpitaux de Paris) for the detection of adverse events, earlier than the clinical signs and symptoms. The clinical relevance of predictive analytics is obvious for highrisk patients such as those with solid organ transplantation or severe chronic respiratory disease for instance. The main challenge is the rupture detection in multivariate and heterogeneous signals (for instance daily measures of electrocardiogram, body temperature, spirometry parameters, sleep duration, etc.). Other collaborations with clinicians concern foetopathology and we want to use our work on conditional distribution function to explain fetal and child growth. To that end, we use data from the “Service de fœtopathologie et de placentologie” of the “Maternité Régionale Universitaire” (CHU Nancy).
4.4 Dynamics of telomeres
Telomeres are disposable buffers at the ends of chromosomes which are truncated during cell division; so that, over time, due to each cell division, the telomere ends become shorter. By this way, they are markers of aging. Through a collaboration with Pr A. Benetos, geriatrician at CHU Nancy, we recently obtained data on the distribution of the length of telomeres from blood cells 84. We want to work in three connected directions: (1) refine methodology for the analysis of the available data; (2) propose a dynamical model for the lengths of telomeres and study its mathematical properties (long term behavior, quasistationarity, etc.); and (3) use these properties to develop new statistical methods.
5 New software and platforms
5.1 New software
5.1.1 quantCurves

Keyword:
Statistical modeling

Functional Description:
Nonparametric methods as local normal regression, polynomial local regression and penalized cubic Bsplines regression are used to estimate quantiles curves.

News of the Year:
Software developed in 2022.
 URL:

Contact:
Sandie Ferrigno
5.1.2 Harissa

Name:
Hartree approximation for inference along with a stochastic simulation algorithm

Keywords:
Gene regulatory networks, Reverse engineering, Molecular simulation

Functional Description:
Harissa is a Python package for both inference and simulation of gene regulatory networks, based on stochastic gene expression with transcriptional bursting. It was implemented in the context of a mechanistic approach to gene regulatory network inference from singlecell data.

News of the Year:
New version in 2022 associated with an application paper (hal03942224).
 URL:
 Publications:

Contact:
Ulysse Herbach
6 New results
6.1 Stochastic modeling
Participants: Virgile Brodu, Nicolas Champagnat, Léo Darrigade, Coralie Fritsch, Anne GégoutPetit, Vincent Hass, Ulysse Herbach, William Oçafrain, Pierre Vallois, Denis Villemonais, Nicolás Zalduendo Vidal.
6.1.1 Quasistationary distributions
We are continuing our research on quasistationary distributions (QSD), that is, distributions of Markov stochastic processes with absorption, which are stationary conditionally on nonabsorption. For models of biological populations, absorption usually corresponds to extinction of a (sub)population. QSDs are fundamental tools to describe the population state before extinction and to quantify the largetime behavior of the probability of extinction.
Thanks to the previous general result of the team in 53, together with B. Cloez (INRAE), we proved in 26 the exponential convergence of a chemostat model, whose dynamics are highly degenerate due to a deterministic part, towards a unique quasistationary distributions.
We also finalized an important work 7 that provides general criteria for the exponential convergence of conditional distributions of absorbed Markov processes when the convergence is not uniform with respect to the initial distribution. Our results allow to characterize a large subset of the domain of attraction of the minimal QSD and apply to a large range of stochastic processes, including diffusion processes and perturbed dynamical systems. We completed this work with specific studies of the periodic case in 24 and the case of reducible processes in 25. In this last work, we were in particular able to characterize cases with polynomial speed of convergence to a QSD and to prove the existence of a QSD for general processes in denumerable state spaces, assuming only aperiodicity, the existence of a Lyapunov function and the existence of a point in the state space from which the return time is finite with positive probability.
Motivated by our work with M. Benaïm (Univ. Neuchâtel) on degenerate processes such as hypoelliptic diffusions 48, we studied in 21 the links between Feller properties and quasicompactness of general semigroups. This work allows to clarify the links between existing results on QSDs for hypoelliptic diffusions. We also provided in 6 a counterexample to the uniqueness of a quasistationary distribution for a diffusion process which satisfies the weak Hörmander condition.
In collaboration with A. Watson (Univ. College London, UK), we studied a general fragmentationgrowth equation with unbounded rates. The work is decomposed into two parts: in the first part, it is shown that the equation admits a unique solution on a certain definition space; in the second part, spectral properties of the solution are established 36. The proofs are largely based on 7 and on fine properties of piecewise deterministic Markov processes.
In 33, we obtained a central limit theorem and BerryEsseen estimates for Markov processes conditioned to never be absorbed (socalled $Q$processes) under the general conditions of 7. We also obtained in 34 quasiergodic theorems for particular timeinhomogeneous Markov processes, whose timeinhomogeneity is asymptotically periodic, with applications to processes absorbed at a moving boundary.
6.1.2 Adaptive dynamics in biological populations
We continued our study of parameter scalings of individualbased models of biological populations under mutation and selection, taking into account the influence of negligible but nonextinct populations. In a work within the ERC SINGER, in collaboration with S. Méléard (École Polytechnique), S. Mirrahimi (Univ. Montpellier) and V.C. Tran (Univ. Paris Est MarnelaVallée) 23, we were able to give an individualbased justification of the HamiltonJacobi equation of adaptive dynamics (see e.g. 70), with a specific parameter scaling that is promising for the study of local (in space) extinction of subpopulations. The analysis of models allowing for such an extinction is the next step of this project.
We also worked on general evolutionary models of adaptive dynamics under an assumption of large population and small mutations. This year, we obtained in 22 existence, uniqueness and ergodicity results for a centered version of the FlemingViot process of population genetics, which is a key step to recover variants of the canonical equation of adaptive dynamics, which describes the long time evolution of the dominant phenotype in the population, under less stringent biological assumptions than in previous works. We plan to complete this project next year.
6.1.3 Multitype bisexual branching process and branching processes with Moran interaction
The asexual multitype GaltonWatson branching processes as well as the singletype bisexual processes have been studied in the literature. In particular, survival condition of the processes are well known in both cases. However, until now, the multitype bisexual branching processes have only been studied in very specific situations and no general mathematical description has been established yet.
In 31, we studied general multitype bisexual branching processes with superadditive mating function. We exhibited a necessary and sufficient condition for almost sure extinction, we proved a law of large numbers for our model and we studied the longtime convergence of the rescaled process.
In collaboration with E. Horton (Inria ASTRAL team) and A. Cox (Univ. of Bath, UK), we proposed a new population size model gathering in the same framework branching processes and processes with Moran type interactions. We studied this process in a setting of large population and long time 27. Its dynamics is related to the evolution of a nonconservative semigroup, whose spectral properties provide information on the long time behavior of the model.
6.1.4 Reconstruction of epigenetic landscapes from singlecell data
Joint work with E. Ventre (ENS Lyon), T. Espinasse (Univ. Lyon 1), G. Benoit (Univ. Rennes 1) and O. Gandrillon (ENS Lyon).
The aim is to better understand how living cells make decisions (e.g., differentiation of a stem cell into a particular specialized type), seeing decisionmaking as an emergent property of an underlying complex molecular network. Indeed, it is now proven that cells react probabilistically to their environment: cell types do not correspond to fixed states, but rather to “potential wells” of a certain energy landscape (representing the energy of the possible states of the cell) that we are trying to reconstruct. The achievement of this year is to show that the same mathematical model driven by transcriptional bursting can be used simultaneously as an inference tool, to reconstruct biologically relevant networks, and as a simulation tool, to generate realistic transcriptional profiles emerging from gene interactions 35.
6.1.5 Modeling of chronic obstructive pulmonary disease
Joint work with I. Dupin (Univ. Bordeaux), E. Maurat (Univ. Bordeaux) and J.M. SacEpée (IECL).
Lung exposure to various types of particules, such as those present in cigarette smoke, can lead to chronic obstructive pulmonary disease (COPD). COPD bronchi are an area of intense immunological activity and tissue remodeling, as evidenced by the extensive immune cell infiltration and changes in tissue structures. This allows the persistent contact between resident cells and stimulated immune cells. Our hypothesis is that the contact between cells is a major cause of chronic destructive or fibrotic manifestations. We aim to analyze the potential cellcell interactions in situ in human tissues, to characterize in vitro the dynamics of the interplay, and to define a computational model with intercellular interactions which fits to experimental measurements and explains the macroscopic properties of cell populations. The effects of potential therapeutic drugs modulating local intercellular interactions will be tested by simulations. Two papers have been submitted this year 29, 30.
6.1.6 Numerical simulation of diffusions
In a collaboration with A. Lejay (Inria PASTA team) and their PhD student A. Anagnostakis, D. Villemonais proposed a method for approximating general, singular diffusions by discrete time and state space processes 1. One of the main interests compared to existing methods is to propose a numerical method whose main computational cost is done upstream and thus represents a fixed cost, independently of the number of simulations performed afterwards.
6.1.7 A toy model of an animal foraging
We considered a toy model of an animal foraging in a onedimensional space. The position of the animal as time $t$ elapses is described by a standard Brownian motion. To survive, it needs one unit of food per unit of time, and it may stockpile any extra supply for future use, without any upper limit on the size of the stock nor any expiry date for its consumption. As for the provision of food, we assume that only half of the space is initially filled with one unit of food per unit length, and that there is no replenishment. We studied this model in 14, determining in particular the joint distribution of the position and time of death of the animal.
6.2 Regression and machine learning
Participants: Sandie Ferrigno, JeanMarie Monnez.
6.2.1 Cramér–von Mises goodnessoffit tests in regression models
Join work with R. Azaïs (Inria, ENS Lyon) and M.J. Martinez (Univ. Grenoble Alpes).
Many goodnessoffit tests have been developed to assess the different assumptions of a (possibly heteroscedastic) regression model. Most of them are ‘directional’ in that they detect departures from a given assumption of the model. Other tests are ‘global’ (or ‘omnibus’) in that they assess whether a model fits a dataset on all its assumptions. We focus on the task of choosing the structural part of the regression function because it contains easily interpretable information about the studied relationship. We consider two nonparametric ‘directional’ tests and one nonparametric ‘global’ test, all based on generalizations of the Cramérvon Mises statistic.
To perform these goodnessoffit tests, we have developped the R package cvmgof 42, an easytouse tool for practitioners, available from the Comprehensive R Archive Network (CRAN). The use of the library is illustrated through a tutorial on real data and simulation studies are carried out in order to show how the package can be exploited to compare the 3 implemented tests. The practitioner can also easily compare the test procedures with different kernel functions, bootstrap distributions, numbers of bootstrap replicates, or bandwidths. The package was updated last year, this is its third version. An article 2 has been published on this work in 2022.
We are now working on nonparametric tests associated with the functional form of the variance of the regression model. For this, we continue to work on the global test of Ducharme and Ferrigno in order to compare its performance with directional tests associated with the variance of the model. Many simulations are in progress and a part of this work has been presented in CMStatistics 2022 conference 37. This will also make it possible to propose a more general packagetype tool allowing to validate the regression models used in practice.
To complete this work, we plan to assess the other assumptions of a regression model such as the additivity of the random error term. The implementation of these directional tests would enrich the cvmgof package and offer a complete easytouse tool for validating regression models. Moreover, the assessment of the overall validity of the model when using several directional tests will be compared with that done when using only a global test. In particular, we will discuss the wellknown problem of multiple testing by comparing the results obtained from multiple test procedures with those obtained when using a global test strategy.
6.2.2 Online data analysis and online learning
Stochastic approximation is an important tool for the analysis of streaming data, introduced by Robbins and Monro in 1951, that can be used for example to estimate online parameters of a regression function 56 or centers of clusters in unsupervised classification 50. Another type of stochastic approximation processes was introduced by Benzécri in 1969 for estimating eigenvectors and eigenvalues of the $M$symmetric expectation of a random matrix $A$ using independent observations of $A$. In all these processes, it is assumed that we can observe independent observations of the random matrix and that we take into account one or a minibatch of observations per step. We are interested in the study of cases where we can't have independent observations and cases where at each step, we take into account all the observations up to this step without having to store them. Experiments we have conducted show that this second type of process generally converges faster than the first type with a minibatch of observations at each step.
On this topic, our works with E. Albuisson (CHRU Nancy) on constrained binary logistic regression with online standardized data 13 and on the construction and update of an online ensemble score involving linear discriminant analysis and logistic regression 12, described in the previous activity report, are now published.
In the article 32, we establish an almost sure convergence theorem of two stochastic approximation processes for estimating eigenvectors of the unknown $Q$symmetric expectation $B$ of a random matrix: the first one is the classical process proposed by Oja and the second one makes use of past and current observations at each step. This theorem extends previous theorems to the case where the metric $Q$ is unknown and estimated online in parallel. We apply these results to streaming PCA and streaming generalized canonical correlation analysis of a random vector $Z$, both when a minibatch of observations of $Z$ is used at each step or when all the observations up to the current step are used. Other applications are in progress, such as streaming multiple factor analyses and streaming partial PCA, as well as convergence theorems of RobbinsMonro type processes.
6.3 Statistical learning and application in health
Participants: Nicolas Champagnat, Sandie Ferrigno, Anne GégoutPetit, Aurélie Gueudin, Walid Laziri, Rodolphe Loubaton, Sophie Mézières, JeanMarie Monnez, Anouk Rago, Pierre Vallois.
6.3.1 Estimation of reference curves for fetal weight
Our research in the field of epidemiology focuses on fetal development in the last two trimesters of pregnancy. Reference or standard curves are required in this kind of biomedical problems. Values that lie outside the limits of these reference curves may indicate the presence of a disorder. Data are from the French EDEN motherchild cohort (INSERM). This is a motherchild cohort study investigating the prenatal and early postnatal determinants of child health and development. 2002 pregnant women were recruited before 24 weeks of amenorrhoea in two maternity clinics from middlesized French cities (Nancy and Poitiers). From May 2003 to September 2006, 1899 newborns were then included. The main outcomes of interest are fetal (via ultrasound) and postnatal growth, adiposity development, respiratory health, atopy, behaviour and bone, cognitive and motor development. We are studying fetal weight and height as a function of the gestional age in the third trimester of pregnancy. Some classical empirical and parametric methods such as polynomial regression are first used to construct these curves. For instance, polynomial regression is one of the most common parametric approaches for modeling growth data, especially during the prenatal period. However, these classical methods build upon restrictive assumptions on estimated curves. We therefore propose to work with semiparametric LMS methods, by modifying the response variable (fetal weight) with, among others, Box–Cox transformations. An article detailing these methodologies applied to the EDEN data should be submitted next year.
Alternative nonparametric methods as NadarayaWatson kernel estimation, local polynomial estimation, Bsplines or cubic splines are also developed to construct these curves. The practical implementation of these methods requires working on smoothing parameters or choice of knots for the different types of nonparametric estimation. In particular, optimal choice of these parameters has been proposed. To fit these curves, we have developped the R package quantCurves 39, an easytouse tool for practitioners. In addition, a graphical interface (GUI) intended for practitioners is being developed to enable intuitive visualization of the results given by the package and an article is in progress.
6.3.2 Prediction of silencing experiments on gene networks for chronic lymphocytic leukemia
We are working with L. Vallat (CHRU Strasbourg) on the inference of dynamical gene networks from RNAseq and proteome data. The goal is to infer a model of gene expression allowing to predict gene expression in cells where the expression of specific genes is silenced (e.g. using siRNA), in order to select the silencing experiments which are more likely to reduce the cell proliferation. We expect the selected genes to provide new therapeutic targets for the treatment of chronic lymphocytic leukemia. This year, we have developed a package for the statistical analysis of temporal gene expression datasets with several biological conditions (in particular for exploratory analysis and the detection of differentially expressed genes), which will be submitted soon to Bioconductor.
6.3.3 Covariates selection in highdimensional data of genetic profiles in oncology
Joint work with T. Boukhobza and H. Dumond from CRAN, and B. Bastien from biopharmaceutical industry Transgene.
We proposed a new methodology for selecting and ranking covariates associated with a variable of interest in a context of highdimensional data under dependence but few observations. The methodology successively intertwines the clustering of covariates, decorrelation of covariates using Factor Latent Analysis, selection using aggregation of adapted methods and finally ranking. We first applied our method to transcriptomic data of 37 patients with advanced nonsmallcell lung cancer who have received chemotherapy, to select the transcriptomic covariates that explain the survival outcome of the treatment. Secondly, we applied our method to 79 breast tumor samples to define patient profiles for a new metastatic biomarker and associated gene network in order to personalize the treatments. This work is published in 3 and is implemented in the R package ‘ARMADA’.
6.3.4 Multidimensional statistical analysis of information for clinical use
The startup EMOSIS develops blood tests relying on flow cytometry in order to improve in vitro diagnosis of vascular thrombosis. This technology leads to multiparametric measurements on tens of thousands cells collected from each blood sample. Manual methods of analysis classically used in flow cytometry are based on data visualization by means of histograms or scatter plots. Recent progresses in the active area of computational methods for dimension reduction suggest many directions of improvement of the classical approaches for the analysis of flow cytometry data. Our first goal is to define and operate such methods. We started to focus on methods of information geometry and topological analysis of data. Once appropriate methods will be identified, an important aspect to consider will be visualization of the results in a way easy to interpret by clinicians and ethically permissible. On a longer term, our ambition is to design more accurate prediction tools for diagnosis.
7 Bilateral contracts and grants with industry
Participants: Anne GégoutPetit, Walid Laziri, Sophie Mézières, Bruno Scherrer.
7.1 Bilateral contracts with industry
 B. Scherrer collaborated with Google Brain on reinforcement learning in the framework of the PhD thesis of Nino Vieillard, until June 2022.
 As part of the French “Plan de relance”, we obtained funds for a 2year engineering contract with the startup EMOSIS based in Strasbourg (from October 1, 2022). Project MOSAiC : MultidimensiOnal Statistical Analysis of Information for Clinical use.
8 Partnerships and cooperations
Participants: Virgile Brodu, Nicolas Champagnat, Léo Darrigade, Coralie Fritsch, Anne GégoutPetit, Vincent Hass, Ulysse Herbach, Joseph LamWeil, Rodolphe Loubaton, Sophie Mézières, JeanMarie Monnez, Aurélie MullerGueudin, Anouk Rago, Pierre Vallois, Denis Villemonais, Nicolás Zalduendo Vidal.
8.1 International initiatives
8.1.1 Associate Teams in the framework of an Inria International Lab or in the framework of an Inria International Program
MAGO

Title:
Modelling and analysis for growthfragmentation processes

Coordinator:
E. Horton (Inria Bordeaux)

Partner Institutions:
 Inria Bordeaux and Nancy (ASTRAL and BIGS teams, E. Horton, C. Fritsch, D. Villemonais)
 Univ. College of London (F.X. Briol, O. Key, A. Watson)

Date/Duration:
20222024

Total amount of the grant:
9000€ in 2022, in 2023 and 7000€ in 2024

Description:
Growthfragmentation (GF) refers to a collection of mathematical models in which objects – classically, biological cells – slowly gather mass over time, and fragment suddenly into multiple, smaller offspring. These models may be used to represent a range of biological processes, in which an individual reproduces by fission into two or more new individuals, such as the evolution of plasmids in bacteria populations and protein polymerisation. It is crucial to understand the longterm behaviour of GF processes so that they can be used to build algorithms to simulate realworld processes and estimate quantities such as the growth rate of the system, the steady state behaviour, and the fragmentation rate and kernel, allowing scientists to gain a better understanding of the behaviour of these complex systems. In this project, we aim to combine probabilistic and statistical tools to study these processes. In particular, we will employ methods from branching processes, quasistationary distributions and interacting particle systems to study their longterm behaviour and develop numerical simulations. Further, we will develop likelihoodfree methods to estimate the model parameters, followed by goodnessoffit tests to analyse the strength of these methods when working with real data.
8.2 European initiatives
8.2.1 ERC projects
N. Champagnat is scientific collaborator of the ERC SINGER (AdG 101054787) on Stochastic dynamics of sINgle cells, coordinated by S. Méléard (Ecole Polytechnique). He is involved in the research axes “From stochastic processes to singular HamiltonJacobi equations” and “Lineages and time reversed trajectories” of this project.
8.3 National initiatives
 FHU CARTAGE (Fédération Hospitalo Universitaire Cardial and ARTerial AGEing). Leader: Pr. A. Benetos. Participants: J.M. Monnez, B. Lalloué, A. GégoutPetit.
 RHU Fight HF (Fighting Heart Failure), located at the University Hospital of Nancy. Leader: Pr. P. Rossignol. Participants: J.M. Monnez, B. Lalloué.
 ITMO Physics, Mathematics applied to Cancer (from 2017 until June 2022): “Modeling ctDNA dynamics for detecting targeted therapy resistance”. Funding organisms: ITMO Cancer, ITMO Technologies pour la santé de l’alliance nationale pour les sciences de la vie et de la santé (AVIESAN), INCa. Partners: Inria and IECL (Institut Élie Cartan de Lorraine), CHRU Strasbourg, CRAN (Centre de Recherche en Automatique de Nancy) and ICL (Institut de Cancérologie de Lorraine). Leader: N. Champagnat. Participants: L. Darrigade, C. Fritsch, A. GégoutPetit, U. Herbach, A. MullerGueudin, P. Vallois.
 GDR 720 ISIS (funded by CNRS). Leader: L. BlancFéraud. Participant: S. Mézières.
 Réseau Thématique MathSAV (funded by CNRS). Leader: Fabien Crauste. Participants: N. Champagnat, L. Darrigade, C. Fritsch, V. Hass, U. Herbach, J. Lam, R. Loubaton, A. Rago, N. Zalduendo Vidal.
 Chair “Modélisation Mathématique et Biodiversité” between VEOLIA, Ecole Polytechnique, Museum National d'Histoire Naturelle and Fondation X (funded by VEOLIA). Leader: S. Méléard. Participants: V. Brodu, N. Champagnat, C. Fritsch, V. Hass, D. Villemonais, N. Zalduendo Vidal.
8.4 Regional initiatives
 A regional fund PACTE has been obtained to host the Lowgrade Glioma database and use it in diverse purposes: teaching, dissemination and development of experimentation tools. We continue to build the PIANO platform. Participant: S. Mézières.
 Région GrandEst: in the context of the Telomere project, A. GégoutPetit and Denis Villemonais obtained a grant from GrandEst region to hire J. LamWeil as a postdoctoral fellow. University of Lorraine and LUE GEENAGE program completed the grant.
9 Dissemination
Participants: Thierry Bastogne, Virgile Brodu, Nicolas Champagnat, Sandie Ferrigno, Coralie Fritsch, Anne GégoutPetit, Vincent Hass, Ulysse Herbach, Rodolphe Loubaton, Sophie Mézières, Aurélie MullerGueudin, William Oçafrain, Anouk Rago, Bruno Scherrer, Denis Villemonais, Nicolás Zalduendo Vidal.
9.1 Promoting scientific activities
9.1.1 Scientific events: organization
Member of the organizing committees
 A. GégoutPetit was organizer of the “Journées d'Etudes en Statistique” on compositional data, held in Fréjus in October.
 A. GégoutPetit was coorganizer of the frENBIS event “Journées de Statistique industrielle”, online, in April.
 U. Herbach is coorganizer of the Probability and Statistics weekly seminar at IECL in Nancy.
 D. Villemonais was organizer with Cécile Mailler (Univ. Bath) of the Workshop on Pólya urns, stochastic approximation and quasistationary distributions: new developments, held in Bath in April.
9.1.2 Scientific events: selection
Member of the conference program committees
 N. Champagnat was member of the scientific committee of JdS 2022 (53èmes Journées de Statistique de la SFdS), held in Lyon in June.
 In the framework of the Journées d'Etudes en Statistique 2021, A. GégoutPetit was coeditor of a book on Missing data 19.
9.1.3 Journal
Member of the editorial boards
N. Champagnat serves as associate editor of ESAIM: Probability & Statistics and Stochastic Models.
Reviewer  reviewing activities.
The members of the team wrote referee reports for Acta Applicandae Mathematicae, Annals of Applied Probability, Annals of Applied Statistics, Annals of Probability, Bulletin of Mathematical Biology, COVID, Discrete and Continuous Dynamical Systems Series B, Electronic Journal of Probability, Journal de l'École Polytechnique, Journal of Mathematical Biology, Stochastics and Partial Differential Equations: Analysis and Computations, Stochastic Models, Stochastic Processes and their Applications, Viruses.
9.1.4 Invited talks
 N. Champagnat has been invited to give talks at the scientific day “EcoEvoMath: Building on Twenty Years of Research and Training at the Crossroads of Ecology, Evolution, and Mathematics” in Paris in July, at the “AG du département BioSiS du CRAN” in Nancy in May, at the Journées de Probabilités 2022 in Orbey in May, at the workshop on Pólya urns, stochastic approximation and quasistationary distributions in Bath in April, at the Journée cancérologie : innovations et expérience patient, d'hier à demain in Nancy in April, and at the workshop on Population Dynamics and Statistical Physics in Synergy in Oberwolfach in March.
 C. Fritsch has been invited to give talks at the Mathematical models in ecology and evolution in Paris in March, and at the Mathematical models in ecology and evolution conference in Reading in July.
 V. Hass has been invited to give talks at the Mathematical models in ecology and evolution conference in Reading in July, at MPDEE 2022 (Models in Population Dynamics, Ecology and Evolution) in Turin in June, and at the Journées de Probabilités 2022 in Orbey in May. He also gave seminar talks at the "Séminaire de l'équipe SPOC (Statistiques, Probabilités, Optimisation et Contrôle)" in Dijon in May, and at the "Séminaire de probabilités et systèmes dynamiques" in Amiens in March.
 U. Herbach has been invited to give talks at the London Institute for Mathematical Sciences in London in March, at the Laboratoire de Biologie et Modélisation de la Cellule in Lyon in April, at the Journées Math Bio Santé in Besançon in October, and at the Groupe de travail de Statistique in Rouen in December.
 W. Oçafrain has been invited to give a talk at the workshop on Pólya urns, stochastic approximation and quasistationary distributions in Bath in April.
 D. Villemonais has been invited to give talks at the Workshop "Singular diffusions: numerical and theoretical aspects" in Nancy in May, at the Journées de Probabilités 2022 in Orbey in May and at the workshop on Pólya urns, stochastic approximation and quasistationary distributions in Bath in April. He also gave seminar talks at the Séminaire de calcul stochastique de Strasbourg in December in Strasbourg, at the Séminaire de probabilités d'Orsay in November in Orsay and at the Séminaire FHU  CARTAGE in Nancy in November.
 N. Zalduendo Vidal has been invited to give talks at the Journées de Probabilités 2022 in Orbey in May, and at the Mathematical models in ecology and evolution conference in Reading in July.
9.1.5 Leadership within the scientific community
A. GégoutPetit is vicepresident of the European Network for Business and Industrial Statistics (ENBIS).
9.1.6 Scientific expertise
A. GégoutPetit was in the hiring committees of a biostatistics ‘Professeur’ position at Sorbonne Univ., and three ‘Chaire de professeur junior’ (CPJ) at CNRS (INSMI, SPJ Monaie), Univ. Lorraine (Biostatistics), and Univ. de Pau et des Pays de l'Adour (Artificial Intelligence).
9.1.7 Research administration
 N. Champagnat is a member of the COMIPERS and the Commission Information Scientifique et Technique of Inria Nancy  Grand Est and Responsable Scientifique for the library of Mathematics of the IECL. He is also local correspondent of the COERLE (Comité Opérationel d'Évaluation des Risques Légaux et Éthiques) for the Inria Research Center of Nancy  Grand Est.
 C. Fritsch is a member of the Commission du Développement Technologique of Inria NancyGrand Est and of the Commission du personnel of IECL. She was the local Radar correspondent for the Inria Research Center of Nancy  Grand Est until June.
 A. GégoutPetit is the head of IECL.
9.2 Teaching  Supervision  Juries
9.2.1 Teaching
BIGS faculty members have teaching obligations at Univ. Lorraine and are teaching at least 192 hours each year. They teach probability and statistics at different levels (Licence, Master, Engineering school). Many of them have pedagogical responsibilities.
 D. Villemonais is the head of the Mathematical Engineering Major of ENSMN, Université de Lorraine.
 T. Bastogne is in charge of the research master program “Santé Numérique et Imagerie Médicale” with the Faculty of Medicine, Université de Lorraine.
 Licence: V. Brodu, Probability Theory tutorial, 40h, L3, first year of ENSMN, Université de Lorraine.
 Licence: V. Brodu, Numerical Analysis tutorial, 20h, L3, first year of ENSMN, Université de Lorraine.
 Master: N. Champagnat, Introduction to Quantitative Finance, 12h, M1, second year of ENSMN, Université de Lorraine.
 Master: N. Champagnat, Introduction to Quantitative Finance, 9h, M2, third year of ENSMN, Université de Lorraine.
 Master: S. Ferrigno, Experimental designs, 4.5h, M1, fourth year of EEIGM, Université de Lorraine.
 Master: S. Ferrigno, Data analyzing and mining, 36h, M1, second year of ENSMN, Université de Lorraine.
 Master: S. Ferrigno, Modeling and forecasting, 32h, M1, second year of ENSMN, Université de Lorraine.
 Master: S. Ferrigno, Training projects, 18h, M1/M2, second and third year of ENSMN, Université de Lorraine.
 Licence: S. Ferrigno, Descriptive and inferential statistics, 60h, L2, second year of EEIGM, Université de Lorraine.
 Licence: S. Ferrigno, Statistical modeling, 60h, L2, second year of EEIGM, Université de Lorraine.
 Licence: S. Ferrigno, Mathematical and computational tools, 20h, L3, third year of EEIGM, Université de Lorraine.
 Licence: S. Ferrigno, Training projects, 40h, L1/L3, first, second and third year of EEIGM, Université de Lorraine.
 Master: C. Fritsch, Inverse problem, 18h, M1, second year of ENSMN, Université de Lorraine.
 Licence: C. Fritsch, Probability Theory tutorial, 40h, L3, first year of ENSMN, Université de Lorraine.
 Master: A. GégoutPetit, Statistics, modeling, data analysis, 80h, master in applied mathematics, Université de Lorraine.
 Licence: V. Hass, Mathématiques FIGIM 1A, 38h, L1/L2, first year of ENSMN, Université de Lorraine.
 Licence: V. Hass, Mathématiques FIGIM 2A, 19h, L2, second year of ENSMN, Université de Lorraine.
 Licence: V. Hass, Probabilités, 40h, L3, first year of ENSMN, Université de Lorraine.
 Licence: V. Hass, Analyse numérique et optimisation, 45h, L3, first year of ENSMN, Université de Lorraine.
 Licence: V. Hass, Recherche opérationnelle, 18h, L3, first year of ENSMN, Université de Lorraine.
 Master: V. Hass, Méthodes stochastiques pour le calcul, 14h, M1, second year of ENSMN, Université de Lorraine.
 Licence: V. Hass, Mathématiques FIGIM 1A, 70h, L1/L2, first year of ENSMN, Université de Lorraine.
 Licence: V. Hass, Mathématiques FIGIM 2A, 19h, L2, second year of ENSMN, Université de Lorraine.
 Licence: V. Hass, Probabilités, 40h, L3, first year of ENSMN, Université de Lorraine.
 Licence: V. Hass, Analyse numérique et optimisation, 45h, L3, first year of ENSMN, Université de Lorraine.
 Licence: V. Hass, Recherche opérationnelle, 18h, L3, first year of ENSMN, Université de Lorraine.
 Licence: R. Loubaton, Inférence statistique, 42h, L3, first year of ENSMN, Université de Lorraine.
 Master: R. Loubaton, Analyse de données, 18h, M1, second year of ENSMN, Université de Lorraine.
 Master: R. Loubaton, Introduction à l'apprentissage automatique, 14h, M1, second year of ENSMN, Université de Lorraine.
 Master: R. Loubaton, Introduction au deep learning, 14h, M1, second year of ENSMN, Université de Lorraine.
 Licence: R. Loubaton, Analyse numérique, 44h, L3, first year of ENSMN, Université de Lorraine.
 Licence: R. Loubaton, Remédiation mathématique pour étudiants étrangers, 36h, L3, first year of ENSMN, Université de Lorraine.
 Licence: R. Loubaton, Géométrie et vecteurs pour la physique, 25h, L1, first year of EEIGM, Université de Lorraine.
 Licence: R. Loubaton, Analyse, 25h, L1, first year of ENGSI, Université de Lorraine.
 Master : A. Rago, Modélisation et Prévision, 14h, M1, second year of ENSMN, Université de Lorraine.
 Licence : A. Rago, Analyse numérique et optimisation, 20h, L3, first year of ENSMN, Université de Lorraine.
 Master : A. Rago, Analyse de données, 18h, M1, second year of ENSMN, Université de Lorraine.
 Master : A. Rago, Statistiques pour la grande dimension, 18h, M2 IMSD/third year of ENSMN, Université de Lorraine.
 Master: D. Villemonais, Probability Theory II, 63h, M1, second year of ENSMN, Université de Lorraine.
 Master: D. Villemonais, Stochastic processes, 32h, Master 2 MFA, Université de Lorraine.
 Master: D. Villemonais, Modeling and forecasting, 14h, M1, second year of ENSMN, Université de Lorraine.
 License: D. Villemonais Probability Theory I, 57h, L3, first year of ENSMN, Université de Lorraine.
 Master: S. WantzMézières, Learning and analysis of medical data, 36h, with J.M. Moureaux, M2 SNIM, Université de Lorraine.
 Licence: S. WantzMézières, Applied mathematics for management, financial mathematics, Probability and Statistics, 160h, IUT NancyCharlemagne (L1/L2/L3), Université de Lorraine.
 Licence: S. WantzMézières, Probability, 100h, first year in TELECOM Nancy (initial and apprenticeship cursus), Université de Lorraine.
 Licence: N. Zalduendo Vidal, Probability Theory tutorial, 40h, L3, first year of ENSMN, Université de Lorraine.
 Licence: N. Zalduendo Vidal, Numerical Analysis tutorial, 20h, L3, first year of ENSMN, Université de Lorraine.
9.2.2 Supervision
PhD
 PhD in progress: Virgile Brodu, “Émergence des allométries dans les systèmes écologiques : comportement stationnaire de modèles déterministes et stochastiques de flux d’énergie et de biomasse”, grant ENS Lyon. Advisors: S. Billiard (Univ. Lille), N. Champagnat, C. Fritsch.
 PhD in progress: Vincent Hass, “Individualbased models in adaptive dynamics and long time evolution under assumptions of rare advantageous mutations”, grant InriaCordi, currently ATER in ENSMN. Advisor: N. Champagnat.
 PhD in progress: Rodolphe Loubaton, “Caractérisation des cibles thérapeutiques dans un programme génique tumoral”, grant Région GrandEst, currently ATER in EEIGM. Advisors: N. Champagnat and L. Vallat (CHRU Strasbourg).
 PhD in progress: Anouk Rago, “Inférence de réseaux de gènes dynamiques et prédiction d’expériences d’interventions biologiques dans des cellules cancéreuses”, grant Région GrandEst, Inria. Advisors: N. Champagnat, A. GégoutPetit.
 PhD: Nino Vieillard, “Approximate Dynamic Programming and Deep Reinforcement Learning”, CIFRE with Google Brain. Advisors: B. Scherrer, M. Geist (Google Brain), defense on June, 30.
 PhD in progress: Nicolás Zalduendo Vidal, “Processus de branchement bisexués multitypes”, grant InriaCordis. Advisors: C. Fritsch, D. Villemonais.
Other
 M2 internship: Virgile Brodu, "Emergence des allométries dans les systèmes écologiques : convergence d'un modèle individucentré de flux d'énergie vers la solution d'un système d'équations intégrodifférentielles" (ENS Lyon). Advisor: N. Champagnat, C. Fritsch and S. Billiard (Univ. Lille).
 Research project: Hassan Berrada, “Condition de survie et d'extinction pour un modèle bisexué” (M2 ENSMN). Advisor: C. Fritsch, N. Zalduendo Vidal.
 Research project: Hugo Breton, “Interface graphique et package quantCurves” (M2 ENSMN). Advisor : S. Ferrigno.
 Parcours Recherche: Romain Maillard, “Inférence statistique de réseaux de gènes à partir de graphes dynamiques” (fullyear research project, M1 ENSMN). Advisor: U. Herbach.
 Project M1: Guillaume Nodet et May Ouir, “Inférence de réseaux de gènes avec Random Forest” (M1 ENSMN). Advisor: R. Loubaton.
9.2.3 Juries
 PhD: N. Champagnat, reviewer, PhD thesis of Apolline Louvet, “Modèles probabilistes de génétique des populations pour les populations en expansion”, Institut polytechnique de Paris.
 PhD: A. GégoutPetit, reviewer, PhD thesis of Guillaume BottazBottom, “Classification de trajectoires d’observances de patients atteints d’un syndrome d’apnées obstructives du sommeil”, Univ. GrenobleAlpes.
 HDR: A. GégoutPetit, reviewer, HDR thesis of Frédéric Proia, “Autorégressifs à coefficients variables — Modèles graphiques partiels — Applications aux sciences du vivant”, Univ. Angers.
 HDR: A. GégoutPetit, HDR thesis of Romain Azaïs, “Approches algorithmiques pour la statistique : processus déterministes par morceaux et arbres aléatoires”, ENS Lyon.
 PhD: A. GégoutPetit, thesis of Olivier Coudray, “Un point de vue statistique sur les critères de fatigue: de la classification supervisée à l’apprentissage positifnon labellisé”, Université ParisSaclay.
 PhD: A. GégoutPetit, thesis of Cécile Spychala, “Statistical analysis of road accidents in the region FrancheComté: risk factors for accident injuries and spatial modelling for accident occurrences”, Univ. Besançon, FrancheComté.
 PhD: B. Scherrer, reviewer, PhD thesis of Léonard Blier, "Some Principled Methodes for Deep Reinforcement Learning", Univ. ParisSaclay.
 PhD: B. Scherrer, reviewer, PhD thesis of Giovanni Gatti Pinheiro, "Apprentissage par renforcement appliqué au Revenue Management des compagnies aériennes", Univ. Côte d’Azur.
 PhD: B. Scherrer, reviewer, PhD thesis of Chen Yan, "Asymptotically Optimal Policies for Restless Bandits", Univ. GrenobleAlpes.
 Prize: A. GégoutPetit, member of the committe for AMIES PhD prize.
9.3 Popularization
9.3.1 Education
 S. Mézières: organisation of a research training week on NeuroOncology and Numerics, for medical and engineering students, January 2022.
9.3.2 Interventions
 C. Fritsch made two interventions in the Lycée Cormontaigne in Metz, as part of the “Chiche!” program, in November.
 S. Ferrigno: Advisor of a group of students, “Traitement statistique de données” Project, various high schools, Nancy.
 S. Ferrigno: Advisor of a group of students, “La main à la Pâte” Project, Institut médicoéducatif (IME), Commercy.
 S. Ferrigno: Advisor of a group of students, “La main à la Pâte”, “C'Génial” Projects, Colleges, Malzéville and Nancy.
 S. Ferrigno: Advisor of a group of students, “La main à la Pâte” Project, elementary schools, Nancy.
 U. Herbach gave a general public conference “Les maths peuventelles servir à vaincre le cancer ?” in Ambert in October, for the breast cancer national awareness campaign “Octobre Rose”. On this occasion, he also made several scientific mediation interventions in secondary and high schools of Ambert.
 R. Loubaton gave an introduction to artificial intelligence in the conference "Être humain à l'âge de l'IA" at Paris in July.
 JM. Monnez gave a masterclass on online data analysis 38 to the BIGS working group in Nancy, in October and November.
10 Scientific production
10.1 Publications of the year
International journals
 1 articleGeneral diffusion processes as the limit of timespace Markov chains.Annals of Applied Probability2023
 2 articlecvmgof: an R package for Cramérvon Mises goodnessoffit tests in regression models.Journal of Statistical Computation and Simulation9262022, 12461266
 3 articleA statistical methodology to select covariates in highdimensional data under dependence. Application to the classification of genetic profiles in oncology.Journal of Applied Statistics493March 2022, 764781
 4 articleA state of the art in analytical qualitybydesign and perspectives in characterization of nanoenabled medicinal products.Journal of Pharmaceutical and Biomedical Analysis219September 2022, 114911
 5 articleiQbD: a TRLindexed qualitybydesign paradigm for medical device engineering.Journal of Medical Devices162June 2022, 021008
 6 articleTranscritical bifurcation for the conditional distribution of a diffusion process.Journal of Theoretical ProbabilityNovember 2022
 7 articleGeneral criteria for the study of quasistationarity.Electronic Journal of Probability2023
 8 articleA PDMP model of the epithelial cell turnover in the intestinal crypt including microbiotaderived regulations.Journal of Mathematical Biology847June 2022, 167
 9 articlePredicting acute severe toxicity for head and neck squamous cell carcinomas by combining dosimetry with a radiosensitivity biomarker : a pilot study.TumoriMay 2022
 10 articleInvert emulsions alleviate biotic interactions in bacterial mixed culture.Microbial Cell Factories221December 2023, 16
 11 articleAB186 inhibits migration of triplenegative breast cancer cells and interacts with αTubulin.International Journal of Molecular Sciences2312June 2022, 6859
 12 articleConstruction and Update of an Online Ensemble Score Involving Linear Discriminant Analysis and Logistic Regression.Applied Mathematics132February 2022, 228242
 13 articleStreaming constrained binary logistic regression with online standardized data.Journal of Applied Statistics4962022, 15191539
 14 articleOn a first hit distribution of the running maximum of Brownian motion.Stochastic Processes and their Applications150June 2022
 15 articleIntensive chemotherapy followed by autologous stem cell transplantation in primary central nervous system lymphomas (PCNSLs). Therapeutic outcomes in real lifeexperience of the French network.Bone Marrow Transplantation576April 2022, 966974
 16 articleClimate changeinduced background tree mortality is exacerbated towards the warm limits of the species ranges.Annals of Forest Science791December 2022, 23
International peerreviewed conferences
 17 inproceedingsOffline Reinforcement Learning as AntiExploration.AAAI 2022  36th AAAI Conference on Artificial IntelligenceVancouver, CanadaFebruary 2022
Scientific books
 18 bookMathématiques pour les sciences de l’ingénieur  Tout le cours en fiches: 3ème édition.Dunod2022, 576 pages
 19 bookDonnées manquantes.Editions TechnipJune 2022
Scientific book chapters
 20 inbookUne histoire lacunaire.Données manquantesEditions TechnipJune 2022, 127
Reports & preprints
 21 miscQuasicompactness criterion for strong Feller kernels with an application to quasistationary distributions.April 2022
 22 miscExistence, uniqueness and ergodicity for the centered FlemingViot process.March 2022
 23 miscFilling the gap between individualbased evolutionary models and HamiltonJacobi equations.May 2022
 24 miscQuasilimiting estimates for periodic absorbed Markov chains.2022
 25 miscQuasistationary distributions in reducible state spaces.January 2022
 26 miscQuasistationary behavior for an hybrid model of chemostat: the CrumpYoung model.May 2022
 27 miscBinary branching processes with Moran type interactions.July 2022
 28 miscPenalized polytomous ordinal logistic regression using cumulative logits. Application to network inference of zeroinflated variables.August 2022
 29 miscProbabilistic Cellular Automata modeling of intercellular interactions in airways : complex pattern formation in patients with Chronic Obstructive Pulmonary Disease.October 2022
 30 miscShortrange interactions between fibrocytes and CD8+ T cells in COPD bronchial inflammatory response.October 2022
 31 miscThe Multitype Bisexual GaltonWatson Branching Process.June 2022
 32 miscStochastic approximation of eigenvectors and eigenvalues of the Q symmetric expectation of a random matrix.2022, 115
 33 miscA central limit and BerryEsseen theorem for continuoustime Markov processes conditioned not to be absorbed.March 2022
 34 miscAn ergodic theorem for asymptotically periodic timeinhomogeneous Markov processes, with application to quasistationarity with moving boundaries.April 2022
 35 miscOne model fits all: combining inference and simulation of gene regulatory networks.June 2022
 36 miscA quasistationary approach to the longterm asymptotics of the growthfragmentation equation.February 2022
Other scientific publications
 37 inproceedingsGOODNESSOFFIT TESTS FOR VARIANCE FUNCTION IN REGRESSION MODELS.CMStatistics 2022London, United KingdomDecember 2022
10.2 Other
Educational activities
 38 unpublishedAnalyse des données en flux. Analyse en composantes principales et méthodes dérivées.October 2022, DoctoralFrance
Softwares
 39 softwarequantCurves:Estimate Quantiles Curves.1.0.0March 2022CeCILL
10.3 Cited publications
 40 articleSpatial ecoevolutionary dynamics along environmental gradients: multistability and cluster dynamics.Ecology Letters225May 2019, 767777
 41 articleNonParametric Estimation of the Conditional Distribution of the Interjumping Times for PiecewiseDeterministic Markov Processes.Scandinavian Journal of Statistics414December 2014, 950969
 42 softwarecvmgof: Cramervon Mises goodnessoffit tests.1.0.0November 2018CeCILL
 43 articleOptimal choice among a class of nonparametric estimators of the jump rate for piecewisedeterministic Markov processes.Electronic journal of statistics 2016
 44 articleA recursive nonparametric estimator for the transition kernel of a piecewisedeterministic Markov process.ESAIM: Probability and Statistics182014, 726749
 45 inproceedingsNonparametric estimation of the jump rate for nonhomogeneous marked renewal processes.Annales de l'Institut Henri Poincaré, Probabilités et Statistiques494Institut Henri Poincaré2013, 12041231
 46 articleIdentification of pharmacokinetics models in the presence of timing noise.Eur. J. Control1422008, 149157URL: http://dx.doi.org/10.3166/ejc.14.149157
 47 articlePhenomenological modeling of tumor diameter growth based on a mixed effects model.Journal of theoretical biology26232010, 544552
 48 unpublishedDegenerate processes killed at the boundary of a domain.2021, working paper or preprint
 49 bookNeurodynamic Programming.Athena Scientific1996
 50 articleA fast and recursive algorithm for clustering large datasets with kmedians.Computational Statistics and Data Analysis562012, 14341449
 51 articleAdaptation in a stochastic multiresources chemostat model.Journal de Mathématiques Pures et Appliquées1016June 2014, 755788
 52 articleExponential convergence to quasistationary distribution and Qprocess.Probability Theory and Related Fields164146 pages2016, 243283

53
articlePractical criteria for
$R$ positive recurrence of unbounded semigroups.Electronic Communications in Probability25none2020, 1  11URL: https://doi.org/10.1214/20ECP288  54 articlePiecewisedeterministic Markov processes: A general class of nondiffusion stochastic models.Journal of the Royal Statistical Society. Series B (Methodological)1984, 353388
 55 articleStatistical estimation of a growthfragmentation model observed on a genealogical tree.Bernoulli2132015, 17601799
 56 articleSequential linear regression with online standardized data.PLoS ONE2018, 127
 57 articleUn test d'adéquation global pour la fonction de répartition conditionnelle.C. R. Math. Acad. Sci. Paris34152005, 313316URL: http://dx.doi.org/10.1016/j.crma.2005.07.003
 58 articleUniform law of the logarithm for the local linear estimator of the conditional distribution function.C. R. Math. Acad. Sci. Paris34817182010, 10151019URL: http://dx.doi.org/10.1016/j.crma.2010.08.003
 59 articleSparse inverse covariance estimation with the graphical lasso.Biostatistics932008, 432441
 60 articleA numerical approach to determine mutant invasion fitness and evolutionary singular strategies.Theoretical Population Biology1152017, 8999
 61 articleGraph selection with GGMselect.Statistical applications in genetics and molecular biology1132012
 62 inproceedingsLower Bounds for Howard's Algorithm for Finding Minimum MeanCost Cycles.ISAAC (1)2010, 415426
 63 articleInferring gene regulatory networks from singlecell data: a mechanistic approach.BMC Systems Biology111November 2017, 105
 64 articleFrom persistent random walk to the telegraph noise.Stoch. Dyn.1022010, 161196URL: http://dx.doi.org/10.1142/S0219493710002905
 65 incollectionModeling subtilin production in bacillus subtilis using stochastic hybrid systems.Hybrid Systems: Computation and ControlSpringer2004, 417431
 66 articleMultinomial modelbased formulations of TCP and NTCP for radiotherapy treatment planning.Journal of Theoretical Biology2791June 2011, 5562URL: http://hal.inria.fr/hal00588935/en
 67 bookQuantile regression.38Cambridge university press2005
 68 incollectionOn the Benzecri's method for computing eigenvectors by stochastic approximation (the case of binary data).Compstat 1974 (Proc. Sympos. Computational Statist., Univ. Vienna, Vienna, 1974)ViennaPhysica Verlag1974, 202211
 69 inproceedingsNonStationary Approximate Modified Policy Iteration.ICML 2015Lille, FranceJuly 2015
 70 articleDirac mass dynamics in multidimensional nonlocal parabolic equations.Communications in Partial Differential Equations3662011, 10711098
 71 articleHighdimensional graphs and variable selection with the lasso.The Annals of Statistics2006, 14361462
 72 articleApproximation stochastique en analyse factorielle multiple.Ann. I.S.U.P.5032006, 2745
 73 articleConvergence d'un processus d'approximation stochastique en analyse factorielle.Publ. Inst. Statist. Univ. Paris3811994, 3755
 74 articleStochastic approximation of the factors of a generalized canonical correlation analysis.Statist. Probab. Lett.78142008, 22102216URL: http://dx.doi.org/10.1016/j.spl.2008.01.088
 75 articleOn nonparametric estimates of density functions and regression curves.Theory of Probability & Its Applications1011965, 186190
 76 techreportThe simplex method is strongly polynomial for deterministic Markov decision processes.arXiv:1208.5083v22012
 77 bookMarkov Decision Processes.Wiley, New York1994
 78 articleSingleCellBased Analysis Highlights a Surge in CelltoCell Molecular Variability Preceding Irreversible Commitment in a Differentiation Process.PLoS Biology1412December 2016
 79 inproceedingsBrownian penalisations related to excursion lengths, VII.Annales de l'IHP Probabilités et statistiques4522009, 421452
 80 articleStochastic calculus with respect to continuous finite quadratic variation processes.Stochastics: An International Journal of Probability and Stochastic Processes70122000, 140
 81 inproceedingsApproximate Policy Iteration Schemes: A Comparison.ICML  31st International Conference on Machine Learning  2014Pékin, ChinaJune 2014
 82 articleApproximate Modified Policy Iteration and its Application to the Game of Tetris.Journal of Machine Learning Research16A parâitre2015, 16291676
 83 articleImproved and Generalized Upper Bounds on the Complexity of Policy Iteration.Mathematics of Operations ResearchMarkov decision processes ; Dynamic Programming ; Analysis of AlgorithmsFebruary 2016
 84 articleThe individual's signature of telomere length distribution.Scientific Reports91January 2019, 18
 85 articleMemorybased persistence in a counting random walk process.Phys. A.38612007, 303307URL: http://dx.doi.org/10.1016/j.physa.2007.08.027
 86 articleThe range of a simple random walk on Z.Advances in applied probability1996, 10141033
 87 misc An introduction to network inference and mining.(consulté le 22/07/2015)2015, URL: http://www.nathalievilla.org/doc/pdf//wikistatnetwork_compiled.pdf
 88 articleThe Simplex and PolicyIteration Methods Are Strongly Polynomial for the Markov Decision Problem with a Fixed Discount Rate.Math. Oper. Res.3642011, 593603