2023Activity reportProjectTeamBIGS
RNSR: 200920955T Research center Inria Centre at Université de Lorraine
 In partnership with:CNRS, Université de Lorraine
 Team name: Biology, genetics and statistics
 In collaboration with:Institut Elie Cartan de Lorraine (IECL)
 Domain:Digital Health, Biology and Earth
 Theme:Modeling and Control for Life Sciences
Keywords
Computer Science and Digital Science
 A3.1. Data
 A3.2. Knowledge
 A3.2.3. Inference
 A3.3. Data and knowledge analysis
 A3.3.1. Online analytical processing
 A3.3.2. Data mining
 A3.3.3. Big data analysis
 A3.4.1. Supervised learning
 A3.4.2. Unsupervised learning
 A3.4.4. Optimization and learning
 A3.4.7. Kernel methods
 A6. Modeling, simulation and control
 A6.1. Methods in mathematical modeling
 A6.1.2. Stochastic Modeling
 A6.2. Scientific computing, Numerical Analysis & Optimization
 A6.2.3. Probabilistic methods
 A6.2.4. Statistical methods
 A6.4. Automatic control
 A6.4.2. Stochastic control
Other Research Topics and Application Domains
 B1. Life sciences
 B1.1. Biology
 B1.1.2. Molecular and cellular biology
 B1.1.10. Systems and synthetic biology
 B1.1.11. Plant Biology
 B2.2. Physiology and diseases
 B2.2.1. Cardiovascular and respiratory diseases
 B2.2.3. Cancer
 B2.3. Epidemiology
 B2.4. Therapies
1 Team members, visitors, external collaborators
Research Scientists
 Nicolas Champagnat [Team leader, INRIA, Senior Researcher, HDR]
 Coralie Fritsch [INRIA, Researcher]
 Ulysse Herbach [INRIA, Researcher]
 Bruno Scherrer [INRIA, Researcher, HDR]
Faculty Members
 Thierry Bastogne [UL, Associate Professor, HDR]
 Sandie Ferrigno [UL, Associate Professor]
 Anne GégoutPetit [UL, Professor, HDR]
 JeanMarie Monnez [UL, Emeritus, HDR]
 Aurélie MullerGueudin [UL, Associate Professor]
 Pierre Vallois [UL, Emeritus, HDR]
 Denis Villemonais [UL, Associate Professor, HDR]
 Sophie WantzMézières [UL, Associate Professor]
PhD Students
 Sophie Baland [UL, from Oct 2023]
 Virgile Brodu [UL]
 Mathilde Gaillard [INRIA, from Oct 2023]
 Vincent Hass [UL, until Aug 2023]
 Vincent Kagan [INRIA, from Apr 2023 until Sep 2023]
 Rodolphe Loubaton [UL, ATER, until Aug 2023]
 Anouk Rago [UL]
 Nicolas Zalduendo Vidal [INRIA]
Technical Staff
 Walid Laziri [INRIA, Engineer, until Nov 2023]
 Nathaniel Seyler [INRIA, Engineer, from Oct 2023]
Interns and Apprentices
 Mathilde Gaillard [INRIA, from Apr 2023 until Sep 2023]
Administrative Assistant
 Emmanuelle Deschamps [INRIA]
2 Overall objectives
BIGS is a joint team of Inria, CNRS and University of Lorraine, within the Institut Élie Cartan of Lorraine (IECL), UMR 7502 CNRSUL laboratory in mathematics, of which Inria is a strong partner. One member of BIGS, T. Bastogne, comes from the Research Center of Automatic Control of Nancy (CRAN), with which BIGS has strong relations in the domain “HealthBiologySignal”. Our research is mainly focused on stochastic modeling and statistics but also aims at a better understanding of biological systems. BIGS involves applied mathematicians whose research interests mainly concern probability and statistics. More precisely, our attention is directed on (1) stochastic modeling, (2) estimation and control for stochastic processes, (3) regression and machine learning, and (4) statistical learning and application in health. The main objective of BIGS is to exploit these skills in applied mathematics to provide a better understanding of issues arising in life sciences, with a special focus on (1) tumor growth and heterogeneity, (2) gene networks, (3) telomere length dynamics, (4) epidemiology and ehealth.
3 Research program
3.1 Introduction
We give here the main lines of our research. For clarity, we made the choice to structure them in four items. Note that all of these items deal with stochastic modeling and inference, therefore they are all interconnected.
3.2 Stochastic modeling
Our aim is to propose relevant stochastic frameworks for the modeling and the understanding of biological systems. The stochastic processes are particularly suitable for this purpose. Among them, Markov processes provide a first framework for the modeling of population of cells 84, 64. Piecewise deterministic processes are nondiffusion processes that are also frequently used in the biological context 51, 63, 52. Among Markov models, we developed strong expertise about processes derived from Brownian motion and Stochastic Differential Equations 79, 62. For instance, knowledge about Brownian or random walk excursions 83, 78 helps to analyse genetic sequences and to develop inference about them. We also have strong expertise in stochastic modeling of complex biological populations using individualbased models. These models can be used either from the point of view of asymptotic stochastic analysis 48, e.g. to study the long term Darwinian evolution of populations, or from the point of view of numerical analysis of biological phenomena 58, 39. We also develop mathematical tools for the analysis of the longtime behavior of stochastic population processes accounting for possible extinction of (sub)populations 49.
3.3 Estimation and control for stochastic processes
We develop inference about the stochastic processes that we use for modeling. Control of stochastic processes is also a way to optimise administration (dose, frequency) of therapy, such as targeted therapies in cancer. Our team has a good expertise about inference of the jump rate and the kernel of piecewisedeterministic Markov processes (PDMP) 43, 42, 2, but there are many directions to go further into. For instance, previous work made the assumption of a complete observation of jumps and mode, which is unrealistic in practice. We also tackle the problem of inference of “hidden PDMP”. For example, in pharmacokinetics modeling inference, we want to account for the presence of timing noise and identification from longitudinal data. We have expertise on these subjects 44, and we also use mixed models to estimate tumor growth or heterogeneity 45.
We consider the control of stochastic processes within the framework of Markov Decision Processes 76 and their generalization known as multiplayer stochastic games, with a particular focus on infinitehorizon problems. In this context, we are interested in the complexity analysis of standard algorithms, as well as the proposition and analysis of numerical approximate schemes for large problems in the spirit of 46. Regarding complexity, a central topic of research is the analysis of the Policy Iteration algorithm, which has made significant progress in the last years 86, 75, 60, 82, but is still not fully understood. For large problems, we have an extensive experience of sensitivity analysis of approximate dynamic programming algorithms for Markov Decision Processes 80, 67, 81, and we currently investigate whether/how similar ideas may be adapted to multiplayer stochastic games.
3.4 Algorithms and estimation for graph data
Recently, our group has focused its attention on modeling and inference for graph data. A graph data structure consists of a set of nodes, together with a set of pairs of these nodes called edges. This type of data is frequently used in biology because they provide a mathematical representation of many concepts such as biological networks of relationships in a population or between genes in a cell.
Network inference is the process of making inference about the link between two variables, by taking into account the information about other variables. Reference 85 gives a very good introduction and many references about network inference and mining. Many methods are available to infer and test edges in Gaussian graphical models 85, 69, 57, 59. However, the Gaussian assumption does not hold when dealing with typical “zeroinflated” abundance data, and we want to develop inference in this case.
Concerning gene networks, most studies have been based on populationaveraged data: now that technologies enable us to observe mRNA levels in individual cells, a revolution in terms of precision, the network reconstruction problem paradoxically becomes more challenging than ever. Indeed, the traditional way of seeing a gene regulatory network as a deterministic system with some small external noise is being challenged by the probabilistic, bursty nature of gene expression revealed at singlecell level. Our objective is to propose dynamical models and inference methods that fully exploit the particular time structure of singlecell data. We described a promising strategy in which the network inference problem is seen as a calibration procedure for a new PDMP model that is able to acceptably reproduce real singlecell data 61, 77.
Among graphs, trees play a special role because they offer a powerful model for many biological concepts, from RNA to phylogenetic trees in heterogeneous tumors or through plant structures. Our research deals with several aspects of tree data. In particular, we work on statistical inference for this type of data under a given stochastic model. We also work on lossy compression of trees via directed acyclic graphs. These methods enable us to compute distances between tree data faster than from the original structures and with a high accuracy.
3.5 Regression and machine learning
Regression models and machine learning aim at inferring statistical links between a variable of interest and covariates. In biological studies, it is always important to develop adapted learning methods both in the context of standard data and also for data of high dimension (sometimes with few observations) and very massive or online data.
Many methods are available to estimate conditional quantiles and test dependencies 74, 65. Among them we have developed nonparametric estimation by local analysis via kernel methods 55, 56 and we want to study properties of this estimator in order to derive a measure of risk based e.g. on confidence band and test. We study also other regression models like survival analysis, spatiotemporal models with covariates. Among the multiple regression models, we want to develop omnibus tests that examine several assumptions together.
Concerning the analysis of high dimensional data, our view on the topic relies on the French data analysis school, specifically on Factorial Analysis. In this context, stochastic approximation is an essential tool 66, which allows one to approximate eigenvectors in a stepwise manner 71, 70, 73. We aim at performing accurate classification or clustering by taking advantage of the possibility of updating the information "online" using stochastic approximation algorithms 47. We focus on several incremental procedures for regression and data analysis like linear and logistic regressions and PCA (Principal Component Analysis).
We also focus on the biological context of highthroughput bioassays in which several hundreds or thousands of biological signals are measured for a posterior analysis. We have to account for the interindividual variability within the modeling procedure. We aim at developing a new solution based on an ARX (Auto Regressive model with eXternal inputs) model structure using the EM (ExpectationMaximisation) algorithm for the estimation of the model parameters.
4 Application domains
4.1 Oncology: tumor growth and heterogeneity
We want to propose stochastic processes to model the appearance of mutations and the evolution of their frequencies in tumor samples, through new collaborations with clinicians who measure a particular quantity called circulating tumor DNA (ctDNA). The final purpose is to use ctDNA as an early biomarker of the resistance to a targeted therapy: this is the aim of the project funded by ITMO Cancer that we coordinate. In the ongoing work on lowgrade gliomas, a local database of 400 patients will be soon available to construct models. We plan to extend it through national and international collaborations (Montpellier CHU, Montreal CRHUM). Our aim is to build a decisionaid tool for personalised medicine.
4.2 Gene networks and singlecell data
We already mentioned in Section 3.4 our interest in the modeling and inference of transcriptomic bursting in gene regulatory networks from singlecell data. We are also currently working on the prediction and identification of therapeutic targets for chronic lymphocytic leukemia from gene expression data. Our goal is to propose new models allowing to make prediction of gene silencing experiments. Inference will be performed on gene expression data from patients’ cells suffering from different forms of chronic lymphocytic leukemia. The goal is to identify therapeutic targets which could be silenced to reduce cell proliferation.
4.3 Epidemiology and ehealth
In the context of personalized medicine, we have many ongoing projects with CHU Nancy. They deal with biomarkers research, prognostic value of quantitative variables and events, scoring, and adverse events. We also want to develop our expertise in rupture detection in a project with APHP (Assistance Publique Hôpitaux de Paris) for the detection of adverse events, earlier than the clinical signs and symptoms. The clinical relevance of predictive analytics is obvious for highrisk patients such as those with solid organ transplantation or severe chronic respiratory disease for instance. The main challenge is the rupture detection in multivariate and heterogeneous signals (for instance daily measures of electrocardiogram, body temperature, spirometry parameters, sleep duration, etc.). Other collaborations with clinicians concern foetopathology and we want to use our work on conditional distribution function to explain fetal and child growth. To that end, we use data from the “Service de fœtopathologie et de placentologie” of the “Maternité Régionale Universitaire” (CHU Nancy).
4.4 Dynamics of telomeres
Telomeres are disposable buffers at the ends of chromosomes which are truncated during cell division; so that, over time, due to each cell division, the telomere ends become shorter. By this way, they are markers of aging. Through a collaboration with Pr A. Benetos, geriatrician at CHU Nancy, we recently obtained data on the distribution of the length of telomeres from blood cells 9. We want to work in three connected directions: (1) refine methodology for the analysis of the available data; (2) propose a dynamical model for the lengths of telomeres and study its mathematical properties (long term behavior, quasistationarity, etc.); and (3) use these properties to develop new statistical methods.
5 Highlights of the year
5.1 Awards
D. Villemonais has been granted a delegation at Institut Universitaire de France from september 2023 to august 2028.
6 New software, platforms, open data
6.1 New software
6.1.1 Harissa

Name:
Hartree approximation for inference along with a stochastic simulation algorithm

Keywords:
Gene regulatory networks, Reverse engineering, Molecular simulation

Functional Description:
Harissa is a Python package for both inference and simulation of gene regulatory networks, based on stochastic gene expression with transcriptional bursting. It was implemented in the context of a mechanistic approach to gene regulatory network inference from singlecell data.

News of the Year:
This software has a more userfriendly interface and several tutorial notebooks are now available.
 URL:
 Publications:

Contact:
Ulysse Herbach
6.1.2 MultiRNAflow

Name:
An R package for the analysis of RNAseq raw counts with multiple biological conditions and time points

Keywords:
RNAseq, Gene regulatory networks, Integrated data analysis, Complex experimental design, Multiple temporal and biological conditions, Differential expression

Functional Description:
The R package MultiRNAflow provides an easy to use unified framework allowing to make both unsupervised and supervised analysis (differential expression analysis) for RNAseq datasets with an arbitrary number of biological conditions and time points. In particular, this package makes a deep downstream analysis of differential expression information, e.g. identifying temporal patterns across biological conditions and differentially expresses genes which are specific to a biological condition for each time.

Release Contributions:
First version
 URL:

Contact:
Nicolas Champagnat

Participants:
Rodolphe Loubaton, Nicolas Champagnat, Pierre Vallois, Laurent Vallat

Partner:
CHRU de Strasbourg
7 New results
7.1 Stochastic modeling
Participants: Sophie Baland, Virgile Brodu, Nicolas Champagnat, Coralie Fritsch, Mathilde Gaillard, Vincent Hass, Ulysse Herbach, Vincent Kagan, Nathaniel Seyler, Pierre Vallois, Denis Villemonais, Nicolás Zalduendo Vidal.
7.1.1 Reconstruction of epigenetic landscapes from singlecell data
Joint work with E. Ventre (ENS Lyon), T. Espinasse (Univ. Lyon 1), G. Benoit (Univ. Rennes 1) and O. Gandrillon (ENS Lyon).
The aim of this collaboration is to better understand how living cells make decisions (e.g., differentiation of a stem cell into a particular specialized type), seeing decisionmaking as an emergent property of an underlying complex molecular network. Indeed, it is now proven that cells react probabilistically to their environment: cell types do not correspond to fixed states, but rather to “potential wells” of a certain energy landscape (representing the energy of the possible states of the cell) that we are trying to reconstruct. The achievement of last year was to show that the same mathematical model driven by transcriptional bursting can be used simultaneously as an inference tool, to reconstruct biologically relevant networks, and as a simulation tool, to generate realistic transcriptional profiles emerging from gene interactions: the article presenting these results is now published 25. In addition, the paper proposing a landscape reconstruction method with application to several datasets has also been published this year 22.
These results form the starting point of M. Gaillard's thesis work, which will focus on making links with interpretable dimension reduction for singlecell RNAseq data. Finally, we are working with software engineer N. Seyler on a refactoring of the “Harissa” Python package used in 10 for stochastic simulation and inference of gene regulatory networks, with the aim of making it modular and scalable. The latest stable version is available on PyPI and is presented in a dedicated tool paper 30.
7.1.2 Quasistationary distributions
We are continuing our research on quasistationary distributions (QSD), that is, distributions of Markov stochastic processes with absorption, which are stationary conditionally on nonabsorption. For models of biological populations, absorption usually corresponds to extinction of a (sub)population. QSDs are fundamental tools to describe the population state before extinction and to quantify the largetime behavior of the probability of extinction.
Thanks to the previous general result of the team in 50, together with B. Cloez (INRAE), we proved in 16 the exponential convergence of a chemostat model, whose dynamics are highly degenerate due to a deterministic part, towards a unique quasistationary distributions.
We also finalized an important work 15 that provides general criteria for the exponential convergence of conditional distributions of absorbed Markov processes when the convergence is not uniform with respect to the initial distribution. Our results allow to characterize a large subset of the domain of attraction of the minimal QSD and apply to a large range of stochastic processes, including diffusion processes and perturbed dynamical systems.
In collaboration with E. Strickler (Univ. Lorraine), we also studied in 34 the convergence of general penalized Markov processes with soft killing in ${L}^{1}$ (MongeKantorovich) Wasserstein distance. We propose a simple criterion ensuring uniform convergence of conditional distributions to a unique quasistationary distribution. We give several examples of application where our criterion can be checked, including Bernoulli convolutions and piecewise deterministic Markov processes, for which convergence in total variation is not possible.
7.1.3 Fluctuations of balanced urns with infinitely many colours
Joint work with Svante Janson (Uppsala Univ.,Sweden) and Cécile Mailler (Univ. Bath, UK)
In this collaborative study, we delve into the dynamics of measurevalued Pólya processes (MVPPs), commonly known as Pólya urns with infinitelymany colours. Our study introduces the first secondorder results in the literature on MVPPs, extending classical fluctuation outcomes from finitelymanycolour Pólya urns to the infinite colour space scenario. The nature of fluctuations in MVPPs is intricately linked to the “spectral gap”, adding a layer of sophistication to our understanding of these processes.
By framing MVPPs as stochastic approximations operating within the set of measures on a measurable space $E$ (the colour space), we employ martingale methods and standard operator theory to rigorously prove convergence and unravel the nuanced fluctuation patterns inherent in these stochastic approximations 23.
7.1.4 Adaptive dynamics in biological populations
Joint work with Sylvie Méléard (École Polytechnique), Sepideh Mirrahimi (Univ. Montpellier) and Viet Chi Tran (Univ. Paris Est MarnelaVallée).
We continued our study of parameter scalings of individualbased models of biological populations under mutation and selection, taking into account the influence of negligible but nonextinct populations. In a work within the ERC SINGER 14, we were able to give an individualbased justification of the HamiltonJacobi equation of adaptive dynamics (see e.g. 68), with a specific parameter scaling that is promising for the study of local (in space) extinction of subpopulations. The analysis of models allowing for such an extinction is the next step of this project. We also wrote an article 26 for the proceedings of the International Congress of Mathematicians (ICM 2022) where S. Méléard gave an invited talk on several large population scalings that can be used in evolutionary biology.
We also worked on general evolutionary models of adaptive dynamics under an assumption of large population and small mutations. We obtained in 13 existence, uniqueness and ergodicity results for a centered version of the FlemingViot process of population genetics, which are key steps to recover variants of the canonical equation of adaptive dynamics, which describes the long time evolution of the dominant phenotype in the population, under less stringent biological assumptions than in previous works such as 48. We completed this second step in 33.
7.1.5 Binary Branching Processes with Moran Type Interactions
Joint work with Alexander Cox (Univ. Bath, UK) and Emma Horton (Univ. Warwick, UK).
In this collaboration, our focus is on investigating the large population limit of a binary branching particle system with Moran type interactions. The novel model introduced in this paper features particles that evolve, reproduce, and die independently. It encompasses branching models and fixed size Moran type interacting particle systems. The death of a particle may trigger the reproduction of another, while a branching event may, in turn, lead to the demise of another particle. Our study 17 aims to elucidate the intricate dynamics of this model. We explore diverse applications of our model, including its relevance to the neutron transport equation and population size dynamics. We focus on the occupation measure of the new model, explicitly connecting it to the FeynmanKac semigroup of the underlying Markov evolution. Additionally, we quantify the ${L}^{2}$ distance between the normalizations of these measures, providing valuable insights into the convergence behavior of the system.
7.1.6 Multitype bisexual branching process
The asexual multitype GaltonWatson branching processes as well as the singletype bisexual processes have been studied in the literature. In particular, survival condition of the processes are well known in both cases. However, until now, the multitype bisexual branching processes have only been studied in very specific situations and no general mathematical description has been established yet.
In 21, we studied general multitype bisexual branching processes with superadditive mating function. We exhibited a necessary and sufficient condition for almost sure extinction, we proved a law of large numbers for our model and we studied the longtime convergence of the rescaled process.
7.1.7 A branching model for intergenerational telomere length dynamics
Joint Work with Athanasios Benetos (Univ. Lorraine), Lionel Lenôtre (Univ. Haute Alsace) and Simon Toupance (Univ. Lorraine).
In this study, we construct and analyze an individualbased model capturing the evolution of telomere length in a population across multiple generations 32. The model, a continuoustime typed branching process, incorporates individual characteristics such as gamete mean telomere length and age. Our investigation delves into the Malthusian behavior of the model, and we complement our findings with numerical simulations to elucidate the impact of biologically relevant parameters on telomere length dynamics on an evolutionary time scale.
7.1.8 Modeling of chronic obstructive pulmonary disease
Joint work with Isabelle Dupin (Univ. Bordeaux), Élise Maurat (Univ. Bordeaux) and JeanMarc SacEpée (IECL).
Lung exposure to various types of particules, such as those present in cigarette smoke, can lead to chronic obstructive pulmonary disease (COPD). COPD bronchi are an area of intense immunological activity and tissue remodeling, as evidenced by the extensive immune cell infiltration and changes in tissue structures. This allows the persistent contact between resident cells and stimulated immune cells. Our hypothesis is that the contact between cells is a major cause of chronic destructive or fibrotic manifestations. We aim to analyze the potential cellcell interactions in situ in human tissues, to characterize in vitro the dynamics of the interplay, and to define a computational model with intercellular interactions which fits to experimental measurements and explains the macroscopic properties of cell populations. The effects of potential therapeutic drugs modulating local intercellular interactions will be tested by simulations. A paper has been submitted this year 19 (see also 54).
7.1.9 Numerical simulation of diffusions
In a collaboration with A. Lejay (Inria PASTA team) and their PhD student A. Anagnostakis, D. Villemonais proposed a method for approximating general, singular diffusions by discrete time and state space processes 11. One of the main interests compared to existing methods is to propose a numerical method whose main computational cost is done upstream and thus represents a fixed cost, independently of the number of simulations performed afterwards.
7.2 Regression and machine learning
Participants: Sandie Ferrigno, JeanMarie Monnez.
7.2.1 Cramérvon Mises goodnessoffit tests in regression models
Join work with R. Azaïs (Inria, ENS Lyon) and M.J. Martinez (Univ. Grenoble Alpes).
Many goodnessoffit tests have been developed to assess the different assumptions of a (possibly heteroscedastic) regression model. Most of them are `directional' in that they detect departures from a given assumption of the model. Other tests are `global' (or `omnibus') in that they assess whether a model fits a dataset on all its assumptions. We focus on the task of choosing the structural part of the regression and the variance functions because they contain easily interpretable informations about the studied relationship. We consider two nonparametric `directional' tests and one nonparametric `global' test, all based on generalizations of the Cramérvon Mises statistic.
To perform these goodnessoffit tests, we have developed the R package cvmgof 40, an easytouse tool for practitioners, available from the Comprehensive R Archive Network (CRAN). The package was updated in 2022 (this is its third version) 41. This latest version currently allows testing the “regression function” part of the model. In 2023, we worked to enrich the package by allowing the user to test the homoskedasticity/heteroskedasticity of the model. This new version will be submitted to CRAN in 2024 and an associated article is currently being written.
To complete this work, we plan to assess the other assumptions of a regression model such as the additivity of the random error term. The implementation of these directional tests would enrich the cvmgof package and offer a complete easytouse tool for validating regression models. Another perspective of this work would be to develop a similar tool for other statistical models widely used in practice such as generalized linear models.
7.2.2 Imprecise extension of the kernel density estimator
Join work with Bilal Nehme (IECL, Nancy).
The estimation of the probability density function underlying a finite set of observations is a fundamental problem that covers a broad range of applications including machine learning. We propose a new nonparametric method to estimate this function that combines both the Schwartz distribution theory and the possibility theory. It is an extension of the kernel density estimator that leads to imprecise estimation, based on a new type of kernel called maxitive kernel. The form of the obtained estimation is an interval. In collaboration with B. Nehme, S. Ferrigno demonstrated several theoretical properties of the imprecise estimator. We implement this method using very low complexity algorithms and illustrate some theoretical properties of the proposed imprecise density estimation as well as a comparative analysis with other estimation intervals. An associated article is currently being written.
7.2.3 Online Big Data Analysis and Online Learning
A tool for analyzing streaming data is stochastic approximation introduced by Robbins and Monro in 1951, that can be used for example to estimate online parameters of a regression function 53 or centers of clusters in unsupervised classification 47. Another type of stochastic approximation processes was introduced by Benzécri in 1969 for estimating eigenvectors and eigenvalues of the unknown $Q$symmetric expectation of a random matrix $A$ using independent observations of $A$. In all these processes, it is assumed that independent observations of the random matrix are oberved and that one or a minibatch of observations per step are taken into account. We are interested in the study of cases where we cannot have independent observations and we define processes where at each step all the observations up to this step are taken into account without storing them. Experiments we have conducted show that this second type of process generally converges faster than the first type.
Stochastic approximation of eigenvectors and eigenvalues of the $Q$symmetric expectation of a random matrix
In the article 24, we establish an almost sure convergence theorem of an extension of the stochastic approximation process of Oja for estimating eigenvectors of the unknown $Q$symmetric expectation $B$ of a random matrix, under a correlation model between the incoming random matrices. This theorem generalizes previous theorems and extends them to the case where the metric $Q$ is unknown and estimated online in parallel. We suggest constructing processes using past and current observations at each step without storing them. We prove the almost sure convergence of specific processes to corresponding eigenvalues. We apply these results to streaming principal component analysis (PCA) of a random vector $Z$, when a minibatch of observations of $Z$ is used at each step or all the observations up to the current step. We deal with the case of streaming generalized canonical correlation analysis, interpreted as a PCA with a metric estimated online in parallel.
An extended Oja process for streaming canonical analysis
In the article 36, after recalling an almost sure convergence theorem of an extended Oja process 24, we present the canonical correlation analysis (CCA) of two random vectors ${Z}^{1}$ and ${Z}^{2}$ such that there is no affine relation between their components. Couples of canonical components are interpreted as couples of principal components of the respective PCA of the linear regression function of ${Z}^{1}$ with respect to ${Z}^{2}$ and ${Z}^{2}$ with respect to ${Z}^{1}$, or as canonical components of the generalized canonical correlation analysis (gCCA) of $Z=\left({Z}^{1},{Z}^{2}\right)$. In the case of streaming data, we estimate online in parallel a regression function and canonical components, using at each step a minibatch of current data or all the data up to the current step to have a faster convergence. We define two algorithms, the second being extended to gCCA. Using the same methodology for streaming factorial correspondence analysis (FCA) when the components of ${Z}^{1}$ and ${Z}^{2}$ are respectively the indicators of the exclusive modalities of two categorical variables, we define two algorithms to estimate online the canonical components, the second being extended to multiple correspondence analysis (MCA). Finally, we apply this methodology to streaming factorial discriminant analysis (FDA), when there is no affine relation between the components of ${Z}^{1}$ and the components of ${Z}^{2}$ are the indicators of the exclusive modalities of a categorical variable.
7.3 Statistical learning and application in health
Participants: Nicolas Champagnat, Sandie Ferrigno, Anne GégoutPetit, Ulysse Herbach, Walid Laziri, Rodolphe Loubaton, Sophie WantzMézières, Anouk Rago, Pierre Vallois.
7.3.1 Invert emulsions alleviate biotic interactions in bacterial mixed culture
Joint work with A. Dijamentiuk, C. Mangavel and F. Borges from LIBio, Univ. Lorraine.
The large application potential of microbiomes has led to a great need for mixed culture methods. However, microbial interactions can compromise the maintenance of biodiversity during cultivation in a reactor. In particular, competition among species can lead to a strong disequilibrium in favor of the fittest microorganism. The aim of this study was to evaluate the potential of single invert emulsions to alleviate competition during the culture of antagonistic microorganisms and therefore to maintain diversity in a more complex mixed culture. Experimental data obtained in this study were analyzed using a twoway analysis of variance using a fixed effects model, followed by Tukey's HSD test. In the droplet size distributions of the invert emulsions, factors involved were the presence or absence of bacteria, and the incubation of invert emulsions. In bacterial enumerations, factors were the cultivation system used and the incubation. In community cultivation experiments, differences in Shannon diversity index between groups of samples were tested using oneway analysis of variance, followed by a Tukey's HSD test. An article 18 has been published on this work in 2023.
7.3.2 Prediction of silencing experiments on gene networks for chronic lymphocytic leukemia
Joint work with Laurent Vallat (CHRU Strasbourg).
In this collaboration, we work on the inference of dynamical gene networks from RNAseq and proteome data. The goal is to infer a model of gene expression allowing to predict gene expression in cells where the expression of specific genes is silenced (e.g. using siRNA), in order to select the silencing experiments which are more likely to reduce the cell proliferation. We expect the selected genes to provide new therapeutic targets for the treatment of chronic lymphocytic leukemia. This year, we have developed a new method of prediction of the effect of gene silencing, based on the reexploitation of expression data of genes not influenced by the silenced gene 27. We also have developed the package MultiRNAflow (see Section 6.1.2) for the statistical analysis of temporal gene expression datasets with several biological conditions (in particular for exploratory analysis and the detection of differentially expressed genes). The package is described in the application note 35.
7.3.3 Multidimensional statistical analysis of information for clinical use
The startup EMOSIS develops blood tests relying on flow cytometry in order to improve in vitro diagnosis of vascular thrombosis. This technology leads to multiparametric measurements on tens of thousands cells collected from each blood sample. Manual methods of analysis classically used in flow cytometry are based on data visualization by means of histograms or scatter plots. Computational algorithmic approach that would automate and deepen the search of differences or similarities between cell subpopulations could thus increase the quality of diagnosis.
Recent progresses in the active area of computational methods for dimension reduction suggest many directions of improvement of the classical approaches for the analysis of flow cytometry data. The approach that we considered is information geometry, whose principle is to lower the dimensionality of multiparametric observations by considering the subspace of the parameters of the statistical model describing the observation, whose points are probability density functions, and which is equipped with a special geometrical structure. The objective of the reported study is to use an algorithm belonging to the field of information geometry known as Fisher Information Nonparametric Embedding (FINE) to analyze flow cytometry data in the context of the specific severe disorder called heparininduced thrombocytopenia. This work lead to two communications in conferences 28, 29.
Unfortunately the startup EMOSIS non longer exists, which put an end to our collaboration.
7.3.4 Effects of adapted physical activity and education program on endometriosis symptoms
Joint work with Géraldine EscrivaBoulley and Lionel Lenôtre (Univ. HauteAlsace).
Endometriosis is a chronic disease characterized by growth of endometrial tissue outside the uterine cavity which could affect 200 million women worldwide. One of the most common symptoms of endometriosis is pelvic chronic pain associated with fatigue. This pain can cause psychological distress and interpersonal difficulties. As for several chronic diseases, adapted physical activity could help to manage the physical and psychological symptoms.
We are participating in both design and statistical analysis of a randomizedcontrolled trial, led by G. EscrivaBoulley, to investigate the potential effects of a videoconferencebased adapted physical activity combined with endometriosisbased education program 20. This study is one of the first trials to test the effects of a combined adapted physical activity and education program for improving endometriosis symptoms and physical activity.
8 Bilateral contracts and grants with industry
8.1 Bilateral contracts with industry
Participants: Anne GégoutPetit, Walid Laziri, Sophie WantzMézières.
As part of the French “Plan de relance”, we obtained funds for a 2year engineering contract with the startup EMOSIS based in Strasbourg (from October 1, 2022). Project MOSAiC : MultidimensiOnal Statistical Analysis of Information for Clinical use. Unfortunetly EMOSIS ordered to file for bankruptcy in 2024 an the project was stopped.
9 Partnerships and cooperations
Participants: Sophie Baland, Nicolas Champagnat, Coralie Fritsch, Mathilde Gaillard, Anne GégoutPetit, Vincent Hass, Ulysse Herbach, Rodolphe Loubaton, Anouk Rago, Pierre Vallois, Denis Villemonais, Sophie WantzMézières, Nicolas Zalduendo Vidal.
9.1 International initiatives
9.1.1 Inria associate team not involved in an IIL or an international program
MAGO

Title:
Modelling and analysis for growthfragmentation processes

Duration:
20222024

Coordinator:
Denis Villemonais

Partners:
 Inria Nancy (C. Fritsch, D. Villemonais)
 University College London London (F.X. Briol, O. Key, A. Watson)

Summary:
Growthfragmentation(GF)refers to a collection of mathematical models in which objects – classically, biological cells – slowly gather mass over time, and fragment suddenly into multiple, smaller offspring. These models may be used to represent a range of biological processes, in which an individual reproduces by fission into two or more new individuals, such as the evolution of plasmids in bacteria populations and protein polymerisation. It is crucial to understand the long term behaviour of GF processes so that they can be used to build algorithms to simulate realworld processes and estimate quantities such as the growth rate of the system, the steady state behaviour, and the fragmentation rate and kernel, allowing scientists to gain a better understanding of the behaviour of these complex systems. In this project, we aim to combine probabilistic and statistical tools to study these processes. In particular, we will employ methods from branching processes, quasistationary distributions and interacting particle systems to study their longterm behaviour and develop numerical simulations. Further, we will develop likelihoodfree methods to estimate the model parameters, followed by goodnessoffit tests to analyse the strength of these methods when working with real data.
9.2 International research visitors
9.2.1 Visits of international scientists
Other international visits to the team
Emma Horton

Status
Researcher

Institution of origin:
University of Melbourne

Country:
Australia

Dates:
May 30  June 2

Context of the visit:
collaboration on the growthcoagulationfragmentation processes.

Mobility program/type of mobility:
research stay
Alex Watson

Status
Researcher

Institution of origin:
University College London

Country:
UK

Dates:
May 30  June 2

Context of the visit:
collaboration on the growthcoagulationfragmentation processes in the framework of the MAGO Inria associate team.

Mobility program/type of mobility:
research stay
9.2.2 Visits to international teams
Research stays abroad
Coralie Fritsch & Denis Villemonais

Visited institution:
University College London

Country:
UK

Dates:
October 30  November 3

Context of the visit:
collaboration with Alex Watson and Emma Horton on the growthcoagulationfragmentation processes in the framework of the MAGO Inria associate team.

Mobility program/type of mobility:
research stay
Nicolas Champagnat

Visited institution:
Pontificia Universidad Católica de Chile, Universidad de Valparaiso

Country:
Chile

Dates:
March 18  March 27

Context of the visit:
Collaboration with Pablo Marquet and Rolando Rebolledo on niche construction. After this visit, we applied for the Inria associate team aStoNiche (a Stochastic framework for modeling Niche construction), that is funded for the period 20242027.

Mobility program/type of mobility:
research stay
Pierre Vallois

Visited institution:
Turku University

Country:
Finland

Dates:
November 21  November 25

Context of the visit:
Collaboration with Paavo Salminen.

Mobility program/type of mobility:
research stay
9.3 European initiatives
9.3.1 ERC projects
N. Champagnat is scientific collaborator of the ERC SINGER (AdG 101054787) on Stochastic dynamics of sINgle cells, coordinated by S. Méléard (Ecole Polytechnique). He is involved in the research axes “From stochastic processes to singular HamiltonJacobi equations” and “Lineages and time reversed trajectories” of this project.
9.4 National initiatives
 A. GégoutPetit was in the committee interviewed by ANR for the IHU Infiny on the subject of chronic inflammatory bowel diseases. PI: L. PeyrinBiroulet (Univ. Lorraine and CHRU Nancy).
 ITMO Physics, Mathematics applied to Cancer (from October 2023): “Quantifying and predicting the evolution of clonal heterogeneity in chronic lymphocytic leukemia”. Funding organisms: ITMO Cancer, ITMO Technologies pour la santé de l'alliance nationale pour les sciences de la vie et de la santé (AVIESAN), INCa. Partners: Inria and IECL (Institut Élie Cartan de Lorraine) and CHRU Strasbourg. Leader: N. Champagnat. Participants: C. Fritsch, U. Herbach, P. Vallois, D. Villemonais.
 PEPR Exploratoire MathsVivES, (starting in spring 2024), target project DyLT (Dynamics of Telomere Length) on “Influence of telomere length dynamics and environmental conditions on biological and clinical aspects of aging”. Funding organisms: ANR. Partners: Inria Nancy and Saclay, Institut Élie Cartan de Lorraine (Nancy), CHRU Nancy, Centre de Recherche en Cancérologie de Marseille and Institut de recherche sur le cancer et le vieillissement (Nice). Coordinators: N. Champagnat and A. Benetos (CHRU Nancy). Participants: C. Fritsch, A. GégoutPetit, D. Villemonais, S. Baland.
 PEPR Santé Numérique (started in July 2023), project AI4scMed (Multiscale AI for singlecellbased precision medicine) including WP3: “Regulatory network inference: from dynamical models to logical models”. Funding organisms: ANR. Partners: Inria, Inserm, CNRS. Coordinator: F. Picard (CNRS, ENS Lyon). Participants: M. Gaillard, U. Herbach.
 FHU CARTAGE (Fédération Hospitalo Universitaire Cardial and ARTerial AGEing). Leader: Pr. A. Benetos. Participants: J.M. Monnez, A. GégoutPetit.
 ANR JCJC project CRESCENDO (inCRease physical Exercise and Sport to Combat ENDOmetriosis, AAPG 2022). Coordinator: G. EscrivaBoulley (LISEC, Université de HauteAlsace). Participant: U. Herbach.
 GDR 720 IASIS (funded by CNRS). Leader: C. Richard. Participant: S. WantzMézières.
 Réseau Thématique MathSAV (funded by CNRS). Leader: F. Crauste. Participants: N. Champagnat, C. Fritsch, V. Hass, U. Herbach, R. Loubaton, A. Rago, N. Zalduendo Vidal.
 Chair “Modélisation Mathématique et Biodiversité” between VEOLIA, Ecole Polytechnique, Museum National d'Histoire Naturelle and Fondation X (funded by VEOLIA). Leader: S. Méléard. Participants: V. Brodu, N. Champagnat, C. Fritsch, V. Hass, D. Villemonais, N. Zalduendo Vidal.
9.5 Regional initiatives
A. GégoutPetit is one the two PIs of the interdisciplinary program “Life Travel” of the ISite “Lorraine Université d'Excellence” on life trajectories and longevity (under construction).
10 Dissemination
Participants: Sophie Baland, Virgile Brodu, Nicolas Champagnat, Sandie Ferrigno, Coralie Fritsch, Mathilde Gaillard, Anne GégoutPetit, Vincent Hass, Ulysse Herbach, Vincent Kagan, Rodolphe Loubaton, JeanMarie Monnez, Anouk Rago, Pierre Vallois, Denis Villemonais, Sophie WantzMézières, Nicolas Zalduendo Vidal.
10.1 Promoting scientific activities
10.1.1 Scientific events: organisation
Member of the organizing committees
 N. Champagnat coorganized the conference A Random Walk in the Land of Stochastic Analysis and Numerical Probability (September 48, CIRM, Luminy) in the honor of Denis Talay.
 C. Fritsch coorganized the 21st INFORMS Applied Probability Society Conference (Centre Prouvé, Nancy, June 2830). S. Baland, V. Brodu, M. Gaillard, V. Hass, R. Loubaton, A. Rago and N. Zalduendo Vidal were members of the logistic crew during the conference.
 N. Champagnat and D. Villemonais organized the invited session “Quasistationary distributions in numerical stochastic methods and statistics” in the 21st INFORMS Applied Probability Society Conference (Centre Prouvé, Nancy, June 2830).
 D. Villemonais coorganized the GdR Branchement first conference in November 2023, Toulouse, France.
 U. Herbach has been coorganizing the Probability and Statistics weekly seminar at IECL in Nancy until September.
10.1.2 Scientific events: selection
Chair of conference program committees
 A. GégoutPetit is chair program committtee of the ENBIS meeting 2024 that will be held in Leuwen, Belgium in September 2024.
Member of the conference program committees
 A. GégoutPetit was member of program committee of the ENBIS meeting 2023 that was held in Valencia, Spain in September.
10.1.3 Journal
Member of the editorial boards
 N. Champagnat is associate editor for ESAIM: Probability & Statistics and Stochastic Models.
 A. GégoutPetit was guest editor with L. MarcoAlmagro (Univ. Politècnica de Catalunya, Barcelona, Spain) for the Quality and Reliability Engineering International special issue related to the 22nd Annual Conference of the European Network for Business and Industrial Statistics (ENBIS).
10.1.4 Invited talks
 N. Champagnat gave a plenary talk at the 11ème Biennale Française des Mathématiques Appliquées et Industrielles (Congrès SMAI 2023) in Le Gosier, Guadeloupe in May. He has been also invited to give talks at the 43rd Conference on Stochastic Processes and their Applications in Lisbonne, Portugal in July, the conference Celebrating the mathematics of Michel Benaïm in Bernoulli Center, Lausanne in August, the conference A random walk in the land of stochastic analysis and numerical probability at CIRM, Luminy in September and the International Conference on Recent Developments of Theory and Methods in Mathematical biology, conference of the IRN ReaDiNet network at NCTS Taipei, Taiwan in October.
 C. Fritsch has been invited to give talks at the 21st INFORMS Applied Probability Society Conference in Nancy in June and at the Première conférence du GDR Branchement in Toulouse in November.
 U. Herbach has been invited to give talks at Statistics seminar of LMA in Avignon in April, at GT Bioss workshop in Marseille in July, at Inria MUSCA team seminar in Saclay in September, at Statistics seminar of IRMA in Strasbourg in October and at LCSB Systems Control group seminar in Luxembourg in November.
 P. Vallois has been invited to give talks at the Journées de Probabilités 2023 in Angers in June and at the 21st INFORMS Applied Probability Society Conference in Nancy in June.
 D. Villemonais has been invited to give a talk at the 21st INFORMS Applied Probability Society Conference in Nancy in June and at the conference Celebrating the mathematics of Michel Benaïm in Bernoulli Center, Lausanne in August.
 N. Zalduendo Vidal has been invited to give talks at the 21st INFORMS Applied Probability Society Conference in Nancy in June and at the conference Discrete Randomness in Créteil in December.
10.1.5 Contributed talks, posters, workshops,seminars
 V. Brodu has presented a poster at the Conférence internationale Mathematical Population Dynamics, Ecology and Evolution, MPDEE 2023 in Marseille in April, at the 21st INFORMS Applied Probability Society Conference in Nancy in June (where he has been awarded a Best Poster prize), at the 43rd Conference on Stochastic Processes and their Applications in Lisbon in July, and at the GdR Branchement first conference in Toulouse in November.
 N. Champagnat has been invited to give a talk at the EcodepBiostochastic Workshop: Modelling Time Series and Stochastic Processes at Las Cruces Marine Station, Chile in March. He has been also invited to give a (remote) seminar talk at the Seminar of differential equations, Instytut Matematyczny Wroclaw, Poland in November.
 C. Fritsch gave talks at the Journées INRAE  Inria 2023 in Nancy in July and at the Journée de la donnée en Meurthe et Moselle in Nancy in October.
 U. Herbach has given talks at Statistical Methods for Post Genomic Data workshop (SMPGD 2023) in Ghent (Belgium) in February, at 21st International Conference on Computational Methods in Systems Biology (CMSB 2023) in Luxembourg in September and at CENTURI Conference on Information networks in biological systems in Cargèse in October.
 A. Rago has presented a poster at Journée scientifique autour de l'IA in Nancy in February and at Statlearn23 in Montpellier in April, and has given a talk 27 at Journées de Statistique 2023 in Bruxelles in July.
 P. Vallois has given a seminar talk at Univ. Sorbonne Paris Nord in November.
 D. Villemonais gave a talk at the Journée Santé Numérique in Nancy in November.
 N. Zaluendo Vidal gave talks at the Seminario de Probabilidades de Chile (online) in August, at the Workshop ${L}^{2}$ in Probability and Statistics in Metz in September and at the Séminaire Image Optimisation et Probabilités in Bordeaux in October.
10.1.6 Scientific expertise
 N. Champagnat evaluated a research project submitted to ShapeMed@Lyon.
 C. Fritsch has been a member of the Committee for junior permanent research positions of Inria Nancy  Grand Est.
 A. GégoutPetit was expert for the Messidore call 2023 (Méthodologie des ESSais cliniques Innovants, Dispositifs, Outils et Recherches Exploitant les données de santé et biobanques) of INSERM.
 A. GégoutPetit was member of two different selection committees for Professor (Univ. de Pau et de l'Adour (UPPA) and Univ. Rennes 1) and one selection committee for a Maître de conférence for AixMarseille Université.
 U. Herbach evaluated a research project submitted to ANR AAPG 2023 as a scientific expert for the “Interfaces: mathematics, digital sciences  biology, health” panel.
10.1.7 Research administration
 V. Brodu is an elected representative of doctoral students at the doctoral school committee (local scale), and also at the doctoral college committee (regional scale).
 N. Champagnat is elected member of the Commission d'Evaluation of Inria since September, member of the COMIPERS (hiring committee for nonpermanent positions) of Inria Nancy – Grand Est, substitute member of the Comité de Centre of Inria Nancy – Grand Est, local researcher (correspondant local) representing the COERLE (Inria's Ethic Committee) at Inria Nancy – Grand Est, and he was responsable scientifique for the library of Mathematics of IECL until October.
 C. Fritsch was a member, until August, of the Commission du Développement Technologique of the Inria Research Center of Nancy  Grand Est and of the Commission du personnel of IECL. She is an elected member of the Commission d'Évaluation of Inria since September.
 A. GégoutPetit is director of the research unit IECL (Institut Elie Cartan de Lorraine), Mathematics laboratory of Univ. Lorraine (200 members).
10.2 Teaching  Supervision  Juries
10.2.1 Teaching
BIGS faculty members have teaching obligations at Univ. Lorraine and are teaching at least 192 hours each year. They teach probability and statistics at different levels (Licence, Master, Engineering school). Many of them have pedagogical responsibilities.
 T. Bastogne is in charge of the research master program “Santé Numérique et Imagerie Médicale” with the Faculty of Medicine, Univ. Lorraine.
 S. Ferrigno is in charge (since september 2023) of the “DU Big Data and Data Science” in ENSMN, Univ. Lorraine.
 D. Villemonais is the head of the Mathematical Engineering Major of ENSMN, Univ. Lorraine.
 Licence: V. Brodu, Probability theory tutorial, 40h, L3, first year of ENSMN, Univ. Lorraine.
 Licence: V. Brodu, Numerical Analysis tutorial, 20h, L3, first year of ENSMN, Univ. Lorraine.
 Master: N. Champagnat, Introduction to Quantitative Finance, 12h, M1, second year of ENSMN, Univ. Lorraine.
 Master: N. Champagnat, Introduction to Quantitative Finance, 9h, M2, third year of ENSMN, Univ. Lorraine.
 Master: S. Ferrigno, Experimental designs, 6h, M1, fourth year of EEIGM, Univ. Lorraine.
 Master: S. Ferrigno, Data analyzing and mining, 36h, M1, second year of ENSMN, Univ. Lorraine.
 Master: S. Ferrigno, Modeling and forecasting, 32h, M1, second year of ENSMN, Univ. Lorraine.
 Master: S. Ferrigno, Training projects, 18h, M1/M2, second and third year of ENSMN, Univ. Lorraine.
 Licence: S. Ferrigno, Descriptive and inferential statistics, 60h, L2, second year of EEIGM, Univ. Lorraine.
 Licence: S. Ferrigno, Statistical modeling, 60h, L2, second year of EEIGM, Univ. Lorraine.
 Licence: S. Ferrigno, Mathematical and computational tools, 20h, L3, third year of EEIGM, Univ. Lorraine.
 Licence: S. Ferrigno, Training projects, 40h, L1/L3, first, second and third year of EEIGM, Univ. Lorraine.
 Master: C. Fritsch, Inverse problem, 18h, M1, second year of ENSMN, Univ. Lorraine.
 Licence: C. Fritsch, Probability Theory tutorial, 27h, L3, first year of ENSMN, Univ. Lorraine.
 License: M. Gaillard, Numerical analysis and Optimization tutorial, 23h, L3, first year of ENSMN, Univ. Lorraine.
 Master: A. GégoutPetit, Statistics, modeling, data analysis, 80h, master in applied mathematics, Univ. Lorraine.
 Licence: V. Hass, Mathématiques FIGIM 1A, 38h, L1/L2, first year of ENSMN, Univ. Lorraine.
 Licence: V. Hass, Mathématiques FIGIM 2A, 19h, L2, second year of ENSMN, Univ. Lorraine.
 Licence: V. Hass, Probabilités, 40h, L3, first year of ENSMN, Univ. Lorraine.
 Licence: V. Hass, Analyse numérique et optimisation, 45h, L3, first year of ENSMN, Univ. Lorraine.
 Licence: V. Hass, Recherche opérationnelle, 18h, L3, first year of ENSMN, Univ. Lorraine.
 Master: V. Hass, Méthodes stochastiques pour le calcul, 14h, M1, second year of ENSMN, Univ. Lorraine.
 Licence: R. Loubaton, Modélisation Statistique, 21.5h, L2, prépa intégrée de l'EEIGM, Univ. Lorraine.
 Licence: R. Loubaton, EDP, 20h, L2, prépa intégrée de l'EEIGM, Univ. Lorraine.
 Licence: R. Loubaton, Algèbre des matrices, 21.5h, L1, prépa intégrée de l'EEIGM, Univ. Lorraine.
 Licence: R. Loubaton, alcul différentiel, 21.5h, L1, prépa intégrée de l'EEIGM, Univ. Lorraine.
 Master : A. Rago, Analyse de données, 18h, M1, second year of ENSMN, Univ. Lorraine.
 Master : A. Rago, Statistiques pour la grande dimension, 18h, M2 IMSD/third year of ENSMN, Univ. Lorraine.
 Master: D. Villemonais, Probability Theory II, 63h, M1, second year of ENSMN, Univ. Lorraine.
 Master: D. Villemonais, Stochastic processes, 32h, Master 2 MFA, Univ. Lorraine.
 Master: D. Villemonais, Modeling and forecasting, 14h, M1, second year of ENSMN, Univ. Lorraine.
 License: D. Villemonais Probability Theory I, 57h, L3, first year of ENSMN, Univ. Lorraine.
 Master: S. WantzMézières, Learning and analysis of medical data, 36h, with J.M. Moureaux, M2 SNIM, Univ. Lorraine.
 Licence: S. WantzMézières, Applied mathematics for management, financial mathematics, Probability and Statistics, 160h, IUT NancyCharlemagne (L1/L2/L3), Univ. Lorraine.
 Licence: S. WantzMézières, Probability, 100h, first year in TELECOM Nancy (initial and apprenticeship cursus), Univ. Lorraine.
 Licence: N. Zalduendo Vidal, Probability Theory tutorial, 20h, L3, first year of ENSMN, Univ. Lorraine.
10.2.2 Supervision
PhD
 PhD in progress: Sophie Baland, “Telomere length dynamics : modelisation, estimation and application to diagnostic support systems” since September 2023, funding LUE. Advisors: S. Toupance (Univ. Lorraine) and D. Villemonais.
 PhD in progress: Virgile Brodu,“Émergence des allométries dans les systèmes écologiques : comportement stationnaire de modèles déterministes et stochastiques de flux d'énergie et de biomasse”, grant ENS Lyon. Advisors: S. Billiard (Univ. Lille), N. Champagnat, C. Fritsch.
 PhD in progress: Mathilde Gaillard, “Processus de Markov déterministes par morceaux et inférence bayésienne de réseaux de gènes”, grant PEPR Santé Numérique, since October 2023. Advisors: A. GégoutPetit, U. Herbach.
 PhD: Vincent Hass, “Individualbased models in adaptive dynamics and long time evolution under assumptions of rare advantageous mutations” 31, ATER in ENSMN. Defense on the 21st September. Advisor: N. Champagnat.
 PhD in progress: Anouar Jeddi, “Convergence of individualbased population models to HamiltonJacobi equations” since September 2023, grant ERC SINGER (Ecole Polytechnique). Advisors: S. Méléard (Ecole Polytechnique) and N. Champagnat.
 PhD in progress: Vincent Kagan, “Asymptotic behavior of epidemiological epidemiological models with individual viral load” since September 2023, funding Université de Lorraine. Advisors: E. Strickler (Univ. Lorraine) and D. Villemonais.
 PhD: Rodolphe Loubaton, “Caractérisation des cibles thérapeutiques dans un programme génique tumoral”, ATER in EEIGM. Defense on the 21st December. Advisors: N. Champagnat and L. Vallat (CHRU Strasbourg).
 PhD in progress: Anouk Rago,“Inférence de réseaux de gènes dynamiques et prédiction d'expériences d'interventions biologiques dans des cellules cancéreuses”, grant Région GrandEst and Inria. Advisors: N. Champagnat, A. GégoutPetit.
 PhD: Nicolás Zalduendo Vidal, “Processus de branchement bisexués multitypes”, grant InriaCordi. Defense on the 18th December. Advisors: C. Fritsch, D. Villemonais.
Other
 Engineer: Walid Laziri, “Flow cytometry data analysis” (Plan de relance, contract with the startup EMOSIS), until November. Advisors: A. GégoutPetit, S. Mézières.
 M2 internship: Mathilde Gaillard, “Couplage et vitesse de convergence de processus de Markov déterministes par morceaux” (Univ. Lyon 1). Advisor: U. Herbach.
 M2 internship: Anouar Jeddi, “Convergence of individualbased population models to HamiltonJacobi equations” (M2 MSV, Univ. ParisSaclay) Advisors: S. Méléard (Ecole Polytechnique), S. Mirrahimi (Univ. Montpellier), V.C. Tran (Univ. G. Eiffel) and N. Champagnat.
 M2 internship: Vincent Kagan, “Asymptotic behavior of epidemiological epidemiological models with individual viral load”, funding Inria. Advisors: E. Strickler (IECL) and D. Villemonais.
 M2 SNIM Research Project: Marie Camonin, “Etude prospective de l'impact de certaines mutations génétiques dans le traitement des gliomes de bas grade”, Advisors: S. WantzMézières et J.M. Moureaux (CRAN).
 M2 IMSD Research Project: Nicolas Dinant, “Apport des statistiques spatiales dans l'étude de la localisation des tumeurs cérébrales”, Advisors: S. WantzMézières and J.M. Moureaux (CRAN).
 PIDR 2A TelecomNancy: A. Chevallier, A. Crivelli and T. Kieffer, “Machine Learning pour la mesure automatique du volume des tumeurs cérébrales”, Advisors: S. WantzMézières et J.M. Moureaux (CRAN).
 M1 ENSMN Research project: Antonin Clerc, “Émergence des allométries dans les écosystèmes” (fullyear research project). Advisors: V. Brodu and C. Fritsch.
 M1 ENSMN Research project: two 2nd year students, “Continuoustime Markov chains” (from September). Advisor: M. Gaillard.
10.2.3 Juries
 N. Champagnat was referee for the PhD theses of Léo Meyer (Univ. Orléans, 09/10/2023), Van Hai Thai (Univ. Nantes, 28/09/2023), Imane Akjouj (Univ. Lille, 29/06/2023) and Anaïs Rat (AixMarseille Univ., 31/05/2023). He was also president of the PhD committee of Aleksian Ashot (Univ. SaintEtienne, 20/11/2023). He was also examiner for the PhD thesis of Vincent Hass (Univ. Lorraine, 26/09/2023).
 N. Champagnat and P. Vallois were examiner for the PhD thesis of Rodolphe Loubaton (Univ. Lorraine, 21/12/2023).
 C. Fritsch and D. Villemonais were examiner for the PhD thesis of Nicolás ZalduendoVidal (Univ. Lorraine, 18/12/2023).
 A. GégoutPetit was examiner for the HdR jury of Nathalie Krell (Univ. Rennes 1, 01/06/2023).
 A. GégoutPetit was referee for the PhD theses of Maéva Kyheng (Université de Lille, 01/03/2023) ; Fatima Ezzahra MANA (Univ. Troyes, 03/09/2023). She was president of the PhD committee of Jérémie Frigério (Univ. Dijon, 18/12/2023) ; Jeremy Borderieu (AgroParistech, 14/12/2023) ; Rodolphe Loubaton (Univ. Lorraine, 21/12/2023) and examiner for the PhD of Nicolas Zalduendo (Univ. Lorraine, 18/12/2023).
 V. Brodu has been part of the jury for the thesis prize awarded by Métropole du Grand Nancy (rewarding PhD theses that are creative, innovative, and/or implanted in the local territory).
10.3 Popularization
10.3.1 Education
 V. Brodu supervised a maths club for highschool students in Lycée Jeanne d'Arc, Nancy. This club hosted a dozen students for two hours sessions, on a weekly basis.
 J.M. Monnez wrote an unpdated version of the lecture notes 37 presented in the 2022 Activity Report.
 S. WantzMézières organised a Research Training Week on Neurooncology and Numerics, for medical and engineering students in Nancy in March.
10.3.2 Interventions
 S. Ferrigno: Advisor of a group of EEIGM students, “Ateliers expérimentaux : Mécanique et Statistique” Project, various high schools, Nancy.
 S. Ferrigno: Advisor of a group of EEIGM students, “La main à la Pâte” Project, Institut médicoéducatif (IME), Commercy.
 S. Ferrigno: Advisor of a group of EEIGM students, “La main à la Pâte”, “CGénial” Projects, Collèges Paul Verlaine in Malzéville and La Craffe in Nancy.
 S. Ferrigno: Advisor of a group of EEIGM students, “La main à la Pâte” Project, elementary schools, Nancy.
 C. Fritsch gave three talks as part of the “Chiche!” program at Lycée De La Salle, Metz, in December.
 M. Gaillard gave a talk as part of the “Chiche!” program at Lycée René Cassin, Mâcon, in October. She presented her educational background and possible directions for a doctoral thesis in mathematics; explained what mathematical modeling is, and illustrated it with a stochastic gene expression model.
 U. Herbach gave a talk “Les maths peuventelles servir à vaincre le cancer ?” as part of a training course organized by Maison pour la science en Lorraine for secondary and high school teachers, at Faculté de Pharmacie de Nancy (campus Brabois) in October.
 A. Rago participed to the regional finals (Université de Lorraine) of “MT180”.
11 Scientific production
11.1 Major publications
 1 articlecvmgof: an R package for Cramérvon Mises goodnessoffit tests in regression models.Journal of Statistical Computation and Simulation9262022, 12461266HALDOI
 2 articleOptimal choice among a class of nonparametric estimators of the jump rate for piecewisedeterministic Markov processes.Electronic Journal of Statistics 1022016, 36483692HALDOIback to text
 3 inproceedingsMultiscale ecoevolutionary models: from individuals to populations.International Congress of Mathematicians, ICM 202271fully virtually, RussiaEMS PressDecember 2023, 56565678HALDOI
 4 articleGeneral criteria for the study of quasistationarity.Electronic Journal of Probability2023HALDOI
 5 inproceedingsHow to Combine TreeSearch Methods in Reinforcement Learning.AAAI 19  ThirtyThird AAAI Conference on Artificial IntelligenceHonolulu, Hawai, United StatesJanuary 2019HAL
 6 miscShortrange interactions between fibrocytes and CD8+ T cells in COPD bronchial inflammatory response.October 2022HALDOI
 7 articleThe Multitype Bisexual GaltonWatson Branching Process.Annales de l'I.H.P. Probabilités et statistiques2024HAL
 8 articleStochastic approximation of eigenvectors and eigenvalues of the Qsymmetric expectation of a random matrix.Communications in Statistics  Theory and Methods5352024, 16691683HALDOI
 9 articleThe individual’s signature of telomere length distribution.Scientific Reports91January 2019, 18HALDOIback to text
 10 articleOne model fits all: Combining inference and simulation of gene regulatory networks.PLoS Computational Biology193March 2023, e1010962HALDOIback to text
11.2 Publications of the year
International journals
 11 articleGeneral diffusion processes as the limit of timespace Markov chains.The Annals of Applied Probability3352023, 36203651HALDOIback to text
 12 articleA Bayesian implementation of qualitybydesign for the development of cationic nanolipid for siRNA transfection.IEEE Transactions on NanoBioscience223July 2023, 455466HALDOI
 13 articleExistence, uniqueness and ergodicity for the centered FlemingViot process.Stochastic Processes and their Applications166December 2023, 104219HALDOIback to text
 14 articleFilling the gap between individualbased evolutionary models and HamiltonJacobi equations.Journal de l'École polytechnique — Mathématiques102023, 12471275HALDOIback to text
 15 articleGeneral criteria for the study of quasistationarity.Electronic Journal of Probability2023HALDOIback to text
 16 articleQuasistationary behavior for a piecewise deterministic Markov model of chemostat: the CrumpYoung model.Annales Henri Lebesgue2024HALback to text
 17 articleBinary branching processes with Moran type interactions.Annales de l'I.H.P. Probabilités et statistiques2024HALback to text
 18 articleInvert emulsions alleviate biotic interactions in bacterial mixed culture.Microbial Cell Factories221December 2023, 16HALDOIback to text
 19 articleProbabilistic Cellular Automata modeling of intercellular interactions in airways : complex pattern formation in patients with Chronic Obstructive Pulmonary Disease.Journal of Theoretical BiologyMarch 2023, 111448HALDOIback to text
 20 articleEffects of a physical activity and endometriosisbased education program delivered by videoconference on endometriosis symptoms: the CRESCENDO program (inCRease physical Exercise and Sport to Combat ENDOmetriosis) protocol study.Trials241November 2023, 759HALDOIback to text
 21 articleThe Multitype Bisexual GaltonWatson Branching Process.Annales de l'I.H.P. Probabilités et statistiques2024HALback to text
 22 articleSinglecell transcriptional uncertainty landscape of cell differentiation.F1000Research12April 2023, 426HALDOIback to text
 23 articleFluctuations of balanced urns with infinitely many colours.Electronic Journal of Probability282023, 172HALDOIback to text
 24 articleStochastic approximation of eigenvectors and eigenvalues of the Qsymmetric expectation of a random matrix.Communications in Statistics  Theory and Methods5352024, 16691683HALDOIback to textback to text
 25 articleOne model fits all: Combining inference and simulation of gene regulatory networks.PLoS Computational Biology193March 2023, e1010962HALDOIback to text
International peerreviewed conferences
 26 inproceedingsMultiscale ecoevolutionary models: from individuals to populations.International Congress of Mathematicians, ICM 202271fully virtually, RussiaEMS PressDecember 2023, 56565678HALDOIback to text
 27 inproceedingsSimulation d'expériences d'intervention biologique dans des cellules cancéreuses à partir de données temporelles d'expression de gènes.54es Journées de Statistique de la SFdS (JdS 2023)Bruxelles, Belgium2023HALback to textback to text
Conferences without proceedings
 28 inproceedingsMéthodes de réduction de dimension basées sur l'algorithme FINE pour le clustering de patients à partir de données de cytométrie en flux.54es Journées de Statistiques de la SFDSBruxelles (BEL), BelgiumJuly 2023HALback to text
 29 inproceedingsDimension reduction methods based on FINE algorithm for clustering patients from flow cytometry data.ENBIS 2023 ConferenceValencia, SpainSeptember 2023HALback to text
Scientific book chapters
 30 inbookHarissa: Stochastic Simulation and Inference of Gene Regulatory Networks Based on Transcriptional Bursting.14137Computational Methods in Systems BiologyLecture Notes in Computer ScienceSpringerSeptember 2023, 97105HALDOIback to text
Doctoral dissertations and habilitation theses
 31 thesisIndividualbased models in adaptive dynamics, asymptotic behaviour and canonical equation : the case of small and frequent mutations.Université de LorraineSeptember 2023HALback to text
Reports & preprints
 32 miscA branching model for intergenerational telomere length dynamics.October 2023HALback to text
 33 miscConvergence of individualbased models with small and frequent mutations to the canonical equation of adaptive dynamics.March 2023HALback to text
 34 miscUniform Wasserstein convergence of penalized Markov processes.June 2023HALback to text
 35 miscMultiRNAflow: integrated analysis of temporal RNAseq data with multiple biological conditions.2023HALback to text
 36 miscAn extended Oja process for streaming canonical analysis.2023HALback to text
11.3 Other
Educational activities
 37 unpublishedAnalyse des données en flux. Analyse en composantes principales et méthodes dérivées.June 2023, DoctoralFranceHALback to text
Softwares
 38 softwareHarissa: tools for mechanistic gene network inference from singlecell data.3.0.72023 lic: BSD 3Clause.HALSoftware HeritageVCS
11.4 Cited publications
 39 articleSpatial ecoevolutionary dynamics along environmental gradients: multistability and cluster dynamics.Ecology Letters225May 2019, 767777HALback to text
 40 softwarecvmgof: Cramervon Mises goodnessoffit tests.1.0.0November 2018 lic: CeCILL.HALSoftware Heritageback to text
 41 articlecvmgof: an R package for Cramérvon Mises goodnessoffit tests in regression models.Journal of Statistical Computation and Simulation9262022, 12461266HALDOIback to text
 42 articleA recursive nonparametric estimator for the transition kernel of a piecewisedeterministic Markov process.ESAIM: Probability and Statistics182014, 726749back to text
 43 inproceedingsNonparametric estimation of the jump rate for nonhomogeneous marked renewal processes.Annales de l'Institut Henri Poincaré, Probabilités et Statistiques494Institut Henri Poincaré2013, 12041231back to text
 44 articleIdentification of pharmacokinetics models in the presence of timing noise.Eur. J. Control1422008, 149157URL: http://dx.doi.org/10.3166/ejc.14.149157DOIback to text
 45 articlePhenomenological modeling of tumor diameter growth based on a mixed effects model.Journal of theoretical biology26232010, 544552back to text
 46 bookNeurodynamic Programming.Athena Scientific1996back to text
 47 articleA fast and recursive algorithm for clustering large datasets with kmedians.Computational Statistics and Data Analysis562012, 14341449HALDOIback to textback to text
 48 articleAdaptation in a stochastic multiresources chemostat model.Journal de Mathématiques Pures et Appliquées1016June 2014, 755788HALDOIback to textback to text
 49 articleExponential convergence to quasistationary distribution and Qprocess.Probability Theory and Related Fields164146 pages2016, 243283HALDOIback to text
 50 articlePractical criteria for Rpositive recurrence of unbounded semigroups.Electronic Communications in Probability2562020, 111HALDOIback to text
 51 articlePiecewisedeterministic Markov processes: A general class of nondiffusion stochastic models.Journal of the Royal Statistical Society. Series B (Methodological)1984, 353388back to text
 52 articleStatistical estimation of a growthfragmentation model observed on a genealogical tree.Bernoulli2132015, 17601799back to text
 53 articleSequential linear regression with online standardized data.PLoS ONE2018, 127HALDOIback to text
 54 articleShortrange interactions between fibrocytes and CD8+ T cells in COPD bronchial inflammatory response.eLifeMicroscopy was performed at BIC, a service unit of the CNRSINSERM and Bordeaux University, a member of the national BioImaging infrastructure of France supported by the French National Research Agency (ANR10INBS0004).October 2022HALDOIback to text
 55 articleUn test d'adéquation global pour la fonction de répartition conditionnelle.C. R. Math. Acad. Sci. Paris34152005, 313316URL: http://dx.doi.org/10.1016/j.crma.2005.07.003DOIback to text
 56 articleUniform law of the logarithm for the local linear estimator of the conditional distribution function.C. R. Math. Acad. Sci. Paris34817182010, 10151019URL: http://dx.doi.org/10.1016/j.crma.2010.08.003DOIback to text
 57 articleSparse inverse covariance estimation with the graphical lasso.Biostatistics932008, 432441back to text
 58 articleA numerical approach to determine mutant invasion fitness and evolutionary singular strategies.Theoretical Population Biology1152017, 8999HALDOIback to text
 59 articleGraph selection with GGMselect.Statistical applications in genetics and molecular biology1132012back to text
 60 inproceedingsLower Bounds for Howard's Algorithm for Finding Minimum MeanCost Cycles.ISAAC (1)2010, 415426back to text
 61 articleInferring gene regulatory networks from singlecell data: a mechanistic approach.BMC Systems Biology111November 2017, 105HALDOIback to text
 62 articleFrom persistent random walk to the telegraph noise.Stoch. Dyn.1022010, 161196URL: http://dx.doi.org/10.1142/S0219493710002905DOIback to text
 63 incollectionModeling subtilin production in bacillus subtilis using stochastic hybrid systems.Hybrid Systems: Computation and ControlSpringer2004, 417431back to text
 64 articleMultinomial modelbased formulations of TCP and NTCP for radiotherapy treatment planning.Journal of Theoretical Biology2791June 2011, 5562URL: http://hal.inria.fr/hal00588935/enDOIback to text
 65 bookQuantile regression.38Cambridge university press2005back to text
 66 incollectionOn the Benzecri's method for computing eigenvectors by stochastic approximation (the case of binary data).Compstat 1974 (Proc. Sympos. Computational Statist., Univ. Vienna, Vienna, 1974)ViennaPhysica Verlag1974, 202211back to text
 67 inproceedingsNonStationary Approximate Modified Policy Iteration.ICML 2015Lille, FranceJuly 2015HALback to text
 68 articleDirac mass dynamics in multidimensional nonlocal parabolic equations.Communications in Partial Differential Equations3662011, 10711098back to text
 69 articleHighdimensional graphs and variable selection with the lasso.The Annals of Statistics2006, 14361462back to text
 70 articleApproximation stochastique en analyse factorielle multiple.Ann. I.S.U.P.5032006, 2745back to text
 71 articleConvergence d'un processus d'approximation stochastique en analyse factorielle.Publ. Inst. Statist. Univ. Paris3811994, 3755back to text
 72 articleStochastic approximation of eigenvectors and eigenvalues of the Qsymmetric expectation of a random matrix.Communications in Statistics  Theory and Methods53515 pages2024, 16691683HALDOIback to textback to text
 73 articleStochastic approximation of the factors of a generalized canonical correlation analysis.Statist. Probab. Lett.78142008, 22102216URL: http://dx.doi.org/10.1016/j.spl.2008.01.088DOIback to text
 74 articleOn nonparametric estimates of density functions and regression curves.Theory of Probability & Its Applications1011965, 186190back to text
 75 techreportThe simplex method is strongly polynomial for deterministic Markov decision processes.arXiv:1208.5083v22012back to text
 76 bookMarkov Decision Processes.Wiley, New York1994back to text
 77 articleSingleCellBased Analysis Highlights a Surge in CelltoCell Molecular Variability Preceding Irreversible Commitment in a Differentiation Process.PLoS Biology1412December 2016HALDOIback to text
 78 inproceedingsBrownian penalisations related to excursion lengths, VII.Annales de l'IHP Probabilités et statistiques4522009, 421452back to text
 79 articleStochastic calculus with respect to continuous finite quadratic variation processes.Stochastics: An International Journal of Probability and Stochastic Processes70122000, 140back to text
 80 inproceedingsApproximate Policy Iteration Schemes: A Comparison.ICML  31st International Conference on Machine Learning  2014Pékin, ChinaJune 2014HALback to text
 81 articleApproximate Modified Policy Iteration and its Application to the Game of Tetris.Journal of Machine Learning Research16A parâitre2015, 16291676HALback to text
 82 articleImproved and Generalized Upper Bounds on the Complexity of Policy Iteration.Mathematics of Operations ResearchMarkov decision processes ; Dynamic Programming ; Analysis of AlgorithmsFebruary 2016HALDOIback to text
 83 articleMemorybased persistence in a counting random walk process.Phys. A.38612007, 303307URL: http://dx.doi.org/10.1016/j.physa.2007.08.027DOIback to text
 84 articleThe range of a simple random walk on Z.Advances in applied probability1996, 10141033back to text
 85 misc An introduction to network inference and mining.(consulté le 22/07/2015)2015, URL: http://www.nathalievilla.org/doc/pdf//wikistatnetwork_compiled.pdfback to textback to text
 86 articleThe Simplex and PolicyIteration Methods Are Strongly Polynomial for the Markov Decision Problem with a Fixed Discount Rate.Math. Oper. Res.3642011, 593603back to text