BIGS is a joint team of Inria, CNRS and Université Lorraine, via the Institut Élie Cartan, UMR 7502 CNRS-UL laboratory in mathematics, of which Inria is a strong partner. One member of BIGS, T. Bastogne, comes from the Research Center of Automatic Control of Nancy (CRAN), with which BIGS has strong relations in the domain "Health-Biology-Signal". Our research is mainly focused on stochastic modeling and statistics but also aiming at a better understanding of biological systems. BIGS involves applied mathematicians whose research interests mainly concern probability and statistics. More precisely, our attention is directed on (1) stochastic modeling, (2) estimation and control for stochastic processes, (3) algorithms and estimation for graph data and (4) regression and machine learning. The main objective of BIGS is to exploit these skills in applied mathematics to provide a better understanding of issues arising in life sciences, with a special focus on (1) tumor growth, (2) photodynamic therapy, (3) population studies of genomic data and of micro-organisms genomics, (4) epidemiology and e-health.

We give here the main lines of our research that belongs to the domains of probability and statistics. For clarity, we made the choice to structure them in four items. Although this choice was not arbitrary, the outlines between these items are sometimes fuzzy because each of them deals with modeling and inference and they are all interconnected.

Our aim is to propose relevant stochastic frameworks for the modeling and the understanding of biological systems. The stochastic processes are particularly suitable for this purpose. Among them, Markov chains give a first framework for the modeling of population of cells 83, 59. Piecewise deterministic processes are non diffusion processes also frequently used in the biological context 49, 58, 51. Among Markov models, we developed strong expertise about processes derived from Brownian motion and Stochastic Differential Equations 76, 57. For instance, knowledge about Brownian or random walk excursions 82, 74 helps to analyse genetic sequences and to develop inference about them. However, nature provides us with many examples of systems such that the observed signal has a given Hölder regularity, which does not correspond to the one we might expect from a system driven by ordinary Brownian motion.

This situation is commonly handled by noisy equations driven by Gaussian processes such as fractional Brownian motion of fractional fields. The basic aspects of these differential equations are now well understood, mainly thanks to the so-called rough paths tools 66, but also invoking the Russo-Vallois integration techniques 75. The specific issue of Volterra equations driven by fractional Brownian motion, which is central for the subdiffusion within proteins problem, is addressed in 50. Many generalizations (Gaussian or not) of this model have been recently proposed for some Gaussian locally self-similar fields, or for some non-Gaussian models 62, or for anisotropic models 44.

We develop inference about stochastic processes that we use for modeling. Control of stochastic processes is also a way to optimise administration (dose, frequency) of therapy.

There are many estimation techniques for diffusion processes or coefficients of fractional or multifractional Brownian motion according to a set of observations 61, 40, 48. However, the inference problem for diffusions driven by a fractional Brownian motion is still in its infancy. Our team has a good expertise about inference of the jump rate and the kernel of piecewise-deterministic Markov processes (PDMP) 39, 35, 38, 37, but there are many directions to go further into. For instance, previous work made the assumption of a complete observation of jumps and mode, which is unrealistic in practice. We tackle the problem of inference of “hidden PDMP”. For example, in pharmacokinetics modeling inference, we want to account for the presence of timing noise and identification from longitudinal data. We have expertise on these subjects 41, and we also used mixed models to estimate tumor growth 42.

We consider the control of stochastic processes within the framework of Markov Decision Processes 73 and their generalization known as multi-player stochastic games, with a particular focus on infinite-horizon problems. In this context, we are interested in the complexity analysis of standard algorithms, as well as the proposition and analysis of numerical approximate schemes for large problems in the spirit of 43. Regarding complexity, a central topic of research is the analysis of the Policy Iteration algorithm, which has made significant progress in the last years 85, 72, 56, 79, but is still not fully understood. For large problems, we have a long experience of sensitivity analysis of approximate dynamic programming algorithms for Markov Decision Processes 81, 80, 77, 65, 78, and we currently investigate whether/how similar ideas may be adapted to multi-player stochastic games.

A graph data structure consists of a set of nodes, together with a set of pairs of these nodes called edges. This type of data is frequently used in biology because they provide a mathematical representation of many concepts such as biological structures and networks of relationships in a population. Some attention has recently been focused in the group on modeling and inference for graph data.

Network inference is the process of making inference about the link between two variables, taking into account the information about other variables. 84 gives a very good introduction and many references about network inference and mining. Many methods are available to infer and test edges in Gaussian graphical models 84, 67, 54, 55. However, the Gaussian assumption does not hold when dealing with typical “zero-inflated” abundance data, and we want to develop inference in this case.

Among graphs, trees play a special role because they offer a good model for many biological concepts, from RNA to phylogenetic trees through plant structures. Our research deals with several aspects of tree data. In particular, we work on statistical inference for this type of data under a given stochastic model. We also work on lossy compression of trees via directed acyclic graphs. These methods enable us to compute distances between tree data faster than from the original structures and with a high accuracy.

Regression models and machine learning aim at inferring statistical links between a variable of interest and covariates. In biological study, it is always important to develop adapted learning methods both in the context of standard data and also for data of high dimension (with sometimes few observations) and very massive or online data.

Many methods are available to estimate conditional quantiles and test dependencies 71, 60. Among them we have developed nonparametric estimation by local analysis via kernel methods 52, 53 and we want to study properties of this estimator in order to derive a measure of risk like confidence band and test. We study also many other regression models like survival analysis, spatio temporal models with covariates. Among the multiple regression models, we want to develop omnibus tests that examine several assumptions together.

Concerning the analysis of high dimensional data, our view on the topic relies on the French data analysis
school, specifically on Factorial Analysis tools. In this context, stochastic approximation is an essential tool
64, which allows one to approximate eigenvectors in a stepwise manner 69, 68, 70.
BIGS aims at performing accurate classification or clustering by taking advantage of the possibility of updating the information "online" using stochastic approximation algorithms 45. We focus on several incremental procedures for regression and data analysis like linear and logistic regressions and PCA (Principal Component Analysis).

We also focus on the biological context of high-throughput bioassays in which several hundreds or thousands of biological signals are measured for a posterior analysis. We have to account for the inter-individual variability within the modeling procedure. We aim at developing a new solution based on an ARX (Auto Regressive model with eXternal inputs) model structure using the EM (Expectation-Maximisation) algorithm for the estimation of the model parameters.

On this topic, we want to propose branching processes to model the appearance of mutations in tumors, through new collaborations with clinicians who measure a particular quantity called circulating tumor DNA (ctDNA). The final purpose is to use ctDNA as an early biomarker of the resistance to an immunotherapy treatment: it is the aim of the ITMO project. Another topic is the identification of dynamic networks of gene expression. In the ongoing work on low-grade gliomas, a local database of 400 patients will be soon available to construct models. We plan to extend it through national and international collaborations (Montpellier CHU, Montreal CRHUM). Our aim is to build a decision-aid tool for personalised medicine. In the same context, there is a topic of clustering analysis of a brain cartography obtained by sensorial simulations during awake surgery.

Despite of his 'G' in the name of BIGS, Genetics is not central in the applications of the team. However, we want to contribute to a better understanding of the correlations between genes trough their expression data and of the genetic bases of drug response and disease. We have contributed to methods detecting proteomics and transcriptomics variables linked with the outcome of a treatment.

We have many works to do in our ongoing projects in the context of personalized medicine with CHU Nancy. They deal with biomarkers research, prognostic value of quantitative variables and events, scoring, and adverse events. We also want to develop our expertise in rupture detection in a project with APHP (Assistance Publique Hôpitaux de Paris) for the detection of adverse events, earlier than the clinical signs and symptoms. The clinical relevance of predictive analytics is obvious for high-risk patients such as those with solid organ transplantation or severe chronic respiratory disease for instance. The main challenge is the rupture detection in multivariate and heterogeneous signals (for instance daily measures of electrocardiogram, body temperature, spirometry parameters, sleep duration, etc.). Other collaborations with clinicians concern foetopathology and we want to use our work on conditional distribution function to explain fetal and child growth. We have data from the "Service de foetopathologie et de placentologie" of the "Maternité Régionale Universitaire" (CHU Nancy).

Telomeres are disposable buffers at the ends of chromosomes which are truncated during cell division; so that, over time, due to each cell division, the telomere ends become shorter. By this way, they are markers of aging. Through a collaboration with Pr A. Benetos, geriatrician at CHU Nancy, we recently obtained data on the distribution of the length of telomeres from blood cells. With members of Inria team TOSCA, we want to work in three connected directions: (1) refine methodology for the analysis of the available data; (2) propose a dynamical model for the lengths of telomeres and study its mathematical properties (long term behavior, quasi-stationarity, etc.); and (3) use these properties to develop new statistical methods. A slot of postdoc position is already planned in the Lorraine Université d'Excellence, LUE project GEENAGE (managed by CHU Nancy).

We followed Inria's recommendations to get involved in the fight against COVID 19. We tried to collaborate with the LCPME laboratory in the purpose to predict the number of SARS‐CoV‐2 positive patients from the Grand Nancy metropolitan at the Nancy University Hospital from the concentration of SARS-Cov-2 residues in waste water. We have encountered difficulties with the Obépine network in obtaining raw data instead of pre-processed indicators. We made predictions from the incidence rates available on Santé Publique France. The predictions are available on the siwam website.

We were also involved in the MODCOV19 project, a platform of coordination of research actions about modeling of SARS-CoV-2 (Covid-19) pandemic. We were in particular responsible for the bibliographic awareness group of the coordination committee.

The list of permanent members of the team noticeably increased in 2021, due to the arrival of several researchers from the former Inria team Tosca. These researchers are experts of stochastic modeling and analysis for bio-medical applications. Their arrival led to a strengthening of the first axis of our research program. We are currently proposing a new Inria team Simba which takes into account these arrivals and the recent recruitments in the past few years in our team, and more generally on the topic of mathematical biology in Institut Élie Cartan de Lorraine.

The team has been developing three new packages.

The aim is to better understand how living cells make decisions (e.g., differentiation of a stem cell into a particular specialized type), seeing decision-making as an emergent property of an underlying complex molecular network. Indeed, it is now proven that cells react probabilistically to their environment: cell types do not correspond to fixed states, but rather to “potential wells” of a certain energy landscape (representing the energy of the possible states of the cell) that we are trying to reconstruct. A first paper proposing a reconstruction method has been submitted 26 in the framework of an international collaboration (USA, Switzerland, France). Another paper is about to be submitted 28, dealing more specifically with the inference of the underlying networks.

Continuation of the ITMO Cancer project, supervised by Nicolas Champagnat, concerning the modeling of circulating tumor DNA (ctDNA) to detect the appearance of resistance to targeted therapies (personalized medicine). After a phase of investigation of possible scenarios in collaboration with Alexandre Harlé of the Institute of Cancerology of Lorraine (ICL), a final model was selected. Based on a mathematical analysis, the members of the project then designed a statistical inference algorithm (learning the parameters of the model, including the genealogical tree of mutations for each patient) which is intended to be validated on real data currently being acquired at the Nancy CHRU. The general idea is to exploit a “variational principle” that allows to explore the discrete space of family trees, of very large size, through a “pivot” space of continuous parameters, easy to optimize (and in reasonable numbers). A paper detailing the model and its inference is in preparation. The previous method allows for the reconstruction of intratumoral heterogeneity, i.e. the subclone composition of the tumor. Based on these data, we are currently studying models of stochastic tumor growth with an emphasis on interactions between the clones to assess the effects of different treatment strategies.

We are continuing our research on quasi-stationary distributions (QSD), that is, distributions of Markov stochastic processes with absorption, which are stationary conditionally on non-absorption. For models of biological populations, absorption corresponds usually to extinction of a (sub-)population. QSDs are fundamental tools to describe the population state before extinction and to quantify the large-time behavior of the probability of extinction.

This year, we solved a general conjecture on the Fleming-Viot particle systems approximating QSDs: in cases where several QSDs exist, it is expected that the stationary distributions of the Fleming-Viot processes approach a particular QSD, called minimal QSD. We proved that this holds true for general absorbed Markov processes with soft obstacles in 7. We also obtained in 8 criteria based on Lyapunov functions allowing to check general conditions of 47 which characterize the exponential uniform convergence in total variation of conditional distributions of an absorbed Markov process to a unique quasi-stationary distribution. Among the various applications they give, they prove that these conditions apply to any logistic Feller diffusions in any dimension conditioned to the non-extinction of all its coordinates. This question was left partly open since the first work of Cattiaux and Méléard on this topic 46.

Together with M. Benaïm (Univ. Neuchâtel), we studied in 4 stochastic algorithms to approximate quasi-stationary distributions of diffusion processes absorbed at the boundary of a bounded domain. We considered a reinforced version of the diffusion, which is resampled according to its occupation measure when it reaches the boundary. We showed that its occupation measure converges to the unique quasi-stationary distribution of the diffusion process. We also obtained in 24 general criteria ensuring existence, uniqueness and/or exponential convergence properties for quasi-stationary distributions. The criteria were specifically designed to apply to degenerate processes such as hypoelliptic diffusions. We also provided in 25 a counterexample to the uniqueness of a quasi-stationary distribution for a diffusion process which satisfies the weak Hörmander condition.

Together with R. Schott (IECL, Univ. Lorraine), we studied in 6 models of deadlocks in distributed systems, using the approach we developped in 8 to study quasi-stationary distributions, in order to characterize and compute numerically the asymptotic behaviour of the deadlock time and the behaviour of the system before deadlock, both for discrete and for diffusion models.

We studied models of food web adaptive evolution in 10. We identified the biomass conversion efficiency as a key mechanism underlying food web evolution and discussed the relevance of such models to study the evolution of food webs.

We studied evolutionary models of bacteria with horizontal transfer in 5. Horizontal transfer is a common mechanism of DNA exchange between micro-organisms that is thought to be responsible for fast evolution of antibiotic resistance for bacteria or evolution of virulence for pathogenes. We considered a scaling of parameters taking into account the influence of negligible but non-extinct populations, allowing us to study specific phenomena observed in these models (re-emergence of traits, cyclic evolutionary dynamics and evolutionary suicide). This work is done in collaboration with S. Méléard (École Polytechnique) and V.C. Tran (Univ. Paris Est Marne-la-Vallée).

We also worked on general evolutionary models of adaptive dynamics under an assumption of large population and small mutations. This year, we obtained existence, uniqueness and ergodicity results for a centered version of the Fleming-Viot process of population genetics, which is a key step to recover variants of the canonical equation of adaptive dynamics, which describes the long time evolution of the dominant phenotype in the population, under less stringent biological assumptions than in previous works. We plan to complete this work next year.

We consider Offline Reinforcement Learning methods. The problem is to learn a policy from logged transitions of an environment, without any interaction. In the presence of function approximation, and under the assumption of limited coverage of the state-action space of the environment, it is necessary to enforce the policy to visit state-action pairs close to the support of logged transitions.

In 17, we propose an iterative procedure to learn a pseudometric (closely related to bisimulation metrics) from logged transitions, and use it to define this notion of closeness. We show its convergence and extend it to the function approximation setting. We then use this pseudometric to define a new lookup based bonus in an actor-critic algorithm: PLOFF. This bonus encourages the actor to stay close, in terms of the defined pseudometric, to the support of logged transitions.

In 18, noticing that an agent in this setting should avoid selecting actions whose consequences cannot be predicted from the data, we take inspiration from the literature on bonus-based exploration to design a new offline RL agent. The core idea is to subtract a prediction-based exploration bonus from the reward, instead of adding it for exploration. This allows the policy to stay close to the support of the dataset. We connect this approach to a more common regularization of the learned policy towards the data. Instantiated with a bonus based on the prediction error of a variational autoencoder, we show that our agent is competitive with the state of the art on a set of continuous control locomotion and manipulation tasks.

Many goodness-of-fit tests have been developed to assess the different assumptions of a (possibly heteroscedastic) regression model. Most of them are 'directional' in that they detect departures from a given assumption of the model. Other tests are 'global' (or 'omnibus') in that they assess whether a model fits a dataset on all its assumptions. We focus on the task of choosing the structural part of the regression function because it contains easily interpretable information about the studied relationship. We consider 2 nonparametric 'directional' tests and one nonparametric 'global' test, all based on generalizations of the Cramér-von Mises statistic.

To perform these goodness-of-fit tests, we develop the R package cvmgof 36, an easy-to-use tool for practitioners, available from the Comprehensive R Archive Network (CRAN). The use of the library is illustrated through a tutorial on real data and simulation studies are carried out in order to show how the package can be exploited to compare the 3 implemented tests. The practitioner can also easily compare the test procedures with different kernel functions, bootstrap distributions, numbers of bootstrap replicates, or bandwidths. The package was updated at the start of 2021, this is its third version. A first article 1 has been published on this work in October 2021.

We are now working on nonparametric tests associated with the functional form of the variance of the regression model. For this, we continue to work on the global test of Ducharme and Ferrigno in order to compare it in terms of performance with directional tests associated with the variance of the model. Many simulations are in progress. This will also make it possible to propose a more general package-type tool making it possible to validate the regression models used in practice.

To complete this work, it would be interesting to assess the other assumptions of a regression model such as the additivity of the random error term. The implementation of these directional tests would enrich the cvmgof package and offer a complete easy-to-use tool for validating regression models. Moreover, the assessment of the overall validity of the model when using several directional tests could be compared with that done when using only a global test. In particular, the well-known problem of multiple testing could be discussed by comparing the results obtained from multiple test procedures with those obtained when using a global test strategy. Another perspective of this work would be to develop a similar tool for other statistical models widely used in practice such as generalized linear models.

Widening the scope of an eigenvector stochastic approximation process and application to streaming PCA and related methods. This article in collaboration with A. Skiredj was presented in the 2020 Activity Report (Section 8.3.5) and is now published in Journal of Multivariate Analysis 15.

Streaming constrained binary logistic regression with online standardized data. This article in collaboration with E. Albuisson was presented in the 2020 Activity Report (Section 8.3.5) and is now accepted in Journal of Applied Statistics 13.

Construction and update of an online ensemble score involving linear discriminant analysis and logistic regression. This article in collaboration with E. Albuisson was presented in the 2020 Activity Report (Section 8.3.5) and is being submitted 30, 63.

Stochastic approximation of eigenvectors and eigenvalues of the Q-symmetric expectation of a random matrix. Application to streaming PCA. In this analysis, we have studied the convergence of stochastic approximation processes of the Oja type for estimating eigenvectors of the unknown

Other applications to methods related to PCA such as generalized canonical correlation analysis are in progress.

To apply our algorithms of change-point to real data, we turned to some EMG signal data provided by INRS. The study concerns the development of trapezius muscle myalgia in the workplace. We apply change-point detection to characterize different computer activities carried out during an experimental day. Our analysis allowed us to characterize activities according to the frequency and amplitude of jumps and to distinguish office activities using the mouse from those using the keyboard. This work was presented in a conference paper 19.

In the aim of understanding the growth of low-grade glioma, we investigate multiple fields of information available in clinical practice: patient-related predictors, variables related to tumor tissue and genetics. Monitoring growth through regular MRIs gives us access to many imaging-related variables, including an original one measuring tumor infiltration (thesis defended in 2021: Cyril Brzenczek CRAN, article in preparation). Our last efforts have focused on the statistical analysis of the database composed of these variables. We have obtained a regional fund PACTE to host this database and use it for teaching, dissemination and development of experimentation tools: PIANO platform.

In Epidemiology, we are working with INSERM to study fetal development in the last two trimesters of pregnancy. Reference or standard curves are required in this kind of biomedical problems. Values that lie outside the limits of these reference curves may indicate the presence of a disorder. Data are from the French EDEN mother-child cohort (INSERM). It's a mother-child cohort study investigating the prenatal and early postnatal determinants of child health and development. 2002 pregnant women were recruited before 24 weeks of amenorrhoea in two maternity clinics from middle-sized French cities (Nancy and Poitiers). From May 2003 to September 2006, 1899 newborns were then included. The main outcomes of interest are fetal (via ultra-sound) and postnatal growth, adiposity development, respiratory health, atopy, behaviour and bone, cognitive and motor development. We are studying fetal weight and height as a function of the gestional age in the third trimester of pregnancy. Some classical empirical and parametric methods such as polynomial regression are first used to construct these curves. For instance, polynomial regression is one of the most common parametric approaches for modeling growth data, especially during the prenatal period. However, these classical methods require strong assumptions. We therefore propose to work with semi-parametric LMS methods, by modifying the response variable (fetal weight) with, among others, Box–Cox transformations. A first article detailing these methodologies applied to the EDEN data should be submitted next year and is the object of the communication 31.

Alternative nonparametric methods as Nadaraya-Watson kernel estimation, local polynomial estimation, B-splines or cubic splines are also developed in this context to construct these curves. The practical implementation of these methods required working on smoothing parameters or choice of knots for the different types of nonparametric estimation. In particular, optimal choice of these parameters has been proposed. Then, a first version of an R package has been developed to propose a tool to construct nonparametric reference curves. It will soon be available on GitHub. In addition, a graphical interface (GUI) intended for practitioners is being developed to allow intuitive visualization of the results given by the package and an article is in progress.

This article in collaboration with E. Albuisson and D. Lucci was presented in the 2020 Activity Report (Section 8.4.2) and is now published in Applied Mathematics 14.

We are working with L. Vallat (CHRU Strasbourg) on the inference of dynamical gene networks from RNAseq and proteome data. The goal is to infer a model of gene expression allowing to predict the gene expression in cells where the expression of genes is silenced (e.g. using siRNA), in order to select the silencing experiments which are more likely to reduce the cell proliferation. We expect the selected genes to provide new therapeutic targets for the treatment of chronic lymphocytic leukemia. This year, we addressed the general problem of prediction as defined above, and constructed and proposed an inference method for a new gene network model for which such a prediction is possible. Next year, we expect to identify potential therapeutic targets for which silencing experiments could be conducted.

We propose a new methodology for selecting and ranking covariates associated with a variable of interest in a context of high-dimensional data under dependence but few observations. The methodology successively intertwines the clustering of covariates, decorrelation of covariates using Factor Latent Analysis, selection using aggregation of adapted methods and finally ranking. A simulation study shows the interest of the decorrelation inside the different clusters of covariates. We first apply our method to transcriptomic data of 37 patients with advanced non-small-cell lung cancer who have received chemotherapy, to select the transcriptomic covariates that explain the survival outcome of the treatment. Secondly, we apply our method to 79 breast tumor samples to define patient profiles for a new metastatic biomarker and associated gene network in order to personalize the treatments. This work is published in 2 and is implemented in R package ‘ARMADA’.

Seroprevalence study
Pierre Vallois is the scientific coordinator of the seroprevalence study COVAL Nancy held in Nancy in July 2020 in collaboration with CHRU de Nancy (CIC épidémiologie clinique and Laboratoire de Virologie).

Background. The World Health Organisation recommends monitoring the circulation of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). We aimed to estimate anti–SARS-CoV-2 total immunoglobulin (IgT) antibody seroprevalence and describe symptom profiles and in vitro seroneutralization in Nancy, France, in spring 2020.

Methods. Individuals were randomly sampled from electoral lists and invited with household members over 5 years old to be tested for anti–SARS-CoV-2 (IgT, i.e. IgA/IgG/IgM) antibodies by ELISA (Bio-rad). Serum samples were classified according to seroneutralization activity 50 %
(NT50) on Vero CCL-81 cells. Age- and sex-adjusted seroprevalence was estimated. Subgroups were compared by chi-square or Fisher exact test and logistic regression.

Results. Among 2006 individuals, 43 were SARS-CoV-2–positive; the raw seroprevalence was 2.1 % (95 % confidence interval 1.5 to 2.9), with adjusted metropolitan and national standardized seroprevalence 2.5 % (1.8 to 3.3) and 2.3 % (1.7 to 3.1). Seroprevalence was highest for 20- to 34-year-old participants (4.7 % [2.3 to 8.4]), within than out of socially deprived area (2.5 % vs 1 %, P=0.02) and with than without intra-family infection (p<10-6). Moreover, 25 % (23 to 27) of participants presented at least one COVID-19 symptom associated with SARS-CoV-2 positivity (p<10-13), with anosmia or ageusia highly discriminant (odds ratio 27.8 [13.9 to 54.5]), associated with dyspnea and fever. Among the SARS-CoV-2-positives, 16.3 % (6.8 to 30.7) were asymptomatic. For 31 of these individuals, positive seroneutralization was demonstrated in vitro.

Conclusions. In this population of very low anti-SARS-CoV-2 antibody seroprevalence, a beneficial effect of the lockdown can be assumed, with frequent SARS-CoV-2 seroneutralization among IgT-positive patients.

The results were published first in Medrxiv corresponding to 27 and in a peer-reviewed international journal 11.

SARS‐CoV‐2 positive patients in hospital predictions
Participants : A. Gégout-Petit, U. Herbach, N. Thorr.

In collaboration with H. Berry, D. Gemmerlé, T. Lepoutre, D. Maucourt and D. Parsons.

We followed Inria's recommendations to get involved in the fight against COVID 19. We tried to collaborate with the LCPME laboratory in the purpose to predict the number of SARS‐CoV‐2 positive patients from the Grand Nancy metropolitan at the Nancy University Hospital from the concentration of SARS-Cov-2 residues in waste water. We have encountered difficulties with the Obépine network in obtaining raw data rather than mere indicators. We made predictions from the incidence rates available on Santé Publique France. The predictions are available on the siwam website. Inria hired Nicolas Thorr as engineer during 6 months for this project.

B. Scherrer collaborates with Google Brain on reinforcement learning in the framework of the PhD thesis of Nino Vieillard.

Here is a selection of the journals for which we regularly write referee reports: Bernoulli, Cell, Medicina, The Annals of Applied Probability, Stochastic Processes and their Applications, Journal de Mathématiques Pures et Appliquées, ALEA - Latin American Journal of Probability and Mathematical Statistics, ESAIM: Probability & Statistics, Journal of Theoretical Biology, Mathematical Biosciences, Journal of Physics A: Mathematical and Theoretical, Current Opinion in Systems Biology, Bioinformatics...

BIGS faculty members have teaching obligations at Université de Lorraine and are teaching at least 192 hours each year. They teach probability and statistics at different levels (Licence, Master, Engineering school). Many of them have pedagogical responsibilities.