EN FR
EN FR

2024Activity reportProject-TeamSTATIFY

RNSR: 202023582A
  • Research center Inria Centre at Université Grenoble Alpes
  • In partnership with:CNRS, Université de Grenoble Alpes
  • Team name: Bayesian and extreme value statistical models for structured and high dimensional data
  • In collaboration with:Laboratoire Jean Kuntzmann (LJK)
  • Domain:Applied Mathematics, Computation and Simulation
  • Theme:Optimization, machine learning and statistical methods

Keywords

Computer Science and Digital Science

  • A3.1.1. Modeling, representation
  • A3.1.4. Uncertain data
  • A3.3.2. Data mining
  • A3.3.3. Big data analysis
  • A3.4.1. Supervised learning
  • A3.4.2. Unsupervised learning
  • A3.4.4. Optimization and learning
  • A3.4.5. Bayesian methods
  • A3.4.7. Kernel methods
  • A5.3.3. Pattern recognition
  • A5.9.2. Estimation, modeling
  • A6.2. Scientific computing, Numerical Analysis & Optimization
  • A6.2.3. Probabilistic methods
  • A6.2.4. Statistical methods
  • A6.3. Computation-data interaction
  • A6.3.1. Inverse problems
  • A6.3.3. Data processing
  • A6.3.5. Uncertainty Quantification
  • A9.2. Machine learning
  • A9.3. Signal analysis

Other Research Topics and Application Domains

  • B1.2.1. Understanding and simulation of the brain and the nervous system
  • B2.6.1. Brain imaging
  • B3.3. Geosciences
  • B3.4.1. Natural risks
  • B3.4.2. Industrial risks and waste
  • B3.5. Agronomy
  • B5.1. Factory of the future
  • B9.5.6. Data science
  • B9.11.1. Environmental risks

1 Team members, visitors, external collaborators

Research Scientists

  • Florence Forbes [Team leader, INRIA, Senior Researcher]
  • Sophie Achard [CNRS, Senior Researcher]
  • Julyan Arbel [INRIA]
  • Pedro Luiz Coelho Rodrigues [INRIA, ISFP]
  • Michel Dojat [INSERM, Senior Researcher, on secondment]
  • Stephane Girard [INRIA, Senior Researcher]
  • Cambyse Pakzad [INRIA, Starting Research Position, until Aug 2024]

Faculty Members

  • Julien Chevallier [UGA, Associate Professor]
  • Jonathan El Methni [UGA, Associate Professor]

Post-Doctoral Fellows

  • Loic Chalmandrier [UGA, Post-Doctoral Fellow, from Dec 2024]
  • Jhouben Janyk Cuesta Ramirez [INRIA, Post-Doctoral Fellow, until Sep 2024]
  • Henrique Donancio [INRIA, Post-Doctoral Fellow]
  • Anton Francois [CNRS, Post-Doctoral Fellow, from Apr 2024]
  • Yiye Jiang [UGA, Post-Doctoral Fellow]
  • Tam Le Minh [INRIA, Post-Doctoral Fellow]
  • Rafael Mouallem Rosa [INRIA, Post-Doctoral Fellow, from Oct 2024]
  • Paul-Gauthier Noe [INRIA, Post-Doctoral Fellow, from Apr 2024]
  • Konstantinos Pitas [INRIA, Post-Doctoral Fellow, until Feb 2024]
  • Chen Yan [INRIA, Post-Doctoral Fellow, until Mar 2024]

PhD Students

  • Louise Alamichel [UGA, until Sep 2024]
  • Yuchen Bai [UGA]
  • Arturo Cabrera Vazquez [INRIA, from Nov 2024]
  • Alice Chevaux [UGA, from Oct 2024]
  • Isabella Costa Maia [UGA, from Sep 2024]
  • Antoine Franchini [INRIA, from Dec 2024]
  • Jacopo Iollo [INRIA]
  • Benjamin Lambert [PIXYL, until May 2024]
  • Pearl Laveur [UGA, from Oct 2024]
  • Brice Marc [CEREMA]
  • Razan Mhanna [UGA]
  • Geoffroy Oudoumanessah [INSERM]
  • Pierre-Louis Ruhlmann [INRIA]
  • Camille Touron [UGA, from Oct 2024]

Technical Staff

  • Pascal Dkengne Sielenou [INRIA, Engineer, until May 2024]

Interns and Apprentices

  • Mateo Amazo [INRIA, Intern, from Sep 2024]
  • Mathis Antonetti [INRIA, Intern, from May 2024 until Oct 2024]
  • Hani Anouar Bourrous [INRIA, Intern, from Nov 2024]
  • Alice Chevaux [UGA, Intern, from Mar 2024 until Jul 2024]
  • Antoine Franchini [INRIA, Intern, from May 2024 until Sep 2024]
  • Pearl Laveur [INRIA, Intern, from Apr 2024 until Aug 2024]
  • Kevin-Lâm Quesnel [UGA, Intern, from Apr 2024 until Sep 2024]
  • Bierhoff Theolien [UGA, Intern, from May 2024 until Aug 2024]
  • Eloïse Touron [UGA, Intern, from Feb 2024 until Jun 2024]
  • Camille Touron [UGA, Intern, from Feb 2024 until Jun 2024]

External Collaborator

  • Jean-Baptiste Durand [CIRAD]

2 Overall objectives

The statify team focuses on statistics. Statistics can be defined as a science of variation where the main question is how to acquire knowledge in the face of variation. In the past, statistics were seen as an opportunity to play in various backyards. Today, the statistician sees his own backyard invaded by data scientists, machine learners and other computer scientists of all kinds. Everyone wants to do data analysis and some (but not all) do it very well. Generally, data analysis algorithms and associated network architectures are empirically validated using domain-specific datasets and data challenges. While winning such challenges is certainly rewarding, statistical validation lies on more fundamentally grounded bases and raises interesting theoretical, algorithmic and practical insights. Statistical questions can be converted to probability questions by the use of probability models. Once certain assumptions about the mechanisms generating the data are made, statistical questions can be answered using probability theory. However, the proper formulation and checking of these probability models is just as important, or even more important, than the subsequent analysis of the problem using these models. The first question is then how to formulate and evaluate probabilistic models for the problem at hand. The second question is how to obtain answers after a certain model has been assumed. This latter task can be more a matter of applied probability theory, and in practice, contains optimization and numerical analysis.

The statify team aims at bringing strengths, at a time when the number of solicitations received by statisticians increases considerably because of the successive waves of big data, data science and deep learning. The difficulty is to back up our approaches with reliable mathematics while what we have is often only empirical observations that we are not able to explain. Guiding data analysis with statistical justification is a challenge in itself. statify has the ambition to play a role in this task and to provide answers to questions about the appropriate usage of statistics.

Often statistical assumptions do not hold. Under what conditions then can we use statistical methods to obtain reliable knowledge? These conditions are rarely the natural state of complex systems. The central motivation of statify is to establish the conditions under which statistical assumptions and associated inference procedures approximately hold and become reliable.

However, as George Box said "Statisticians and artists both suffer from being too easily in love with their models". To moderate this risk, we choose to develop, in the team, expertise from different statistical domains to offer different solutions to attack a variety of problems. This is possible because these domains share the same mathematical food chain, from probability and measure theory to statistical modeling, inference and data analysis.

Our goal is to exploit methodological resources from statistics and machine learning to develop models that handle variability and that scale to high dimensional data while maintaining our ability to assess their correctness, typically the uncertainty associated with the provided solutions. To reach this goal, the team offers a unique range of expertise in statistics, combining probabilistic graphical models and mixture models to analyze structured data, Bayesian analysis to model knowledge and regularize ill-posed problems, non-parametric statistics, risk modeling and extreme value theory to face the lack, or impossibility, of precise modeling information and data. In the team, this expertise is organized to target five key challenges:

  • 1.
    Models for high dimensional, multimodal, heterogeneous data;
  • 2.
    Spatial (structured) data science;
  • 3.
    Scalable Bayesian models and procedures;
  • 4.
    Understanding mathematical properties of statistical and machine learning methods;
  • 5.
    The big problem of small data.

The first two challenges address sources of complexity coming from data, namely, the fact that observations can be: 1) high dimensional, collected from multiple sensors in varying conditions i.e. multimodal and heterogeneous and 2) inter-dependent with a known structure between variables or with unknown interactions to be discovered. The other three challenges focus on providing reliable and interpretable models: 3) making the Bayesian approach scalable to handle large and complex data; 4) quantifying the information processing properties of machine learning methods and 5) allowing to draw reliable conclusions from datasets that are too small or not large enough to be used for training machine/deep learning methods.

These challenges rely on our four research axes:

  • 1.
    Models for graphs and networks;
  • 2.
    Dimension reduction and latent variable modeling;
  • 3.
    Bayesian modeling;
  • 4.
    Modeling and quantifying extreme risk.

In terms of applied work, we will target high-impact applications in neuroimaging, environmental and earth sciences.

3 Research program

3.1 Models for graphs and networks

Participants: Jean-Baptiste Durand, Florence Forbes, Julyan Arbel, Sophie Achard, Michel Dojat, Julien Chevallier.

Keywords: graphical models, Markov properties, hidden Markov models, clustering, missing data, mixture of distributions, EM algorithm, image analysis, Bayesian inference.

Graphs arise naturally as versatile structures for capturing the intrinsic organization of complex datasets. The literature on graphical modeling is growing rapidly and covers a wide range of applications, from bioinformatics to document modeling, image analysis, social network analysis, etc. When faced with multivariate, possibly high dimensional, data acquired at different sites (or nodes) and structured according to an underlying network (or graph), the objective is generally to understand the dependencies or associations present in the data so as to provide a more accurate statistical analysis and a better understanding of the phenomenon under consideration.

Structure learning.

This refers to the inference of the existing dependences between variables from observed samples. The limits of obtaining graph edges using sample correlation between nodes is well known. We have investigated alternative approaches, both Bayesian and frequentist, the former were rather used to account for constraints on the structure while for the latter we focused on robust modeling and estimation in presence of outliers. We proposed a fast Bayesian structure learning based on pre-screening of categorical variables, in the PhD thesis of T. Rahier with Schneider Electric. In the continuous variable case, we studied the design of tractable estimators and algorithms that can provide robust estimation of covariance structures. Many covariance estimation methods rely on the Gaussian graphical model but a viable model for data contaminated by outliers requires the use of more robust and complex procedures and is therefore more challenging to build. Then, the problem of robust structure learning is especially acute in the high-dimensional setting, in which the number of variables p is of the same order or is much larger than the number of available observations n. We have investigated different ways to handle both the above mentioned issues, in order to provide models for application such as modeling brain connectivity from functional magnetic resonance imaging (fMRI) data. Each brain region is associated with a time series, and the goal is to study the connectivity among these regions. Interactions between the regions can be described by covariance or precision matrices that quantify the links between time series and can then be represented as graphs. We have first proposed an approach, initiated with the PhD of K. Ashurbekova, to generalize the Gaussian approach to multivariate heavy-tailed distributions with dimensionality relatively larger than the number of observations. This encompasses methods related to shrinkage and M-estimators for which we aimed at designing algorithms with proved convergence results and optimal values for shrinkage coefficients. Second, still motivated by the brain connectivity application, we have investigated in the PhD of H. Lbath (QFunC project), the possibility to compute more subtle correlations between brain regions using a new notion of correlation of local averages. At last, to go beyond the Gaussian assumption, we also investigated copulas approaches or characterized graphical dependencies for multivariate counts, with potential applications to branching processes.

Structure modelling.

Once the structure is identified, the following questions are about comparing the discovered graph structures together, or with regards to a reference graph. If the structure is not itself the object of consideration, the goal is usually to account for it in a subsequent analysis. Except for simple graphs (chains or trees), this is problematic because mainstream statistical models and algorithms are based on the independence assumption and become intractable for even moderate graph sizes. The analysis of graphs as the objects of interest with the design of tools to model and compare them has been studied in the PhD of L. Carboni. We proposed new mathematical tools based on equivalence relation between graph statistics in order to be able to take into account the location in space of the nodes. To account for dependences in a tractable way we often rely on Markov modelling and variational inference. When dependence in time is considered, Gaussian processes are an interesting tractable tool. With the PhD of A. Constantin, we have investigated those in the context of a collaboration with INRAE and CNES in Toulouse, for the classification and reconstruction of irregularly sampled satellite image times series. The proposed approach is able to deal with irregular temporal sampling and missing data directly in the classification process. It is based on Gaussian processes and allows to perform jointly the classification of the pixel labels as well as the reconstruction of the pixel time series. The method complexity scales linearly with the number of pixels, making it amenable in large scale scenario. In a different context, we have developed hidden semi-Markov models for the analysis of eye movements, in particular with the PhD of B. Olivier in collaboration with A. Guérin-Dugué (GIPSA-lab) and B. Lemaire (Laboratoire de Psychologie et Neurocognition). New coupling methods for hidden semi-Markov models driven by several underlying state processes have been proposed.

Structured anomaly detection.

The vast majority of deep learning architectures for medical image analysis are based on supervised models requiring the collection of large datasets of annotated examples. Building such annotated datasets, which requires skilled medical experts, is time consuming and hardly achievable, especially for some specific tasks, including the detection of small and subtle lesions that are sometimes impossible to visually detect and thus manually outline. This critical aspect significantly impairs performances of supervised models and hampers their deployment in clinical neuroimaging applications, especially for brain pathologies that require the detection of small size lesions (e.g. multiple sclerosis, microbleeds) or subtle structural or morphological changes (e.g. Parkinson's disease). We have developed unsupervised anomaly detection methods based on generalized Student mixture models and deep statistical unsupervised learning model for the detection of early forms of Parkinson's disease. We have also compared parametric mixture approaches to non parametric machine learning techniques for change detection in the context of time series analysis of glycemic curves for diabetes.

3.2 Dimension reduction and latent variable modeling

Participants: Jean-Baptiste Durand, Florence Forbes, Stephane Girard, Julyan Arbel, Pedro Luiz Coelho Rodrigues.

Keywords: mixture of distributions, EM algorithm, missing data, conditional independence, statistical pattern recognition, clustering, unsupervised and partially supervised learning.

Extracting information from raw data is a complex task, all the more so as this information is measured in a high dimensional space. Fortunately, this information usually lives in a subspace of smaller size. Identifying this subspace is crucial but difficult. One approach is to perform appropriate changes of representation that facilitate the identification and characterization of the desired subspace. Latent random variables are a key concept to encode in a structured way representations that are easier to handle and capture the essential features of the data.

Regression in high dimensions.

Methods adapted to high dimensions include inverse regression methods, i.e. SIR, partial least squares (PLS), approaches based on mixtures of regressions with different variants, e.g. Gaussian locally linear mapping (GLLiM) and extensions, Mixtures of Experts, cluster weighted models, etc. SIR-like methods are flexible in that they reduce the dimension in a way optimal for the subsequent regression task that can itself be carried out by any desired regression tool. In that sense these methods are said to be non parametric or semi-parametric and they have a potential to provide robust procedures. We have also proposed a new approach, called Extreme-PLS, for dimension reduction in conditional extreme values settings where the goal is to best explain the extreme values of the response variable.

Simulation-based inference (SBI) for high dimensional inverse problems.

To account for uncertainty in a principled manner, we also considered Bayesian inversion techniques. We investigated the use of learning approaches to handle Bayesian inverse problems in a computationally efficient way when the observations to be inverted present a moderately high number of dimensions and are in large number. We proposed tractable inverse regression approaches, based on GLLiM and normalizing flows. They have the advantage to produce full probability distributions as approximations of the target posterior distributions. These distributions have several interesting features. They provide confidence indices on the predictions and can be combined with importance sampling or approximate Bayesian computation (ABC) schemes for a better exploration when multiple equivalent solutions exist. They generalise easily to variants that can handle non Gaussian data, dependent or missing observations. The relevance of the proposed approach has been illustrated on synthetic examples and on two real data applications, in the context of planetary remote sensing and neuroimaging. In addition, we addressed the issue of model selection for some of the GLLiM models, i.e. Mixture of experts (MoE) models and contributed to a number of theoretical results.

Online and incremental inference.

Most SBI methods scale poorly when the number of observations is too large, which makes them unsuitable for modern data, which are often acquired in real time, in an incremental nature, and are often available in large volume. Computation of inferential quantities in an incremental manner may be forcibly imposed by the nature of data acquisition (e.g. streaming and sequential data) but may also be seen as a solution to handle larger data volumes in a more resource friendly way, with respect to memory, energy, and time consumption. To produce feasible and practical online algorithms for streaming data and complex models, we have investigated the family of stochastic approximation (SA) algorithms combined with the class of majorization-minimization (MM) and expectation-maximization (EM) algorithms for a certain class of models, e.g., exponential family distributions and their mixtures.

3.3 Bayesian modelling

Participants: Julyan Arbel, Florence Forbes, Jean-Baptiste Durand, Pedro Coelho Rodrigues.

Keywords: Bayesian statistics, Bayesian nonparametrics, Markov Chain Monte Carlo, Experimental design, Bayesian neural networks, Approximate Bayesian Computation.

Bayesian methods have become the center of attraction to model the underlying uncertainty of statistical models. Bayesian models and methods are already used in all of our other axes, whenever the Bayesian choice provides interesting features, e.g. for model selection, dependence modeling (copulas), inverse problems, etc. This axis emphasizes more specifically our theoretical and methodological research in Bayesian learning. In particular, we will focus on techniques referred to as Bayesian nonparametrics (BNP).

Markov priors for Bayesian nonparametric models.

We have proposed Bayesian nonparametric priors for hidden Markov random fields, first for continuous, Gaussian observations with an illustration in image segmentation. Second, for discrete observed data typically issued from counts, e.g. Poisson distributed observations with an illustration on risk mapping model. The inference was done by Variational Bayesian Expectation Maximization (VBEM).

Asymptotic properties of BNP models.

A common way to assess a Bayesian procedure is to study the asymptotic behavior of posterior distributions, that is their ability to estimate a true distribution when the number of observations grows. Mixture models have attracted a lot of attention in the last decade due to some negative results regarding the number of clusters. More specifically, it was shown that Bayesian nonparametric mixture models are inconsistent for some choices of priors. We proposed ways to compute the prior distribution of the number of clusters. This is a notoriously difficult task, and we proposed approximations in order to enable such computations for real-world applications. We studied and justified BNP models based on their asymptotic properties. We showed that mixture models based on many different BNP processes are inconsistent in the number of clusters and discuss possible solutions. Notably, we showed that a post-processing algorithm introduced for the simplest process (Dirichlet process) extends to more general models and provides a consistent method to estimate the number of components.

Amortized Approximate Bayesian computation.

Approximate Bayesian computation (ABC) has become an essential part of the Bayesian toolbox for addressing problems in which the likelihood is prohibitively expensive or entirely unknown. A key ingredient in ABC is the choice of a discrepancy that describes how different the simulated and observed data are, often based on a set of summary statistics when the data cannot be compared directly. The choice of the appropriate discrepancies is an active research topic, which has mainly considered data discrepancies requiring samples of observations or distances between summary statistics. We have first investigated sample-based discrepancies and established new asymptotic results using so-called energy-based distances. We have then considered a summary-based approach and proposed a new ABC procedure that can be seen as an extension of the semi-automatic ABC framework to a functional summary statistics setting and can also be used as an alternative to sample-based approaches. The resulting ABC approach also exhibits amortization properties via the use of the GLLiM inverse regression model.

Bayesian neural networks.

The connection between Bayesian neural networks and Gaussian processes gained a lot of attention in the last few years, with the flagship result that hidden units converge to a Gaussian process limit when the layers width tends to infinity. Underpinning this result is the fact that hidden units become independent in the infinite-width limit. Our aim is to shed some light on hidden units dependence properties in practical finite-width Bayesian neural networks. In addition to theoretical results, we assessed empirically the depth and width impacts on hidden units dependence properties. Hidden units are proven to follow a Gaussian process limit when the layer width tends to infinity. Recent work has suggested that finite Bayesian neural networks may outperform their infinite counterparts because they adapt their internal representations flexibly. To establish solid ground for future research on finite-width neural networks, our goal is to study the prior induced on hidden units. Our main result is an accurate description of hidden units tails which shows that unit priors become heavier-tailed going deeper, thanks to the introduced notion of generalized Weibull-tail. This finding sheds light on the behavior of hidden units of finite Bayesian neural networks.

3.4 Modelling and quantifying extreme risk

Participants: Julyan Arbel, Stephane Girard, Florence Forbes, Sophie Achard, Jonathan El Methni, Cambyse Pakzad.

Keywords: dimension reduction, extreme value analysis, functional estimation.

Extreme events have a major impact on a wide variety of domains from environmental sciences (heat waves, flooding), reliability, to finance and insurance (financial crashes, reinsurance). While usual statistical approaches focus on the modeling of the bulk of the distribution, extreme-value analysis aims at building models adapted to distribution tails, where by nature, observations are rare. Extreme value analysis is a relatively recent domain in statistics focusing on distribution tails.

Extreme quantile estimation.

One of the most popular risk measures is the Value-at-Risk (VaR) introduced in the 1990’s. In statistical terms, the VaR at level α(0,1) corresponds to the upper α-quantile of the loss distribution. We have proposed estimators and studied their theoretical properties for extreme quantiles, that is when α0. We have also investigated Weissman extrapolation device for estimating extreme quantiles from heavy-tailed distributions. This is based on two estimators: an order statistic to estimate an intermediate quantile and an estimator of the tail-index. The common practice is to select the same intermediate sequence for both estimators. We showed how an adapted choice of two different intermediate sequences leads to a reduction of the asymptotic bias associated with the resulting refined Weissman estimator. This new bias reduction method is fully automatic and does not involve the selection of extra parameters.

New measures of extreme risk.

A simple way to assess the (environmental, industrial or financial) risk is to compute a measure linked to the value of the phenomena of interest (rainfall height, wind speed, river flow). Candidate measures include quantiles (which correspond to traditional Value at Risk or return levels), expectiles, tail conditional moments, spectral risk measures, distorsion risk measures, etc. We have mainly focused on the first two measures, quantiles and expectiles, and investigated estimation procedures for extensions of these measures. The main drawback of quantiles is that they do not provide a coherent risk measure. Two distributions may have the same extreme quantile but very different tail behaviors. Moreover, standard estimators do not use the most extreme values of the sample and consequently induce a loss of information. Our strategy was to adapt the definition of quantiles to take into account the whole distribution tail.

We have introduced new measures of extreme risk based on Lp- quantiles encompassing both expectiles and quantiles. We believe this generalization of the concept of extreme quantile to extreme Lp- quantile opens promising new research directions. We have first explored to what extent univariate extreme-value estimators can be improved on the basis of these novel Lp- quantiles. We built tractable estimators of these quantities with guaranteed theoretical properties.

Extremes with covariates.

A second challenge was to extend this concept to the regression framework where the variable of interest depends on a set of covariates. When the number of covariates is large, two research directions have been explored to overcome the curse of dimensionality: 1) we designed a dimension reduction method for the extreme-value context, 2) we also considered semi-parametric models to reduce the complexity of the fitted model.

Another challenge with expectiles is that their sample versions do not benefit from a simple explicit form, making their analysis significantly harder than that of quantiles and order statistics. This difficulty is compounded when one wishes to integrate auxiliary information about the phenomenon of interest through a finite-dimensional covariate, in which case the problem becomes the estimation of conditional expectiles. We exploited the fact that the expectiles of a distribution are in fact the quantiles of another distribution explicitly linked to the former one, in order to construct nonparametric kernel estimators of extreme conditional expectiles. We analyze the asymptotic properties of our estimators in the context of conditional heavy tailed distributions. The extension to functional covariates was investigated. Since quantiles and expectiles belong to the wider family of Lp-quantiles, we also proposed to construct kernel estimators of extreme conditional Lp-quantiles. We studied their asymptotic properties in the context of conditional heavy-tailed distributions and we showed through a simulation study that taking p(1,2) may allow to recover extreme conditional quantiles and expectiles accurately.

We built a general theory for the estimation of extreme conditional expectiles in heteroscedastic regression models with heavy-tailed noise. Our approach is supported by general results of independent interest on residual-based extreme value estimators in heavy-tailed regression models, and is intended to cope with covariates having a large but fixed dimension. We demonstrated how our results could be applied to a wide class of important examples, among which linear models, single-index models as well as ARMA and GARCH time series models.

Extremes and machine learning.

This is the topic of a more recent collaboration with E. Gobet from CMAP. Feedforward neural networks based on Rectified linear units (ReLU) cannot efficiently approximate quantile functions which are not bounded, especially in the case of heavy-tailed distributions. We have thus proposed a new parametrization for the generator of a Generative adversarial network (GAN) adapted to this framework, basing on extreme-value theory. We provided an analysis of the uniform error between the extreme quantile and its GAN approximation. It appears that the rate of convergence of the error is mainly driven by the second-order parameter of the data distribution. A similar investigation has been conducted to simulate fractional Brownian motion with ReLU neural networks.

4 Application domains

4.1 Image Analysis

Participants: Florence Forbes, Jean-Baptiste Durand, Stephane Girard, Pedro Coelho Rodrigues, Sophie Achard, Michel Dojat.

As regards applications, several areas of image analysis can be covered using the tools developed in the team. More specifically, in collaboration with team perception, we address various issues in computer vision involving Bayesian modelling and probabilistic clustering techniques. Other applications in medical imaging are natural. We work more specifically on MRI and functional MRI data, in collaboration with the Grenoble Institute of Neuroscience (GIN). We also consider other statistical 2D fields coming from other domains such as remote sensing, in collaboration with the Institut de Planétologie et d'Astrophysique de Grenoble (IPAG) and the Centre National d'Etudes Spatiales (CNES). In this context, we worked on hyperspectral and/or multitemporal images. In the context of the "pole de competivité" project I-VP, we worked of images of PC Boards.

4.2 Biology, Environment and Medicine

Participants: Florence Forbes, Stephane Girard, Jean-Baptiste Durand, Julyan Arbel, Sophie Achard, Pedro Coelho Rodrigues, Julien Chevallier, Michel Dojat, Jonathan El Methni.

A second domain of applications concerns biology and medicine. We considered the use of mixture models to identify biomakers. We also investigated statistical tools for the analysis of fluorescence signals in molecular biology. Applications in neurosciences are also considered. In the environmental domain, we considered the modelling of high-impact weather events and the use of hyperspectral data as a new tool for quantitative ecology.

5 Social and environmental responsibility

5.1 Footprint of research activities

The footprint of our research activities has not been assessed yet. Most of the team members have validated the “charte d'éco-responsabilité” written by a working group from Laboratoire Jean Kuntzmann, which should have practical implications in the near future.

5.2 Impact of research results

A lot of our developments are motivated by and target applications in medicine and environmental sciences. As such they have a social impact with a better handling and treatment of patients, in particular with brain diseases or disorders. On the environmental side, our work has an impact on geoscience-related decision making with e.g. extreme events risk analysis, planetary science studies and tools to assess biodiversity markers. However, how to truly measure and report this impact in practice is another question we have not really addressed yet.

6 Highlights of the year

Stéphane Girard was invited as a keynote speaker to the Journées de Statistique organized by the French Statistical Society.

Sophie Achard has been appointed scientific director of MIAI cluster.

A new Australian Research Council project coordinated by Angus Ng from Griffith University in Brisbane and Hien Duy Nguyen in La Trobe university in Melbourne, has been funded. The project last four years and involved F. Forbes as co-pi.

7 New software, platforms, open data

7.1 New software

7.1.1 Planet-GLLiM

  • Name:
    Planet-GLLiM
  • Keyword:
    Inverse problem
  • Functional Description:
    The application implements the GLLiM statistical learning technique in its different variants for the inversion of a physical model of reflectance on spectro-(gonio)-photometric data. The latter are of two types: 1. laboratory measurements of reflectance spectra acquired according to different illumination and viewing geometries, 2. and 4D spectro-photometric remote sensing products from multi-angular CRISM or Pléiades acquisitions.
  • URL:
  • Publications:
  • Contact:
    Sylvain Douté
  • Participants:
    Florence Forbes, Benoit Kugler, Sami Djouadi, Samuel Heidmann, Stanislaw Borkowski
  • Partner:
    Institut de Planétologie et d’Astrophysique de Grenoble

7.1.2 Kernelo

  • Name:
    Kernelo-GLLiM
  • Keywords:
    Inverse problem, Clustering, Regression, Gaussian mixture, Python, C++
  • Scientific Description:
    Building a regression model for the purpose of prediction is widely used in all disciplines. A large number of applications consists of learning the association between responses and predictors and focusing on predicting responses for the newly observed samples. In this work, we go beyond simple linear models and focus on predicting low-dimensional responses using high-dimensional covariates when the associations between responses and covariates are non-linear.
  • Functional Description:
    Kernelo-GLLiM is a Gaussian Locally-Linear Mapping (GLLiM) solver. Kernelo-GLLiM provides a C++ library and a python module for non linear mapping (non linear regression) using a mixture of regression model and an inverse regression strategy. The methods include the GLLiM model (Deleforge et al (2015) ) based on Gaussian mixtures.
  • URL:
  • Publications:
  • Contact:
    Florence Forbes
  • Participants:
    Florence Forbes, Benoit Kugler, Sami Djouadi, Samuel Heidmann, Stanislaw Borkowski
  • Partner:
    Institut de Planétologie et d’Astrophysique de Grenoble

8 New results

8.1 Models for graphs and networks

8.1.1 Robust Conformal Volume Estimation in 3D Medical Images

Participants: Florence Forbes, Benjamin Lambert, Michel Dojat.

Joint work with: Senan Doyle from Pixyl.

Volumetry is one of the principal downstream applications of 3D medical image segmentation, for example, to detect abnormal tissue growth or for surgery planning. Conformal Prediction is a promising framework for uncertainty quantification, providing calibrated predictive intervals associated with automatic volume measurements. However, this methodology is based on the hypothesis that calibration and test samples are exchangeable, an assumption that is in practice often violated in medical image applications. A weighted formulation of Conformal Prediction can be framed to mitigate this issue, but its empirical investigation in the medical domain is still lacking. A potential reason is that it relies on the estimation of the density ratio between the calibration and test distributions, which is likely to be intractable in scenarios involving high-dimensional data. To circumvent this, we proposed an efficient approach for density ratio estimation relying on the compressed latent representations generated by the segmentation model. Our experiments demonstrate the efficiency of our approach to reduce the coverage error in the presence of covariate shifts, in both synthetic and real-world settings. This work has been presented at the MICCAI conference in 2024.

8.1.2 Trustworthy clinical AI solutions: a unified review of uncertainty quantification in deep learning models for medical image analysis

Participants: Florence Forbes, Benjamin Lambert, Michel Dojat.

Joint work with: Senan Doyle from Pixyl.

The full acceptance of Deep Learning (DL) models in the clinical field is rather low with respect to the quantity of high-performing solutions reported in the literature. Particularly, end users are reluctant to rely on the rough predictions of DL models. Uncertainty quantification methods have been proposed in the literature as a potential response to reduce the rough decision provided by the DL black box and thus increase the interpretability and the acceptability of the result by the final user. In this review, we propose an overview of the existing methods to quantify uncertainty associated to DL predictions. We focus on applications to medical image analysis, which present specific challenges due to the high dimensionality of images and their quality variability, as well as constraints associated to real-life clinical routine. We then discuss the evaluation protocols to validate the relevance of uncertainty estimates. Finally, we highlight the open challenges of uncertainty quantification in the medical field. This review has been publihsed in Artificial Intelligence and Medicine in 2024 25.

8.1.3 From out-of-distribution detection to quality control

Participants: Florence Forbes, Benjamin Lambert, Michel Dojat.

Joint work with: Senan Doyle from Pixyl.

Quality Control (QC) is an important step of any medical image analysis pipeline to impose safeguards against biased interpretation. Visual QC can be tedious and time-consuming when the volume of data is important and a branch of work has thus focused on providing automated QC algorithms. In the context of computerized image analysis, such algorithms can be categorized according to the domain on which they operate, namely input (i.e., image) or output (i.e., prediction). Input QC is akin to out-of-distribution detection, aiming at the detection of images that are unusual due for example to the presence of artifacts. Output QC, in contrast, focuses on detecting automated predictions that do not meet expectations. These two facets of QC are intertwined, as noisy images are likely to produce poor predictions. However, they are generally considered as separate problems in the literature and tackled with different methodologies and evaluation procedures. In this work, a taxonomy of QC methods is first proposed, oriented to input or output checking. Then, a general framework to jointly combine these two QC facets is proposed and illustrated on two tasks, namely binary segmentation of polyps in endoscopic images and multiclass tumor segmentation in multimodal MRIs. This work has been published as a book chapter in the book entitled Trustworthy AI in Medical Imaging 60.

8.1.4 Leaf Area estimation and Semantic segmentation of forest point clouds using neural networks.

Participants: Jean-Baptiste Durand, Florence Forbes, Yuchen Bai.

Joint work with: Grégoire Vincent, IRD, AMAP, Montpellier, France.

Tropical forests, covering only 7% of the Earth’s land surface, play a disproportionately vital role in biosphere, storing 25% of the terrestrial carbon and contribute to over a third of the global terrestrial productivity. They also recycle about a third of the precipitations through evapotranspiration and thus contribute to generate and maintain a humid climate regionally, with positive effects also extending well beyond the tropics. However, the seasonal variability in fluxes between tropical rainforests and atmosphere is still poorly understood. Better understanding the processes underlying flux seasonality in tropical forests is thus critical to improve our predictive ability on global biogeochemical cycles. Leaf area index (LAI), a key parameter governing water and carbon fluxes, is inadequately characterised, necessitating advances in monitoring technologies such as aerial and terrestrial laser scanning (LiDAR). In this work, we address key challenges in quantifying leaf area in tropical forests using LiDAR technology.

The first challenge aims at developing and end-to-end Deep Learning approach for semantic segmentation of Unmanned Aerial Vehicle (UAV) Laser Scans (ULS) in presence of two classes: wood and leaves. We developped a pipeline dedicated to the analysis of point clouds produced by ULS in tropical forest environments, whose specificities are the following: 1) sampling is globally sparser than in Terrestrial Laser Scanning; 2) sampling is from above of the scene, which causes decreasing point density when getting closer to the ground and as a consequence; 3) the “leaf” class is largely majoritary (about 20 times more frequent). This particular structure of point clouds is the cause of state-of-the-art machine learning methods to fail in recognizing wood points. Our pipeline, referred to as SOUL 83, 43 is based on some adaptation on the PointNet++ algorithm with an additional sampling scheme and an innovative training loss function to handle the high class imbalance. SOUL relies on the coordinates of the points, thus ignoring device-specific information such as apparent reflectance, so as to increase its range of application to other forests and other sensors. It also includes 4 point-wise geometric features computed at 3 scales to characterise each point. Additionally, we introduce a novel data preprocessing methodology, geodesic voxelization decomposition (GVD), to address the challenges of training neural networks from sparse point clouds. GVD partitions the ULS data while preserving the topology of the point cloud. Experiments were conducted on a dataset recorded in a French Guiana tropical forest. SOUL reached the best results on 3 over 6 evaluation metrics, including metrics adapted to unbalanced datasets. The approach was also qualitatively tested on open source datasets recorded in Australia and Germany with open source datasets, showing a potential generalisation on other forests and other LiDAR sensors.

The second challenge aims at analysing various sources of uncertainty and biases that affect LAI estimation from LiDAR surveys. These biases include limitations in sensor sensitivity (censoring), unknown clumping of targets, inadequate weighting of multiple LiDAR returns, unknown leaf angle distribution, leaf size, and the presence of woody components within the canopy. Since there is currently no efficient and comprehensive method to obtain the true LAI of a forest plot, the study uses simulated ULS data generated by the DART software based on two forest mock-ups: Wytham Woods and RAMI-V Järvselja Birch Stand. The simulated data mimics the characteristics of real ULS data while providing full access to details about the forest, particularly the LAI. Among the various biases, woody components pose a unique challenge because woody organ structure is naturally different from the other sources of bias. Therefore, our approach prioritises addressing this bias to isolate and understand the individual contributions of other factors of bais in LAI estimation. To eliminate the impact of woody components, we propose a robust protocol that combines the SOUL method with AMAPVox, a ray tracing software. Once the woody component bias removed, a quantitative analysis of the remaining biases is conducted, laying the foundation for future work in this area.

8.1.5 Graph modelling for the study of language dynamics

Participants: Sophie Achard.

Joint work with: Clément Guichet, Monica Bacciu and Martial Mermillod from LPNC, Univ. Grenoble Alpes.

Healthy aging is associated with heterogeneous decline across cognitive functions, typically observed between language comprehension and language production (LP). In 22, we Examine resting-state fMRI and neuropsychological data from 628 healthy adults (age 18-88) from the CamCAN cohort. We performed state-of-the-art graph theoretical analysis to uncover the neural mechanisms underlying this variability. At the cognitive level, our findings suggest that LP is not an isolated function but is modulated throughout the lifespan by the extent of inter-cognitive synergy between semantic and domain-general processes. At the cerebral level, we show that default mode network (DMN) suppression coupled with fronto-parietal network (FPN) integration is the way for the brain to compensate for the effects of dedifferentiation at a minimal cost, efficiently mitigating the age-related decline in LP. Relatedly, reduced DMN suppression in midlife could compromise the ability to manage the cost of FPN integration. This may prompt older adults to adopt a more cost-efficient compensatory strategy that maintains global homeostasis at the expense of LP performances. Taken together, we propose that midlife represents a critical neurocognitive juncture that signifies the onset of LP decline, as older adults gradually lose control over semantic representations. We summarize our findings in a novel synergistic, economical, nonlinear, emergent, cognitive aging model, integrating connectomic and cognitive dimensions within a complex system perspective. In a follow up work concerning aging and language production 23, we examine the white matter changes associated with lexical production difficulties, beginning in midlife with increased naming latencies. To delay lexical production decline, middle-aged adults may rely on domain-general and language-specific compensatory mechanisms proposed by the LARA model (Lexical Access and Retrieval in Aging). However, the white matter changes supporting these mechanisms remains largely unknown. Our findings indicate that midlife is marked by alterations in brain structure within distributed dorsal, ventral, and anterior cortico-subcortical networks, marking the onset of lexical production decline around ages 53–54. Middle-aged adults may initially adopt a “semantic strategy” to compensate for lexical production challenges, but this strategy seems compromised later (ages 55–60) as semantic control declines. These insights underscore the interplay between domain-general and language-specific processes in the trajectory of lexical production performance in healthy aging and hint at potential biomarkers for language-related neurodegenerative pathologies.

8.1.6 Link between Graphs and artificial neural networks

Participants: Sophie Achard, Lucrezia Carboni.

Joint work with: Michel Dojat from GIN, Univ. Grenoble Alpes

Artificial neural networks are prone to being fooled by carefully perturbed inputs which cause an egregious misclassification. These adversarial attacks have been the focus of extensive research. Likewise, there has been an abundance of research in ways to detect and defend against them. In 30, we introduce a novel approach of detection and interpretation of adversarial attacks from a graph perspective. For an input image, we compute an associated sparse graph using the layer-wise relevance propagation algorithm (Bach et al., 2015). Specifically, we only keep edges of the neural network with the highest relevance values. Three quantities are then computed from the graph which are then compared against those computed from the training set. The result of the comparison is a classification of the image as benign or adversarial. To make the comparison, two classification methods are introduced: (1) an explicit formula based on Wasserstein distance applied to the degree of node and (2) a logistic regression. Both classification methods produce strong results which lead us to believe that a graph-based interpretation of adversarial attacks is valuable.

8.1.7 Spatio-temporal data

Participants: Sophie Achard, Hana Lbath.

Joint work with: Alex Petersen, Brigham Young University, US and Wendy Meiring, University Santa Barbara California, US

In two papers 27 and 26, we studied the impact of spatial properties of multivariate time series. First, a novel non-parametric estimator of the correlation between regions, or groups of arbitrarily dependent variables, is proposed in the presence of noise. The challenge resides in the fact that both noise and low intra-regional correlation lead to inconsistent inter-regional correlation estimation using classical approaches. While some existing methods handle one of these issues or the other, none tackle both at the same time. To address this problem, we propose a trade-off between two approaches: correlating regional averages, which is not robust to low average intra-regional correlation, and averaging pairwise inter-regional correlations, which is not robust to high noise. To that end, we project the data onto a space where the Euclidean distance can be used as a proxy for the sample correlation. We then leverage hierarchical clustering to gather together highly correlated variables within each region prior to averaging. We prove our estimator is consistent for an appropriate cut-off height of the dendogram. We also empirically show our approach surpasses popular estimators in terms of quality and provide illustrations on real-world datasets that further demonstrate its usefulness. Then, we we propose to leverage techniques from the large-scale correlation screening literature, and derive simple and practical characterizations of the mean number of correlation discoveries that flexibly incorporate intra-regional dependence structures. A connectivity network inference framework is then presented. First, inter-regional correlation distributions are estimated. Then, correlation thresholds that can be tailored to one's application are constructed for each edge. Finally, the proposed framework is implemented on synthetic and real-world datasets. This novel approach for handling arbitrary intra-regional correlation is shown to limit false positives while improving true positive rates.

8.1.8 Wavelets for time series

Participants: Sophie Achard.

Joint work with: Irène Gannaz, Univ. Grenoble Alpes

In the general setting of long-memory multivariate time series, the long-memory characteristics are defined by two components. The long-memory parameters describe the autocorrelation of each time series. And the long-run covariance measures the coupling between time series, with general phase parameters. It is of interest to estimate the long-memory, long-run covariance and general phase parameters of time series generated by this wide class of models although they are not necessarily Gaussian nor stationary. This estimation is thus not directly possible using real wavelets decomposition or Fourier analysis. Our purpose in the paper 11 is to define an inference approach based on a representation using quasi-analytic wavelets. We first show that the covariance of the wavelet coefficients provides an adequate estimator of the covariance structure including the phase term. Consistent estimators based on a local Whittle approximation are then proposed. Simulations highlight a satisfactory behavior of the estimation on finite samples on linear time series and on multivariate fractional Brownian motions. An application on a real neuroscience dataset is presented, where long-memory and brain connectivity are inferred.

8.1.9 Graph comparisons

Participants: Sophie Achard, Hana Lbath, Lucrezia Carboni.

Joint work with: Paul Sitoleux

We already published several articles for brain graph comparisons. Functional magnetic resonance imaging (fMRI) allows the construction of functional brain networks, offering a tool to probe the organization of neural activity. Null models have been proposed in this framework to evaluate the accuracy of new proposed approaches to discriminate network features coming from the data themselves in comparison to randomize procedure. Several models have been recently proposed and it is still complicated to choose one. We propose in this paper 51 to compare null models and real datasets using Persistent homology (PH). PH is part of topological data analysis (TDA), and offers a framework for building multiscale summaries of networks. PH is first applied to a density-based filtration. We propose a procedure to extract label information from persistent homology summaries of labeled graphs. We then investigate its ability to discriminate between real data and surrogate data generated from null models. Interestingly, our new proposed label-informed approach is able to discriminate very accurately real datasets and classical null models opening the way to the design of new null models. Index Terms—Toppological data analysis, graph distances, persistent homology, null models, function brain connectivity.

8.1.10 Inferring the dependence graph density of binary graphical models in high dimension

Participants: Julien Chevallier.

Joint work with: Eva Löcherbach from Paris 1, Guilherme Ost from UFRJ

The main objective of 65 is to estimate the connectivity parameter p of a biological neural network based only on the observation of the action potentials of N neurons over T time units. In our main result, we show that p can be estimated with rate N-1/2+N1/2/T+(log(T)/T)1/2 through an easy-to-compute estimator. Our analysis relies on a precise study of the spatio-temporal decay of correlations of the interacting chains. This is done through the study of coalescing random walks defining a backward regeneration representation of the system.

8.1.11 Community detection for binary graphical models in high dimension

Participants: Julien Chevallier.

Joint work with: Guilherme Ost from UFRJ

The main objective of 66 is to find two the communities (one excitating and one inhibiting) based on the observation of the action potentials of N neurons over T time units. More specifically, we propose a simple algorithm for which the probability of exact recovery converges to 1 as long as (N/T1/2)log(NT)0 as T and N diverge. Interestingly, this simple algorithm does not required any prior knowledge on the other model parameters (e.g. the edge probability p).

8.1.12 Normalizing Flows with Task-specific Pre-training for Unsupervised Anomaly Detection on Engineering Structures

Participants: Florence Forbes, Brice Marc.

Joint work with: Philippe Fouchier and Pierre Charbonnier from CEREMA endsum, Strasbourg

Automatic anomaly detection on engineering structures is often carried out using supervised models, raising the issue of anomalous images acquisition and annotation. Unsupervised methods like normalizing flows achieve excellent results while trained with defect-free images only. However, normalizing flows methods, such as MSFlow, are generally applied on features extracted by an encoder pre-trained on datasets that may not be related to engineering structures images. Therefore, we investigate the possibility to derive more discriminative features with an additional fine-tuning of the feature extractor on images with synthetic anomalies. We consider two types of such anomalies and demonstrate their efficiency with MSFlow on the MVTec (Wood/Tile) and Crack500 datasets, with significantly improved predictions. Interestingly, both tasks produce similar results suggesting that pre-training is mainly improved by the healthy part of images and not very sensitive to anomaly realism. Additionally, when comparing our fine-tuned MSFlow with a reference supervised model, CT-CrackSeg, on the Crack500 dataset, we observe similar qualitative behaviours. This open a promising direction towards annotation-free, more scalable alternatives, in particular for anomaly detection in engineering structure applications. More details in 49.

—————————————

8.2 Latent variable modelling

8.2.1 Stochastic Majorization-Minimization with sample-average approximation

Participants: Florence Forbes.

Joint work with: Hien Nguyen, University of Queensland, Brisbane Australia, Gersende Fort, IMT Toulouse.

To extend the applicability of Majorization-Minimization algorithms (MM) in a stochastic optimization context, we propose to combine MM with Sample Average Approximation (SAA). So doing, we avoid the setting of step sizes that goes with stochastic approximation approaches while augmenting SAA with the possibility to consider smaller samples of increasing sizes. In addition SAA does not require to assume uniqueness of the solution or quasi-convexity of the majorizers. More details in 69.

8.2.2 Natural Variational Annealing for Multimodal Optimization

Participants: Tam Le Minh, Florence Forbes, Julyan Arbel.

Joint work with: Emtiyaz Khan and Thomas Mollenhoff from Riken, Tokyo, Japan

We introduce a new multimodal optimization approach called Natural Variational Annealing (NVA) that combines the strengths of three foundational concepts to simultaneously search for multiple global and local modes of black-box nonconvex objectives. First, it implements a simultaneous search by using variational posteriors, such as, mixtures of Gaussians. Second, it applies annealing to gradually trade off exploration for exploitation. Finally, it learns the variational search distribution using natural-gradient learning where updates resemble well-known and easy-toimplement algorithms. The three concepts come together in NVA giving rise to new algorithms and also allowing us to incorporate "fitness shaping", a core concept from evolutionary algorithms. We assess the quality of search on simulations and compare them to methods using gradient descent and evolution strategies. We also provide an application to a real-world inverse problem in planetary science. More details in 73. An extension to the situations where only samples are available can be found in 72.

8.2.3 Scalable magnetic resonance fingerprinting: Incremental inference of high dimensional elliptical mixtures from large data volumes

Participants: Florence Forbes, Geoffroy Oudoumanessah.

Joint work with: Luc Meyer from SED, Michel Dojat, Thomas Coudert, Thomas Christen from Grenoble Institute of Neurosciences, Carole Lartizien from Creatis.

Magnetic Resonance Fingerprinting (MRF) is an emerging technology with the potential to revolutionize radiology and medical diagnostics. In comparison to traditional magnetic resonance imaging (MRI), MRF enables the rapid, simultaneous, non-invasive acquisition and reconstruction of multiple tissue parameters, paving the way for novel diagnostic techniques. In the original matching approach, reconstruction is based on the search for the best matches between in vivo acquired signals and a dictionary of high-dimensional simulated signals (fingerprints) with known tissue properties. A critical and limiting challenge is that the size of the simulated dictionary increases exponentially with the number of parameters, leading to an extremely costly subsequent matching. In this work, we propose to address this scalability issue by considering probabilistic mixtures of high-dimensional elliptical distributions, to learn more efficient dictionary representations. Mixture components are modelled as flexible ellipitic shapes in low dimensional subspaces. They are exploited to cluster similar signals and reduce their dimension locally cluster-wise to limit information loss. To estimate such a mixture model, we provide a new incremental algorithm capable of handling large numbers of signals, allowing us to go far beyond the hardware limitations encountered by standard implementations. We demonstrate, on simulated and real data, that our method effectively manages large volumes of MRF data with maintained accuracy. It offers a more efficient solution for accurate tissue characterization and significantly reduces the computational burden, making the clinical application of MRF more practical and accessible. A preliminary version of this work has been accepted at the International Symposium on Biomedical Imaging (ISBI 2025) 50. An extended version is submitted to a journal 77.

8.2.4 Assessing a dose-response relationship after brain radiotherapy via Mixture of Regressions

Participants: Florence Forbes.

Joint work with: Theo Sylvestre, Sophie Ancelet from IRSN.

Radiotherapy (RT) is one of the most important treatments for brain tumors. However, its potential toxicity on the central nervous system is a highly relevant clinical issue as cognitive dysfunction, mainly related to radiation-induced leukoencephalopathy, may alter the quality of life of patients. As part of the RADIO-AIDE research project, the aim of this work 80, 79 is to model and learn about the potential relationship between the dose of ionizing radiation absorbed in a voxel of the brain after RT (from CT-scan images) and the presence/absence of brain lesions in these voxels (from segmented brain MRI data). We propose to extend the class of mixture of experts’ models, including the well-known Gaussian Locally Linear Mapping model (GLLiM), to a binary outcome and a spatially structured predictor. We thus propose and compare several mixtures of experts’ models based on a piecewise logistic regression and different spatial components (hidden Potts model, conditionally auto-regressive model) to account for dependency between neighboring voxels. Various Bayesian statistical learning methods (variational Bayes, MCMC, SMC) are implemented and compared from simulated data as well as real data from the EpiBrainRad prospective cohort, which includes patients treated with radiochemotherapy for glioblastoma. Many modelling perspectives and Bayesian computational challenges will also be discussed.

8.2.5 Massive analysis of multidimensional astrophysical data by inverse regression of physical models

Participants: Florence Forbes.

Joint work with: Sylvain Douté IPAG, Stan Borkowski and Luc Meyer from SED Grenoble

Modern observational sciences such as geophysics, astrophysics, medical imaging, etc. produce a huge volume of high-dimensional data. A powerful approach to analyzing these data and retrieving information of interest uses the Bayesian formalism to invert physical models. In this article, we show the application of a method based on an inverse regression statistical approach – GLLiM – which has the advantage of producing distributions approximating the target posterior laws. These distributions can also be used for finer predictions using importance sampling while providing a way to better explore the inverse problem when multiple equivalent solutions exist and to perform uncertainty level estimation. In this article, our objective is to present an application of GLLiM to the analysis of a sequence of hyperspectral images acquired from space for the same Martian scene and to present the PlanetGLLiM software. More details in 53.

8.2.6 Probability metrics in distributional reinforcement learning: why were Gaussian mixtures overlooked?

Participants: Florence Forbes, Henrique Donancio, Mathis Antonetti.

Distributional Reinforcement Learning (DRL) aims at optimizing a risk measure of the return by representing its distribution. However, finding a representation of this distribution is challenging as it requires a tractable estimation of the risk measure, a tractable loss, and a representation with enough approximation power. Although Gaussian mixtures are powerful statistical models to solve these challenges, only very few papers have investigated this approach and most use the L2 space norm as a tractable metric between probability density functions. In this paper, we show that this metric is not suitable and propose alternative metrics, namely a mixture-specific optimal transport distance and a maximum mean discrepancy distance. Our approach is illustrated on some environments of the Atari Learning Environment benchmark and shows promising empirical results.

8.2.7 Dynamic Learning Rate for Deep Reinforcement Learning: A Bandit Approach

Participants: Florence Forbes, Henrique Donancio.

Joint work with: Leah South, Queensland University of Technology, Brisbane Australia and Antoine Barrier, Grenoble Institute of Neuroscience.

In Deep Reinforcement Learning models trained using gradient-based techniques, the choice of optimizer and its learning rate are crucial to achieving good performance: higher learning rates can prevent the model from learning effectively, while lower ones might slow convergence. Additionally, due to the non-stationarity of the objective function, the best-performing learning rate can change over the training steps. To adapt the learning rate, a standard technique consists of using decay schedulers. However, these schedulers assume that the model is progressively approaching convergence, which may not always be true, leading to delayed or premature adjustments. In this work, we propose dynamic Learning Rate for deep Reinforcement Learning (LRRL), a meta-learning approach that selects the learning rate based on the agent's performance during training. LRRL is based on a multi-armed bandit algorithm, where each arm represents a different learning rate, and the bandit feedback is provided by the cumulative returns of the RL policy to update the arms' probability distribution. Our empirical results demonstrate that LRRL can substantially improve the performance of deep RL algorithms.

8.2.8 Bandits and sequential learning

Participants: Julyan Arbel, Julien Zhou.

Joint work with: Pierre Gaillard (Inria Thoth), Thibaud Rahier (Criteo AI Lab).

Bandit algorithms address the exploration-exploitation trade-off by balancing learning about actions and maximizing cumulative rewards, with applications in areas like online advertising, recommendation systems, and A/B testing. We improve existing regret bounds in two settings: stochastic combinatorial semi-bandits 78, and online unconstrained submodular maximization with stochastic bandit feedback 52.

8.2.9 Causal inference

Participants: Julyan Arbel.

Joint work with: Daria Bystrova, Charles Assaad, Emilie Devijver (LIG), Éric Gaussier (LIG), Wilfried Thuiller (LECA).

18 focusses on constraint-based methods and noise-based methods, which are two distinct families of methods proposed for uncovering causal graphs from observational data. However, both operate under strong assumptions that may be challenging to validate or could be violated in real-world scenarios. In response to these challenges, there is a growing interest in hybrid methods that amalgamate principles from both methods, showing robustness to assumption violations. This paper introduces a novel comprehensive framework for hybridizing constraint-based and noise-based methods designed to uncover causal graphs from observational time series. The framework is structured into two classes. The first class employs a noise-based strategy to identify a super graph, containing the true graph, followed by a constraint-based strategy to eliminate unnecessary edges. In the second class, a constraint-based strategy is applied to identify a skeleton, which is then oriented using a noise-based strategy. The paper provides theoretical guarantees for each class under the condition that all assumptions are satisfied, and it outlines some properties when assumptions are violated. To validate the efficacy of the framework, two algorithms from each class are experimentally tested on simulated data, realistic ecological data, and real datasets sourced from diverse applications. Notably, two novel datasets related to Information Technology monitoring are introduced within the set of considered real datasets. The experimental results underscore the robustness and effectiveness of the hybrid approaches across a broad spectrum of datasets.

8.2.10 Optimal sub-Gaussian variance proxy

Participants: Julyan Arbel.

Joint work with: Mathias Barreto (National Research University Higher School of Economics, Moscow), Olivier Marchal (Institut Camille Jordan, Lyon).

In 64, we establish the optimal sub-Gaussian variance proxy for truncated Gaussian and truncated exponential random variables. The proofs rely on first characterizing the optimal variance proxy as the unique solution to a set of two equations and then observing that for these two truncated distributions, one may find explicit solutions to this set of equations. Moreover, we establish the conditions under which the optimal variance proxy coincides with the variance, thereby characterizing the strict sub-Gaussianity of the truncated random variables. Specifically, we demonstrate that truncated Gaussian variables exhibit strict sub-Gaussian behavior if and only if they are symmetric, meaning their truncation is symmetric with respect to the mean. Conversely, truncated exponential variables are shown to never exhibit strict sub-Gaussian properties. These findings contribute to the understanding of these prevalent probability distributions in statistics and machine learning, providing a valuable foundation for improved and optimal modeling and decision-making processes.

8.3 Bayesian modelling

8.3.1 Concentration results for approximate Bayesian computation without identifiability

Participants: Florence Forbes, Julyan Arbel.

Joint work with: Hien Nguyen and Trung Tin Nguyen, University of Queensland, Brisbane Australia.

We study the large sample behaviors of approximate Bayesian compu- tation (ABC) posterior measures in situations when the data generating process is dependent on unidentifiable parameters. In particular, we establish the concen- tration of posterior measures on sets of arbitrarily small measure that contain the equivalence set of the data generative parameter, when the sample size tends to infinity. Our theory also makes weak assumptions regarding the measurement of discrepancy between the data set and simulations. In particular, it does not require the use of summary statistics and is applicable to a broad class of kernelized ABC algorithms. We provide useful illustrations and demonstrations of our theory in practice, and offer a comprehensive assessment of how our findings complement other results in the literature

8.3.2 Bayesian Likelihood Free Inference using Mixtures of Experts

Participants: Florence Forbes.

Joint work with: Hien Nguyen and Trung Tin Nguyen, University of Queensland, Brisbane Australia.

We extend Bayesian Synthetic Likelihood (BSL) methods to non-Gaussian approximations of the likelihood function. In this setting 44, we introduce Mixtures of Experts (MoEs), a class of neural network models, as surrogate likelihoods that exhibit desirable approximation theoretic properties. Moreover, MoEs can be estimated using Expectation-Maximization algorithm-based approaches, such as the Gaussian Locally Linear Mapping model estimators that we implement. Further, we provide theoretical evidence towards the ability of our procedure to estimate and approximate a wide range of likelihood functions. Through simulations, we demonstrate the superiority of our approach over existing BSL variants in terms of both posterior approximation accuracy and computational efficiency.

8.3.3 Bayesian mixture models (in)consistency for the number of clusters

Participants: Julyan Arbel, Louise Alamichel, Daria Bystrova.

Joint work with: Guillaume Kon Kam King (INRAE).

Bayesian nonparametric mixture models are common for modeling complex data. While these models are well-suited for density estimation, their application for clustering has some limitations. Recent results proved posterior inconsistency of the number of clusters when the true number of clusters is finite for the Dirichlet process and Pitman–Yor process mixture models. We extend these results to additional Bayesian nonparametric priors such as Gibbs-type processes and finite-dimensional representations thereof. The latter include the Dirichlet multinomial process, the recently proposed Pitman–Yor, and normalized generalized gamma multinomial processes. We show that mixture models based on these processes are also inconsistent in the number of clusters and discuss possible solutions. Notably, we show that a post-processing algorithm introduced for the Dirichlet process can be extended to more general models and provides a consistent method to estimate the number of components.

8.3.4 Diagnosing convergence of Markov chain Monte Carlo

Participants: Julyan Arbel, Stephane Girard.

Joint work with: A. Dutfoy (EDF R&D).

Diagnosing convergence of Markov chain Monte Carlo (MCMC) is crucial in Bayesian analysis. Among the most popular methods, the potential scale reduction factor (commonly named R^) is an indicator that monitors the convergence of output chains to a stationary distribution, based on a comparison of the between- and within-variance of the chains. Several improvements have been suggested since its introduction in the 90'ss. In the PhD work of Théo Moins, we analyse some properties of the theoretical value R associated to R^ in the case of a localized version that focuses on quantiles of the distribution. This leads to proposing a new indicator 28, which is shown to allow both for localizing the MCMC convergence in different quantiles of the distribution, and at the same time for handling some convergence issues not detected by other R^ versions.

8.3.5 Bayesian deep learning

Participants: Julyan Arbel, Pierre Wolinski, Konstantinos Pitas.

Joint work with: Vincent Fortuin (Munich), Mariia Vladimirova (Criteo AI Lab).

Neural networks have achieved remarkable performance across various problem domains, but their widespread applicability is hindered by inherent limitations such as overconfidence in predictions, lack of interpretability, and vulnerability to adversarial attacks. To address these challenges, Bayesian neural networks (BNNs) have emerged as a compelling extension of conventional neural networks, integrating uncertainty estimation into their predictive capabilities.

The comprehensive primer 17 presents a systematic introduction to the fundamental concepts of neural networks and Bayesian inference, elucidating their synergistic integration for the development of BNNs. The target audience comprises statisticians with a potential background in Bayesian methods but lacking deep learning expertise, as well as machine learners proficient in deep neural networks but with limited exposure to Bayesian statistics. We provide an overview of commonly employed priors, examining their impact on model behavior and performance. Additionally, we delve into the practical considerations associated with training and inference in BNNs.

Furthermore, we explore advanced topics within the realm of BNN research, acknowledging the existence of ongoing debates and controversies. By offering insights into cutting-edge developments, this primer not only equips researchers and practitioners with a solid foundation in BNNs, but also illuminates the potential applications of this dynamic field. As a valuable resource, it fosters an understanding of BNNs and their promising prospects, facilitating further advancements in the pursuit of knowledge and innovation.

Our recent work studies feature propagation at initialization in neural networks, which lies at the root of numerous initialization designs. An assumption very commonly made in the field states that the pre-activations are Gaussian. Although this convenient Gaussian hypothesis can be justified when the number of neurons per layer tends to infinity, it is challenged by both theoretical and experimental works for finite-width neural networks. Our major contribution of this work is to construct a family of pairs of activation functions and initialization distributions that ensure that the pre-activations remain Gaussian throughout the network's depth, even in narrow neural networks. In the process, we discover a set of constraints that a neural network should fulfill to ensure Gaussian pre-activations. Additionally, we provide a critical review of the claims of the Edge of Chaos line of works and build an exact Edge of Chaos analysis. We also propose a unified view on pre-activations propagation, encompassing the framework of several well-known initialization procedures. Finally, our work provides a principled framework for answering the much-debated question: is it desirable to initialize the training of a neural network whose pre-activations are ensured to be Gaussian?

8.3.6 PASOA-PArticle baSed Bayesian Optimal Adaptive design

Participants: Florence Forbes, Jacopo Iollo.

Joint work with: Pierre Alliez, Inria Titane and Christophe Heinkele, Cerema Strasbourg.

We propose a new procedure named PASOA, for Bayesian experimental design, that performs sequential design optimization by simultaneously providing accurate estimates of successive posterior distributions for parameter inference. The sequential design process is carried out via a contrastive estimation principle, using stochastic optimization and Sequential Monte Carlo (SMC) samplers to maximise the Expected Information Gain (EIG). As larger information gains are obtained for larger distances between successive posterior distributions, this EIG objective may worsen classical SMC performance. To handle this issue, tempering is proposed to have both a large information gain and an accurate SMC sampling, that we show is crucial for performance. This novel combination of stochastic optimization and tempered SMC allows to jointly handle design optimization and parameter inference. We provide a proof that the obtained optimal design estimators benefit from some consistency property. Numerical experiments confirm the potential of the approach, which outperforms other recent existing procedures. This work has been accepted at ICML 2024 46.

8.3.7 Bayesian Experimental Design via Contrastive Diffusions.

Participants: Florence Forbes, Jacopo Iollo.

Joint work with: Pierre Alliez, Inria Titane and Christophe Heinkele, Cerema Strasbourg.

Bayesian Optimal Experimental Design (BOED) is a powerful tool to reduce the cost of running a sequence of experiments. When based on the Expected Information Gain (EIG), design optimization corresponds to the maximization of some intractable expected contrast between prior and posterior distributions. Scaling this maximization to high dimensional and complex settings has been an issue due to BOED inherent computational complexity. In this work, we introduce a pooled posterior distribution with cost-effective sampling properties and provide a tractable access to the EIG contrast maximization via a new EIG gradient expression. Diffusion-based samplers are used to compute the dynamics of the pooled posterior and ideas from bi-level optimization are leveraged to derive an efficient joint sampling-optimization loop. The resulting efficiency gain allows to extend BOED to the well-tested generative capabilities of diffusion models. By incorporating generative models into the BOED framework, we expand its scope and its use in scenarios that were previously impractical. Numerical experiments and comparison with state-of-the-art methods show the potential of the approach. This work has been accepted at ICLR 2025 45.

8.3.8 Bayesian nonparametric mixture of experts for high-dimensional inverse problems.

Participants: Julyan Arbel, Florence Forbes.

Joint work with: Hien Duy Nguyen and Trung Tin Nguyen, University of Queensland, Brisbane, Australia.

A wide class of problems can be formulated as inverse problems where the goal is to find parameter values that best explain some observed measures. Typical constraints in practice are that relationships between parameters and observations are highly nonlinear, with high-dimensional observations and multi-dimensional correlated parameters. To handle these constraints, we consider probabilistic mixtures of locally linear models, which can be seen as particular instances of mixtures of experts (MoE). We have shown in previous studies that such models had a good approximation ability provided the number of experts was large enough. This contribution is to propose a general scheme to design a tractable Bayesian nonparametric (BNP) MoE model to avoid any commitment to an arbitrary number of experts. A tractable estimation algorithm is designed using a variational approximation and theoretical properties are derived on the predictive distribution and the number of components. Illustrations on simulated and real data show good results in terms of selection and computing time compared to more traditional model selection procedures. This work has been accepted in Journal of Nonparametric statistics 29.

8.3.9 Bayesian inference on a large-scale brain simulator

Participants: Pedro Rodrigues.

Joint work with: Nicholas Tolley and Stephanie Jones from Brown University, Alexandre Gramfort from Inria Saclay

Biophysically detailed neural models are a powerful technique to study neural dynamics in health and disease with a growing number of established and openly available models. A major challenge in the use of such models is that parameter inference is an inherently difficult and unsolved problem. Identifying unique parameter distributions that can account for observed neural dynamics, and differences across experimental conditions, is essential to their meaningful use. Recently, simulation based inference (SBI) has been proposed as an approach to perform Bayesian inference to estimate parameters in detailed neural models. SBI overcomes the challenge of not having access to a likelihood function, which has severely limited inference methods in such models, by leveraging advances in deep learning to perform density estimation. While the substantial methodological advancements offered by SBI are promising, their use in large scale biophysically detailed models is challenging and methods for doing so have not been established, particularly when inferring parameters that can account for time series waveforms. We provide guidelines and considerations on how SBI can be applied to estimate time series waveforms in biophysically detailed neural models starting with a simplified example and extending to specific applications to common MEG/EEG waveforms using the the large scale neural modeling framework of the Human Neocortical Neurosolver. Specifically, we describe how to estimate and compare results from example oscillatory and event related potential simulations. We also describe how diagnostics can be used to assess the quality and uniqueness of the posterior estimates. The methods described provide a principled foundation to guide future applications of SBI in a wide variety of applications that use detailed models to study neural dynamics 31.

8.3.10 Simulation-based inference using score-diffusion

Participants: Pedro Rodrigues, Julia Linhart.

Joint work with: Gabriel Cardoso from Ecole Polytechnique and Sylvain Le Corff from Sorbonne Université and Alexandre Gramfort from Inria Saclay

Determining which parameters of a non-linear model best describe a set of experimental data is a fundamental problem in science and it has gained much traction lately with the rise of complex large-scale simulators. The likelihood of such models is typically intractable, which is why classical MCMC methods can not be used. Simulation-based inference (SBI) stands out in this context by only requiring a dataset of simulations to train deep generative models capable of approximating the posterior distribution that relates input parameters to a given observation. In this work, we consider a tall data extension in which multiple observations are available to better infer the parameters of the model. The proposed method is built upon recent developments from the flourishing score-based diffusion literature and allows to estimate the tall data posterior distribution, while simply using information from a score network trained for a single context observation. We compare our method to recently proposed competing approaches on various numerical experiments and demonstrate its superiority in terms of numerical stability and computational cost. Submission under review 75.

8.3.11 Lightweight posterior approximation for simulation-based inference

Participants: Pedro Rodrigues, Florence Forbes, Geoffroy Oudoumanessah.

Joint work with: Henrik Häggström and Umberto Piccini from Chalmers University

Bayesian inference for complex models with an intractable likelihood can be tackled using algorithms performing many calls to computer simulators. These approaches are collectively known as "simulation-based inference" (SBI). Recent SBI methods have made use of neural networks (NN) to provide approximate, yet expressive constructs for the unavailable likelihood function and the posterior distribution. However, the trade-off between accuracy and computational demand leaves much space for improvement. In this work, we propose an alternative that provides both approximations to the likelihood and the posterior distribution, using structured mixtures of probability distributions. Our approach produces accurate posterior inference when compared to state-of-the-art NN-based SBI methods, even for multimodal posteriors, while exhibiting a much smaller computational footprint. We illustrate our results on several benchmark models from the SBI literature and on a biological model of the translation kinetics after mRNA transfection 24.

8.4 Modelling and quantifying extreme risk

8.4.1 Extreme events and neural networks

Participants: Stephane Girard.

Joint work with: M. Allouche and E. Gobet (CMAP, Ecole Poytechnique).

In 61, we investigate the use of generative methods based on neural networks to simulate extreme events. Although very popular, these methods are mainly invoked in empirical works. Therefore, providing theoretical guidelines for using such models in extreme values context is of primal importance. To this end, we propose an overview of most recent generative methods dedicated to extremes, giving some theoretical and practical tips on their tail behaviour thanks to both extreme-value and copula tools. This work is submitted for publication.

In 14, we propose new parametrizations for neural networks in order to estimate extreme quantiles in both non-conditional and conditional heavy-tailed settings. All proposed neural network estimators feature a bias correction based on an extension of the usual second-order condition to an arbitrary order. The convergence rate of the uniform error between extreme log-quantiles and their neural network approximation is established. The finite sample performances of the non-conditional neural network estimator are compared to other bias-reduced extreme-value competitors on simulated data. It is shown that our method outperforms them in difficult heavy-tailed situations where other estimators almost all fail. Finally, the conditional neural network estimators are implemented to investigate the behaviour of extreme rainfalls as functions of their geographical location in the southern part of France. This work is extended in 15 to the estimation of more general risk measures such as the Expected Shortfall and the Conditional Tail Moments.

8.4.2 Estimation of univariate extreme risk measures

Participants: Jonathan El Methni, Stephane Girard.

Joint work with: M. Allouche (CMAP, Ecole Polytechnique).

One of the most popular risk measures is the Value-at-Risk (VaR) introduced in the 1990's. In statistical terms, the VaR at level α(0,1) corresponds to the upper α-quantile of the loss distribution. Weissman extrapolation device for estimating extreme quantiles (when α0) from heavy-tailed distributions is based on two estimators: an order statistic to estimate an intermediate quantile and an estimator of the tail-index. The common practice is to select the same intermediate sequence for both estimators. In a previous work, we showed how an adapted choice of two different intermediate sequences leads to a reduction of the asymptotic bias associated with the resulting refined Weissman estimator. This new bias reduction method is fully automatic and does not involve the selection of extra parameters. Our approach is compared to other bias reduced estimators of extreme quantiles both on simulated and real data. This work is extended to more general risk measures in 13 and to Weibull-tail distributions in 20.

8.4.3 Estimation of multivariate risk measures

Participants: Stephane Girard.

Joint work with: T. Opitz (INRAe Avignon) and A. Usseglio-Carleve (Univ. Avignon).

Analysis of variance (ANOVA) is commonly employed to assess differences in the means of independent samples. However, it is unsuitable for evaluating differences in tail behaviour, especially when means do not exist or empirical estimation of moments is inconsistent due to heavy-tailed distributions. In 21, we propose an ANOVA-like decomposition to analyse tail variability, allowing for flexible representation of heavy tails through a set of user-defined extreme quantiles, possibly located outside the range of observations. Building on the assumption of regular variation, we introduce a test for significant tail differences among multiple independent samples and derive its asymptotic distribution. We investigate the theoretical behaviour of the test statistics for the case of two samples, each following a Pareto distribution, and explore strategies for setting hyperparameters in the test procedure. To demonstrate the finite-sample performance, we conduct simulations that highlight generally reliable test behaviour for a wide range of situations. The test is applied to identify clusters of financial stock indices with similar extreme log-returns and to detect temporal changes in daily precipitation extremes at rain gauges in Germany.

8.4.4 Dimension reduction for extremes

Participants: Julyan Arbel, Stephane Girard, Cambyse Pakzad.

Joint work with: H. Lorenzo (Univ. Marseille).

In the context of the PhD thesis of Meryem Bousebata, we proposed a new approach, called Extreme-PLS (EPLS), for dimension reduction in regression and adapted to distribution tails. The objective is to find linear combinations of predictors that best explain the extreme values of the response variable in a non-linear inverse regression model. Further, a novel interpretation of EPLS directions as maximum likelihood estimators is introduced in 16, utilizing the von Mises-Fisher distribution applied to hyperballs. The dimension reduction process is enhanced through the Bayesian paradigm, enabling the incorporation of prior information into the projection direction estimation. The maximum a posteriori estimator is derived in two specific cases, elucidating it as a regularization or shrinkage of the EPLS estimator. We also establish its asymptotic behavior as the sample size approaches infinity. A simulation data study is conducted in order to assess the practical utility of our proposed method. This clearly demonstrates its effectiveness even in moderate data problems within high-dimensional settings. Furthermore, we provide an illustrative example of the method's applicability using French farm income data, highlighting its efficacy in real-world scenarios. Finally, the EPLS method is extended to the functional framework in 70, to tackle the case of functional covariates. The results are submitted for publication.

9 Bilateral contracts and grants with industry

9.1 Bilateral contracts with industry

Participants: Florence Forbes, Pedro Luiz Coelho Rodrigues, Stephane Girard.

  • Plan de Relance project with GE Healthcare (2022-24).
    The topic of the collaboration is related to early anomaly detection of failures in medical transducer manufacturing. The financial support for statify is of 155K euros.
  • Contract with EDF (2024-2027).
    Stephane Girard is the advisor of the PhD thesis of Antoine Franchini funded by EDF. The goal is to investigate sensitivity analysis and extrapolation limits in extreme-value theory. The financial support for statify is of 50K euros.

10 Partnerships and cooperations

10.1 International initiatives

10.1.1 Inria associate team not involved in an IIL or an international program

WOMBAT
  • Title:
    Variance-reduced Optimization Methods and Bayesian Approximation Techniques for scalable inference
  • Duration:
    2023 -> ...
  • Coordinator:
    Hien Duy Nguyen (h.nguyen5@latrobe.edu.au)
  • Partners:
    • Trobe University de Melbourne (Australie)
  • Inria contact:
    Florence Forbes
  • Summary:
    Many inferential tools, such as machine learning algorithms and statistical models, require the estimation of model parameters, structures, quantities, and properties, from data. In practice, it is common that model characterizations are available through high-fidelity simulations of the data generating processes, but only through “black-boxes” that are poorly suited for optimization under uncertainty or conventional statistical inference procedures. The main statistical challenge is that model likelihoods are typically intractable or unavailable in closed form. Approaches suited for these scenarios are typically referred to as likelihood-free or simulation-based inference (SBI) methods, and have received a great deal of attention in recent years, with momentum coming from mixing of ideas from the interface between statistics and machine learning. However, most SBI methods scale poorly when the number of observations is too large, which makes them unsuitable for modern data, which are often acquired in real time, in an incremental nature, and are often available in large volume. Computation of inferential quantities in an incremental manner may be forcibly imposed by the nature of data acquisition (e.g. streaming and sequential data) but may also be seen as a solution to handle larger data volumes in a more resource friendly way, with respect to memory, energy, and time consumption. To produce feasible and practical online algorithms for streaming data and complex models, we propose to study the family of stochastic approximation (SA) algorithms. The overall goal of the project is to combine recent ideas from the SBI and SA literature, to propose efficient methods for handling complex inferential problems. We shall demonstrate our approaches via applications to problems in challenging domains, such as Magnetic Resonance Imaging (MRI) or road network management as initial targets. So doing, we hope to achieve both breakthroughs in applied methodology and the development of new SBI and SA techniques that wide-spread applicability.

10.2 International research visitors

10.2.1 Visits of international scientists

Other international visits to the team
Alex Peterson
  • Status
    associate professor
  • Institution of origin:
    Brigham Young University
  • Country:
    USA
  • Dates:
    31st May 2024 to 10th July 2024
  • Context of the visit:
    Research project within the team with MIAI Cluster
  • Mobility program/type of mobility:
    research stay, partly funded by Inria

10.3 National initiatives

Participants: Jonathan El Methni, Jean-Baptiste Durand, Florence Forbes, Julyan Arbel, Sophie Achard, Stephane Girard, Pedro Luiz Coelho Rodrigues.

  • Jonathan El Methni and Stephane Girard were awarded 5K euros and a PhD funding via the IRGA call from Université Grenoble-Alpes, 2024–2027.
ANR

  • An ANR project RADIO-AIDE (2022-26) for Radiation induced neurotoxicity assessed by Spatio-temporal modeling and AI after brain radiotherapy coordinated by S.Ancelet from IRSN has been granted for 4 years starting from April 2022. It involves statify, Grenoble Insitute of Neurosciences, Pixyl, ICANS, APHP, ICM and ENS P.Saclay. The available funding for statify is 94K euros.
  • ANR project PEG2 (2022-26) on Predictive Ecological Genomics: statify is involved in this 4-year project recently accepted in July 2022. The PI is prof. Olivier Francois who spent 2 years (2021-22) in the team on a Delegation position.
  • Julyan Arbel is coPI of the Bayes-Duality project launched with a funding of $2.76 millions by Japan JST - French ANR for a total of 5 years starting in October 2021. The goal is to develop a new learning paradigm for Artificial Intelligence that learns like humans in an adaptive, robust, and continuous fashion. On the Japan side the project is led by Mohammad Emtiyaz Khan as the research director, and Kenichi Bannai and Rio Yokota as Co-PIs.
  • Statify is involved in the 4-year ANR project EXSTA “EXtremes, STatistical learning and Applications” (2024-2028) hosted by Paris-Sorbonne University. Extreme Value Theory is the branch of probability and statistics dedicated to rare events associated with tails of distributions, with numerous applications in various scientific fields where extreme events are of particular importance, and in risk management. Recent years have seen the development of a theoretical framework inspired by statistical learning theory and algorithms adapted from machine learning for the analysis of extremes, in line with the statistical community's growing interest in high-dimensional problems and the increasing availability of large-scale data sets. The aim of the project is to reinforce these emerging directions and encourage interaction between theory and practice. The consortium brings together statisticians whose research topics cover a wide spectrum, from mathematical statistics and learning theory to operational applications in climate and environmental sciences and industry.

PEPR Digital Health

  • Florence Forbes and Sophie Achard are involved in the REWIND project (2023-2028), pRecision mEdicine WIth loNgitudinal Data. The goal is to develop models for longitudinal for understanding the progression of chronic diseases.

France Life Imaging (FLI)

  • Funding from “comité” de pilotage national du Réseau d’Expertise « Traitement et Analyse en Imagerie Multimodale » (RE4) de l’Infrastructure France Life Imaging (FLI) for a project entitled « Détection d’Anomalies en Imagerie Médicale par apprentissage faiblement Supervisé ». Joint project with Carole Lartizien and Michel Dojat.
INRAe projects

  • Stephane Girard was awarded 80K euros for the project “Analysis of variability in extremes - ANOVEX” via the AMI INRAe-Inria “Risques naturels et environnementaux”, 2023–2024.
  • Jonathan El Methni and Stephane Girard were awarded 3.5K euros for the project “Warm Winter Risk - WWR” via the AMI INRAe “projets exploratoires pour le métaprogramme XRISQUES”, 2023–2024.

10.3.1 Networks

MSTGA and AIGM INRAE (French National Institute for Agricultural Research) networks: F. Forbes and J.B Durand are members of the INRAE network called AIGM (ex MSTGA) network since 2006, website, on Algorithmic issues for Inference in Graphical Models. It is funded by INRAE MIA and RNSC/ISC Paris. This network gathers researchers from different disciplines. Statify co-organized and hosted 2 of the network meetings in 2008 and 2015 in Grenoble.

11 Dissemination

11.1 Promoting scientific activities

11.1.1 Scientific events: organisation

Member of the organizing committees
  • Pedro Rodrigues , 2eme Colloque Français d'Intelligence Artificielle en Imagerie Biomédicale, 25-27 March 2024, Grenoble, France
  • Pedro Rodrigues , 1st Workshop Grenoble Artificial Intelligence for Physical Sciences, 29-31 May 2024, Grenoble, France
  • Sophie Achard , organisation of a special session with EUSIPCO conference entitled "Reproducibility of Classical Approaches for Brain Imaging", 26-30 August 2024, Lyon , France.

11.1.2 Scientific events: selection

Reviewer
  • Julyan Arbel : Advances in Approximate Bayesian Inference, International Conference on Artificial Intelligence and Statistics (AISTATS).

11.1.3 Journal

Member of the editorial boards
  • Stephane Girard : Associate Editor for Revstat and Dependence Modelling.
  • Julyan Arbel , Associate Editor, Bayesian Analysis since 2019.
  • Julyan Arbel and Florence Forbes , Associate Editor, Australian and New Zealand Journal of Statistics since 2019.
  • Julyan Arbel , Associate Editor, Statistics & Probability Letters since 2019.
  • Julyan Arbel , Associate Editor, Computational Statistics & Data Analysis since 2020.
  • Julyan Arbel , Associate Editor, Statistical Methods & Applications since 2023.
Reviewer - reviewing activities
  • Julien Chevallier , ESAIM: Probability and Statistics, Journal of Mathematical Biology, Scandinavian Journal of Statistics, Stochastic Processes and their Applications
  • Stephane Girard : Journal of Machine Learning Reseach, Oxford Bulletin of Economics and Statistics, Stat, Bernoulli, Extremes, Dependence Modelling, Advances in Data Analysis and Classification, Journal of Royal Statistical Society - Series B.
  • Julyan Arbel : Annals of Statistics, Journal of the Royal Statistical Society: Series B, Statistical Science, Annals of Applied Probability.

11.1.4 Invited talks

  • Stephane Girard was invited as a keynote speaker to the Journées de Statistique organized by the French Statistical Society.
  • Julyan Arbel , invited talk at Montpellier Statistics Seminar.
  • Julyan Arbel , invited talk at Rethinking the Role of Bayesianism in the Age of Modern AI, Dagstuhl Seminar.
  • Julyan Arbel , invited talk at Generative models and uncertainty quantification, GenU, Copenhagen.
  • Sophie Achard , invited talk at Marseille Statistics Seminar, June 2024.

11.1.5 Leadership within the scientific community

  • Pedro Rodrigues is responsible for the weekly seminar of the DATA departement from the Laboratoire Jean Kuntzmann
  • Sophie Achard is scientific director of MIAI Cluster.

11.1.6 Scientific expertise

  • Stephane Girard : member of the committee in charge of hiring a full professor at Université de Toulouse Jean-Jaurès.
  • Julyan Arbel , Member of the Board of Directors member of ISBA, the International Society for Bayesian Analysis, 2022-2025.
  • Sophie Achard , member of the committe Inrae for evaluation of researchers.
  • Sophie Achard , member of the HCERES committee of L2S, CentraleSupelec, Paris-Saclay.
  • Sophie Achard , member of the HCERES committee of MAP5, Université Paris Cité.

11.1.7 Research administration

  • Julyan Arbel , Member of the Comité des Emplois Scientifiques at Inria Grenoble since 2019.
  • Sophie Achard , Member of the Conseil Scientifique d'Institut, CNRS Sciences Informatiques.
  • Florence Forbes is in charge of representing the Grenoble Inria Center and LJK for activities and events related to digital health.
  • Florence Forbes a member of the scientific committe COS of Inria Grenoble.

11.2 Teaching - Supervision - Juries

11.2.1 Teaching

  • Master: Stephane Girard , Statistique Inférentielle Avancée, 18 ETD, M1 level, Ensimag. Grenoble-INP, France.
  • Master: Stephane Girard , Introduction to Extreme-Value Analysis, 15 ETD, M2 level, Univ-Grenoble Alpes (UGA), France.
  • Master: Julyan Arbel , Bayesian machine learning, Master Mathématiques Vision et Apprentissage (MVA), École normale supérieure Paris-Saclay, 36 ETD.
  • Master M1AM and ENSIMAG, Pedro Rodrigues , Statistical Learning with Applications, 35 ETD
  • Master M2 DataScience (Institut Polytechnique de Paris), Pedro Rodrigues , DataCamp, 40 ETD

11.2.2 Supervision

  • Stephane Girard is the PhD advisor of the PhD thesis of Antoine Franchini (Université Grenoble-Alpes, since december 2024).
  • Stephane Girard and Jonathan El Methni are the PhD co-avisors of the PhD thesis of Pearl Laveur (Université Grenoble-Alpes, since october 2024).
  • Stephane Girard is the co-advisor (with G. Stupfler, Université d'Angers and A. Usseglio-Carleve, Université d'Avignon) of the PhD thesis of Solune Denis (Université d'Angers, since october 2024).
  • Stephane Girard is the co-advisor (with E. Gobet, Ecole Polytechnique) of the PhD thesis of Jean Pachebat (Ecole Polytechnique, since february 2023).
  • Julyan Arbel was co-supervisor with Guillaume Kon Kam King (INRAE) of the PhD thesis of Louise Alamichel . "Bayesian Nonparametric methods for complex genomic data" Inria, defended in September 2024.
  • PhD in progress: Mohamed-Bahi Yahiaoui. "Computation time reduction and efficient uncertainty propagation for fission gas simulation" CEA Cadarache-Inria, started in October 2021, advised by Julyan Arbel , Loic Giraldi , Geoffrey Daniel.
  • PhD in progress: Alexandre Wendling. "Machine learning of embeddings and generative models. Applications in ecology", advised by Julyan Arbel and Clovis Galliez.
  • PhD in progress: Julien Zhou: "Learning combinatorial bandit models under privacy constraints", advised by Julyan Arbel , Pierre Gaillard and Thibaud Rahier.
  • PhD in progress: Alice Chevaux, "Density-based graphs modelling: inference, comparison, classification", advised by Sophie Achard , Julyan Arbel , and Guillaume Kon Kam King.
  • PhD in progress: Camille Touron, "Hierarchical posterior estimation with deep generative models: theory and methods", advised by Pedro Rodrigues and Julyan Arbel .
  • PhD in progress: Eloise Touron, "Surrogate modeling for simulation-based inference", advised by Pedro Rodrigues , Julyan Arbel , Michael Arbel, and Nelle Varoquaux.
  • PhD in progress: Razan Mhanna, "Machine learning for functional data", advised by Sophie Achard , Jonas Richiardi and Alex Petersen.
  • PhD in progress: Ali Fakhar, "Graph Learning,Statistical Graph Learning,Brain Signal Dynamics", advided by Sophie Achard , Kevin Polisano and Irène Gannaz.
  • PhD in progress: Arturo Cabrera Vazquez, "Brain Connectivity Modeling for Impaired Consciousness Studies", advised by Sophie Achard , Michel Dojat , Stein Silva.
  • Defended October 2024: Benjamin Lambert, started in January 2022, supervised by Florence Forbes and Michel Dojat (GIN).
  • Defended December 2024: Yuchen Bai, "Hierarchical Bayesian Modelling of leaf area density from UAV-lidar", started in October 2021, supervised by Jean-Baptiste Durand . Florence Forbes and Gregoire Vincent (IRD, Montpellier).
  • PhD in progress: Jacopo Iollo, started in January 2022, supervised by Florence Forbes , P. Alliez (DR Inria Sophia) and C. Heinkele (Cerema, Strasbourg).
  • PhD in progress: Geoffroy Oudoumanessah, started in October 2022, supervised by Florence Forbes , C. Lartizien (Creatis, Lyon) and Michel Dojat (GIN).
  • PhD in progress: Theo Sylvestre, started in October 2022, supervised by Florence Forbes and S. Ancelet (IRSN).
  • PhD in progress: Brice Marc, started in January 2023, supervised by Florence Forbes , Philippe Foucher and Pierre Charbonier, Cerema Strasbourg.

11.2.3 Juries

  • Stephane Girard : Reviewer of the PhD thesis of Marwan Wehaiba El Khazen, “Statistical models for the optimization of energy consumption in cyber-physical systems”, Sorbonne Université, december 2024.
  • Stephane Girard : Member of the PhD committee of Cyprien Ferraris, “Estimation of high-dimensional dependency structure using random graph models: applications to the monitoring and optimization of production controls”, Sorbonne Université, november 2024.
  • Stephane Girard : Reviewer of the PhD thesis of Samuel Valiquette, “On count data in the context of extreme values and multivariate modeling”, Université de Montpellier, France and Sherbrooke Canada, july 2024.
  • Julyan Arbel , Reviewer of the PhD thesis of Nikita Kotelevskii, “Predictive Uncertainty Quantification: A Unifying Framework and Applications to Federated Learning”, Skolkovo Institute of Science and Technology, Moscow, December 2024.
  • Julyan Arbel , Reviewer of the PhD thesis of Davide Agnoletto, “Generalized Bayes methodologies”, University of Padova, December 2024.
  • Julyan Arbel , Reviewer of the PhD thesis of Tom Huix, “Variational Inference: theory and large scale applications”, Institut Polytechnique de Paris, November, 2024.
  • Julyan Arbel , Member of the PhD Committee of Alice Giampino, “Innovative approaches to Bayesian Clustering Methods: parametric and nonparametric perspectives”, Università degli Studi di Milano-Bicocca, February 2024.
  • Julyan Arbel , Reviewer of the PhD thesis of Andrea Mascaretti, “Bayesian Sparse Model for Complex Data”, Università degli Studi di Padova, Italy, February 2024.
  • Sophie Achard , Reviewer of PhD thesis of Irina Dolzhikova, "Robust Data-driven Predictive Model for Brain-Computer Interface", Nazarbayev University, Astana city, Kazakhstan.
  • Sophie Achard , Reviewer of HDR of Julie Coloigner, "Cerebral functional and structural connectivity analysis", Rennes, France.
  • Sophie Achard , Reviewer of HDR of Nisrine Jrad, "Contributions en apprentissage automatique pour l’analyse des données", Angers, France.
  • Sophie Achard , Reviewer of PhD thesis of Adrien Thirion "Algorithmes d’intelligence artificielle en temps réel sur cible embarquée avec des capteurs de ballistocardiogra- phie de nouvelle génération", Toulouse, France.
  • Florence Forbes , reviewer oh the PhD thesis of Gabriel Cardoso, Ecole Polytechnique.
  • Florence Forbes , Chair of the PhD thesis of Margaux Zaffran, Ecole Polytechnique.
  • Florence Forbes , Chair of the PhD thesis of Qiao Chen, Inria Airsea, UGA.
  • Florence Forbes , Member of the PhD thesis of Juliette Ortholand, Inria Heka, PSL.
  • Florence Forbes , Member of the PhD thesis of Nicolas Pinon, Creatis, Lyon.
  • Florence Forbes , Member of the PhD thesis of Thomas Coudert, GIN, INSERM.
  • Florence Forbes , Member of the PhD thesis of Louise Alamichel, INRIA-INRAe.

11.3 Popularization

11.3.1 Productions (articles, videos, podcasts, serious games, ...)

  • Julyan Arbel published a popularization paper entitled Vers une IA qui saurait dire “je ne sais pas”, in “Nouveaux défis de la statistique”, published by Bibliothèque Tangente n86.
  • Stephane Girard published a popularization paper entitled "Les statistiques de l'extrême" co-authored with R. Barbero, T. Opitz (INRAe Avignon) and A. Usseglio-Carleve (Avignon university) in the "Pour la Science" journal, issue 546.

12 Scientific production

12.1 Major publications

  • 1 articleS.Sophie Achard, J.-F.Jean-François Coeurjolly, P. L.Pierre Lafaye de Micheaux, H.Hanâ Lbath and J.Jonas Richiardi. Inter-regional correlation estimators for functional magnetic resonance imaging.NeuroImage282November 2023, 120388HALDOI
  • 2 articleM.Michaël Allouche, S.Stéphane Girard and E.Emmanuel Gobet. EV-GAN: Simulation of extreme events with ReLU neural networks.Journal of Machine Learning Research231502022, 1--39HAL
  • 3 articleF.Fabien Boux, F.Florence Forbes, J.Julyan Arbel, B.Benjamin Lemasson and E. L.Emmanuel L. Barbier. Bayesian inverse regression for vascular magnetic resonance fingerprinting.IEEE Transactions on Medical Imaging407July 2021, 1827-1837HALDOI
  • 4 articleA.Abdelaati Daouia, S.Stéphane Girard and G.G. Stupfler. Estimation of Tail Risk based on Extreme Expectiles.Journal of the Royal Statistical Society series B802018, 263--292
  • 5 articleA.Antoine Deleforge, F.Florence Forbes and R.Radu Horaud. High-Dimensional Regression with Gaussian Mixtures and Partially-Latent Response Variables.Statistics and ComputingFebruary 2014HALDOI
  • 6 articleF.Florence Forbes, H. D.Hien Duy Nguyen, T. T.Trung Tin Nguyen and J.Julyan Arbel. Summary statistics and discrepancy measures for ABC via surrogate posteriors.Statistics and Computing32852022HALDOI
  • 7 articleS.Stéphane Girard, G. C.Gilles Claude Stupfler and A.Antoine Usseglio-Carleve. Extreme Conditional Expectile Estimation in Heavy-Tailed Heteroscedastic Regression Models.Annals of Statistics496December 2021, 3358--3382HALDOI
  • 8 articleB.Benjamin Lambert, F.Florence Forbes, S.Senan Doyle, H.Harmonie Dehaene and M.Michel Dojat. Trustworthy clinical AI solutions: A unified review of uncertaintyquantification in Deep Learning models for medical image analysis.Artificial Intelligence in Medicine150April 2024, 102830HALDOI
  • 9 articleH.Hongliang Lu, J.Julyan Arbel and F.Florence Forbes. Bayesian nonparametric priors for hidden Markov random fields.Statistics and Computing302020, 1015-1035HALDOI
  • 10 inproceedingsM.Mariia Vladimirova, J.Jakob Verbeek, P.Pablo Mesejo and J.Julyan Arbel. Understanding Priors in Bayesian Neural Networks at the Unit Level.Proceedings of Machine Learning ResearchICML 2019 - 36th International Conference on Machine Learning97Proceedings of the 36th International Conference on Machine LearningLong Beach, United StatesJune 2019, 6458-6467HAL

12.2 Publications of the year

International journals

Invited conferences

International peer-reviewed conferences

National peer-reviewed Conferences

Conferences without proceedings

Scientific book chapters

  • 60 inbookB.Benjamin Lambert, F.Florence Forbes and M.Michel Dojat. From out-of-distribution detection to quality control.Trustworthy AI in Medical ImagingElsevier2025, 101-126HALDOIback to text

Reports & preprints

Other scientific publications

Software

12.3 Cited publications

  • 83 inproceedingsY.Y. Bai, J.-B.J.-B. Durand, G.G. Vincent and F.F. Forbes. Semantic segmentation of sparse irregular point clouds for leaf/wood discrimination.Advances in Neural Information Processing Systems 37 (NeurIPS 2023)New-Orleans, United StatesDec 2023, 1--21back to text