## Section: New Results

### Dealing with uncertainties

#### Sensitivity Analysis

Participants : Eric Blayo, Laurent Gilquin, François-Xavier Le Dimet, Elise Arnaud, Maëlle Nodet, Clémentine Prieur, Laurence Viry.

##### Scientific context

Forecasting geophysical systems require complex models, which sometimes need to be coupled, and which make use of data assimilation. The objective of this project is, for a given output of such a system, to identify the most influential parameters, and to evaluate the effect of uncertainty in input parameters on model output. Existing stochastic tools are not well suited for high dimension problems (in particular time-dependent problems), while deterministic tools are fully applicable but only provide limited information. So the challenge is to gather expertise on one hand on numerical approximation and control of Partial Differential Equations, and on the other hand on stochastic methods for sensitivity analysis, in order to develop and design innovative stochastic solutions to study high dimension models and to propose new hybrid approaches combining the stochastic and deterministic methods.

##### Sensitivity analysis with dependent inputs

An important challenge for stochastic sensitivity analysis is to develop methodologies which work for dependent inputs. For the moment, there does not exist conclusive results in that direction. Our aim is to define an analogue of Hoeffding decomposition [54] in the case where input parameters are correlated. Clémentine Prieur supervised Gaëlle Chastaing's PhD thesis on the topic (defended in September 2013) [41]. We obtained first results [42], deriving a general functional ANOVA for dependent inputs, allowing defining new variance based sensitivity indices for correlated inputs. We then adapted various algorithms for the estimation of these new indices. These algorithms make the assumption that among the potential interactions, only few are significant. Two papers have been recently accepted [40], [43]. We also considered the estimation of groups Sobol' indices, with a procedure based on replicated designs [52]. These indices provide information at the level of groups, and not at a finer level, but their interpretation is still rigorous.

Céline Helbert and Clémentine Prieur supervised the PhD thesis of Simon Nanty (funded by CEA Cadarache, and defended in October, 2015). The subject of the thesis is the analysis of uncertainties for numerical codes with temporal and spatio-temporal input variables, with application to safety and impact calculation studies. This study implied functional dependent inputs. A first step was the modeling of these inputs [64]. The whole methodology proposed during the PhD is presented in [65].

More recently, the Shapley value, from econometrics, was proposed as an alternative to quantify the importance of random input variables to a function. Owen [67] derived Shapley value importance for independent inputs and showed that it is bracketed between two different Sobol' indices. Song et al. [72] recently advocated the use of Shapley value for the case of dependent inputs. In a very recent work [13], in collaboration with Art Owen (Standford's University), we show that Shapley value removes the conceptual problems of functional ANOVA for dependent inputs. We do this with some simple examples where Shapley value leads to intuitively reasonable nearly closed form values. We also investigated further the properties of Shapley effects in [31].

#### Non-Parametric Estimation for Kinetic Diffusions

Participants : Clémentine Prieur, Jose Raphael Leon Ramos.

This research is the subject of a collaboration with Chile and Uruguay. More precisely, we started working with Venezuela. Due to the crisis in Venezuela, our main collaborator on that topic moved to Uruguay.

We are focusing our attention on models derived from the linear Fokker-Planck equation. From a probabilistic viewpoint, these models have received particular attention in recent years, since they are a basic example for hypercoercivity. In fact, even though completely degenerated, these models are hypoelliptic and still verify some properties of coercivity, in a broad sense of the word. Such models often appear in the fields of mechanics, finance and even biology. For such models we believe it appropriate to build statistical non-parametric estimation tools. Initial results have been obtained for the estimation of invariant density, in conditions guaranteeing its existence and unicity [37] and when only partial observational data are available. A paper on the non parametric estimation of the drift has been accepted recently [38] (see Samson et al., 2012, for results for parametric models). As far as the estimation of the diffusion term is concerned, a paper has been accepted [38], in collaboration with J.R. Leon (Montevideo, Uruguay) and P. Cattiaux (Toulouse). Recursive estimators have been also proposed by the same authors in [39], also recently accepted. In a recent collaboration with Adeline Samson from the statistics department in the Lab, we considered adaptive estimation, that is we proposed a data-driven procedure for the choice of the bandwidth parameters.

In [5], we focused on damping Hamiltonian systems under the so-called fluctuation-dissipation condition. Idea in that paper were re-used with applications to neuroscience in [63].

Note that Professor Jose R. Leon (Caracas, Venezuela, Montevideo, Uruguay) was funded by an international Inria Chair, allowing to collaborate further on parameter estimation.

We recently proposed a paper on the use of the Euler scheme for inference purposes, considering reflected diffusions.This paper could be extended to the hypoelliptic framework.

We started a collaboration with Karine Bertin (Valparaiso, Chile) funded by a MATHAMSUD project. We are interested in new adaptive estimators for invariant densities on bounded domains, and would like to extend that results to hypo-elliptic diffusions.

#### Multivariate Risk Indicators

Participants : Clémentine Prieur, Patricia Tencaliec.

Studying risks in a spatio-temporal context is a very broad field of research and one that lies at the heart of current concerns at a number of levels (hydrological risk, nuclear risk, financial risk etc.). Stochastic tools for risk analysis must be able to provide a means of determining both the intensity and probability of occurrence of damaging events such as e.g. extreme floods, earthquakes or avalanches. It is important to be able to develop effective methodologies to prevent natural hazards, including e.g. the construction of barrages.

Different risk measures have been proposed in the one-dimensional framework . The most classical ones are the return level (equivalent to the Value at Risk in finance), or the mean excess function (equivalent to the Conditional Tail Expectation CTE). However, most of the time there are multiple risk factors, whose dependence structure has to be taken into account when designing suitable risk estimators. Relatively recent regulation (such as Basel II for banks or Solvency II for insurance) has been a strong driver for the development of realistic spatio-temporal dependence models, as well as for the development of multivariate risk measurements that effectively account for these dependencies.

We refer to [44] for a review of recent extensions of the notion of return level to the multivariate framework. In the context of environmental risk, [71] proposed a generalization of the concept of return period in dimension greater than or equal to two. Michele et al. proposed in a recent study [45] to take into account the duration and not only the intensity of an event for designing what they call the dynamic return period. However, few studies address the issues of statistical inference in the multivariate context. In [46], [48], we proposed non parametric estimators of a multivariate extension of the CTE. As might be expected, the properties of these estimators deteriorate when considering extreme risk levels. In collaboration with Elena Di Bernardino (CNAM, Paris), Clémentine Prieur is working on the extrapolation of the above results to extreme risk levels [29].

Elena Di Bernardino, Véronique Maume-Deschamps (Univ. Lyon 1) and Clémentine Prieur also derived an estimator for bivariate tail [47]. The study of tail behavior is of great importance to assess risk.

With Anne-Catherine Favre (LTHE, Grenoble), Clémentine Prieur supervised the PhD thesis of Patricia Tencaliec. We are working on risk assessment, concerning flood data for the Durance drainage basin (France). The PhD thesis started in October 2013 and was defended in February 2017. A first paper on data reconstruction has been accepted [73]. It was a necessary step as the initial series contained many missing data. A second paper is in revision, considering the modeling of precipitation amount with semi-parametric models, modeling both the bulk of the distribution and the tails, but avoiding the arbitrary choice of a threshold. We work in collaboration with Philippe Naveau (LSCE, Paris).

#### Extensions of the replication method for the estimation of Sobol' indices

Participants : Elise Arnaud, Laurent Gilquin, Clémentine Prieur.

Sensitivity analysis studies how the uncertainty on an output of a mathematical model can be attributed to sources of uncertainty among the inputs. Global sensitivity analysis of complex and expensive mathematical models is a common practice to identify influent inputs and detect the potential interactions between them. Among the large number of available approaches, the variance-based method introduced by Sobol' allows to calculate sensitivity indices called Sobol' indices. Each index gives an estimation of the influence of an individual input or a group of inputs. These indices give an estimation of how the output uncertainty can be apportioned to the uncertainty in the inputs. One can distinguish first-order indices that estimate the main effect from each input or group of inputs from higher-order indices that estimate the corresponding order of interactions between inputs. This estimation procedure requires a significant number of model runs, number that has a polynomial growth rate with respect to the input space dimension. This cost can be prohibitive for time consuming models and only a few number of runs is not enough to retrieve accurate informations about the model inputs.

The use of replicated designs to estimate first-order Sobol' indices has the major advantage of reducing drastically the estimation cost as the number of runs n becomes independent of the input space dimension. The generalization to closed second-order Sobol' indices relies on the replication of randomized orthogonal arrays. However, if the input space is not properly explored, that is if n is too small, the Sobol' indices estimates may not be accurate enough. Gaining in efficiency and assessing the estimate precision still remains an issue, all the more important when one is dealing with limited computational budget.

We designed approaches to render the replication method recursive, enabling the required number of evaluations to be controlled. With these approaches, more accurate Sobol' estimates are obtained while recycling previous sets of model evaluations. The estimation procedure is therefore stopped when the convergence of estimates is considered reached. One of these approaches corresponds to a recursive version of the replication method and is based on the iterative construction of stratified designs, latin hypercubes and orthogonal arrays [50]. A second approach combines the use of quasi-Monte Carlo sampling and the construction of a new stopping criterion [9] [32] .

In [30] a new strategy to estimate the full set of first-order and second-order Sobol' indices with only two replicated designs based on orthogonal arrays of strength two. Such a procedure increases the precision of the estimation for a given computation budget. A bootstrap procedure for producing confidence intervals, that are compared to asymptotic ones in the case of first-order indices, is also proposed.

#### Parameter control in presence of uncertainties: robust estimation of bottom friction

Participants : Victor Trappler, Elise Arnaud, Laurent Debreu, Arthur Vidard.

Many physical phenomena are modelled numerically in order to better understand and/or to predict their behaviour. However, some complex and small scale phenomena can not be fully represented in the models. The introduction of ad-hoc correcting terms, can represent these unresolved processes, but they need to be properly estimated.

A good example of this type of problem is the estimation of bottom friction parameters of the ocean floor. This is important because it affects the general circulation. This is particularly the case in coastal areas, especially for its influence on wave breaking. Because of its strong spatial disparity, it is impossible to estimate the bottom friction by direct observation, so it requires to do so indirectly by observing its effects on surface movement. This task is further complicated by the presence of uncertainty in certain other characteristics linking the bottom and the surface (eg boundary conditions). The techniques currently used to adjust these settings are very basic and do not take into account these uncertainties, thereby increasing the error in this estimate.

Classical methods of parameter estimation usually imply the minimisation of an objective function, that measures the error between some observations and the results obtained by a numerical model. In the presence of uncertainties, the minimisation is not straightforward, as the output of the model depends on those uncontrolled inputs and on the control parameter as well. That is why we will aim at minimising the objective function, to get an estimation of the control parameter that is robust to the uncertainties. In this work, a toy model of a coastal area has been modelled and implemented. The control parameter is the bottom friction, upon which classical methods of estimation are applied in a simulation-reestimation experiment. The model is then modified to include uncertainties on the boundary conditions in order to apply robust control methods. First, a sensitivity analysis of the objective function has been performed to assess the influence of each considered variable. Then, a study on the meaning of different concepts of robustness have been carried on. Typically, one then seeks an optimal parameter set that would minimise the variance or the mean of the original objective function. Various associated algorithms from the literature have been implemented. They all rely on surrogate models and black-box optimisation techniques to solve this estimation problem.

#### Sensitivity of a floating offshore wind turbine to uncertain parameters

Participants : Adrien Hirvoas, Elise Arnaud, Clémentine Prieur, Arthur Vidard.

In a fast-changing energy context, marine renewable energies in general and floating offshore wind energy in particular are a promising source of energy in France and aboard. The design of these structures is made in a specific regulated framework related to their environment. Floating offshore wind turbines are submitted to various continuous environmental loadings (wind, current, swell), which generate solicitations and fatigue in some components. Fatigue lifetime is estimated with a dedicated software that allows performing coupled multi-physics simulations of the system (hydrodynamics, aerodynamics, mechanics and controls). The inputs of these simulations necessarily include uncertainties regarding the environmental loadings and the physical parameters of the models as well. These uncertainties can have an influence on the simulated behaviour of the system. The core of this work consists in conducting a sensitivity analysis to assess, how the uncertainty on an output of a model can be attributed to sources of uncertainty among the inputs. The approach that is considered, is based on the calculation of Sobol indices with the FAST method, and a meta-model using Kriging. These indices are used to evaluate in what extend an input or group of inputs is responsible for the output variance. The perspectives of this study is to understand what king of measurements could be of interest to properly estimate the sensible parameters, and where these measurements should be monitored on the structure. Such an estimation will be performed with data assimilation approaches, which optimally combine numerical models and physical observations. This work in done in collaboration with IFPEN.

#### Uncertainty and robustess analysis for models with functional input/output.

Participants : Mohammed Reda El Amri, Clémentine Prieur.

Numerical models are commonly used to study physical phenomena. They imply many inputs parameters, and potentially provide a large number of quantities of interest as outputs. Practitioners are not only interested in the response of their model for a given set of inputs (forward problem) but also in recovering the set of inputs values leading to a prescribed value or range for the quantity of interest (inversion problem). In collaboration with IFP Energies nouvelles, we develop data-driven strategies for robust inversion under functional uncertainties. Reda El Amri's PhD thesis aim at developping such tools with application to pollutant emission control.