The general scope of the AIRSEA project-team is to develop mathematical and computational methods for the modeling of oceanic and atmospheric flows.
The mathematical tools used involve both deterministic and statistical approaches. The main research topics cover a) modeling and coupling b) model reduction for sensitivity analysis, coupling and multiscale optimizations c) sensitivity analysis, parameter estimation and risk assessment d) algorithms for high performance computing. The range of application is from climate modeling to the prediction of extreme events.

Recent events have raised questions regarding the social and economic implications of anthropic alterations of the Earth system, i.e. climate change and the associated risks of increasing extreme events. Ocean and atmosphere, coupled with other components (continent and ice) are the building blocks of the Earth system. A better understanding of the ocean atmosphere system is a key ingredient for improving prediction of such events. Numerical models are essential tools to understand processes, and simulate and forecast events at various space and time scales. Geophysical flows generally have a number of characteristics that make it difficult to model them. This justifies the development of specifically adapted mathematical methods:

Our scientific objectives are divided into four major points. The first objective focuses on developing advanced mathematical methods for both the ocean and atmosphere, and the coupling of these two components. The second objective is to investigate the derivation and use of model reduction to face problems associated with the numerical cost of our applications. The third objective is directed toward the management of uncertainty in numerical simulations. The last objective deals with efficient numerical algorithms for new computing platforms. As mentioned above, the targeted applications cover oceanic and atmospheric modeling and related extreme events using a hierarchy of models of increasing complexity.

Current numerical oceanic and atmospheric models suffer from a number of well-identified problems. These problems are mainly related to lack of horizontal and vertical resolution, thus requiring the parameterization of unresolved (subgrid scale) processes and control of discretization errors in order to fulfill criteria related to the particular underlying physics of rotating and strongly stratified flows. Oceanic and atmospheric coupled models are increasingly used in a wide range of applications from global to regional scales. Assessment of the reliability of those coupled models is an emerging topic as the spread among the solutions of existing models (e.g., for climate change predictions) has not been reduced with the new generation models when compared to the older ones.

Advanced methods for modeling 3D rotating and stratified flows
The continuous increase of computational power and the resulting finer grid resolutions have triggered a recent regain of interest in numerical methods and their relation to physical processes. Going beyond present knowledge requires a better understanding of numerical dispersion/dissipation ranges and their connection to model fine scales. Removing the leading order truncation error of numerical schemes is thus an active topic of research and each mathematical tool has to adapt to the characteristics of three dimensional stratified and rotating flows. Studying the link between discretization errors and subgrid scale parameterizations is also arguably one of the main challenges.

Complexity of the geometry, boundary layers, strong stratification and lack of resolution are the main sources of discretization errors in the numerical simulation of geophysical flows. This emphasizes the importance of the definition of the computational grids (and coordinate systems) both in horizontal and vertical directions, and the necessity of truly multi resolution approaches. At the same time, the role of the small scale dynamics on large scale circulation has to be taken into account. Such parameterizations may be of deterministic as well as stochastic nature and both approaches are taken by the AIRSEA team. The design of numerical schemes consistent with the parameterizations is also arguably one of the main challenges for the coming years. This work is complementary and linked to that on parameters estimation described in .

Ocean Atmosphere interactions and formulation of coupled models
State-of-the-art climate models (CMs) are complex systems under continuous development. A fundamental aspect of climate modeling is the representation of air-sea interactions. This covers a large range of issues: parameterizations of atmospheric and oceanic boundary layers, estimation of air-sea fluxes, time-space numerical schemes, non conforming grids, coupling algorithms ...Many developments related to these different aspects were performed over the last 10-15 years, but were in general conducted independently of each other.

The aim of our work is to revisit and enrich several aspects of the representation of air-sea interactions in CMs, paying special attention to their overall consistency with appropriate mathematical tools. We intend to work consistently on the physics and numerics. Using the theoretical framework of global-in-time Schwarz methods, our aim is to analyze the mathematical formulation of the parameterizations in a coupling perspective. From this study, we expect improved predictability in coupled models (this aspect will be studied using techniques described in ). Complementary work on space-time nonconformities and acceleration of convergence of Schwarz-like iterative methods (see ) are also conducted.

The high computational cost of the applications is a common and major concern to have in mind when deriving new methodological approaches. This cost increases dramatically with the use of sensitivity analysis or parameter estimation methods, and more generally with methods that require a potentially large number of model integrations.

A dimension reduction, using either stochastic or deterministic methods, is a way to reduce significantly the number of degrees of freedom, and therefore the calculation time, of a numerical model.

Model reduction
Reduction methods can be deterministic (proper orthogonal decomposition, other reduced bases) or stochastic (polynomial chaos, Gaussian processes, kriging), and both fields of research are very active. Choosing one method over another strongly depends on the targeted application, which can be as varied as real-time computation, sensitivity analysis (see e.g., section ) or optimisation for parameter estimation (see below).

Our goals are multiple, but they share a common need for certified error bounds on the output. Our team has a 4-year history of working on certified reduction methods and has a unique positioning at the interface between deterministic and stochastic approaches. Thus, it seems interesting to conduct a thorough comparison of the two alternatives in the context of sensitivity analysis. Efforts will also be directed toward the development of efficient greedy algorithms for the reduction, and the derivation of goal-oriented sharp error bounds for non linear models and/or non linear outputs of interest. This will be complementary to our work on the deterministic reduction of parametrized viscous Burgers and Shallow Water equations where the objective is to obtain sharp error bounds to provide confidence intervals for the estimation of sensitivity indices.

Reduced models for coupling applications
Global and regional high-resolution oceanic models are either coupled to an atmospheric model
or forced at the air-sea interface by fluxes computed empirically preventing proper physical
feedback between the two media. Thanks to high-resolution observational studies, the existence of air-sea
interactions at oceanic mesoscales (i.e., at

Multiphysics coupling often requires iterative methods to obtain a mathematically correct numerical solution. To mitigate the cost of the iterations, we will investigate the possibility of using reduced-order models for the iterative process. We will consider different ways of deriving a reduced model: coarsening of the resolution, degradation of the physics and/or numerical schemes, or simplification of the governing equations. At a mathematical level, we will strive to study the well-posedness and the convergence properties when reduced models are used. Indeed, running an atmospheric model at the same resolution as the ocean model is generally too expensive to be manageable, even for moderate resolution applications. To account for important fine-scale interactions in the computation of the air-sea boundary condition, the objective is to derive a simplified boundary layer model that is able to represent important 3D turbulent features in the marine atmospheric boundary layer.

Reduced models for multiscale optimization
The field of multigrid methods for optimisation has known a tremendous development over the past few decades. However, it has not been applied to oceanic and atmospheric problems apart from some crude (non-converging) approximations or applications to simplified and low dimensional models. This is mainly due to the high complexity of such models and to the difficulty in handling several grids at the same time. Moreover, due to complex boundaries and physical phenomena, the grid interactions and transfer operators are not trivial to define.

Multigrid solvers (or multigrid preconditioners) are efficient methods for the solution of variational data assimilation problems. We would like to take advantage of these methods to tackle the optimization problem in high dimensional space. High dimensional control space is obtained when dealing with parameter fields estimation, or with control of the full 4D (space time) trajectory. It is important since it enables us to take into account model errors. In that case, multigrid methods can be used to solve the large scales of the problem at a lower cost, this being potentially coupled with a scale decomposition of the variables themselves.

There are many sources of uncertainties in numerical models. They are due to imperfect external forcing, poorly known parameters, missing physics and discretization errors. Studying these uncertainties and their impact on the simulations is a challenge, mostly because of the high dimensionality and non-linear nature of the systems. To deal with these uncertainties we work on three axes of research, which are linked: sensitivity analysis, parameter estimation and risk assessment. They are based on either stochastic or deterministic methods.

Sensitivity analysis
Sensitivity analysis (SA), which links uncertainty in the model inputs to uncertainty in the model outputs, is a powerful tool for model design and validation. First, it can be a pre-stage for parameter estimation (see ), allowing for the selection of the more significant parameters. Second, SA permits understanding and quantifying (possibly non-linear) interactions induced by the different processes defining e.g., realistic ocean atmosphere models. Finally SA allows for validation of models, checking that the estimated sensitivities are consistent with what is expected by the theory.
On ocean, atmosphere and coupled systems, only first order deterministic SA are performed, neglecting the initialization process (data assimilation). AIRSEA members and collaborators proposed to use second order information to provide consistent sensitivity measures, but so far it has only been applied to simple academic systems. Metamodels are now commonly used, due to the cost induced by each evaluation of complex numerical models: mostly Gaussian processes, whose probabilistic framework allows for the development of specific adaptive designs, and polynomial chaos not only in the context of intrusive Galerkin approaches but also in a black-box approach. Until recently, global SA was based primarily on a set of engineering practices. New mathematical and methodological developments have led to the numerical computation of Sobol' indices, with confidence intervals assessing for both metamodel and estimation errors. Approaches have also been extended to the case of dependent entries, functional inputs and/or output and stochastic numerical codes. Other types of indices and generalizations of Sobol' indices have also been introduced.

Concerning the stochastic approach to SA we plan to work with parameters that show spatio-temporal dependencies and to continue toward more realistic applications where the input space is of huge dimension with highly correlated components. Sensitivity analysis for dependent inputs also introduces new challenges. In our applicative context, it would seem prudent to carefully learn the spatio-temporal dependences before running a global SA. In the deterministic framework we focus on second order approaches where the sought sensitivities are related to the optimality system rather than to the model; i.e., we consider the whole forecasting system (model plus initialization through data assimilation).

All these methods allow for computing sensitivities and more importantly a posteriori error statistics.

Parameter estimation
Advanced parameter estimation methods are barely used in ocean, atmosphere and coupled systems, mostly due to a difficulty of deriving adequate response functions, a lack of knowledge of these methods in the ocean-atmosphere community, and also to the huge associated computing costs. In the presence of strong uncertainties on the model but also on parameter values, simulation and inference are closely associated. Filtering for data assimilation and Approximate Bayesian Computation (ABC) are two examples of such association.

Stochastic approach can be compared with the deterministic approach, which allows to determine the sensitivity of the flow to parameters and optimize their values relying on data assimilation. This approach is already shown to be capable of selecting a reduced space of the most influent parameters in the local parameter space and to adapt their values in view of correcting errors committed by the numerical approximation. This approach assumes the use of automatic differentiation of the source code with respect to the model parameters, and optimization of the obtained raw code.

AIRSEA assembles all the required expertise to tackle these difficulties. As mentioned previously, the choice of parameterization schemes and their tuning has a significant impact on the result of model simulations. Our research will focus on parameter estimation for parameterized Partial Differential Equations (PDEs) and also for parameterized Stochastic Differential Equations (SDEs). Deterministic approaches are based on optimal control methods and are local in the parameter space (i.e., the result depends on the starting point of the estimation) but thanks to adjoint methods they can cope with a large number of unknowns that can also vary in space and time. Multiscale optimization techniques as described in will be one of the tools used. This in turn can be used either to propose a better (and smaller) parameter set or as a criterion for discriminating parameterization schemes. Statistical methods are global in the parameter state but may suffer from the curse of dimensionality. However, the notion of parameter can also be extended to functional parameters. We may consider as parameter a functional entity such as a boundary condition on time, or a probability density function in a stationary regime. For these purposes, non-parametric estimation will also be considered as an alternative.

Risk assessment
Risk assessment in the multivariate setting suffers from a lack of consensus on the choice of indicators. Moreover, once the indicators are designed, it still remains to develop estimation procedures, efficient even for high risk levels. Recent developments for the assessment of financial risk have to be considered with caution as methods may differ pertaining to general financial decisions or environmental risk assessment. Modeling and quantifying uncertainties related to extreme events is of central interest in environmental sciences. In relation to our scientific targets, risk assessment is very important in several areas: hydrological extreme events, cyclone intensity, storm surges...Environmental risks most of the time involve several aspects which are often correlated. Moreover, even in the ideal case where the focus is on a single risk source, we have to face the temporal and spatial nature of environmental extreme events.
The study of extremes within a spatio-temporal framework remains an emerging field where the development of adapted statistical methods could lead to major progress in terms of geophysical understanding and risk assessment thus coupling data and model information for risk assessment.

Based on the above considerations we aim to answer the following scientific questions: how to measure risk in a multivariate/spatial framework? How to estimate risk in a non stationary context? How to reduce dimension (see ) for a better estimation of spatial risk?

Extreme events are rare, which means there is little data available to make inferences of risk measures. Risk assessment based on observation therefore relies on multivariate extreme value theory. Interacting particle systems for the analysis of rare events is commonly used in the community of computer experiments. An open question is the pertinence of such tools for the evaluation of environmental risk.

Most numerical models are unable to accurately reproduce extreme events. There is therefore a real need to develop efficient assimilation methods for the coupling of numerical models and extreme data.

Methods for sensitivity analysis, parameter estimation and risk assessment are extremely costly due to the necessary number of model evaluations. This number of simulations require considerable computational resources, depends on the complexity of the application, the number of input variables and desired quality of approximations. To this aim, the AIRSEA team is an intensive user of HPC computing platforms, particularly grid computing platforms. The associated grid deployment has to take into account the scheduling of a huge number of computational requests and the links with data-management between these requests, all of these as automatically as possible. In addition, there is an increasing need to propose efficient numerical algorithms specifically designed for new (or future) computing architectures and this is part of our scientific objectives. According to the computational cost of our applications, the evolution of high performance computing platforms has to be taken into account for several reasons. While our applications are able to exploit space parallelism to its full extent (oceanic and atmospheric models are traditionally based on a spatial domain decomposition method), the spatial discretization step size limits the efficiency of traditional parallel methods. Thus the inherent parallelism is modest, particularly for the case of relative coarse resolution but with very long integration time (e.g., climate modeling). Paths toward new programming paradigms are thus needed. As a step in that direction, we plan to focus our research on parallel in time methods.

New numerical algorithms for high performance computing
Parallel in time methods can be classified into three main groups. In the first group, we find methods using parallelism across the method, such as parallel integrators for ordinary differential equations. The second group considers parallelism across the problem. Falling into this category are methods such as waveform relaxation
where the space-time system is decomposed into a set of subsystems which can then be solved independently using some form of relaxation techniques or multigrid reduction in time.
The third group of methods focuses on parallelism across the steps. One of the best known algorithms in this family is parareal.
Other methods combining the strengths of those listed above (e.g., PFASST) are currently under investigation in the community.

Parallel in time methods are iterative methods that may require a large number of iteration before convergence. Our first focus will be on the convergence analysis of parallel in time (Parareal / Schwarz) methods for the equation systems of oceanic and atmospheric models. Our second objective will be on the construction of fast (approximate) integrators for these systems. This part is naturally linked to the model reduction methods of section (). Fast approximate integrators are required both in the Schwarz algorithm (where a first guess of the boundary conditions is required) and in the Parareal algorithm (where the fast integrator is used to connect the different time windows). Our main application of these methods will be on climate (i.e., very long time) simulations. Our second application of parallel in time methods will be in the context of optimization methods. In fact, one of the major drawbacks of the optimal control techniques used in is a lack of intrinsic parallelism in comparison with ensemble methods. Here, parallel in time methods also offer ways to better efficiency. The mathematical key point is centered on how to efficiently couple two iterative methods (i.e., parallel in time and optimization methods).

The evolution of natural systems, in the short, mid, or long term, has extremely important consequences for both the global Earth system and humanity. Forecasting this evolution is thus a major challenge from the scientific, economic, and human viewpoints.

Humanity has to face the problem of global warming, brought on by the
emission of greenhouse gases from human activities. This warming will probably cause huge changes at global and regional
scales, in terms of climate, vegetation and biodiversity, with major consequences for local populations.
Research has therefore been conducted over the past 15 to 20 years in an effort to
model the Earth's climate and forecast its evolution in the 21st century in response to anthropic
action.

With regard to short-term forecasts, the best and oldest example is of course weather forecasting.
Meteorological services have been providing daily short-term forecasts for several decades which are of
crucial importance for numerous human activities.

Numerous other problems can also be mentioned, like seasonal weather
forecasting (to enable powerful phenomena like an El Nioperational oceanography (short-term forecasts of the evolution of the ocean system to provide services for the fishing industry, ship routing, defense, or the fight against marine pollution) or the prediction of floods.

As mentioned previously, mathematical and numerical tools are omnipresent and play a fundamental role in these areas of research. In this context, the vocation of AIRSEA is not to carry out numerical prediction, but to address mathematical issues raised by the development of prediction systems for these application fields, in close collaboration with geophysicists.

Most of the research activities of the AIRSEA team are directed towards the improvement of numerical systems of the ocean and the atmosphere. This includes the development of appropriated numerical methods, model/parameter calibration using observational data and uncertainty quantification for decision making. The AIRSEA team members work in close collaboration with the researchers in the field of geophyscial fluid and are partners of several interdisciplinary projects. They also strongly contribute to the development of state of the art numerical systems, like NEMO and CROCO in the ocean community.

Clémentine Prieur was selected as one of ambassadors for La Science taille XX elles in Grenoble

With the increase of resolution, the hydrostatic assumption becomes less valid and the AIRSEA group also works on the development of non-hydrostatic ocean models. The treatment of non-hydrostatic incompressible flows leads to a 3D elliptic system for pressure that can be ill conditioned in particular with non geopotential vertical coordinates. That is why we favour the use of the non-hydrostatic compressible equations that removes the need for a 3D resolution at the price of reincluding acoustic waves .

A comparison between 2D and 3D simulations of non hydrostatic surface waves has been performed in .

In addition, Emilie Duval started her PhD in September 2018 on the coupling between the hydrostatic incompressible and non-hydrostatic compressible equations. In 2021, a detailed analysis of acoustic-gravity waves in a free-surface compressible and stratified ocean has been published in Ocean Modelling.

Accurate and stable implementation of bathymetry boundary conditions in ocean models remains a challenging problem. The dynamics of ocean flow often depend sensitively on satisfying bathymetry boundary conditions and correctly representing their complex geometry. Generalized (e.g. ) terrain-following coordinates are often used in ocean models, but they require smoothing the bathymetry to reduce pressure gradient errors. Geopotential -coordinates are a common alternative that avoid pressure gradient and numerical diapycnal diffusion errors, but they generate spurious flow due to their “staircase” geometry. In , we introduce a new Brinkman volume penalization to approximate the no-slip boundary condition and complex geometry of bathymetry in ocean models. This approach corrects the staircase effect of -coordinates, does not introduce any new stability constraints on the geometry of the bathymetry and is easy to implement in an existing ocean model. The porosity parameter allows modelling subgrid scale details of the geometry. We illustrate the penalization and confirm its accuracy by applying it to three standard test flows: upwelling over a sloping bottom, resting state over a seamount and internal tides over highly peaked bathymetry features. New results on realistic applications will be submitted soon. In the context of the Gulf Stream simulation, Figure () shows very significant improvements brought by the use of penalization methods. Through the use of penalty methods, the Gulf Stream detachment is correctly represented in a 1/8 degree simulation. This opens the door to a clear improvement of climate models in which a good representation of this mechanism is essential. Following this work, Antoine Nasser started his PhD in October 2019 on penalization methods. One of the objectives is to study the representation of overflows in global ocean model. The NEMO ocean model will be used for the applications.

In the context of the H2020 IMMERSE project, the AIRSEA team is in charge of the development of a new time stepping strategy for the NEMO ocean model. The team also studies the use of exponential time integrators ( (submitted to Applied Numerical Mathematics) and ) for ocean models.

The Airsea team is involved in the modeling and algorithmic aspects of ocean-atmosphere (OA) coupling. For the last few years we have been actively working on the analysis of such coupling both in terms of its continuous and numerical formulation

,

. Our activities can be divided into three general topics

Continuous and discrete analysis of Schwarz algorithms for OA coupling:
we have been developing coupling approaches for several years, based on
so-called Schwarz algorithms. Schwarz-like domain decomposition methods are
very popular in mathematics, computational sciences and engineering notably
for the implementation of coupling strategies. However, for complex applications
(like in OA coupling) it is challenging to have an a priori knowledge of the
convergence properties of such methods. Indeed coupled problems arising in Earth
system modeling often exhibit sharp turbulent boundary layers whose parameterizations
lead to peculiar transmission conditions and diffusion coefficients. In the
framework of S. Thery PhD (defended in February 2021) the well-posedness of the
non-linear coupling problem including parameterizations has been addressed and a
detailed continuous analysis of the convergence properties of the Schwarz methods
has been pursued to entangle the impact of the different parameters at play in
such coupling problem , .
In S. Clement PhD, a general framework has been proposed to study the convergence
properties at a (semi)-discrete level to allow a systematic comparison with the
results obtained from the continuous problem . Such
framework allows to study more complex coupling problems whose formulation is
representative of the discretization used in realistic coupled models .

Within the COCOA project, a Schwarz-like iterative method has been applied in a state-of-the-art Earth-System model to evaluate the consequences of inaccuracies in the usual ad-hoc ocean-atmosphere coupling algorithms used in realistic models , . Numerical results obtained with an iterative process show large differences at sunrise and sunset compared to usual ad-hoc algorithms thus showing that synchrony errors inherent to ad-hoc coupling methods can be significant. The objective now is to reduce the computational cost of Schwarz algorithms using deep learning techniques.

Representation of the air-sea interface in coupled models:
During the PhD-thesis of Charles Pelletier the scope was on including the
formulation of physical parameterizations in the theoretical analysis of the
coupling, in particular the parameterization schemes to compute air-sea fluxes.
Following this work, a novel and rigorous framework for a consistent two-sided
modeling of the surface boundary layer has been proposed .
This framework allows for a more general representation of the vertical physics
at the air-sea interface while improving the mathematical regularity of the
numerical solutions. Moreover, it is flexible enough to include additional
physical parameters for example to account for the effect of surface waves in
the turbulent flux computation. This work is the first step toward more adequate
discretization methods for the parameterization of surface and planetary boundary
layers in coupled models.

At a more fundamental level, in collaboration with A. Wirth, we have studied turbulent fluctuations in a coupled Ekman layer problem with randomized drag coefficient .

These topics are addressed through strong collaborations between the applied mathematicians and the climate and operational community (Meteo-France, Ifremer, SHOM, Mercator-Ocean, LMD, and LOCEAN). Our work on ocean-atmosphere coupling has steadily matured over the last few years and has reached a point where it triggers interest from the physicists. Through the funding of the projects ANR COCOA (2017-2021, PI: E. Blayo) and SHOM 19CP07 (2020-2024, PI: F. Lemarié), Airsea team members play a major role in the structuration of a multi-disciplinary scientific community working on ocean-atmosphere coupling spanning a broad range from mathematical theory to practical implementations in climate and operational models.

During the year 2021, the AIRSEA team has started to work on new topics around physics-dynamics coupling

. Schematically, numerical models consist of two blocks generally identified as “physics” and “dynamics” which are often developed separately. The “Physics” represents unresolved or under-resolved processes with typical scales below model resolution while the “dynamics” corresponds to a discrete representation in space and time of resolved processes. Unresolved processes cannot be ignored because they directly influence the resolved part of the flow since energy is continuously transferred between scales. The interplay between resolved and unresolved scales is a large, incomplete and complex topic for which there is still much to do within the Earth system modeling community

.

In current models, the scale separation between the resolved part of the flow and the unresolved part is carried out via the Reynolds decomposition which corresponds to a filtering of the Navier-Stokes equations. Such a filtering leads to an evolution equation for the large-scale flow containing unknown terms (often called Reynolds stress terms) which represent the average contribution of the small-scale processes. The system is closed by so-called parameterizations that arbitrarily relate the contribution of small-scale processes to large-scale variables. In this context, the AIRSEA team has started work on two aspects:

Those topics are addressed through collaborations with the climate and operational community (Meteo-France, SHOM, Mercator-Ocean, and IGE). Two projects are currently funded, one on the energetically consistent discretization aspect (SHOM 19CP07, 2020-2024, PI: F. Lemarié) and one on the convection parameterization (Institut des Mathématiques pour la Planètre Terre, 2021-2024, PIs: F. Lemarié and G. Madec).

Artificial intelligence and machine learning may be considered as a potential way to address unresolved model scales and to approximate poorly known processes such as dissipation that occurs essentially at small scales. In order to understand the possibility to combine numerical model and neural network learned with the aid of external data, we develop a network generation and learning algorithm and use it to approximate nonlinear model operators.

A potential way to reconstruct subgrid scales consists in application the Image Super-Resolution methods that refer to the process of recovering high-resolution images from low-resolution image in computer vision and image processing. Recent years have shown remarkable progress of image super-resolution using machine learning techniques . We try to use this methodology in order to identify fine structure of the chaotic turbulent solution of a simple barotropic ocean model. After the learning the flow patterns obtained by the high resolution model, the neuron net can identify fine structure in the low-resolution model solution with better precision than bicubic interpolation.

Different techniques of neuron net construction have been analyzed. Full-connected networks, basic convolutional and encoder-decoder processes as well as mixed architectures were compared with each other and with the classical interpolation of model solution on a low-resolution grid.

At the present time the observation of Earth from space is done by more than thirty satellites. These platforms provide two kinds of observational information:

Our current developments are targeted at the use of learning methods methods to describe the evolution of the images. This approach is being applied to the tracking of oceanic oil spills in the framework of a Long Li's Phd in co-supervision with Jianwei Ma. It led to the publication of . In parallel, we investigated the same problem in the framework of pesticide transfer

Accounting for realistic observations errors is a known bottleneck in data assimilation, because dealing with error correlations is complex. Following a previous study on this subject, we propose to use multiscale modelling, more precisely wavelet transform, to address this question. In we investigate the problem further by addressing two issues arising in real-life data assimilation: how to deal with partially missing data (e.g., concealed by an obstacle between the sensor and the observed system); how to solve convergence issues associated to complex observation error covariance matrices? Two adjustments relying on wavelets modelling are proposed to deal with those, and offer significant improvements. The first one consists in adjusting the variance coefficients in the frequency domain to account for masked information. The second one consists in a gradual assimilation of frequencies. Both of these fully rely on the multiscale properties associated with wavelet covariance modelling.

This kind of work was put to application in a collaboration with colleagues of LGGE and led to the publication of

In the meantime, STORM, a new collaborative project with Université de La Réunion started this year . Our role is to prepare for the assimilation of data collected by sea turtles., and in particular work on the description of observation error statistics.

Numerical models describing the evolution of the system (ocean + atmosphere) or any numerical models describing the evolution of a complex dynamical system (e.g., the evolution of a pandemic) contain a large number of parameters which are generally poorly known. The reliability of the numerical simulations strongly depends on the identification and calibration of these parameters from observed data. In this context, it seems important to understand the kinds of low-dimensional structure that may be present in geophysical models and to exploit this low-dimensional structure with appropriate algorithms. We focus in the team, on parameter space dimension reduction techniques, reduced bases, low-rank structures and transport maps techniques for probability measure approximation.

Undirected probabilistic graphical models represent the conditional dependencies, or Markov properties, of a collection of random variables. Knowing the sparsity of such a graphical model is valuable for modeling multivariate distributions and for efficiently performing inference. While the problem of learning graph structure from data has been studied extensively for certain parametric families of distributions, most existing methods fail to consistently recover the graph structure for non-Gaussian data. In , we propose an algorithm for learning the Markov structure of continuous and non-Gaussian distributions. To characterize conditional independence, we introduce a score based on integrated Hessian information from the joint log-density, and we prove that this score upper bounds the conditional mutual information for a general class of distributions. To compute the score, our algorithm SING estimates the density using a deterministic coupling, induced by a triangular transport map, and iteratively exploits sparse structure in the map to reveal sparsity in the graph. For certain non-Gaussian datasets, we show that our algorithm recovers the graph structure even with a biased approximation to the density. Among other examples, we apply sing to learn the dependencies between the states of a chaotic dynamical system with local interactions.

In the framework of Arthur Macherey’s PhD (defended in June 2021), in collaboration with Anthony Nouy and Marie Billaud-Friess (Ecole Centrale Nantes), we have proposed algorithms for solving high-dimensional Partial Differential Equations (PDEs) that combine a probabilistic interpretation of PDEs, through Feynman-Kac representation, with sparse interpolation . Monte-Carlo methods and time-integration schemes are used to estimate pointwise evaluations of the solution of a PDE. We use a sequential control variates algorithm, where control variates are constructed based on successive approximations of the solution of the PDE. We are now interested in solving parametrized PDE with stochastic algorithms in the framework of potentially high dimensional parameter space. A preliminary step was the development of a PAC algorithm in relative precision for bandit problem with costly sampling .

Reduced models are also developed In the framework of robust inversion. In , we have combined a new greedy algorithm for functional quantization with a Stepwise Uncertainty Reduction strategy to solve a robust inversion problem under functional uncertainties. In a more recent work, we further reduced the number of simulations required to solve the same robust inversion problem, based on Gaussian process meta-modeling on the joint input space of deterministic control parameters and functional uncertain variable . These results are applied to automotive depollution. This research axis was conducted in the framework of the Chair OQUAIDO. This research axis is till active in the team through Clément Duhamel’s PhD, in collaboration with Céline Helbert (Ecole Centrale Lyon) and Miguel Munoz Zuniga, Delphine Sinoquet (IFPEN, Rueil Malmaison).

Rishabh Bhatt started his PhD in December 2019. Under the supervision of Laurent Debreu and Arthur Vidard, he studies the application of time parallelization algorithms to variational data assimilation methods. Thus, at each step of the optimization algorithm, the direct model is integrated by a time-parallel method (here the Parareal algorithm ). One of the main difficulties lies in the choice of the stopping criterion of the Parareal algorithm. To do so, we rely on theoretical results on the convergence of conjugate gradient methods in the presence of approximate gradient calculations . The first results are encouraging and show the possibility to optimally tune the stopping criterion of the Parareal algorithm (and thus the number of iterations) without affecting the convergence of the conjugate gradient. These results are reported in (). We are now working on taking better advantage of the coupling of these two iterative methods (conjugate gradient and Parareal), in particular by optimally reusing the Krylov bases of the Krylov Enhanced version of the Parareal algorithm.

Forecasting geophysical systems require complex models, which sometimes need to be coupled, and which make use of data assimilation. The objective of this project is, for a given output of such a system, to identify the most influential parameters, and to evaluate the effect of uncertainty in input parameters on model output. Existing stochastic tools are not well suited for high dimension problems (in particular time-dependent problems), while deterministic tools are fully applicable but only provide limited information. So the challenge is to gather expertise on one hand on numerical approximation and control of Partial Differential Equations, and on the other hand on stochastic methods for sensitivity analysis, in order to develop and design innovative stochastic solutions to study high dimension models and to propose new hybrid approaches combining the stochastic and deterministic methods. We took part to the writing of a position paper on the futur of sensitivity analysis .

An important challenge for stochastic sensitivity analysis is to develop methodologies which work for dependent inputs. Recently, the Shapley value, from econometrics, was proposed as an alternative to quantify the importance of random input variables to a function. Owen derived Shapley value importance for independent inputs and showed that it is bracketed between two different Sobol' indices. Song et al. recently advocated the use of Shapley value for the case of dependent inputs. In a recent work , in collaboration with Art Owen (Standford's University), we showed that Shapley value removes the conceptual problems of functional ANOVA for dependent inputs. We also investigated further the properties of Shapley effects in . By the end of 2021, Clémentine Prieur started a collaboration with Elmar Plischke (TU Clausthal, Germany) and Emanuele Borgonovo (Bocconi University, Milan, Italy) to estimate total Sobol’ indices as a measure for variable selection even in the framework of dependent inputs. In particular, it allows to estimate total Sobol’ indices for inputs defined on a non rectangular domain. This setting is of particular interest for applications where the input space is reduced due to physical constraints on the quantity of interest. This last setting was encountered, e.g., in Maria Belén Heredia’s PhD thesis (defended in december, 2020), and analyzed by estimating Shapley effects with a nonparametric procedure based on nearest neighbors (see Section for more details). In October 2021, Ri Wang has started a PhD, cosupervised by Clémentine Prieur and Véronique Maume-Deschamps (ICJ, Lyon 1) on the estimation of quantile oriented sensitivity indices in the framework of dependent inputs, by means of random forests or other machine learning tools. Ri Wang has received a funding from the Chinese Scientific Council.

In the field of sensitivity analysis, Sobol’ indices are widely used to assess the importance of the inputs of a model to its output. Among the methods that estimate these indices, the replication procedure is noteworthy for its efficient cost. A practical problem is how many model evaluations must be performed to guarantee a sufficient precision on the Sobol’ estimates. We proposed to tackle this issue by rendering the replication procedure iterative . The idea is to enable the addition of new model evaluations to progressively increase the accuracy of the estimates. These evaluations are done at points located in under-explored regions of the experimental designs, but preserving their characteristics. The key feature of this approach is the construction of nested space-filling designs. For the estimation of first-order indices, a nested Latin hypercube design is used. For the estimation of closed second-order indices, two constructions of a nested orthogonal array design are proposed. Regularity and uniformity properties of the nested designs are studied.

Another research direction for global SA algorithm starts with the report that most of the algorithms to compute sensitivity measures require special sampling schemes or additional model evaluations so that available data from previous model runs (e.g., from an uncertainty analysis based on Latin Hypercube Sampling) cannot be reused. One challenging task for estimating global sensitivity measures consists in recycling an available finite set of input/output data. Green sensitivity, by recycling, avoids wasting. These given data have been discussed, e.g., in , . Most of the given data procedures depend on parameters (number of bins, truncation argument…) not easy to calibrate with a bias-variance compromise perspective. Adaptive selection of these parameters remains a challenging issue for most of these given-data algorithms. In the context of María Belén Heredia’s PhD thesis, we have proposed a non-parametric given data estimator for aggregated Sobol’ indices, introduced in and further developed in for multivariate or functional outputs. We also introduced aggregated Shapley effects and we have extended a nearest neighbor estimation procedure to estimate these indices . We also started a collaboration with Sébastien Da Veiga (Safran Tech), Agnès Lagnoux, Thierry Klein and Fabrice Gamboa (Institut de Mathématiques de Toulouse) on a new nonparametric estimation procedure for closed Sobol’ indices of any order based on degenerate U-statistics.

Many models are stochastic in nature, and some of them may be driven by parametrized stochastic differential equations (SDE). It is important for applications to propose a strategy to perform global sensitivity analysis (GSA) for such models, in presence of uncertainties on the parameters. In collaboration with Pierre Etoré (DATA department in Grenoble), Clémentine Prieur proposed an approach based on Feynman-Kac formulas . The research on GSA for stochastic simulators is still ongoing, first in the context of the MATH-AmSud project FANTASTIC (Statistical inFerence and sensitivity ANalysis for models described by sTochASTIC differential equations) with Chile and Uruguay, secondly through the PhD thesis of Henri Mermoz Kouye, co-supervised by Clémentine Prieur, in collaboration with Gildas Mazo and Eliza Vergu (INRAE, département MIA, Jouy). Note that our recent developments with P. Etoré on GSA for parametrized SDEs and are strongly related to reduced order modeling (see Section ), as GSA requires jose leon intensive computations of the quantity of interest. In collaboration with Pierre Etoré and Joël Andrepont (master internship started in spring 2021), Clémentine Prieur is working on GSA for parametrized SDEs based on Fokker-Planck equation and kernel based sensitivity indices. Note that a joint work between Pierre Etoré, Clémentine Prieur and Jose R. Leon has been submitted, related to exact or approximated computation of Kolmogorov hypoelliptic equations (KHE). Even if not dealing with GSA, it could be a starting point for analyzing sensitivity for models described by a parametrized version of KHE. Concerning Henri Mermoz Kouye’s PhD thesis, the approach is different. We are interested in GSA for compartmental stochastic models. Our methodology relies on a deterministic representation of continuous time Markov chains stochastic compartmental models.

Pesticide transfer models are valuable tools to predict and prevent pollution of water bodies. However, using such models in operational contexts requires a strong knowledge of their structure including influential parameters. This project aims at performing global sensitivity analysis (GSA) of the PESHMELBA model (pesticide and hydrology: modelling at the catchment scale). This work is made hard due to the modular, complex structure of the model that couples different physical processes. It results in a large input space dimension and a high computational cost that limits the number of available runs. Using classical GSA tools such as Sobol' indices is thus not feasible. In order to circumvent those limitations, we also explored alternative techniques such as HSIC dependence measure or Random Forest metamodel. The use of such methods in the specific context of spatially distributed output was presented in and submitted in a journal paper . We extended this work to spatiotemporal output, as presented in .

To conclude Section , let us mention that Clémentine Prieur took part to the writing on a monography on recent trends in sensitivity analysis, which appeared by the end of 2021 .

Physically-based avalanche propagation models must still be locally calibrated to provide robust predictions, e.g. in long-term forecasting and subsequent risk assessment. Friction parameters cannot be measured directly and need to be estimated from observations. Rich and diverse data is now increasingly available from test-sites, but for measurements made along ow propagation, potential autocorrelation should be explicitly accounted for. In the context of María Belén Heredia’s PhD, in collaboration with IRSTEA Grenoble, we have proposed in a comprehensive Bayesian calibration and statistical model selection framework with application to an avalanche sliding block model with the standard Voellmy friction law and high rate photogrammetric images. An avalanche released at the Lautaret test-site and a synthetic data set based on the avalanche were used to test the approach. Results have demonstrated i) the efficiency of the proposed calibration scheme, and ii) that including autocorrelation in the statistical modelling definitely improves the accuracy of both parameter estimation and velocity predictions. In the context of the energy transition, wind power generation is developing rapidly in France and worldwide. Research and innovation on wind resource characterisation, turbin control, coupled mechanical modelling of wind systems or technological development of offshore wind turbines floaters are current research topics. In particular, the monitoring and the maintenance of wind turbine is becoming a major issue. Current solutions do not take full advantage of the large amount of data provided by sensors placed on modern wind turbines in production. These data could be advantageously used in order to refine the predictions of production, the life of the structure, the control strategies and the planning of maintenance. In this context, it is interesting to optimally combine production data and numerical models in order to obtain highly reliable models of wind turbines. This process is of interest to many industrial and academic groups and is known in many fields of the industry, including the wind industry, as "digital twin”. The objective of Adrien Hirvoas's PhD work is to develop of data assimilation methodology to build the "digital twin" of an onshore wind turbine. Based on measurements, the data assimilation should allow to reduce the uncertainties of the physical parameters of the numerical model developed during the design phase to obtain a highly reliable model. Various ensemble data assimilation approches are currently under consideration to address the problem. In the context of this work, it is necessary to develop algorithms of identification quantifying and ranking all the uncertainty sources. This work in done in collaboration with IFPEN. A first paper has been accepted for publication .

Due to the sanitary context, Clémentine Prieur decided to join a working group, SEEPIA Simulation & Estimation of EPIdemics with Algorithms, animated by Didier Georges (Gipsa-lab). A first work has been published . An extension of the classical pandemic SIRD model was considered for the regional spread of COVID-19 in France under lockdown strategies. This compartment model divides the infected and the recovered individuals into undetected and detected compartments respectively. By fitting the extended model to the real detected data during the lockdown, an optimization algorithm was used to derive the optimal parameters, the initial condition and the epidemics start date of regions in France. Considering all the age classes together, a network model of the pandemic transport between regions in France was presented on the basis of the regional extended model and was simulated to reveal the transport effect of COVID-19 pandemic after lockdown. Using the the measured values of displacement of people mobilizing between each city, the pandemic network of all cities in France was simulated by using the same model and method as the pandemic network of regions. Finally, a discussion on an integro-differential equation was given and a new model for the network pandemic model of each age was provided. As already mentioned in Section , Clémentine Prieur went on working on the pandemic, in collaboration with Didier Georges (GIPSA Lab, Grenoble). Both of them supervised the internship of Matthieu Oliver, submitting a work proposing time-dependent reduced order modeling for short-term forecast from a spatialized SIR model . Robin Vaudry started in October 2021 a PhD, funded by the CNR research platform MODCOV19, and cosupervised by Clémentine Prieur and Didier Georges. The objective of this PhD is to solve inverse problems related to spatialized and ages structured commpartmental models of COVID19 pandemic.

This research is the subject of a collaboration with Chile and Uruguay. More precisely, we started working with Venezuela. Due to the crisis in Venezuela, our main collaborator on that topic moved to Uruguay.

We are focusing our attention on models derived from the linear Fokker-Planck equation. From a probabilistic viewpoint, these models have received particular attention in recent years, since they are a basic example for hypercoercivity. In fact, even though completely degenerated, these models are hypoelliptic and still verify some properties of coercivity, in a broad sense of the word. Such models often appear in the fields of mechanics, finance and even biology. For such models we believe it appropriate to build statistical non-parametric estimation tools. Initial results have been obtained for the estimation of invariant density, in conditions guaranteeing its existence and unicity and when only partial observational data are available. A paper on the non parametric estimation of the drift has been accepted recently (see Samson et al., 2012, for results for parametric models). As far as the estimation of the diffusion term is concerned, a paper has been accepted , in collaboration with J.R. Leon (Montevideo, Uruguay) and P. Cattiaux (Toulouse). Recursive estimators have been also proposed by the same authors in , also recently accepted. In a recent collaboration with Adeline Samson from the statistics department in the Lab, we considered adaptive estimation, that is we proposed a data-driven procedure for the choice of the bandwidth parameters.

In , we focused on damping Hamiltonian systems under the so-called fluctuation-dissipation condition. Idea in that paper were re-used with applications to neuroscience in .

Note that Professor Jose R. Leon (Caracas, Venezuela, Montevideo, Uruguay) was funded by an international Inria Chair, allowing to collaborate further on parameter estimation.

We recently proposed a paper on the use of the Euler scheme for inference purposes, considering reflected diffusions. This paper could be extended to the hypoelliptic framework.

We also have a collaboration with Karine Bertin (Valparaiso, Chile), Nicolas Klutchnikoff (Université Rennes) and Jose R. León (Montevideo, Uruguay) funded by a MATHAMSUD project (2016-2017) and by the LIA/CNRS (2018). We are interested in new adaptive estimators for invariant densities on bounded domains , and would like to extend that results to hypo-elliptic diffusions. More recently, in relation with her work on GSA for parametrized SDEs (see Section ), Clémentine Prieur is interested in proposing and analyzing theoretical properties of reduced order modeling (see Section ) based on weak formulations of stationary Fokker Planck equations related to hypoelliptic SDEs. This is a collaboration with Pierre Etoré (LJK, département DATA).

Many physical phenomena are modelled numerically in order to better understand and/or to predict their behaviour. However, some complex and small scale phenomena can not be fully represented in the models. The introduction of ad-hoc correcting terms, can represent these unresolved processes, but they need to be properly estimated.

A good example of this type of problem is the estimation of bottom friction parameters of the ocean floor. This is important because it affects the general circulation. This is particularly the case in coastal areas, especially for its influence on wave breaking. Because of its strong spatial disparity, it is impossible to estimate the bottom friction by direct observation, so it requires to do so indirectly by observing its effects on surface movement. This task is further complicated by the presence of uncertainty in certain other characteristics linking the bottom and the surface (eg boundary conditions). The techniques currently used to adjust these settings are very basic and do not take into account these uncertainties, thereby increasing the error in this estimate.

Classical methods of parameter estimation usually imply the minimisation of an objective function, that measures the error between some observations and the results obtained by a numerical model. In the presence of uncertainties, the minimisation is not straightforward, as the output of the model depends on those uncontrolled inputs and on the control parameter as well. That is why we will aim at minimising the objective function, to get an estimation of the control parameter that is robust to the uncertainties.

The definition of robustness differs depending of the context in which it is used. In this work, two different notions of robustness are considered: robustness by minimising the mean and variance, and robustness based on the distribution of the minimisers of the function. This information on the location of the minimisers is not a novel idea, as it had been applied as a criterion in sequential Bayesian optimisation. However, the constraint of optimality is here relaxed to define a new estimate. To evaluate this estimation, a toy model of a coastal area has been implemented. The control parameter is the bottom friction, upon which classical methods of estimation are applied in a simulation-estimation experiment. The model is then modified to include uncertainties on the boundary conditions in order to apply robust control methods. This was the topic of Victor Trappler's PhD .

This work will be continued in two different actions : On the application side, in the framework of an Inrae/Inria collaboration for dealing with pesticide transfer parameterisation, and on the methodological side, in the framework of the joint-lab with ATOS, in order to explore the use of Machine Learning techniques for such problem.

In the context of Philomène Le Gall’s PhD thesis, we are applying the aforementioned modeling of extreme precipitation with the aim of regionalizing extreme precipitation.

Clémentine Prieur is PI of MATH-AmSud project 2020-2021 FANTASTIC (Statistical inFerence and sensitivity ANalysis for models described by sTochASTIC differential equations). This project involves Chile (local head Karine Bertin, Universidad de Valparaíso) and Uruguay (local head Jose R. Leon, Universdad de la república de Uruguay, Montevideo).

SAMO board is in charge of the organization of the SAMO (sensitivity analysis of model outputs) conferences, every three years. It is strongly supported by the Joint Research Center of the European Commission.

In 2019, Clémentine Prieur, which is part of this board, as also co-chair of a satellite event on the future of sensitivity analysis. A position paper has been published in 2021, as a synthesis of the discussions hold in Barcelona (autumn 2019).

A 4-year contract: ANR MODENA, "Model and data reduction for efficient assimilation", PI: Olivier ZAHM The reliability of numerical predictions strongly rely on our ability to calibrate the unknown model parameters using observable data. Data assimilation is a particularly challenging task in ocean modelling because of the complexity of the computational models and because of the high-dimension of both the parameters and the data. The objective of this project is to explore novel methodologies to improve the probabilistic description of the Bayesian solution to inverse problem while, at the same time, taking into consideration the computational limitations imposed by large scale applications. This project relies on three building blocks and on their interplay:

In addition, the proposed methodology addresses key challenges present in many real-world applications and therefore we expect that they can find applications in other domains.