The general scope of the AIRSEA project-team is to develop mathematical and computational methods for the modeling of oceanic and atmospheric flows.
The mathematical tools used involve both deterministic and statistical approaches. The main research topics cover a) modeling and coupling b) model reduction for sensitivity analysis, coupling and multiscale optimizations c) sensitivity analysis, parameter estimation and risk assessment d) algorithms for high performance computing. The range of application is from climate modeling to the prediction of extreme events.

Recent events have raised questions regarding the social and economic implications of anthropic alterations of the Earth system, i.e. climate change and the associated risks of increasing extreme events. Ocean and atmosphere, coupled with other components (continent and ice) are the building blocks of the Earth system. A better understanding of the ocean atmosphere system is a key ingredient for improving prediction of such events. Numerical models are essential tools to understand processes, and simulate and forecast events at various space and time scales. Geophysical flows generally have a number of characteristics that make it difficult to model them. This justifies the development of specifically adapted mathematical methods:

Our scientific objectives are divided into four major points. The first objective focuses on developing advanced mathematical methods for both the ocean and atmosphere, and the coupling of these two components. The second objective is to investigate the derivation and use of model reduction to face problems associated with the numerical cost of our applications. The third objective is directed toward the management of uncertainty in numerical simulations. The last objective deals with efficient numerical algorithms for new computing platforms. As mentioned above, the targeted applications cover oceanic and atmospheric modeling and related extreme events using a hierarchy of models of increasing complexity.

Current numerical oceanic and atmospheric models suffer from a number of well-identified problems. These problems are mainly related to lack of horizontal and vertical resolution, thus requiring the parameterization of unresolved (subgrid scale) processes and control of discretization errors in order to fulfill criteria related to the particular underlying physics of rotating and strongly stratified flows. Oceanic and atmospheric coupled models are increasingly used in a wide range of applications from global to regional scales. Assessment of the reliability of those coupled models is an emerging topic as the spread among the solutions of existing models (e.g., for climate change predictions) has not been reduced with the new generation models when compared to the older ones.

Advanced methods for modeling 3D rotating and stratified flows
The continuous increase of computational power and the resulting finer grid resolutions have triggered a recent regain of interest in numerical methods and their relation to physical processes. Going beyond present knowledge requires a better understanding of numerical dispersion/dissipation ranges and their connection to model fine scales. Removing the leading order truncation error of numerical schemes is thus an active topic of research and each mathematical tool has to adapt to the characteristics of three dimensional stratified and rotating flows. Studying the link between discretization errors and subgrid scale parameterizations is also arguably one of the main challenges.

Complexity of the geometry, boundary layers, strong stratification and lack of resolution are the main sources of discretization errors in the numerical simulation of geophysical flows. This emphasizes the importance of the definition of the computational grids (and coordinate systems) both in horizontal and vertical directions, and the necessity of truly multi resolution approaches. At the same time, the role of the small scale dynamics on large scale circulation has to be taken into account. Such parameterizations may be of deterministic as well as stochastic nature and both approaches are taken by the AIRSEA team. The design of numerical schemes consistent with the parameterizations is also arguably one of the main challenges for the coming years. This work is complementary and linked to that on parameters estimation described in 3.3.

Ocean Atmosphere interactions and formulation of coupled models
State-of-the-art climate models (CMs) are complex systems under continuous development. A fundamental aspect of climate modeling is the representation of air-sea interactions. This covers a large range of issues: parameterizations of atmospheric and oceanic boundary layers, estimation of air-sea fluxes, time-space numerical schemes, non conforming grids, coupling algorithms ...Many developments related to these different aspects were performed over the last 10-15 years, but were in general conducted independently of each other.

The aim of our work is to revisit and enrich several aspects of the representation of air-sea interactions in CMs, paying special attention to their overall consistency with appropriate mathematical tools. We intend to work consistently on the physics and numerics. Using the theoretical framework of global-in-time Schwarz methods, our aim is to analyze the mathematical formulation of the parameterizations in a coupling perspective. From this study, we expect improved predictability in coupled models (this aspect will be studied using techniques described in 3.3). Complementary work on space-time nonconformities and acceleration of convergence of Schwarz-like iterative methods (see 8.1.2) are also conducted.

The high computational cost of the applications is a common and major concern to have in mind when deriving new methodological approaches. This cost increases dramatically with the use of sensitivity analysis or parameter estimation methods, and more generally with methods that require a potentially large number of model integrations.

A dimension reduction, using either stochastic or deterministic methods, is a way to reduce significantly the number of degrees of freedom, and therefore the calculation time, of a numerical model.

Model reduction
Reduction methods can be deterministic (proper orthogonal decomposition, other reduced bases) or stochastic (polynomial chaos, Gaussian processes, kriging), and both fields of research are very active. Choosing one method over another strongly depends on the targeted application, which can be as varied as real-time computation, sensitivity analysis (see e.g., section 8.5) or optimisation for parameter estimation (see below).

Our goals are multiple, but they share a common need for certified error bounds on the output. Our team has a 4-year history of working on certified reduction methods and has a unique positioning at the interface between deterministic and stochastic approaches. Thus, it seems interesting to conduct a thorough comparison of the two alternatives in the context of sensitivity analysis. Efforts will also be directed toward the development of efficient greedy algorithms for the reduction, and the derivation of goal-oriented sharp error bounds for non linear models and/or non linear outputs of interest. This will be complementary to our work on the deterministic reduction of parametrized viscous Burgers and Shallow Water equations where the objective is to obtain sharp error bounds to provide confidence intervals for the estimation of sensitivity indices.

Reduced models for coupling applications
Global and regional high-resolution oceanic models are either coupled to an atmospheric model
or forced at the air-sea interface by fluxes computed empirically preventing proper physical
feedback between the two media. Thanks to high-resolution observational studies, the existence of air-sea
interactions at oceanic mesoscales (i.e., at

Multiphysics coupling often requires iterative methods to obtain a mathematically correct numerical solution. To mitigate the cost of the iterations, we will investigate the possibility of using reduced-order models for the iterative process. We will consider different ways of deriving a reduced model: coarsening of the resolution, degradation of the physics and/or numerical schemes, or simplification of the governing equations. At a mathematical level, we will strive to study the well-posedness and the convergence properties when reduced models are used. Indeed, running an atmospheric model at the same resolution as the ocean model is generally too expensive to be manageable, even for moderate resolution applications. To account for important fine-scale interactions in the computation of the air-sea boundary condition, the objective is to derive a simplified boundary layer model that is able to represent important 3D turbulent features in the marine atmospheric boundary layer.

Reduced models for multiscale optimization
The field of multigrid methods for optimisation has known a tremendous development over the past few decades. However, it has not been applied to oceanic and atmospheric problems apart from some crude (non-converging) approximations or applications to simplified and low dimensional models. This is mainly due to the high complexity of such models and to the difficulty in handling several grids at the same time. Moreover, due to complex boundaries and physical phenomena, the grid interactions and transfer operators are not trivial to define.

Multigrid solvers (or multigrid preconditioners) are efficient methods for the solution of variational data assimilation problems. We would like to take advantage of these methods to tackle the optimization problem in high dimensional space. High dimensional control space is obtained when dealing with parameter fields estimation, or with control of the full 4D (space time) trajectory. It is important since it enables us to take into account model errors. In that case, multigrid methods can be used to solve the large scales of the problem at a lower cost, this being potentially coupled with a scale decomposition of the variables themselves.

There are many sources of uncertainties in numerical models. They are due to imperfect external forcing, poorly known parameters, missing physics and discretization errors. Studying these uncertainties and their impact on the simulations is a challenge, mostly because of the high dimensionality and non-linear nature of the systems. To deal with these uncertainties we work on three axes of research, which are linked: sensitivity analysis, parameter estimation and risk assessment. They are based on either stochastic or deterministic methods.

Sensitivity analysis
Sensitivity analysis (SA), which links uncertainty in the model inputs to uncertainty in the model outputs, is a powerful tool for model design and validation. First, it can be a pre-stage for parameter estimation (see 3.3), allowing for the selection of the more significant parameters. Second, SA permits understanding and quantifying (possibly non-linear) interactions induced by the different processes defining e.g., realistic ocean atmosphere models. Finally SA allows for validation of models, checking that the estimated sensitivities are consistent with what is expected by the theory.
On ocean, atmosphere and coupled systems, only first order deterministic SA are performed, neglecting the initialization process (data assimilation). AIRSEA members and collaborators proposed to use second order information to provide consistent sensitivity measures, but so far it has only been applied to simple academic systems. Metamodels are now commonly used, due to the cost induced by each evaluation of complex numerical models: mostly Gaussian processes, whose probabilistic framework allows for the development of specific adaptive designs, and polynomial chaos not only in the context of intrusive Galerkin approaches but also in a black-box approach. Until recently, global SA was based primarily on a set of engineering practices. New mathematical and methodological developments have led to the numerical computation of Sobol' indices, with confidence intervals assessing for both metamodel and estimation errors. Approaches have also been extended to the case of dependent entries, functional inputs and/or output and stochastic numerical codes. Other types of indices and generalizations of Sobol' indices have also been introduced.

Concerning the stochastic approach to SA we plan to work with parameters that show spatio-temporal dependencies and to continue toward more realistic applications where the input space is of huge dimension with highly correlated components. Sensitivity analysis for dependent inputs also introduces new challenges. In our applicative context, it would seem prudent to carefully learn the spatio-temporal dependences before running a global SA. In the deterministic framework we focus on second order approaches where the sought sensitivities are related to the optimality system rather than to the model; i.e., we consider the whole forecasting system (model plus initialization through data assimilation).

All these methods allow for computing sensitivities and more importantly a posteriori error statistics.

Parameter estimation
Advanced parameter estimation methods are barely used in ocean, atmosphere and coupled systems, mostly due to a difficulty of deriving adequate response functions, a lack of knowledge of these methods in the ocean-atmosphere community, and also to the huge associated computing costs. In the presence of strong uncertainties on the model but also on parameter values, simulation and inference are closely associated. Filtering for data assimilation and Approximate Bayesian Computation (ABC) are two examples of such association.

Stochastic approach can be compared with the deterministic approach, which allows to determine the sensitivity of the flow to parameters and optimize their values relying on data assimilation. This approach is already shown to be capable of selecting a reduced space of the most influent parameters in the local parameter space and to adapt their values in view of correcting errors committed by the numerical approximation. This approach assumes the use of automatic differentiation of the source code with respect to the model parameters, and optimization of the obtained raw code.

AIRSEA assembles all the required expertise to tackle these difficulties. As mentioned previously, the choice of parameterization schemes and their tuning has a significant impact on the result of model simulations. Our research will focus on parameter estimation for parameterized Partial Differential Equations (PDEs) and also for parameterized Stochastic Differential Equations (SDEs). Deterministic approaches are based on optimal control methods and are local in the parameter space (i.e., the result depends on the starting point of the estimation) but thanks to adjoint methods they can cope with a large number of unknowns that can also vary in space and time. Multiscale optimization techniques as described in 8.3 will be one of the tools used. This in turn can be used either to propose a better (and smaller) parameter set or as a criterion for discriminating parameterization schemes. Statistical methods are global in the parameter state but may suffer from the curse of dimensionality. However, the notion of parameter can also be extended to functional parameters. We may consider as parameter a functional entity such as a boundary condition on time, or a probability density function in a stationary regime. For these purposes, non-parametric estimation will also be considered as an alternative.

Risk assessment
Risk assessment in the multivariate setting suffers from a lack of consensus on the choice of indicators. Moreover, once the indicators are designed, it still remains to develop estimation procedures, efficient even for high risk levels. Recent developments for the assessment of financial risk have to be considered with caution as methods may differ pertaining to general financial decisions or environmental risk assessment. Modeling and quantifying uncertainties related to extreme events is of central interest in environmental sciences. In relation to our scientific targets, risk assessment is very important in several areas: hydrological extreme events, cyclone intensity, storm surges...Environmental risks most of the time involve several aspects which are often correlated. Moreover, even in the ideal case where the focus is on a single risk source, we have to face the temporal and spatial nature of environmental extreme events.
The study of extremes within a spatio-temporal framework remains an emerging field where the development of adapted statistical methods could lead to major progress in terms of geophysical understanding and risk assessment thus coupling data and model information for risk assessment.

Based on the above considerations we aim to answer the following scientific questions: how to measure risk in a multivariate/spatial framework? How to estimate risk in a non stationary context? How to reduce dimension (see 3.2) for a better estimation of spatial risk?

Extreme events are rare, which means there is little data available to make inferences of risk measures. Risk assessment based on observation therefore relies on multivariate extreme value theory. Interacting particle systems for the analysis of rare events is commonly used in the community of computer experiments. An open question is the pertinence of such tools for the evaluation of environmental risk.

Most numerical models are unable to accurately reproduce extreme events. There is therefore a real need to develop efficient assimilation methods for the coupling of numerical models and extreme data.

Methods for sensitivity analysis, parameter estimation and risk assessment are extremely costly due to the necessary number of model evaluations. This number of simulations require considerable computational resources, depends on the complexity of the application, the number of input variables and desired quality of approximations. To this aim, the AIRSEA team is an intensive user of HPC computing platforms, particularly grid computing platforms. The associated grid deployment has to take into account the scheduling of a huge number of computational requests and the links with data-management between these requests, all of these as automatically as possible. In addition, there is an increasing need to propose efficient numerical algorithms specifically designed for new (or future) computing architectures and this is part of our scientific objectives. According to the computational cost of our applications, the evolution of high performance computing platforms has to be taken into account for several reasons. While our applications are able to exploit space parallelism to its full extent (oceanic and atmospheric models are traditionally based on a spatial domain decomposition method), the spatial discretization step size limits the efficiency of traditional parallel methods. Thus the inherent parallelism is modest, particularly for the case of relative coarse resolution but with very long integration time (e.g., climate modeling). Paths toward new programming paradigms are thus needed. As a step in that direction, we plan to focus our research on parallel in time methods.

New numerical algorithms for high performance computing
Parallel in time methods can be classified into three main groups. In the first group, we find methods using parallelism across the method, such as parallel integrators for ordinary differential equations. The second group considers parallelism across the problem. Falling into this category are methods such as waveform relaxation
where the space-time system is decomposed into a set of subsystems which can then be solved independently using some form of relaxation techniques or multigrid reduction in time.
The third group of methods focuses on parallelism across the steps. One of the best known algorithms in this family is parareal.
Other methods combining the strengths of those listed above (e.g., PFASST) are currently under investigation in the community.

Parallel in time methods are iterative methods that may require a large number of iteration before convergence. Our first focus will be on the convergence analysis of parallel in time (Parareal / Schwarz) methods for the equation systems of oceanic and atmospheric models. Our second objective will be on the construction of fast (approximate) integrators for these systems. This part is naturally linked to the model reduction methods of section (8.3.1). Fast approximate integrators are required both in the Schwarz algorithm (where a first guess of the boundary conditions is required) and in the Parareal algorithm (where the fast integrator is used to connect the different time windows). Our main application of these methods will be on climate (i.e., very long time) simulations. Our second application of parallel in time methods will be in the context of optimization methods. In fact, one of the major drawbacks of the optimal control techniques used in 3.3 is a lack of intrinsic parallelism in comparison with ensemble methods. Here, parallel in time methods also offer ways to better efficiency. The mathematical key point is centered on how to efficiently couple two iterative methods (i.e., parallel in time and optimization methods).

The evolution of natural systems, in the short, mid, or long term, has extremely important consequences for both the global Earth system and humanity. Forecasting this evolution is thus a major challenge from the scientific, economic, and human viewpoints.

Humanity has to face the problem of global warming, brought on by the
emission of greenhouse gases from human activities. This warming will probably cause huge changes at global and regional
scales, in terms of climate, vegetation and biodiversity, with major consequences for local populations.
Research has therefore been conducted over the past 15 to 20 years in an effort to
model the Earth's climate and forecast its evolution in the 21st century in response to anthropic
action.

With regard to short-term forecasts, the best and oldest example is of course weather forecasting.
Meteorological services have been providing daily short-term forecasts for several decades which are of
crucial importance for numerous human activities.

Numerous other problems can also be mentioned, like seasonal weather
forecasting (to enable powerful phenomena like an El Nioperational oceanography (short-term forecasts of the evolution of the ocean system to provide services for the fishing industry, ship routing, defense, or the fight against marine pollution) or the prediction of floods.

As mentioned previously, mathematical and numerical tools are omnipresent and play a fundamental role in these areas of research. In this context, the vocation of AIRSEA is not to carry out numerical prediction, but to address mathematical issues raised by the development of prediction systems for these application fields, in close collaboration with geophysicists.

Most of the research activities of the AIRSEA team are directed towards the improvement of numerical systems of the ocean and the atmosphere. This includes the development of appropriate numerical methods, model/parameter calibration using observational data and uncertainty quantification for decision making. The AIRSEA team members work in close collaboration with the researchers in the field of geophyscial fluid and are partners of several interdisciplinary projects. They also strongly contribute to the development of state of the art numerical systems, like NEMO and CROCO in the ocean community.

SWEET supports periodic boundary conditions for * the bi-periodic plane (2D torus) * the sphere

Space discretization * PLANE: Spectral methods based on Fourier space * PLANE: Finite differences * SPHERE: Spherical Harmonics

Time discretization * Explicit RK * Implicit RK * Leapfrog * Crank-Nicolson * Semi-Lagrangian * Parallel-in-time ** Parareal ** PFASST ** Rational approximation of exponential Integrators (REXI) ...and many more time steppers...

Special features * Graphical user interface * Fast Helmholtz solver in spectral space * Easy-to-code in C++ ...

There’s support for various applications * Shallow-water equations on plane/sphere * Advection * Burgers’ ...

Accurate and stable implementation of bathymetry boundary conditions in ocean models remains a challenging problem. Generalized terrain-following coordinates are often used in ocean models, but they require smoothing the bathymetry to reduce pressure gradient errors. Geopotential z-coordinates are a common alternative that avoid pressure gradient and numerical diapycnal diffusion errors, but they generate spurious flow due to their “staircase” geometry. In 47, we introduce a new Brinkman volume penalization to approximate the no-slip boundary condition and complex geometry of bathymetry in ocean models. This approach corrects the staircase effect of z-coordinates, does not introduce any new stability constraints on the geometry of the bathymetry and is easy to implement in an existing ocean model. The porosity parameter allows modelling subgrid scale details of the geometry. As an illustration, through the use of penalization methods, the Gulf Stream detachment is correctly represented in a 1/8 degree simulation (see 1). These new results on realistic applications have been published in 7. This opens the door to a clear improvement of climate models in which a good representation of this mechanism is essential. This work is currently extended to z coordinate ocean models through the PhD work of A. Nasser. We have also investigated the representation of coastlines and its sensitivity to the mesh orientation in 34.

With the increase of resolution, the hydrostatic assumption becomes less valid and the AIRSEA group also works on the development of non-hydrostatic ocean models. The treatment of non-hydrostatic incompressible flows leads to a 3D elliptic system for pressure that can be ill-conditioned, in particular with non geopotential vertical coordinates. That is why we favor the use of the non-hydrostatic compressible equations, that remove the need for a 3D resolution at the price of reincluding acoustic waves. For that purposes a detailed analysis of acoustic-gravity waves in a free-surface compressible and stratified ocean was carried out 41 in part in the PhD of E. Duval. The proposed numerical approach has been implemented in the CROCO ocean model and tested in various flow configurations 57, 13.

In the context of the French initiative CROCO (Coastal and Regional Ocean COmmunity model, www.croco-ocean.org) for the development of a new oceanic modeling system, Emilie Duval worked on the design of methods to couple local nonhydrostatic models to larger scale hydrostatic ones. Such a coupling is quite delicate from a mathematical point of view, due to the different nature of hydrostatic and nonhydrostatic equations (where the vertical velocity is either a diagnostic or a prognostic variable). A thorough analysis of the different families of waves that can be present in these equations was performed. Moreover a decomposition of the solutions into vertical modes, which is quite usual in the hydrostatic case, has been generalized in the nonhydrostatic case, which could be an interesting lead for coupling algorithms. A prototype has been implemented, which allows for analytical solutions in simplified configurations and makes it possible to test different numerical coupling approaches. Emilie Duval defended her PhD on 15 December 2022.

The Airsea team is involved in the modeling and algorithmic aspects of ocean-atmosphere (OA) coupling. For the last few years we have been actively working on the analysis of such coupling both in terms of its continuous and numerical formulation

62,

63. Our activities can be divided into three general topics

Continuous and discrete analysis of Schwarz algorithms for OA coupling:
we have been developing coupling approaches for several years, based on
so-called Schwarz algorithms. Schwarz-like domain decomposition methods are
very popular in mathematics, computational sciences and engineering notably
for the implementation of coupling strategies. However, for complex applications
(like in OA coupling) it is challenging to have an a priori knowledge of the
convergence properties of such methods. Indeed coupled problems arising in Earth
system modeling often exhibit sharp turbulent boundary layers whose parameterizations
lead to peculiar transmission conditions and diffusion coefficients. In the
framework of S. Thery PhD (defended in February 2021) the well-posedness of the
non-linear coupling problem including parameterizations has been addressed and a
detailed continuous analysis of the convergence properties of the Schwarz methods
has been pursued to entangle the impact of the different parameters at play in
such coupling problem 78, 16.
In S. Clement PhD, a general framework has been proposed to study the convergence
properties at a (semi)-discrete level to allow a systematic comparison with the
results obtained from the continuous problem 5. Such
framework allows to study more complex coupling problems whose formulation is
representative of the discretization used in realistic coupled models 46.

Within the COCOA project, a Schwarz-like iterative method has been applied in a state-of-the-art Earth-System model to evaluate the consequences of inaccuracies in the usual ad-hoc ocean-atmosphere coupling algorithms used in realistic models 68, 69. Numerical results obtained with an iterative process show large differences at sunrise and sunset compared to usual ad-hoc algorithms thus showing that synchrony errors inherent to ad-hoc coupling methods can be significant. The objective now is to reduce the computational cost of Schwarz algorithms using deep learning techniques.

Representation of the air-sea interface in coupled models:
During the PhD-thesis of Charles Pelletier the scope was on including the
formulation of physical parameterizations in the theoretical analysis of the
coupling, in particular the parameterization schemes to compute air-sea fluxes.
Following this work, a novel and rigorous framework for a consistent two-sided
modeling of the surface boundary layer has been proposed 72.
This framework allows for a more general representation of the vertical physics
at the air-sea interface while improving the mathematical regularity of the
numerical solutions. Moreover, it is flexible enough to include additional
physical parameters for example to account for the effect of surface waves in
the turbulent flux computation. This work is the first step toward more adequate
discretization methods for the parameterization of surface and planetary boundary
layers in coupled models.

At a more fundamental level, in collaboration with A. Wirth, we have studied turbulent fluctuations in a coupled Ekman layer problem with randomized drag coefficient 81.

These topics are addressed through strong collaborations between the applied mathematicians and the climate and operational community (Meteo-France, Ifremer, SHOM, Mercator-Ocean, LMD, and LOCEAN). Our work on ocean-atmosphere coupling has steadily matured over the last few years and has reached a point where it triggers interest from the physicists. Through the funding of the projects ANR COCOA (2017-2021, PI: E. Blayo) and SHOM 19CP07 (2020-2024, PI: F. Lemarié), Airsea team members play a major role in the structuration of a multi-disciplinary scientific community working on ocean-atmosphere coupling spanning a broad range from mathematical theory to practical implementations in climate and operational models.

Following the work started in 2021, the AIRSEA team continued to study new topics around physics-dynamics coupling

53. Schematically, numerical models consist of two blocks generally identified as “physics” and “dynamics” which are often developed separately. The “Physics” represents unresolved or under-resolved processes with typical scales below model resolution while the “dynamics” corresponds to a discrete representation in space and time of resolved processes. Unresolved processes cannot be ignored because they directly influence the resolved part of the flow since energy is continuously transferred between scales. The interplay between resolved and unresolved scales is a large, incomplete and complex topic for which there is still much to do within the Earth system modeling community

64.

In current models, the scale separation between the resolved part of the flow and the unresolved part is carried out via the Reynolds decomposition which corresponds to a filtering of the Navier-Stokes equations. Such a filtering leads to an evolution equation for the large-scale flow containing unknown terms (often called Reynolds stress terms) which represent the average contribution of the small-scale processes. The system is closed by so-called parameterizations that arbitrarily relate the contribution of small-scale processes to large-scale variables. In this context, the AIRSEA team has started work on two aspects:

Those topics are addressed through collaborations with the climate and operational community (Meteo-France, SHOM, Mercator-Ocean, and IGE). Two projects are currently funded, one on the energetically consistent discretization aspect (SHOM 19CP07, 2020-2024, PI: F. Lemarié) and one on the convection parameterization (Institut des Mathématiques pour la Planètre Terre, 2021-2024, PIs: F. Lemarié and G. Madec).

Artificial intelligence and machine learning may be considered as a potential way to address unresolved model scales and to approximate poorly known processes such as dissipation that occurs essentially at small scales. In order to understand the possibility to combine numerical model and neural network learned with the aid of external data, we develop a network generation and learning algorithm and use it to approximate nonlinear model operators.

A potential way to reconstruct subgrid scales consists in application the Image Super-Resolution methods that refer to the process of recovering high-resolution images from low-resolution image in computer vision and image processing. Recent years have shown remarkable progress of image super-resolution using machine learning techniques 80. We try to use this methodology in order to identify fine structure of the chaotic turbulent solution of a simple barotropic ocean model. After the learning the flow patterns obtained by the high resolution model, the neuron net can identify fine structure in the low-resolution model solution with better precision than bicubic interpolation.

Different techniques of neuron net construction have been analyzed. Full-connected networks, basic convolutional and encoder-decoder processes 60 as well as mixed architectures were compared with each other and with the classical interpolation of model solution on a low-resolution grid.

At the present time the observation of Earth from space is done by more than thirty satellites. These platforms provide two kinds of observational information:

Our current developments are targeted at the use of learning methods methods to describe the evolution of the images. This approach has been applied to the tracking of oceanic oil spills in the framework of a Long Li's Phd in co-supervision with Jianwei Ma. It led to the publication of 12.

Accounting for realistic observations errors is a known bottleneck in data assimilation, because dealing with error correlations is complex. Following a previous study on this subject, we propose to use multiscale modelling, more precisely wavelet transform, to address this question. In 45 we investigate the problem further by addressing two issues arising in real-life data assimilation: how to deal with partially missing data (e.g., concealed by an obstacle between the sensor and the observed system); how to solve convergence issues associated to complex observation error covariance matrices? Two adjustments relying on wavelets modelling are proposed to deal with those, and offer significant improvements. The first one consists in adjusting the variance coefficients in the frequency domain to account for masked information. The second one consists in a gradual assimilation of frequencies. Both of these fully rely on the multiscale properties associated with wavelet covariance modelling.

This kind of work was put to application in STORM, a collaborative project with Université de La Réunion ended this year . Our role vas to prepare for the assimilation of data collected by sea turtles., and in particular work on the description of observation error statistics.

The high computational cost of complex numerical simulations is a common and major concern when deriving new methodological approaches. This cost increases dramatically with the use of sensitivity analysis or parameter estimation methods, and more generally with any method requiring numerous model integrations. Model reduction, using either stochastic or deterministic methods, is a way to reduce significantly the computing time of a numerical model. Over the past year, our team focused on different reduction aspects, emphasized and described below.

In the paper 29, we address the problem of the parametrization and the learning of monotone triangular transport maps. Transportation of measure provides a versatile approach for modeling complex probability distributions, with applications in density estimation, Bayesian inference, generative modeling, and beyond. Monotone triangular transport maps—approximations of the Knothe–Rosenblatt (KR) rearrangement are a canonical choice for these tasks. Yet the representation and parameterization of such maps have a significant impact on their generality and expressiveness, and on properties of the optimization problem that arises in learning a map from data (e.g., via maximum likelihood estimation). We present a general framework for representing monotone triangular maps via invertible transformations of smooth functions. We establish conditions on the transformation such that the associated infinite-dimensional minimization problem has no spurious local minima, i.e., all local minima are global minima; and we show for target distributions satisfying certain tail conditions that the unique global minimizer corresponds to the KR map. Given a sample from the target, we then propose an adaptive algorithm that estimates a sparse semi-parametric approximation of the underlying KR map. We demonstrate how this framework can be applied to joint and conditional density estimation, likelihood-free inference, and structure learning of directed graphical models, with stable generalization performance across a range of sample sizes.

In the paper 28, we consider the problem of reducing the dimensions of parameters and data in non-Gaussian Bayesian inference problems. Our goal is to identify an “informed” subspace of the parameters and an “informative” subspace of the data so that a high-dimensional inference problem can be approximately reformulated in low-to-moderate dimensions, thereby improving the computational efficiency of many inference techniques. To do so, we exploit gradient evaluations of the log-likelihood function. Furthermore, we use an information-theoretic analysis to derive a bound on the posterior error due to parameter and data dimension reduction. This bound relies on logarithmic Sobolev inequalities, and it reveals the appropriate dimensions of the reduced variables. We compare our method with classical dimension reduction techniques, such as principal component analysis and canonical correlation analysis, on applications ranging from mechanics to image processing.

Markov chain Monte Carlo (MCMC) methods form one of the algorithmic foundations of Bayesian inverse problems. The recent development of likelihood-informed subspace (LIS) methods offers a viable route to designing efficient MCMC methods for exploring high-dimensional posterior distributions via exploiting the intrinsic low-dimensional structure of the underlying inverse problem. However, existing LIS methods and the associated performance analysis often assume that the prior distribution is Gaussian. This assumption is limited for inverse problems aiming to promote sparsity in the parameter estimation, as heavy-tailed priors, e.g., Laplace distribution or the elastic net commonly used in Bayesian LASSO, are often needed in this case. To overcome this limitation, we consider in the paper 6 a prior normalization technique that transforms any non-Gaussian (e.g. heavy-tailed) priors into standard Gaussian distributions, which makes it possible to implement LIS methods to accelerate MCMC sampling via such transformations. We also rigorously investigate the integration of such transformations with several MCMC methods for high-dimensional problems. Finally, we demonstrate various aspects of our theoretical claims on two nonlinear inverse problems.

In a joint work 14 with Didier Georges (GIPSA Lab, Grenoble) and Mathieu Oliver (internship student), we proposed a spatialized extension of a SIR model that accounts for undetected infections and recoveries as well as the load on hospital services. The spatialized compartmental model we introduced is governed by a set of partial differential equations (PDEs) defined on a spatial domain with complex boundary. We proposed to solve the set of PDEs defining our model by using a meshless numerical method based on a finite difference scheme in which the spatial operators were approximated by using radial basis functions. Then we calibrated our model on the French department of Isère during the first period of lockdown, using daily reports of hospital occupancy in France. Our methodology allowed to simulate the spread of Covid-19 pandemic at a departmental level, and for each compartment. However, the simulation cost prevented from online short-term forecast. Therefore, we proposed to rely on reduced order modeling tools to compute short-term forecasts of infection number. The strategy consisted in learning a time-dependent reduced order model with few compartments from a collection of evaluations of our spatialized detailed model, varying initial conditions and parameter values. A set of reduced bases was learnt in an offline phase while the projection on each reduced basis and the selection of the best projection was performed online, allowing short-term forecast of the global number of infected individuals in the department. This work is going on in the framework of Robin Vaudry’s PhD (co-supervised with Didier Georges). We are investigating the more complex setting of spatialized models taking into account the vaccination and the loss of immunity.

In the framework of Arthur Macherey’s PhD (defended in June 2021), in collaboration with Anthony Nouy and Marie Billaud-Friess (Ecole Centrale Nantes), we have proposed algorithms for solving high-dimensional Partial Differential Equations (PDEs) that combine a probabilistic interpretation of PDEs, through Feynman-Kac representation, with sparse interpolation 43. Monte-Carlo methods and time-integration schemes are used to estimate pointwise evaluations of the solution of a PDE. We use a sequential control variates algorithm, where control variates are constructed based on successive approximations of the solution of the PDE. We are now interested in solving parametrized PDE with stochastic algorithms in the framework of potentially high dimensional parameter space. A preliminary step was the development of a PAC algorithm in relative precision for bandit problem with costly sampling 3.

Reduced models are also developed In the framework of robust inversion. In 48, we have combined a new greedy algorithm for functional quantization with a Stepwise Uncertainty Reduction strategy to solve a robust inversion problem under functional uncertainties. In a more recent work, we further reduced the number of simulations required to solve the same robust inversion problem, based on Gaussian process meta-modeling on the joint input space of deterministic control parameters and functional uncertain variable 49. These results are applied to automotive depollution. This research axis was conducted in the framework of the Chair OQUAIDO. This research axis is till active in the team through Clément Duhamel’s PhD, in collaboration with Céline Helbert (Ecole Centrale Lyon) and Miguel Munoz Zuniga, Delphine Sinoquet (IFPEN, Rueil Malmaison) 32.

Forecasting geophysical systems require complex models, which sometimes need to be coupled, and which make use of data assimilation. The objective of this project is, for a given output of such a system, to identify the most influential parameters, and to evaluate the effect of uncertainty in input parameters on model output. Existing stochastic tools are not well suited for high dimension problems (in particular time-dependent problems), while deterministic tools are fully applicable but only provide limited information. So the challenge is to gather expertise on one hand on numerical approximation and control of Partial Differential Equations, and on the other hand on stochastic methods for sensitivity analysis, in order to develop and design innovative stochastic solutions to study high dimension models and to propose new hybrid approaches combining the stochastic and deterministic methods. We took part to the writing of a position paper on the future of sensitivity analysis 75.

An important challenge for stochastic sensitivity analysis is to develop methodologies which work for dependent inputs. Recently, the Shapley value, from econometrics, was proposed as an alternative to quantify the importance of random input variables to a function. Owen 71 derived Shapley value importance for independent inputs and showed that it is bracketed between two different Sobol' indices. Song et al. 76 recently advocated the use of Shapley value for the case of dependent inputs. In a recent work 70, in collaboration with Art Owen (Standford's University), we showed that Shapley value removes the conceptual problems of functional ANOVA for dependent inputs. We also investigated further the properties of Shapley effects in 59. By the end of 2021, Clémentine Prieur started a collaboration with Elmar Plischke (TU Clausthal, Germany) and Emanuele Borgonovo (Bocconi University, Milan, Italy) to estimate total Sobol’ indices as a measure for variable selection even in the framework of dependent inputs. In particular, it allows to estimate total Sobol’ indices for inputs defined on a non rectangular domain. This setting is of particular interest for applications where the input space is reduced due to physical constraints on the quantity of interest. This last setting was encountered, e.g., in Maria Belén Heredia’s PhD thesis (defended in december, 2020), and analyzed by estimating Shapley effects with a nonparametric procedure based on nearest neighbors 1 (see Section 8.5.2 for more details). In October 2021, Ri Wang has started a PhD, cosupervised by Clémentine Prieur and Véronique Maume-Deschamps (ICJ, Lyon 1) on the estimation of quantile oriented sensitivity indices in the framework of dependent inputs, by means of random forests or other machine learning tools. Ri Wang has received a funding from the Chinese Scientific Council.

Another research direction for global SA algorithm starts with the report that most of the algorithms to compute sensitivity measures require special sampling schemes or additional model evaluations so that available data from previous model runs (e.g., from an uncertainty analysis based on Latin Hypercube Sampling) cannot be reused. One challenging task for estimating global sensitivity measures consists in recycling an available finite set of input/output data. Green sensitivity, by recycling, avoids wasting. These given data have been discussed, e.g., in 73, 74. Most of the given data procedures depend on parameters (number of bins, truncation argument…) not easy to calibrate with a bias-variance compromise perspective. Adaptive selection of these parameters remains a challenging issue for most of these given-data algorithms. In the context of María Belén Heredia’s PhD thesis, we have proposed 42 a non-parametric given data estimator for aggregated Sobol’ indices, introduced in 61 and further developed in 51 for multivariate or functional outputs. We also introduced aggregated Shapley effects and we have extended a nearest neighbor estimation procedure to estimate these indices 1. We also started a collaboration with Sébastien Da Veiga (Safran Tech), Agnès Lagnoux, Thierry Klein and Fabrice Gamboa (Institut de Mathématiques de Toulouse) on a new nonparametric estimation procedure for closed Sobol’ indices of any order based on degenerate U-statistics.

Many models are stochastic in nature, and some of them may be driven by parametrized stochastic differential equations (SDE). It is important for applications to propose a strategy to perform global sensitivity analysis (GSA) for such models, in presence of uncertainties on the parameters. In collaboration with Pierre Etoré (DATA department in Grenoble), Clémentine Prieur proposed an approach based on Feynman-Kac formulas 50. The research on GSA for stochastic simulators is still ongoing, first in the context of the MATH-AmSud project FANTASTIC (Statistical inFerence and sensitivity ANalysis for models described by sTochASTIC differential equations) with Chile and Uruguay, secondly through the PhD thesis of Henri Mermoz Kouye, co-supervised by Clémentine Prieur, in collaboration with Gildas Mazo and Eliza Vergu (INRAE, département MIA, Jouy). Note that our recent developments with P. Etoré on GSA for parametrized SDEs and are strongly related to reduced order modeling (see Section 8.3), as GSA requires jose leon intensive computations of the quantity of interest. In collaboration with Pierre Etoré and Joël Andrepont (master internship started in spring 2021), Clémentine Prieur is working on GSA for parametrized SDEs based on Fokker-Planck equation and kernel based sensitivity indices. Note that a joint work between Pierre Etoré, Clémentine Prieur and Jose R. Leon has been submitted, related to exact or approximated computation of Kolmogorov hypoelliptic equations (KHE). Even if not dealing with GSA, it could be a starting point for analyzing sensitivity for models described by a parametrized version of KHE. Concerning Henri Mermoz Kouye’s PhD thesis, the approach is different. We are interested in GSA for compartmental stochastic models. Our methodology relies on a deterministic representation of continuous time Markov chains stochastic compartmental models 33. Henri Mermoz Kouye defended in December, 2022.

Pesticide transfer models are valuable tools to predict and prevent pollution of water bodies. However, using such models in operational contexts requires a strong knowledge of their structure including influential parameters. This project aims at performing global sensitivity analysis (GSA) of the PESHMELBA model (pesticide and hydrology: modelling at the catchment scale). This work is made hard due to the modular, complex structure of the model that couples different physical processes. It results in a large input space dimension and a high computational cost that limits the number of available runs. Using classical GSA tools such as Sobol' indices is thus not feasible. In order to circumvent those limitations, we also explored alternative techniques such as HSIC dependence measure or Random Forest metamodel. The use of such methods in the specific context of spatially distributed output was submitted in a journal paper 36. We extended this work to spatiotemporal output, as presented in 15.

To conclude Section 8.5, let us mention that Clémentine Prieur took part to the writing on a monography on recent trends in sensitivity analysis, which appeared by the end of 2021 79.

Physically-based avalanche propagation models must still be locally calibrated to provide robust predictions, e.g. in long-term forecasting and subsequent risk assessment. Friction parameters cannot be measured directly and need to be estimated from observations. Rich and diverse data is now increasingly available from test-sites, but for measurements made along ow propagation, potential autocorrelation should be explicitly accounted for. In the context of María Belén Heredia’s PhD, in collaboration with IRSTEA Grenoble, we have proposed in 56 a comprehensive Bayesian calibration and statistical model selection framework with application to an avalanche sliding block model with the standard Voellmy friction law and high rate photogrammetric images. An avalanche released at the Lautaret test-site and a synthetic data set based on the avalanche were used to test the approach. Results have demonstrated i) the efficiency of the proposed calibration scheme, and ii) that including autocorrelation in the statistical modelling definitely improves the accuracy of both parameter estimation and velocity predictions. In the context of the energy transition, wind power generation is developing rapidly in France and worldwide. Research and innovation on wind resource characterisation, turbin control, coupled mechanical modelling of wind systems or technological development of offshore wind turbines floaters are current research topics. In particular, the monitoring and the maintenance of wind turbine is becoming a major issue. Current solutions do not take full advantage of the large amount of data provided by sensors placed on modern wind turbines in production. These data could be advantageously used in order to refine the predictions of production, the life of the structure, the control strategies and the planning of maintenance. In this context, it is interesting to optimally combine production data and numerical models in order to obtain highly reliable models of wind turbines. This process is of interest to many industrial and academic groups and is known in many fields of the industry, including the wind industry, as "digital twin”. The objective of Adrien Hirvoas's PhD work is to develop of data assimilation methodology to build the "digital twin" of an onshore wind turbine. Based on measurements, the data assimilation should allow to reduce the uncertainties of the physical parameters of the numerical model developed during the design phase to obtain a highly reliable model. Various ensemble data assimilation approches are currently under consideration to address the problem. In the context of this work, it is necessary to develop algorithms of identification quantifying and ranking all the uncertainty sources. This work in done in collaboration with IFPEN 8, 58.

Due to the sanitary context, Clémentine Prieur decided to join a working group, SEEPIA Simulation & Estimation of EPIdemics with Algorithms, animated by Didier Georges (Gipsa-lab). A first work has been published 54. An extension of the classical pandemic SIRD model was considered for the regional spread of COVID-19 in France under lockdown strategies. This compartment model divides the infected and the recovered individuals into undetected and detected compartments respectively. By fitting the extended model to the real detected data during the lockdown, an optimization algorithm was used to derive the optimal parameters, the initial condition and the epidemics start date of regions in France. Considering all the age classes together, a network model of the pandemic transport between regions in France was presented on the basis of the regional extended model and was simulated to reveal the transport effect of COVID-19 pandemic after lockdown. Using the the measured values of displacement of people mobilizing between each city, the pandemic network of all cities in France was simulated by using the same model and method as the pandemic network of regions. Finally, a discussion on an integro-differential equation was given and a new model for the network pandemic model of each age was provided. As already mentioned in Section 8.3, Clémentine Prieur went on working on the pandemic, in collaboration with Didier Georges (GIPSA Lab, Grenoble). Both of them supervised the internship of Matthieu Oliver, submitting a work proposing time-dependent reduced order modeling for short-term forecast from a spatialized SIR model 14. Robin Vaudry started in October 2021 a PhD, funded by the CNR research platform MODCOV19, and cosupervised by Clémentine Prieur and Didier Georges. The objective of this PhD is to solve inverse problems related to spatialized and ages structured commpartmental models of COVID19 pandemic.

In 77, we are considering the modeling of precipitation amount with semi-parametric models, modeling both the bulk of the distribution and the tails, but avoiding the arbitrary choice of a threshold. We work in collaboration with Anne-Catherine Favre (LGGE-Lab in Grenoble) and Philippe Naveau (LSCE, Paris).

In the context of Philomène Le Gall’s PhD thesis, we are applying the aforementioned modeling of extreme precipitation with the aim of regionalizing extreme precipitation 11.

The way how applications are executed on supercomputers still follows a traditional static resource allocation pattern: Computing resources are allocated at the start of a job which executes the application and are only released at the end of the job’s runtime. This still follows the way of running jobs since decades where a dynamic resource allocation over the application’s runtime would lead to several benefits: higher utilization of the computing resources, ad-hoc allocation of AI accelerator cards, less energy consumption, faster response for interactive jobs, improved data locality and I/O over the full runtime, support of urgent computing without necessarily killing running jobs, etc. Various attempts have been conducted under different terminologies used such as “evolving” jobs (application-driven dynamic resource changes) and “malleability” (system-driven dynamic resource changes) where we see a hybridization of them required for reaching optimal results.

We currently investigate possibilities to extend the MPI parallel programming model (which is the de-facto standard in HPC) and also models beyond MPI with such interfaces. As part of that, we participate in regular MPI Session working group meetings where certain effort has been done to investigate such interfaces. This also resulted in some prototype implementations with an emulator 20 as well as an implementation based on PMIx 18.

We started investigating Domain Specific Languages (DSL) for ocean simulation models targeting to close the increasing gap between the numerics and HPC with a separation of concerns: A DSL allows on the one hand applied mathematicians to express their model equations and on the other hand also HPC experts to develop tools to automatically transform this into highly performing code on the target programming model and corresponding architecture. As part of this, we started working with the PsyIR development (package of PsyClone) and successfully applied it to parsing the entire CROCO ocean model. This forms the basis for currently undergoing work to support different HPC programming models and architectures in a sustainable way for the CROCO model.

We investigate new time-integration methods by also taking HPC requirements into account, hence targeting a better wallclocktime vs. error ratio. These numerical methods include different forms of exponential, semi-Lagrangian as well as parallel-in-time integration methods. Our team and collaborators made significant progress on this over the last years, however, publishable results have not yet been generated or are currently in production.

As part of a collaboration with the University of São Paulo (USP), we investigated exponential integration methods based on Faber polynomials. with a preprint available 35. Although this paper's main focus is on seismic wave propagation, these results can be also applied to the linearized fast modes in atmospheric models, e.g., as part of time splitting methods. Theoretical as well as numerical results have been deeply investigated for a variety of wave propagation problems. In 4, we have also studied in detail the application of Runge Kutta exponential integrators and its variants to shallow water equations.

In a previous work, we have been able to show significant speedups using the Parallel Full Approximation Scheme in Space and Time (PFASST) parallel-in-time method for the shallow-water equations on the rotating sphere with an IMEX-SDC scheme using benchmarks which are used for the development of dynamical cores 55. Motivated by this work, we started to investigate this in collaboration with USP using the MGRIT parallel-in-time method where we expect to have even better results in combination with exponential time integration methods.

On variational data assimilation and parallel-in-time integration methods: Rishabh Bhatt started his PhD in December 2019. Under the supervision of Laurent Debreu and Arthur Vidard, he studies the application of time parallelization algorithms to variational data assimilation methods.
Thus, at each step of the optimization algorithm, the direct model is integrated by a time-parallel method (here the Parareal algorithm 67).
One of the main difficulties lies in the choice of the stopping criterion of the Parareal algorithm.
To do so, we rely on theoretical results on the convergence of conjugate gradient methods in the presence of approximate gradient calculations 52.
The first results are encouraging and show the possibility to optimally tune the stopping criterion of the Parareal algorithm (and thus the number of iterations) without affecting the convergence of the conjugate gradient.
These results are reported in (30).

Rishabh Bhatt started his PhD in December 2019. Under the supervision of Laurent Debreu and Arthur Vidard, he studies the application of time parallelization algorithms to variational data assimilation methods. Thus, at each step of the optimization algorithm, the direct model is integrated by a time-parallel method (here the Parareal algorithm 66). One of the main difficulties lies in the choice of the stopping criterion of the Parareal algorithm. To do so, we rely on theoretical results on the convergence of conjugate gradient methods in the presence of approximate gradient calculations 52. The first results are encouraging and show the possibility to optimally tune the stopping criterion of the Parareal algorithm (and thus the number of iterations) without affecting the convergence of the conjugate gradient. These results are reported in (30). We are now working on taking better advantage of the coupling of these two iterative methods (conjugate gradient and Parareal), in particular by optimally reusing the Krylov bases of the Krylov Enhanced version of the Parareal algorithm.

IMMERSE project on cordis.europa.eu

The overarching goal of IMMERSE project is to ensure that the Copernicus Marine Environment Monitoring Service (CMEMS) will have continuing access to world-class marine modelling tools for its next generation systems while leveraging advances in space and information technologies, therefore allowing it to address the ever-increasing and evolving demands for marine monitoring and prediction in the 2020s and beyond.

In response to the future priorities for CMEMS, IMMERSE will develop new capabilities to:

- enable the production of ocean forecasts and analyses that exploit upcoming high resolution satellite datasets,

- deliver ocean analyses and forecasts with the higher spatial resolution and additional process complexity demanded by users,

- exploit the opportunities of new high performance computing (HPC) technology

- allow easy interfacing of CMEMS products with detailed local coastal models.

These developments will be delivered in the NEMO ocean model, an established, world-class ocean modelling system that already forms the basis of the majority of CMEMS analysis and forecast products. Hence the pathway from the research in IMMERSE to implementation in CMEMS will be simple and seamless, as the model code developed will be directly applicable in CMEMS models. NEMO has a long track record of producing and maintaining a stable, robustly engineered code base of the type that is needed for operational applications, including CMEMS.

The IMMERSE consortium combines world-class expertise in ocean modelling, applied mathematics and HPC, established software engineering processes and infrastructure, and in-depth knowledge of the CMEMS systems and downstream CMEMS systems. Thus IMMERSE is exceptionally well placed to deliver the operational-quality model code required to meet the emerging needs of CMEMS, and maintain it into the future.

SAMO board is in charge of the organization of the SAMO (sensitivity analysis of model outputs) conferences, every three years. It is strongly supported by the Joint Research Center of the European Commission.

In 2019, Clémentine Prieur, which is part of this board, as also co-chair of a satellite event on the future of sensitivity analysis. A position paper 75 has been published in 2021, as a synthesis of the discussions hold in Barcelona (autumn 2019).

A 4-year contract: ANR MODENA, "Model and data reduction for efficient assimilation", PI: Olivier ZAHM The reliability of numerical predictions strongly rely on our ability to calibrate the unknown model parameters using observable data. Data assimilation is a particularly challenging task in ocean modelling because of the complexity of the computational models and because of the high-dimension of both the parameters and the data. The objective of this project is to explore novel methodologies to improve the probabilistic description of the Bayesian solution to inverse problem while, at the same time, taking into consideration the computational limitations imposed by large scale applications. This project relies on three building blocks and on their interplay:

In addition, the proposed methodology addresses key challenges present in many real-world applications and therefore we expect that they can find applications in other domains.