Keywords
 A6.1.1. Continuous Modeling (PDE, ODE)
 A6.1.4. Multiscale modeling
 A6.1.5. Multiphysics modeling
 A6.2.1. Numerical analysis of PDE and ODE
 A6.2.5. Numerical Linear Algebra
 A6.2.7. High performance computing
 A6.3. Computationdata interaction
 A6.3.1. Inverse problems
 A7.1. Algorithms
 B3.3.1. Earth and subsoil
 B9.5.2. Mathematics
 B9.5.3. Physics
1 Team members, visitors, external collaborators
Research Scientists
 Laura Grigori [Team leader, Inria, Senior Researcher, HDR]
 Sever Hirstoaga [Inria, Researcher, HDR]
 Frédéric Nataf [CNRS, Senior Researcher, HDR]
Faculty Members
 Xavier Claeys [Sorbonne Université, Associate Professor]
 Frédéric Hecht [Sorbonne Université, Emeritus, HDR]
PostDoctoral Fellows
 Martin Averseng [Inria, until Jun 2020]
 Oleg Balabanov [Inria]
 Suraj Kumar [Inria]
PhD Students
 Siwar Badreddine [Inria, from Oct 2020]
 Matthias Beaupere [Inria]
 Igor Chollet [Sorbonne Université]
 Thibault Cimic [Inria]
 David Frenkiel [Inria, from Oct 2020]
 Antoine Lechevallier [Sorbonne Université, from Oct 2020]
 Van Thanh Nguyen [Inria]
Technical Staff
 Axel Fourmont [Inria, Engineer, until Feb 2020]
 PierreHenri Tournier [CNRS, Engineer]
Interns and Apprentices
 Siwar Badreddine [Inria, from Mar 2020 until Aug 2020]
 David Frenkiel [Inria, from Apr 2020 until Sep 2020]
Administrative Assistants
 Laurence Bourcier [Inria]
 Julien Guieu [Inria]
2 Overall objectives
2.1 Introduction
The focus of our research is on the development of novel parallel numerical algorithms and tools appropriate for stateoftheart mathematical models used in complex scientific applications, and in particular numerical simulations. The proposed research program is by nature multidisciplinary, interweaving aspects of applied mathematics, computer science, as well as those of several specific applications, as porous media flows, elasticity, wave propagation in multiscale media, molecular simulations, and inverse problems.
Our first objective is to develop numerical methods and tools for complex scientific and industrial applications that will enhance their scalable execution on the emergent heterogeneous hierarchical models of massively parallel machines. Our second objective is to integrate the novel numerical algorithms into a middlelayer that will hide as much as possible the complexity of massively parallel machines from the users of these machines.
3 Research program
3.1 Overview
The research described here is directly relevant to several steps of the numerical simulation chain. Given a numerical simulation that was expressed as a set of differential equations, our research focuses on mesh generation methods for parallel computation, novel numerical algorithms for linear algebra, as well as algorithms and tools for their efficient and scalable implementation on high performance computers. The validation and the exploitation of the results is performed with collaborators from applications and is based on the usage of existing tools. In summary, the topics studied in our group are the following:
 Numerical methods and algorithms
 Mesh generation for parallel computation

Solvers for numerical linear algebra:
domain decomposition methods, preconditioning for iterative methods
 Computational kernels for numerical linear algebra
 Tensor computations for high dimensional problems
 Validation on numerical simulations and other numerical applications
3.2 Domain specific language  parallel FreeFem++
In the engineering, researchers, and teachers communities, there is a
strong demand for simulation frameworks that are simple to install and
use, efficient, sustainable, and that solve efficiently and accurately
complex problems for which there are no dedicated tools or codes
available. In our group we develop FreeFem++ (see https://
 getting a quick answer to a specific problem,
 prototyping the resolution of a new complex problem.
The current users of FreeFem++ are mathematicians, engineers, university professors, and students. In general for these users the installation of public libraries as MPI, MUMPS, Ipopt, Blas, lapack, OpenGL, fftw, scotch, PETSc, SLEPc is a very difficult problem. For this reason, the authors of FreeFem++ have created a user friendly language, and over years have enriched its capabilities and provided tools for compiling FreeFem++ such that the users do not need to have special knowledge of computer science. This leads to an important work on porting the software on different emerging architectures.
Today, the main components of parallel FreeFem++ are:
 definition of a coarse grid,
 splitting of the coarse grid,
 mesh generation of all subdomains of the coarse grid, and construction of parallel data structures for vectors and sparse matrices from the mesh of the subdomain,
 discretization by a chosen numerical method,
 call to a linear solver,
 analysis of the result.
All these components are parallel, except for the last point which is not in the focus of our research. However for the moment, the parallel mesh generation algorithm is very simple and not sufficient, for example it addresses only polygonal geometries. Having a better parallel mesh generation algorithm is one of the goals of our project. In addition, in the current version of FreeFem++, the parallelism is not hidden from the user, it is done through direct calls to MPI. Our goal is also to hide all the MPI calls in the specific language part of FreeFem++. In addition to these inhouse domain decomposition methods, FreeFem++ is also linked to PETSc solvers which enables an easy use of third parties parallel multigrid methods.
3.3 Solvers for numerical linear algebra
Iterative methods are widely used in industrial applications, and preconditioning is the most important research subject here. Our research considers domain decomposition methods and iterative methods and its goal is to develop solvers that are suitable for parallelism and that exploit the fact that the matrices are arising from the discretization of a system of PDEs on unstructured grids.
One of the main challenges that we address is the lack of robustness and scalability of existing methods as incomplete LU factorizations or Schwarzbased approaches, for which the number of iterations increases significantly with the problem size or with the number of processors. This is often due to the presence of several low frequency modes that hinder the convergence of the iterative method. To address this problem, we study different approaches for dealing with the low frequency modes as coarse space correction in domain decomposition or deflation techniques.
We also focus on developing boundary integral equation methods that would be adapted to the simulation of wave propagation in complex physical situations, and that would lend themselves to the use of parallel architectures. The final objective is to bring the state of the art on boundary integral equations closer to contemporary industrial needs. From this perspective, we investigate domain decomposition strategies in conjunction with boundary element method as well as acceleration techniques (Hmatrices, FMM and the like) that would appear relevant in multimaterial and/or multidomain configurations. Our work on this topic also includes numerical implementation on large scale problems, which appears as a challenge due to the peculiarities of boundary integral equations.
3.4 Computational kernels for numerical linear and multilinear algebra
The design of new numerical methods that are robust and that have well proven convergence properties is one of the challenges addressed in Alpines. Another important challenge is the design of parallel algorithms for the novel numerical methods and the underlying building blocks from numerical linear algebra. The goal is to enable their efficient execution on a diverse set of node architectures and their scaling to emerging highperformance clusters with an increasing number of nodes.
Increased communication cost is one of the main challenges in high performance computing that we address in our research by investigating algorithms that minimize communication, as communication avoiding algorithms. The communication avoiding algorithmic design is an approach originally developed in our group since more than ten years (initially in collaboration with researchers from UC Berkeley and CU Denver). While our first results concerned direct methods of factorization as LU or QR factorizations, our recent work focuses on designing robust algorithms for computing the low rank approximation of a matrix or a tensor. We focus on both deterministic and randomized approaches.
Our research also focuses on solving problems of large size that feature high dimensions as in molecular simulations. The data in this case is represented by objects called tensors, or multilinear arrays. The goal is to design novel tensor techniques to allow their effective compression, i.e. their representation by simpler objects in small dimensions, while controlling the loss of information. The algorithms are aiming to being highly parallel to allow to deal with the large number of dimensions and large data sets, while preserving the required information for obtaining the solution of the problem.
4 Application domains
4.1 Compositional multiphase Darcy flow in heterogeneous porous media
We study the simulation of compositional multiphase flow in porous media with different types of applications, and we focus in particular on reservoir/bassin modeling, and geological CO2 underground storage. All these simulations are linearized using Newton approach, and at each time step and each Newton step, a linear system needs to be solved, which is the most expensive part of the simulation. This application leads to some of the difficult problems to be solved by iterative methods. This is because the linear systems arising in multiphase porous media flow simulations cumulate many difficulties. These systems are nonsymmetric, involve several unknowns of different nature per grid cell, display strong or very strong heterogeneities and anisotropies, and change during the simulation. Many researchers focus on these simulations, and many innovative techniques for solving linear systems have been introduced while studying these simulations, as for example the nested factorization [Appleyard and Cheshire, 1983, SPE Symposium on Reservoir Simulation].
4.2 Inverse problems
We focus on methods related to the blend of time reversal techniques and absorbing boundary conditions (ABC) used in a non standard way. Since the seminal paper by [M. Fink et al., Imaging through inhomogeneous media using time reversal mirrors. Ultrasonic Imaging, 13(2):199, 1991.], time reversal is a subject of very active research. The principle is to backpropagate signals to the sources that emitted them. The initial experiment was to refocus, very precisely, a recorded signal after passing through a barrier consisting of randomly distributed metal rods. In [de Rosny and Fink. Overcoming the diffraction limit in wave physics using a timereversal mirror and a novel acoustic sink. Phys. Rev. Lett., 89 (12), 2002], the source that created the signal is time reversed in order to have a perfect time reversal experiment. In 36, we improve this result from a numerical point of view by showing that it can be done numerically without knowing the source. This is done at the expense of not being able to recover the signal in the vicinity of the source. In 37, time dependent wave splitting is performed using ABC and time reversal techniques. We now work on extending these methods to non uniform media.
All our numerical simulations are performed in FreeFem++ which is very flexible. As a byproduct, it enables us to have an end user point of view with respect to FreeFem++ which is very useful for improving it.
4.3 Numerical methods for wave propagation in multiscale media
We are interested in the development of fast numerical methods for the simulation of electromagnetic waves in multiscale situations where the geometry of the medium of propagation may be described through caracteristic lengths that are, in some places, much smaller than the average wavelength. In this context, we propose to develop numerical algorithms that rely on simplified models obtained by means of asymptotic analysis applied to the problem under consideration.
Here we focus on situations involving boundary layers and localized singular perturbation problems where wave propagation takes place in media whose geometry or material caracteristics are submitted to a small scale perturbation localized around a point, or a surface, or a line, but not distributed over a volumic subregion of the propagation medium. Although a huge literature is already available for the study of localized singular perturbations and boundary layer pheneomena, very few works have proposed efficient numerical methods that rely on asymptotic modeling. This is due to their functional framework that naturally involves singular functions, which are difficult to handle numerically. The aim of this part of our reasearch is to develop and analyze numerical methods for singular perturbation methods that are prone to high order numerical approximation, and robust with respect to the small parameter characterizing the singular perturbation.
4.4 Data analysis in astrophysics
We focus on computationally intensive numerical algorithms arising in the data analysis of current and forthcoming Cosmic Microwave Background (CMB) experiments in astrophysics. This application is studied in collaboration with researchers from University Paris Diderot, and the objective is to make available the algorithms to the astrophysics community, so that they can be used in large experiments.
In CMB data analysis, astrophysicists produce and analyze
multifrequency 2D images of the universe when it was 5% of its
current age. The new generation of the CMB experiments observes the
sky with thousands of detectors over many years, producing
overwhelmingly large and complex data sets, which nearly double every
year therefore following Moore's Law. Planck
(http://
4.5 Molecular simulations
Molecular simulation is one of the most dynamic areas of scientific computing. Its field of application is very broad, ranging from theoretical chemistry and drug design to materials science and nanotechnology. It provides many challenging problems to mathematicians, and computer scientists.
In the context of the ERC Synergy Grant EMC2 we address several important limitations of state of the art molecular simulation. In particular, the simulation of very large molecular systems, or smaller systems in which electrons interact strongly with each other, remains out of reach today. In an interdisciplinary collaboration between chemists, mathematicians and computer scientists, we focus on developing a new generation of reliable molecular simulation algorithms and software.
5 Highlights of the year
5.1 Awards
 Laura Grigori was elected SIAM Fellow class of 2020, for contributions to numerical linear algebra, including communicationavoiding algorithms.
5.2 Other
 HPDDM is now available through Petsc PCHPDDM. It implements Geneo 12 and its recent multilevel version 25.
 Laura Grigori was one of the five members of the Scientific Committee of the Prace Fast Track Call for Proposals in support to mitigate impact of Covid 19, opened in March 2020. Meeting daily in the first months of the pandemic to ensure an evaluation within one week of the proposals.
6 New software and platforms
6.1 New software
6.1.1 FreeFem++
 Name: FeeFrem++

Scientific Description:
FreeFem++ is a partial differential equation solver. It has its own language. freefem scripts can solve multiphysics non linear systems in 2D and 3D.
Problems involving PDE (2d, 3d) from several branches of physics such as fluidstructure interactions require interpolations of data on several meshes and their manipulation within one program. FreeFem++ includes a fast 2d̂treebased interpolation algorithm and a language for the manipulation of data on multiple meshes (as a follow up of bamg (now a part of FreeFem++ ).
FreeFem++ is written in C++ and the FreeFem++ language is a C++ idiom. It runs on Macs, Windows, Unix machines. FreeFem++ replaces the older freefem and freefem+.
 Functional Description: FreeFem++ is a PDE (partial differential equation) solver based on a flexible language that allows a large number of problems to be expressed (elasticity, fluids, etc) with different finite element approximations on different meshes.

URL:
http://
www. freefem. org/ ff++/  Contact: Frederic Hecht
 Partner: UPMC
6.1.2 HPDDM
 Scientific Description: HPDDM is an efficient implementation of various domain decomposition methods (DDM) such as one and twolevel Restricted Additive Schwarz methods, the Finite Element Tearing and Interconnecting (FETI) method, and the Balancing Domain Decomposition (BDD) method. This code has been proven to be efficient for solving various elliptic problems such as scalar diffusion equations, the system of linear elasticity, but also frequency domain problems like the Helmholtz equation. A comparison with modern multigrid methods can be found in the thesis of Pierre Jolivet.
 Functional Description: HPDDM is an efficient implementation of various domain decomposition methods (DDM) such as one and twolevel Restricted Additive Schwarz methods, the Finite Element Tearing and Interconnecting (FETI) method, and the Balancing Domain Decomposition (BDD) method.

URL:
https://
github. com/ hpddm  Authors: Frédéric Nataf, Pierre Jolivet
 Contacts: Pierre Jolivet, Frédéric Nataf
 Participants: Frédéric Nataf, Pierre Jolivet
6.1.3 preAlps
 Name: Preconditioned enlarged Krylov subspace methods
 Keywords: Linear Systems Solver, Preconditioner
 Functional Description: Contains enlarged Conjugate Gradient Krylov subspace method and Lorasc preconditioner.

URL:
https://
github. com/ NLAFET/ preAlps  Contact: Laura Grigori
6.1.4 HTool
 Keyword: Hierarchical matrices
 Functional Description: HTool is a C++ headeronly library implementing compression techniques (e.g. Adaptive Cross Approximation) using hierarchical matrices (Hmatrices). The library uses MPI and OpenMP for parallelism, and is interfaced with HPDDM for the solution of linear systems.

URL:
https://
github. com/ htoolddm/ htool  Contacts: PierreHenri Tournier, Pierre Marchand
6.1.5 BemTool
 Keyword: Boundary element method
 Functional Description: BemTool is a C++ headeronly library implementing the boundary element method (BEM) for the discretisation of the Laplace, Helmholtz and Maxwell equations, in 2D and 3D. Its main purpose is the assembly of classical boundary element matrices, which can be compressed and inverted through its interface with the HTool library.

URL:
https://
github. com/ xclaeys/ BemTool  Contact: Xavier Claeys
7 New results
7.1 Low rank matrix approximation
Part of our research focuses on computing accurate low rank matrix approximations through randomized or deterministic approaches.
In 29 we introduce a parallel algorithm for computing the low rank approximation ${A}_{k}$ of a large matrix $A$ which minimizes the number of messages exchanged between processors (modulo polylogarithmic factors) and has guarantees for the approximations of the singular values of $A$ provided by ${A}_{k}$. This operation is essential in many applications in scientific computing and data analysis when dealing with large data sets. Our algorithm is based on QR factorization that consists in selecting a subset of columns from the matrix $A$ that allow to approximate the range of $A$, and then projecting the columns of $A$ on a basis of the subspace spanned by those columns. The selection of columns is performed by using tournament pivoting, a strategy introduced previously for matrices partitioned into blocks of columns. This strategy is extended here to matrices partitioned along both dimensions that are distributed on a twodimensional grid of processors, and also to tournaments with more general reduction trees. Performance results show that the algorithm scales well on up to 1024 cores of 16 nodes.
In 16 we propose lineartime CUR approximation algorithms for admissible matrices obtained from the hierarchical form of Boundary Element matrices. We propose a new approach called geometric sampling to obtain indices of most significant rows and columns using information from the domains where the problem is posed. Our strategy is tailored to Boundary Element Methods (BEM) since it uses directly and explicitly the cluster tree containing information from the problem geometry.
7.2 Low rank tensor compression
To address problems arising in computational chemistry, we focus on computing low rank tensor approximation. We present in 33 a parallel algorithm that generates a lowrank approximation of a distributed tensor using QR decomposition with tournament pivoting (QRTP). The algorithm generates factor matrices for a Tucker decomposition by applying QRTP to the unfolding matrices of a tensor distributed blockwise (by subtensor) on a set of processors. For each unfolding mode the algorithm logically reorganizes (unfolds) the processors so that the associated unfolding matrix has a suitable logical distribution. We also establish error bounds between a tensor and the compressed version of the tensor generated by the algorithm.
In 35 we consider the problem of developing parallel decomposition and approximation algorithms for high dimensional tensors. We focus on a tensor representation named Tensor Train (TT). It stores a $d$dimensional tensor in $\mathcal{O}\left(nrd\right)$, much less than the $\mathcal{O}\left({n}^{d}\right)$ entries in the original tensor, where $r$ is usually a very small number and depends on the application. Sequential algorithms to compute TT decomposition and TT approximation of a tensor have been proposed in the literature. Here we propose a parallel algorithm to compute TT decomposition of a tensor. We prove that the ranks of TTrepresentation produced by our algorithm are bounded by the ranks of unfolding matrices of the tensor. Additionally, we propose a parallel algorithm to compute approximation of a tensor in TTrepresentation. Our algorithm relies on a hierarchical partitioning of the dimensions of the tensor in a balanced binary tree shape and transmission of leading singular values of associated unfolding matrix from the parent to its children. We consider several approaches on the basis of how leading singular values are transmitted in the tree. We present an indepth experimental analysis of our approaches for different low rank tensors and also assess them for tensors obtained from quantum chemistry simulations. Our results show that the approach which transmits leading singular values to both of its children performs better in practice. Compression ratios and accuracies of the approximations obtained by our approaches are comparable with the sequential algorithm and, in some cases, even better than that.
7.3 Adaptive hierarchical subtensor partitioning for tensor compression
In 19 a numerical method is proposed to compress a tensor by constructing a piecewise tensor approximation. This is defined by partitioning a tensor into subtensors and by computing a lowrank tensor approximation (in a given format) in each subtensor. Neither the partition nor the ranks are fixed a priori, but, instead, are obtained in order to fulfill a prescribed accuracy and optimize, to some extent, the storage. The different steps of the method are detailed and some numerical experiments are proposed to assess its performances.
7.4 Randomized GramSchmidt orthogonalization and its use in GMRES
In 28 a randomized GramSchmidt algorithm is developed for orthonormalization of highdimensional vectors or QR factorization. The proposed process can be less computationally expensive than the classical GramSchmidt process while being at least as numerically stable as the modified GramSchmidt process. Our approach is based on random sketching, which is a dimension reduction technique consisting in estimation of inner products of highdimensional vectors by inner products of their small efficientlycomputable random projections, socalled sketches. This allows to perform the projection step in GramSchmidt process on sketches rather than highdimensional vectors with a minor computational cost. This also provides an ability to efficiently certify the output. The proposed GramSchmidt algorithm can provide computational cost reduction in any architecture. The benefit of random sketching can be amplified by exploiting multiprecision arithmetic. We provide stability analysis for multiprecision model with coarse unit roundoff for standard highdimensional operations. Numerical stability is proven for the unit roundoff independent of the (high) dimension of the problem. The proposed GramSchmidt process can be applied to Arnoldi iteration and result in new Krylov subspace methods for solving highdimensional systems of equations or eigenvalue problems. Among them we chose randomized GMRES method as a practical application of the methodology.
7.5 Accelerating linear system solvers for timedomain component separation of cosmic microwave background data
In 22 we consider component separation, which is one of the key stages of any modern cosmic microwave background data analysis pipeline. It is an inherently nonlinear procedure and typically involves a series of sequential solutions of linear systems with similar but not identical system matrices, derived for different data models of the same data set. Sequences of this type arise, for instance, in the maximization of the data likelihood with respect to foreground parameters or sampling of their posterior distribution. However, they are also common in many other contexts. In this work we consider solving the component separation problem directly in the measurement (time) domain. This can have a number of important benefits over the more standard pixelbased methods, in particular if nonnegligible timedomain noise correlations are present, as is commonly the case. The approach based on the timedomain, however, implies significant computational effort because the full volume of the timedomain data set needs to be manipulated. To address this challenge, we propose and study efficient solvers adapted to solving timedomainbased component separation systems and their sequences, and which are capable of capitalizing on information derived from the previous solutions. This is achieved either by adapting the initial guess of the subsequent system or through a socalled subspace recycling, which allows constructing progressively more efficient twolevel preconditioners. We report an overall speedup over solving the systems independently of a factor of nearly 7, or 5, in our numerical experiments, which are inspired by the likelihood maximization and likelihood sampling procedures, respectively.
7.6 A posterioribased adaptive preconditioners
In 14, 13, and 15 we explore adaptive preconditioners that integrate a posteriori error estimators. In particular we introduce an adaptive preconditioner for iterative solution of sparse linear systems arising from partial differential equations with selfadjoint operators. This preconditioner allows to control the growth rate of a dominant part of the algebraic error within a fixed point iteration scheme. Several numerical results that illustrate the efficiency of this adaptive preconditioner with a PCG solver are presented and the preconditioner is also compared with a previous variant in the literature.
7.7 Adaptive Domain Decomposition method for Saddle Point problem
In 11, we introduce an adaptive domain decomposition (DD) method for solving saddle point problems defined as a block two by two matrix. The algorithm does not require any knowledge of the constrained space. We assume that all sub matrices are sparse and that the diagonal blocks are the sum of positive semi definite matrices. The latter assumption enables the design of adaptive coarse space for DD methods. Numerical results on three dimensional elasticity problems on steelrubber structures discretized with 1 billion degrees of freedom are shown.
7.8 Analysis of the SORAS domain decomposition preconditioner for nonselfadjoint or indefinite problems
In 30, we analyze the convergence of the onelevel overlapping domain decomposition preconditioner SORAS (Symmetrized Optimized Restricted Additive Schwarz) applied to a generic linear system whose matrix is not necessarily symmetric/selfadjoint nor positive definite. By generalizing the theory for the Helmholtz equation developed in [I.G. Graham, E.A. Spence, and J. Zou, SIAM J. Numer. Anal., 2020], we identify a list of assumptions and estimates that are sufficient to obtain an upper bound on the norm of the preconditioned matrix, and a lower bound on the distance of its field of values from the origin. We stress that our theory is general in the sense that it is not specific to one particular boundary value problem. As an illustration of this framework, we prove new estimates for overlapping domain decomposition methods with Robintype transmission conditions for the heterogeneous reactionconvectiondiffusion equation.
7.9 An overlapping splitting double sweep method for the Helmholtz equation
In 31, we consider the domain decomposition method approach to solve the Helmholtz equation. Double sweep based approaches for overlapping decompositions are presented. In particular, we introduce an overlapping splitting double sweep (OSDS) method valid for any type of interface boundary conditions. Despite the fact that first order interface boundary conditions are used, the OSDS method demonstrates good stability properties with respect to the number of subdomains and the frequency even for heterogeneous media. In this context, convergence is improved when compared to the double sweep methods in [Nataf,1997] and [Vion et al., 2014 and 2016] for all of our test cases: waveguide, open cavity, and wedge problems.
7.10 Domain decomposition preconditioning for high frequency wave propagation problems
In the context of seismic imaging, frequencydomain fullwaveform inversion (FWI) is suitable for longoffset stationaryrecording acquisition, since reliable subsurface models can be reconstructed with a few frequencies and attenuation is easily implemented without computational overhead. In the frequency domain, wave modeling is a Helmholtztype boundaryvalue problem which requires to solve a large and sparse system of linear equations per frequency with multiple righthand sides (sources). This system can be solved with direct or iterative methods. While the former are suitable for FWI application on 3D dense OBC acquisitions covering spatial domains of moderate size, the later should be the approach of choice for sparse node acquisitions covering large domains (more than 50 millions of unknowns). Fast convergence of iterative solvers for Helmholtz problems remains however challenging in high frequency regime, due to the non definiteness of the Helmholtz operator and also to the discretization constraints in order to minimize the dispersion error for a given frequency, hence requiring efficient preconditioners.
For such wave propagation problems, we continued development of twolevel ORAS Schwarz domain decomposition preconditioners, where the second level consists in inexact coarse solves of a coarse problem with added absorption discretized on a coarse mesh. In particular, we developed and tested a single precision implementation, together with incomplete Cholesky factorization of local fine level matrices, allowing a significant gain in computing time and memory consumption without loss of robustness, as shown in Table 1 for a 3D acoustic simulation on the Overthrust benchmark. For this same benchmark, we were thus able to solve the problem at 20 Hz frequency with P3 finite elements on an unstructured mesh adapted to the local wavelength, a problem comprising 2.3 billion unknowns and solved in 37 seconds on 16960 cores. Table 2 gathers some results at different frequencies, for a regular mesh and for an adapted unstructured mesh. This work was the subject of conference papers and was presented this year at the EAGE and SEG geophysics conferences.
Cartesian grid, f = 5Hz  
precision  fine local solver  # it  setup  GMRES 
double  Cholesky  10  92.5s  15.5s 
double  ICC  10  30.2s  8.9s 
single  Cholesky  10  50.3s  10.3s 
single  ICC  10  25.8s  6.3s 
Regular mesh  
Freq (Hz)  $\#$core  $\#$elts (M)  $\#$dofs (M)  $\#$it  GMRES 
5  265  16  74  7  16s 
10  2,120  131  575  15  33s 
Adaptive mesh  
Freq (Hz)  $\#$core  $\#$elts (M)  $\#$dofs (M)  $\#$it  GMRES 
10  2,120  63  286  14  15s 
20  16,960  506  2,285  30  37s 
7.11 Fullwaveform redatuming via a TRAC approach
In 26, we consider the problem of redatuming. In inverse problems, redatuming data consists in virtually moving the sensors from the original acquisition location to an arbitrary position. This is an essential tool for target oriented inversion. An exact redatuming method which has the peculiarity to be robust with respect to noise is proposed. Our iterative method is based on the Time Reversal Absorbing Conditions (TRAC) approach and avoids the need for a regularization strategy. Numerical results and comparisons with other redatuming approaches illustrate the robustness of our method.
7.12 Parareal simulations of highly oscillatory differential equations
In 34 we propose a new strategy for solving by the parareal algorithm highly oscillatory ordinary differential equations which are characteristics of a sixdimensional Vlasov equation. For the coarse solvers we use reduced models, obtained from twoscale asymptotic expansions. Such reduced models have a low computational cost since they are free of high oscillations. The parareal method allows to improve their accuracy in a few iterations through corrections by fine solvers of the full model. We demonstrate the accuracy and the efficiency of the strategy in numerical experiments of short time and long time simulations of charged particles submitted to a large magnetic field. In addition, the convergence of the parareal method is obtained uniformly with respect to the vanishing stiff parameter.
7.13 Two level domain decomposition preconditionner for hypersingular integral operator
In 20, we consider an hypersingular symmetric positive definite operators stemming from boundary integral equation (BIE), and we adapt the twolevel domain decomposition stratgey based on the GenEO coarse space to this context of nonlocal operator. We derive an estimate on the condition number, showing the scalability of this method, and provide numerical results based on the BemTool and HPDDM libraries that confirm the theoretically predicted efficiency of this approach.
7.14 Treatment of cross points in optimized Schwarz Methods
In 17 we consider a scalar wave propagation in harmonic regime modelled by Helmholtz equation with heterogeneous coefficients. Using the MultiTrace Formalism (MTF), we propose a new variant of the Optimized Schwarz Method (OSM) that can accomodate the presence of crosspoints in the subdomain partition. This leads to the derivation of a strongly coercive formulation of our Helmholtz problem posed on the union of all interfaces. The corresponding operator takes the form "identity + contraction".
In the field of Domain Decomposition (DD), Optimized Schwarz Method (OSM) appears to be one of the prominent techniques to solve large scale timeharmonic wave propagation problems. It is based on appropriate transmission conditions using carefully designed impedance operators to exchange information between subdomains. The efficiency of such methods is however hindered by the presence of crosspoints, where more than two subdomains abut, if no appropriate treatment is provided. In 32, we propose a new treatment of the crosspoint issue for the Helmholtz equation that remains valid in any geometrical interface configuration. We exploit the multitrace formalism to define a new exchange operator with suitable continuity and isometry properties. We then develop a complete theoretical framework that generalizes classical OSM to partitions with cross points and contains a rigorous proof of geometric convergence, uniform with respect to the mesh discretization, for appropriate positive impedance operators. Extensive numerical results in 2D and 3D are provided as an illustration of the performance of the proposed method.
7.15 Analysis of local Multitrace formulations in electromagnetism
In 27 we consider the timeharmonic electromagnetic transmission problem for the unit sphere. Appealing to a vector spherical harmonics analysis, we prove the first stability result of the local multiple trace formulation (MTF) for electromagnetics, originally introduced by Hiptmair and JerezHanckes [Adv. Comp. Math. 37 (2012), 3791] for the acoustic case, paving the way towards an extension to general piecewise homogeneous scatterers. Moreover, we investigate preconditioning techniques for the local MTF scheme and study the accumulation points of induced operators. In particular, we propose a novel secondorder inverse approximation of the operator. Numerical experiments validate our claims and confirm the relevance of the preconditioning strategies.
8 Bilateral contracts and grants with industry
8.1 Bilateral contracts with industry
 Contract with IFPen, January 2021  December 2023, that funds the Phd thesis of Antoine Lechevallier on "Accélération des simulations avec évènements ponctuels récurrents par apprentissage : application au stockage geologique du CO2" . Supervisor F. Nataf, S. Desroziers and T. Faney.
9 Partnerships and cooperations
9.1 International initiatives
9.1.1 Inria international partners
Informal international partners
 J. Demmel, UC Berkeley, USA
 F. Assous, Ariel University, Israel
 R. Hiptmair, ETH Zurich, Suisse
9.2 European initiatives
9.2.1 FP7 & H2020 Projects
EMC2, ERC Synergy project
 Title: Extremescale Mathematicallybased Computational Chemistry
 Duration: Nov 2019  April 2026
 Coordinators: E. Cances, L. Grigori, Y. Maday, and J. P. Piquemal

Partners:
 LJLL and LCT, Sorbonne Université (France)
 Cermics, ECOLE NATIONALE DES PONTS ET CHAUSSEES (France)
 Inria contact: Laura Grigori
 Summary: EMC2 is an ERC Synergy project that aims to overcome some of the current limitations in the field of molecular simulation and aims to provide academic communities and industrial companies with new generation, dramatically faster and quantitatively reliable molecular simulation software. This will enable those communities to address major technological and societal challenges of the 21st century in health, energy, and the environment for instance.
9.3 National initiatives
9.3.1 ANR
B3DCMB
ANR Decembre 2017  March 2022 This project is in the area of data analysis of cosmological data sets as collected by contemporary and forthcoming observatories. This is one of the most dynamic areas of modern cosmology. Our special target are data sets of Cosmic Microwave Background (CMB) anisotropies, measurements of which have been one of the most fruitful of cosmological probes. CMB photons are remnants of the very early evolution of the Universe and carry information about its physical state at the time when the Universe was much younger, hotter, and denser, and simpler to model mathematically. The CMB has been, and continues to be, a unique source of information for modern cosmology and fundamental physics. The main objective of this project is to empower the CMB data analysis with novel high performance tools and algorithms superior to those available today and which are capable of overcoming the existing performance gap. Partners: AstroParticules et Cosmologie Paris 7 (PI R. Stompor), ENSAE Paris Saclay.
ANR CinePara
October 2015  March 2021, Laura Grigori is Principal Coordinator for Inria Paris. Funding for Inria Paris is 145 Keuros. The funding for Inria is to combine Krylov subspace methods with parallel in time methods. Partners: University Pierre and Marie Curie, J. L. Lions Laboratory (PI Y. Maday), CEA, Paris Dauphine University, Paris 13 University.
Nonlocal DD
ANR appel à projet générique October 2015  September 2020
This project in scientific computing aims at developing new domain decomposition methods for massively parallel simulation of electromagnetic waves in harmonic regime. The specificity of the approach that we propose lies in the use of integral operators not only for solutions local to each subdomain, but for coupling subdomains as well. The novelty of this project consists, on the one hand, in exploiting multitrace formalism for domain decomposition and, on the other hand, considering optimized Schwarz methods relying on Robin type transmission conditions involving quasilocal integral operators. Partners: Inria Alpines (PI X. Claeys), Inria Poems, Inria Magique 3D.
Muffin
ANR appel à projet générique 2019.
S. Hirstoaga and P.H. Tournier are members of the project MUFFIN, whose objective is to explore and optimize original computational scenarios for multiscale and high dimensional transport codes, with priority applications in plasma physics. Several approximation methods are planed to be developed. It is at the frontier of computing and numerical analysis and intends to reduce the computational burden in the context of intensive calculation. Principal Investigator: B. Després (Sorbonne University).
10 Dissemination
10.1 Promoting scientific activities
10.1.1 Scientific events: selection
Member of the conference program committees
 Laura Grigori, member of PC of International Conference on Parallel Processing, ICPP 2020.
10.1.2 Journal
Member of the editorial boards
 Laura Grigori, March 2014 – current. Member of the editorial board for the SIAM book series Software, Environments and Tools. See
http://
bookstore. .siam. org/ softwareenvironmentsandtools/  Laura Grigori, January 2016 – current. Associate Editor, SIAM Journal on Scientific Computing.
 Laura Grigori, January 2017 – current. Associate Editor, SIAM Journal on Matrix Analysis and Applications.
 Laura Grigori, January 2016 – current. Editorial board, Numerical linear algebra with applications Journal, Wiley.
 Frédéric Nataf, 2012 – current. Editorial board of the “Journal of Numerical Mathematics”
10.1.3 Invited talks
 Laura Grigori, Lecture in the online seminar series on numerical linear algebra, July 8, 2020, slides and the talk, Communication avoiding low rank matrix approximation a unified perspective on deterministic and randomized approaches.
 Laura Grigori, Panel of the GenX Workshop, Planning for exascale continuum mechanics software, UK, November 2020.
 S. Hirstoaga gave a talk at "Rencontres InriaLJLL en calcul scientifique", October 19, 2020.
 S. Hirstoaga gave a talk at "La demiheure de science", October 1st, 2020.
 S. Hirstoaga gave a talk at WCCMECCOMAS Congres 2020, MS 399: Numerical Aspects of Transport, Boltzmann and Kinetic Equations, January 2021.
10.1.4 Leadership within the scientific community
 Laura Grigori, member elected of SIAM Council, January 2018  December 2020, the committee supervising the scientific activities of SIAM. Nominated by a Committee and elected by the members of SIAM.
 Laura Grigori, Vicechair of the PRACE (Partnership for Advanced Computing in Europe) Scientific Steering Committee, March 2020  March 2021.
 Laura Grigori, Member of Scientific Committee of the Prace Fast Track Call for Proposals in support to mitigate impact of Covid 19, opened in March 2020.
 Laura Grigori is the coordinator of the High Performance in Scientific Computing Major of second year of Mathematics and Applications Master, Sorbonne University.
10.1.5 Scientific expertise
 Laura Grigori: November 2015  current, expert to the Scientific Commission of IFPEN (French Petroleum Institute). Evaluation of research programs, PhD theses, work representing a total of 5 days per year.
10.1.6 Research administration
 Laura Grigori, president of the Hiring Committee for a Maitre de Conférences position at LJLL, Sorbonne University, 2020.
 Laura Grigori, member of selection jury for Chair (Professor position) in high performance computing, Delft University, Netherlands, 2020.
10.2 Teaching  Supervision  Juries
10.2.1 Teaching
 Licence 3: S. Hirstoaga, Méthodes numériques, 53 HETD (cours + TP), EISE 3, Polytech Sorbonne.
 Master 1: Xavier Claeys, Initiation to C++, 36 hrs of programming tutorials in C++, SU.
 Master 1: Xavier Claeys, Computational Linear Algebra, 51 hrs of lectures, SU.
 Master 1: Xavier Claeys, Approximation of PDEs, 51 hrs of lectures on variationnal theory of PDEs, SU.
 Master 2: Xavier Claeys, PDEs from analysis to implementation, 24hrs of lectures on the C++ programming of a PDE solver, SU.
 Master 2: Laura Grigori, Winter 2020, Participation in the course on High Performance Computing given at Sorbonne University, Computer Science, intervention for 8 hours per year.
 Master 2: Laura Grigori, Course on High performance computing for numerical methods and data analysis, Master 2nd year, Mathematics & Applications, Sorbonne University, 24 hours per year.
 Licence 1: S. Hirstoaga, analysis and matrix calculus, 24 HETD, INSEP, Sorbonne University.
 Master 2: Frédéric Nataf, Course on Domain Decomposition Methods, Sorbonne University, 22h.
10.2.2 Supervision
 PhD in progress: Antoine Lechevallier, Sorbonne University, since Oct 2020, advisors Frédéric Nataf (SU), Sylvain Desroziers (IFPEN) and Thibault Faney (IFPEN).
 PhD in progress: Matthias Beaupere, since October 2019 (funded by ERC Synergy EMC2), advisor Laura Grigori.
 PhD in progress: Igor Chollet, since October 2017 (funded by ICSD), advisors Xavier Claeys, Pierre Fortin, Laura Grigori.
 PhD in progress: Thibault Cimic, since October 2017 (funded by ANR B3DCMB), advisor Laura Grigori.
 PhD in progress: Thanh Van Nguyen, since November 2017 (funded by ANR CinePara), advisor Laura Grigori.
 PhD in progress: Houssam Houssein, since January 2018 (CIFRE, funded by Airthium), advisor Frédéric Hecht.
 PhD in progress: Siwar Badreddine, since October 2020 (funded by ERC Synergy EMC2), advisor Laura Grigori.
 PhD in progress: David Frenkiel, since October 2020 (funded by ERC Synergy EMC2), advisor Laura Grigori.
10.2.3 Juries
 Xavier Claeys was member of the jury for the PhD thesis of Emile Parolin, defended at Université ParisSaclay on December the 4th 2020.
 Laura Grigori was a member of the jury of HDR habilitation of V. Ehrlacher, CERMICS Ecole des Ponts, Dec 2020.
 Laura Grigori was president of the jury of HDR habilitation of Antoine Levitt, CERMICS Ecole des Ponts, Oct 2020.
 Frédéric Nataf was president of the PhD defense of Ani Miraci, 2020, Sorbonne Université
 Frédéric Nataf was president of the PhD defense of Hugo Girardon, 2020, Ecole Polytechnique.
 Frédéric Nataf was reviewer of the HdR defense of Robin Bouclier, 2020, Université de Toulouse.
10.3 Popularization
10.3.1 Education
 3hour lecture at Algorithmes émergents pour le calcul scientifique à grande échelle, april 2020, in the context of Actions formation DIM RFSI et Math Innov.
11 Scientific production
11.1 Major publications
 1 articleEssential spectrum of local multitrace boundary integral operatorsIMA J. Appl. Math.8162016, 961983URL: https://doiorg.accesdistant.sorbonneuniversite.fr/10.1093/imamat/hxw019
 2 articleIntegral equations for electromagnetic scattering at multiscreensIntegral Equations Operator Theory8412016, 3368URL: https://doiorg.accesdistant.sorbonneuniversite.fr/10.1007/s0002001522425
 3 articleSecond kind boundary integral equation for multisubdomain diffusion problemsAdv. Comput. Math.4352017, 10751101URL: https://doiorg.accesdistant.sorbonneuniversite.fr/10.1007/s1044401795170
 4 articleCommunicationoptimal parallel and sequential QR and LU factorizationsSIAM Journal on Scientific Computing1short version of technical report UCB/EECS200889 from 20082012, 206239
 5 book An Introduction to Domain Decomposition Methods: algorithms, theory and parallel implementation SIAM 2015
 6 articleCALU: a communication optimal LU factorization algorithmSIAM Journal on Matrix Analysis and Applications322011, 13171350
 7 articleEnlarged Krylov Subspace Conjugate Gradient methods for Reducing CommunicationSIAM Journal on Matrix Analysis and Applications3722016, 744773
 8 articleAn Additive Schwarz Method Type Theory for Lions's Algorithm and a Symmetrized Optimized Restricted Additive Schwarz MethodSIAM J. Sci. Comput.3942017, A1345A1365URL: https://hal.archivesouvertes.fr/hal01278347
 9 articleNew development in FreeFem++J. Numer. Math.20342012, 251265
 10 misc HPDDM: HighPerformance Unified framework for Domain Decomposition methods, MPIC++ library 2014
 11 unpublishedAdaptive Domain Decomposition method for Saddle Point problem in Matrix FormNovember 2019, working paper or preprint
 12 articleAbstract robust coarse spaces for systems of PDEs via generalized eigenproblems in the overlapsNumer. Math.12642014, 741770URL: http://dx.doi.org/10.1007/s002110130576y
11.2 Publications of the year
International journals
 13 articleAdaptive solution of linear systems of equations based on a posteriori error estimatorsNumerical Algorithms84May 2020, 331364
 14 article Adaptive linear solution process for singlephase Darcy flow Oil & Gas Science and Technology  Revue d'IFP Energies nouvelles 75 2020
 15 article An a posterioribased adaptive preconditioner for controlling a local algebraic error norm BIT Numerical Mathematics August 2020
 16 article Lineartime CUR approximation of BEM matrices Journal of Computational and Applied Mathematics 368 112528 April 2020
 17 article A new variant of the Optimised Schwarz Method for arbitrary nonoverlapping subdomain partitions ESAIM: Mathematical Modelling and Numerical Analysis 2020
 18 articleNumerical algorithms for highperformance computational sciencePhilosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences3782166March 2020, 20190066
 19 article Adaptive hierarchical subtensor partitioning for tensor compression SIAM Journal on Scientific Computing 2021
 20 articleTwolevel preconditioning for hversion boundary element approximation of hypersingular operator with GenEONumerische Mathematik146September 2020, 597–628
 21 articleMathematical Analysis of Robustness of TwoLevel Domain Decomposition Methods with respect to Inexact Coarse SolvesNumerische MathematikFebruary 2020, 811833
 22 articleAccelerating linear system solvers for timedomain component separation of cosmic microwave background dataAstronomy and Astrophysics  A&A638June 2020, A73
International peerreviewed conferences
 23 inproceedings Analysis of a List Scheduling Algorithm for Task Graphs on Two Types of Resources IPDPS 2020  34th IEEE International Parallel and Distributed Procesing Symposium New Orleans / Virtual, United States May 2020
Doctoral dissertations and habilitation theses
 24 thesis Schwarz methods and boundary integral equations Sorbonne Université January 2020
Reports & preprints
 25 misc A Multilevel Schwarz Preconditioner Based on a Hierarchy of Robust Coarse Spaces December 2020
 26 misc Fullwaveform redatuming via a TRAC approach: a first step towards target oriented inverse problem January 2021
 27 misc Local Multiple Traces Formulation for Electromagnetics: Stability and Preconditioning for Smooth Geometries March 2020
 28 misc Randomized GramSchmidt process with application to GMRES January 2021
 29 misc Communication avoiding low rank approximation based on QR with tournament pivoting January 2021
 30 misc Analysis of the SORAS domain decomposition preconditioner for nonselfadjoint or indefinite problems November 2020
 31 misc An overlapping splitting double sweep method for the Helmholtz equation January 2021
 32 misc Robust treatment of cross points in Optimized Schwarz Methods January 2021
 33 misc Higherorder QR with tournament pivoting for tensor compression December 2020
 34 misc Reduced modelbased parareal simulations of oscillatory singularly perturbed ordinary differential equations January 2021
 35 misc Parallel Tensor Train through Hierarchical Decomposition February 2021
11.3 Cited publications
 36 articleTimereversed absorbing condition: application to inverse problemsInverse Problems2762011, 065003URL: http://stacks.iop.org/02665611/27/i=6/a=065003
 37 articleTimedependent wave splitting and source separationJournal of Computational Physics3302017, 981  996URL: http://www.sciencedirect.com/science/article/pii/S0021999116305198