EN FR
EN FR

2023Activity reportProject-TeamALPINES

RNSR: 201321071B
  • Research center Inria Paris Centre at Sorbonne University
  • In partnership with:CNRS, Sorbonne Université
  • Team name: Algorithms and parallel tools for integrated numerical simulations
  • In collaboration with:Laboratoire Jacques-Louis Lions (LJLL)
  • Domain:Networks, Systems and Services, Distributed Computing
  • Theme:Distributed and High Performance Computing

Keywords

Computer Science and Digital Science

  • A6.1.1. Continuous Modeling (PDE, ODE)
  • A6.1.4. Multiscale modeling
  • A6.1.5. Multiphysics modeling
  • A6.2.1. Numerical analysis of PDE and ODE
  • A6.2.5. Numerical Linear Algebra
  • A6.2.7. High performance computing
  • A6.3. Computation-data interaction
  • A6.3.1. Inverse problems
  • A7.1. Algorithms

Other Research Topics and Application Domains

  • B3.3.1. Earth and subsoil
  • B9.5.2. Mathematics
  • B9.5.3. Physics

1 Team members, visitors, external collaborators

Research Scientists

  • Frédéric Nataf [Team leader, CNRS, Senior Researcher, from May 2023, HDR]
  • Laura Grigori [Team leader, INRIA, Senior Researcher, until Apr 2023, HDR]
  • Sever Hirstoaga [INRIA, Researcher, HDR]
  • Emile Parolin [INRIA, Researcher]
  • Pierre-Henri Tournier [CNRS, Researcher]

Faculty Member

  • Frédéric Hecht [Sorbonne University, Emeritus, HDR]

Post-Doctoral Fellows

  • Mathieu Verite [INRIA, Post-Doctoral Fellow]
  • Yanfei Xiang [INRIA, Post-Doctoral Fellow, from Feb 2023 until Nov 2023]

PhD Students

  • Siwar Badreddine [INRIA]
  • Matthias Beaupere [INRIA, until Feb 2023]
  • Jean-Guillaume De Damas [INRIA]
  • Nicola Galante [INRIA, from Jul 2023]
  • Antoine Lechevallier [Sorbonne Université]
  • Edouard Timsit [INRIA]

Technical Staff

  • Victor Lederer [UNIV VIENNE, Engineer, until Oct 2023]
  • Daniel Torres Gonzales [SORBONNE UNIVERSITE, Engineer]

Administrative Assistant

  • Laurence Bourcier [INRIA]

2 Overall objectives

2.1 Introduction

The focus of our research is on the development of novel parallel numerical algorithms and tools appropriate for state-of-the-art mathematical models used in complex scientific applications, and in particular numerical simulations. The proposed research program is by nature multi-disciplinary, interweaving aspects of applied mathematics, computer science, as well as those of several specific applications, as porous media flows, elasticity, wave propagation in multi-scale media, molecular simulations, and inverse problems.

Our first objective is to develop numerical methods and tools for complex scientific and industrial applications that will enhance their scalable execution on the emergent heterogeneous hierarchical models of massively parallel machines. Our second objective is to integrate the novel numerical algorithms into a middle-layer that will hide as much as possible the complexity of massively parallel machines from the users of these machines.

3 Research program

3.1 Overview

The research described here is directly relevant to several steps of the numerical simulation chain. Given a numerical simulation that was expressed as a set of differential equations, our research focuses on mesh generation methods for parallel computation, novel numerical algorithms for linear algebra, as well as algorithms and tools for their efficient and scalable implementation on high performance computers. The validation and the exploitation of the results is performed with collaborators from applications and is based on the usage of existing tools. In summary, the topics studied in our group are the following:

  • Numerical methods and algorithms
    • Mesh generation for parallel computation
    • Solvers for numerical linear algebra:

      domain decomposition methods, preconditioning for iterative methods

    • Computational kernels for numerical linear algebra
    • Tensor computations for high dimensional problems
  • Validation on numerical simulations and other numerical applications

3.2 Domain specific language - parallel FreeFem

In the engineering, researchers, and teachers communities, there is a strong demand for simulation frameworks that are simple to install and use, efficient, sustainable, and that solve efficiently and accurately complex problems for which there are no dedicated tools or codes available. In our group we develop FreeFem++, a user-dedicated language for solving PDEs. The goal of FreeFem++ is not to be a substitute for complex numerical codes, but rather to provide an efficient and relatively generic tool for:

  • getting a quick answer to a specific problem,
  • prototyping the resolution of a new complex problem.

The current users of FreeFem++ are mathematicians, engineers, university professors, and students. In general for these users the installation of public libraries as MPI, MUMPS, Ipopt, Blas, lapack, OpenGL, fftw, scotch, PETSc, SLEPc is a very difficult problem. For this reason, the authors of FreeFem++ have created a user friendly language, and over years have enriched its capabilities and provided tools for compiling FreeFem++ such that the users do not need to have special knowledge of computer science. This leads to an important work on porting the software on different emerging architectures.

Today, the main components of parallel FreeFem++ are:

  • definition of a coarse grid,
  • splitting of the coarse grid,
  • mesh generation of all subdomains of the coarse grid, and construction of parallel data structures for vectors and sparse matrices from the mesh of the subdomain,
  • discretization by a chosen numerical method,
  • call to a linear solver,
  • analysis of the result.

All these components are parallel, except for the last point which is not in the focus of our research. However for the moment, the parallel mesh generation algorithm is very simple and not sufficient, for example it addresses only polygonal geometries. Having a better parallel mesh generation algorithm is one of the goals of our project. In addition, in the current version of FreeFem++, the parallelism is not hidden from the user, it is done through direct calls to MPI. Our goal is also to hide all the MPI calls in the specific language part of FreeFem++. In addition to these in-house domain decomposition methods, FreeFem++ is also linked to PETSc solvers which enables an easy use of third parties parallel multigrid methods.

3.3 Solvers for numerical linear algebra

Iterative methods are widely used in industrial applications, and preconditioning is the most important research subject here. Our research considers domain decomposition methods  2 and iterative methods and its goal is to develop solvers that are suitable for parallelism and that exploit the fact that the matrices are arising from the discretization of a system of PDEs on unstructured grids.

One of the main challenges that we address is the lack of robustness and scalability of existing methods as incomplete LU factorizations or Schwarz-based approaches, for which the number of iterations increases significantly with the problem size or with the number of processors. This is often due to the presence of several low frequency modes that hinder the convergence of the iterative method. To address this problem, we study different approaches for dealing with the low frequency modes as coarse space correction in domain decomposition or deflation techniques.

We also focus on developing boundary integral equation methods that would be adapted to the simulation of wave propagation in complex physical situations, and that would lend themselves to the use of parallel architectures. The final objective is to bring the state of the art on boundary integral equations closer to contemporary industrial needs. From this perspective, we investigate domain decomposition strategies in conjunction with boundary element method as well as acceleration techniques (H-matrices, FMM and the like) that would appear relevant in multi-material and/or multi-domain configurations. Our work on this topic also includes numerical implementation on large scale problems, which appears as a challenge due to the peculiarities of boundary integral equations.

3.4 Computational kernels for numerical linear and multilinear algebra

The design of new numerical methods that are robust and that have well proven convergence properties is one of the challenges addressed in Alpines. Another important challenge is the design of parallel algorithms for the novel numerical methods and the underlying building blocks from numerical linear algebra. The goal is to enable their efficient execution on a diverse set of node architectures and their scaling to emerging high-performance clusters with an increasing number of nodes.

Increased communication cost is one of the main challenges in high performance computing that we address in our research by investigating algorithms that minimize communication, as communication avoiding algorithms. The communication avoiding algorithmic design is an approach originally developed in our group since more than ten years (initially in collaboration with researchers from UC Berkeley and CU Denver). While our first results concerned direct methods of factorization as LU or QR factorizations, our recent work focuses on designing robust algorithms for computing the low rank approximation of a matrix or a tensor. We focus on both deterministic and randomized approaches.

Our research also focuses on solving problems of large size that feature high dimensions as in molecular simulations. The data in this case is represented by objects called tensors, or multilinear arrays. The goal is to design novel tensor techniques to allow their effective compression, i.e. their representation by simpler objects in small dimensions, while controlling the loss of information. The algorithms are aiming to being highly parallel to allow to deal with the large number of dimensions and large data sets, while preserving the required information for obtaining the solution of the problem.

4 Application domains

Our work is by design as general purpose as possible in the field of scientific computing based on partial differential equations modeling including multiphysics and multiscale modeling. Moreover, since our methods are available in standalone open sources libraries and in the free finite element specific language FreeFEM which is developed in our team, they have many potential users that we are not necessarily aware of. We give here some recent works performed in collaboration with team members.

4.1 Radiative transfer in the atmosphere

The Hierarchical matrix acceleration techniques implemented in our parallel Htool 7.1.4 library, which is interfaced with FreeFEM, allows to efficiently solve the radiative transfer equations with complexity n log(n), where n is the number of vertices of the physical domain. The method can be used to study the change of temperature in the atmosphere, and in particular the effect of clouds and greenhouse gases 10, 27.

4.2 Inverse problems

4.2.1 Time Reversal techniques

We focus on methods related to the blend of time reversal techniques and absorbing boundary conditions (ABC) used in a non standard way. Since the seminal paper by [M. Fink et al., Imaging through inhomogeneous media using time reversal mirrors. Ultrasonic Imaging, 13(2):199, 1991.], time reversal is a subject of very active research. The principle is to back-propagate signals to the sources that emitted them. The initial experiment was to refocus, very precisely, a recorded signal after passing through a barrier consisting of randomly distributed metal rods. In [de Rosny and Fink. Overcoming the diffraction limit in wave physics using a time-reversal mirror and a novel acoustic sink. Phys. Rev. Lett., 89 (12), 2002], the source that created the signal is time reversed in order to have a perfect time reversal experiment. In 29, we improve this result from a numerical point of view by showing that it can be done numerically without knowing the source. This is done at the expense of not being able to recover the signal in the vicinity of the source. In 30, we have extended these methods to non uniform media.

4.2.2 Seismic imaging

In the context of seismic imaging, frequency-domain full-waveform inversion (FWI) is suitable for long-offset stationary-recording acquisition, since reliable subsurface models can be reconstructed with a few frequencies and attenuation is easily implemented without computational overhead. In the frequency domain, wave modeling is a Helmholtz-type boundary-value problem which requires to solve a large and sparse system of linear equations per frequency with multiple right-hand sides (sources).

For sparse node acquisitions covering large domains (more than 50 millions of unknowns), an iterative solver should be the method of choice. However, fast convergence remains challenging in the high frequency regime due to the non-definiteness of the operator and to the discretization constraints needed to minimize dispersion, hence requiring efficient preconditioners. For such wave propagation problems, our Schwarz domain decomposition preconditioners are good candidates to accelerate the time to solution and to provide robustness with respect to heterogeneities and frequency, for both finite difference and finite element discretizations  34.

4.2.3 Electroencephalography

The inverse source problem in electroencephalography (EEG) is usually solved using a numerical head model composed of different layers of tissues (scalp, skull, brain, ...) having constant electric conductivity. However, the conductivity of the skull can hardly be assumed to be constant: (i) hard and spongy bones, especially for neonates (fontanels), (ii) conductivity models vary between individuals, and through time for a single person.

In 9, we consider a new numerical method for the EEG inverse source problem which takes into account the heterogeneity of the skull. In particular, in the first step of source reconstruction, the so-called "cortical mapping" procedure which deals with (i) data completion on the scalp knowing partial measurements of the electric potential from electrode measurements and (ii) data transmission from the scalp to the brain surface, we use the quasi-reversibility method which provides a regularized solution of Cauchy problems in a bounded domain. The problem is solved using the finite element method in the scalp and skull layers.

4.2.4 Medical imaging with microwave tomography

Microwave tomography is a novel imaging modality with a large number of potential attractive medical applications, and is based on the difference between the dielectric properties of normal and diseased tissues. Microwave tomography features rapid data acquisition time, and together with rapid tomographic reconstructions allows detecting, identifying and monitoring injuries continuously.

From a computational point of view, microwave imaging requires the solution of an inverse problem based on a minimisation algorithm. Reconstruction algorithms are computationally intensive with successive solutions of the forward problem needing efficient numerical modelling and high-performance parallel computing. This methodology involves distinct research fields: optimisation, inverse problems, approximation and solution methods for the simulation of the forward problem modelled by Maxwell’s equations. The latter is challenging in itself as the modelling must accurately take account of the high heterogeneity and complexity of the different tissues.

Our numerical framework for microwave imaging is based on the FreeFEM 7.1.1 finite element software, and the forward problem is solved efficiently in parallel using Domain Decomposition methods implemented in the HPDDM 7.1.2 library.

Brain stroke detection and monitoring

Stroke, or cerebrovascular accident (CVA), is a major cause of disability and death worldwide. About 85% of CVAs are ischemic due to cerebral infarction, caused by an interruption of the blood supply to some part of the brain, and 15% are haemorrhagic. Differentiating between ischemic and haemorrhagic CVAs is an essential part of the initial workup of the patient, and rapid and accurate diagnosis is crucial for patient survival; here, neuroimaging plays a vital role. Computed Tomography (CT) and Magnetic Resonance Imaging (MRI) are the ‘gold’ standards, but their use is not well suited to efficient medical care of CVAS, as they are bulky diagnostic instruments and cannot be used in continuous brain monitoring. A non-invasive and transportable/portable device for the characterisation and monitoring of CVAs would have clear clinical applications, beginning with the very first instance of patient care in an ambulance and extending to continuous patient monitoring at the hospital.

Microwave tomography holds great promise for rapid brain stroke diagnosis. The raw data acquired by the microwave imaging system can be wirelessly transferred to a remote computing center, where the tomographic images will be computed. The images can then be quickly transferred to the hospital.

In 33, 32, we demonstrate on synthetic data the feasibility of a microwave imaging technique for the characterisation and monitoring of strokes. Using high performance computing, we are able to obtain a tomographic reconstruction of the brain in less than two minutes.

Detection and imaging of rotator cuff tears

One of the most challenging shoulder injuries is rotator cuff tear, which increases with aging and particularly happens among athletes. These tears cause pain and highly affect the functionality of the shoulder.

In  17, 16, we propose to study the feasibility of microwave tomographic imaging to detect these tears. We introduce a wearable imaging system design, adapted to the real shoulder structure and designed to surround it partially. We investigate the efficiency of the imaging system numerically using realistic CAD models for shoulder profile and bones, including humerus and scapula, and taking into account the dielectric properties of the different tissues (bone, tendon, muscle, skin, synovial fluid). The reconstructed images of the shoulder joint obtained using noisy synthetic data show that the proposed imaging system is capable of accurately detecting and localizing large (5mL) and partial (1mL) rotator cuff tears.

4.3 Numerical methods for time harmonic wave propagation problems

Acoustic, electromagnetic and elastic linear waves are ubiquitous phenomena in science and engineering. The numerical simulation of their propagation and interaction is a core task in areas like medical imaging, seismic remote sensing, design of electronic devices, atmospheric particle scattering, radar and sonar modelling, etc. The design of appropriate numerical schemes is challenging because of approximation issues arising from the fact that the solutions are highly oscillatory, which usually require a large computational effort to produce a meaningful numerical approximation. In addition, common formulations suffer from stability issues, such as the numerical dispersion (also known as pollution effect), and sign-indefiniteness, which typically limit the achievable accuracy. The variety of different applications and the challenges of their numerical simulations produces a great research effort in trying to design new efficient schemes: new approximation spaces using degrees of freedom sparingly (wave based and Trefftz methods), preconditionners for solving large linear systems (e.g. by domain decomposition), new formulations, acceleration techniques, etc.

4.4 Data analysis in astrophysics

We focus on computationally intensive numerical algorithms arising in the data analysis of current and forthcoming Cosmic Microwave Background (CMB) experiments in astrophysics. This application is studied in collaboration with researchers from University Paris Diderot, and the objective is to make available the algorithms to the astrophysics community, so that they can be used in large experiments.

In CMB data analysis, astrophysicists produce and analyze multi-frequency 2D images of the universe when it was 5% of its current age. The new generation of the CMB experiments observes the sky with thousands of detectors over many years, producing overwhelmingly large and complex data sets, which nearly double every year therefore following Moore's Law. Planck () is a keystone satellite mission which has been developed under auspices of the European Space Agency (ESA).

4.5 Molecular simulations

Molecular simulation is one of the most dynamic areas of scientific computing. Its field of application is very broad, ranging from theoretical chemistry and drug design to materials science and nanotechnology. It provides many challenging problems to mathematicians, and computer scientists.

In the context of the ERC Synergy Grant EMC2 we address several important limitations of state of the art molecular simulation. In particular, the simulation of very large molecular systems, or smaller systems in which electrons interact strongly with each other, remains out of reach today. In an interdisciplinary collaboration between chemists, mathematicians and computer scientists, we focus on developing a new generation of reliable molecular simulation algorithms and software.

5 Social and environmental responsibility

5.1 Impact of research results

Our activities in scientific computing help in the design and optimization of non fossil energy resources. Here are three examples we are aware of and there are many others:

  • the startup Airthium specialized in decarbonized energy uses our free parallel finite element software FreeFEM.
  • The French electricity company EDF has integrated our domain decomposition methods GenEO via the open source library HPDDM in its simulation software Salomé in order to be able to perform large scale computations related to nuclear reactors.
  • In the context of climate change, our work on radiative transfer studies the change of temperature in the atmosphere, in particular due to the effect of clouds and greenhouse gases.

6 Highlights of the year

  • HPDDM software, available through PETSc PCHPDDM routines has been extended to saddle point problems using the GenEO method presented in 12, for details see PCHPDDM
  • These dissemination efforts of our GenEO method have enabled EDF engineers to conduct thermo-mechanical simulations on a problem with 127 million degrees of freedom using 2000 cores. Only our algorithm demonstrated the robustness and speed required for these highly nonlinear calculations, where the Newton method relies on very precise linear solvers to maintain quadratic convergence.

-

7 New software, platforms, open data

7.1 New software

7.1.1 FreeFem++

  • Name:
    FeeFrem++
  • Scientific Description:

    FreeFem++ is a partial differential equation solver. It has its own language. freefem scripts can solve multiphysics non linear systems in 2D and 3D.

    Problems involving PDE (2d, 3d) from several branches of physics such as fluid-structure interactions require interpolations of data on several meshes and their manipulation within one program. FreeFem++ includes a fast 2d̂-tree-based interpolation algorithm and a language for the manipulation of data on multiple meshes (as a follow up of bamg (now a part of FreeFem++ ).

    FreeFem++ is written in C++ and the FreeFem++ language is a C++ idiom. It runs on Macs, Windows, Unix machines. FreeFem++ replaces the older freefem and freefem+.

  • Functional Description:
    FreeFem++ is a PDE (partial differential equation) solver based on a flexible language that allows a large number of problems to be expressed (elasticity, fluids, etc) with different finite element approximations on different meshes.
  • URL:
  • Contact:
    Frédéric Hecht
  • Partner:
    UPMC

7.1.2 HPDDM

  • Scientific Description:
    HPDDM is an efficient implementation of various domain decomposition methods (DDM) such as one- and two-level Restricted Additive Schwarz methods, the Finite Element Tearing and Interconnecting (FETI) method, and the Balancing Domain Decomposition (BDD) method. This code has been proven to be efficient for solving various elliptic problems such as scalar diffusion equations, the system of linear elasticity, but also frequency domain problems like the Helmholtz equation. A comparison with modern multigrid methods can be found in the thesis of Pierre Jolivet.
  • Functional Description:
    HPDDM is an efficient implementation of various domain decomposition methods (DDM) such as one- and two-level Restricted Additive Schwarz methods, the Finite Element Tearing and Interconnecting (FETI) method, and the Balancing Domain Decomposition (BDD) method.
  • URL:
  • Contact:
    Pierre Jolivet
  • Participants:
    Frédéric Nataf, Pierre Jolivet

7.1.3 preAlps

  • Name:
    Preconditioned enlarged Krylov subspace methods
  • Keywords:
    Linear Systems Solver, Preconditioner
  • Functional Description:
    Contains enlarged Conjugate Gradient Krylov subspace method and Lorasc preconditioner.
  • URL:
  • Contact:
    Laura Grigori

7.1.4 HTool

  • Keyword:
    Hierarchical matrices
  • Functional Description:
    HTool is a C++ header-only library implementing compression techniques (e.g. Adaptive Cross Approximation) using hierarchical matrices (H-matrices). The library uses MPI and OpenMP for parallelism, and is interfaced with HPDDM for the solution of linear systems.
  • URL:
  • Contact:
    Pierre Marchand

7.1.5 BemTool

  • Keyword:
    Boundary element method
  • Functional Description:
    BemTool is a C++ header-only library implementing the boundary element method (BEM) for the discretisation of the Laplace, Helmholtz and Maxwell equations, in 2D and 3D. Its main purpose is the assembly of classical boundary element matrices, which can be compressed and inverted through its interface with the HTool library.
  • URL:
  • Contact:
    Xavier Claeys

8 New results

Participants: Alpines members.

8.1 A Directional Equispaced interpolation-based Fast Multipole Method for oscillatory kernels

Fast Multipole Methods (FMMs) based on the oscillatory Helmholtz kernel can reduce the cost of solving N-body problems arising from Boundary Integral Equations (BIEs) in acoustic or electromagnetics. However, their cost strongly increases in the high-frequency regime. In 7 introduces a new directional FMM for oscillatory kernels (defmm - directional equispaced interpolation-based fmm), whose precomputation and application are FFT-accelerated due to polynomial interpolations on equispaced grids. We demonstrate the consistency of our FFT approach, and show how symmetries can be exploited in the Fourier domain. We also describe the algorithmic design of defmm, well-suited for the BIE non-uniform particle distributions, and present performance optimizations on one CPU core. Finally, we exhibit important performance gains on all test cases for defmm over a state-of-the-art FMM library for oscillatory kernels.

8.2 Hybrid Newton method for the acceleration of well event handling in the simulation of CO2 storage using supervised learning

CO2 geological storage is an essential instrument for efficient Carbon Capture and Storage policies. Numerical simulations provide the solution to the multi- phase flow equations that model the behavior of the CO2 injection site. However, numerical simulations of fluid flow in porous media are computation- ally demanding: it can take up to several hours on a HPC cluster in order to simulate one injection scenario for a large CO2 reservoir if we want to accurately model the complex physical processes involved. This becomes a limiting issue when performing a large number of simulations, e.g. in the process of ’history matching’. During the numerical simulation of CO2 storage in the subsurface, well events cause important numerical difficulties due to their instant impact on the pressure and saturation unknowns. This often forces a drastic reduction of the time step size to be able to solve the non-linear system of equations resulting from the discretization of the continuous mathematical model. However, these specific well events in a simulation have a relatively similar impact across space and time. In 26, we propose a methodology to alleviate the impact of well events during the numerical simulation of CO2 storage in the underground. We complement the standard numerical algorithm by predicting an initialization of Newton’s method directly in the domain of convergence using supervised learning. More specifically, we replace the initialization in pressure by a linear approxima- 1tion obtained through an implicit solver and we use a Fourier Neural Operator (FNO) to predict the saturation initialization. We apply our methodology to two test cases derived from a realistic C02 storage in saline aquifer benchmark. We reduce the required number of Newton iterations to handle a well opening by 53% for the first test case, i.e required number of linear system to solve and by 38% for the second test case.

8.3 A robust and adaptive GenEO-type domain decomposition preconditioner for H(curl) problems in general non-convex three-dimensional geometries

In 19, we develop and analyse domain decomposition methods for linear systems of equations arising from conforming finite element discretisations of positive Maxwell-type equations, namely for H(curl) problems. It is well known that convergence of domain decomposition methods rely heavily on the efficiency of the coarse space used in the second level. We design adaptive coarse spaces that complement a near-kernel space made from the gradient of scalar functions. The new class of preconditioner is inspired by the idea of subspace decomposition, but based on spectral coarse spaces, and is specially designed for curl-conforming discretisations of Maxwell's equations in heterogeneous media on general domains which may have holes. Our approach has wider applicability and theoretical justification than the well-known Hiptmair-Xu auxiliary space preconditioner, with results extending to the variable coefficient case and non-convex domains at the expense of a larger coarse space.

8.4 A Robust Two-Level Schwarz Preconditioner For Sparse Matrices

In 20, we introduce a fully algebraic two-level additive Schwarz preconditioner for general sparse large-scale matrices. The preconditioner is analyzed for symmetric positive definite (SPD) matrices. For those matrices, the coarse space is constructed based on approximating two local subspaces in each subdomain. These subspaces are obtained by approximating a number of eigenvectors corresponding to dominant eigenvalues of two judiciously posed generalized eigenvalue problems. The number of eigenvectors can be chosen to control the condition number. For general sparse matrices, the coarse space is constructed by approximating the image of a local operator that can be defined from information in the coefficient matrix. The connection between the coarse spaces for SPD and general matrices is also discussed. Numerical experiments show the great effectiveness of the proposed preconditioners on matrices arising from a wide range of applications. The set of matrices includes SPD, symmetric indefinite, nonsymmetric, and saddle-point matrices. In addition, we compare the proposed preconditioners to the state-of-the-art domain decomposition preconditioners.

8.5 A Unified Framework for Double Sweep Methods for the Helmholtz Equation

In 5, we consider sweeping domain decomposition preconditioners to solve the Helmholtz equation in the case of stripwise domain decomposition with or without overlaps. We unify their derivation and convergence studies by expressing them as Jacobi, Gauss-Seidel, and Symmetric Gauss-Seidel methods for different numbering of the unknowns. The proposed framework enables theoretical comparisons between the double sweep methods in [Nataf and Nier (1997), Vion and Geuzaine (2018)] and those in [Stolk (2013, 2017), Vion and Geuzaine (2014)]. Additionally, it facilitates the introduction of a new sweeping algorithm. We provide numerical test cases to assess the validity of the theoretical studies.

8.6 How does the partition of unity influence SORAS preconditioner

In 18, we investigate the influence of the choice of the partition of unity on the convergence of the Symmetrized Optimized Restricted Additive Schwarz (SORAS) preconditioner for the reaction-convection-diffusion equation. We focus on two kinds of partitions of unity, and study the dependence on the overlap and on the number of subdomains. In particular, the second kind of partition of unity, which is non-zero in the interior of the whole overlapping region, gives more favorable convergence properties, especially when increasing the overlap width, in comparison with the first kind of partition of unity, whose gradient is zero on the subdomain interfaces and which would be the natural choice for ORAS solver instead.

8.7 Domain decomposition preconditioning for high frequency wave propagation problems

In the context of seismic imaging, frequency-domain full-waveform inversion (FWI) is suitable for long-offset stationary-recording acquisition, since reliable subsurface models can be reconstructed with a few frequencies and attenuation is easily implemented without computational overhead. In the frequency domain, wave modeling is a Helmholtz-type boundary-value problem which requires to solve a large and sparse system of linear equations per frequency with multiple right-hand sides (sources). This system can be solved with direct or iterative methods. While the former are suitable for FWI application on 3D dense OBC acquisitions covering spatial domains of moderate size, the later should be the approach of choice for sparse node acquisitions covering large domains (more than 50 millions of unknowns). Fast convergence of iterative solvers for Helmholtz problems remains however challenging in high frequency regime, due to the non definiteness of the Helmholtz operator and also to the discretization constraints in order to minimize the dispersion error for a given frequency, hence requiring efficient preconditioners.

For such wave propagation problems, we continued development of ORAS Schwarz domain decomposition preconditioners for finite difference and finite element discretizations for different 3D seismic benchmarks. We also studied the influence of the choice of the local solver on the convergence and computing time, in particular for problems with multiple right-hand sides using pseudo-block methods. For example, Table 1 gathers results comparing PARDISO, MUMPS and the Block Low-Rank (BLR) version of MUMPS as the local solver, for an acoustic simulation on a 20 km × 102 km × 28.4 km subset of the 3D GO_3D_OBS crustal geomodel, which has been designed to assess seismic imaging techniques for deep crustal exploration. The table reports computing times on 1344 cores for double and single precision arithmetics for one and 130 right-hand sides. The problem is discretized using finite differences with a wavelength-adaptive 27-point stencil, resulting in 67.5 million unknowns. We can see that using the Block Low-Rank (BLR) version of MUMPS as an approximate local solver gives the best computing time, even though 55 iterations are needed to reach convergence instead of 50 when using exact local solvers. Table 2 gathers weak and strong scaling results on the 3D GO_3D_OBS crustal geomodel with finite differences up to 10 Hz frequency, where the linear system of 1160.6 million unknowns is solved in 85.9 seconds on 9856 cores. The paper, which assesses the accuracy of finite-element and finite-difference discretizations for several seismic acoustic benchmarks as well as the efficiency of ORAS Schwarz domain decomposition preconditioners for the solution of the linear systems, is published in Geophysics 34.

New results for frequency-domain FWI showcasing the use of mixed-precision BLR with MUMPS have been obtained using real data from the Gorgon gas field case study (Australia) 14.

Table 1: Solver: Intel MKL Pardiso versus MUMPSFR/BLR. arith: single versus double precision. ortho: CGS versus MGS orthogonalization. #it: Number of iterations. Tf(s): Elapsed time for local LU factorizations. Ts(s): Elapsed time for all GMRES iterations for one RHS and 130 RHSs processed with a pseudo-block Krylov method. Ttot(s)=Tf(s)+Ts(s): Total elapsed time for the simulation.
1 RHS 130 RHS
Solver arith. ortho. T f ( s ) #it T s ( s ) T t o t ( s ) #it T s ( s ) T t o t ( s )
PARDISO double MGS 12.1 48 16.5 28.6  50 724.4 736.5
PARDISO double CGS 12.1 48 16.7 28.8  50 575.5 587.6
MUMPSFR double MGS 11.3 48 17.2 28.5  50 654.9 666.2
MUMPSFR double CGS 11.3 48 16.9 28.2  50 482.2 493.5
PARDISO single MGS 6.5 48 9.9 16.4  50 403.6 410.1
PARDISO single MGS 6.5 48 9.6 16.1  50 334.8 341.3
MUMPSFR single MGS 6.2 48 8.9 15.1  50 345.8 352.0
MUMPSFR single CGS 6.2 48 8.9 15.1  50 275.3 281.5
MUMPSBLR single CGS 4.9 53 7.2 12.1  55 256.8 261.7
Table 2: Scaling results for the GO_3D_OBS model with finite differences. Same nomenclature as that of Table 1. f: Frequency in Hz. ovl: Size of the optimal overlap in terms of computational time. Ew: parallel efficiency with respect to the first row.
f(Hz) #dof(106) #cores #it ovl T f ( s ) T s ( s ) T t o t ( s ) E w
2.5 21.4 60 26 3 52.5 24.8 77.3 1
2.5 21.4 168 30 3 13.7 8.9 22.6 1.222
3.75 67.5 360 40 3 21.9 18.6 40.5 1.003
3.75 67.5 660 45 3 10.8 11.5 22.3 0.994
3.75 67.5 1344 53 3 4.4 7.1 11.5 0.947
3.75 67.5 2380 59 3 2.5 4.8 7.3 0.842
5 153.5 875 61 3 20.9 26.0 46.9 0.811
5 153.5 1344 63 3 12.4 18.6 31.0 0.798
7.5 500.1 2450 98 4 26.7 60.8 87.5 0.506
7.5 500.1 4752 108 4 12.2 39.7 51.9 0.439
10 1160.6 9856 142 5 15.1 70.8 85.9 0.297

8.8 Microwave tomographic imaging of shoulder injury

One of the most challenging shoulder injuries is rotator cuff tear, which increases with aging and particularly happens among athletes. These tears cause pain and highly affect the functionality of the shoulder. The motivation of this work 17, 16 is to detect these tears by microwave tomographic imaging. This imaging method requires the solution of an inverse problem based on a minimization algorithm, with successive solutions of a direct problem. The direct problem consists in solving the time-harmonic Maxwell’s equations discretized with a Nédélec edge finite element method, and we make use of an ORAS Schwarz domain decomposition preconditioner to solve it efficiently in parallel. We propose a wearable imaging system design with 96 antennas distributed over two fully-circular and two half-circular layers, adapted to the real shoulder structure and designed to surround it partially. We test the imaging system numerically using realistic CAD models for shoulder profile and bones, including humerus and scapula, and taking into account the dielectric properties of the different tissues (bone, tendon, muscle, skin, synovial fluid). The reconstructed images of the shoulder joint obtained using noisy synthetic data show that the proposed imaging system is capable of accurately detecting and localizing large (5mL) and partial (1mL) rotator cuff tears. The reconstruction takes 20 minutes on 480 computing cores and shows great promise for rapid diagnosis or medical monitoring. In a second step, we take advantage of numerical modeling to try and optimize the number of antennas in the imaging system, achieving a drastic reduction from 96 to 32 antennas while still being able to detect the injury in the difficult partial tear case, reducing the computing time from 20 to 12 minutes.

8.9 Numerical resolution of the inverse source problem in electroencephalography

In 9, we consider a new numerical method for the electroencephalography(EEG) inverse source problem. The novelty is to take into account the heterogeneity of the skull, as the electric conductivity of the skull can hardly be assumed to be constant: (i) hard and spongy bones, especially for neonates (fontanels), (ii) conductivity models vary between individuals, and through time for a single person. The first step of the source reconstruction method is the "cortical mapping" procedure. It deals with the numerical resolution of both data completion and transmission problems for elliptic equations: (i) data completion on the scalp knowing partial measurements of the electric potential (from electrode measurements), (ii)data transmission from the scalp to the brain surface. To this end, we propose to apply the quasi-reversibility method which provides a regularized solution of Cauchy problems in a bounded domain. The problem is solved using the finite element method in the scalp and skull layers. The second step consists in using an efficient analytic method to retrieve the source (position and moment) from the Cauchy data on the brain surface, which is assumed to be spherical with constant conductivity. For the classical three-layer spherical model (brain, skull, scalp), using synthetic data from 257 electrodes, we recover the source with 3.6% position error and moment correlation of 1. With 5% and 20% white Gaussian noise, the position error is 5.5% and 8.4% respectively, with moment correlation of 1. Numerical results are also presented for sets of 33 and 61 electrodes, for two distant and two close sources, and for a variable skull conductivity with the presence of the anterior fontanel, as well as a cerebrospinal fluid (CSF) layer.

8.10 Radiative transfer for variable 3D atmospheres

The radiative transfer equations have three spatial dimensions, two radiational dimensions, one frequency variable, and possibly time. Here, without loss of generality, they are reduced to a nonlinear integro-differential system and solved by an iterative scheme which is shown to be monotone and convergent. At each step, two large integrals of convolution type need to be computed. The kernel matrices are assembled in compressed form using Hierarchical matrices, with the help of the H-Matrix C++ parallel library Htool 7.1.4. The complexity of the resulting method is of complexity n log(n) where n is the number of vertices of the physical domain. The method is applied to the temperature and radiation in the valley of Chamonix, the challenge being the large variation of density, altitude and light reflection due to snow and possibly clouds. Our method is able to handle 240K physical points, all directions of radiation and 683 frequencies in less than 35 minutes on an Apple M1 laptop 10. The method has also been extended to handle reflective boundary conditions 27.

8.11 New developments in FreeFEM

New features were developed in the FreeFEM software thanks to the work of Jacques Morice (INRIA ADT engineer March 2022-2023):

  • The Boundary Element Method (BEM) can now be used to solve Maxwell equations through the Electric Field Integral Equations (EFIE).
  • The definition of the variational problems has been generalized through the introduction of a 'composite' type of finite element space which may have components defined on different meshes and even different mesh types. This allows to easily define several classes of general coupled problems such as FEM-BEM coupling problems with only one variational formulation.
  • Work has been started to ease the parallelization of a user's sequential code. In this setting, the parallelization of the assembly and solution steps is entirely hidden to the user, and a sequential code can be parallelized with only very few changes in the script. For example, PETSc can easily be used to solve the linear system stemming from a coupled problem with a fieldsplit preconditioner.

The FreeFEM software now includes the GenEO for saddle-point 12 examples with PCHPDDM in PETSc.

8.12 Modelisation and simulation of highly oscillatory Vlasov-Poisson equations

In this part we are concerned with solving Vlasov ort Vlasov-Poisson systems involving several time scales.

First, we are interested with the parareal algorithm which is a method performing parallel computing in time for an efficient numerical solving of a very large class of time dependent equations. In 11, we propose a new reduced model derived from the two-scale convergence theory that we use in the parareal framework for the coarse solving. The parareal method allows to improve their accuracy in a few iterations through corrections by fine solvers of the full model. The models are numerically solved by particle in cell methods. We perform numerical experiments of long time simulations of a charged beam in a focusing channel and under the influence of a rapidly oscillating external electric field. On the basis of computing times, we provide an analysis of the efficiency of the parareal algorithm in terms of speedup. An ongoing work is the parallelization of the previous code in order to study the efficiency of the strategy. The issue is to analyze the distributed vs. shared memory paradigms and to address the problem of optimizing the memory footprint of the code. Then, in 24 we derive a reduced first-order model from a two-scale asymptotic expansion in a small parameter, in order to approximate the solution of a stiff differential equation. The problem of interest is a multi-scale Newton-Lorentz equation modeling the dynamics of a charged particle under the influence of a linear electric field and of a perturbed strong magnetic field. First, we show that in short times, the first-order model provides a much better approximation than the zero-order one. Then, we propose a volume-preserving method using a particular splitting technique to solve numerically the first-order model. Finally, it turns out that the first-order model does not systematically provide a satisfactory approximation in long times. To overcome this issue, we implement a recent strategy based on the Parareal algorithm, in which the first-order approximation is used for the coarse solver. This approach allows to perform efficient and accurate long-time simulations for any small parameter. Numerical results for two realistic Penning traps are provided to support these statements.

Second, in 6 we obtained an original application of the method of moments to the Vlasov-Poisson system with non constant strong magnetic field in three dimensions of space. Using basis functions which are aligned with the magnetic field, one obtains a Friedrichs system where the kernel of the singular part is made explicit. A projection of the original model on this kernel yields what we call the reduced model. Basic numerical tests of the field illustrate the accuracy of our implementation.

8.13 Stable Trefftz approximation of Helmholtz solutions using evanescent plane waves

Superpositions of plane waves are known to approximate well the solutions of the Helmholtz equation. Their use in discretizations is typical of Trefftz methods for Helmholtz problems, aiming to achieve high accuracy with a small number of degrees of freedom. However, Trefftz methods lead to ill-conditioned linear systems, and it is often impossible to obtain the desired accuracy in floating-point arithmetic.

In 15 we show that a judicious choice of plane waves can ensure high-accuracy solutions in a numerically stable way, in spite of having to solve such ill-conditioned systems. Numerical accuracy of plane wave methods is linked not only to the approximation space, but also to the size of the coefficients in the plane wave expansion. We show that the use of plane waves can lead to exponentially large coefficients, regardless of the orientations and the number of plane waves, and this causes numerical instability. We prove that all Helmholtz fields are continuous superposition of evanescent plane waves, i.e., plane waves with complex propagation vectors associated with exponential decay, and show that this leads to bounded representations. We provide a constructive scheme to select a set of real and complex-valued propagation vectors numerically. This results in an explicit selection of plane waves and an associated Trefftz method that achieves accuracy and stability. The theoretical analysis is provided for a two-dimensional domain with circular shape. However, the principles are general and we provide numerical experiments demonstrating practical applicability also for polygonal domains.

In 21 we extend the previous analysis from two to three space dimensions. A first non-trivial challenge in 3D is the parametrization of the complex direction set. Our approach involves defining a complex-valued reference direction and then consider its rigid-body rotations via Euler angles. Then, by generalizing the Jacobi–Anger identity to complex-valued directions, we prove that any solution of the Helmholtz equation on a three-dimensional ball can be written as a continuous superposition of evanescent plane waves in a stable way. Most of the analysis is then easily extended from 2D to 3D. Again our numerical results showcase a great improvement of our proposed method on the achievable accuracy compared to standard method.

8.14 Parallel approximation of the exponential of Hermitian matrices

In 23, we consider a rational approximation of the exponential function to design an algorithm for computing matrix exponential in the Hermitian case. Using partial fraction decomposition, we obtain a parallelizable method, where the computation reduces to independent resolutions of linear systems. We analyze the effects of rounding errors on the accuracy of our algorithm. We complete this work with numerical tests showing the efficiency of our method and a comparison of its performances with Krylov algorithms.

8.15 A finite element toolbox for the Bogoliubov-de Gennes stability analysis of Bose-Einstein condensates

In 28 we present a finite element toolbox for the computation of Bogoliubov-de Gennes modes used to assess the linear stability of stationary solutions of the Gross-Pitaevskii (GP) equation. Applications concern one (single GP equation) or two-component (a system of coupled GP equations) Bose-Einstein condensates in one, two and three dimensions of space. An implementation using the free software FreeFem++ is distributed with this paper. For the computation of the GP stationary (complex or real) solutions we use a Newton algorithm coupled with a continuation method exploring the parameter space (the chemical potential or the interaction constant). Bogoliubov-de Gennes equations are then solved using dedicated libraries for the associated eigenvalue problem. Mesh adaptivity is proved to considerably reduce the computational time for cases implying complex vortex states. Programs are validated through comparisons with known theoretical results for simple cases and 97 numerical results reported in the literature.

8.16 Mathematics and Finite Element Discretizations of Incompressible Navier-Stokes Flows

With : C. Bernardi, V. Girault, F. Hecht, P.-A. Raviart, B. Riviere.

This book in final preparation (2024) is a revised, updated, and augmented version of the out-of-print classic book “Finite Element Methods for Navier–Stokes Equations" by Girault and Raviart published by Springer in 1986 [GR]. The incompressible Navier–Stokes equations model the flow of incom- pressible Newtonian fluids and are used in many practical applications, including computational fluid dynamics. In addition to the basic theoretical analysis, this book presents a fairly exhaustive treatment of the up-to-date finite element discretizations of incompressible Navier–Stokes equa- tions and a variety of numerical algorithms used in their computer implementation. It covers the cases of standard and non-standard boundary conditions and their numerical discretizations via the finite element methods. Both conforming and non-conforming finite elements are examined in detail, as well as their stability or instability. The topic of time-dependent Navier–Stokes equa- tions, which was missing from [GR], is now presented in several chapters. In the same spirit as [GR], we have tried as much as possible to make this book self-contained and therefore we have either proved or recalled all the theoretical results required. This book can be used as at textbook for advanced graduate students.

8.17 Communication Lower Bounds and Optimal Algorithms for Multiple Tensor-Times-Matrix Computation

With Hussam Al Daas, Grey Ballard, Laura Grigori, Suraj Kumar, Kathryn Rouse

Multiple Tensor-Times-Matrix (Multi-TTM) is a key computation in algorithms for computing and operating with the Tucker tensor decomposition, which is frequently used in multidimensional data analysis. In 8 , we establish communication lower bounds that determine how much data movement is required to perform the Multi-TTM computation in parallel. The crux of the proof relies on analytically solving a constrained, nonlinear optimization problem. We also present a parallel algorithm to perform this computation that organizes the processors into a logical grid with twice as many modes as the input tensor. We show that with correct choices of grid dimensions, the communication cost of the algorithm attains the lower bounds and is therefore communication optimal. Finally, we show that our algorithm can significantly reduce communication compared to the straightforward approach of expressing the computation as a sequence of tensor-times-matrix operations.

8.18 Interpretation of Parareal as a Two-level Additive Schwarz In Time Preconditioner and Its Acceleration with GMRES.

With Van-Thanh Nguyen, Laura Grigori

In 13, we describe an interpretation of parareal as a two-level additive Schwarz preconditioner in the time domain. We show that this two-level preconditioner in time is equivalent to parareal and to multigrid reduction in time (MGRIT) with F-relaxation. We also discuss the case when additional fine or coarse propagation steps are applied in the preconditioner. This leads to procedures equivalent to MGRIT with FCF-relaxation and to MGRIT with F(CF)2-relaxation or overlapping parareal. Numerical results show that these variants have faster convergence in some cases. In addition, we also apply a Krylov subspace method, namely GMRES (generalized minimal residual), to accelerate the parareal algorithm. Better convergence is obtained, especially for the advection-reaction-diffusion equation in the case when advection and reaction coefficients are large.

8.19 Randomized Flexible GMRES with Deflated Restarting.

With Yongseok Jang, Laura Grigori, Emeric Martin, Cédric Content

For high dimensional spaces, a randomized Gram-Schmidt (RGS) algorithm is beneficial in computational costs as well as numerical stability. In 25, we apply this dimension reduction technique by random sketching to Krylov subspace methods, e.g. to the generalized minimal residual method (GMRES). We propose a flexible variant of GMRES with the randomized Gram-Schmidt based Arnoldi iteration to produce a set of basis vectors of the Krylov subspace. Even though the Krylov basis is no longer l2 orthonormal, its random projection onto the low dimensional space shows l 2 orthogonality. As a result, it is observed the numerical stability which turns out to be independent of the dimension of the problem even in extreme scale problems. On the other hand, as the Harmonic Ritz values are commonly used in GMRES with deflated restarting to improve convergence, we consider another deflation strategy, for instance disregarding the singular vectors associated with the smallest singular values. We thus introduce a new algorithm of the randomized flexible GMRES with singular value decomposition (SVD) based deflated restarting. At the end, we carry out numerical experiments in the context of compressible turbulent flow simulations. Our proposed approach exhibits a quite competitive numerical behaviour to existing methods while reducing computational costs.

8.20 Randomized Householder QR

With Laura Grigori, Edouard Timsit

In 22, we introduce a randomized Householder QR factorization (RHQR). This factorization can be used to obtain a well conditioned basis of a set of vectors and thus can be employed in a variety of applications. We discuss in particular the usage of this randomized Householder factorization in the Arnoldi process. Numerical experiments show that RHQR produces a well conditioned basis and an accurate factorization. We observe that for some cases, it can be more stable than Randomized Gram-Schmidt (RGS) in both single and double precision.

9 Bilateral contracts and grants with industry

9.1 Bilateral contracts with industry

Participants: F. Nataf, A. Lechevallier.

  • Contract with IFPen, January 2021 - December 2023, that funds the Phd thesis of Antoine Lechevallier on "Accélération des simulations avec évènements ponctuels récurrents par apprentissage : application au stockage geologique du CO2" . Supervisor Frédéric Nataf , S. Desroziers and T. Faney.

10 Partnerships and cooperations

10.1 International research visitors

10.1.1 Visits to international teams

Research stays abroad
persLauraGrigori
  • Visited Institution:
    Université de Berkeley
  • Country:
    USA
  • Duration of the visit:
    January–April 2023

10.2 European initiatives

10.2.1 Horizon Europe

AMCG: Asynchronous modified Conjugate Gradient Method

In this project supported by European Union’Inno4Scale, INRIA and WIKKI (Germany) will collaborate to develop, test and evaluate a novel asynchronous modified conjugate gradient (AMCG) algorithm. It will be demonstrated in two important middle-ware applications, namely OpenFOAM® and FreeFEM which rely respectively on finite volume and finite element discretizations.

10.2.2 H2020 projects

EMC2

Participants: Frédéric Nataf, Pierre-Henri Tournier.

EMC2 project on cordis.europa.eu

  • Title:
    Extreme-scale Mathematically-based Computational Chemistry
  • Duration:
    From September 1, 2019 to February 28, 2026
  • Partners:
    • INSTITUT NATIONAL DE RECHERCHE EN INFORMATIQUE ET AUTOMATIQUE (INRIA), France
    • ECOLE NATIONALE DES PONTS ET CHAUSSEES (ENPC), France
    • CENTRE NATIONAL DE LA RECHERCHE SCIENTIFIQUE CNRS (CNRS), France
    • SORBONNE UNIVERSITE, France
  • Inria contact:
    Laura GRIGORI (Alpines)
  • Coordinator:
  • Summary:

    Molecular simulation has become an instrumental tool in chemistry, condensed matter physics, molecular biology, materials science, and nanosciences. It will allow to propose de novo design of e.g. new drugs or materials provided that the efficiency of underlying software is accelerated by several orders of magnitude.

    The ambition of the EMC2 project is to achieve scientific breakthroughs in this field by gathering the expertise of a multidisciplinary community at the interfaces of four disciplines: mathematics, chemistry, physics, and computer science. It is motivated by the twofold observation that, i) building upon our collaborative work, we have recently been able to gain efficiency factors of up to 3 orders of magnitude for polarizable molecular dynamics in solution of multi-million atom systems, but this is not enough since ii) even larger or more complex systems of major practical interest (such as solvated biosystems or molecules with strongly-correlated electrons) are currently mostly intractable in reasonable clock time. The only way to further improve the efficiency of the solvers, while preserving accuracy, is to develop physically and chemically sound models, mathematically certified and numerically efficient algorithms, and implement them in a robust and scalable way on various architectures (from standard academic or industrial clusters to emerging heterogeneous and exascale architectures).

    EMC2 has no equivalent in the world: there is nowhere such a critical number of interdisciplinary researchers already collaborating with the required track records to address this challenge. Under the leadership of the 4 PIs, supported by highly recognized teams from three major institutions in the Paris area, EMC2 will develop disruptive methodological approaches and publicly available simulation tools, and apply them to challenging molecular systems. The project will strongly strengthen the local teams and their synergy enabling decisive progress in the field.

10.3 National initiatives

10.3.1 ANR

PEPR Numpex - PC Exama

Numpex is a French program dedicated to Exascale which is divided into five Research Projects. The first one is Exa-MA (Methods and Algorithms of Exascale) aimed at advancing scientific simulations and modeling capabilities to reach and surpass the exascale barrier. The project is organized into several work packages (WP) that focus on different aspects of the research objectives. Frédéric Nataf is in charge of the Work Package 3 (WP3): Numerical Kernels and Coupled Solvers Focuses on designing and implementing efficient and possibly provable numerical kernels and solvers for largescale problems. Tasks include domain decomposition methods, data sparsity techniques, multiple precision, adaptive solution strategies, and efficient coupling of multiphysics simulations.

Partners in WP3: LIP6 Sorbonne University, INRIA Bordeaux, CEA.

Muffin

: ANR appel à projet générique 2019.

S. Hirstoaga and P.-H. Tournier are members of the project MUFFIN, whose objective is to explore and optimize original computational scenarios for multi-scale and high dimensional transport codes, with priority applications in plasma physics. Several approximation methods are planed to be developed. It is at the frontier of computing and numerical analysis and intends to reduce the computational burden in the context of intensive calculation. Principal Investigator: B. Després (Sorbonne University).

11 Dissemination

11.1 Promoting scientific activities

11.1.1 Scientific events: organisation

Frédéric Hecht and Pierre-Henri Tournier organized the 15th edition of the FreeFEM days.

11.1.2 Journal

Member of the editorial boards

Frédéric Nataf is a member of the editorial board of Journal of Numerical Mathematics.

Reviewer - reviewing activities

Frédéric Nataf reviewed 9 articles for JCP and SIAM journals.

Emile Parolin reviewed 2 articles for JOMP and IMA-JNA journals.

Sever Hirstoaga reviewed 1 article for Kinetic and Related Models and 1 article for Applied Mathematics and Optimization.

11.1.3 Invited talks

Frédéric Nataf was invited to minisymposia

  • The 29th Biennial Numerical Analysis Conference, Glasgow
  • Coupled 2023, Crète

Emile Parolin was invited to

  • a minisymposia at the 29th Biennial Numerical Analysis Conference, Glasgow;
  • the workshop PoWER2023: Propagation of Waves, European Researchers, Turin.

Sever Hirstoaga was invited to

  • 21st Copper Mountain Conference on Multigrid Methods, Copper Mountain, April 2023.
  • Conference NANMAT at Tiberiu Popoviciu Insitute of Numerical Analysis, Cluj, Romania, November 2023.

11.2 Teaching - Supervision - Juries

11.2.1 Teaching

  • Licence 3: Sever Hirstoaga , Méthodes numériques, 75 HETD (cours + TP), EISE 3, Polytech Sorbonne.
  • Master 2: Laura Grigori, Course on High performance computing for numerical methods and data analysis, Master 2nd year, Mathematics & Applications, Sorbonne University, 24 hours per year.
  • Master 2: Frédéric Nataf, Course on Domain Decomposition Methods, Sorbonne University, 22h.
  • Master 1: Emile Parolin , Numerical simulation and approximation of solutions to elliptic PDEs, S2 2022, Sorbonne Université, 40h.
  • Master 2: Emile Parolin , High Performance Computing, S1 2023, Sorbonne Université, 15h.

11.2.2 Supervision

  • Internship defended: Nicola Galante , Trefftz type methods for wavepropagation phenomena, Emile Parolin co-supervised with Andrea Moiola (University of Pavia).
  • PhD in progress: Nicola Galante , Trefftz type methods for wavepropagation phenomena ,started fall 2023, Emile Parolin
  • PhD defended: Matthias Beaupere , Parallel algorithms for computing low rank decompositions of matrices and tensors  31, March 2023, Laura Grigori
  • PhD in progress, Siwar Badreddine , Leveraging symmetries and low-rank structure of matrices and tensors in high-dimensional quantum chemistry problems, started in 2021, Laura Grigori
  • PhD in progress: Antoine Lechevallier , Physics Informed Deep Learning – Application to well opening and closing events, started in fall 2020, Frédéric Nataf
  • PhD defended: Edouard Timsit , Randomized Algorithms, started fall 2021, Laura Grigori

11.2.3 Juries

Sever Hirstoaga was examiner in the defense of the PhD thesis of VU Anh-Tuan, Aix-Marseille université, July 2023.

12 Scientific production

12.1 Major publications

  • 1 articleJ. W.J. W. Demmel, L.L. Grigori, M.M. Hoemmen and J.J. Langou. Communication-optimal parallel and sequential QR and LU factorizations.SIAM Journal on Scientific Computing1short version of technical report UCB/EECS-2008-89 from 20082012, 206-239
  • 2 bookV.Victorita Dolean, P.Pierre Jolivet and F.Frédéric Nataf. An Introduction to Domain Decomposition Methods: algorithms, theory and parallel implementation.SIAM2015back to text
  • 3 articleF.F. Hecht. New development in FreeFem++.J. Numer. Math.203-42012, 251--265
  • 4 articleN.Nicole Spillane, V.Victorita Dolean, P.Patrice Hauret, F.Frédéric Nataf, C.Clemens Pechstein and R.Robert Scheichl. Abstract robust coarse spaces for systems of PDEs via generalized eigenproblems in the overlaps.Numer. Math.12642014, 741--770URL: http://dx.doi.org/10.1007/s00211-013-0576-yDOI

12.2 Publications of the year

International journals

International peer-reviewed conferences

  • 16 inproceedingsS.Sahar Borzooei, C.Claire Migliaccio, V.Victorita Dolean, P.-H.Pierre-Henri Tournier and C.Christian Pichot. High-Performance Numerical Modeling for Detection of Rotator Cuff Tear.IEEE International Symposium on Antennas and Propagation and USNC-URSI Radio Science MeetingPortland, Oregon, United StatesIEEESeptember 2023, 1879-1880HALDOIback to textback to text
  • 17 inproceedingsS.Sahar Borzooei, P.-H.Pierre-Henri Tournier, C.Claire Migliaccio, V.Victorita Dolean and C.Christian Pichot. Microwave tomographic imaging of shoulder injury.Conférence EUCAP 2023 - 17th European Conference on Antennas and PropagationFlorence, ItalyMarch 2023, 5HALback to textback to text

Scientific book chapters

Reports & preprints

12.3 Cited publications

  • 29 articleF.F Assous, M.M Kray, F.F Nataf and E.E Turkel. Time-reversed absorbing condition: application to inverse problems.Inverse Problems2762011, 065003URL: http://stacks.iop.org/0266-5611/27/i=6/a=065003back to text
  • 30 articleF.Franck Assous and F.Frédéric Nataf. Full-waveform redatuming via a TRAC approach: a first step towards target oriented inverse problem.Journal of Computational Physics4402021, 110377back to text
  • 31 phdthesisM.Matthias Beaupère. Parallel algorithms for computing low rank decompositions of matrices and tensors.Sorbonne Université2023back to text
  • 32 articleP.-H.P.-H. Tournier, I.I. Aliferis, M.M. Bonazzoli, M.M. de Buhan, M.M. Darbas, V.V. Dolean, F.F. Hecht, P.P. Jolivet, I. E.I. El Kanfoud, C.C. Migliaccio, F.F. Nataf, C.Ch. Pichot and S.S. Semenov. Microwave tomographic imaging of cerebrovascular accidents by using high-performance computing.Parallel Computing852019, 88-97URL: https://www.sciencedirect.com/science/article/pii/S0167819118301042DOIback to text
  • 33 articleP.-H.Pierre-Henri Tournier, M.Marcella Bonazzoli, V.Victorita Dolean, F.Francesca Rapetti, F.Frederic Hecht, F.Frederic Nataf, I.Iannis Aliferis, I.Ibtissam El Kanfoud, C.Claire Migliaccio, M.Maya de Buhan, M.Marion Darbas, S.Serguei Semenov and C.Christian Pichot. Numerical Modeling and High-Speed Parallel Computing: New Perspectives on Tomographic Microwave Imaging for Brain Stroke Detection and Monitoring.IEEE Antennas and Propagation Magazine5952017, 98-110DOIback to text
  • 34 articleP.-H.Pierre-Henri Tournier, P.Pierre Jolivet, V.Victorita Dolean, H. S.Hossein S. Aghamiry, S.Stéphane Operto and S.Sebastian Riffo. 3D finite-difference and finite-element frequency-domain wave simulation with multilevel optimized additive Schwarz domain-decomposition preconditioner: A tool for full-waveform inversion of sparse node data sets.GEOPHYSICS8752022, T381-T402DOIback to textback to text