The focus of our research is on the development of novel parallel numerical algorithms and tools appropriate for stateoftheart mathematical models used in complex scientific applications, and in particular numerical simulations. The proposed research program is by nature multidisciplinary, interweaving aspects of applied mathematics, computer science, as well as those of several specific applications, as porous media flows, elasticity, wave propagation in multiscale media, molecular simulations, and inverse problems.
Our first objective is to develop numerical methods and tools for complex scientific and industrial applications that will enhance their scalable execution on the emergent heterogeneous hierarchical models of massively parallel machines. Our second objective is to integrate the novel numerical algorithms into a middlelayer that will hide as much as possible the complexity of massively parallel machines from the users of these machines.
The research described here is directly relevant to several steps of the numerical simulation chain. Given a numerical simulation that was expressed as a set of differential equations, our research focuses on mesh generation methods for parallel computation, novel numerical algorithms for linear algebra, as well as algorithms and tools for their efficient and scalable implementation on high performance computers. The validation and the exploitation of the results is performed with collaborators from applications and is based on the usage of existing tools. In summary, the topics studied in our group are the following:
Solvers for numerical linear algebra:
domain decomposition methods, preconditioning for iterative methods
In the engineering, researchers, and teachers communities, there is a strong demand for simulation frameworks that are simple to install and use, efficient, sustainable, and that solve efficiently and accurately complex problems for which there are no dedicated tools or codes available. In our group we develop FreeFem++, a userdedicated language for solving PDEs. The goal of FreeFem++ is not to be a substitute for complex numerical codes, but rather to provide an efficient and relatively generic tool for:
The current users of FreeFem++ are mathematicians, engineers, university professors, and students. In general for these users the installation of public libraries as MPI, MUMPS, Ipopt, Blas, lapack, OpenGL, fftw, scotch, PETSc, SLEPc is a very difficult problem. For this reason, the authors of FreeFem++ have created a user friendly language, and over years have enriched its capabilities and provided tools for compiling FreeFem++ such that the users do not need to have special knowledge of computer science. This leads to an important work on porting the software on different emerging architectures.
Today, the main components of parallel FreeFem++ are:
All these components are parallel, except for the last point which is not in the focus of our research. However for the moment, the parallel mesh generation algorithm is very simple and not sufficient, for example it addresses only polygonal geometries. Having a better parallel mesh generation algorithm is one of the goals of our project. In addition, in the current version of FreeFem++, the parallelism is not hidden from the user, it is done through direct calls to MPI. Our goal is also to hide all the MPI calls in the specific language part of FreeFem++. In addition to these inhouse domain decomposition methods, FreeFem++ is also linked to PETSc solvers which enables an easy use of third parties parallel multigrid methods.
Iterative methods are widely used in industrial applications, and preconditioning is the most important research subject here. Our research considers domain decomposition methods 2 and iterative methods and its goal is to develop solvers that are suitable for parallelism and that exploit the fact that the matrices are arising from the discretization of a system of PDEs on unstructured grids.
One of the main challenges that we address is the lack of robustness and scalability of existing methods as incomplete LU factorizations or Schwarzbased approaches, for which the number of iterations increases significantly with the problem size or with the number of processors. This is often due to the presence of several low frequency modes that hinder the convergence of the iterative method. To address this problem, we study different approaches for dealing with the low frequency modes as coarse space correction in domain decomposition or deflation techniques.
We also focus on developing boundary integral equation methods that would be adapted to the simulation of wave propagation in complex physical situations, and that would lend themselves to the use of parallel architectures. The final objective is to bring the state of the art on boundary integral equations closer to contemporary industrial needs. From this perspective, we investigate domain decomposition strategies in conjunction with boundary element method as well as acceleration techniques (Hmatrices, FMM and the like) that would appear relevant in multimaterial and/or multidomain configurations. Our work on this topic also includes numerical implementation on large scale problems, which appears as a challenge due to the peculiarities of boundary integral equations.
The design of new numerical methods that are robust and that have well proven convergence properties is one of the challenges addressed in Alpines. Another important challenge is the design of parallel algorithms for the novel numerical methods and the underlying building blocks from numerical linear algebra. The goal is to enable their efficient execution on a diverse set of node architectures and their scaling to emerging highperformance clusters with an increasing number of nodes.
Increased communication cost is one of the main challenges in high performance computing that we address in our research by investigating algorithms that minimize communication, as communication avoiding algorithms. The communication avoiding algorithmic design is an approach originally developed in our group since more than ten years (initially in collaboration with researchers from UC Berkeley and CU Denver). While our first results concerned direct methods of factorization as LU or QR factorizations, our recent work focuses on designing robust algorithms for computing the low rank approximation of a matrix or a tensor. We focus on both deterministic and randomized approaches.
Our research also focuses on solving problems of large size that feature high dimensions as in molecular simulations. The data in this case is represented by objects called tensors, or multilinear arrays. The goal is to design novel tensor techniques to allow their effective compression, i.e. their representation by simpler objects in small dimensions, while controlling the loss of information. The algorithms are aiming to being highly parallel to allow to deal with the large number of dimensions and large data sets, while preserving the required information for obtaining the solution of the problem.
Our work is by design as general purpose as possible in the field of scientific computing based on partial differential equations modeling including multiphysics and multiscale modeling. Moreover, since our methods are available in standalone open sources libraries and in the free finite element specific language FreeFEM which is developed in our team, they have many potential users that we are not necessarily aware of. We give here some recent works performed in collaboration with team members.
The Hierarchical matrix acceleration techniques implemented in our parallel Htool 7.1.4 library, which is interfaced with FreeFEM, allows to efficiently solve the radiative transfer equations with complexity n log(n), where n is the number of vertices of the physical domain. The method can be used to study the change of temperature in the atmosphere, and in particular the effect of clouds and greenhouse gases 10, 27.
We focus on methods related to the blend of time reversal techniques and absorbing boundary conditions (ABC) used in a non standard way. Since the seminal paper by [M. Fink et al., Imaging through inhomogeneous media using time reversal mirrors. Ultrasonic Imaging, 13(2):199, 1991.], time reversal is a subject of very active research. The principle is to backpropagate signals to the sources that emitted them. The initial experiment was to refocus, very precisely, a recorded signal after passing through a barrier consisting of randomly distributed metal rods. In [de Rosny and Fink. Overcoming the diffraction limit in wave physics using a timereversal mirror and a novel acoustic sink. Phys. Rev. Lett., 89 (12), 2002], the source that created the signal is time reversed in order to have a perfect time reversal experiment. In 29, we improve this result from a numerical point of view by showing that it can be done numerically without knowing the source. This is done at the expense of not being able to recover the signal in the vicinity of the source. In 30, we have extended these methods to non uniform media.
In the context of seismic imaging, frequencydomain fullwaveform inversion (FWI) is suitable for longoffset stationaryrecording acquisition, since reliable subsurface models can be reconstructed with a few frequencies and attenuation is easily implemented without computational overhead. In the frequency domain, wave modeling is a Helmholtztype boundaryvalue problem which requires to solve a large and sparse system of linear equations per frequency with multiple righthand sides (sources).
For sparse node acquisitions covering large domains (more than 50 millions of unknowns), an iterative solver should be the method of choice. However, fast convergence remains challenging in the high frequency regime due to the nondefiniteness of the operator and to the discretization constraints needed to minimize dispersion, hence requiring efficient preconditioners. For such wave propagation problems, our Schwarz domain decomposition preconditioners are good candidates to accelerate the time to solution and to provide robustness with respect to heterogeneities and frequency, for both finite difference and finite element discretizations 34.
The inverse source problem in electroencephalography (EEG) is usually solved using a numerical head model composed of different layers of tissues (scalp, skull, brain, ...) having constant electric conductivity. However, the conductivity of the skull can hardly be assumed to be constant: (i) hard and spongy bones, especially for neonates (fontanels), (ii) conductivity models vary between individuals, and through time for a single person.
In 9, we consider a new numerical method for the EEG inverse source problem which takes into account the heterogeneity of the skull. In particular, in the first step of source reconstruction, the socalled "cortical mapping" procedure which deals with (i) data completion on the scalp knowing partial measurements of the electric potential from electrode measurements and (ii) data transmission from the scalp to the brain surface, we use the quasireversibility method which provides a regularized solution of Cauchy problems in a bounded domain. The problem is solved using the finite element method in the scalp and skull layers.
Microwave tomography is a novel imaging modality with a large number of potential attractive medical applications, and is based on the difference between the dielectric properties of normal and diseased tissues. Microwave tomography features rapid data acquisition time, and together with rapid tomographic reconstructions allows detecting, identifying and monitoring injuries continuously.
From a computational point of view, microwave imaging requires the solution of an inverse problem based on a minimisation algorithm. Reconstruction algorithms are computationally intensive with successive solutions of the forward problem needing efficient numerical modelling and highperformance parallel computing. This methodology involves distinct research fields: optimisation, inverse problems, approximation and solution methods for the simulation of the forward problem modelled by Maxwell’s equations. The latter is challenging in itself as the modelling must accurately take account of the high heterogeneity and complexity of the different tissues.
Our numerical framework for microwave imaging is based on the FreeFEM 7.1.1 finite element software, and the forward problem is solved efficiently in parallel using Domain Decomposition methods implemented in the HPDDM 7.1.2 library.
Brain stroke detection and monitoring
Stroke, or cerebrovascular accident (CVA), is a major cause of disability and death worldwide. About 85% of CVAs are ischemic due to cerebral infarction, caused by an interruption of the blood supply to some part of the brain, and 15% are haemorrhagic. Differentiating between ischemic and haemorrhagic CVAs is an essential part of the initial workup of the patient, and rapid and accurate diagnosis is crucial for patient survival; here, neuroimaging plays a vital role. Computed Tomography (CT) and Magnetic Resonance Imaging (MRI) are the ‘gold’ standards, but their use is not well suited to efficient medical care of CVAS, as they are bulky diagnostic instruments and cannot be used in continuous brain monitoring. A noninvasive and transportable/portable device for the characterisation and monitoring of CVAs would have clear clinical applications, beginning with the very first instance of patient care in an ambulance and extending to continuous patient monitoring at the hospital.
Microwave tomography holds great promise for rapid brain stroke diagnosis. The raw data acquired by the microwave imaging system can be wirelessly transferred to a remote computing center, where the tomographic images will be computed. The images can then be quickly transferred to the hospital.
In 33, 32, we demonstrate on synthetic data the feasibility of a microwave imaging technique for the characterisation and monitoring of strokes. Using high performance computing, we are able to obtain a tomographic reconstruction of the brain in less than two minutes.
Detection and imaging of rotator cuff tears
One of the most challenging shoulder injuries is rotator cuff tear, which increases with aging and particularly happens among athletes. These tears cause pain and highly affect the functionality of the shoulder.
In 17, 16, we propose to study the feasibility of microwave tomographic imaging to detect these tears. We introduce a wearable imaging system design, adapted to the real shoulder structure and designed to surround it partially. We investigate the efficiency of the imaging system numerically using realistic CAD models for shoulder profile and bones, including humerus and scapula, and taking into account the dielectric properties of the different tissues (bone, tendon, muscle, skin, synovial fluid). The reconstructed images of the shoulder joint obtained using noisy synthetic data show that the proposed imaging system is capable of accurately detecting and localizing large (5mL) and partial (1mL) rotator cuff tears.
Acoustic, electromagnetic and elastic linear waves are ubiquitous phenomena in science and engineering. The numerical simulation of their propagation and interaction is a core task in areas like medical imaging, seismic remote sensing, design of electronic devices, atmospheric particle scattering, radar and sonar modelling, etc. The design of appropriate numerical schemes is challenging because of approximation issues arising from the fact that the solutions are highly oscillatory, which usually require a large computational effort to produce a meaningful numerical approximation. In addition, common formulations suffer from stability issues, such as the numerical dispersion (also known as pollution effect), and signindefiniteness, which typically limit the achievable accuracy. The variety of different applications and the challenges of their numerical simulations produces a great research effort in trying to design new efficient schemes: new approximation spaces using degrees of freedom sparingly (wave based and Trefftz methods), preconditionners for solving large linear systems (e.g. by domain decomposition), new formulations, acceleration techniques, etc.
We focus on computationally intensive numerical algorithms arising in the data analysis of current and forthcoming Cosmic Microwave Background (CMB) experiments in astrophysics. This application is studied in collaboration with researchers from University Paris Diderot, and the objective is to make available the algorithms to the astrophysics community, so that they can be used in large experiments.
In CMB data analysis, astrophysicists produce and analyze multifrequency 2D images of the universe when it was 5% of its current age. The new generation of the CMB experiments observes the sky with thousands of detectors over many years, producing overwhelmingly large and complex data sets, which nearly double every year therefore following Moore's Law. Planck () is a keystone satellite mission which has been developed under auspices of the European Space Agency (ESA).
Molecular simulation is one of the most dynamic areas of scientific computing. Its field of application is very broad, ranging from theoretical chemistry and drug design to materials science and nanotechnology. It provides many challenging problems to mathematicians, and computer scientists.
In the context of the ERC Synergy Grant EMC2 we address several important limitations of state of the art molecular simulation. In particular, the simulation of very large molecular systems, or smaller systems in which electrons interact strongly with each other, remains out of reach today. In an interdisciplinary collaboration between chemists, mathematicians and computer scientists, we focus on developing a new generation of reliable molecular simulation algorithms and software.
Our activities in scientific computing help in the design and optimization of non fossil energy resources. Here are three examples we are aware of and there are many others:

FreeFem++ is a partial differential equation solver. It has its own language. freefem scripts can solve multiphysics non linear systems in 2D and 3D.
Problems involving PDE (2d, 3d) from several branches of physics such as fluidstructure interactions require interpolations of data on several meshes and their manipulation within one program. FreeFem++ includes a fast 2d̂treebased interpolation algorithm and a language for the manipulation of data on multiple meshes (as a follow up of bamg (now a part of FreeFem++ ).
FreeFem++ is written in C++ and the FreeFem++ language is a C++ idiom. It runs on Macs, Windows, Unix machines. FreeFem++ replaces the older freefem and freefem+.
Fast Multipole Methods (FMMs) based on the oscillatory Helmholtz kernel can reduce the cost of solving Nbody problems arising from Boundary Integral Equations (BIEs) in acoustic or electromagnetics. However, their cost strongly increases in the highfrequency regime. In 7 introduces a new directional FMM for oscillatory kernels (defmm  directional equispaced interpolationbased fmm), whose precomputation and application are FFTaccelerated due to polynomial interpolations on equispaced grids. We demonstrate the consistency of our FFT approach, and show how symmetries can be exploited in the Fourier domain. We also describe the algorithmic design of defmm, wellsuited for the BIE nonuniform particle distributions, and present performance optimizations on one CPU core. Finally, we exhibit important performance gains on all test cases for defmm over a stateoftheart FMM library for oscillatory kernels.
CO2 geological storage is an essential instrument for efficient Carbon Capture and Storage policies. Numerical simulations provide the solution to the multi phase flow equations that model the behavior of the CO2 injection site. However, numerical simulations of fluid flow in porous media are computation ally demanding: it can take up to several hours on a HPC cluster in order to simulate one injection scenario for a large CO2 reservoir if we want to accurately model the complex physical processes involved. This becomes a limiting issue when performing a large number of simulations, e.g. in the process of ’history matching’. During the numerical simulation of CO2 storage in the subsurface, well events cause important numerical difficulties due to their instant impact on the pressure and saturation unknowns. This often forces a drastic reduction of the time step size to be able to solve the nonlinear system of equations resulting from the discretization of the continuous mathematical model. However, these specific well events in a simulation have a relatively similar impact across space and time. In 26, we propose a methodology to alleviate the impact of well events during the numerical simulation of CO2 storage in the underground. We complement the standard numerical algorithm by predicting an initialization of Newton’s method directly in the domain of convergence using supervised learning. More specifically, we replace the initialization in pressure by a linear approxima 1tion obtained through an implicit solver and we use a Fourier Neural Operator (FNO) to predict the saturation initialization. We apply our methodology to two test cases derived from a realistic C02 storage in saline aquifer benchmark. We reduce the required number of Newton iterations to handle a well opening by 53% for the first test case, i.e required number of linear system to solve and by 38% for the second test case.
In 19, we develop and analyse domain decomposition methods for linear systems of equations arising from conforming finite element discretisations of positive Maxwelltype equations, namely for H(curl) problems. It is well known that convergence of domain decomposition methods rely heavily on the efficiency of the coarse space used in the second level. We design adaptive coarse spaces that complement a nearkernel space made from the gradient of scalar functions. The new class of preconditioner is inspired by the idea of subspace decomposition, but based on spectral coarse spaces, and is specially designed for curlconforming discretisations of Maxwell's equations in heterogeneous media on general domains which may have holes. Our approach has wider applicability and theoretical justification than the wellknown HiptmairXu auxiliary space preconditioner, with results extending to the variable coefficient case and nonconvex domains at the expense of a larger coarse space.
In 20, we introduce a fully algebraic twolevel additive Schwarz preconditioner for general sparse largescale matrices. The preconditioner is analyzed for symmetric positive definite (SPD) matrices. For those matrices, the coarse space is constructed based on approximating two local subspaces in each subdomain. These subspaces are obtained by approximating a number of eigenvectors corresponding to dominant eigenvalues of two judiciously posed generalized eigenvalue problems. The number of eigenvectors can be chosen to control the condition number. For general sparse matrices, the coarse space is constructed by approximating the image of a local operator that can be defined from information in the coefficient matrix. The connection between the coarse spaces for SPD and general matrices is also discussed. Numerical experiments show the great effectiveness of the proposed preconditioners on matrices arising from a wide range of applications. The set of matrices includes SPD, symmetric indefinite, nonsymmetric, and saddlepoint matrices. In addition, we compare the proposed preconditioners to the stateoftheart domain decomposition preconditioners.
In 5, we consider sweeping domain decomposition preconditioners to solve the Helmholtz equation in the case of stripwise domain decomposition with or without overlaps. We unify their derivation and convergence studies by expressing them as Jacobi, GaussSeidel, and Symmetric GaussSeidel methods for different numbering of the unknowns. The proposed framework enables theoretical comparisons between the double sweep methods in [Nataf and Nier (1997), Vion and Geuzaine (2018)] and those in [Stolk (2013, 2017), Vion and Geuzaine (2014)]. Additionally, it facilitates the introduction of a new sweeping algorithm. We provide numerical test cases to assess the validity of the theoretical studies.
In 18, we investigate the influence of the choice of the partition of unity on the convergence of the Symmetrized Optimized Restricted Additive Schwarz (SORAS) preconditioner for the reactionconvectiondiffusion equation. We focus on two kinds of partitions of unity, and study the dependence on the overlap and on the number of subdomains. In particular, the second kind of partition of unity, which is nonzero in the interior of the whole overlapping region, gives more favorable convergence properties, especially when increasing the overlap width, in comparison with the first kind of partition of unity, whose gradient is zero on the subdomain interfaces and which would be the natural choice for ORAS solver instead.
In the context of seismic imaging, frequencydomain fullwaveform inversion (FWI) is suitable for longoffset stationaryrecording acquisition, since reliable subsurface models can be reconstructed with a few frequencies and attenuation is easily implemented without computational overhead. In the frequency domain, wave modeling is a Helmholtztype boundaryvalue problem which requires to solve a large and sparse system of linear equations per frequency with multiple righthand sides (sources). This system can be solved with direct or iterative methods. While the former are suitable for FWI application on 3D dense OBC acquisitions covering spatial domains of moderate size, the later should be the approach of choice for sparse node acquisitions covering large domains (more than 50 millions of unknowns). Fast convergence of iterative solvers for Helmholtz problems remains however challenging in high frequency regime, due to the non definiteness of the Helmholtz operator and also to the discretization constraints in order to minimize the dispersion error for a given frequency, hence requiring efficient preconditioners.
For such wave propagation problems, we continued development of ORAS Schwarz domain decomposition preconditioners for finite difference and finite element discretizations for different 3D seismic benchmarks. We also studied the influence of the choice of the local solver on the convergence and computing time, in particular for problems with multiple righthand sides using pseudoblock methods. For example, Table 1 gathers results comparing PARDISO, MUMPS and the Block LowRank (BLR) version of MUMPS as the local solver, for an acoustic simulation on a 20 km
New results for frequencydomain FWI showcasing the use of mixedprecision BLR with MUMPS have been obtained using real data from the Gorgon gas field case study (Australia) 14.
One of the most challenging shoulder injuries is rotator cuff tear, which increases with aging and particularly happens among athletes. These tears cause pain and highly affect the functionality of the shoulder. The motivation of this work 17, 16 is to detect these tears by microwave tomographic imaging. This imaging method requires the solution of an inverse problem based on a minimization algorithm, with successive solutions of a direct problem. The direct problem consists in solving the timeharmonic Maxwell’s equations discretized with a Nédélec edge finite element method, and we make use of an ORAS Schwarz domain decomposition preconditioner to solve it efficiently in parallel. We propose a wearable imaging system design with 96 antennas distributed over two fullycircular and two halfcircular layers, adapted to the real shoulder structure and designed to surround it partially. We test the imaging system numerically using realistic CAD models for shoulder profile and bones, including humerus and scapula, and taking into account the dielectric properties of the different tissues (bone, tendon, muscle, skin, synovial fluid). The reconstructed images of the shoulder joint obtained using noisy synthetic data show that the proposed imaging system is capable of accurately detecting and localizing large (5mL) and partial (1mL) rotator cuff tears. The reconstruction takes 20 minutes on 480 computing cores and shows great promise for rapid diagnosis or medical monitoring. In a second step, we take advantage of numerical modeling to try and optimize the number of antennas in the imaging system, achieving a drastic reduction from 96 to 32 antennas while still being able to detect the injury in the difficult partial tear case, reducing the computing time from 20 to 12 minutes.
In 9, we consider a new numerical method for the electroencephalography(EEG) inverse source problem. The novelty is to take into account the heterogeneity of the skull, as the electric conductivity of the skull can hardly be assumed to be constant: (i) hard and spongy bones, especially for neonates (fontanels), (ii) conductivity models vary between individuals, and through time for a single person. The first step of the source reconstruction method is the "cortical mapping" procedure. It deals with the numerical resolution of both data completion and transmission problems for elliptic equations: (i) data completion on the scalp knowing partial measurements of the electric potential (from electrode measurements), (ii)data transmission from the scalp to the brain surface. To this end, we propose to apply the quasireversibility method which provides a regularized solution of Cauchy problems in a bounded domain. The problem is solved using the finite element method in the scalp and skull layers. The second step consists in using an efficient analytic method to retrieve the source (position and moment) from the Cauchy data on the brain surface, which is assumed to be spherical with constant conductivity. For the classical threelayer spherical model (brain, skull, scalp), using synthetic data from 257 electrodes, we recover the source with 3.6% position error and moment correlation of 1. With 5% and 20% white Gaussian noise, the position error is 5.5% and 8.4% respectively, with moment correlation of 1. Numerical results are also presented for sets of 33 and 61 electrodes, for two distant and two close sources, and for a variable skull conductivity with the presence of the anterior fontanel, as well as a cerebrospinal fluid (CSF) layer.
The radiative transfer equations have three spatial dimensions, two radiational dimensions, one frequency variable, and possibly time. Here, without loss of generality, they are reduced to a nonlinear integrodifferential system and solved by an iterative scheme which is shown to be monotone and convergent. At each step, two large integrals of convolution type need to be computed. The kernel matrices are assembled in compressed form using Hierarchical matrices, with the help of the HMatrix C++ parallel library Htool 7.1.4. The complexity of the resulting method is of complexity n log(n) where n is the number of vertices of the physical domain. The method is applied to the temperature and radiation in the valley of Chamonix, the challenge being the large variation of density, altitude and light reflection due to snow and possibly clouds. Our method is able to handle 240K physical points, all directions of radiation and 683 frequencies in less than 35 minutes on an Apple M1 laptop 10. The method has also been extended to handle reflective boundary conditions 27.
New features were developed in the FreeFEM software thanks to the work of Jacques Morice (INRIA ADT engineer March 20222023):
The FreeFEM software now includes the GenEO for saddlepoint 12 examples with PCHPDDM in PETSc.
In this part we are concerned with solving Vlasov ort VlasovPoisson systems involving several time scales.
First, we are interested with the parareal algorithm which is a method performing parallel computing in time for an efficient numerical solving of a very large class of time dependent equations. In
11, we propose a new reduced model derived from the twoscale convergence theory that we use in the parareal framework for the coarse solving. The parareal method allows to improve their accuracy in a few iterations through corrections by fine solvers of the full model. The models are numerically solved by particle in cell methods. We perform numerical experiments of long time simulations of a charged beam in a focusing channel and under the influence of a rapidly oscillating external electric field. On the basis of computing times, we provide an analysis of the efficiency of the parareal algorithm in terms of speedup.
An ongoing work is the parallelization of the previous code in order to study the efficiency of the strategy. The issue is to analyze the distributed vs. shared memory paradigms and to address the problem of optimizing the memory footprint of the code. Then, in 24 we derive a reduced firstorder model from a twoscale asymptotic expansion in a small parameter, in order to approximate the solution of a stiff differential equation. The problem of interest is a multiscale NewtonLorentz equation modeling the dynamics of a charged particle under the influence of a linear electric field and of a perturbed strong magnetic field. First, we show that in short times, the firstorder model provides a much better approximation than the zeroorder one. Then, we propose a volumepreserving method using a particular splitting technique to solve numerically the firstorder model. Finally, it turns out that the firstorder model does not systematically provide a satisfactory approximation in long times. To overcome this issue, we implement a recent strategy based on the Parareal algorithm, in which the firstorder approximation is used for the coarse solver. This approach allows to perform efficient and accurate longtime simulations for any small parameter. Numerical results for two realistic Penning traps are provided to support these statements.
Second, in 6 we obtained an original application of the method of moments to the VlasovPoisson system with non constant strong magnetic field in three dimensions of space. Using basis functions which are aligned with the magnetic field, one obtains a Friedrichs system where the kernel of the singular part is made explicit. A projection of the original model on this kernel yields what we call the reduced model. Basic numerical tests of the field illustrate the accuracy of our implementation.
Superpositions of plane waves are known to approximate well the solutions of the Helmholtz equation. Their use in discretizations is typical of Trefftz methods for Helmholtz problems, aiming to achieve high accuracy with a small number of degrees of freedom. However, Trefftz methods lead to illconditioned linear systems, and it is often impossible to obtain the desired accuracy in floatingpoint arithmetic.
In 15 we show that a judicious choice of plane waves can ensure highaccuracy solutions in a numerically stable way, in spite of having to solve such illconditioned systems. Numerical accuracy of plane wave methods is linked not only to the approximation space, but also to the size of the coefficients in the plane wave expansion. We show that the use of plane waves can lead to exponentially large coefficients, regardless of the orientations and the number of plane waves, and this causes numerical instability. We prove that all Helmholtz fields are continuous superposition of evanescent plane waves, i.e., plane waves with complex propagation vectors associated with exponential decay, and show that this leads to bounded representations. We provide a constructive scheme to select a set of real and complexvalued propagation vectors numerically. This results in an explicit selection of plane waves and an associated Trefftz method that achieves accuracy and stability. The theoretical analysis is provided for a twodimensional domain with circular shape. However, the principles are general and we provide numerical experiments demonstrating practical applicability also for polygonal domains.
In 21 we extend the previous analysis from two to three space dimensions. A first nontrivial challenge in 3D is the parametrization of the complex direction set. Our approach involves defining a complexvalued reference direction and then consider its rigidbody rotations via Euler angles. Then, by generalizing the Jacobi–Anger identity to complexvalued directions, we prove that any solution of the Helmholtz equation on a threedimensional ball can be written as a continuous superposition of evanescent plane waves in a stable way. Most of the analysis is then easily extended from 2D to 3D. Again our numerical results showcase a great improvement of our proposed method on the achievable accuracy compared to standard method.
In 23, we consider a rational approximation of the exponential function to design an algorithm for computing matrix exponential in the Hermitian case. Using partial fraction decomposition, we obtain a parallelizable method, where the computation reduces to independent resolutions of linear systems. We analyze the effects of rounding errors on the accuracy of our algorithm. We complete this work with numerical tests showing the efficiency of our method and a comparison of its performances with Krylov algorithms.
In 28 we present a finite element toolbox for the computation of Bogoliubovde Gennes modes used to assess the linear stability of stationary solutions of the GrossPitaevskii (GP) equation. Applications concern one (single GP equation) or twocomponent (a system of coupled GP equations) BoseEinstein condensates in one, two and three dimensions of space. An implementation using the free software FreeFem++ is distributed with this paper. For the computation of the GP stationary (complex or real) solutions we use a Newton algorithm coupled with a continuation method exploring the parameter space (the chemical potential or the interaction constant). Bogoliubovde Gennes equations are then solved using dedicated libraries for the associated eigenvalue problem. Mesh adaptivity is proved to considerably reduce the computational time for cases implying complex vortex states. Programs are validated through comparisons with known theoretical results for simple cases and 97 numerical results reported in the literature.
With : C. Bernardi, V. Girault, F. Hecht, P.A. Raviart, B. Riviere.
This book in final preparation (2024) is a revised, updated, and augmented version of the outofprint classic book “Finite Element Methods for Navier–Stokes Equations" by Girault and Raviart published by Springer in 1986 [GR]. The incompressible Navier–Stokes equations model the flow of incom pressible Newtonian fluids and are used in many practical applications, including computational fluid dynamics. In addition to the basic theoretical analysis, this book presents a fairly exhaustive treatment of the uptodate finite element discretizations of incompressible Navier–Stokes equa tions and a variety of numerical algorithms used in their computer implementation. It covers the cases of standard and nonstandard boundary conditions and their numerical discretizations via the finite element methods. Both conforming and nonconforming finite elements are examined in detail, as well as their stability or instability. The topic of timedependent Navier–Stokes equa tions, which was missing from [GR], is now presented in several chapters. In the same spirit as [GR], we have tried as much as possible to make this book selfcontained and therefore we have either proved or recalled all the theoretical results required. This book can be used as at textbook for advanced graduate students.
With Hussam Al Daas, Grey Ballard, Laura Grigori, Suraj Kumar, Kathryn Rouse
Multiple TensorTimesMatrix (MultiTTM) is a key computation in algorithms for computing and operating with the Tucker tensor decomposition, which is frequently used in multidimensional data analysis. In 8 , we establish communication lower bounds that determine how much data movement is required to perform the MultiTTM computation in parallel. The crux of the proof relies on analytically solving a constrained, nonlinear optimization problem. We also present a parallel algorithm to perform this computation that organizes the processors into a logical grid with twice as many modes as the input tensor. We show that with correct choices of grid dimensions, the communication cost of the algorithm attains the lower bounds and is therefore communication optimal. Finally, we show that our algorithm can significantly reduce communication compared to the straightforward approach of expressing the computation as a sequence of tensortimesmatrix operations.
With VanThanh Nguyen, Laura Grigori
In 13, we describe an interpretation of parareal as a twolevel additive Schwarz preconditioner in the time domain. We show that this twolevel preconditioner in time is equivalent to parareal and to multigrid reduction in time (MGRIT) with Frelaxation. We also discuss the case when additional fine or coarse propagation steps are applied in the preconditioner. This leads to procedures equivalent to MGRIT with FCFrelaxation and to MGRIT with F(CF)2relaxation or overlapping parareal. Numerical results show that these variants have faster convergence in some cases. In addition, we also apply a Krylov subspace method, namely GMRES (generalized minimal residual), to accelerate the parareal algorithm. Better convergence is obtained, especially for the advectionreactiondiffusion equation in the case when advection and reaction coefficients are large.
With Yongseok Jang, Laura Grigori, Emeric Martin, Cédric Content
For high dimensional spaces, a randomized GramSchmidt (RGS) algorithm is beneficial in computational costs as well as numerical stability. In 25, we apply this dimension reduction technique by random sketching to Krylov subspace methods, e.g. to the generalized minimal residual method (GMRES). We propose a flexible variant of GMRES with the randomized GramSchmidt based Arnoldi iteration to produce a set of basis vectors of the Krylov subspace. Even though the Krylov basis is no longer l2 orthonormal, its random projection onto the low dimensional space shows l 2 orthogonality. As a result, it is observed the numerical stability which turns out to be independent of the dimension of the problem even in extreme scale problems. On the other hand, as the Harmonic Ritz values are commonly used in GMRES with deflated restarting to improve convergence, we consider another deflation strategy, for instance disregarding the singular vectors associated with the smallest singular values. We thus introduce a new algorithm of the randomized flexible GMRES with singular value decomposition (SVD) based deflated restarting. At the end, we carry out numerical experiments in the context of compressible turbulent flow simulations. Our proposed approach exhibits a quite competitive numerical behaviour to existing methods while reducing computational costs.
With Laura Grigori, Edouard Timsit
In 22, we introduce a randomized Householder QR factorization (RHQR). This factorization can be used to obtain a well conditioned basis of a set of vectors and thus can be employed in a variety of applications. We discuss in particular the usage of this randomized Householder factorization in the Arnoldi process. Numerical experiments show that RHQR produces a well conditioned basis and an accurate factorization. We observe that for some cases, it can be more stable than Randomized GramSchmidt (RGS) in both single and double precision.
In this project supported by European Union’Inno4Scale, INRIA and WIKKI (Germany) will collaborate to develop, test and evaluate a novel asynchronous modified conjugate gradient (AMCG) algorithm. It will be demonstrated in two important middleware applications, namely OpenFOAM® and FreeFEM which rely respectively on finite volume and finite element discretizations.
EMC2 project on cordis.europa.eu
Molecular simulation has become an instrumental tool in chemistry, condensed matter physics, molecular biology, materials science, and nanosciences. It will allow to propose de novo design of e.g. new drugs or materials provided that the efficiency of underlying software is accelerated by several orders of magnitude.
The ambition of the EMC2 project is to achieve scientific breakthroughs in this field by gathering the expertise of a multidisciplinary community at the interfaces of four disciplines: mathematics, chemistry, physics, and computer science. It is motivated by the twofold observation that, i) building upon our collaborative work, we have recently been able to gain efficiency factors of up to 3 orders of magnitude for polarizable molecular dynamics in solution of multimillion atom systems, but this is not enough since ii) even larger or more complex systems of major practical interest (such as solvated biosystems or molecules with stronglycorrelated electrons) are currently mostly intractable in reasonable clock time. The only way to further improve the efficiency of the solvers, while preserving accuracy, is to develop physically and chemically sound models, mathematically certified and numerically efficient algorithms, and implement them in a robust and scalable way on various architectures (from standard academic or industrial clusters to emerging heterogeneous and exascale architectures).
EMC2 has no equivalent in the world: there is nowhere such a critical number of interdisciplinary researchers already collaborating with the required track records to address this challenge. Under the leadership of the 4 PIs, supported by highly recognized teams from three major institutions in the Paris area, EMC2 will develop disruptive methodological approaches and publicly available simulation tools, and apply them to challenging molecular systems. The project will strongly strengthen the local teams and their synergy enabling decisive progress in the field.
Numpex is a French program dedicated to Exascale which is divided into five Research Projects. The first one is ExaMA (Methods and Algorithms of Exascale) aimed at advancing scientific simulations and modeling capabilities to reach and surpass the exascale barrier. The project is organized into several work packages (WP) that focus on different aspects of the research objectives.
Partners in WP3: LIP6 Sorbonne University, INRIA Bordeaux, CEA.
: ANR appel à projet générique 2019.
S. Hirstoaga and P.H. Tournier are members of the project MUFFIN, whose objective is to explore and optimize original computational scenarios for multiscale and high dimensional transport codes, with priority applications in plasma physics. Several approximation methods are planed to be developed. It is at the frontier of computing and numerical analysis and intends to reduce the computational burden in the context of intensive calculation. Principal Investigator: B. Després (Sorbonne University).