## Section: New Results

### Algorithms and high-performance solvers

#### Dense linear algebra solvers for multicore processors accelerated with multiple GPUs

In collaboration with the Inria
`RUNTIME` team and the University of Tennessee, we have designed dense
linear algebra solvers that can fully exploit a node composed of a
multicore processor accelerated with multiple
GPUs. This work has been integrated in the latest release of the MAGMA package
(http://icl.cs.utk.edu/magma/ ).

#### Hybrid direct/iterative solvers based on algebraic domain decomposition techniques

A first release of the `MaPHyS` package should be made available early in 2012 thanks to the
developments conducted in the last year of the ADT.
An approximation of the local Schur complement has been studied that is based on approximated inverse
technique.
This work is a natural extension of part of the PhD research of Mikko Byckling.
Furthermore, during his master internship, Stojce Nakov has investigated the
design of a Krylov subspace method, namely the conjugate gradient, on a run-time system in order
to best exploit the computing capabilites of many-GPU nodes and manycore systems.
In the framework of his starting PhD funded by TOTAL, Stojce Nakov will continue his work
to design a new implementation of a hybrid linear solver
(see Section
3.3 )
for heterogeneous manycore platforms.

#### Resilience in numerical simulations

In his master internship work, Mawussi Zounon investigated recovery strategies for core faults in the framework of parallel preconditioned Krylov solvers. The underlying idea is to recover fault entries of the iterate via interpolation from existing values available on neighbor cores. He will continue this work in the framework of his PhD funded by the ANR-RESCUE. Notice that theses activities are also part of our contribution to the G8-ECS (Enabling Climate Simulation at extreme scale).

#### Full geometric multigrid method for 3D Maxwell equations

In the context of a collaboration with the CEA/CESTA center, Mathieu Chanaud continued his PhD work on a tight combination between multigrid methods and direct methods for the efficient solution of challenging 3D irregular finite element problems arising from the discretization of Maxwell equations. A parallel solver dedicated to the ODYSSEE challenge (electromagnetism) of CEA/CESTA has been implemented and integrated. The novel parallel solver was able to solve a 1.3 billion system given a 20 million unknown problem at the coarsest level. The input mesh defines the coarsest level. This mesh is further refined to defined the grid hierarchy, where matrix free smoothers are considered to reduce the memory consumption.

#### Scalable numerical schemes for scientific applications

A work is currently carried on with TOTAL (Rached Abdelkhalek PhD).
The extraordinary challenge that the oil and gas industry must face
for hydrocarbon exploration requires the development of leading edge
technologies to recover an accurate representation of the subsurface.
Seismic modeling and Reverse Time Migration (RTM) based on the full
wave equation discretization, are tools of major importance since they
give an accurate representation of complex wave propagation
areas. Unfortunately, they are highly compute intensive.
The recent development in `GPU` technologies with unified architecture
and general-purpose languages coupled with the high and rapidly
increasing performance throughput of these components made General
Purpose Processing on Graphics Processing Units an attractive
solution to speed up diverse applications.
We have designed a fast parallel simulator that solves the acoustic
wave equation on a `GPU` cluster.
Solving the acoustic wave equation in an oil exploration industrial
context aims at speeding up seismic modeling and Reverse Time
Migration. We consider a finite difference approach on a regular
mesh, in both 2D and 3D cases. The acoustic wave equation is solved
in a constant density or a variable density domain. All the
computations are done in single precision, since double precision is
not required in our context. We use nvidia CUDA to take advantage of
the `GPU` computational power. We study different implementations and
their impact on the application performance. We obtain a speed up of
16 for Reverse Time Migration and up to 43 for the modeling
application over a sequential code running on general purpose CPU.
The defense of this thesis is planned early 2012.

For the solution of the elastodynamic equation on meshes with local refinments, we are currently collaborating with Total to design a parallel implementation of a local time refinement technique on top of a discontinuous Galerkin space discretization. This latter technique enables to manage non-conforming meshes suited to deal with multiblock approaches that capture the locally refined regions. this work is developed in the framework of Yohann Dudouit PhD thesis. A software prototype is currently developed to address these simulations.

The calculation of acoustic modes in combustion chambers is a challenging calculation for large 3D geometries. It requires the calculation of a few of the smallest eigenpairs of large unsymmetric matrices in a paralell environment. A new block Arnoldi approach is currently developed to best benefit from the continuation scheme used in this application context. This is part of the PhD research activity of Pablo Salas.