The team SAGE undertakes research on high-performance computing and deals with three subjects :

numerical algorithms, mostly for solving linear and nonlinear systems,

large scale high performance computing, involving parallel and grid computing,

environmental and geophysical applications, mostly in hydrogeology.

These three subjects are highly interconnected: the first topic aims at designing numerical algorithms, which will lead to high performances on parallel and grid architectures and which will be applied in geophysical models.

Moreover, the team SAGE develops a software platform for groundwater numerical simulations in heterogeneous subsurface.

The focus of this topic is the design of efficient and robust numerical algorithms in linear algebra. The main objective is to solve large systems of equations
Ax=
b, where the matrix
Ahas a sparse structure (many coefficients are zero). High performance computing (
) is required in order to tackle large scale problems. Algorithms and solvers are applied to problems arising from
hydrogeology and geophysics (
).

Direct methods, based on the factorization
A=
LU, induce fill-in in matrices
Land
U. Reordering techniques can be used to reduce this fill-in, hence memory requirements and floating-point operations
.

More precisely, direct methods involve two steps, first
*factoring*the matrix
Ainto the product
A=
P_{1}LUP_{2}where
P_{1}and
P_{2}are permutation matrices,
Lis lower triangular, and
Uis upper triangular, then solving
P_{1}LUP_{2}x=
bby processing one factor at a time. The most time consuming and complicated step is the first one, which is further broken down into the following steps :

Choose
P_{1}and diagonal matrices
D_{1}and
D_{2}so that
P_{1}D_{1}AD_{2}has a “large diagonal.” This helps to assure accuracy of the final solution.

Choose
P_{2}so that the
Land
Ufactors of
P_{1}AP_{2}are as sparse as possible.

Perform
*symbolic analysis*, i.e. identify the locations of nonzero entries of
Land
U.

Factorize
P_{1}AP_{2}into
Land
U.

The two main classes of iterative solvers are Krylov methods and multigrid methods.

A Krylov subspace is for example
. If the matrix is symmetric positive definite, the Krylov method of choice is the Conjugate Gradient; for symmetric undefinite matrices, there are mainly three methods, SYMMLQ, MINRES
and LSQR. For unsymmetric matrices, it is not possible to have both properties of minimization and short recurrences. The GMRES method minimizes the error but must be restarted to limit
memory requirements. The BICGSTAB and QMR methods have short recurrences but do not guarantee a decreasing residual
,
. All iterative methods require preconditioning to speed-up convergence : the system
M^{-1}Ax=
M^{-1}bis solved, where
Mis a matrix close to
Asuch that linear systems
Mz=
care easy to solve. A family of preconditioners uses incomplete factorizations
A=
LU+
R, where
Ris implicitely defined by the level of fill-in allowed in
Land
U. Other types of preconditioners include an algebraic multigrid approach, an approximate inverse or a domain decomposition
.

Multigrid methods can be used as such or as a preconditioner. They can be either geometric or algebraic .

The team studies preconditioners for Krylov methods , and uses multigrid methods.

Domain decomposition methods are hybrid methods or semi-iterative methods using iterative and direct techniques. They can be based on alternating Schwarz method when domain overlap or on Schur complement method whithout overlapping . Schwarz methods can be used as preconditioners of Kyrlov methods or directly with an acceleration based on Aitken extrapolation. Schur methods lead to a reduced system, solved by a preconditioned Krylov method.

The team studies these various aspects of domain decomposition methods.

For linear least-squares problems
, direct methods are based on the normal equations
A^{T}Ax=
A^{T}b, using either a Cholesky factorization of
A^{T}Aor a
QRfactorization of
A, whereas the most common Krylov iterative method is LSQR. If the discrete problem is ill-posed, regularization like Tychonov or a Truncated Singular Value Decomposition (TSVD) is
required
,
. For large matrices, the so-called complete factorization is also useful. The first step is a pivoted QR
factorization, followed by a second factorization
where
Uand
Vare orthogonal matrices and
Eis a matrix neglectable with respect to the chosen threshold. Such a decomposition is a robust rank-revealing factorization and it provides for free the Moore-Penrose Generalized
Inverse. Recently, efficient
QRfactorization software libraries became available but they do not consider column or row permutations based on numerical considerations since the corresponding orderings often end up
with a non tractable level of fill-in.

The team studies iterative Krylov methods for regularized problems, as well as rank-revealing
QRfactorizations.

Nonlinear methods to solve
F(
x) = 0include fixed-point methods, nonlinear stationary methods, secant method, Newton method
,
,
. The team studies Newton-Krylov methods, where the linearized problem is solved by a Krylov method
, Broyden methods, Proper Orthogonalization Decomposition methods.

Another subject of interest is time decomposition methods. The idea is to divide the time interval into subintervals, to apply a timestep in each subinterval and to apply a nonlinear correction at both ends of subintervals. This can be applied to explosive or oscillatory problems.

Let us consider the problem of computing some extremal eigenvalues of a large sparse and symmetric matrix
A. The Davidson method is a subspace method that builds a sequence of subspaces, which the initial problem is projected on. At every step, approximations of the sought eigenpairs are
computed : let
V_{m}be an orthonormal basis of the subspace at step
mand let
(
,
z)be an eigenpair of the matrix
H_{m}=
V_{m}^{T}AV_{m} ; then the Ritz pair
(
,
x=
V
_{m}
z)is an approximation of an eigenpair of
A. The specificity of the method comes from how the subspace is augmented for the next step. In contrast to the Lanczos method, which is the method to refer to, the subspaces are not
Krylov subspaces, since the new vector
t=
x+
ywhich will be added to the subspace is obtained by an acceleration procedure : the correction
yis obtained by an exact Newton step (Jacobi-Davidson method) or an inexact Newton step (Davidson method). The behavior of the Davidson method is studied in
while the Jacobi-Davidson method is described in
. These methods bring a substantial improvement over the Lanczos method when computing the eigenvalues of smallest
amplitude. For that reason, the team considered Davidson method to compute the smallest singular values of a matrix
Bby applying them to the matrix
B^{T}B
.

In several applications, the eigenvalues of a nonsymmetric matrix are often needed to decide whether they belong to a given part of the complex plane (e.g. half-plane of the negative real part complex numbers, unit disc). However, since the matrix is not exactly known (at most, the precision being the precision of the floating point representation), the result of the computation is not always guaranteed, especially for ill-conditioned eigenvalues. Actually, the problem is not to compute the eigenvalues precisely, but to characterize whether they lie in a given region of the complex field. For that purpose the notion of -spectrum or equivalently the notion of pseudospectrum was simultaneously introduced by Godunov and Trefethen . Several teams proposed softwares to compute pseudospectra, including the SAGE team with the software PPAT , described in Section .

The focus of this topic is the development of parallel algorithms and software. The objectives are to solve large scale equations in linear algebra ( ) and to use high performance computing for dealing with problems arising from hydrogeology and geophysics ( ).

Algorithms have been described above (
). The team works on the development of parallel software for iterative solvers (PCG, GMRES, subdomain method), least-squares
solvers (
QRfactorization). The team also compares existing solvers. The target is Giga-systems with billions (
10
^{9}) of unknowns.

Our applications are quite often multi-physics models, where nonlinear coupling occurs. Our objective is to design software components, which provide a great modularity and flexibility for using the models in different contexts. The main numerical difficulty is to design a coupling algorithm with parallel potentiality.

In our applications, we use stochastic modelling in order to take into account geophysical variability. From a numerical point of view, it amounts to run multiparametric simulations. The objective is to use the power of grid computing. The target architecture is a heterogeneous collection of parallel clusters, with high-speed networks in clusters and slower networks interconnecting the clusters.

The team has chosen a particular domain of application, which is geophysics. In this domain, many problems require to solve large scale systems of equations, arising from the discretization of coupled models. Emphasis is put on hydrogeology, but the team investigates also geodesy, submarine acoustics, geological rock formation and heat transfer in soil. One of the objectives is to use high performance computing in order to tackle 3D large scale computational domains with complex physical models.

This is joint work with Geosciences Rennes, University of Le Havre and CDCSP at University of Lyon. It is also done in the context of GdR Momas and Andra grant.

Many environmental studies rely on modelling geo-chemical and hydrodynamic processes. Some issues concern aquifer contamination, underground waste disposal, underground storage of nuclear wastes, land-filling of waste, clean-up of former waste deposits. Simulation of contaminant transport in groundwater is a highly complex problem, governed by coupled linear or nonlinear PDAEs. Moreover, due to the lack of experimental data, stochastic models are used for dealing with heterogeneity. The main objective of the team is to design and to implement efficient and robust numerical models, including Uncertainty Quantification methods.

Recent research showed that rock solid masses are in general fractured and that fluids can percolate through networks of inter-connected fractures. Rock media are thus interesting for water resources as well as for the underground storage of nuclear wastes. Fractured media are by nature very heterogeneous and multi-scale, so that homogenisation approaches are not relevant. The team develops a numerical model for fluid flow and contaminant transport in three-dimensional fracture networks.

The output is a parallel scientific platform running on clusters and on experimental computational grids. Simulations of several test cases assess the performance of the software.

The kernel of SCILAB includes a special format for sparse matrices and some factorizations as well. Iterative linear solvers (PCG, GMRES, BiCGSTAB, QMR, etc) for large sparse linear systems
have been integrated in the Scilab's distribution. SCILIN is a SCILAB toolbox providing preconditioners for these solvers. SCILIN can be downloaded at the address :
http://

PPAT (Parallel PATh following software) is a parallel code, developed by D. Mezher, W. Najem (University of Saint-Joseph, Beirut, Lebanon) and B. Philippe. This tool can follow the contours
of a functional from
to
. The present version is adapted for determining the level curves of the function
f(
z) =
_{min}(
A-
ZI)which gives the pseudospectrum of matrix
A.

The algorithm is reliable : it does not assume that the curve has a derivative everywhere. The process is proved to terminate even when taking into account roundoff errors. The structure of the code spawns many independent tasks which provide a good efficiency in the parallel runs.

The software can be downloaded under the GPL licence from:
http://

Gridmesh is an interactive 2D structured mesh generator. A first version has a graphical user interface entirely built with Matlab. A second version is developed in Fortran 95 with the use of the MUESLI library (see ). The Matlab version is more friendly than the F95 one, but is practically limited to moderate meshes. Gridmesh can create/modify a 2D mesh with associated boundary conditions for both the flow and transport parts. Several numbering schemes can be used, in order to get a more or less banded connectivity matrix. Mesh partition can also be imposed, with an arbitrarily number in subdivisions (but this number must be a power of two).

Doing linear algebra with sparse and dense matrices is somehow difficult in scientific computing. Specific libraries do exist to deal with this area (
*e.g.*BLAS and LAPACK for dense matrices, SPARSKIT for sparse ones) but their use is often awful and tedious, mainly because of the great number of arguments which must be used. Moreover,
classical libraries do not provide dynamic allocation. Lastly, the two types of storage (sparse and dense) are so different that the user must know in advance the storage used in order to
declare correctly the corresponding numerical arrays.

MUESLI is designed to help in dealing with such structures and it provides the convenience of coding in Fortran with a matrix-oriented syntax; its aim is therefore to speed-up development
process and to enhance portability. It is a Fortran 95 library split in two modules: (i) FML (Fortran Muesli Library) contains all necessary material to numerically work with a dynamic array
(dynamic in size, type and structure), called
`mfArray`; (ii) FGL (Fortran Graphics Library) contains graphical routines (some are interactive) which use the
`mfArray`objects.

MUESLI includes some parts of the following numerical libraries: Arpack, GSL, HSL, Slatec, SuiteSparse and Triangle. Moreover, it requires some external libraries: zlib, pnglib, hdf5 (with the f90 interface), BLAS and LAPACK. Recently, the following features have been added or improved:

matrix properties (symmetry and positiveness) are stored inside the derived type in order to avoid number of costly successive tests;

physical units are integrated into the
`mfArray`type;

input/output of matrices support many different format (ASCII, compressed binary, HDF5 and Matrix Market)

MUESLI supports most of Fortran compilers (NAG, INTEL, GNU, ...). Linux is the platform which has been used for developping and testing MUESLI. Whereas the FML part (numerical computations)
should work on any platform (
*e.g.*Win32, Mac OS X, Unix), the FGL part is intended to be used only with X11 (
*i.e.*under all UNIXes).

Last version of MUESLI is 1.6.0 (as of 26 october 2007). Two main guides are provided in and , also available at:

http://

and

http://

When dealing with non-linear free-surface flows, mixed Eulerian-Lagrangian methods have numerous advantages, because we can follow marker particles distributed on the free-surface and then compute with accuracy the surface position without the need of interpolation over a grid. Besides, if the liquid velocity is great enough, Navier-Stokes equations can be reduced to a Laplace equation, which is numerically solved by a Boundary Element Method (BEM); this latter method is very fast and efficient because computing occur only on the fluid boundary. This method is applied to the spreading of a liquid drop impacting on a solid wall and to the droplet formation at a nozzle; applications take place, among others, in ink-jet printing processes.

The code used (CANARD) has been developped with Jean-Luc ACHARD (LEGI, Grenoble) for fifteen years and is used today mainly through collaboration with Carmen GEORGESCU at UPB (University Polytechnica of Bucarest) in Romania.

This software-platform is developed in collaboration with J.-R. de Dreuzy, from Geosciences, university of Rennes 1, with A. Beaudoin, from the University of Le Havre and with D. Tromeur-Dervout, from the University of Lyon.

Hydrolab aims at modeling flow and transport of solute in highly heterogeneous porous or fractured media. Numerical models currently include steady-state flow in saturated media and
transport by advection-diffusion. Physical models can be either a porous medium or a network of fractures. For flow equations, Hydrolab uses a mixed finite element method or a finite volume
method and it includes a particle tracker for transport equations. The platform is organized in software components and relies as far as possible on existing free libraries, such as sparse
linear solvers. Because the target is large computational domains, the platform makes use of high performance computing and all modules have a parallel version. The target is currently clusters
with distributed memory and grid architectures. The code is written in C++ and uses the MPI library for parallel computing. Most modules are fully generic so that they can be used by any
application within the platform. The platform is currently implemented on Windows systems and on Linux systems as well. The objective is to develop a free software available on the Web; it has
already been registered for the Gforge of Inria and a second step will be to register the different components for the APP. A web site dedicated to the Hydrolab platform has been opened in May
2007, available at
http://

That work is done in cooperation with Laura Grigori, from the team Grand Large at INRIA Futurs-Saclay. It is done in the context of the Sarima project, , related to cooperation with Cameroon.

The design of a highly robust solver based on a preconditioned GMRES method dedicated to parallel computing goes on (see section ). The preconditionner is based on a special partitioning of the matrix which can be seen as an algebraic Domain Decomposition with overlaps. The partition must satisfy a constraint more restrictive than in classic domain decomposition : one domain (i.e. one block of the matrix) is connected to two other domains at most. The popular software METIS for automatic graph partitioning does not match the constrainst. Therefore, a new tool which automatically builds the partition and the associated renumbering was to be designed.

For that purpose, a graph partitioning algorithm was defined in 2006 by the team and validated this year. The partitioned matrix is suitable for applying the explicit formulation of Multiplicative Schwarz preconditioner (EFMS) described in a previous work (see the 2005 and 2006 activity reports) . The method has now been improved and tested on a large bunch of matrices.

That work is done in cooperation with Nabil Gmati, from the University of Tunis, Tunisia.

We studied the convergence of the GMRES method when efficiently preconditioned. It is already well-known that, when the eigenvalues of the transformed system are all included in the disk
centered at 1 and of unit radius, convergence occurs. We have proved that, when
peigenvalues lie out of that disk, they may delay the convergence by
psteps at most
.

This work is done in collaboration with J.-R. de Dreuzy, from the department of Geosciences at the University of Rennes 1, and with D. Tromeur-Dervout, from the CDCSP at the University of Lyon 1. It is done in the context of the Grid'5000 project, .

In order to tackle very large computational domains in flow simulations, we develop a multilevel method where we use an external subdomain decomposition combined with a domain decomposition method and an internal subdomain decomposition combined with a multigrid method. We pursue our work on the Schwarz method accelerated by the Aitken process. We have also investigated various domain decomposition methods, such as the Schur complement method and analyzed their efficiency for flow computations in heterogeneous media. This was the subject of internship of Baptiste Poirriez.

That work is done in cooperation with Laura Grigori, from the team Grand Large at INRIA Futurs-Saclay.

We have developed an algorithm to compute a rank revealing sparse QR factorization. The algorithm consists of first a standard QR factorization performed with standard high performance routines like MA49, and a second part based on incremental condition number estimation ICE. Unfortunately this second part relies mostly on BLAS1 routines which are known to be unefficient on modern processors. We are now working on introducing BLAS3 operations in this rank-revealing part of the algorithm. This has started with the design of a block version of the well-known ICE algorithm, so that the whole computations will rely on BLAS3 operations for a better use of modern processors.

This work is done in collaboration with R. Aboulaich, from EMI, Marocco, in the context of the Sarima project, .

Limited memory methods are powerful methods to solve non linear systems. But the difficulty in these methods is to find the right size allocated to the solver for the system. Of course this parameter changes for each system and there is no rule to set it. We propose a new limited memory method that we call autoadaptative because it has the nice ability to find an acceptable parameter for the system. We have then tuned this method so that it is now efficient to solve standard non linear test functions. The most important improvement is that the threshold to limit the memory will adjust itself during the convergence, so that the memory is no more waisted when we have a good convergence. We also proved that this algorithm has a super linear convergence under classical assumptions. This work has been submitted to an international journal.

This work is done in collaboration with Y. Saad, from the University of Minnesota, USA, and with A. Sameh, from the University of Purdue, USA (see section ).

Designing a method to compute eigenvalues of an operator defined on a domain decomposition is not as well studied as its counterpart for solving linear systems. The question is difficult since there is only a weak relation between the spectra of the restrictions of the operator on the subdomains and the spectrum of the global operator. We studied and improved some previous algorithms used in structural mechanics .

We now investigate the symmetric case in which intermediate eigenvalues are sought. The algorithm is based on a Lanczos without reorthogonalization with a special mechanism to eliminate the ghost eigenvalues.

That work is done in cooperation with E. Kamgnia, from the University of Yaoundé, Cameroon, in the context of the Sarima project, . See section .

We have pursued the work on a parallel version of the GMRES method preconditionned by Multiplicative Schwarz. The scheme consists in building a priori a basis of the Krylov subspace and then in orthogonalizing it. The basis construction involves a double recursion (on the domains of the operator and on the successive vectors of the basis). It is parallelized by pipelining the iterations of the nested loop.

The developments of this year brought improvements in two directions. First, the a priori construction of the basis intrinsically suffers of ill conditioning which harms the subsequent steps of the computation. Even by using some tricks to limit that effect, it is necessary to validate the computation. By estimating the roundoff errors magnitude in the QR factorization, we have introduced a criterion to adapt dynamically the basis dimension.

Second, we suppressed the computations which involve a global communication since they prevent scalability. Such steps occur when computing a dot product of two vectors ; for that purpose, the computation of the norms of the basis vectors has been successfully inserted in the pipeline of the nested loop.

The whole code is developed within the PETSc format and it is presently intensively tested.

This work is done in the context of the ERCIM working group on computing and statistics, .

A parallel algorithm for computing the QR decomposition of a large and very long dense matrix is proposed. The algorithm is based on a block version of Givens rotations and is rich in BLAS 3 computations. The large data matrix is partitioned on blocks of columns, where the block size is determined dynamically according to the number of processors used for the computations. The block-columns are distributed among the processors using a cyclic distribution pattern which ensures perfect load balanced allocation and equal local computations. Numerical results and theoretical complexities are computed. They indicate that the method involves a 50 % additional computing cost when compared to the classical algorithm based on Householder reflectors. The expected benefit of the new approach is to increase the data locality.

The first experiments of the developed code confirm the theoretical complexities and efficiency. They illustrate the scalability of the proposed algorithm.

This work is done in collaboration with N. Nassif, from the American University of Beirut, Lebanon, in the context of the Sarima project, .

A rescaling method has been initially designed for solving evolution problems whose solution was explosive in finite time, in a very efficient way. Its basic idea is to generate, through a change of variables, a sequence of time slices such that the time variable and the solution are restored to zero at the beginning of each slice, and the rescaled solution is controlled by a uniform criterion for ending slices. The sequential implementation of the rescaling method has shown the existence of a relation between the initial values of the successive time slices (ratio phenomenon). Approximating this relation allows the prediction of the initial values in order to start on a parallel-in-time integration through a prediction-correction scheme.

RaPTI Algorithm (
*Ratio-based Parallel Time Integration*) starts with running the rescaling method sequentially on a few slices and computing the ratios of the successive initial values, in order to
determine the way they are related and predict the following initial values necessary for the starting of the prediction-correction scheme. Unlike the other parallel-in-time algorithms, RaPTI
does not involve any sequential computation (except for the first slices) and generates time slices whose sizes vary with the behavior of the solution insuring a similarity between the
slices.

After having been tested on the reaction-diffusion equation (and published in 2006), RaPTI Algorithm has been adapted to oscillatory evolution problems such those following from the population dynamics, namely the logistic predator-prey model of Lotka-Volterra (with 2 or 3 species) and the endemic classical SIR model. The experiments showed again a ratio phenomenon leading to good predictions, good corrections and fast convergence and were the object of an article submitted in 2007.

The encouraging results on oscillatory problems led then to trying to adapt the RaPTI Algorithm for solving systems of differential equations governing the motion of a satellite (such
problem has been already studied in
and
using a predictor-corrector multiple shooting scheme based on Newton-type iterations). Applied to the computation
of satellite's trajectories and tested on a
J_{2}-perturbed motion, RaPTI did not show (so far) fast convergence. This is probably due to the large interval of computation (leading to large number of slices) needed in such
problem.

An improvement of the prediction scheme seems to be necessary and might start by noticing the fact that, since they obey to the same stopping criterion, the end-of-slice values and their ratios must somehow be related. Finding relevant models for the behavior of the considered ratios (through statistical techniques based on the observed monotony of the ratios evolution on the first sequentially computed slices), instead of settling for their quasi-stabilization, might exhibit a better ratio phenomenon leading to a more consistent predictor-corrector scheme. The resulting computational models will be first validated on a J2-perturbed motion of the satellite and then on more complicated perturbations.

This work is done in the context of the IFREMER contract (see section ).

We consider a new seismic inverse problem; we seek to invert seismic traveltime measures from an available seismic survey to find the wave velocity underground. Our inversion method is based on ray tracing and genetic algorithms.

We studied the seismic ray theory which may be applied to wave propagation problems by doing some approximations. We developed a code (EIKOLIN) which is more suitable for our application than another existing code available from Internet (ANRAY). EIKOLIN is not expensive in computation because only linear interpolations are used (instead of cubic B-Spline for ANRAY). This feature allows to use genetic algorithms which require a great number of simulations.

We have proposed to estimate the density and the anisotropic elasticity coefficients of a solid medium in contact with a fluid medium by measuring the variation of the pressure in the fluid while propagating an elastic/acoustic wave. This problem is formulated as an inverse problem where the coefficients of the elastodynamics equation coupled with acoustic wave equation are to be estimated. Numerical simulations of such coupled system are done using a special finite element procedure that is considered to be not demanding in computations and thus allowing the use of a stochastic optimization method to solve the inverse problem.

The stochastic optimization method considered is SPSA and we have studied the efficiency of its parallelization in estimating the density and the elasticity of an orthorhombic solid medium (see figure ). The sensitivity of the mechanical parameters with respect to the noise in the input data has been quantified using the singular values of the jacobian around the solution of the inverse problem.

Publications , and are related to the first part of our work whereas publication is related to the second part.

This work is done in collaboration with A. Beaudoin, from the University of le Havre and J.-R. de Dreuzy, from the department of Geosciences at the University of Rennes 1.

We have compared warious methods for solving large sparse symmetric positive definite linear systems arising in flow computations in 2D porous media. We have developed interfaces to
several parallel libraries, such that the distributed matrix and right-hand side are input data to the solver. Parallel computations inside the solver may redistribute the matrix but the
output solution is distributed as the right-hand side. The velocity field is then computed in each processor from the hydraulic head returned by the solver. We require a linear complexity
with the matrix size
N, a robust solver for heterogeneous permeability fields, parallel scalability. We observe that direct solvers are robust and scalable but have a complexity in
O(
N^{1.5}). Preconditioned Conjugate Gradient methods have also a poor complexity. Domain decomposition methods seem very promising thus we are developing such methods, see
section
. Multigrid methods have a linear complexity but geometric multigrid is too sensitive to heterogeneity. On the contrary,
algebraic multigrid is robust but requires more memory than geometric multigrid and is less efficient for moderate heterogeneity. This work has been submitted to journals and presented at
several conferences
,
,
,
.

This work is done in collaboration with A. Beaudoin, from the University of le Havre and J.-R. de Dreuzy, from the department of Geosciences at the University of Rennes 1.

We have developed a parallel algorithm for advection-diffusion modeling and have analyzed its performances on clusters of processors. Transport is simulated by a particle tracker algorithm, well suited for pure advection and advection-dominated transport processes, because it does not introduce spurious numerical diffusions. Our parallel algorithm takes advantage of a subdomain decomposition and of the non interaction between the particles. This hybrid parallel strategy is quite original, since most of previous work uses either one or the other. Our numerical experiments show that our parallel particle tracker is quite robust and efficient , , , .

This work is done in collaboration with the Dart team, at INRIA Futurs-Lille.

A framework for Shared Memory Architectures that makes design of parallel applications easier has been developed. We use the Model-Driven Engineering (MDE) approach and integrate new metamodels into the software Gaspard developed by Dart for each step of the design flow. The targeted model is an OpenMP metamodel, from which we immediately derive a source code in Fortran/OpenMP or C/OpenMP. This approach based on models allows a better reuse and also gives a better and more hierarchical view of the application so that it can better fit the architecture.

Using this OpenMP/Fortran code generation chain, we have experimented the generated code in a typical operation in the scientific field: the matrix multiplication. We have compared generated code with optimized BLAS library function. Different algorithms have been generated: row-column multiplication, multiplication by block, multiplication by block using optimized BLAS function for the sequential part. Those algorithms have been compared with the sequential BLAS function and the parallel BLAS function. The results show that the way to use Gaspard in the High Performance Computing field is to entrust Gaspard to manage parallelism and to use optimized function for the sequential part. This work has been accepted for publication in the proceedings of the 16th Euromicro International Conference on Parallel, Distributed and network-based Processing, Toulouse, February 2008.

This work is done in collaboration with M. Moakher, from ENIT, Tunisia, in the context of the Sarima project .

The geoid is the level surface of the earth attraction at the sea level. That surface is obtained as a correction of a regular surface by fitting existing mesures. The problem ends up with
a large structured generalized least squares problem. Therefore, we plan to apply our algorithms on
QRfactorizations (
). During that year, the Matlab chain of treatments developed in 2006, has been tested and improved.

The main research direction on which we now focus, is the determination of an equivalent mass system which can generate a given geoid. The mathematical definition of the problem is now written: it is expressed as a non-linear least squares problems in the Hilbert space of harmonic functions. The first experiments are now running.

Mohamad Muhhiedine began his PhD thesis in october 2006 on the subject: "Numerical simulations of prehistoric fires", co-advised by Ramiro March (ArcheoSciences, Rennes). This project takes place in the archeological/human sciences program: "Man and fire: towards a comprehension of the evolution of thermal energy control and its technical, cultural and paleo-environmental consequences". Both physical and numerical approaches are used to understand the functioning mode and the thermal history of the studied structures.

A first part of the work concerns the simulation of forced evaporation of water in a saturated soil. A simplified model written in one dimension uses the finite volume method and takes into account the phase change via accumulation of latent heat. The numerical solutions are obtained by using explicit schemes; this work has been presented at the majecSTIC conference ( ).

The results exhibit time fluctuations of temperature (see figure ) which come from the finiteness of the cells. To reduce these fluctuations, a classical way is to refine the mesh locally around the phase-change front. We have designed an algorithm to follow the moving front, based on recursive refinement of a fixed, uniform basic cells mesh, via an insert/delete nodes procedure (see figure ). This second work has been accepted at the international symposium FVCA5, to be held in Aussois, France, in June 2008.

This work is done in collaboration with M. Al Ghoul and N. Nassif, from the American University of Beirut, Lebanon, in the context of the Sarima project ( ). See section .

The project consists of setting forth a theoretical and numerical model describing the transport and chemical reaction processes taking place in a geological self-organizing system, as an attempt to simulate the observed banding. The coupling of transport (diffusion and flow velocity) to chemical reactions (here precipitation) causes the deposition of the mineral in the form of bands resembling those of a Liesegang pattern. In 2D, concentric rings of the precipitate originating from a central diffusion source are expected to form. The global system of Partial Differential Equations is discretised by a Finite Difference method and by a Finite Element method. The discrete system of ODE is solved by an implicit time scheme adapted to stiff systems. Sparse linear systems arising in each time step are solved by a direct method. We are now comparing the two spatial discretizations and we are validating the results.

This work was done in the context of the MOMAS GdR ( ) and in the context of the Andra contract ( ).

Reactive transport models are complex nonlinear PDEs, coupling the transport engine with the reaction operator. We consider here chemical reactions at equilibrium. We have pursued our work on a global approach, based on a PDAE (Partial Differential Algebraic Equations) framework, a method of lines and a DAE solver. We have developed a prototype in C, for 1D domains, with our global approach and the classical sequential iterative approach. We have run experiments with several test cases, in order to validate our approach and to compare it with the classical one. One of the test cases is the benchmark proposed by Momas and others are used by Andra to qualify the platform Alliances. Publications related to this work are , , .

This work is done in collaboration with A. Beaudoin, from the University of le Havre and J.-R. de Dreuzy, from the department of Geosciences at the University of Rennes 1.

We have pursued our work for simulating flow and solute transport in a 2D rectangle, where the permeability field is highly heterogeneous. We have developed a fully parallel package, called Paradis, which is now embedded in the scientific platform Hydrolab, described in section ). Sections and describe the parallel algorithms. To obtain a well-defined asymptotic regime, we have used very large computational domains and run our simulations on a cluster. We could compute with no ambiguity the longitudinal and transverse dispersion coefficient for large heterogeneities , , .

We are now developing a package for 3D domains. The challenge is to design efficient parallel and grid algorithms in order to increase the computational size by several orders of magnitude.

This work is done in collaboration with J.-R. de Dreuzy, from the department of Geosciences at the University of Rennes 1.

We have pursued our work for simulating flow in a 3D network of interconnected plane fractures. We have developed a package, which is now embedded in the scientific platform Hydrolab, described in section ). We have run Monte-Carlo simulations (see section ), for various parameters, in order to validate thoroughly our algorithms , . We are currently investigating efficient linear solvers and we plan to develop a parallel version.

In hydrogeology, the description of the underground properties is very poor, mainly due to its complex heterogeneity and to the lack of measures. As a consequence, we rely on stochastic models of geometrical and physical properties. In our models, we generate the rock fractures or the permeability field randomly from known probabilistic laws. Because a simulation is just one possible realization of the phenomenon, the result is not representative of the reality ; we have to quantify this uncertainty. To achieve this we have developed within the Hydrolab platform a Monte Carlo method in which we perform a large number of simulations and compute the first statistical moments of the results. Our Monte Carlo method has the particularity to be fully generic : it is based on virtual and generic classes, enabling to use it for any random-based application. We are currently developing a parallel and a grid adaptable version. We also plan to study and implement other UQ methods to compute more statistical moments with less simulations.

Because we are doing a huge number of simulations within the Hydrolab platform, we have a huge amount of results. In addition, some simulations are very costly in terms of computational time : several hours on a cluster. Therefore, there was a need for an efficient storage or these results. We have designed and implemented a generic MySQL relational database, in which we can store simulations (parameters, results and metadata) for all Hydrolab applications. Jointly to this database, we have developed a web portal to easily access it : the web portal enables to store simulations and to consult them from a user-friendly interface directly on the web. We have also added other services for Hydrolab users to this web portal : an interface to generate parameters files, and an interface to launch remote computing. The web portal is the base for developing more and more web-accessible services for Hydrolab users. See section .

Time: July 2007

This contract was a consulting action for the company Safe technology. The work was based on a network simulator defined by Safe technology, modeling immiscible two-phase flow in a discrete network of cells. We proposed to modify the boundary conditions, in order to get a more physical model and a well-posed mathematical problem. We also proposed a numerical algorithm to solve the nonlinear equations involved in the model.

Projet Exploratoire Pluridisciplinaire

Time: 2 years from June 2007.

Partners: Sage team, Dart team (INRIA Futurs-Lille), L2EP (Lille)

Title: Co-modeling and model engineering for high performance computing in electromagnetism software

The goal of this project is to use the power of new software engineering tools to design a new version of the electromagnetism solver CARMEL. This multidisciplinary project puts the emphasis on the necessary collaboration between scientists from many different fields (Physics, Applied mathematics, High performance Computing, Software engineering). The objective is to deliver a model of CARMEL using Gaspard metamodel. Then with the Gaspard2 development environment, we plan to generate a parallel software to solve the Maxwell equations.

Contract with Andra

time: three years from October 2005.

This contract is related to C. de Dieuleveult's PhD thesis. The subject is reactive transport, with application to nuclear waste disposal.

See
http://

The working group MOMAS includes many partners from CNRS, INRIA, universities, CEA, ANDRA, EDF and BRGM. It covers many subjects related to mathematical modeling and numerical simulations for nuclear waste disposal problems. We participate in the project entitled “Numerical and mathematical methods for reactive transport in porous media”.

ACI GRID program GRID'5000, two projects entitled Grille Rennes

time: three years from 2004.

Coordinator : Y. Jégou, Paris team, INRIA-Rennes.

Our parallel algorithms were tested on the clusters of the Grid at Rennes.

IFREMER contracts, No 06/2 210 099 and 06/2 210 341

Partners : Irisa, IFREMER

Title : Numerical model for the propagation of elastic waves

time : from January 2007 until November 2007.

This work is done in the context of the “Contrat de Plan État Région Bretagne 2000-2006” (signed in October 2002 – it contains five parts and is spread out over the period 2002-2007), for the development of new geophysical exploration means.

The first objective of this study was to develop a software simulating the propagation of elastic waves in the seawater and in the underwater geophysical layers. We have used the code FLUSOL from the INRIA-team ONDES. The second objective was to study inverse methods to find layer properties in the ground, from acoustic measurements recorded near the sea surface by a ship. The reflection of the wave at each interface allows us to collect information in the fluid and, via appropriate numerical methods, to determine the density and the elasticity of each layer constituting the solid sub-marine soil.

Contract with Ecole Centrale de Lyon

Time: three years from May 2006 (actually started in 2007).

Title: Conception Interactive par simulation Numérique des Ecoulements couplées à des Méthodes d'optimisation par Algorithmes Spécifiques.

This work is done in the context of the Région Rhône-Alpes initiative called Rhône-Alpes Automotive CLUSTER, and the competitiveness cluster called Lyon Urban Truck and Bus (LUTB). The global objective is to design a new methodology in CFD to reduce drastically computational time in an optimization process. The partners FLUOREM and LMFA have developed the software Turb-Opty based on parametrization. The key part of Sage team is to study sparse linear solvers applied to CFD systems arising in Turb-Opty applications. A first step has been done by using direct multifrontal solvers on systems of moderate size.

Contract with ANR, program RNTL

Time: three years from October 2007.

Title: Large Information Base for the Research in AEROdynamics.

Coordinator: FLUOREM, Lyon.

Partners: LMFA, Ecole Centrale de Lyon; CDCSP, University of Lyon; Sage team.

This work is done in the context of the CINEMAS2 project, described above. The main objective for the team Sage is to design efficient algorithms adapted to industrial configurations using the Turb-Opty software developed by Fluorem and LMFA. The challenge is to solve many linear systems of large size.

This ERCIM Working Group was created in 2007 and follows the past ERCIM WG entitled “Matrix Computations and Statistics”, created in 2001. The Sage team is involved in the specialized group named “Matrix computations and Statistics” and Bernard Philippe is co-chair of this track. It concerns with topics of research emerging from statistical applications which involve the use of linear algebra methods, optimization and parallel computing. The track is especially concerned by the very large problems which necessitate the design of reliable and fast numerical procedures. The solution of large-scale linear system of equations using High Performance Computing is addressed.

http://

Petko Yanev was granted with an ERCIM postdoctoral fellowship from October 2006 to June 2007, see section .

Also, Cristian Gatu visited the Sage team in the context of this ERCIM working group (see section ) and presented his work at the kick-off meeting , .

ERCIM Working Group, started in 2001.

**Chairman :**Mario Arioli, RAL.

http://

The Working Group is intended to be a forum within ERCIM Institutional Organizations in which a cross fertilization between numerical techniques used in different fields of scientific computing might take place. Thus, the Working Group intends to focus on this underpinning theme of computational and numerical mathematics. In this way, the intention is that any resulting numerical algorithm will achieve wider applicability, greater robustness, and better accuracy.

PECO-NEI Network for Education-Research with Eastern Europe Countries,

Title: Efficient sparse rank revealing QR factorization for solving least squares problems.

Time: 2006 - 2009

Coordination: INRIA-Futurs team Grand Large (Laura Grigori),

Partners: Politechnica University of Bucarest (Bogdan Dumitrescu), Slovakia Academy of Sciences (Gabriel Oksa)

This project aims at developing efficient algorithms for performing the QR factorization with rank revealing of sparse and dense matrices. In particular the algorithms target matrices arising in geodesy applications. Funding is provided for visits and a student in co-direction between INRIA Futurs and Politechnica of Romania.

Title: Numerical simulations in hydrogeology

This 3-year project includes six partners from Rabat (Morroco), Annaba (Algeria), Tunis (Tunisia), Naples (Italy), Barcelona (Spain) and Rennes. R. Aboulaïch (LERMA, Rabat) chairs the activity with B. Philippe.

The project deals with the numerical simulation of the groundwater flows and the transport of pollutants. The goal consists in organizing a network of teams which gathers expertise for the whole spectrum of the domain from the physical models, the mathematical methods, the numerical algorithms, the codes. The activity will be organized through a list of cooperative actions which will be defined during the first year. The network will be a training tool for each involved team. The success of the approach should be materialized after three years, by the availability of some common codes, by publications and by the ability to access and use computing grids.

From a first workshop which was held in 2006, a list of six tasks was defined on topics including modelling, numerical approximations, inverse problems. The SAGE team is involved in these tasks. M. Ziani's thesis participates to the theme that focuses on Non Linear Solvers (section ).

SARIMA project Inria/Ministry of Foreign Affairs

Support to Research Activities in Mathematics and Computer Science in Africa

**Partner** : CIMPA (International Center for Pure and Applied Mathematics)

**Duration** : 2004-2008,

**Website** :
http://

The project SARIMA is managed by the ministry of Foreign Affairs. It involves INRIA and CIMPA as financial operators.

The aim of the project is to reinforce the cooperation between French research teams and African and Middle-East ones in mathematics and computer science. The strategy consists in reinforcing existing research teams so that they become true poles of excellence for their topic and their region. A network based organization should strengthen the individual situation of the groups. From the CARI experience (African Conference on Research in Computer Science) and the CIMPA's experience (International Center for Pure and Applied Mathematics), the initial network includes seven teams (five teams in French speaking sub-Saharan countries, two teams acting for the whole Maghreba, one in Tunisia in Applied maths and one in Algeria in Computer Science, and one team in Lebanon).

The activity of the network is managed by the SARIMA GIS (Groupe d'Intérêt Scientifique). In this project, INRIA is responsible for all the visits of African researchers to research groups in France. In 2006, more than 120 researchers (PhD students and researchers) were funded to visit France for one to six months long visits.

B. Philippe is the coordinator of the project for INRIA and the president of the Goupement d'Intérêt Scientifique which manages the project.

Five Ph-D students are entirely or partially supported by the project :

co-advised by Maher Moakher (ENIT, Tunisia) and B. Philippe. Signed agreement between the universities of Tunis El Manar and Rennes 1 (see section ).

co-advised by Emmanuel Kamgnia (University of Yaounde I, Cameroon) and B. Philippe. Signed agreement between the universities of Yaounde I and Rennes 1 (see sections and ).

co-advised by Rajae Aboulaïch (LERMA, Morroco) and F. Guyomarc'h (under B. Philippe's signature) (see sections and ).

co-advised by Mazen El Ghoul and Nabil Nassif (American University of Beirut, Lebanon) and J. Erhel (see section ).

co-advised by Nabil Nassif (American University of Beirut, Lebanon) and J. Erhel (see section ).

E. Kamgnia visited the Sage team in the context of Sarima, see and .

J. Erhel organized a one-day scientific meeting, in July, with Anne-Marie Treguier, from LPO, Brest; participants were members of Sage team and Ipso team, members of LPO and LEMAR, Brest.

É. Bresciani organized a Hydrolab workshop in November. Most of the Sage members participated in this workshop.

B. Philippe was member of the organizing committee of the Conference in Honor of Claude Lobry (Sep. 10-14, St Louis, Senegal).

B. Philippe is one of the four chief-editors of the electronic journal ARIMA.

B. Philippe is co-editor with E. Kontoghiorghes of the 2nd Special issue on Numerical Algorithms, Parallelism and Applications, Applied Numerical Mathematics, Volume 57, Issues 11-12, November-December 2007.

É. Canot is member of the CUMI (Commission des Utilisateurs de Moyens Informatiques), of INRIA-Rennes, from September 2007.

J. Erhel is member and secretary of the Comité de Gestion Local of AGOS at INRIA-Rennes.

J. Erhel is member of Comité Technique Paritaire of INRIA.

J. Erhel is member of commission de spécialistes, section 27, of the University of Rennes 1.

F. Guyomarc'h was member of the CUMI until September 2007.

F. Guyomarc'h is member of commission de spécialistes, section 27, of the University of Rennes 1.

F. Guyomarc'h was responsible for the first year of the DIIC (Diplôme d'Ingénieur en Informatique et Communication) and is a member of the working group for updating the academic plans, until August 2007.

In the International Affairs Department of INRIA, B. Philippe is in charge of the cooperating programmes with scientific teams in Africa and Middle-East countries.

B. Philippe was the INRIA representative at the conference “Convergences Mathématiques Franco-Maghrébines (Jan. 22-24, Nice).

B. Philippe is the INRIA representative at the Conseil d'Administration of CIMPA.

B. Philippe is the INRIA coordinator for the SARIMA project (see ).

B. Philippe is the corresponding person for the agreement between the University of Rennes 1, The University of Reims, the Lebanese University (Lebanon) and AUF (Agence Universitaire Francophone) which supports a Master.

É. Canot, C. de Dieuleveult, J. Erhel and S. Zein taught about Applied Mathematics (MAP) for DIIC, IFSIC, Rennes 1 (second year). Lecture notes on
http://

J. Erhel gave a one-week course in January on Methods for Solving Large Systems, in Beirut, Lebanon (Master of Simulation, co-organized by the Lebanese University, IRISA
and the University of Reims). Lecture notes on
http://

F. Guyomarc'h gave lectures on algorithms (ALG2) for Master (M2-CCI), IFSIC, University of Rennes 1.

F. Guyomarc'h gave lectures on object oriented programmation (PROG2) for Master (M2-CCI), IFSIC, University of Rennes 1 and also in the first year of DIIC.

N. Makhoul-Karam taught at the american University of Beirut, giving lectures on Elementary Linear Algebra (42 hours), and solving sessions on Discrete Mathematics (84 hours), Calculus (42 hours) and Ordinary Differential Equations (42 hours).

B. Philippe gave a 3 hours tutorial on eigenvalue solvers at the Collège Polytechnique in Paris (January) during the session “Méthodes performantes en algèbre linéaire pour la résolution de systèmes linéaires et le calcul de valeurs propres".

É. Canot: presentation of a seminar ("Modélisation et calcul des phénomènes physiques") at the archeological site of Pincevent, Montereau, France, July.

J. Erhel participated in the operation "à la découverte de la recherche", organized in the area of Rennes. She visited three high schools in Rennes and Vitré where she
gave a talk entitled
*Comprendre l'écoulement de l'eau dans les roches grâce à l'informatique*. She also discussed with the scholars about research profession.

F. Guyomarc'h gave 3 lectures about solving linear and non linear systems of equations at a seminar of the L2EP (University of Lille).

F. Guyomarc'h, with L. Grigori gave a lecture about linear algebra for high performance computing at Faculty of automatic control and computers (University POLITEHNICA of Bucharest). See section .

C. de Dieuleveult: participation in training on the chemistry solver PHREEQC-2, Amsterdam, March.

C. de Dieuleveult: participation in training on scientific writing, Irisa, October.

M. Muhieddine: participation in the excavation in the archeological site of Pincevent, Pincevent, one month, June-July.

J. Erhel: participation in training on project management, Irisa, December.

É. Canot and S. Zein: participation and contribution in sixth international conference Aplimat, Bratislava, Slovakia, February.

J. Erhel: participation and organization of a mini-symposium. in SIAM conference on Geosciences, Santa Fe, USA, March.

M. Ziani: participation and contribution in the 3ème colloque sur les Tendances dans les Applications Mathématiques en Tunisie, Algérie, Maroc (TAMTAM'07), Alger, Algeria, April.

B. Philippe: participation and organization of the workshop “Tools and Methods in North/South Partnership for Research”, in the IST-Africa 2007 Conference, Maputo, Mozambique, May.

J. Erhel: participation and invited plenary talk, in PARCFD conference, Atlanta, Turkey, May.

B. Philippe: participation and invited plenary lecture, in the 6th International Conference on "Large-Scale Scientific Computations", Sozopol, Bulgaria, June.

C. de Dieuleveult: participation and contribution in ANDRA day of Phd students, Chatenay-Malabry, June.

É. Bresciani, C. de Dieuleveult, J. Erhel, N. Makhoul, B. Philippe and S. Zein: participation in the Scicade conference, St-Malo, July; contribution by C. de Dieuleveult.

J. Erhel and N. Makhoul: participation in the Europar conference, Rennes, August; contribution by J. Erhel.

G. Atenekeng Kahou: participation and contribution in the PPAM'2007 conference, CTPSM07 Workshop, Gdansk, Poland, September.

M. Muhieddine: Participation and contribution in the conference Majecstic, Caen, October.

B. Philippe: participation in the conference in Honor of Claude Lobry, St Louis, Senegal, September.

B. Philippe: participation and invited lecture, in the Journées Modélisation et Applications Thématiques, Dakar, Senegal, October.

J. Erhel and B. Philippe: participation and invited lectures, in M2A72 conference, Marseille, October.

C. de Dieuleveult and J. Erhel: participation and contribution in the MoMas symposium, Fréjus, November.

B. Philippe: participation and invited lecture, in the conference Congrès de Méthodes Numériques et de Modélisation, Tunis, Tunisia, December.

B. Philippe visited the CS Department of Purdue University, USA and collaborated with Ahmed Sameh, 2 weeks, October.

The team has invited the following persons:

C. Gatu, University of Neuchatel, Swiss, 2 months, March-April.

M. Sosonkina, Iowa state University, USA, 2 weeks, May.

R.B. Sidje, University of Queensland, Australia, 1 week, July.

A. Griewank, Humboldt-Universität zu Berlin, Germany, 2 months, July-September.

N. Nassif, American University of Beirut, Lebanon, 1 week, September.

E. Kamgnia, University of Yaounde, Cameroon, 2 months, September-October.