Section: New Results
Efficient algorithmic for load balancing and code coupling in complex simulations
Dynamic load balancing for massively parallel coupled codes
As a preliminary step related to the dynamic load balancing of coupled
codes, we focus on the problem of dynamic load balancing of a single
parallel code, with variable number of processors. Indeed, if the
workload varies drastically during the simulation, the load must be
redistributed regularly among the processors. Dynamic load balancing
is a well studied subject but most studies are limited to an initially
fixed number of processors. Adjusting the number of processors at
runtime allows to preserve the parallel code efficiency or to keep
running the simulation when the current memory resources are
exceeded. We call this problem, MxN graph repartitioning.
We propose some methods based on graph repartitioning in order to
rebalance the load while changing the number of processors. These
methods are split in two main steps. Firstly, we study the migration
phase and we build a “good” migration matrix minimizing several
metrics like the migration volume or the number of exchanged
messages. Secondly, we use graph partitioning heuristics to compute a
new distribution optimizing the migration according to the previous
step results. Besides, we propose a direct
This work is developed in the framework of Clément Vuchener's PhD, that will be defended on February 2014. These contributions have been presented at the international conference ParCo [22] in Munchen.
Regarding the problem of dynamic balancing of parallel coupled codes,
we start to reuse results on MxN graph repartitioning. Given two
coupled codes
This work is developed in the framework of Maria Predari's PhD, that just started in october 2013.
Graph partitioning for hybrid solvers
Nested Dissection has been introduced by A. George and is a very popular
heuristic for sparse matrix ordering before numerical factorization. It allows to maximize
the number of parallel tasks, while reducing the fill-in and the operation count.
The basic standard idea is to build a "small separator"
However, if we examine precisely the complexity analysis for the estimation of asymptotic bounds for fill-in or operation count when using Nested Dissection ordering, we can notice that the size of the halo of the separated sub-graphs (set of external vertices belonging to an old separator and previously ordered) plays a crucial role in the asymptotic behavior achieved. In the perfect case, we need halo vertices to be balanced among parts.
Considering now hybrid methods mixing both direct and iterative solvers such as HIPS , MaPHyS , obtaining a domain decomposition leading to a good balancing of both the size of domain interiors and the Scalable numerical schemes for scientific applications size of interfaces is a key point for load balancing and efficiency in a parallel context. This leads to the same issue: balancing the halo vertices to get balanced interfaces.
For this purpose, we revisit the algorithm introduced by Lipton, Rose and Tarjan which performed the recursion of nested dissection in a different manner: at each level, we apply recursively the method to the sub-graphs But, for each sub-graph, we keep track of halo vertices. We have implemented that in the Scotch framework, and have studied its main algorithm to build a separator, called greedy graph growing.
This work is developed in the framework of Astrid Casadei's PhD. These contributions have been presented at the international workshop on Nested Dissection [32] in Waterloo.