The Graalproject-team is common to CNRS, ENS Lyon, and INRIA. This team is part of the Laboratoire de l'Informatique du Parallélisme(LIP), UMR ENS Lyon/CNRS/INRIA/UCBL 5668. The team is located in part at the École normale supérieure de Lyon and in part at the Université Claude Bernard – Lyon 1.
Parallel computing has spread into all fields of applications, from classical simulation of mechanical systems or weather forecast to databases, video-on-demand servers or search tools like Google. From the architectural point of view, parallel machines have evolved from large homogeneous machines to clusters of PCs (with sometimes boards of several processors sharing a common memory, these boards being connected by high speed networks like Myrinet). However, the need of computing or storage resources has continued to grow leading to the need of resource aggregation through Local Area Networks (LAN) or even Wide Area Networks (WAN). The recent progress of network technology has enabled the use of highly distributed platforms as a single parallel resource. This has been called Metacomputing or more recently Grid Computing . An enormous amount of financing has recently been put into this important subject, leading to an exponential growth of the number of projects, most of them focusing on low level software detail. We believe that many of these projects failed to study fundamental issues such as the computational complexity of problems and algorithms and heuristics for scheduling problems. Also they usually have not validated their theoretical results on available software platforms.
From the architectural point of view, Grid Computing has different scales but is always highly heterogeneous and hierarchical. At a very large scale, tens of thousands of PCs connected through the Internet are aggregated to solve very large applications. This form of the Grid, usually called a Peer-to-Peer (P2P) system, has several incarnations, such as SETI@home, Gnutella or XtremWeb . It is already used to solve large problems (or to share files) on PCs across the world. However, as today's network capacity is still low, the applications supported by such systems are usually embarrassingly parallel. Another large-scale example is TeraGRID which connects several supercomputing centers in the USA and reaches a peak performance of over 100 Teraflops. At a smaller scale but with a high bandwidth, one can mention the Grid'5000 project, which connects PC clusters spread in nine French university research centers. Many such projects exist over the world that connect a small set of machines through a fast network. Finally, at a research laboratory level, one can build an heterogeneous platform by connecting several clusters using a fast network such as Myrinet.
The common problem of all these platforms is not the hardware (these machines are already connected to the Internet) but the software (from the operating system to the algorithmic design). Indeed, the computers connected are usually highly heterogeneous (from clusters of SMPs to the Grid).
There are two main challenges for the widespread use of Grid platforms: the development of environments that will ease the use of the Grid (in a seamless way) and the design and evaluation of new algorithmic approaches for applications using such platforms. Environments used on the Grid include operating systems, languages, libraries, and middlewares , , . Today's environments are based either on the adaptation of “classical” parallel environments or on the development of toolboxes based on Web Services.
Aims of the Graal project.
In the Graalproject we work on the following research topics:
algorithms and scheduling strategies for heterogeneous and distributed platforms,
environments and tools for the deployment of applications over service oriented platforms.
The main keywords of the Graalproject:
Algorithmic Design + Middleware/Libraries + Applications
over Heterogeneous and Distributed Architectures
Frédéric Vivien was promoted Senior researcher.
Scheduling sets of computational tasks on distributed platforms is a key issue but a difficult problem. Although a large number of scheduling techniques and heuristics have been presented in the literature, most of them target only homogeneous resources. However, future computing systems, such as the computational Grid, are most likely to be widely distributed and strongly heterogeneous. Therefore, we consider the impact of heterogeneity on the design and analysis of scheduling techniques: how to enhance these techniques to efficiently address heterogeneous distributed platforms?
The traditional objective of scheduling algorithms is the
following: given a task graph and a set of computing
resources, or
processors, map the tasks onto the processors, and
order the execution of the tasks so that: (i) the task
precedence constraints are satisfied; (ii) the resource
constraints are satisfied; and (iii) a minimum schedule
length is achieved. Task graph scheduling is usually studied
using the so-called
macro-dataflowmodel, which is widely used in the
scheduling literature: see the survey papers
,
,
,
and the references therein. This
model was introduced for homogeneous processors, and has been
(straightforwardly) extended to heterogeneous computing
resources. In a word, there is a limited number of computing
resources, or processors, to execute the tasks. Communication
delays are taken into account as follows: let task
Tbe a predecessor of task
T'in the task graph; if both tasks are assigned to the
same processor, no communication overhead is incurred, the
execution of
T'can start immediately at the end of the execution of
T; on the contrary, if
Tand
T'are assigned to two different processors
Piand
Pj, a communication delay is incurred. More precisely,
if
Picompletes the execution of
Tat time-step
t, then
Pjcannot start the execution of
T'before time-step
, where
is the communication delay, which depends upon both
tasks
Tand
T', and both processors
Piand
Pj. Because memory accesses are typically several orders
of magnitude cheaper than inter-processor communications, it
is sensible to neglect them when
Tand
T'are assigned to the same processor.
The major flaw of the macro-dataflow model is that communication resources are not limited in this model. Firstly, a processor can send (or receive) any number of messages in parallel, hence an unlimited number of communication ports is assumed (this explains the name macro-dataflowfor the model). Secondly, the number of messages that can simultaneously circulate between processors is not bounded, hence an unlimited number of communications can simultaneously occur on a given link. In other words, the communication network is assumed to be contention-free, which of course is not realistic as soon as the number of processors exceeds a few units.
The general scheduling problem is far more complex than the traditional objective in the macro-dataflowmodel. Indeed, the nature of the scheduling problem depends on the type of tasks to be scheduled, on the platform architecture, and on the aim of the scheduling policy. The tasks may be independent (e.g., they represent jobs submitted by different users to a same system, or they represent occurrences of the same program run on independent inputs), or the tasks may be dependent (e.g., they represent the different phases of a same processing and they form a task graph). The platform may or may not have a hierarchical architecture (clusters of clusters vs. a single cluster), it may or may not be dedicated. Resources may be added to or may disappear from the platform at any time, or the platform may have a stable composition. The processing units may have the same characteristics (e.g., computational power, amount of memory, multi-port or only single-port communications support, etc.) or not. The communication links may have the same characteristics (e.g., bandwidths, latency, routing policy, etc.) or not. The aim of the scheduling policy can be to minimize the overall execution time (makespan minimization), the throughput of processed tasks, etc. Finally, the set of all tasks to be scheduled may be known from the beginning, or new tasks may arrive all along the execution of the system (on-line scheduling).
In the Graalproject, we investigate scheduling problems that are of practical interest in the context of large-scale distributed platforms. We assess the impact of the heterogeneity and volatility of the resources onto the scheduling strategies.
The solution of sparse systems of linear equations (symmetric or unsymmetric, most often with an irregular structure) is at the heart of many scientific applications arising in various domains such as geophysics, chemistry, electromagnetism, structural optimization, and computational fluid dynamics. The importance and diversity of the fields of applications are our main motivation to pursue research on sparse linear solvers. Furthermore, in order to solve hard problems that result from ever-increasing demand for accuracy in simulations, special attention must be paid to both memory usage and execution time on the most powerful parallel platforms (whose usage is necessary because of the volume of data and amount of computation required). This is done by specific algorithmic choices and scheduling techniques. From a complementary point of view, it is also necessary to be aware of the functionality requirements from the applications and from the users, so that robust solutions can be proposed for a large range of problems.
Because of their efficiency and robustness, direct methods (based on Gaussian elimination) are methods of choice to solve these types of problems. In this context, we are particularly interested in the multifrontal method , for symmetric positive definite, general symmetric or unsymmetric problems, with numerical pivoting in order to ensure numerical accuracy. The existence of numerical pivoting induces dynamic updates in the data structures where the updates are not predictable with a static or symbolic analysis approach.
The multifrontal method is based on an elimination tree which results (i) from the graph structure corresponding to the nonzero pattern of the problem to be solved, and (ii) from the order in which variables are eliminated. This tree provides the dependency graph of the computations and is exploited to define tasks that may be executed in parallel. In the multifrontal method, each node of the tree corresponds to a task (itself can be potentially parallel) that consists in the partial factorization of a dense matrix. This approach allows for a good locality and hence efficient use of cache memories.
We are especially interested in approaches that are intrinsically dynamic and asynchronous , , as these approaches can encapsulate numerical pivoting and can be adopted to various computer architectures. In addition to their numerical robustness, the algorithms are based on a dynamic and distributed management of the computational tasks, not so far from today's peer-to-peer approaches: each process is responsible for providing work to some other processes and at the same time it acts as a worker for others. These algorithms are very interesting from the point of view of parallelism and in particular for the study of mapping and scheduling strategies for the following reasons:
the associated task graphs are very irregular and can vary dynamically,
they are currently used inside industrial applications, and
the evolution of high performance platforms, to the more heterogeneous and less predictable ones, requires that applications adapt themselves, using a mixture of dynamic and static approaches, as our approach allows.
Our research in this field is strongly linked to the software package Mumps(see Section ) which is our main platform to experiment and validate new ideas and pursue new research directions. We are facing new challenges for very large problems (tens to hundreds of millions of equations) that occur nowadays in various application fields: in that case, either parallel out-of-core approaches are required, or direct solvers should be combined with iterative schemes, leading to hybrid direct-iterative methods.
The fast evolution of hardware capabilities in terms of wide area communication as well as of machine virtualization leads to the requirement of another step in the abstraction of resources with respect to applications. Those large scale platforms based on the aggregation of large clusters (Grids), huge datacenters (Clouds) or collections of volunteer PCs (Desktop computing platforms) are now available for researchers of different fields of science as well as private companies. This variety of platforms and the way they are accessed have also an important impact on how applications are designed (i.e., the programming model used) as well as how applications are executed (i.e., the runtime/middleware system used). The access to these platforms is driven through the use of different services providing mandatory features such as security, resource discovery, virtualization, load-balancing, etc. Software as a Service (SaaS) has thus to play an important role in the future development of large scale applications. The overall idea is to consider the whole system, ranging from the resources to the application, as a set of services. Hence, a user application is an ordered set of instructions requiring and making uses of some services like for example an execution service. Such a service is also an application—but at the middleware level—that is proposing some services (here used by the user application) and potentially using other services like for example a scheduling service. This model based on services provided and/or offered is generalized within software component models which deal with composition issues as well as with deployment issues.
Our goal is to contribute to the design of programming models supporting a wide range of architectures and to their implementation by mastering the various algorithmic issues involved and by studying the impact on application-level algorithms. Ideally, an application should be written once; the complexity is to determine the adequate level of abstraction to provide a simple programming model to the developer while enabling efficient execution on a wide range of architectures. To achieve such a goal, the team plans to contribute at different level including programming models, distributed algorithms, deployment of services, services discovery, service composition and orchestration, large scale data management, etc.
In the context of our activity on sparse direct (multifrontal) solvers in distributed environments, we develop, distribute, maintain and support competitive software. Our methods have a wide range of applications, and they are at the heart of many numerical methods in simulation: whether a model uses finite elements or finite differences, or requires the optimization of a complex linear or nonlinear function, one almost always ends up solving a linear system of equations involving sparse matrices. There are therefore a number of application fields, among which we list some cited by the users of our sparse direct solver Mumps(see Section ): structural mechanical engineering (e.g., stress analysis, structural optimization, car bodies, ships, crankshaft segment, offshore platforms, computer assisted design, computer assisted engineering, rigidity of sphere packings); heat transfer analysis; thermomechanics in casting simulation; fracture mechanics; biomechanics; medical image processing; tomography; plasma physics (e.g., Maxwell's equations), critical physical phenomena, geophysics (e.g., seismic wave propagation, earthquake related problems); ad-hoc networking modeling (e.g., Markovian processes); modeling of the magnetic field inside machines; econometric models; soil-structure interaction problems; oil reservoir simulation; computational fluid dynamics (e.g., Navier-Stokes, ocean/atmospheric modeling with mixed finite elements methods, fluvial hydrodynamics, viscoelastic flows); electromagnetics; magneto-hydro-dynamics; modeling the structure of the optic nerve head and of cancellous bone; modeling of the heart valve; modeling and simulation of crystal growth processes; chemistry (e.g., chemical process modeling); vibro-acoustics; aero-acoustics; aero-elasticity; optical fiber modal analysis; blast furnace modeling; glaciology (e.g., modeling of ice flow); optimization; optimal control theory; astrophysics (e.g., supernova, thermonuclear reaction networks, neutron diffusion equation, quantum chaos, quantum transport); research on domain decomposition (e.g., Mumpsis used on subdomains in an iterative solver framework); and circuit simulations.
Lammpsis a classical molecular dynamics (MD) code created for simulating molecular and atomic systems such as proteins in solution, liquid-crystals, polymers, or zeolites. It was designed for distributed-memory parallel computers and runs on any parallel platform that supports the MPI message-passing library or on single-processor workstations. Lammpsis mainly written in F90.
Lammpswas originally developed as part of a 5-way DoE-sponsored CRADA collaboration between 3 industrial partners (Cray Research, Bristol-Myers Squibb, and Dupont) and 2 DoE laboratories (Sandia and Livermore). The code is freely available under the terms of a simple license agreement that allows you to use it for your own purposes, but not to distribute it further.
The integration of Lammpsinto our Problem Solving Environment Dietis in progress. Discussions are still taking place in order to make the Lammpsservice available through a web portal, on at least one cluster managed by the Sun Grid Engine batch scheduler.
Current progress in different areas of chemistry such as organic chemistry, physical chemistry or biochemistry allows the construction of complex molecular assemblies with predetermined properties. In all these fields, theoretical chemistry plays a major role by helping to build various models which can greatly differ in terms of theoretical and computational complexity, and which allow the understanding and the prediction of chemical properties.
Among the various theoretical approaches available, quantum chemistry is at a central position as all modern chemistry relies on it. This scientific domain is quite complex and involves heavy computations. In order to fully apprehend a model, it is necessary to explore the whole potential energy surface described by the independent variation of all its degrees of freedom. This involves the computation of many points on this surface.
Our project is to couple Dietwith a relational database in order to explore the potential energy surface of molecular systems using quantum chemistry: all molecular configurations to compute are stored in a database, the latter is queried, and all configurations that have not been computed yet are passed through Dietto computer servers which run quantum calculations, all results are then sent back to the database through Diet. At the end, the database will store a whole potential energy surface which can then be analyzed using proper quantum chemical analysis tools.
Genomics acquiring programs, such as full genomes sequencing projects, are producing larger and larger amounts of data. The analysis of these raw biological data require very large computing resources. In some cases, due to the lack of sufficient computing and storage resources, skilled staff or technical abilities, laboratories cannot afford such huge analyses. Grid computing may be a viable solution to the needs of the genomics research field: it can provide scientists with a transparent access to large computational and data management resources. In this application domain, we are currently addressing two different problems outlined below.
In the first problem, we tackle the problem of clustering the sequences contained in international databanks into domain protein families. Our aim is to ensure, through the use of grids, the capacity of timely and automatically building of databases (such as ProDom) when such databases are built from exponentially-fast growing protein databases.
In the second problem, we consider protein functional sites. Functional sites and signatures of proteins are very useful for analyzing raw biological data or for correlating different kinds of existing biological data. These methods are applied, for example, to the identification and characterization of the potential functions of new sequenced proteins. The sites and signatures of proteins can be expressed by using the syntax defined by the PROSITE databank, and written as a “protein regular expression”. Searching one such site in a sequence can be done with the criterion of the identity between the searched and the found patterns. Most of the time, this kind of analysis is quite fast. However, in order to identify non perfectly matching but biologically relevant sites, the user can accept a certain level of error between the searched and the matching patterns. Such an analysis can be very resource consuming.
Ramses
Cosmological simulations are usually divided into two main categories. Large scale periodic boxes requiring massively parallel computers are performed on a very long elapsed time (usually several months). The second category stands for much faster small scale “zoom simulations”. One of the particularity of the HORIZON project is that it allows the re-simulation of some areas of interest for astronomers.
We designed a Grid version of Ramsesthrough the Dietmiddleware. From Grid'5000 experiments we proved that Dietis capable of handling long cosmological parallel simulations: mapping them on parallel resources of a Grid, executing and processing communication transfers. The overhead induced by the use of Dietis negligible compared to the execution time of the services. Thus Dietpermits to explore new research axes in cosmological simulations (on various low resolutions initial conditions), with transparent access to the services and the data.
Climatologists have recourse to numerical simulation and particularly coupled models in several occasions: for example, to estimate natural variability (thousand of simulated years), for seasonal forecasting (only a few simulated months) or to study global warming characteristics (some simulated decades).
To take advantage of the Grid'5000 platform, we choose to launch parallel simulations (ensemble) on several nodes, approximatively 10 or more, according to the load of the platform. Scenario simulations that simulate from present climate to the next century require huge computing power. Indeed, each simulation will differ from each other in physical parameterization of atmospheric model. Comparing them, we expect to better estimate global warming prediction sensibility in order to model parameterization.
Practically, a 150 year long scenario combines 1800 simulations of one month each, launched one after the other. This partitioning eases workflow and implements checkpointing because the final state of the simulation of one month is used as the initial state of the next month.
Our goal regarding the climate forecasting application is to thoroughly analyze it in order to model its needs in terms of execution model, data access pattern, and computing needs. Once a proper model of the application has been derived, appropriate scheduling heuristics can be proposed, tested, and compared. We plan to extend this work to provide generic scheduling schemes for applications with similar dependence graphs.
The Décrypthon project is built over a collaboration between CNRS, AFM ( Association Française contre les Myopathies), and IBM. Its goal is to make computational and storage resources available to bioinformatic research teams in France. These resources, connected as a Grid through the Renater network, are installed in six universities and schools in France (Bordeaux, Jussieu, Lille, Lyon, Orsay, and Rouen). The Décrypthon project offers means necessary to use the Grid through financing of research teams and postdoc, and assistance on computer science problems (such as modeling, application development, and data management). The Graalresearch team is involved in this project as an expert for application gridification. The Grid middleware used at the beginning of the project was GridMP from United Devices. In 2007, Dietwas chosen to be the Grid middleware of the Décrypthon Grid. It ensures the load-balancing of jobs over the six computation centers through the Renater network. This transfer of our middleware, first built for large scale experimentations of scheduling heuristics, in a production Grid is a real victory for our research team.
Micro-factories are automated units designed to produce pieces composed of micro-metric elements. Today's micro-factories are composed of elementary modules or robots able to carry out basic operations. To perform more complex operations, few elementary modules may be grouped in a cell. The realization of one of these cells is still a scientific challenge but several research projects have already got significant results in this domain. These results show very promising functionalities like the ability to configure or reconfigure a cell, by changing a robot tool for instance. However, the set of operations carried out by a cell is still limited. The next generation of micro-factories will put several cells together and make them cooperate to produce complex assembled pieces, as we do for macroscopic productions. In this context, the cell control will evolve to become more cooperative and distributed.
Micro-factories may be modeled in a way that allows reusing the results obtained in scheduling on heterogeneous platforms as Grids, in particular the results on steady-state scheduling. We develop scheduling strategies and algorithms adapted to this context and we optimize the deployment of cells based on the micro-product and the production specification. We are currently working on the evaluation and the adaptation of several scheduling algorithms in this context, taking small-to-medium batch of jobs into account.
At the micro-metric scale, the manipulation of the elements cannot be considered the same way as at macro-metric scale because the equilibrium of forces is modified. For instance, the electrostatic force becomes predominant on the gravity. This lead to uncontrolled behaviors and frequently generates faults. We are working on taking these faults into account into scheduling models and evaluating their performance depending on the fault characteristics.
Huge problems can now be processed over the Internet
thanks to Grid middleware systems. The use of on-the-shelf
applications is needed by scientists of other disciplines.
Moreover, the computational power and memory needs of such
applications may of course not be met by every workstation.
Thus, the RPC paradigm seems to be a good candidate to build
Problem Solving Environments on the Grid. The aim of the
Dietproject (
http://
Moreover, the aim of a middleware system such as Dietis to provide a transparent access to a pool of computational servers. Dietfocuses on offering such a service at a very large scale. A client which has a problem to solve should be able to obtain a reference to the server that is best suited for it. Dietis designed to take into account the data location when scheduling jobs. Data are kept as long as possible on (or near to) the computational servers in order to minimize transfer times. This kind of optimization is mandatory when performing job scheduling on a wide-area network. Dietis built upon Server Daemons. The scheduler is scattered across a hierarchy of Local Agentsand Master Agents.
Applications targeted for the Dietplatform are now able to exert a degree of control over the scheduling subsystem via plug-in schedulers . As the applications that are to be deployed on the Grid vary greatly in terms of performance demands, the Dietplug-in scheduler facility permits the application designer to express application needs and features in order that they be taken into account when application tasks are scheduled. These features are invoked at runtime after a user has submitted a service request to the MA, which broadcasts the request to its agent hierarchy.
Diethas been validated on several applications. Some of them have been described in Sections through .
Workflow-based applications are scientific, data intensive applications that consist of a set of tasks that need to be executed in a certain partial order. These applications are an important class of Grid applications and are used in various scientific domains like astronomy or bioinformatics. We have developed a workflow engine in Dietto manage such applications and propose to the end-user and the developer a simple way either to use provided scheduling algorithms or to develop their own scheduling algorithm.
In our implementation, workflows are described using the XML language. Since no standard exists for scientific workflows, we have proposed our formalism. The Dietagent hierarchy has been extended with a new special agent, the MA_DAG. To be flexible we can execute workflows even if this special agent is not present in the platform. The use of the MA_DAGcentralizes the scheduling decisions and thus can provide a better scheduling when the platform is shared by multiple clients. On the other hand, if the client bypasses the MA_DAG, a new scheduling algorithm can be used without affecting the Dietplatform. The current implementation of Dietprovides several schedulers (Round Robin, HEFT, random, Fairness on finish Time, etc.).
The Dietworkflow runtime also includes a rescheduling mechanism. Most workflow scheduling algorithms are based on performance predictions that are not always accurate (erroneous prediction tool or resource load wrongly estimated). The rescheduling mechanism can trigger the application rescheduling when some conditions specified by the client are filled.
We also continued our work on schedulers for Dietworkflow engine concerning multi-workflows based applications, and graphical tools for workflows within the DietDashBoard project. Within the Gwendia project, we worked on the implementation of the language defined in the project and around the Cardiac application. Experiments were done over the Grid'5000 platform.
DAGDA, designed during the PhD of Gaël Le Mahec, is a new data manager for the Dietmiddleware which allows data explicit or implicit replications and advanced data management on the grid. It was designed to be backward compatible with previously developed applications for Dietwhich benefit transparently of the data replications. It allows explicit or implicit data replications, file sharing between the nodes which can access to the same disk partition, the choice of a data replacement algorithm, and a high level configuration about the memory and disk space Dietshould use for the data storage and transfers. To transfer a data, DAGDA uses the pull model instead of the push model used by DTM. The data are not sent into the profile from the source to the destination, but they are downloaded by the destination from the source. DAGDA also chooses the best source for a given data. DAGDA has also been used for the validation of our join replication and scheduling algorithms over Diet.
The GridRPC paradigm is now an OGF standard. The GridRPC community has interests in the Data Management within the GridRPC paradigm. Because of previous works performed in the Dietmiddleware concerning Data Management, Eddy Caron is now co-chair of the GridRPC working group in order to lead the project to propose a powerful Grid Data Management API which will extend the GridRPC paradigm.
Data Management is a challenging issue inside the OGF GridRPC standard, for feasability and performance reasons. Indeed some temporarily data do not need to be transfered once computed and can reside on servers for example. We can also imagine that data can be directly transferred from one server to another one, without being transferred to the client in accordance to the GridRPC paradigm behavior.
In consequence, we work on a Data Management API which
has been presented to almost all OGF sessions since OGF'21.
Since december 2009, the proposal is available for public
comment and may be reached at:
http://
For the requirements of the GridTLSE project, Diethas been extended with a specialized version of a server daemon. It is able to provide access to the AEGIS middleware services developped in the JAEA. A demo has been presented in the JAEA booth at SuperComputing'10.
Cloud computing is currently drawing more and more attention. This is due to multiple reasons, the most important of which are the on-demand way of provisioning resources and the pay-as-you-go pricing. In order to study and take advantage of these features, we extended the Dietmiddleware with Cloud support. DietCloud is a Dietmodule able of harnessing the extensibility of Cloud platforms in a seamless manner. We have targeted the Eucalyptus open-source Cloud because it implements the Amazon EC2 Cloud interface. Recently we also confirmed DietCloud's compatibility to Amazon EC2 by building a proof-of-concept demo which was shown at SuperComputing'10.
We have designed a new metric called (GreenPerf) to allow to Dietto provide a scheduler that takes into account the energy information. We designed a heuristic to find the best server with the good rate between performance and electric consumption. In collaboration with Laurent Lefevre from the RESO research team, we designed the architecture to deal with energy sensors. More developments and experiments are required to validate the integration into the current release.
The MapReduce programming model (re-)introduced by Google is a promising model to deploy data processing application services over large scale platforms such as Grids and Clouds. We developed a version of MapReduce over Diet. In particular, we automatized the creation of MapReduce-type workflows. Some large-scale experiments over the Grid’5000 platform were conducted to validate the concepts and algorithms developed.
For each input key/value pair, the Dietworkflow engine generates one map task. Each map calculates intermediate key/value pairs and returns a container with all intermediate pairs. Thanks to the Dietworkflow engine, the results are merged and all containers are sent to the sort service. This service sorts the pairs by combining one key and all its values in a container. This container is itself added to a container that is returned by the service. The Dietworkflow engine then explodes the main container, and it creates a reducing task for each element. Reducing tasks calculate and return final key/value pairs. All final pairs are then merged and returned to the client.
We implemented a prototype with the sorting service and a prototype with tree reduction. These prototypes allowed us to validate the feasibility of two solutions and the constraints imposed by the Dietmidlleware.
We worked on the Dietintegration into the EDF infrastructure in the context of the INRIA Grenoble-Rhône-Alpes GRAAL, EDF R&D SINETICS OSIS partnership. The first work was to provide a set of new functionalities for users to submit a large amount of tasks on different remote LRMS ( Local Resources Manager Systems), to manage and tune these tasks and to finally retrieve the results. The solution is based on Dietmodules that can be called in C/C++, or called directly from the command line.
May 26th, DIET 2.4 release.
December 1st, DIET 2.5 release.
Moreover, since May, special developments around the File and Batch management of a HPC infrastructure for EDF R&D are available in open source.
Mumps(for
MUltifrontal Massively Parallel Solver, see
http://
Mumpsimplements a direct method, the multifrontal method, and is a parallel code for distributed memory computers; it is unique by the performance obtained and the number of functionalities available, among which we can cite:
various types of systems: symmetric positive definite, general symmetric, or unsymmetric,
several matrix input formats: assembled or expressed as a sum of elemental matrices, centralized on one processor or pre-distributed on the processors,
detection of null pivots and null space estimate,
parallel analysis, parallel scaling algorithms,
out-of-core execution to solve larger problems,
partial factorization and Schur complement matrix,
dense or sparse right-hand sides, centralized or distributed solution,
real or complex arithmetic, single or double precision,
partial threshold pivoting,
fully asynchronous approach with overlap of computation and communication,
distributed dynamic scheduling of the computational tasks to allow for a good load balance in case of numerical pivoting at runtime.
Mumpsis currently used by thousands of academic and industrial organizations, for a wide range of application fields (see Section ). Mumpsusers include:
students and academic users from all over the world;
various developers of finite element or optimization software;
companies such as Boeing, EADS, EDF, Free Field Technologies, and Samtech.
The latest release is
Mumps4.9.2,
available since November 2009 (see
http://
HLCMi is an implementation of the HLCMcomponent model defined during the PhD of Julien Bigot. HLCMis a generic extensible component model with respect to component implementations and interaction concerns. Moreover, HLCMis abstract; it is its specialization—such as HLCM/ CCM—that define the primitive elements of the model, such as the primitive components and the primitive interactions.
HLCMi is making use of Model-driven Engineering (MDE) methodology to generate a concrete assembly from an high level description. It is based on the Eclipse Modeling Framework (EMF). HLCMi contains 700 Emfatic lines to describe its models and 7000 JAVA lines for utility and model transformation purposes. HLCMi is a general framework that supports several HLCMspecialization: HLCM/ CCM, HLCM/JAVA and HLCM/C++.
BitDewis an open source middleware implementing a set of distributed services for large scale data management on Desktop Grids and Clouds. BitDewrelies on five abstractions to manage the data : i) replication indicates how many occurrences of a data should be available at the same time on the network, ii) fault-tolerance controls the policy in presence of hardware failures, iii) lifetime is an attribute absolute or relative to the existence of other data, which decides the life cycle of a data in the system, iv) affinity drives movement of data according to dependency rules, v) protocol gives the runtime environment hints about the protocol to distribute the data (http, ftp or bittorrent). Programmers define for every data these simple criteria, and let the BitDewruntime environment manage operations of data creation, deletion, movement, replication, and fault-tolerance operation.
The current status of the software is the following : BitDewis open source under the GPLv3 or Cecill licence at the user's choice, 10 releases were produced in the last two years, and it has been downloaded approximativelly 6000 times on the INRIA forge. Known users are Université Paris-XI, Université Paris-XIII, University of Florida, Cardiff University and University of Sfax. Recently, we have implemented a first prototype of the MapReduce programming model for Desktop Grids on top of BitDew.
In term of support, the development of BitDew is partly funded by the INRIA ADT BitDew and by the ANR MapReduce projects.
XtremWebis an open source software for Desktop Grid computing, jointly developed by INRIA and IN2P3.
XtremWeballows to build lightweight Desktop Grid by gathering the unused resources of Desktop Computers (CPU, storage, network). Its primary features permit multi-users, multi-applications and cross-domains deployments. XtremWebturns a set of volatile resources spread over LAN or Internet into a runtime environment executing high througput applications.
XtremWebis a highly programmable and customizable middleware which supports a wide range of applications (bag-of tasks, master/worker), computing requirements (data/CPU/network-intensive) and computing infrastructures (clusters, Desktop PCs, multi-Lan) in a manageable, scalable and secure fasion. Known users include LIFL, LIP, LIG, LRI (CS), LAL (physics Orsay), IBBMC (biology), Université Paris-XIII, Université de Guadeloupe, IFP (petroleum), EADS, CEA, University of Wisconsin Madison, University of Tsukuba (Japan), AIST (Australia), UCSD (USA), Université de Tunis, AlmerGrid (NL), Fundecyt (Spain), Hobai (China), HUST (China).
There are two branches of XtremWeb: XtremWeb-HEP is a production version developed by IN2P3. It features many security improvements such as X509 support which allows its usage within the EGEE context. XtremWeb-CH is a research version developed by HES-SO, Geneva, which aims at building an effective Peer-To-Peer system for CPU time consuming applications.
XtremWebhas been supported by national grants (ACI CGP2P) and by major European grants around Grid and Desktop Grid such as FP6 CoreGrid: European Network of Excellence, FP6 Grid4all, and more recently FP7 EDGeS : Enabling Desktop Grid for E-Science and FP7 EDGI: European Desktop Grid Initiative.
On going developments include : providing Quality-of-Service for Desktop Grids (SpeQuloS), data-intensive processing on Desktop Grid as well as the portage of XtremWeb to the Google App Engine Cloud platform.
Mapping workflow applications onto parallel platforms is a challenging problem that becomes even more difficult when platforms are heterogeneous—nowadays a standard. A high-level approach to parallel programming not only eases the application developer's task, but it also provides additional information which can help realize an efficient mapping of the application. We focused on simple application graphs such as linear chains and fork patterns. Workflow applications are executed in a pipeline manner: a large set of data needs to be processed by all the different tasks of the application graph, thus inducing parallelism between the processing of different data sets. For such applications, several antagonist criteria should be optimized, such as throughput, latency, failure probability and energy minimization.
We have considered the mapping of workflow applications onto different types of platforms: fully homogeneousplatforms with identical processors and interconnection links; communication homogeneousplatforms, with identical links but processors of different speeds; and finally, fully heterogeneousplatforms.
This year, we have pursued the work involving the energy minimization criteria, and we studied the impact of sharing resources for concurrent streaming applications. For interval mappings, a processor is assigned a set of consecutive stages of the same application, so there is no resource sharing across applications. On the contrary, the assignment is fully arbitrary for general mappings, hence a processor can be reused for several applications. On the theoretical side, we establish complexity results for this tri-criteria mapping problem (energy, period, latency), classifying polynomial versus NP-complete instances. Furthermore, we derive an integer linear program that provides the optimal solution in the most general case. On the experimental side, we design polynomial-time heuristics, and assess their absolute performance thanks to the linear program. One main goal is to assess the impact of processor sharing on the quality of the solution.
We have pursued the investigation of timed Petri nets to model the mapping of workflows with stage replication, that we had started in 2009. In particular, we have provided bounds for the throughput when stage parameters are arbitrary I.I.D. (Independent and Identically-Distributed) and N.B.U.E. (New Better than Used in Expectation) variables: the throughput is bounded from below by the exponential case and bounded from above by the deterministic case. This work was conducted in collaboration with Bruno Gaujal (LIG Grenoble).
We have investigated several multi-criteria algorithms and heuristics for the problem of mapping pipelined applications, consisting of a linear chain of stages executed in a pipelined way, onto heterogeneous platforms. The objective was to optimize the reliability under a performance constraint, i.e., while guaranteeing a threshold throughput. In order to increase reliability, we replicate the execution of stages on multiple processors. On the theoretical side, we prove that this bi-criteria optimization problem is NP-hard. We propose some heuristics both for interval and for general mappings, and present extensive experiments evaluating their performance.
The first paper published on this work, “A. Benoit, H. L. Bouziane, Y. Robert. Optimizing the reliability of pipelined applications under throughput constraints. In ISPDC'2010, Istanbul, Turkey, July 2010” received the best paper award.
The multicore revolution is underway, bringing new chips introducing more complex memory architectures. Classical algorithms must be revisited in order to take the hierarchical memory layout into account. The goal is this study is to design cache-aware algorithms that minimize the number of cache misses paid during the execution of the matrix product kernel on a multicore processor. We have analytically studied how to achieve the best possible tradeoff between shared and distributed caches. We have also implemented and evaluated several algorithms on two multicore platforms, one equipped with one Xeon quadcore, and the second one enriched with a GPU. It turns out that the impact of cache misses is very different across both platforms, and we have identified what are the main design parameters that lead to peak performance for each target hardware configuration.
In this study, we focus on the complexity of traversing tree-shaped workflows whose tasks require large I/O files. Such workflows typically arise in the multifrontal method of sparse matrix factorization. We target a classical two-level memory system, where the main memory is faster but smaller than the secondary memory. A task in the workflow can be processed if all its predecessors have been processed, and if its input and output files fit in the currently available main memory. The amount of available memory at a given time depends upon the ordering in which the tasks are executed. We focus on the problem of finding the minimum amount of main memory, over all postorder schemes, or over all possible traversals, that is needed for an in-core execution. We have established several complexity results that answer these questions. We have proposed a new, polynomial time, exact algorithm which runs faster than a reference algorithm. We have also addressed the setting where the required memory renders a pure in-core solution unfeasible. In this setting, we ask the following question: what is the minimum amount of I/O that must be performed between the main memory and the secondary memory? We have shown that this latter problem is NP-hard, and proposed efficient heuristics. All algorithms and heuristics were thoroughly evaluated on assembly trees arising in the context of sparse matrix factorizations.
In this work, we focus on the archive system which will be used in the BlueWaters supercomputer. We have introduced two archival policies tailored for the large tape storage system that will be available on BlueWaters. We have also shown how to adapt the well known RAIT strategy (the counterpart of RAID policy for tapes). We have provided an analytical model of the tape storage platform of BlueWaters, and we used it to asses and analyze the performance of the three policies through simulations. Storage requests were generated using random workloads whose characteristics model various realistic scenarios. The throughput of the system, as well as the average (weighted) response time for each user, are the main objectives.
We proposed a novel job scheduling approach for sharing a homogeneous cluster computing platform among competing jobs. Its key feature is the use of virtual machine technology for sharing resources in a precise and controlled manner. We followed up on our work on this subject by addressing the problem of resource utilization. We proposed a new measure for this utilization and we demonstrated how, following our approach, one can improve over batch scheduling by orders of magnitude in term of job stretch, while leading to comparable or better resource utilization.
An alternative to classical fault-tolerant approaches for large-scale clusters is failure avoidance, by which the occurrence of a fault is predicted and a preventive measure is taken. We developed analytical performance models for two types of such a measure: preventive checkpointing and preventive migration. We also developed an analytical model of the performance of a standard periodic checkpoint fault-tolerant approach. We instantiated these models for platform scenarios that are representative of the current and future technology trends. We found that preventive migration is the better approach in the short term, but that both approaches have comparable merit in the longer term. We also found that standard non-prediction-based fault tolerance achieves poor scaling when compared to prediction-based failure avoidance, thereby demonstrating the importance of failure prediction capabilities. Our results also showed that achieving good utilization of truly large-scale machines (e.g., 2 20nodes) for parallel workloads will require more than the failure avoidance techniques evaluated in this work.
In the previous work, we have assumed that checkpoints were occurring periodically. Indeed, it is usually claimed that such a policy is optimal. However, most of the existing proofs rely on approximations. One such assumption is that the probability that a fault occurs during the execution of an application is very small, an assumption that is no longer valid in the context of exascale platforms. We have begun studying this problem in a fully general context. We have established that, when failures follow a Poisson law, the periodic checkpointing policy is optimal. We have also showed an unexpected result: in some cases, when the platform is sufficiently large, the checkpointing costs are sufficiently expensive, or the failures are frequent enough, one should limit the application parallelism and duplicate tasks, rather than fully parallelize the application on the whole platform. In other words, the expectation of the job duration is smaller with fewer processors! To establish this result we derived and analyzed several scheduling heuristics.
In this work we study the efficient execution of iterative applications onto volatile ressources. We studied a master-worker scheduling scheme that trades-off between the speed and the (expected) reliability and availability of enrolled workers. A key feature of this approach is that it uses a realistic communication model that bounds the capacity of the master to serve the workers, which requires the design of sophisticated resource selection strategies. The contribution of this work is twofold. On the theoretical side, we assess the complexity of the problem in its off-line version, i.e., when processor availability behaviors are known in advance. Even with this knowledge, the problem is NP-hard. On the pragmatic side, we proposed several on-line heuristics that were evaluated in simulation while a Markovian model of processor availabilities.
ProDom is a protein domain family database automatically built from a comprehensive analysis of all known protein sequences. ProDom development is headed by Daniel Kahn ( Inriaproject-team BAMBOO, formerly HELIX). With the protein sequence databases increasing in size at an exponential pace, the parallelization of MkDom2, the algorithm used to build ProDom, has become mandatory (the original sequential version of MkDom2 took 15 months to build the 2006 version of ProDom).
The parallelization of MkDom2 is not a trivial task. The sequential MkDom2 algorithm is an iterative process, and parallelizing it involves forecasting which of these iterations can be run in parallel and detecting and handling dependency breaks when they arise. We have moved forward to be able to efficiently handle larger databases. Such databases are prone to exhibit far larger variations in the processing time of query-sequences than was previously imagined. The collaboration with BAMBOO on ProDom continues today both on the computational aspects of the constructing of ProDom on distributed platforms, as well as on the biological aspects of evaluating the quality of the domains families defined by MkDom2, as well as the qualitative enhancement of ProDom.
This past year was devoted to the full scale validation of the the new parallel MPI_MkDom2 algorithm and code. We proposed a new methodology to compare two clusterings of sub-sequences in domains. We used this methodology to assess that the parallelization using MPI_MkDom2 do not significantly impact the quality of the clustering produced, when compared to the one produced by MkDom2. We successfully processed all the sequences included in the April 2010 version of the UniProt database, namely 6 118 869 sequences and 2 194 382 846 amino-acids. The whole computation would have taken 12 years and 97 days in sequential and was completed in parallel for a wall-clock time of 19 days and 12 hours. After a post-processing phase, this will lead to a new release of ProDom in the upcoming months after a four year hiatus.
Many scientific applications can be structured as Parallel Task Graphs (PTGs), that is, graphs of data-parallel tasks. Adding data-parallelism to a task-parallel application provides opportunities for higher performance and scalability, but poses additional scheduling challenges. We studied the off-line scheduling of multiple PTGs on a single, homogeneous cluster. The objective was to optimize performance without compromising fairness among the PTGs. Many scheduling algorithms, both from the applied and the theoretical literature, are applicable to this problem, and we propose minor improvements when possible. Our main contribution is an extensive evaluation of these algorithms in simulation, using both synthetic and real-world application configurations, using two different metrics for performance and one metric for fairness. We identify a handful of algorithms that provide good trade-offs when considering all these metrics. The best algorithm overall is one that structures the schedule as a sequence of phases of increasing duration based on a makespan guarantee produced by an approximation algorithm.
Each job submitted to a LRMS (Local Resources Manager System) must provide mandatory information like the number of requested computing resources and the requested duration of the resource usage, called walltime. Because the application is killed if not finished by the end of the reservation, the walltime is an over-estimation of the duration of the application launched by the job.
In the context of a Grid composed of several clusters managed by a Grid middleware which is able to tune, submit, and cancel LRMS jobs, such over-estimations have an impact on the local scheduling and performance. Consequently, previous grid scheduling, optimized at that moment, may not be relevant anymore. Thus, we have designed and studied non-intrusive mechanisms for a middleware to be able to migrate jobs still in the waiting files of the different LRMS in the Grid platform. We also proposed different scheduling heuristics integrated to the mechanims, which decide of the migration of jobs. We performed an exhaustive set of simulation experiments, in which parameters such as the load of each simulated parallel resource, the type of applications (rigid and moldable), the dedication of the platform resources, have been varied. We analyzed the performance of our propositions on different metrics which showed some counter-intuitive results.
Constraint Programming emerged in the late 1980's as a successful paradigm to tackle complex combinatorial problems in a declarative manner. It is somehow at the crossroads of combinatorial optimization, constraint satisfaction problems (CSP), declarative programming language and SAT problems (boolean constraint solvers and verification tools). Up to now, the only parallel method to solve optimization problems being deployed at large scale is the classical branch and bound, because it does not require much information to be communicated between parallel processes (basically: the current bound).
Adaptive Search was proposed by , as a generic, domain-independent constraint-based local search method. This meta-heuristic takes advantage of the structure of the problem in terms of constraints and variables and can guide the search more precisely than a single global cost function to optimize, such as for instance the number of violated constraints. A parallelization of this algorithm based on threads realized on IBM BladeCenter with 16 Cell/BE cores show nearly ideal linear speed-ups for a variety of classical CSP benchmarks (magic squares, all-interval series, perfect square packing, etc.).
We parallelized the algorithm using the multi-start approach and realized experiments on the HA8000 machine, an Hitachi supercomputer with a maximum of nearly 16000 cores installed at University of Tokyo, and on the Grid'5000 infrastructure, the French national Grid for the research, which contains 5934 cores deployed on 9 sites distributed in France. Results show that speedups may surprisingly be architecture dependant, but that if they continue to grow with the number of processors, the increase tends to stabilize for some problems after 128 processes. Work in progress considers communications between each computing resource.
Service discovery becomes a challenge in a large scale and distributed context. Heterogeneity and dynamicitu are the two main constraints that have to be taken into account in order to ensure reliability and system efficiency. Thereby, in a heterogeneous context, it is needed to equilibrate service discovery system load to get performance. Moreover, QoS in such an uncertain and dynamic environment has to be ensured by fail-safe mechanisms (self-stabilization and replication). First, Self-stabilisation ensures a consistent configuration in a convergence time. Second replication injects redundancy when the system becomes consistent. All those mechanisms will be validated and implemented. Furthermore, the service discovery system will interact with system with schedulers, batch submission systems, and storage Resource Broker. So, these component’s exchange protocols have to be formally defined.
We decided to develop a new implementation, called Spades Based Middleware ( Sbam) that includes all the concepts described above. This implementation, written in Java, relies on an efficient communication bus and has been developed according to advanced software engineering methods. The communication layer is based on the Ibis Portability Layer (IPL). Sbamhas been evaluated with regard to service research request response time. Our experiments demonstrate the efficiency and scalability of the proposed middleware system. It was demonstrated at SuperComputing 2010.
We continued the collaboration with the University of Nevada Las Vegas. We studied the benefit of Publish/subscribe overlays for the SPADES project. Loosely coupled applications can take advantage of the publish/subscribe communication paradigm. In this paradigm, subscribers declare which events, or which range of events, they wish to monitor, and are asynchronously informed whenever a publishers throws an event. In such a system, when a publication occurs, all peers whose subscriptions contain the publication must be informed. In our approach, the subscriptions are represented by a DR-tree, which is an R-tree where each minimum bounding rectangle is supervised by a peer. Instead of attempting to statically optimize the DR-tree, we give an on-line algorithm, the work function algorithm, which continually changes the DR-tree in response to the sequence of publications, in an attempt to dynamically optimize the structure. The competitiveness of this algorithm is computed to be at most 5 for any example where there are at most three subscriptions and the R-tree has height 2. The benefit of the on-line approach is that no prior knowledge of the distribution of publications in the attribute space is needed.
In 2010, we added new features to, and fixed bugs of, the DIET WebBoard (a web interface for managing the Décrypthon Grid through DIET): support for multiple users on a same application, improved the database dumping method, statistics and charts, and storage space management. We deployed the newest version of DIET and the DIET WebBoard on the Décrypthon grid.
The MaxDO “Help cure muscular dystrophy, phase 2” was
ported on the World Community Grid. To determine the size
of the work-units sent to the World Community Grid users we
ran benchmarks on Grid'5000. Finally on May 14th 2009 the
project was launched and it is running since then. On
December 10th 2010 a total of 30,000,549 work-unit results
had been sent back by the World Community Grid volunteers,
this is 64,972,205,369 positions out of 137,652,178,995
(47.2
%of the project, each work-unit
contains hundreds of “positions” for two proteins: the
result is an energy value for this configuration). We are
also checking and sorting the result files, reducing their
size, and making statistics for the volunteers (cf
http://
As resources become more powerful but heteregeneous, applications' structures are also becoming more complex, not only for harnessing the available power but also for more accurate modeling of physical phenomena. Efficient mapping and scheduling of applications to resources are thus becoming more challenging. However, this is not possible with current resource management systems (RMS) that are assuming simple application models.
Therefore, we have done an initial, theoretical study of the gains one can obtain if RMS could support rigid, fully-predictable evolvingapplications. We have proposed an offline scheduling algorithm, with optional stretching capabilities. Experiments show that taking into account resource requirement evolvement leads to significant improvements in all measured metrics—such as resource utilization and completion time. However, considered stretching strategies do not appear very valuable.
Next, we have started revisiting RMS to enable efficient complex application resource selection. In 2010, we have focused on moldableapplications. We have proposed CooRM, an RMS architecture which delegates the mapping and scheduling responsibility to the applications themselves. Simulations as well as a proof-of-concept implementation of CooRMshow that the approach is feasible and performs well in terms of scalability and fairness.
As future work, we plan to extend CooRMto support evolving and malleable applications. With respect to its applicability to existing systems, we will study its integration into XtreemOS and Salome.
Most software component models focus on the reuse of existing pieces of code called primitive components. There are however many other elements that can be reused in component-based applications. Partial assemblies of components, well defined interactions between components and existing composition patterns (a.k.a. software skeletons) are examples of such reusable elements. It turns out that such elements of reuse are important for parallel and distributed applications.
Therefore, we have designed High Level Component Model(HLCM), a software component model that supports the reuse of these elements thanks to the concepts of hierarchy, genericity and connectors—and in particular the novel concepts of open connection. Moreover, HLCM supports multiple implementations for its elements so as to allow the optimization of applications for various hardware resources. HLCMi, an implementation of HLCM, has enabled us to validate the approach: algorithmic skeletons as well as parallel interactions such as data sharing, collective communications, and parallel method invocations have been successfully implemented.
Ongoing work includes further evaluations of HLCM with the OpenAtom application—in collaboration with Prof. Kale's team at the University of Illinois at Urbana-Champaign. Furthermore, the model will be used for the development of applications based on the MapReduce paradigm and for their efficient execution on Clouds and desktop grids in the context of the MapReduce ANR project.
In 2010, we have studied whether component models can be useful to deal with complex application structure such as those found in adaptive mesh refinement applications (AMR). This kind of applications relies on dynamic and recursive data structures to adapt the computation grain to the simulation requirements. Though very relevant to decrease the computation load, AMR is seldom used as it is complex to implement.
Therefore, we have evaluated the feasibility of designing and implementing an AMR application—based on the heat equation—on two component models: ULCM and SALOME. Those models provide enough features but more are needed. Composite and dynamic management—such as found in ULCM—are very important to ease conception but user-defined skeletons and a mechanism to deal with domain decomposition are also welcome. HLCM enables to define user-defined skeletons but the issue of handling domain decomposition is left open.
We are investigating this problem targeting an application made of the coupling of several instances of Code_Aster, a thermomechanical calculation code from EDF R&D.
Cloud client applications are able to dynamically scale based on their usage. This leads to a more efficient resource usage and, as a consequence, to expense saving. The problem is non-trivial as virtual resources have a setup time that cannot be neglected. In order to make accurate decisions when the Cloud client application needs to scale there are several valid approaches. We have focused our attention on identifying an approach that allows a Cloud client to scale his platform and compensate for the virtual resource setup time. Our approach uses self-similarities in Cloud client platform usage to predict resource usage in advance. In doing so, our approach identifies patterns in the Cloud client's past platform usage. This allows us to make usage predictions with considerable accuracy. We also shown that the prediction accuracy of our approach can be increased by increasing the size of the historic database that we use for matching.
Infrastructure as a Service clouds are a flexible and fast way to obtain (virtual) resources as demand varies. Grids, on the other hand, are middleware platforms able to combine resources from different administrative domains for tasks execution. Clouds can be used as providers of devices such as virtual machines by grids so they only use the resources they need at every moment, but this requires grids to be able to decide when to allocate and release those resources. We analyzed by simulation an economic approach to set resource prices and find when to scale resources depending on the users' demand. The results show how the proposed system can successfully adapt the to the demand, while at the same time ensuring that resources are fairly shared among users.
Desktop Grids use the computing, network and storage resources from idle desktop PC's distributed over multiple-LAN's or the Internet to compute a large variety of resource-demanding distributed applications. While these applications need to access, compute, store and circulate large volumes of data, little attention has been paid to data management in such large-scale, dynamic, heterogeneous, volatile and highly distributed Grids. In most cases, data management relies on ad-hoc solutions, and providing a general approach is still a challenging issue.
We have proposed the BitDewframework which addresses the issue of how to design a programmable environment for automatic and transparent data management on computational Desktop Grids. BitDewrelies on a specific set of meta-data to drive key data management operations, namely life cycle, distribution, placement, replication and fault-tolerance with a high level of abstraction.
Since July 2010, in collaboration with the University of Sfax, we are developing a data-aware and parallel version of Magik, an application for arabic writing recognition using the BitDewmiddleware. We are targeting digital libraries, which require distributed computing infrastructure to store the large number of digitalized books as raw images and at the same time to perform automatic processing of these documents such as OCR, translation, indexing, searching, etc.
In collaboration with the G.V.Kurdyumov Institute for Metal Physics and the LAL/IN2P3, we have developed a Desktop Grid version of the SLinCA (Scaling Laws in Cluster Aggregation) application. SLinCa simulates the several general scenarios of monomer aggregation in clusters with many initial configurations of monomers (random, regular, etc.), different kinetics law (arbitrary, diffusive, ballistic, etc.), various interaction laws (arbitrary, elastic, non-elastic, etc.). The typical simulation of one cluster aggregation process with 10 monomers takes approximately 1-7 days on a single modern processor, depending on the number of Monte Carlo steps (MCS). However, thousands of scenarios have to be simulated with different initial configurations to get statistically reliable results. To calculate the parameters of evolving aggregates (moments of probability density distributions, cumulative density distributions, scaling exponents, etc.) with appropriate accuracy (up to 2-4 significant digits), we need the better statistics ( 10 4- 10 8runs of many different statistical realizations of aggregating ensembles), which will be comparable with the same accuracy statistics of available experimental data. These separate runs of simulation for different physical parameters, initial configurations, and statistical realizations, are completely independent and can be easily split among available CPUs in a “parameter sweeping” manner of parallelism. A large number of runs, needed to reduce the standard deviation in Monte Carlo simulations, are distributed equally among available workers and are combined at the end to calculate the final result.
MapReduce is an emerging programming model for data-intense application proposed by Google, which has recently attracted a lot of attention. MapReduce borrows from functional programming, where programmer defines Map and Reduce tasks executed on large sets of distributed data. In 2010, we have developed an implementation of the MapReduce programming model based on the BitDew middleware. Our prototype features several optimizations which make our approach suitable for large scale and loosely connected Internet Desktop Grid: massive fault tolerance, replica management, barriers-free execution, latency-hiding optimization as well as distributed result checking. We have presented performance evaluations of the prototype both against micro-benchmarks and real MapReduce applications. The scalability test shows that we achieve linear speedup on the classical WordCount benchmark. Several scenarios involving lagger hosts and host crashes demonstrate that the prototype is able to cope with an experimental context similar to real-world Internet.
EDGI is an FP7 European project, following the successful FP7 EDGeS project, whose goal is to build a Grid infrastructure composed of "Desktop Grids", such as BOINC or XtremWeb, where computing resources are provided by Internet volunteers, and "Service Grids", where computing resources are provided by institutional Grid such as EGEE, gLite, Unicore and "Clouds systems" such as OpenNebula and Eucalyptus, where resources are provided on-demand. The goal of the EDGI project is to provide an infrastructure where Service Grids are extended with public and institutional Desktop Grids and Clouds.
The main problem with the current infrastructure is that it cannot give any QoS support for running their applications in the Desktop Grid (DG) part of the infrastructure. For example, a public DG system enables clients to return work-unit results in the range of weeks. Although there are EGEE applications (e.g. the fusion community’s applications) that can tolerate such a long latency most of the user communities want much smaller latencies.
In 2010, we have started the development and deployment of the SpeQuloS middleware to solve this critical problem.
We define QoS concretely as a probabilistic guarantee of job makespan or throughput. Providing QoS features even in Service Grids is hard and not solved yet satisfactorily. It is even more difficult in an environment where there are no guaranteed resources. In DG systems, resources can leave the system at any time for a long time or forever even after taking several work-units with the promise of computing them. Our approach is based on the extension of DG systems with Cloud resources. For such critical work-units the SpeQuloS system is able to dynamically deploy fast and trustable clients from some Clouds that are available to support the EDGI DG systems. It takes the right decision about assigning the necessary number of trusted clients and Cloud clients for the QoS applications. At this stage, the prototype is functional and the first version is planned to be delivered to the EDGI production infrastructure during spring 2011.
Simulation is a popular approach to obtain objective performance indicators of platforms that are not at one's disposal. It may for example help the dimensioning of compute clusters in large computing centers. In many cases, the execution of a distributed application does not behave as expected, it is thus necessary to understand what causes this strange behavior. Simulation provides the possibility to reproduce experiments under similar conditions. This is a suitable method for experimental validation of a parallel or distributed application.
The tracing instrumentation of a profiling tool is the ability to save all the information about the execution of an application at run-time. Every scientific application executed computes floating point operations (flops). The originality of our approach is that we measure the flops of the application and not its execution time. This means that if a distributed application is executed on N cores and we execute it again by mapping two processes per core then we need N/2 cores and more time for the execution time of the application. An execution trace of an instrumented application can be transformed into a corresponding list of actions. These actions can then be simulated by SimGrid. Moreover the SimGrid execution traces will contain almost the same data because the only change is the use of half cores but the same number of processes. This does not affect the number of the flops so the simulation time does not get increased because of the overhead. The Grid'5000 platform is used for this work and the NAS Parallel Benchmarks are used to measure the performance of the clusters.
Mumps(see Section ) is a parallel sparse direct solver, using message passing (MPI) for parallelism. In this work we have experimented how thread parallelism can help taking advantage of recent multicore architectures. The work done consists in testing multithreaded BLAS libraries and inserting OpenMP directives in the routines revealed to be costly by profiling, with the objective to avoid any deep restructuring or rewriting of the code. In INRIA report RR-7411 (October 2010), we have reported on various aspects of this work, presented some of the benefits and difficulties, and showed that 4 to 8 threads per MPI process is generally a good compromise for performance, while increasing the number of threads is always interesting in terms of memory usage. We also considered and discussed several issues that appear to be critical with a mixed MPI-OpenMP approach in a multicore environment. In the future we plan to pursue this work on larger numbers of cores.
We have investigated seven maximum traversal algorithms. We report on their careful implementations. The algorithms are analyzed and design choices are discussed. To the best of our knowledge, this is the most comprehensive comparison of maximum transversal algorithms based on augmenting paths. Previous papers with the same objective either do not have all the algorithms discussed in this paper or they use non-uniform implementations from different researchers. We use a common base to implement all of the algorithms and compare their relative performance on a wide range of graphs and matrices. We systematize, develop and use several ideas for enhancing performance. One of these ideas improves the performance of one of the existing algorithms in most cases, sometimes significantly. So much so that we use this as the eighth algorithm in comparisons.
The inverse of an irreducible sparse matrix is structurally full, so that it is impractical to think of computing or storing it. However, there are several applications where a subset of the entries of the inverse is required. Given a factorization of the sparse matrix held in out-of-core storage, we show how to compute such a subset efficiently, by accessing only parts of the factors. When there are many inverse entries to compute, we need to guarantee that the overall computation scheme has reasonable memory requirements, while minimizing the cost of loading the factors. This leads to a partitioning problem that we prove is NP-complete. We also show that we cannot get a close approximation to the optimal solution in polynomial time. We thus need to develop heuristic algorithms, and we propose: (i) a lower bound on the cost of an optimum solution; (ii) an exact algorithm for a particular case; (iii) two other heuristics for a more general case; and (iv) hypergraph partitioning models for the most general setting. We illustrate the performance of our algorithms in practice using the Mumpssoftware package on a set of real-life problems as well as some standard test matrices. We show that our techniques can improve the execution time by a factor of 50.
We propose a modification of the minimum degree ordering algorithm in which some variables are constrained to be ordered only after some other nodes are ordered. The constrained variables are initially specified, and their constraints are removed during the course of the algorithm. This is close to the minimum degree ordering with constraints algorithm. The difference is that during the course of our algorithm we remove some of the constraints, whereas the constraints are static in the current constrained ordering algorithms. Such an algorithm can have different applications; we target the ordering problem for saddle point matrices.
We consider a family of problems exemplified with the
following one: Given an
m×
nmatrix
Aand an integer
kmin{
m,
n}, find a set of row indices
and a set of column indices
such that the number of nonzeros in the submatrix
indexed by
and
, i.e.,
in Matlab notation, is maximized. This is equivalent
to finding a
k×
ksubmatrix
Sof
Awith entries
Sij=
Ari,
cjsuch that it contains the maximum number
of nonzeros among all
k×
ksubmatrices of
A. We show that this problem is NP-complete, and then
propose and analyze heuristic approaches to the problem.
The problems of this nature arises in a family of hybrid
solvers for sparse linear systems.
INRIA and INPT-IRIT have signed a new contract with the company Samtech S.A. (Belgium). Samtech develops the finite element software package SAMCEF, which uses our parallel sparse direct solver Mumpsas one of the internal solvers. The goal of this contract is to improve the memory usage of Mumps, and to offer the possibility to address a larger amount of memory. We will also study how to use memory already allocated by SAMCEF instead of having the solver allocate its own memory. Finally we also plan to study how performance can be improved on Samtech problems by allowing the forward substitution step to be performed simultaneously with the matrix factorization. This last point is particularly interesting in the case of out-of-core executions.
The contract is 24-month long, and the new functionalities developed in Mumpsfor this contract will be made available in a future public release of the package.
J.-Y. L'Excellent is the principal investigator for the LIP; M. Brémond, G. Joslin, and B. Uçar participate to this contract.
PSMN is a federation of laboratories that aims at sharing the parallel machines from ENS Lyon/PSMN and experiences of parallelization of applications. FLMSN is a wider structure, replacing the FLCHP (Fédération Lyonnaise de Calcul Hautes Performances).
J.-Y. L'Excellent is the correspondent of the LIP in these two structures.
E. Caron leads (with C. Prudhomme from LJK, Grenoble) the “Calcul Hautes Performances et Informatique Distribuée” project of the cluster “Informatique, Signal, Logiciels Embarqués”. Together with several research laboratories from the Rhône-Alpes region, we initiate collaborations between application researchers and distributed computing experts.
Y. Caniou, E. Caron, F. Desprez, J.-Y. L'Excellent, and F. Vivien participate to this project.
In the third and final year of the project (2010), we have pursued the investigation of timed Petri nets to model the mapping of workflows with stage replication, in collaboration with Bruno Gaujal (LIG Grenoble).
Also, we have investigated several multi-criteria algorithms and heuristics for the problem of mapping pipelined applications, consisting of a linear chain of stages executed in a pipeline way, onto heterogeneous platforms. This work was conducted by the post-doctoral student hired on the project, Hinde Bouziane, and the first paper published on this work received the best paper award.
The project is entirely conducted within the GRAAL team by A. Benoit and Y. Robert.
The objective of the Gwendia
Today's emergence of Petascale architectures and evolutions of both research grids and computational grids increase a lot the number of potential resources. However, existing infrastructures and access rules do not allow to fully take advantage of these resources. One key idea of the SPADES project is to propose a non-intrusive but highly dynamic environment able to take advantage of the available resources without disturbing their native use. In other words, the SPADES vision is to adapt the desktop grid paradigm by replacing users at the edge of the Internet by volatile resources. These volatile resources are in fact submitted via batch schedulers to reservation mechanisms which are limited in time or susceptible to preemption (best-effort mode).
One of the priorities of SPADES is to support platforms at a very large scale. Petascale environments are therefore particularly considered. Nevertheless, these next-generation architectures still suffer from a lack of expertise for an accurate and relevant use. One of the SPADES goal is to show how to take advantage of the power of such architectures. Another challenge of SPADES is to provide a software solution for a service discovery system able to face a highly dynamic platform. This system will be deployed over volatile nodes and thus must tolerate failures. SPADES will propose solutions for the management of distributed schedulers in Desktop Computing environments, coping with a co-scheduling framework.
The main goals of this project are to set up such a
cooperation as general as possible with respect to
programming models and resource management systems and to
develop algorithms for efficient resource selection. In
particular, the project targets the SALOME platform and
GRID-TLSE expert-site (
http://
The project is led by Christian Pérez.
Recently, a new vision of cloud computing has emerged where the complexity of an IT infrastructure is completely hidden from its users. At the same time, cloud computing platforms provide massive scalability, 99.999% reliability, and speedy performance at relatively low costs for complex applications and services. This project, lead by D. Kondo from INRIA MESCAL investigates the use of cloud computing for large-scale and demanding applications and services over unreliable resources. In particular, we target volunteered resources distributed over the Internet. In this project, G. Fedak leads the Data management task (WP3).
MapReduce is a parallel programming paradigm successfully used by large Internet service providers to perform computations on massive amounts of data. After being strongly promoted by Google, it has also been implemented by the open source community through the Hadoop project, maintained by the Apache Foundation and supported by Yahoo! and even by Google itself. This model is currently getting more and more popular as a solution for rapid implementation of distributed data-intensive applications. The key strength of the Map-Reduce model is its inherently high degree of potential parallelism.
In this project, the GRAAL team participates to several work packages which address key issues such as efficient scheduling of several MR applications, integration using components on large infrastructures, security and dependability, MapReduce for Desktop Grid.
The emergence of exascale computers will enable to solve new scientific challenges. However, the scientific applications deployed on such machines comprising up to millions of cores will have to cope with numerous failures: it is forecasted that with the current techniques, the mean time between two consecutive failures will be shorter than the time needed to checkpoint an application using the whole platform. The main objective of the RESCUE project is to develop new algorithm techniques and new software to solve the fault-tolerance problem on exascale machines.
The RESCUE project is led by Y. Robert and involves three INRIA teams: Grand-Large, HiePACS and GRAAL (A. Benoit, L. Marchal, F. Vivien).
ADT-MUMPS is an action of technological development funded by Inria. This project gives support for 24 men x months of young engineer (“ingénieur jeune diplômé”). A permanent engineer from INRIA/SED also works on the project (Maurice Brémond, 30 % on the project). One goal of the project is to improve daily work of Mumpsdevelopers by improving the software engineering aspects, by developing non-regression tests and drivers to experiment the package. This project is in collaboration with ENSEEIHT-IRIT.
ALADDIN is an Inriaaction of technological development for “A LArge-scale DIstributed and Deployable INfrastructure” which aim is to manage the Grid'5000 experimental platform. Frédéric Desprez is leading this project (with David Margery from Rennes as the Technical Director).
ADT BitDew is an INRIA support action of technological development for the BitDew middleware. Objectives are several fold : i/ provide documentation and education material for end-users, ii/ improve software quality and support, iii/ develop new features allowing the management of Cloud and Grid resources. The ADT BitDew, leaded by G. Fedak, allows to recruit a young engineer for 24 months.
Hemera deals with the scientific animation of the Grid'5000 community. It aims at making progress in the understanding and management of large scale infrastructure by leveraging competences distributed in various French teams. Hemera contains several scientific challenges and working groups. Christian Pérez is leading the project that involves more than 20 teams located in 9 cities of France.
This action addresses economical issues concerning green-ness in scientific and production grids. Different issues are addressed like the confrontation of energy models in place in experimental grids versus the operational realities in production grids, the study of new energy prediction models related to real measures of energy consumption in production grids, and the design of energy aware scheduling heuristics.
Y. Caniou participates to this action.
The SmartGame start'up asked to take benefit of the knowledge of the GRAAL research team on distributed systems and middleware systems. The aim of this company is to create games of new generation using a new distributed architecture. E. Caron and F. Desprez participate to this action.
Following the success of the NoE CoreGRID, an ERCIM WG was started in 2009, leaded by F. Desprez. This working group gathers 31 research teams from all over Europe working on Grids, service oriented architectures and Clouds.
A workshop on Grids, Clouds, and P2P Computing was organized in conjunction with EuroPAR 2010, Ischia, August, 2010.
This project is lead by P. Kacsuk, and involves the following partners : SZTAKI, INRIA, CIEMAT, Fundecyt, University of Westminster, Cardiff University, University of Coimbra. Grid systems are currently being used and adopted by a growing number of user groups and diverse application domains. However, there still exist many scientific communities whose applications require much more computing resources than existing Grids like EGEE can provide. The main objective of this project is to interconnect the existing EGEE Grid infrastructure with existing Desktop Grid (DG) systems like BOINC or XtremWebin a strong partnership with EGEE. The interconnection of these two types of Grid systems will enable more advanced applications and provide extended compute capabilities to more researchers. In this collaboration G. Fedak represents the GRAAL team and is responsible for JRA1: Service Grids-Desktop Grids Bridges Technologies and is involved in JRA3 : Data Management, as well as NA3 : Standardization within the OGF group.
The project EDGI will develop middleware that consolidates the results achieved in the EDGeS project concerning the extension of Service Grids with Desktop Grids in order to support EGI and NGI user communities that are heavy users of DCIs and require extremely large number of CPUs and cores. EDGI will go beyond existing DCIs that are typically cluster Grids and supercomputer Grids, and will extend them with public and institutional Desktop Grids and Clouds. EDGI will integrate software components of ARC, gLite, Unicore, BOINC, XWHEP, 3G Bridge, and Cloud middleware such as OpenNebula and Eucalyptus into SG→DG→Cloud platforms for service provision and as a result EDGI will extend ARC, gLite and Unicore Grids with volunteer and institutional DG systems. Our partners in EDGI are : SZTAKI, INRIA, CIEMAT, Fundecyt, University of Westminster, Cardiff University, University of Coimbra. In this project, G. Fedak is the INRIA representative and lead the JRA2 work package which is responsible for providing QoS to Desktop Grids.
This project aims at improving the scalability of state-of-the-art computational fluid dynamics calculations by the use of state-of-the-art numerical linear algebra approaches. It mainly involves Tel Aviv University and ENSEEIHT-IRIT (Toulouse), where Alfredo Buttari is coordinator for the French side. In Graal, I. Chowdhury, J.-Y. L'Excellent, and B. Uçar participate to this project.
The collaboration is done with the Concurrency Research Group (CoRG) of Henri Casanova, and the Bioinformatics Laboratory (BiL) of Guylaine Poisson of the Information and Computer Sciences Department, of the University of Hawai`i at Manoā, USA.
The associated-team targets the efficient scheduling of large-scale scientific applications on clusters and Grids. To provide context for this research, we focus on applications from the domain of bioinformatics, in particular comparative genomics and metagenomics applications, which are of interest to a large user community today. So far, applications (in bioinformatics or other fields) that have been successfully deployed at a large scale fall under the “independent task model”: they consist of a large number of tasks that do not share data and that can be executed in any order. Furthermore, many of these application deployments rely on the fact that the application data for each task is “small”, meaning that the cost of sending data over the network can be ignored in the face of long computation time. However, both previous assumptions are not valid for all applications, and in fact many crucial applications, such as the aforementioned bioinformatics applications, require computationally dependent tasks sharing very large data sets.
In our previous collaborations, we have tackled the issue of non-negligible network communication overheads and have made significant contributions. For instance, we have designed strategies that rely on the notions of steady-state scheduling (i.e., attempting to maximize the number of tasks that complete per time unit, in the long run) and/or divisible load scheduling (i.e., approximate the discrete workload that consists of individual tasks as a continuous workload). These strategies provide powerful means for rethinking the deployment and the scheduling of independent task applications when network communication can be a bottleneck. However, the target applications in this project cannot benefit from these strategies directly and will require fundamental advances. This project aims to build upon and go beyond our past collaborations, with two main research thrusts:
Scheduling of applications with data requirements. We consider applications that require possibly multiple data files that need to be shared by multiple application tasks. These files may be extremely large (e.g., millions of genomic sequences) and may need to be updated frequently (e.g., when new sequences are identified). We must then ensure that file access is not a bottleneck.
Scheduling of multiple concurrent applications. We also plan to study the scheduling for multiple applications, i.e., launched by different (most likely competing) users. We then aim to orchestrate computation and communication in order to have the best aggregate performance. This is a difficult problem, first in order to define a good performance metric, and then to maximize this performance metric in a tractable way.
A. Benoit, E. Caron, F. Desprez, Y. Robert and F. Vivien participate to this project.
This project federates INRIA Saclay, CNRS IRIT, CEA Saclay, INRIA Bordeaux, CNRS Prism, INRIA Rennes on the French side and the University of Tokyo, The University of Tsukuba, Titech, Kyoto University on the Japanese side. The main goal of the project is to develop a programming chain and associated runtime systems which will allow scientific end-users to efficiently execute their applications on post-petascale, highly hierarchical computing platforms making use of multi-core processors and accelerators.
Y. Caniou and J.-Y. L'Excellent participate to this project.
Yves Caniou obtained a CNRS delegation for the scholar year 2009-2010, and this delegation has been prolongated for the scholar year 2010-2011. He is working at the CNRS Japan-French Laboratory in Informatics (JFLI) supervised by Philippe Codognet. The JFLI is located in Tokyo, Japan, and is composed of the Tokyo University, Université Pierre et Marie-Curie (UPMC), the Keio University, the CNRS, the NII partnership.
The Mumpsteam organized a MumpsUser Group Meeting on April 15th and 16th 2010 at ENSEEIHT, Toulouse. This was the second edition of a series of meetings started with the 2006 MumpsUser Group Meeting. The aim of this event was to bring together experts both from academia and industry. The general theme of the meeting was sparse direct solvers and related issues, ranging from applications to experiences with Mumpsand other direct solvers, and combinatorial ingredients of sparse direct solvers.
The GRAAL project at École normale supérieure de Lyon organized a workshop in Aussois, France on June 2–4, 2010. The workshop focused on scheduling for large-scale systems and on scientific computing. This was the fifth edition of this workshop series, after Aussois in August 2004, San Diego in November 2005, Aussois in May 2008, and Knoxville in May 2009.
The GRAAL project organized a day around Cloud platforms and research issues at ENS Lyon on December 13. This event that gathered more than 180 attendees allowed to share experiences and solutions both from academia and industry for the management of large scale virtualized resources.
was the Program Chair of the 19th International Heterogeneity in Computing Workshop, HCW 2010, held in Atlanta, USA, April 2010, in conjunction with IPDPS 2010, and she is the General Chair of HCW 2011 in Anchorage, USA, May 2011 (in conjuction with IPDPS 2011). She co-organized the 7th International Workshop on Practical Aspects of high-level Parallel Programming (PAPP 2010) in Amsterdam, The Netherlands, May 2010.
A. Benoit was a member of the program committee of IPDPS 2010, HiPC 2010, HLPP 2010, APDCM 2010, ICCS 2010. She is a member of the program committee of IPDPS 2011 and SPAA 2011.
is a member of the program committee of Heterogeneous Computing Workshop 2010 and 2011, and of the ICCSA 2010 and 2011 conferences.
was a member of the program committee of PDP 2010, ISPA'2010, HCW'10, MapRed'2010, and CloudCom 2010.
He is co-chair of Grid-RPC group in the OGF (Open Grid Forum). He is a co-funder of the SysFera startup company and continue to be involved as a scientific consultant.
is a member of the EuroPar Advisory board and the editorial board of “Scalable Computing: Practice and Experience” (SCPE).
F. Desprez participated to the program committees of DEPEND'2010, CCGRID 2010, EuroMPI'2010, VECPAR'10, CCGrid-Health 2010, workshop Grids meet Autonomic Computing, InterCloud2010. He was the vice-chair of the scheduling topic EuroPar'2010, LaSCoG 2010, the vice-chair of the "Tools/Software/Middleware" topic of Grid'2010, the program chair of the VTDC workshop in conjunction with HPDC'10, Cluster and Cloud Computing Track for IEEE ICPADS2010, CloudCom 2010, HeteroPar'2010, CloudComp,10, MobiCloud'2010, Cloud and Grid Computing track of AICCSA'2010.
co-chaired 2 workshops PCGRID'10 and MAPREDUCE'10 associated respectively with CCGRID (Melbourne Australia, 2010) and HPDC (Chicago, USA, 2010). He was the track co-chair for the High-speed Distributed Systems and Grids (HDSG) track in the 19th IEEE International Conference on Computer Communications and Networks (ICCCN), Zurich, Switzerland, August 2010. He was a member of the program committees of the following conferences and workshops : CloudCom 2010, (Indianapolis, USA, 2010 CoreGrid'10 , associated with Europar, (Ischia - Naples, Italy, 2010), MapRed'10, associated with CloudCom'2010, (Indiannapolis, USA, 2010)
He co-chairs 2 workshops PCGRID'11 and MAPREDUCE'11 associated respectively with IPDPS (Anchorage, Alaska 2011) and HPDC (San José, CA, 2011). He is a member of the program committee of HPDC'2011 (San Jose, California, 2011), CCGRID 2011, (Newport Beach, CA, 2011), ScalCom-11 (Cyprus, 2011), RenPar'20, (Saint-Malo, France, 2011), DICTAP 2011, (Dijon, France, 2011), 3DAPAS, in conjunction with HPDC 2011, (San Jose, CA, USA, 2011), MSOP2P'11, in conjunction with EuroMicro PDP 2011, (Ayia Napa, Cyprus, 2011)
was a member of the program committee of VECPAR'10 (Berkeley, California).
was a member of the program committee of ICNC 2010, LaSCoG 2010, IPDPS 2011 and HCW 2011.
was a member of the program committee of VECPAR'10 (Berkeley, CA, USA, June 22-25, 2010), CBHPC (Brussels, Belgium, October 26, 2010), HPCC (Melbourne, Australia, September 1-3, 2010), and FMMC (Heidelberg Academy of Sciences, Germany, March 17-19, 2010)
He is a local chair of Euro-Par 2011 (Bordeaux, France, August 29-September 2011). He is a member of the program committee of ParCo (Ghent, Belgium, August 30-September 2, 2011), HipHaC (San Antonio, Texas, USA, February 12, 2011), MapReduce (San Jose, California, USA, June 2011), Renpar'20 (St Malo, France, May 10-13, 2011). He is a member of the Steering Committee of CBHPC.
C. Pérez serves as expert for evaluating proposal to the 2010 “White” call of ANR.
is a member of the editorial board of the International Journal of High Performance Computing Applications(Sage Press), of the Journal of Computational Science(Elsevier), and of the International Journal of Grid and Utility Computing.
Y. Robert was Program Chair of HiPC'2010, track Algorithms and Applications. He will be program vice-chair of ICPP'2011, track Algorithms and Applications.
Yves Robert is a member of the Steering Committee of HCW (IEEE Workshop on Heterogeneity in Computing) of IPDPS (IEEE Int. Parallel and Distributed Symposium), and of HeteroPar (Int. Workshop on Algorithms, Models and Tools for Parallel Computing on Heterogeneous Platforms).
was a member of the program committee of Algorithms and Applications track of ICNC'10, the First International Conference on Networking and Computing, Higashi Hiroshima, Japan, November 17–19, 2010. He was also a member of the program committee of IPDPS 2010, TCPP PhD forum.
B. Uçar organized a mini-symposium entitled “Parallel sparse matrix computations and enabling algorithms”as a part of SIAM Conference on Parallel Processing for Scientific Computing (PP10), February 24–266, 2010, Seattle, Washington, USA.
is an associate editor of Parallel Computing.
F. Vivien was a member of the program committee of EuroPDP 2010, Pisa, Italy, February 2010, HiPC'2010, Goa, India, December 19-22, 2010. and Cluster 2010, Heraklion, Greece, September 20-24, 2010. He was Program chair of HeteroPar 2010 (the 8th International Workshop on Algorithms, Models, and Tools for Parallel Computing on Heterogeneous Platforms), Ischia-Naples, Italy, August 31, 2010.
Anne Benoit was responsible of the 3rd year students on fundamental computer science at ENS Lyon until August 2010. She gave a course on algorithms to the 3rd year students.
offered CR11-Grid and Cloud Computing lecture series in the Master d'Informatique at ENS Lyon.
offered CR07-Sparse matrix computations lecture series in the Master d'Informatique Fondamentale at ENS Lyon.
offered CR08-Scheduling lecture series in the Master d'Informatique Fondamentale at ENS Lyon.
is a member (in fact, the only
European member) of the NSF/TCPP initiative on the
parallel and distributed computing (PDC) curriculum. A
working group from IEEE TCPP, NSF, and the sisters
communities has taken up the task of proposing a
curriculum for computer science (CS) and computer
engineering (CE) undergraduates on parallel and
distributed computing. The goal of this committee has
been to propose a core curriculum for CS/CE
undergraduates, with the premise that every such
undergraduate should achieve a specified skill level
regarding PDC-related topics as a result of required
coursework. Over the last months, the working group has
deliberated upon various topics and subtopics, agreed
upon their learning outcomes and level of coverage, has
identified where in current core courses these could be
introduced, and has provided examples of how they might
be taught. Limited reviews have been carried out by
selected stakeholders. Early adopters in Fall-10 and
Spring-11 will be employing and evaluating the proposed
curriculum. See
http://
gave a course on Parallel Algorithms to 2nd year students.