Section: Partnerships and Cooperations

National Initiatives

French National Fund for the Digital Society Project (FSN)

FSN XLcloud , 2012-2014

Participants : Jean-Patrick Gelas, Laurent Lefèvre, François Rossigneux.

Focused on high-performance computing, the XLcloud collaborative project sets out to define and demonstrate a cloud platform based on HPC-as-a-Service. This is designed for computational intensive workloads, with interactive remote visualisation capabilities, thus allowing different users to work on a common platform. XLcloud project's members design, develop and integrate the software elements of a High Performance Cloud Computing (HPCC) System.

Expected results of the projects include : Functional and technical specification of the XLcloud platform architecture, open source API of the XLcloud platform, implementation of algorithms for 3D and video streaming display, prototype of the XLcloud platform including the support of on-demand virtual clusters and remote visualisation service, use cases for validation, illustrating the performance and suggesting future improvements.

XLcloud aims at overcoming some of the most important challenges of implementing operationally high performance applications in the Cloud. The goal is to allow partners of the project to take leadership position in the market, as cloud service providers, or as technology providers. XLcloud relies on a consortium of various partners (BULL (project leader), TSP, Silkan, EISTI, Ateme, Inria, CEA List, OW2, AMG.Lab).

In this project, the Avalon team investigates the issue of energy awareness and energy efficiency in OpenStack Cloud based platforms.

French National Research Agency Projects (ANR)

ANR INFRA MOEBUS , Multi-objective scheduling for large computing platforms, 4 years, ANR-13-INFR-000, 2013-2016

Participants : Christian Perez, Laurent Lefèvre, Frédéric Suter.

The ever growing evolution of computing platforms leads to a highly diversified and dynamic landscape. The most significant classes of parallel and distributed systems are supercomputers, grids, clouds and large hierarchical multi-core machines. They are all characterized by an increasing complexity for managing the jobs and the resources. Such complexity stems from the various hardware characteristics and from the applications characteristics. The MOEBUS project focuses on the efficient execution of parallel applications submitted by various users and sharing resources in large-scale high-performance computing environments.

We propose to investigate new functionalities to add at low cost in actual large scale schedulers and programming standards, for a better use of the resources according to various objectives and criteria. We propose to revisit the principles of existing schedulers after studying the main factors impacted by job submissions. Then, we will propose novel efficient algorithms for optimizing the schedule for unconventional objectives like energy consumption and to design provable approximation multi-objective optimization algorithms for some relevant combinations of objectives. An important characteristic of the project is its right balance between theoretical analysis and practical implementation. The most promising ideas will lead to integration in reference systems such as SLURM and OAR as well as new features in programming standards implementations such as MPI or OpenMP.

ANR ARPEGE MapReduce , Scalable data management for Map-Reduce-based data-intensive applications on cloud and hybrid infrastructures, 4 years, ANR-09-JCJC-0056-01, 2010-2013

Participants : Frédéric Desprez, Gilles Fedak, Sylvain Gault, Christian Perez, Anthony Simonet.

MapReduce is a parallel programming paradigm successfully used by large Internet service providers to perform computations on massive amounts of data. After being strongly promoted by Google, it has also been implemented by the open source community through the Hadoop project, maintained by the Apache Foundation and supported by Yahoo! and even by Google itself. This model is currently getting more and more popular as a solution for rapid implementation of distributed data-intensive applications. The key strength of the MapReduce model is its inherently high degree of potential parallelism.

In this project, the Avalon team participates to several work packages which address key issues such as efficient scheduling of several MapReduce applications, integration using components on large infrastructures, security and dependability, and MapReduce for Desktop Grid.

ANR COSINUS COOP , Multi Level Cooperative Resource Management, 3.5 years, ANR-09-COSI-001-01, 2009-2013

Participants : Frédéric Desprez, Christian Perez, Noua Toukourou.

The main goals of this project are to set up a cooperation as general as possible between programming models and resource management systems and to develop algorithms for efficient resource selection. In particular, the project targets the SALOME platform and the GRID-TLSE expert-site (http://gridtlse.org/ ) as an example of programming models, and PadicoTM, Diet and XtreemOS as examples of communication manager, grid middleware and distributed operating systems.

The project is led by Christian Perez.

ANR INFRA SONGS , Simulation Of Next Generation Systems, 4 years, ANR-12-INFRA-11, 2012-2015

Participants : Frédéric Desprez, Georgios Markomanolis, Jonathan Rouzaud-Cornabas, Frédéric Suter.

The last decade has brought tremendous changes to the characteristics of large scale distributed computing platforms. Large grids processing terabytes of information a day and the peer-to-peer technology have become common even though understanding how to efficiently such platforms still raises many challenges. As demonstrated by the USS SimGrid project, simulation has proved to be a very effective approach for studying such platforms. Although even more challenging, we think the issues raised by petaflop/exaflop computers and emerging cloud infrastructures can be addressed using similar simulation methodology.

The goal of the SONGS project is to extend the applicability of the SimGrid simulation framework from Grids and Peer-to-Peer systems to Clouds and High Performance Computation systems. Each type of large-scale computing system will be addressed through a set of use cases and lead by researchers recognized as experts in this area.

Any sound study of such systems through simulations relies on the following pillars of simulation methodology: Efficient simulation kernel; Sound and validated models; Simulation analysis tools; Campaign simulation management.

Inria Large Scale Initiative

HEMERA , 4 years, 2010-2014

Participants : Christian Perez, Laurent Pouilloux, Laurent Lefèvre.

Hemera deals with the scientific animation of the Grid'5000 community. It aims at making progress in the understanding and management of large scale infrastructure by leveraging competences distributed in various French teams. Hemera contains several scientific challenges and working groups. The project involves around 24 teams located in all around France.

C. Pérez is leading the project; L. Lefevre and L. Pouilloux are managing scientific challenges on Grid'5000 .

C2S@Exa , Computer and Computational Sciences at Exascale, 4 years, 2013-2017

Participants : Frédéric Desprez, Christian Perez, Laurent Lefèvre.

Since January 2013, the team is participating to the C2S@Exa Inria Project Lab (IPL). This national initiative aims at the development of numerical modeling methodologies that fully exploit the processing capabilities of modern massively parallel architectures in the context of a number of selected applications related to important scientific and technological challenges for the quality and the security of life in our society. At the current state of the art in technologies and methodologies, a multidisciplinary approach is required to overcome the challenges raised by the development of highly scalable numerical simulation software that can exploit computing platforms offering several hundreds of thousands of cores. Hence, the main objective of C2S@Exa is the establishment of a continuum of expertise in the computer science and numerical mathematics domains, by gathering researchers from Inria project-teams whose research and development activities are tightly linked to high performance computing issues in these domains. More precisely, this collaborative effort involves computer scientists that are experts of programming models, environments and tools for harnessing massively parallel systems, algorithmists that propose algorithms and contribute to generic libraries and core solvers in order to take benefit from all the parallelism levels with the main goal of optimal scaling on very large numbers of computing entities and, numerical mathematicians that are studying numerical schemes and scalable solvers for systems of partial differential equations in view of the simulation of very large-scale problems.

Inria ADT

Inria ADT Aladdin, 4 years, 2008-2014

Participants : Simon Delamare, Frédéric Desprez, Matthieu Imbert, Laurent Lefèvre, Christian Perez.

ADT ALADDIN is an Inria support action of technological development which supports the Grid'5000 instrument. Frédéric Desprez is leading this action (with David Margery from Rennes as the Technical Director). More information at Section  5.8 .