EN FR
EN FR


Section: Partnerships and Cooperations

National Initiatives

PIA

PIA ELCI, Environnement Logiciel pour le Calcul Intensif, 2014-2018

Participants : Mathilde Boutigny, Thierry Gautier, Laurent Lefèvre, Christian Perez, Issam Raïs, Jérôme Richard, Philippe Virouleau.

The ELCI PIA project is coordinated by BULL with several partners: CEA, Inria, SAFRAB, UVSQ.

This project aims to improve the support for numerical simulations and High Performance Computing (HPC) by providing a new generation software stack to control supercomputers, to improve numerical solvers, and pre- and post computing software, as well as programming and execution environment. It also aims to validate the relevance of these developments by demonstrating their capacity to deliver better scalability, resilience, modularity, abstraction, and interaction on some application use-cases. Avalon is involved in WP1 and WP3 ELCI Work Packages through the PhD of Issam Raïs and the postdoc of Hélène Coullon. Laurent Lefèvre is the Inria representative in the ELCI technical committee.

MRSEI

Fennec, FastEr NaNo-Characterisation, 24 months, 2018-2021

Participants : Eddy Caron, Christian Perez.

The goal of the ANR-MRSEI FENNEC project is to support the submission of a project to the European call DT-NMBP-08-2019 entitled “Real-time nano-characterisation technologies (RIA)”.

Inria Large Scale Initiative

DISCOVERY, DIStributed and COoperative management of Virtual EnviRonments autonomouslY, 4 years, 2015-2019

Participants : Maverick Chardet, Jad Darrous, Christian Perez.

To accommodate the ever-increasing demand for Utility Computing (UC) resources, while taking into account both energy and economical issues, the current trend consists in building larger and larger Data Centers in a few strategic locations. Although such an approach enables UC providers to cope with the actual demand while continuing to operate UC resources through centralized software system, it is far from delivering sustainable and efficient UC infrastructures for future needs.

The DISCOVERY initiative aims at exploring a new way of operating Utility Computing (UC) resources by leveraging any facilities available through the Internet in order to deliver widely distributed platforms that can better match the geographical dispersal of users as well as the ever increasing demand. Critical to the emergence of such locality-based UC (LUC) platforms is the availability of appropriate operating mechanisms. The main objective of DISCOVERY is to design, implement, demonstrate and promote the LUC Operating System (OS), a unified system in charge of turning a complex, extremely large-scale and widely distributed infrastructure into a collection of abstracted computing resources which is efficient, reliable, secure and at the same time friendly to operate and use.

To achieve this, the consortium is composed of experts in research areas such as large-scale infrastructure management systems, network and P2P algorithms. Moreover two key network operators, namely Orange and RENATER, are involved in the project.

By deploying and using such a LUC Operating System on backbones, our ultimate vision is to make possible to host/operate a large part of the Internet by its internal structure itself: A scalable set of resources delivered by any computing facilities forming the Internet, starting from the larger hubs operated by ISPs, government and academic institutions, to any idle resources that may be provided by end-users.

HAC SPECIS, High-performance Application and Computers, Studying PErformance and Correctness In Simulation, 4 years, 2016-2020

Participants : Dorra Boughzala, Idriss Daoudi, Thierry Gautier, Laurent Lefèvre, Frédéric Suter.

Over the last decades, both hardware and software of modern computers have become increasingly complex. Multi-core architectures comprising several accelerators (GPUs or the Intel Xeon Phi) and interconnected by high-speed networks have become mainstream in HPC. Obtaining the maximum performance of such heterogeneous machines requires to break the traditional uniform programming paradigm. To scale, application developers have to make their code as adaptive as possible and to release synchronizations as much as possible. They also have to resort to sophisticated and dynamic data management, load balancing, and scheduling strategies. This evolution has several consequences:

First, this increasing complexity and the release of synchronizations are even more error-prone than before. The resulting bugs may almost never occur at small scale but systematically occur at large scale and in a non deterministic way, which makes them particularly difficult to identify and eliminate.

Second, the dozen of software stacks and their interactions have become so complex that predicting the performance (in terms of time, resource usage, and energy) of the system as a whole is extremely difficult. Understanding and configuring such systems therefore becomes a key challenge.

These two challenges related to correctness and performance can be answered by gathering the skills from experts of formal verification, performance evaluation and high performance computing. The goal of the HAC SPECIS Inria Project Laboratory is to answer the methodological needs raised by the recent evolution of HPC architectures by allowing application and runtime developers to study such systems both from the correctness and performance point of view.