Section: Partnerships and Cooperations
National Initiatives
ANR
OverFlow (2015–2019)
Participants : Alexandru Costan, Pedro de Souza Bento Da Silva, Paul Le Noac'h.
- Project Acronym:
- Project Title:
-
Workflow Data Management as a Service for Multisite Applications
- Coordinator:
- Duration:
- Other Partners:
- External collaborators:
-
Kate Keahey (University of Chicago and Argonne National Laboratory), Bogdan Nicolae (Argonne National Lab)
- Web site:
This project investigates approaches to data management enabling an efficient execution of geographically distributed workflows running on multi-site clouds.
In 2019, we focused on the challenges of stream processing at the Edge. In particular, Edge computing presents a significant opportunity to realize the potential of distributed ML models with regards to low latency, high availability and privacy. It allows for instance inferences on simple image, video or audio classification; as only the final result is transmitted, delays are minimized, while privacy and bandwidth are preserved in IoT applications. Also, neural networks could be partitioned such that some layers are evaluated at the Edge and the rest in the cloud.
In this context we proposed an architecture in which the initial layers can be used for feature-abstraction functions: as data travels through the neural network, they abstract into high-level features, which are more lightweight, helping reduce latency.
Other National Projects
HPC-Big Data Inria Project Lab (IPL)
Participants : Gabriel Antoniu, Alexandru Costan, Daniel Rosendo, Pedro de Souza Bento Da Silva.
- Project Acronym:
- Project Title:
- Coordinator:
- Duration:
- Web site:
The goal of this HPC-BigData IPL is to gather teams from the HPC, Big Data and Machine Learning (ML) areas to work at the intersection between these domains. Research is organized along three main axes: high performance analytics for scientific computing applications, high performance analytics for big data applications, infrastructure and resource management. Gabriel Antoniu is a member of the Advisory Board and leader of the Frameworks work package.
In 2019, Daniel Rosendo, who was hired in the context of this IPL project, focused on assessing the state of the art in high performance analytics on hybrid HPC/Big Data infrastructure. In particular, a new path for future work was identified: running Machine Learning algorithm at the Edge.
ADT Damaris 2
Participants : Ovidiu-Cristian Marcu, Gabriel Antoniu, Luc Bougé.
- Project Acronym:
- Project Title:
- Coordinator:
- Duration:
- Web site:
This action aims to support the development of the Damaris software. Inria's Technological Development Office (D2T, Direction du Développement Technologique) provided 2 years of funding support for a senior engineer.
Ovidiu Marcu has been funded through this project to document, test and extend the Damaris software and make it a safely distributable product. In 2019, the main goal was to add Big Data analytics support in Damaris. We have extended Damaris with a streaming interface for writing and analyzing in real-time simulation data through KerA, a distributed streaming storage system.
KerA is further coupled with RAMCloud for in-memory key-value transactions and with Apache Flink for streaming analytics in an architecture that leverages Apache Arrow as in-memory columnar data representation for co-located streaming. This hybrid HPC-Big Data architecture is subject to further exploration within the ZettaFlow.io startup.
Grid'5000
We are members of Grid'5000 community and run experiments on the Grid'5000 platform on a daily basis.