BENAGIL

BENAGIL - 2025

2025‌Activity reportProject-TeamBENAGIL‌‌

RNSR: 202324438T

Research center Inria Saclay Centre at‌ Institut Polytechnique de Paris‌
In partnership with:Institut‌‌ Polytechnique de Paris, TELECOM SUDPARIS
Team name: Efficient‌ and safe distributed systems‌
In collaboration with:Services‌‌ répartis, Architectures, MOdélisation, Validation, Administration des Réseaux

Creation‌ of the Project-Team: 2023‌ September 01

Each year,‌‌ Inria research teams publish an Activity Report presenting‌ their work and results‌ over the reporting period.‌‌ These reports follow a common structure, with some‌ optional sections depending on‌ the specific team. They‌‌ typically begin by outlining the overall objectives and‌ research programme, including the‌ main research themes, goals,‌‌ and methodological approaches. They also describe the application‌ domains targeted by the‌ team, highlighting the scientific‌‌ or societal contexts in which their work is‌ situated.

The reports then‌ present the highlights of‌‌ the year, covering major scientific achievements, software developments,‌ or teaching contributions. When‌ relevant, they include sections‌‌ on software, platforms, and open data, detailing the‌ tools developed and how‌ they are shared. A‌‌ substantial part is dedicated to new results, where‌ scientific contributions are described‌ in detail, often with‌‌ subsections specifying participants and associated keywords.

Finally, the‌ Activity Report addresses funding,‌ contracts, partnerships, and collaborations‌‌ at various levels, from industrial agreements to international‌ cooperations. It also covers‌ dissemination and teaching activities,‌‌ such as participation in‌ scientific events, outreach, and supervision. The document concludes‌ with a presentation of scientific production, including major‌ publications and those produced during the year.

Keywords‌

Computer Science and Digital Science

A1.1.1. Multicore, Manycore‌
A1.1.4. High performance computing
A1.1.13. Virtualization
A1.3.5. Cloud‌

1 Team members, visitors, external collaborators

Research‌ Scientist

Gael Thomas [Team leader, INRIA‌, Senior Researcher, HDR]

Faculty Members‌

Mathieu Bacou [TELECOM SUDPARIS, Associate Professor‌]
Elisabeth Brunet [TELECOM SUDPARIS, Associate‌ Professor]
Valentin Honore [ENSIIE, Associate‌ Professor]
Alexandre Nolin [TELECOM SUDPARIS,‌ Associate Professor, from Nov 2025]
Pierre‌ Sutra [TELECOM SUDPARIS, Professor, HDR‌]
Francois Trahay [TELECOM SUDPARIS, Professor‌, HDR]

Post-Doctoral Fellows

Nicolas Derumigny [‌TELECOM SUDPARIS, Post-Doctoral Fellow]
Ayush Pandey‌ [TELECOM SUDPARIS, Post-Doctoral Fellow, from‌ Apr 2025]

PhD Students

Tara Aggoun [‌TELECOM SUDPARIS, from Sep 2025]
Mickaël‌ Boichot [TELECOM SUDPARIS, until Jul 2025‌]
Adam Chader [TELECOM SUDPARIS]
Jean-Francois‌ Dumollard [TELECOM SUDPARIS]
Catherine Guelque [‌TELECOM SUDPARIS]
Boubacar Kane [TELECOM SUDPARIS‌, until Jan 2025]
Harena Rakotondratsima [‌TELECOM SUDPARIS, from Sep 2025]
Marie‌ Reinbigler [TELECOM SUDPARIS, until Sep 2025‌]
Jules Risse [INRIA]
Jana Toljaga‌ [TELECOM SUDPARIS]
Guillermo Toyos Marfurt [‌TELECOM SUDPARIS, from Aug 2025]
Nguyen‌ Tung [TELECOM SUDPARIS, from Aug 2025‌]
Lucas Van Lanker [CEA]
Nevena‌ Vasilevska [TELECOM SUDPARIS]

Interns and Apprentices‌

Tara Aggoun [TELECOM SUDPARIS, Intern,‌ from Mar 2025 until Sep 2025]
Joni‌ Dervishi [Telecom SudParis, Intern, from‌ Mar 2025 until May 2025]
Harena Rakotondratsima‌ [TELECOM SUDPARIS, Intern, from Mar‌ 2025 until Sep 2025]

Administrative Assistant

Julienne‌ Moukalou [INRIA]

2 Overall objectives

Distributed‌ systems are pivotal to many applications used in‌ our daily life: AI, data analytics, online gaming,‌ social networks, web services, healthcare, etc. Because they‌ have to sustain massive workloads, these systems scatter‌ computation across many units, which coordinate to store‌ the input, execute the calculus and return results‌ in a usable manner to the application. Inefficiencies‌ in these infrastructures hinder the ability to handle‌ large computations. They also lead to wasting energy‌ and hardware resources. Errors at runtime may result‌ in painful data losses and exploitable security loopholes.‌ As a consequence, designing and implementing such systems‌ in an efficient and safe manner is essential,‌ and it has a strong commitment from all‌ the major IT industries.

The Benagil team works‌ on the design and implementation of more efficient‌ and safer distributed systems. For that, the Benagil‌ team focuses on the core system components at‌ the frontier with the hardware: hypervisors, operating systems, language runtimes, storage systems‌ and communication libraries. Improving‌ the efficiency and safety‌‌ of distributed systems is a challenging task. Modern‌ distributed systems manage large‌ pools of machines, a‌‌ plethora of users and they process very large‌ datasets. Consequently, they are‌ inherently complex and both‌‌ their design and implementation is notoriously hard. Complexity‌ arises from the software‌ stack, the algorithms at‌‌ the core of these systems, as well as‌ the hardware itself:

System‌ software level. A typical‌‌ modern computer system runs many software components: hypervisors‌ (e.g., KVM/Qemu), operating systems‌ (e.g., Linux), container systems‌‌ (e.g., Linux containers), language runtimes (e.g., the Java‌ virtual machine) and specialized‌ runtimes for HPC (e.g.,‌‌ MPI), data analytics (e.g., Spark) or AI (e.g.‌ PyTorch). Such software are‌ today very large. For‌‌ instance, the last version of the Linux kernel‌ runs over 22,000,000 lines‌ of code.
Distributed system‌‌ level. As pointed above, modern systems are distributed,‌ involving many machines. These‌ machines are connected with‌‌ heterogeneous networks, ranging from fast local networks (e.g.,‌ Infiniband or Ethernet 10Gb)‌ to high-latency planet-scale connections.‌‌ Many of these systems have to be highly‌ available, that is they‌ need to be responsive‌‌ 99.999% of the time. This requires to use‌ complex monitoring mechanisms and‌ replication algorithms that solve‌‌ trade-offs between availability and performance. Distributed systems need‌ also to do fine-grained‌ task and data placement‌‌ choices. They aggregate resources, have to use them‌ efficiently, and provide high-enough‌ isolation levels between the‌‌ multiple applications using them.
Machine level. Internally, each‌ machine is a very‌ complex entity. It is‌‌ today composed of multiple processors, memory banks and‌ devices inter-connected with a‌ complex network. A processor‌‌ contains tens of cores with finely tunable cache‌ hierarchies and out-of-order execution‌ pipelines. Each core is‌‌ a very dense unit of calculus, as testified‌ by the specification of‌ Intel Skylake that covers‌‌ more than 4,800 pages. A machine also often‌ includes multiple heterogeneous accelerators‌ and specialized hardware such‌‌ as persistent memory that provides durability at the‌ nanosecond scale, GPUs specialized‌ for massively parallel computations,‌‌ FPGAs used to offload complex computations from the‌ CPUs, and TPUs specialized‌ in deep neural network‌‌ computation. Accessing all these components is not uniform‌ both in terms of‌ bandwidth and latency. Heterogeneity‌‌ must be taken into account at multiple levels‌ of the system stack.‌ This makes data access‌‌ optimization especially challenging. This complexity also opens security‌ breaches, such as cache‌ timing attacks, code timing‌‌ attacks and data access pattern attacks. Preventing these‌ attacks requires to solve‌ complex trade-offs between performance,‌‌ security and usability.

The inherent complexity of distributed‌ systems makes analyzing their‌ performance and safety difficult.‌‌ This difficulty is increased by complex and unexpected‌ interactions between software and‌ hardware components. Besides that,‌‌ understanding and improving the system components in the‌ context of distributed systems‌ require an expertise in‌‌ many areas: hypervisors, operating systems, containerization, language runtimes,‌ compilation, network, architecture, web,‌ databases, data analytics runtimes,‌‌ cloud runtimes and distributed‌ algorithms. As an example, in a previous work,‌ we observed a large performance degradation in a‌ data analytics application written in Scala (namely, PageRank‌ in Apache Spark). This phenomenon was caused by‌ a bad memory placement performed by the Java‌ virtual machine on a non-uniform memory architecture. This‌ issue was also reinforced by the use of‌ a (system) virtual machine that blindly allocates memory‌ from any memory bank. Another source of inefficiencies‌ was due to the hypervisor which was continuously‌ moving memory without telling the virtual machine. All‌ in all, understanding and solving the performance bottleneck‌ at each level of the system stack took‌ us 8 years. It involved 3 PhD students‌ and 6 researchers with expertise in different system‌ areas.

3 Research program

The Benagil team works‌ on improving the performance and the safety of‌ the core system components of the distributed systems.‌ In order to achieve this goal, we propose‌ a systematic approach. This approach first consists in‌ profiling and analyzing current distributed systems to identify‌ their limits in term of efficiency and/or safety‌ when they execute large distributed applications. Then, building‌ upon this analysis, we develop new algorithms, mechanisms‌ and components to improve them.

The Benagil team‌ is structured along three main axes which articulate‌ the above approach. The first axis is devoted‌ to performance profiling and analysis. In this axis,‌ we introduce new tools and techniques to automatically‌ analyze the performance of a large distributed system.‌ Based on this analysis, we identify performance issues,‌ which we use as input in the two‌ other axes to improve performance. The two other‌ axes study two aspects of the system components.‌ In the system components for cloud infrastructure axis,‌ we devise new system techniques to improve the‌ performance and safety of two core system components‌ used in cloud infrastructure: virtualization and storage. In‌ the system components for emerging computing models, we‌ propose new system mechanisms and interfaces for two‌ pivotal upcoming programming models: serverless and edge computing.‌

3.1 Performance analysis

Due to the high complexity‌ of modern large-scale distributed applications, understanding performance problems‌ is a tedious task even for the most‌ experienced programmers. A performance bottleneck may arise from‌ different interactions, between hardware and software, or between‌ different software components. Even just a single contended‌ lock, or a falsely shared cache line, in‌ one of the system components may lead to‌ a dramatic slowdown.

Because of this complexity, manually‌ identifying the root cause of a performance bottleneck‌ is notoriously difficult. In this axis, we propose‌ to help the developer by designing new profiling‌ tools able to handle the complexity of hardware‌ and software stacks, and able to scale with‌ the size of the system.

3.2 System components‌ for the cloud

In this axis, we aim‌ at studying and designing the next generation of‌ systems for cloud infrastructures. Today, these infrastructures are‌ undergoing major changes at the hardware level with the generalization of ultra-fast‌ networks at the micro-second‌ scale (e.g., RDMA) and‌‌ storage devices (e.g., NVMe or Non-Volatile Memory). Their‌ joint arrivals require to‌ radically revisit the way‌‌ we design two core system components of any‌ cloud infrastructure: the virtualization‌ system and the storage‌‌ system.

3.3 System components for emerging computing‌ models

At a higher‌ level of the system‌‌ stack, we are witnessing the arrival of two‌ new computing models: serverless‌ computing and edge computing‌‌. These computing models deeply change the assumptions‌ under which the current‌ system components were built.‌‌ Current system components assume long-running applications and powerful‌ computing infrastructures. However, this‌ is no more the‌‌ case with these two new computing models. In‌ serverless computing, applications are‌ split into short-lived tasks.‌‌ In edge computing, applications execute at the border‌ of the network, atop‌ low performance hardware.

4‌‌ Application domains

Overall, the Benagil team is mostly‌ specialized on the low-level‌ components of distributed systems.‌‌ This specialization is at the frontier of security,‌ hardware, high-performance computing (HPC),‌ machine learning, data analytics‌‌ and databases. With respect to security, the team‌ studies some system aspects,‌ such as trusted execution‌‌ environments (e.g., Intel SGX) to protect applications, or‌ data replication to improve‌ availability. However, the Benagil‌‌ team is not a security one per se.‌ Regarding hardware, the Benagil‌ team has a strong‌‌ background in using modern hardware such as persistent‌ memory or GPU. This‌ knowledge is crucial to‌‌ efficiently use the hardware in system components. However,‌ the team is only‌ consuming hardware and does‌‌ not directly design it. This is also the‌ case with HPC, machine‌ learning and data analytics.‌‌ The Benagil team understand the system requirements of‌ these highly-demanding applications, and‌ use them to benchmark‌‌ their system components. However the team only rarely‌ contribute to these runtimes‌ themselves. The Benagil team‌‌ has also a strong knowledge regarding the storage‌ system components used in‌ databases. This includes the‌‌ algorithmic and implementation concerns related to data distribution,‌ consistency, replication and persistence.‌ However, the Benagil team‌‌ is not specialized in database in general.

5‌ Highlights of the year‌

Four PhD students of‌‌ the team defended their PhD in 2025: Boubacar‌ Kane, Mickaël Boichot, Marie‌ Reinbigler, and Adam Chader‌‌
The team obtained three new grants: the ANR‌ JCJC VHS, the ANR‌ Centeanes, and the PIA‌‌ Camelia

6 Latest software developments, platforms, open data‌

6.1 Latest software developments‌

6.1.1 EZTrace

Keywords:
MPI‌‌ communication, Execution trace, Traces, High performance computing, Performance‌ analysis, HPC, OpenMP, CUDA‌
Functional Description:

The improvement‌‌ of the performances of parallel applications (numerical simulation‌ for example) is an‌ important phase of the‌‌ development. For that it is necessary to detect‌ the various phases of‌ the application and to‌‌ understand the performances of them.

The automatic generation‌ of traces of execution‌ makes it possible the‌‌ developer to quickly detect simply and the various‌ phases of the application‌ and to understand the‌‌ behavior of it.
URL:‌
https://gitlab.com/eztrace/eztrace
Publications:
hal-01257904v1, hal-00707236v1, hal-03215663v1,‌ hal-03276036v1, hal-02179717v1, inria-00587216v1, hal-00918733v1,‌ hal-00865845v1, tel-03278305v1
Contact:
Francois Trahay
Participant:
2‌ anonymous participants

6.1.2 Pallas

Keywords:
Performance analysis, HPC,‌ High performance computing, Execution trace
Functional Description:
Pallas‌ is a generic trace format tailored for conducting‌ various post-mortem performance analyses of traces describing large‌ executions of HPC applications. During the execution of‌ the application, Pallas collects events and detects their‌ repetitions on-the-fly. When storing the trace to disk,‌ PALLAS groups the data from similar events or‌ groups of events together in order to later‌ speed up trace reading. The Pallas format allows‌ faster trace analysis compared to other trace formats.‌
URL:
https://gitlab.inria.fr/pallas/pallas
Contact:
Francois Trahay
Participant:
an anonymous‌ participant

6.1.3 numamma

Keywords:
NUMA, Memory Allocation, Profiling‌
Functional Description:
NumaMMa is both a NUMA memory‌ profiler/analyzer and a NUMA application execution engine. The‌ profiler allows to run an application while gathering‌ information about memory accesses. The analyzer visually reports‌ information about the memory behavior of the application‌ allowing to identify memory access patterns. Based on‌ the results of the analyzer, the execution engine‌ is capable of executing the application in an‌ efficient way by allocating memory pages in a‌ clever way.
URL:
https://github.com/numamma/numamma
Publications:
cea-01854072v2, tel-03278305v1‌
Contact:
Francois Trahay
Participant:
an anonymous participant

6.1.4‌ ForkNox

Name:
ForkNox: a micro-hypervisor to protect Linux‌
Keywords:
Virtualization, Security
Functional Description:
ForkNox is a‌ micro-hypervisor designed to protect Linux. By leveraging virtualization‌ techniques, ForkNox can revoke read, write, and execute‌ permissions for specific memory regions of Linux. This‌ ensures that, even if Linux is under attack,‌ the attacker cannot modify those parts of the‌ system.
Release Contributions:
Initial version of the software.‌
News of the Year:
Initial version of the‌ software.
URL:
https://gitlab.inria.fr/benagil/fkx/fork-nox
Contact:
Gael Thomas
Participant:
4‌ anonymous participants

6.1.5 VoliMem

Name:
VoliMem: a lightweight‌ virtualization for processes
Keyword:
Virtualization
Functional Description:
VoliMem‌ is a small library that remaps a native‌ process inside a virtual machine. Thanks to this,‌ the process gains access to low-level system hardware‌ primitives, such as a page table in user‌ space or fast inter-processor interrupts.
Release Contributions:
Initial‌ version of the prototype
News of the Year:‌
Initial implementation of the software.
URL:
https://gitlab.inf.telecom-sudparis.eu/VoliMembers/libvolimem
Contact:‌
Gael Thomas
Participant:
3 anonymous participants

6.1.6 Tele-GC‌

Name:
Tele-GC: a garbage collector for disaggregated memory‌
Keywords:
Garbage Collection, Java, Disaggregated memory
Functional Description:‌
Tele-GC is a garbage collector specifically designed for‌ disaggregated memory. It runs the application on the‌ compute node while the garbage collector operates on‌ the memory node. Tele-GC leverages the discrepancy between‌ the cache on the compute node and the‌ memory on the memory node to avoid any‌ synchronization during a collection.
URL:
https://github.com/Adchad/RemoteSpace
Contact:
Gael‌ Thomas

6.1.7 FaaSLoad

Keywords:
Cloud computing, Serverless, Function-as-a-Service,‌ Measures, Resource utilization, Workload injection, Performance measure
Scientific‌ Description:
FaaSLoad is a tool to gather fine-grained‌ data about performance and resource usage of the programs that run on‌ Function-as-a-Service cloud platforms. It‌ considers individual instances of‌‌ functions to collect hardware and operating-system performance information,‌ by monitoring them while‌ injecting a workload. FaaSLoad‌‌ helps building a dataset of function executions to‌ train machine learning models,‌ studying at fine grain‌‌ the behavior of function runtimes, and replaying real‌ workload traces for in‌ situ observations.
Functional Description:‌‌
Invoke functions in a Function-as-a-Service platform, and gather‌ data about their performance‌ and their resource usage‌‌ to understand their behavior in Serverless environments.
Release‌ Contributions:
Stabilization and opening‌ to outsiders.
News of‌‌ the Year:
Release of public version 2.0 (and‌ then 2.1.0), the first‌ mature and useful to‌‌ outsiders. Published in a dedicated scientific paper at‌ OPODIS'24.
URL:
https://gitlab.com/faasload/faasload
Publications:‌
hal-03211416, hal-04886267
Contact:‌‌
Mathieu Bacou
Participant:
an anonymous participant

7 New‌ results

This year, the‌ Benagil team carried out‌‌ research projects along three axes.

In the performance‌ analysis axis, the Benagil‌ team studied: (i) the‌‌ optimization of performance trace representations to improve analysis‌ time, (ii) methods for‌ measuring energy consumption at‌‌ a fine granularity, and (iii) the performance prediction‌ of an application when‌ we change hardware.

In‌‌ the system components for cloud infrastructures axis, the‌ Benagil team studied: (i)‌ how we can simplify‌‌ the use of persistent memory by relying on‌ a page table to‌ identify the dirty set‌‌ of a transaction, (ii) the protection of the‌ internal data structures of‌ the Linux kernel with‌‌ virtualization techniques, and (iii) the memory collection of‌ a large heap in‌ a disaggregated context.

In‌‌ the system components for emerging computing models axis,‌ the Benagil team worked‌ on: (i) analyzing large‌‌ images on a modest cluster, and (ii) adjusting‌ data consistency to the‌ actual needs of an‌‌ application.

7.1 Performance analysis

7.1.1 Scalable trace format‌

Participants: Catherine Guelque,‌ Valentin Honoré, Philippe‌‌ Swartvegher [Inria TOPAL], François Trahay.

Identifying‌ performance bottlenecks in a‌ parallel application is tedious,‌‌ especially because it requires analyzing the behavior of‌ various software components, as‌ bottlenecks may have several‌‌ causes and symptoms. Detecting a performance problem means‌ investigating the execution of‌ an application and applying‌‌ several performance analysis techniques. To do so, one‌ can use a tracing‌ tool to collect information‌‌ describing the behavior of the application. At the‌ end of the execution,‌ a trace file in‌‌ a specific format is available to the application‌ user, which can be‌ used to conduct a‌‌ complete post-mortem investigation. When analyzing the performance of‌ application running at a‌ large scale, the post-mortem‌‌ analysis needs to load thousands of trace files‌ in memory, and process‌ them. This quickly becomes‌‌ impractical for large scale applications, as memory gets‌ exhausted and the number‌ of opened files exceeds‌‌ the system capacity.

As part of the Exa-SofT‌ project, Catherine Guelque proposes‌ Pallas, a generic trace‌‌ format tailored for conducting various post-mortem performance analyses‌ of traces describing large‌ executions of HPC applications‌‌ 7. During the‌ execution of the application, Pallas collects events and‌ detects their repetitions on-the-fly. When storing the trace‌ to disk, Pallas groups the data from similar‌ events or groups of events together in order‌ to later speed up trace reading. We conducted‌ large-scale experiments on the Jean-Zay supercomputer to evaluate‌ Pallas. Our experiments show that the Pallas format‌ allows faster trace analysis compared to other evaluated‌ trace formats. Overall, the Pallas trace format allows‌ an interactive analysis of a trace that is‌ required when a user investigates a performance problem.‌ These results were presented at IPDPS'257.‌

7.1.2 Fine-grain energy measurement

Participants: Jules Risse,‌ Amina Guermouche [Inria STORM], François Trahay.‌

The power consumption of supercomputers is and will‌ be a major concern. As a matter of‌ fact, Frontier, the fastest super computer in the‌ world consumes around 20 MW. As a consequence,‌ reducing the power consumtion of HPC applications is‌ mandatory. The first step towards reducing the power‌ consumption of programs is being able to monitor‌ their energy consumption. Servers usually contain wattmeters able‌ to measure the power consumption of the CPU,‌ the memory, the GPU, etc. However, these wattmeters‌ only provide coarse grain energy measurement, with a‌ typical measurement period of dozens of milliseconds. During‌ this period of time, the application may execute‌ hundreds of tasks. As a result, analyzing the‌ power consumption of an application at the microsecond‌ scale is tedious.

As part of the Exa-SofT‌ project, Jules Risse's PhD investigates fine grain energy‌ measurement in StarPU. Since StarPU executes many instances‌ of a few types of tasks, it should‌ be possible to build an energy consumption model‌ of each type of task. The energy consumption‌ model can then be provided to StarPU so‌ that the task scheduling takes into account both‌ the performance of tasks, and their energy consumption.‌ In this project, we measure the energy consumption‌ of a server (its CPU, GPU, etc.) at‌ coarse-grain (typically, one sample every 20 ms), and‌ we log which tasks were executed during this‌ period of time. By repeating this many times,‌ we build a linear system that can be‌ solved to model the energy consumption of microsecond-scale‌ tasks. We show that the model can accurately‌ predict the energy consumption of fine grain tasks‌ running on CPUs. We conducted similar experiments on‌ GPUs where the accuracy is lower due to‌ errornous power consumption metrics reported by the GPU.‌ These results were presented at Cluster'25 9.‌

7.1.3 Performance prediction

Participants: Lucas Van Lanker,‌ Hugo Taboada [CEA/DAM], Mickaêl Boichot, Adrien‌ Roussel [CEA/DAM], Patrick Carribault [CEA/DAM], Elisabeth‌ Brunet, François Trahay.

With the advent‌ of heterogeneous systems that combine CPUs and GPUs,‌ designing a supercomputer becomes more and more complex.‌ The hardware characteristics of GPUs significantly impact the‌ performance. Choosing the GPU that will maximize performance‌ for a limited budget is tedious because it requires predicting the performance‌ on a non-existing hardware‌ platform.

During his Phd,‌‌ Mickaël Boichot studied the relation between the expressed‌ parallelism and memory footprint‌ of loops in order‌‌ to extrapolate which data sizes provide sufficient parallelism‌ to load a new‌ GPU architecture. In the‌‌ case oversubscribing memory, his work focused on how‌ to efficiently exploit new‌ unified memory feature of‌‌ GPU in order to place data where it‌ is most often reused.‌ These results are detailed‌‌ in his thesis and in 6

Lucas Van‌ Lanker's PhD explores means‌ for predicting the performance‌‌ of kernels running on GPUs. We propose a‌ methodology that analyzes the‌ behavior of an application‌‌ running on an existing platform, and projects its‌ performance on another GPU‌ based on the target‌‌ hardware characteristics. The performance projection relies on a‌ hierarchical roofline model as‌ well as on a‌‌ comparison of the kernel’s assembly instructions of both‌ GPUs to estimate the‌ operational intensity of the‌‌ target GPU. Our experiments show that the performance‌ can be predicted accurately‌ at a low cost.‌‌

7.2 System components for the cloud

7.2.1 VoliPMem:‌ using transparently a persistent‌ memory

Participants: Jana Toljaga‌‌, Tara Aggoun, Gaël Thomas, Mathieu‌ Bacou, Nicolas Derumigny‌.

Handling persistent memory‌‌ is complex because the application can fail at‌ any time, leaving the‌ persistent memory in an‌‌ inconsistent state. To avoid inconsistency, the developer must‌ use transactions, which are‌ applied with an all-or-nothing‌‌ semantics at the end of a transaction. To‌ achieve this, persistent memory‌ writes are first executed‌‌ in volatile memory and only applied to persistent‌ memory at the end‌ of a transaction. Unfortunately,‌‌ currently, to propagate the writes, the developer has‌ to explicitly indicate the‌ modified memory locations, which‌‌ is cumbersome and error-prone. With VoliPMem (PhD thesis‌ of Jana Toljaga), we‌ propose to transparently identify‌‌ the memory locations modified inside a transaction for‌ the developer. To achieve‌ this, we rely on‌‌ a library called VoliMem, which, instead of executing‌ a process natively, executes‌ it in a lightweight‌‌ virtual machine. By executing the process inside a‌ virtual machine, the application‌ can directly manage a‌‌ secondary page table within its address space. In‌ VoliPMem, we use this‌ page table to automatically‌‌ identify the modified memory locations. Specifically, VoliPMem identifies‌ the modified pages by‌ traversing the page table‌‌ to find the dirty pages. At the end‌ of a transaction, VoliPMem‌ collects these dirty pages‌‌ and copies them atomically to persistent memory with‌ an all-or-nothing semantics. Thanks‌ to these abstractions, using‌‌ persistent memory becomes straightforward: the developer simply has‌ to indicate the boundaries‌ of the transaction and‌‌ no longer has to worry about annotating each‌ write.

7.2.2 ForkNox: protecting‌ the internal data structures‌‌ of Linux

Participants: Jean-François Dumollard, Harena Rakotondratsima‌, Gaël Thomas,‌ Mathieu Bacou, Nicolas‌‌ Derumigny.

Linux is designed as a monolithic‌ kernel, which leaves it‌ vulnerable as soon as‌‌ an attacker can execute‌ code in system mode. Technically, if an attacker‌ can execute code in system mode, the attacker‌ can modify any part of Linux: the attacker‌ can alter Linux’s code, modify any data structure,‌ and disable any security mechanisms installed by Linux.‌ With ForkNox (PhD thesis of Jean-François Dumollard), we‌ propose a new technique to enforce Linux’s security,‌ even if an attacker is able to execute‌ code inside the kernel. To do so, we‌ introduce a new protection ring by leveraging the‌ processor’s virtualization feature. Specifically, ForkNox is a Linux‌ module that runs as a hypervisor while Linux‌ runs as a guest operating system. By leveraging‌ virtualization, ForkNox can revoke read, write, or execute‌ permissions for important Linux memory regions, which allows‌ ForkNox to protect Linux against an attacker capable‌ of executing code inside the Linux kernel.

7.2.3‌ Tele-GC: a garbage collector for disaggregated memory

Participants:‌ Adam Chader, Nevena Vasilevska, Yohan Pipereau‌ [Engineer at Gandi], Gaël Thomas, Mathieu‌ Bacou, Nicolas Derumigny.

A disaggregated infrastructure‌ simplifies hardware resource management. In detail, in a‌ disaggregated infrastructure, the cloud system can dynamically adjust‌ the hardware resources allocated to a virtual machine‌ to its actual use by allocating hardware resource‌ from a specialized blade. Designing a garbage collector‌ in this context is challenging because of the‌ high-memory latency. With TéléGC (PhD of Adam Chader),‌ we propose a new garbage collector (GC) for‌ a disaggregated infrastructure. TéléGC runs on the memory‌ node while the application runs on the CPU‌ node. It runs concurrently with the application while‌ avoiding most synchronization. To achieve this, we introduce‌ the write-back barrier. With the write-back barrier, instead‌ of synchronously executing a barrier when the application‌ writes to the heap, TéléGC executes a barrier‌ asynchronously later, when the CPU node writes back‌ a page to the memory node. Thanks to‌ this, the application does not pay the cost‌ of synchronizing with the GC, boosting its performance.‌ Our evaluation on a disaggregated infrastructure shows that‌ TéléGC significantly reduces both completion time and pause‌ time compared to Mako and G1, which are‌ the state-of-the-art GCs of Hotspot.

7.3 System components‌ for emerging computing models

7.3.1 Efficient Pyramidal Analysis‌ of Gigapixel Images on a Decentralized Modest Computer‌ Cluster

Participants: Marie Reinbigler, Rishi Sharma [EPFL]‌, Rafael Pires [EPFL], Elisabeth Brunet,‌ Anne-Marie Kermarrec [EPFL], Catalin Fetita [Telecom SudParis]‌.

Analyzing gigapixel images is recognized as computationally‌ demanding. In this work, we introduce PyramidAI, a‌ technique for analyzing gigapixel images with reduced computational‌ cost 8. The proposed approach adopts a‌ gradual analysis of the image, beginning with lower‌ resolutions and progressively concentrating on regions of interest‌ for detailed examination at higher resolutions. We investigated‌ two strategies for tuning the accuracy-computation performance trade-off‌ when implementing the adaptive resolution selection, validated against‌ the Camelyon16 dataset of biomedical images. Our results‌ demonstrate that PyramidAI substantially decreases the amount of processed data required for‌ analysis by up to‌ 2.65x, while preserving the‌‌ accuracy in identifying relevant sections on a single‌ computer. To ensure democratization‌ of gigapixel image analysis,‌‌ we evaluated the potential to use mainstream computers‌ to perform the computation‌ by exploiting the parallelism‌‌ potential of the approach. Using a simulator, we‌ estimated the best data‌ distribution and load balancing‌‌ algorithm according to the number of workers. The‌ selected algorithms were implemented‌ and highlighted the same‌‌ conclusions in a real-world setting. Analysis time is‌ reduced from more than‌ an hour to a‌‌ few minutes using 12 modest workers, offering a‌ practical solution for efficient‌ large-scale image analysis.

7.3.2‌‌ Efficient and Principled Approaches to Scalable Programming

Participants:‌ Boubacar Kane, Tung‌ Nguyen, Pierre Sutra‌‌.

Parallel programs require software support to coordinate‌ access to shared data.‌ For this purpose, modern‌‌ programming languages provide strongly-consistent shared objects. To account‌ for their many usages,‌ these objects offer a‌‌ large API. However, in practice, each program calls‌ only a tiny fraction‌ of the interface. Leveraging‌‌ such an observation, we propose to tailor a‌ shared object for a‌ specific usage. We call‌‌ this principle adjusted objects.

Adjusted objects already‌ exist in the wild.‌ Our work provides their‌‌ first systematic study. We explain how everyday programmers‌ already adjust common shared‌ objects (such as queues,‌‌ maps, and counters) for better performance. We present‌ the formal foundations of‌ adjusted objects using a‌‌ new tool to characterize scalability, the indistinguishability graph.‌ Leveraging this study, we‌ introduce a library named‌‌ DEGO to inject adjusted objects in a Java‌ program. In micro-benchmarks, objects‌ from the DEGO library‌‌ improve the performance of standard JDK shared objects‌ by up to two‌ orders of magnitude. We‌‌ also evaluate DEGO with a Retwis-like benchmark modeled‌ after a social network‌ application. On a modern‌‌ server-class machine, DEGO boosts by up to 1.7x‌ the performance of the‌ benchmark. This work was‌‌ conducted during the PhD of Boubacar Kane, who‌ successfully defended in January‌ 2025 11.

A‌‌ key question in concurrent programming is determining the‌ synchronization power of a‌ shared object. An object‌‌ has consensus number $n$ when $n$ is the‌ largest number for which‌ we may solve consensus‌‌ with copies of this object and registers. The‌ indistinguishability graph can be‌ used to characterize the‌‌ consensus number of a shared object. However, this‌ characterizations is incomplete, and‌ it covers only objects‌‌ that are readable. In a seminal work, Herlihy‌ and Ruppert provide an‌ exact characterization of the‌‌ consensus number for deterministic one-shot objects (that can‌ be accessed by each‌ process at most once).‌‌ In 12, we extend the study of‌ Herlihy and Ruppert to‌ deterministic two-shot objects in‌‌ a two-process system. Such objects that can be‌ accessed by each process‌ at most twice. We‌‌ introduce three disjoint classes of two-shot objects: The‌ first class is similar‌ to one-shot objects in‌‌ the sense that the‌ first operation call gives enough information to solve‌ consensus. Objects in the second class do not‌ provide any useful information after the first call‌ to one of the two processes. The last‌ class contains objects for which calling the object‌ twice is always necessary. In this class, the‌ second operation to call is chosen adaptively, which‌ may lead to using different operations in different‌ schedules. For instance, the second operation used in‌ a solo run might differ from the one‌ called when processes interleave. We show that these‌ three classes provide an exact characterization of the‌ two-shot deterministic objects able to solve two-process consensus.‌

8 Bilateral contracts and grants with industry

8.1‌ Bilateral contracts with industry

Participants: Mickaël Boichot,‌ Lucas Van Lanker.

Contract with CEA for‌ the PhD of Mickaël Boichot (2021-2025), and Lucas‌ Van Lanker (2024-2027)

Adobe research gift to support‌ our research activities.

9 Partnerships and cooperations

9.1‌ National initiatives

PEPR NumPex – Exa-SofT

Participants: Catherine‌ Guelque, Jules Risse, Élisabeth Brunet,‌ Valentin Honoré, François Trahay.

Partners: Université‌ Paris Saclay, Télécom SudParis, CEA, CNRS, Inria

Coordinator:‌ Raymond Namyst, Inria Bordeaux

Funding: 453 k€

Date:‌ 2023-2028

Summary: Though significant efforts have been devoted‌ to the implementation and optimization of several crucial‌ parts of a typical HPC software stack, most‌ HPC experts agree that exascale supercomputers will raise‌ new challenges, mostly because the trend in exascale‌ compute-node hardware is toward heterogeneity and scalability: Compute‌ nodes of future systems will have a combination‌ of regular CPUs and accelerators (typically GPUs), along‌ with a diversity of GPU architectures. Meeting the‌ needs of complex parallel applications and the requirements‌ of exascale architectures raises numerous challenges which are‌ still left unaddressed. As a result, several parts‌ of the software stack must evolve to better‌ support these architectures. More importantly, the links between‌ these parts must be strengthened to form a‌ coherent, tightly integrated software suite. Our project aims‌ at consolidating the exascale software ecosystem by providing‌ a coherent, exascale-ready software stack featuring breakthrough research‌ advances enabled by multidisciplinary collaborations between researchers. The‌ main scientific challenges we intend to address are:‌ productivity, performance portability, heterogeneity, scalability and resilience, performance‌ and energy efficiency.

PEPR Cloud – DiVa

Participants:‌ Jana Toljaga, Nevena Vasilevska, Tara Aggoun‌, Mathieu Bacou, Nicolas Derumigny, Gaël‌ Thomas.

Partners: LIP6, LIG, IRIT, Inria Paris,‌ Benagil/Telecom SudParis

Coordinator: Gaël Thomas, Télécom SudParis

Funding:‌ 864 k€

Date: 2023-2030

Summary: The DiVa project‌ investigates new virtualization mechanisms tailored for a disaggregated‌ infrastructure and for an infrastructure composed of small‌ edge infrastructures connected to powerful data centers. In‌ the context of a disaggregated cloud, the DiVa‌ project will focus on the virtualization interfaces, the‌ scheduling, the use of programmable networks, and replication‌ mechanisms. In the context of the continuum between‌ the edge and the cloud, the DiVa project‌ will focus on migration between heterogeneous machines, edge/edge and edge/data center network‌ optimizations, and virtualization interfaces‌ for micro virtual machines.‌‌

PEPR Cloud – Archi-CESAM

Participants: Jean-François Dumollard,‌ Harena Rakotondratsima, Mathieu‌ Bacou, Nicolas Derumigny‌‌, Gaël Thomas.

Partners: Université de Rennes,‌ Benagil/Telecom SudParis, Institut Polytechnique‌ de Grenoble, CEA, Inria‌‌

Coordinator: Denis Dutoit, CEA

Funding: 580 k€

Date:‌ 2023-2030

Summary: European sovereignty‌ in the cloud also‌‌ means sovereignty over hardware, especially processors and accelerators.‌ Dennard's Law is now‌ over and Moore's Law‌‌ is slowing down. In this technological context, which‌ will continue, the improvement‌ of processor performance will‌‌ require hardware architectures that evolve towards more parallelism‌ (multi-core), more specialization (accelerators),‌ towards a closer relationship‌‌ between computing and memory and new types of‌ interconnections between components. On‌ the other hand, by‌‌ dissociating hardware resources (computing, memory, interconnection) from logical‌ resources, virtualization facilitates the‌ deployment of converged architectures‌‌ that bring together the computing, storage and network‌ infrastructure. The cloud gains‌ in modularity, speed and‌‌ agility for the deployment of new services with‌ optimal use of resources.‌ Hardware disaggregation on the‌‌ one hand and resource virtualization on the other‌ are making the intermediate‌ adaptation layer increasingly complex,‌‌ difficult to validate and prone to failure. The‌ Archi-CESAM project proposes to‌ rethink the hardware (computing,‌‌ memory and interconnection) so that it is co-designed‌ with the application in‌ a perspective of converged‌‌ architecture and trust, in an environment known for‌ its abundance of data‌ to be processed. The‌‌ Archi-CESAM project addresses this major evolution of the‌ Cloud in a global‌ and coordinated approach between‌‌ distributed architectures, acceleration, interconnection and security bricks, without‌ forgetting the design methods.‌

ANR PRC – FrugalDinet‌‌

Participants: Gaël Thomas.

Partners: LIP6, LISTIC, Benagil/Inria‌ Saclay, New-York University Shanghai‌

Coordinator: Pierre Sens, LIP6‌‌

Funding: 171 k€

Date: 2024-2028

Summary: In recent‌ years, innovative hardware technologies‌ have emerged to enhance‌‌ distributed computations in datacenters. Programmable switches enable packet‌ processing with user-defined functionality‌ on packets in transit.‌‌ Similarly, SmartNIC DPUs offload data-centric computations from host‌ CPUs. Simultaneously, the urgency‌ of climate and energy‌‌ crises has emphasized the need for frugal architectures.‌ These technologies present an‌ opportunity to reduce overall‌‌ network traffic from distributed services, offloading computations from‌ CPUs to the network‌ itself. They should be‌‌ integrated in designing fundamental distributed system components like‌ failure detectors, group membership,‌ reliable broadcast, or consensus.‌‌ We propose FrugalDinet a framework to build reliable,‌ low-cost distributed services, leveraging‌ these technologies which minimizes‌‌ CPU usage in datacenters and subsequently their energy‌ consumption. Our holistic approach‌ extends key algorithms such‌‌ as leader election, group membership and broadcasting, necessary‌ for the creation of‌ reliable services. We intend‌‌ not only to offload algorithmic logics on network‌ elements, but also to‌ make opportunistic use of‌‌ the information available at the switch level. We‌ also plan to introduce‌ a new high-level programming‌‌ language facilitating transparent utilization of these frugal, reliable‌ distributed services. The implemented‌ frugal algorithms and programming‌‌ abstractions will be applied‌ to design a distributed transaction system

ANR PRCE‌ – Centeanes

Participants: Pierre Sutra.

Partners: Télécom‌ SudParis, Université Paris Cité, École Polytechnique, Université de‌ Paris 6.

Coordinator: Pierre Sutra

Funding: 196 k€‌

Date: 2025-2029

Summary: Cloud computing of the past‌ was concerned with the management of infrastructure resources,‌ e.g., servers, VMs or containers. Today, serverless computing‌ promises to abstract this worry away. In this‌ new paradigm, the quantum of computation is the‌ function; a function-as-a-service platform automatically manages deployment‌ of functions, executing them on demand and at‌ scale. This greatly simplifies access to the cloud,‌ letting the application developer focus on getting the‌ application code right, and ignore infrastructure issues.

Unfortunately,‌ serverless computing remains difficult to use and to‌ reason about. Indeed, the serverless environment is inherently‌ unpredictable and non-deterministic, making it hard to understand‌ and to control. Being distributed, serverless must cope‌ with concurrency, unpredictable failures, or impossibility of consensus.‌ On top of that, serverless poses more, new‌ challenges to the application programmer. Events may trigger‌ the same function invoked multiple times and/or terminate‌ it before it has finished. Functions are stateless,‌ starting from afresh every time; but often it‌ must access an external storage service, thus being‌ exposed to stale or inconsistent state. Finally, existing‌ platforms suffer from inefficiencies, such as excessive data‌ movement or random placement.

The Centeanes project aims‌ to address these challenges from the perspectives of‌ correctness, efficiency, and expressivity, in a real application‌ context. It will develop tools for specifying, programming‌ and running correct-by-design serverless applications. In detail, we‌ propose a formal framework to study the foundations‌ of serverless computing, including function composition and fault-tolerance.‌ This framework is implemented in a lightweight runtime‌ environment, where stateful operations and data locality are‌ first class citizen. We also construct a toolchain‌ to program and verify serverless applications executing in‌ the runtime. This verification toolchain simplifies the programming‌ of applications and helps enforce their correctness. The‌ design is informed by, and will be validated‌ against, benchmarks and full-scale industrial cloud or edge‌ applications built with Eclipse Zenoh.

ANR PRC –‌ Maplurinum

Participants: Adam Chader, Mathieu Bacou,‌ Gaël Thomas.

Partners: INPG, Inria Rennes, CEA,‌ Benagil/Telecom SudParis

Coordinator: Gaël Thomas, Telecom SudParis

Funding:‌ 184 k€

Date: 2021-2025

Summary: High-Performance architectures are‌ increasingly heteregenous and incorporate often specialized hardware. We‌ have first seen the generalization of GPUs in‌ the most powerful machines, followed a few years‌ later by the introduction of FPGAs. More recently‌ we have seen nascence of many other accelerators‌ such as tensor processor units (TPUs) for DNNs‌ or variable precision FPUs. Recent hardware manufacturing trends‌ make it very likely that specialization will not‌ only persist, but increase in future supercomputers. Because‌ manually managing this heterogeneity in each application is‌ complex and not maintainable, we propose in this‌ project to revisit how we design both hardware‌ and operating systems in order to better hide the heterogeneity to supercomputer‌ users. In summary, we‌ propose to rethink the‌‌ hardware/software boundary in order to hide the heterogeneity‌ behind a common minimal‌ instruction set and a‌‌ unified address space.

ANR JCJC – VHS

Participants:‌ Valentin Delis, François‌ Trahay.

Partners: CEA/DAM,‌‌ Benagil/Telecom SudParis

Coordinator: Valentin Delis, ensIIE

Funding: 225‌ k€

Date: 2025-2029

Summary:‌ Magnetic tapes have been‌‌ used to store computer data since the 1950s,‌ so the layman now‌ often considers it as‌‌ an outdated technology. However, tape storage is still‌ and will remain essential‌ in many fields such‌‌ as academic research, international organisations or cloud companies‌ for its strong practical‌ benefits: low cost per‌‌ TB, low energy consumption, longevity etc... This dependency‌ on tapes has motivated‌ industrial efforts in technology‌‌ improvements, resulting in much faster data density progression‌ on tape rather than‌ on disk. After recent‌‌ breakthroughs in materials used, tape capacity is expected‌ to witness a massive‌ leap in coming years,‌‌ increasing to several hundreds of TB per tape.‌ This evolution will amplify‌ the main benefits of‌‌ tape storage.

However, tapes have often been primarily‌ considered for archiving cold‌ data, because of their‌‌ main drawback: it takes around a minute to‌ mount a tape from‌ its shelf into a‌‌ drive and position the reading head before starting‌ reading data. This explains‌ the current lack of‌‌ academic effort to optimize relatively frequent data accesses.‌ Nevertheless, more and more‌ research projects require to‌‌ handle tremendous volumes of data, which are not‌ only destined to be‌ archived but also regularly‌‌ accessed for scientific analysis. Budget constraints impose the‌ usage of tape storage,‌ and optimizing tape data‌‌ access therefore becomes more and more significant, and‌ not limited to improving‌ archive retrieval.

The general‌‌ idea of the VHS project is to propose‌ new interactions between resource‌ management and tape systems.‌‌ Using filesystem on tapes, we plan to design‌ novel data placement strategies‌ that will propose efficient‌‌ data accesses by considering tapes at the level‌ of the storage hierarchy‌ by optimizing its operational‌‌ cost. Our methodology starts from the tapes themselves,‌ to better understand the‌ physical processes involved in‌‌ the different operations. Then, we will leverage this‌ knowledge to derive interactions‌ between tape and disk‌‌ storage systems in order to improve data placement.‌

PIA Camelia

Participants: Élisabeth‌ Brunet, Gaël Thomas‌‌.

Partners: CEA, Inria, CNRS, IMT, UGA, ECL,‌ SU, INSA Rennes, UM,‌ UB, IJL, INL, IM2NP,‌‌ UJM, Mines Paris, UniStra, UPVD, UBO

Coordinator: C.‌ Auliac et O. Santieys‌

Funding: 319 k€

Date:‌‌ 2026-2032

Summary: Ce projet a pour objectif la‌ conception et le développement‌ d’un environnement et d’une‌‌ pile logicielle permettant l’apprentissage et l’inférence de grands‌ réseaux de neurones, dans‌ des environnements exigeants en‌‌ ressources, tels que le near-edge, le Cloud ou‌ le HPC. À ce‌ titre, il devra permettre‌‌ de tirer pleinement parti des accélérateurs matériels développés‌ dans le cadre des‌ projets 1 (accélérateurs numériques),‌‌ 2 (accélérateurs analogiques) et‌ 3 (plateforme de co-intégration matérielle) du programme. Les‌ solutions développées devront également être suffisamment flexibles pour‌ permettre l’exploitation ultérieure de cibles matérielles exogènes au‌ programme, notamment des solutions industrielles françaises ou européennes‌ telles que celles de SiPearl, STMicroelectronics et Kalray.‌ En complément, la facilité de prise en main‌ par les ingénieurs et chercheurs en IA (souvent‌ peu familiers du matériel ou des couches logicielles‌ basses qu’ils exploitent) et la compatibilité avec les‌ concepts émergents en IA, sont des aspects clefs‌ pour le succès du projet, qui seront étudiés‌ de près.

Chist-ERA - Redonda

Participants: Pierre Sutra‌.

Partners: Institut Mines-Télécom, IMDEA Software Institute, University‌ of Surrey, Royal Holloway College - University of‌ London, University of Neuchâtel

Coordinator: Pierre Sutra

Funding:‌ 320 k€

Date: 2023-2026

Summary: The Redonda project's‌ ambition is to design a next-generation replication protocol‌ for blockchain. To achieve this, the project taps‌ into recent advances in networking, secure computing and‌ distributed systems. At the scale of a datacenter,‌ the protocol relies on two recent technologies: RDMA‌ and TEE. Both technologies are leveraged to create‌ a sub-microsecond consensus layer that tolerates Byzantine failures.‌ TEEs are also used in a novel upgradable‌ and portable smart contract engine to execute blockchain‌ transactions across a variety of infrastructures and hardware.‌ Between datacenters, the protocol relies on leaderless state-machine‌ replication. This recent approach decomposes transaction ordering into‌ two sub-tasks that can execute in parallel, without‌ a central coordinator to bottleneck the system. To‌ ensure security and safety at runtime, the Redonda‌ project creates the blockchain protocol by composing mechanically-verified‌ building blocks. The new blockchain protocol is assessed‌ using real hardware against benchmarks and publicly available‌ traces. We target that it scales across hundreds‌ of geo-distributed nodes while offering 100k+ transactions per‌ second and split-second latency.

10 Dissemination

10.1 Promoting‌ scientific activities

10.1.1 Scientific events: organisation

Gaël Thomas‌ : annual Inria Defi OS workshop (11/2025), annual‌ PEPR DiVa workshop (05/2025)
Mathieu Bacou : annual‌ thematic workshop of the working group "Virtualization" of‌ CNRS's GDR RSD about virtualization of systems and‌ networks (12/2025)

Member of the organizing committees

François‌ Trahay : participation to the organization of the‌ Per3S workshop as part of the steering committee;‌

10.1.2 Scientific events: selection

Member of the steering‌ committees

Gaël Thomas : chair of the steering‌ committee of Compas (french)
François Trahay : member‌ of the steering committee of Compas (french)
Pierre‌ Sutra : member of the steering committee for‌ PaPoC

Member of the conference program committees

Gaël‌ Thomas : member of Usenix ATC 2025, Eurosys‌ 2025, Apsys 2025, and Resdis 2025 program committee.‌
François Trahay : member of the ISC 2025‌ program comittee.
Élisabeth Brunet : member of PDS‌ 2025, Compas 2025, SC 2025.
Valentin Delis :‌ member of Cluster 2025 and ESA 2025 program‌ comittee.
Pierre Sutra : TPC member for Middleware‌ 2025, ICDCS 2025, PaPoC 2025, and SRDS 2025.‌

Reviewer - reviewing activities

Valentin Delis : Reviewer for TPDS

10.1.3 Invited‌ talks

Gaël Thomas
- 06/2025,‌ invited talk at Epita,‌‌ téléGC: a barrier-free garbage collector for disaggregated memory‌
- 07/2025, invited talk at‌ Sushi Seminar, téléGC: a‌‌ barrier-free garbage collector for disaggregated memory
François Trahay‌
- workshop ECLAT
- journée scientifique‌ de l'Institut Polytechnique de‌‌ Paris
- tutorial on performance analysis with EZTrace, as‌ part of the Compas‌ conference
Pierre Sutra
- keynote‌‌ at LADC '25

10.1.4 Scientific expertise

François Trahay‌ was a member of‌ selection committee for an‌‌ Associate Professor position at Télécom SudParis, May 2025.‌
Élisabeth Brunet w as‌ a member of the‌‌ selection committee for an Associate Professor position at‌ INSA Lyon in 2025.‌
Elisabeth Brunet was a‌‌ member of the committee awarding the Prix de‌ thèse Gilles Kahn of‌ the SIF-Société Informatique‌‌ de France in 2025.
Mathieu Bacou was a‌ member of selection committee‌ for twin Associate Professor‌‌ positions at Université de Lille, May 2025.

10.1.5‌ Research administration

François Trahay‌ : head of research‌‌ action "Energy Efficiency" of the Energy4Climate interdisciplinary center.‌
François Trahay : head‌ of working group "Large‌‌ Scale Computing" of CNRS's GDR C4P.
Mathieu Bacou‌ : co-head of working‌ group "Virtualization" of CNRS's‌‌ GDR RSD.

10.2 Teaching - Supervision - Juries‌ - Educational and pedagogical‌ outreach

Master: François Trahay‌‌ is the head of the master of Computer‌ Science at Institut Polytechnique‌ de Paris
Master: Pierre‌‌ Sutra and Gaël Thomas are the heads of‌ the Parallel & Distributed‌ Systems master track at‌‌ Institut Polytechnique de Paris
Engineering: Élisabeth Brunet is‌ in charge of the‌ AI 3rd year track‌‌ at Télécom SudParis
Engineering: Pierre Sutra is in‌ charge of the ASR‌ 3rd year track at‌‌ Télécom SudParis
Engineering : Valentin Delis is in‌ charge of the CIDM‌ HPC track at ensIIE‌‌ (2nd & 3rd year of engineering program). Holder‌ of the Chair "Technologies‌ avancées & émergentes pour‌‌ la Souveraineté Numérique" between ensIIE and CEA. 330h‌ of teaching duties (including‌ administrative duties) at ensIIE‌‌ from 1st to 3rd year in both initial‌ and apprenticeship training programs.‌ Teaching in CPES Data‌‌ Science course at Lycée International de Saclay (course‌ leader: Maria Boritchev ,‌ Télécom Paris). Jury member‌‌ for the oral examination of Concours Mines-Telecom.

10.2.1‌ Supervision

Phd in progress:‌

Jean-Francois Dumollard , "Virtualization‌‌ techniques to enforce the security of an operating‌ system", supervised by G.‌ Thomas, M. Bacou and‌‌ N. Derumigny
Catherine Guelque , "Large scale performance‌ analysis", supervised by F.‌ Trahay, and V. Delis‌‌
Martin Horth , "Static analysis methods for obfuscated‌ software reverse engineering", supervised‌ by F. Trahay, and‌‌ O. Levillain
Jules Risse , "Fine-grain energy consumption‌ measurement", supervised by F.‌ Trahay, and A. Guermouche‌‌
Jana Toljaga , "Virtualization techniques for persistent memory",‌ supervised by G. Thomas,‌ M. Bacou and N.‌‌ Derumigny
Guillermo Toyos Marfurt , "A Next-Generation State-Machine‌ Replication Protocol for Blockchain",‌ supervised by P. Sutra‌‌ and P. Kuznetsov
Lucas Van Lanker , "Performance‌ projection of GPU applications",‌ supervised by F. Trahay,‌‌ E. Brunet, and H.‌ Taboada
Nevena Vasilevska , "Hardware cache controlled by‌ software for memory disaggregation", supervised by G. Thomas,‌ J. Dumas, and N. Derumigny
Tara Aggoun ,‌ "Design and implementation of a disaggregated Java virtual‌ machine", supervised by G. Thomas and J.-P. Lozi‌
Harena Rakotondratsima , "Design and implementation of in-process‌ isolation mechanisms", supervised by G. Thomas and N.‌ Derumigny
Minh Tung Nguyen , "Computability and Complexity‌ in Mixed-Trust Distributed Systems", supervised by P. Sutra‌

Defended Phd:

Mickaël Boichot , "Caracterizing parallel applications‌ for porting to multi-GPUs systems", supervised by P.‌ Carribault, and E. Brunet
Adam Chader , "Large-scale‌ garbage collectors", supervised by G. Thomas, and M.‌ Bacou
Marie Reinbigler , "Frugal multiresolution analysis of‌ gigapixel images : application to biomedical data and‌ beyond", supervised by C. Fetita, and E. Brunet‌
Boubacar Kane , "Les objets ajustés : Une‌ approche bien fondée et efficace pour la programmation‌ concurrente", supervised by P. Sutra

10.2.2 Juries

Gaël‌ Thomas
- Reviewer of the PhDs of Papa Assane‌ Fall, Nahuel Palumbo, Xiaoxiang (William) Wu (Australia), Nahuel‌ Palumbo, Lana Scravaglieri, Simon Lambert, Adrian Khelili, Aghiles‌ Ait Messaoud, Guillermo Polito (HdR)
- Examiner of the‌ PhDs of Léo Cosseron, Eduardo Tomasi Ribeiro, Himadri‌ Pandya, Matthieu Bettinger, Ayush Pandey
François Trahay
- President‌ of the PhD committee for Boubacar Kane, Institut‌ Polytechnique de Paris
- Reviewer of the PhDs of‌ Aymeric Millan, Louis Boulanger, Himadri Pandya
Pierre Sutra‌
- President of the PhD committee for Luciano Freitas‌ de Souza, Institut Polytechnique de Paris
Élisabeth Brunet‌ : examiner of the PhD of Youssouph Faye.‌
Mathieu Bacou
- Expert member of the jury to‌ award the VAE "Expert en Sécurité des Systèmes‌ d'Information (ESSI)" of ANSSI
- Examiner of the PhD‌ of Jean-Baptiste Decourcelle

11 Scientific production

11.1 Major‌ publications

1 inproceedingsC.Catherine Guelque, V.‌Valentin Honoré, P.Philippe Swartvagher, G.‌Gaël Thomas and F.François Trahay. PALLAS:‌ a generic trace format for large HPC trace‌ analysis.IPDPS 2025: 39th IEEE International Parallel‌ & Distributed Processing Symposium39th IEEE International Parallel‌ & Distributed Processing Symposium(IPDPS)Milan, Italy2025HAL‌
2 proceedingsAdjusted Objects: An Efficient and Principled‌ Approach to Scalable Programming.MIDDLEWARE '25: 26th‌ International Middleware ConferenceNashville (Tenessee), United StatesACM‌December 2025, 215-227HAL DOI
3 proceedings‌An Exact Characterization of the Two-shot Deterministic Objects‌ Solving Two-process Consensus.PODC '25: ACM Symposium‌ on Principles of Distributed ComputingSanta María Huatulco,‌ MexicoACMJune 2025, 477-487HAL DOI‌

11.2 Publications of the year

International journals

4‌ articleJ.Jose Bolina, D.Douglas Antunes‌, L.Lasaro Camargos and P.Pierre Sutra‌. Generic Multicast: One Group Communication Primitive to‌ Rule Them All.Journal of Internet Services‌ and Applications161December 2025, 666–681‌HAL DOI
5 articleJ.-F.Jean-François Dumollard,‌ S.Sulian Le Bozec-Chiffoleau and J.José Neto‌. Weak power domination.Annals of Operations‌ ResearchSeptember 2025HALDOI

International peer-reviewed conferences

6 inproceedingsM.Mickaël‌ Boichot, A.Adrien‌ Roussel, E.Elisabeth‌‌ Brunet and P.Patrick Carribault. Leveraging interaction‌ between memory footprint and‌ parallelism degree for efficient‌‌ GPU portings.HCW 2025: 34th Heterogeneity in‌ Computing Workshop34th Heterogeneity‌ in Computing Workshop (HCW)‌‌2025 IEEE International Parallel and Distributed Processing Symposium‌ Workshops (IPDPSW)Milan, Italy‌IEEEJune 2025,‌‌ 857-865HAL DOI back to text
7 inproceedings‌C.Catherine Guelque,‌ V.Valentin Honoré,‌‌ P.Philippe Swartvagher, G.Gaël Thomas and‌ F.François Trahay.‌ PALLAS: a generic trace‌‌ format for large HPC trace analysis.IPDPS‌ 2025: 39th IEEE International‌ Parallel & Distributed Processing‌‌ Symposium39th IEEE International Parallel & Distributed Processing‌ Symposium(IPDPS)Milan, Italy2025‌HAL back to text‌‌back to text
8 inproceedingsM.Marie Reinbigler‌, R.Rishi Sharma‌, R.Rafael Pires‌‌, E.Elisabeth Brunet, A.-M.Anne-Marie Kermarrec‌ and C.Catalin Fetita‌. Efficient pyramidal analysis‌‌ of gigapixel images on a decentralized modest computer‌ cluster.Euro-Par 2025:‌ 31st International European Conference‌‌ on Parallel and Distributed Computing31st International European‌ Conference on Parallel and‌ Distributed Computing (Euro-Par)15902‌‌Lecture Notes in Computer ScienceDresden, GermanySpringer‌ Nature SwitzerlandAugust 2026‌, 298-312HAL DOI‌‌back to text
9 inproceedingsJ.Jules Risse‌, A.Amina Guermouche‌ and F.François Trahay‌‌. Fine-grain energy consumption modeling of HPC task-based‌ programs.CLUSTER 2025:‌ IEEE International Conference on‌‌ Cluster ComputingIEEE International Conference on Cluster Computing‌ (CLUSTER)Edimbourg, United Kingdom‌IEEEOctober 2025HAL‌‌DOI back to text

Conferences without proceedings

10‌ inproceedingsJ.-T.Jean-Thomas Acquaviva‌, J.Jalil Boukhobza‌‌, P.Philippe Deniel, S.Shadi Ibrahim‌, P.Philippe Raipin-Parvédy‌ and F.François Trahay‌‌. Minutes from the 9 th edition of‌ the Performance and Scalability‌ of Storage Systems workshop‌‌ (Per3S), 23rd May 2025, "Maison des Mines et‌ des Ponts", Paris.‌9th edition of the‌‌ workshop Performance and Scalability of Storage Systems (Per3S)‌Paris, France2025,‌ 1-2HAL

Edition (books,‌‌ proceedings, special issue of a journal)

11 proceedings‌Adjusted Objects: An Efficient‌ and Principled Approach to‌‌ Scalable Programming.MIDDLEWARE '25: 26th International Middleware‌ ConferenceNashville (Tenessee), United‌ StatesACMDecember 2025‌‌, 215-227HAL DOIback to text
12‌ proceedingsAn Exact Characterization‌ of the Two-shot Deterministic‌‌ Objects Solving Two-process Consensus.PODC '25: ACM‌ Symposium on Principles of‌ Distributed ComputingSanta María‌‌ Huatulco, MexicoACMJune 2025, 477-487HAL‌DOI back to text‌
13 proceedingsBrief Announcement:‌‌ Revisiting Lower Bounds for Two-Step Consensus.PODC‌ '25: ACM Symposium on‌ Principles of Distributed Computing‌‌Santa Huatulco Huatulco, FranceACMJune 2025,‌ 58-61HAL DOI
14‌ proceedingsMaking Democracy Work:‌‌ Fixing and Simplifying Egalitarian Paxos.29th International‌ Conference on Principles of‌ Distributed Systems (OPODIS 2025)‌‌Iaşi, RomaniaSchloss Dagstuhl – Leibniz-Zentrum für Informatik‌2026HAL DOI

Doctoral‌ dissertations and habilitation theses‌‌

15 thesisB.Boubacar‌ Kane. Adjusted objects : An efficient and‌ principled approach to scalable programming.Institut Polytechnique‌ de ParisJanuary 2025HAL

Reports & preprints‌

16 miscM.Mathieu Bacou, D.David‌ Beserra, E.Eugen Dedu, L.Loïc‌ Desgeorges, D.Didier Donsez, A.Alexandre‌ Guitton, B.Baptiste Jonglez, A.Arnaud‌ Legrand, G.Georgios Papadopoulos, O.Olivier‌ Richard, S.Samir Si-Mohammed, N.Nina‌ Tamdrari and F.Fabrice Theoleyre. Journée thématique‌ du GDR RSD : pratiques expérimentales de la‌ communauté systèmes et réseaux.January 2025HAL‌
17 miscB.Boubacar Kane and P.Pierre‌ Sutra. Adjusted Objects: An Efficient and Principled‌ Approach to Scalable Programming (Extended Version).2025‌HAL DOI
18 miscF.Fedor Ryabinin,‌ A.Alexey Gotsman and P.Pierre Sutra.‌ Making Democracy Work: Fixing and Simplifying Egalitarian Paxos‌ (Extended Version).2025HAL DOI
19 misc‌F.Fedor Ryabinin, A.Alexey Gotsman and‌ P.Pierre Sutra. Revisiting Lower Bounds for‌ Two-Step Consensus.2025HAL DOI

BENAGIL - 2025

BENAGIL - 2025

2025﻿﻿﻿‌Activity reportProject-TeamBENAGIL﻿‌​‌

Keywords​‌﻿﻿

Computer Science and Digital​​﻿﻿ Science

Other Research Topics and​​﻿﻿ Application Domains

1 Team members,﻿​﻿﻿ visitors, external collaborators

Research​‌﻿﻿ Scientist

Faculty Members​‌﻿﻿

Post-Doctoral﻿​﻿﻿ Fellows

PhD​​﻿﻿ Students

Interns and Apprentices​​​‌

Administrative Assistant

2 Overall objectives

3 Research program​​﻿﻿

3.1 Performance analysis

3.2 System components​​​‌ for the cloud

3.3 System﻿​​﻿ components for emerging computing​​​‌ models

4﻿‌​‌ Application domains

5​​​‌ Highlights of the year﻿﻿﻿‌

6 Latest software﻿​​﻿ developments, platforms, open data​​​‌

6.1 Latest software developments﻿﻿﻿‌

6.1.1 EZTrace

6.1.2 Pallas​​﻿﻿

6.1.3 numamma

6.1.4​‌﻿﻿ ForkNox

6.1.5 VoliMem​​﻿﻿

6.1.6 Tele-GC​‌﻿﻿

6.1.7 FaaSLoad

7 New​​​‌ results

7.1 Performance analysis﻿​​﻿

7.1.1 Scalable trace format​​​‌

7.1.2 Fine-grain energy measurement﻿​﻿﻿

7.1.3 Performance prediction

7.2 System components for﻿​​﻿ the cloud

7.2.1 VoliPMem:​​​‌ using transparently a persistent﻿﻿﻿‌ memory

7.2.2 ForkNox: protecting﻿﻿﻿‌ the internal data structures﻿‌​‌ of Linux

7.2.3​‌﻿﻿ Tele-GC: a garbage collector​​﻿﻿ for disaggregated memory

7.3 System components​​​‌ for emerging computing models﻿​﻿﻿

7.3.1 Efficient Pyramidal Analysis​‌﻿﻿ of Gigapixel Images on​​﻿﻿ a Decentralized Modest Computer​​​‌ Cluster

7.3.2﻿‌​‌ Efficient and Principled Approaches﻿​​﻿ to Scalable Programming

8 Bilateral contracts and​​﻿﻿ grants with industry

8.1​​​‌ Bilateral contracts with industry﻿​﻿﻿

9﻿​﻿﻿ Partnerships and cooperations

9.1​‌﻿﻿ National initiatives

PEPR NumPex​​﻿﻿ – Exa-SofT

PEPR​​﻿﻿ Cloud – DiVa

PEPR Cloud – Archi-CESAM﻿​​﻿

ANR PRC – FrugalDinet﻿‌​‌

ANR PRCE​‌﻿﻿ – Centeanes

ANR PRC –​‌﻿﻿ Maplurinum

ANR﻿​​﻿ JCJC – VHS

PIA Camelia

Chist-ERA -​​﻿﻿ Redonda

10 Dissemination

10.1 Promoting​‌﻿﻿ scientific activities

10.1.1 Scientific​​﻿﻿ events: organisation

Member of﻿​﻿﻿ the organizing committees

10.1.2 Scientific events: selection​​﻿﻿

Member of the steering​​​‌ committees

Member of the﻿​﻿﻿ conference program committees

Reviewer - reviewing activities​​﻿﻿

10.1.3 Invited​​​‌ talks

10.1.4﻿​​﻿ Scientific expertise

10.1.5​​​‌ Research administration

10.2 Teaching﻿​​﻿ - Supervision - Juries​​​‌ - Educational and pedagogical﻿﻿﻿‌ outreach

10.2.1​​​‌ Supervision

10.2.2 Juries

11﻿​﻿﻿ Scientific production

11.1 Major​‌﻿﻿ publications

11.2 Publications of the﻿​﻿﻿ year

International journals

International peer-reviewed conferences﻿​​﻿

Conferences without proceedings

Edition (books,﻿‌​‌ proceedings, special issue of﻿​​﻿ a journal)

Doctoral﻿﻿﻿‌ dissertations and habilitation theses﻿‌​‌

Reports & preprints​‌﻿﻿

2025‌Activity reportProject-TeamBENAGIL‌‌

Keywords‌

Computer Science and Digital Science

Other Research Topics and Application Domains

1 Team members, visitors, external collaborators

Research‌ Scientist

Faculty Members‌

Post-Doctoral Fellows

PhD Students

Interns and Apprentices‌

3 Research program

3.2 System components‌ for the cloud

3.3 System components for emerging computing‌ models

4‌‌ Application domains

5‌ Highlights of the year‌

6 Latest software developments, platforms, open data‌

6.1 Latest software developments‌

6.1.2 Pallas

6.1.4‌ ForkNox

6.1.5 VoliMem

6.1.6 Tele-GC‌

7 New‌ results

7.1 Performance analysis

7.1.1 Scalable trace format‌

7.1.2 Fine-grain energy measurement

7.2 System components for the cloud

7.2.1 VoliPMem:‌ using transparently a persistent‌ memory

7.2.2 ForkNox: protecting‌ the internal data structures‌‌ of Linux

7.2.3‌ Tele-GC: a garbage collector for disaggregated memory

7.3 System components‌ for emerging computing models

7.3.1 Efficient Pyramidal Analysis‌ of Gigapixel Images on a Decentralized Modest Computer‌ Cluster

7.3.2‌‌ Efficient and Principled Approaches to Scalable Programming

8 Bilateral contracts and grants with industry

8.1‌ Bilateral contracts with industry

9 Partnerships and cooperations

9.1‌ National initiatives

PEPR NumPex – Exa-SofT

PEPR Cloud – DiVa

PEPR Cloud – Archi-CESAM

ANR PRC – FrugalDinet‌‌

ANR PRCE‌ – Centeanes

ANR PRC –‌ Maplurinum

ANR JCJC – VHS

Chist-ERA - Redonda

10.1 Promoting‌ scientific activities

10.1.1 Scientific events: organisation

Member of the organizing committees

10.1.2 Scientific events: selection

Member of the steering‌ committees

Member of the conference program committees

Reviewer - reviewing activities

10.1.3 Invited‌ talks

10.1.4 Scientific expertise

10.1.5‌ Research administration

10.2 Teaching - Supervision - Juries‌ - Educational and pedagogical‌ outreach

10.2.1‌ Supervision

11 Scientific production

11.1 Major‌ publications

11.2 Publications of the year

International peer-reviewed conferences

Edition (books,‌‌ proceedings, special issue of a journal)

Doctoral‌ dissertations and habilitation theses‌‌

Reports & preprints‌