KRAKOS

KRAKOS - 2025

2025Activity reportProject-TeamKRAKOS

RNSR: 202424576N‌

Research center Inria Centre‌ at Université Grenoble Alpes‌‌
In partnership with:Université de Grenoble Alpes, Institut‌ polytechnique de Grenoble, CNRS‌
Team name: Design of‌‌ performance, robust, secure, flexible, and energy-efficient system software‌
In collaboration with:Laboratoire‌ d'Informatique de Grenoble (LIG)‌‌

Creation of the Project-Team: 2024 October 01

Each‌ year, Inria research teams‌ publish an Activity Report‌‌ presenting their work and results over the reporting‌ period. These reports follow‌ a common structure, with‌‌ some optional sections depending on the specific team.‌ They typically begin by‌ outlining the overall objectives‌‌ and research programme, including the main research themes,‌ goals, and methodological approaches.‌ They also describe the‌‌ application domains targeted by‌ the team, highlighting the scientific or societal contexts‌ in which their work is situated.

The reports‌ then present the highlights of the year, covering‌ major scientific achievements, software developments, or teaching contributions.‌ When relevant, they include sections on software, platforms,‌ and open data, detailing the tools developed and‌ how they are shared. A substantial part is‌ dedicated to new results, where scientific contributions are‌ described in detail, often with subsections specifying participants‌ and associated keywords.

Finally, the Activity Report addresses‌ funding, contracts, partnerships, and collaborations at various levels,‌ from industrial agreements to international cooperations. It also‌ covers dissemination and teaching activities, such as participation‌ in scientific events, outreach, and supervision. The document‌ concludes with a presentation of scientific production, including‌ major publications and those produced during the year.‌

Keywords

Computer Science and Digital Science

A1.1.1. Multicore,‌ Manycore
A1.1.9. Fault tolerant systems
A1.1.10. Reconfigurable architectures‌
A1.1.13. Virtualization
A1.3. Distributed Systems
A2.2.3. Memory management‌
A2.2.4. Parallel architectures
A2.2.5. Run-time systems
A2.6. Infrastructure‌ software

1 Team members,‌ visitors, external collaborators

Research Scientist

Baptiste Lepers [‌INRIA, Advanced Research Position, HDR]‌

Faculty Members

Alain Tchana [Team leader,‌ GRENOBLE INP, Professor]
Noel De Palma‌ [UGA, Professor]
Fabienne Dechamboux [‌UGA, Professor]
Renaud Lachaize [UGA‌, Associate Professor]
Vania Marangozova [UGA‌, Professor]
Nicolas Palix [UGA,‌ Associate Professor]
Thomas Ropars [UGA,‌ Associate Professor]

Post-Doctoral Fellows

Celestin Bessala Bessala‌ [FLORALIS, Post-Doctoral Fellow, from Jul‌ 2025 until Oct 2025]
Kenta Ishiguro [‌GRENOBLE INP, Post-Doctoral Fellow, from Mar‌ 2025]
Daniel Ndjodo Bessala [UGA,‌ Post-Doctoral Fellow, from Aug 2025]

PhD‌ Students

Ivane Adam [UGA]
Paul Breuil‌ [ENSMP]
Fonyuy-Asheri Caleb [INRIA]‌
Maxime Collette [INRIA, from May 2025‌]
Ifechukwu Ejiofor [UGA, from Oct‌ 2025]
Papa Assane Fall [INRIA,‌ from Oct 2025 until Nov 2025]
Jordan‌ Gounou Fondjo [GRENOBLE INP]
Gabriel Job‌ Antunes Grabher [UGA]
Yves Kone [‌TOULOUSE INP]
Jean-Luc Mahop Ma Ngos [‌UGA]
Gregoire Mugnier [UGA, until‌ Jun 2025]
Armel Nguetoum Mewoupea [UGA‌, from Apr 2025]
Yannick Nzali Koagne‌ [UGA]
Arnold Okala Nanga [ORANGE‌, CIFRE]
Damase Onana [Vates]‌
Benjamin Priour [HUAWEI, CIFRE, from‌ Sep 2025]
Brice Teguia Wakam [ORANGE‌]

Technical Staff

Louis Duval [GRENOBLE INP‌, Engineer, from Apr 2025]
Andre‌ Freyssinet [UGA, Engineer]
Franck Kamokoue‌ Sikati [GRENOBLE INP, from Nov 2025‌]
Tony Kwenkeu [GRENOBLE INP, Engineer‌, from Oct 2025]
Armel Nguetoum Mewoupea [UGA, Engineer‌, until Mar 2025‌]
Albin Petit [‌‌INRIA, Engineer]
Jules Seban [INRIA‌, Engineer, from‌ Dec 2025]
Remi‌‌ Segretain [UGA, Engineer]

Interns and‌ Apprentices

Elouan Barraud [‌UGA, Intern,‌‌ from Feb 2025 until May 2025]
Merveille‌ Biada Tchuisseu [GRENOBLE‌ INP, Intern,‌‌ from Oct 2025]
Maxime Bodart [GRENOBLE‌ INP, Intern,‌ until Jul 2025]‌‌
Julien Brelot [GRENOBLE INP, Intern,‌ from Mar 2025 until‌ Jul 2025]
Marie-Line‌‌ Da Costa Bento [INRIA, Intern,‌ from Jun 2025 until‌ Sep 2025]
Greg‌‌ Depoire–Ferrer [ENS Lyon, from Feb 2025‌]
Kevin Efremov [‌GRENOBLE INP, Intern‌‌, until Aug 2025]
Ifechukwu Ejiofor [‌FLORALIS, Intern,‌ from Feb 2025 until‌‌ Jun 2025]
Thomas Fourier [INRIA,‌ Intern, from Apr‌ 2025 until Sep 2025‌‌]
Kimia Khademlou [UGA, Intern]‌
Fideline Kuetche [GRENOBLE‌ INP, from Sep‌‌ 2025]
Meyo Charlotte Lysana Georgia [INRIA‌, Intern, from‌ Sep 2025]
Weihao‌‌ Ni [INRIA, Intern, from Mar‌ 2025 until Aug 2025‌]
Corentin Oparowski [‌‌INRIA, Intern, from May 2025 until‌ Aug 2025]
Jad‌ Salameh [UGA,‌‌ Intern]
Jules Seban [GRENOBLE INP,‌ Intern, from Mar‌ 2025 until Aug 2025‌‌]
Franck Tamwo [GRENOBLE INP, Intern‌, from Sep 2025‌]
Yann Brady Tchounkeu‌‌ Djabou [INRIA, Intern, from Jun‌ 2025 until Aug 2025‌]
Niels Terese [‌‌INRIA, Intern, until Apr 2025]‌
Xiaoxiang (William) Wu [‌INRIA, Intern,‌‌ until Apr 2025]
Yuben Yang [INRIA‌, Intern, until‌ Jul 2025]
Alexander‌‌ Yanovskyy [UGA, Intern, from Feb‌ 2025 until Jul 2025‌]

Administrative Assistant

Annie‌‌ Simon [INRIA]

2 Overall objectives

2.1‌ Presentation

Created on October‌ 1st, 2024, KrakOS is‌‌ the Systems group at Inria Centre at Université‌ Grenoble Alpes. The team‌ name pays homage to‌‌ Sacha Krakowiak, emeritus professor from Grenoble whose work‌ has significantly influenced the‌ local and international scientific‌‌ community in operating systems research.

Data centers are‌ an essential pillar of‌ computing infrastructures. They host‌‌ the vast majority of applications used daily by‌ businesses and individuals, along‌ with associated data. Applications‌‌ are increasingly diverse and must meet ever-stronger efficiency‌ constraints in terms of‌ responsiveness, data volumes, and‌‌ energy consumption. To meet these needs, data centers‌ are designed with complex‌ multi-level architectures, characterized by:‌‌

Large scale: Number of physical and virtual servers,‌ volumes of internal and‌ external requests
Density and‌‌ resource sharing: Number of applications cohabiting on each‌ physical server
Hardware heterogeneity:‌ At the server scale‌‌ and at the data center scale
Multiple accelerators:‌ NVM, GPU, TPU, PIM,‌ FPGA, etc.
Extremely advanced‌‌ microarchitectures: AMP, NUMA, DDIO,‌ SGX, etc.

System layers (hypervisor, operating system, centralized‌ or distributed runtime) play a critical role due‌ to the control they exercise over both hardware‌ resources and software activities: they directly impact the‌ security, stability, and efficiency of the data center,‌ and therefore the applications it hosts.

Numerous works‌ from the scientific community have highlighted the growing‌ inadequacy between the characteristics of current system layers‌ and those of the data centers described above.‌ Current systems are delicate to maintain, evolve, observe/supervise,‌ optimize, make reliable, and secure, especially as each‌ of these objectives conflicts with the others. Generally,‌ these difficulties lead to under-exploitation of the potential‌ of hardware resources. These inefficiencies are amplified by‌ the significant and growing reduction in time scales‌ for both the latencies of certain hardware resources‌ and the durations of application tasks ("microsecond-scale" computing).‌

2.2 Objectives

The KrakOS team aims to revisit‌ the fundamental principles that have governed the construction‌ of system layers until now in order to‌ take into account the modernity of data centers‌ and anticipate future developments. KrakOS targets five main‌ objectives and the inherent trade-offs between them:

Performance,‌ characterized by application metrics such as execution time,‌ throughput, latency, as well as statistical indicators on‌ the variability of these metrics;
Fault tolerance and‌ high availability;
Velocity of development, testing, and‌ deployment (to enable rapid consideration of new requirements);‌
Expressiveness and flexibility of programming interfaces (APIs), to‌ simplify the work of application programmers;
Energy efficiency.‌ KrakOS aims to achieve the above objectives while‌ maintaining (at minimum) the energy efficiency of systems‌ or improving it.

Like any system research team,‌ KrakOS aims to invent new abstractions, concepts, policies,‌ mechanisms, and techniques. Prototyping and empirical evaluation are‌ the preferred methods to validate our proposed contributions.‌ Theoretical proofs are rarely performed in this domain‌ given the complexity of the studied systems.

3‌ Research program

3.1 Methodology

KrakOS has a unique‌ approach in its scientific methodology:

Revisit and question‌ the relevance of established solutions in systems (Process‌ and Thread actions, for example);
Revisit and question‌ the relevance of solutions that have not succeeded‌ (microkernels, for example);

KrakOS validates its results primarily‌ empirically. For this, the Grid'5000 research platform and‌ its successor SLICES-FR will be our main experimental‌ grounds.

3.1.1 M1 - Virtualization

To achieve the‌ stated objectives, KrakOS relies primarily on virtualization. Virtualization‌ is a fundamental tool at the heart of‌ building computer systems. It enables optimal resource utilization,‌ isolation/security, uniformity in resource access, and facilitates the‌ design of fault tolerance techniques.

We consider virtualization‌ in its original sense, as defined by Sacha‌ Krakowiak: the virtualization of a component is the‌ design of an "ideal" abstraction of that component‌ for other components or users. In this definition,‌ the virtualized component can be a physical component‌ (device, machine or grouping of machines) or software‌ (a machine is a stack of virtual machines‌ that goes from the motherboard to the browser, for example).

3.1.2 M2‌ - Profiling, Tracing and‌ Monitoring

Empirical observation and‌‌ therefore observability are at the heart of systems‌ research. They allow identifying‌ and understanding limitations and‌‌ problems: bottlenecks, sources of inefficiency and resource waste,‌ performance anomalies, bugs and‌ complex failures (at hardware‌‌ and software levels).

KrakOS aims to contribute to‌ the production of profiling‌ and tracing tools adapted‌‌ to the modernity of data centers. Among the‌ challenges posed by the‌ latter, we can cite‌‌ the stack of complex and highly distributed layers.‌ In a virtualized cloud‌ environment, for example, it‌‌ is extremely difficult to reconstruct the path taken‌ by an I/O request‌ that traverses the virtual‌‌ machine, host system, network, storage system, and disk,‌ then retraces the path‌ in reverse.

More generally,‌‌ the evolution of latencies and throughputs of emerging‌ communication and storage devices,‌ coupled with strong quality‌‌ of service constraints of cloud applications, require new‌ approaches allowing an acceptable‌ and flexible compromise between‌‌ precision, efficiency, and intrusiveness (code and privacy). Regarding‌ privacy, for example, it‌ is necessary for data‌‌ center operators to comply with regulations (GDPR, for‌ example). Monitoring tools must‌ be able to trace‌‌ the I/O activities of virtual machines without observing‌ customer data.

3.2 Research‌ Axes

KrakOS will pursue‌‌ four research axes simultaneously. These axes are deeply‌ interconnected and address the‌ five main objectives of‌‌ KrakOS: performance, fault tolerance and high availability, velocity‌ of development, API expressiveness‌ and flexibility, and energy‌‌ efficiency.

3.2.1 A1 - Machine Virtualization

KrakOS investigates‌ fundamental problems in machine‌ virtualization that have become‌‌ increasingly critical as datacenters evolve toward greater heterogeneity,‌ incorporate diverse hardware accelerators,‌ and face ever more‌‌ stringent performance requirements. While virtualization has been the‌ cornerstone technology enabling cloud‌ computing for over two‌‌ decades, the assumptions underlying current virtualization systems—designed for‌ relatively homogeneous server fleets‌ with CPU-centric workloads—are increasingly‌‌ mismatched with modern datacenter realities. These mismatches manifest‌ as performance bottlenecks, operational‌ challenges, and missed opportunities‌‌ to leverage emerging hardware capabilities.

The research addresses‌ five interconnected challenges. VM‌ live migration at scale‌‌ has become problematic as datacenter hardware diversifies. Migrating‌ virtual machines across heterogeneous‌ hardware platforms—from older to‌‌ newer CPU generations, between different vendors' processors, or‌ to systems with different‌ accelerator configurations—requires maintaining both‌‌ functional correctness and performance characteristics. Industry has highlighted‌ the severity of this‌ problem: Microsoft has noted‌‌ that hardware heterogeneity in Azure contributes significantly to‌ resource fragmentation, where incompatibility‌ between hosts prevents optimal‌‌ VM placement and leads to hundreds of millions‌ of dollars in efficiency‌ losses. I/O virtualization efficiency‌‌ remains a persistent challenge despite decades of research.‌ Current virtualization approaches impose‌ significant performance overhead on‌‌ I/O-intensive applications due to additional software layers, context‌ switches between guest and‌ host, and memory copies.‌‌ As storage devices transition to NVMe SSDs and‌ networking speeds reach 100+‌ Gbps, these overheads become‌‌ increasingly unacceptable—the challenge is to design virtualization mechanisms‌ that approach bare-metal performance‌ while maintaining the isolation‌‌ and management benefits that‌ motivate virtualization in the first place.

Hardware accelerator‌ support presents difficulties because emerging accelerators like PIM‌ (Processing-in-Memory), GPUs, TPUs, and FPGAs were designed without‌ virtualization in mind. Each accelerator type presents unique‌ challenges: GPUs have complex memory hierarchies and scheduling‌ requirements, PIM devices tightly couple computation with memory‌ access patterns, and FPGAs require load-time configuration that‌ complicates sharing. Virtualizing these devices while maintaining both‌ performance (near-native execution speed) and isolation (preventing one‌ tenant from observing or interfering with another) requires‌ deep co-design of hardware features and virtualization software.‌ Nested virtualization—where virtual machines run inside other‌ virtual machines—has long been considered impractical for production‌ use due to performance overhead. However, nested virtualization‌ is increasingly important for scenarios like testing cloud‌ infrastructure, providing cloud-within-cloud services, and enabling sophisticated isolation‌ architectures. Building on recent hardware advances and algorithmic‌ improvements, KrakOS aims to make nested virtualization practical‌ for production deployment. Finally, security challenges in virtualized‌ environments continue to evolve as attack surfaces expand‌ and new vulnerability classes emerge. Beyond traditional concerns‌ about hypervisor bugs that could allow guest escape,‌ modern threats include side-channel attacks that exploit shared‌ hardware resources, malware operating within guest VMs that‌ must be detected from outside, and supply-chain attacks‌ targeting virtualization infrastructure itself.

KrakOS addresses these challenges‌ through coordinated research across multiple dimensions. The team‌ designs and implements novel I/O virtualization mechanisms that‌ minimize overhead through techniques such as direct device‌ assignment with enhanced isolation, optimized data paths that‌ reduce memory copies, and hardware-software co-design that leverages‌ emerging virtualization features in I/O devices. For emerging‌ accelerators, KrakOS develops transparent virtualization approaches exemplified by‌ the vPIM project for Processing-in-Memory devices, which provides‌ full virtualization support while maintaining near-native performance and‌ strong isolation guarantees. The team creates migration protocols‌ and feasibility testing tools, such as MigCheck, that‌ can predict whether a VM can successfully migrate‌ to a target host before attempting the migration—preventing‌ failures that cause service disruptions and wasted resources.‌ Security is strengthened through multiple approaches: formal verification‌ techniques that can prove properties about critical hypervisor‌ code, hardware-software co-design that leverages trusted execution environments‌ and memory encryption, and virtual machine introspection frameworks‌ (such as GoodKit) that enable security monitoring from‌ the hypervisor without compromising guest privacy or performance.‌ Throughout this research, KrakOS maintains a strong focus‌ on open-source hypervisors—particularly Xen and KVM—to ensure that‌ innovations can be rapidly adopted in production cloud‌ environments and benefit the broader systems community rather‌ than remaining theoretical contributions.

3.2.2 A2 - Mutant‌ Kernels and Key Abstractions for Concurrency

Sub-axis 1:‌ Mutant Kernels - Outsourcing OS Services to User‌ Space

KrakOS studies the extensibility of monolithic kernels‌ through a novel approach: outsourcing system services and‌ abstractions from kernel space to user mode. This‌ research direction is inspired by the microkernel philosophy‌ but operates under fundamentally different constraints and opportunities.‌ While traditional microkernels aim for minimalism—reducing the kernel‌ to a small, provably correct core—the mutant kernel approach preserves the rich‌ feature set of monolithic‌ kernels that applications depend‌‌ on while gaining the flexibility, safety, and evolvability‌ benefits of user-space implementation.‌ This philosophy aligns with‌‌ recent industry trends where major systems are being‌ redesigned to support user-space‌ implementations of traditionally kernel-resident‌‌ services.

Recent work in the systems community demonstrates‌ both the promise and‌ current limitations of this‌‌ approach. Systems like uFS 13 (file system), Snap‌ 14 (networking stack), and‌ ghOSt 12 (scheduler) have‌‌ shown that moving individual services to user space‌ can improve flexibility and‌ enable rapid innovation. However,‌‌ current approaches suffer from three fundamental limitations. First,‌ they consider outsourcing only‌ a single service at‌‌ a time, neglecting the complex interactions between system‌ services that occur in‌ real kernels. Second, they‌‌ rely exclusively on classical abstractions like the Process,‌ which provides insufficient nuance‌ to distinguish between ordinary‌‌ application code and semi-privileged system services that require‌ special treatment—higher scheduling priority,‌ access to privileged resources,‌‌ and protection from interference by untrusted applications. Third,‌ no existing framework addresses‌ efficient and secure cooperation‌‌ between multiple outsourced services, despite the fact that‌ services like memory management,‌ scheduling, and I/O must‌‌ closely coordinate.

KrakOS addresses these limitations through several‌ research directions. The team‌ pursues a holistic study‌‌ of OS service outsourcing that considers multiple services‌ simultaneously and their necessary‌ interactions. This requires designing‌‌ new abstractions specifically for system services—abstractions that sit‌ conceptually between ordinary processes‌ and kernel code, with‌‌ appropriate privileges, protection mechanisms, and scheduling guarantees. The‌ team explores using high-level‌ and provable languages (such‌‌ as Rust or formally verified subsets of C)‌ for implementing these services,‌ taking advantage of user-space‌‌ deployment to leverage stronger type systems and verification‌ tools than are practical‌ in kernel development. Security‌‌ and isolation mechanisms must be carefully adapted to‌ the needs of semi-privileged‌ services, providing protection both‌‌ from untrusted applications and between mutually distrusting system‌ services. Finally, efficient user-kernel‌ communication interfaces are essential—the‌‌ performance overhead of crossing protection boundaries must be‌ minimized since system services‌ handle operations at microsecond‌‌ granularity.

Sub-axis 2: Key Abstractions for Concurrency and‌ Isolation

The fundamental abstractions‌ that developers use to‌‌ structure concurrent and distributed programs have remained largely‌ unchanged since their introduction‌ in the 1960s and‌‌ 1970s. The Process abstraction, introduced by Dijkstra in‌ 1965 and subsequently implemented‌ in pioneering systems like‌‌ MULTICS, provides isolated address spaces and resource management.‌ The Thread abstraction was‌ later derived to accommodate‌‌ concurrent shared-memory programming within a single address space.‌ For over fifty years,‌ developers have been forced‌‌ to make a static choice between these two‌ abstractions during application development,‌ a choice with profound‌‌ implications for performance, scalability, and fault tolerance.

This‌ creates a fundamental dilemma.‌ Multi-threaded applications benefit from‌‌ efficient communication through shared memory but can only‌ scale to the size‌ of a single machine—they‌‌ cannot leverage the computational resources of an entire‌ datacenter. Conversely, multi-process applications‌ can scale to datacenter-scale‌‌ by distributing work across‌ many machines, but suffer from heavy communication overhead‌ since inter-process communication requires expensive serialization, network transmission,‌ and deserialization. The challenge is particularly acute because‌ developers typically lack complete control over where their‌ applications will be deployed—an application designed for a‌ single large machine may later need to scale‌ across multiple machines, or vice versa, but the‌ chosen abstraction is baked into the application's architecture.‌

KrakOS's vision, articulated in 15, proposes a‌ radical rethinking of these fundamental abstractions. Rather than‌ forcing developers to choose between Processes and Threads,‌ KrakOS designs a system interface that facilitates the‌ integration of new abstractions at the same architectural‌ level—abstractions that can provide different trade-offs between isolation,‌ performance, and scalability. This research leverages modern hardware‌ protection mechanisms including Intel SGX (secure enclaves), Intel‌ MPK (memory protection keys for fast domain switching),‌ Arm TrustZone (secure execution environments), and CHERI capabilities‌ (hardware-enforced fine-grained memory protection). By separating three orthogonal‌ concerns—execution flows (units of sequential execution), protection domains‌ (boundaries for isolation and security), and communication mechanisms‌ (how execution flows interact)—the programming interface allows applications‌ to compose these elements in ways that match‌ their specific needs rather than being constrained by‌ the Process/Thread dichotomy.

The research extends beyond API‌ design to system runtime implementation. KrakOS develops runtimes‌ capable of dynamically selecting the most relevant communication‌ and isolation mechanisms for each application based on‌ current deployment context, workload characteristics, and performance requirements.‌ For example, when co-located on a single machine,‌ components might communicate through shared memory; when distributed,‌ the same application might transparently switch to network‌ communication. This dynamic adaptation requires sophisticated runtime support‌ that can make these decisions efficiently and transparently.‌ Finally, the team extends compilers and code generators‌ to enable simplified or even transparent use of‌ these new abstractions, allowing developers to express high-level‌ intent rather than low-level mechanism choices, with the‌ compiler and runtime collaborating to select appropriate implementations.‌

3.2.3 A3 - Disaggregation

The rise of cloud‌ computing, enabled by virtualization technologies, has paradoxically led‌ to server fragmentation—the chronic underutilization of hardware resources‌ within individual servers. While virtualization allows multiple workloads‌ to share a physical machine, the granularity of‌ allocation remains at the server level, leading to‌ situations where some servers have excess CPU capacity‌ while others have unused memory, yet these resources‌ cannot be efficiently shared across server boundaries. Resource‌ disaggregation addresses this fundamental limitation by enabling more‌ flexible allocation of hardware resources at finer granularities.‌ The economic impact is substantial: Microsoft estimates that‌ even a 1% reduction in fragmentation within its‌ Azure cloud platform would generate savings of hundreds‌ of millions of dollars annually 11, highlighting‌ both the scale of the problem and the‌ potential impact of effective solutions.

KrakOS pursues research‌ on two complementary approaches to disaggregation, each with‌ distinct characteristics and challenges: software-based (soft) disaggregation and‌ hardware-based (hard) disaggregation.

Soft Disaggregation (Software-based)

Soft disaggregation‌ retains the traditional "server-centric" paradigm where the fundamental building block remains a‌ complete server machine, but‌ modifies software layers—particularly hypervisors‌‌ and operating systems—to allow virtual machines to dynamically‌ leverage hardware resources from‌ multiple physical servers within‌‌ the same rack. This approach benefits from emerging‌ high-speed interconnection technologies like‌ CXL (Compute eXpress Link),‌‌ which provide memory-semantic access across physical server boundaries‌ with latencies approaching those‌ of local DRAM.

KrakOS‌‌ investigates several research directions within soft disaggregation. For‌ memory disaggregation, the‌ team revisits fundamental OS‌‌ algorithms including memory management, synchronization primitives, and checkpointing‌ mechanisms to account for‌ NUMA (Non-Uniform Memory Access)‌‌ and CXL-based topologies where memory access latencies vary‌ significantly depending on physical‌ location. The research explores‌‌ how user-space service delegation (as discussed in axis‌ A2 on mutant kernels)‌ can simplify the implementation‌‌ of disaggregation mechanisms and improve system resilience by‌ isolating complex memory management‌ policies in recoverable user-space‌‌ services. For I/O disaggregation, KrakOS optimizes data‌ communication streams through automatic‌ and transparent migration or‌‌ distribution of TCP and QUIC sessions across multiple‌ network interfaces, coupled with‌ global and opportunistic management‌‌ of memory buffers that can reduce data copies‌ and eliminate bottlenecks in‌ distributed application communication.

Hard‌‌ Disaggregation (Hardware-based)

Hard disaggregation represents a more radical‌ approach requiring deep redesign‌ of both hardware and‌‌ system software architectures. Rather than organizing a rack‌ as a collection of‌ complete server machines (server-centric),‌‌ hard disaggregation builds racks as clusters of specialized‌ resource boards (resource-centric architecture).‌ Each resource board, or‌‌ "blade," provides only one type of resource—CPU boards‌ contain only processors and‌ minimal local memory, memory‌‌ boards provide large pools of DRAM, storage boards‌ host persistent storage devices,‌ and so forth. These‌‌ specialized boards are interconnected through an ultra-fast network‌ fabric whose performance and‌ reliability characteristics approach those‌‌ of traditional within-server buses, fundamentally different from commodity‌ inter-rack datacenter networks.

This‌ architectural transformation opens vast‌‌ design spaces that KrakOS explores systematically. The team‌ investigates the adequate scale‌ of disaggregated racks—determining optimal‌‌ numbers of boards and their interconnection topologies to‌ balance performance, cost, and‌ fault isolation. Research on‌‌ board dimensioning examines trade-offs between "light" boards (highly‌ specialized with minimal resources‌ beyond their primary function)‌‌ and "heavy" boards (incorporating more local resources for‌ reduced network dependency). Network‌ communication management presents unique‌‌ challenges: the system must efficiently handle loopback traffic‌ (communication within a virtual‌ server), intra-rack traffic (between‌‌ boards in the same rack), and inter-datacenter traffic‌ (between racks or to‌ external networks), each with‌‌ vastly different performance characteristics. Energy efficiency optimizations explore‌ how disaggregation enables fine-grained‌ power management—for example, powering‌‌ down unused memory boards or consolidating computation onto‌ fewer CPU boards during‌ low-load periods.

A critical‌‌ and underexplored aspect of hard disaggregation is software‌ stack design. KrakOS develops‌ hypervisors and guest operating‌‌ systems with paravirtualized interfaces specifically designed to virtualize‌ a disaggregated rack into‌ elastic virtual servers that‌‌ can dynamically grow and shrink by adding or‌ removing resource boards. This‌ software stack must support‌‌ existing server-centric applications without‌ modification, enable both intra-rack virtual server migration (moving‌ between resource configurations) and inter-datacenter migration (moving complete‌ virtual servers between racks or datacenters), and enforce‌ strict isolation across multiple dimensions including configuration (preventing‌ misconfigurations from affecting other tenants), performance (ensuring one‌ tenant cannot degrade another's performance), fault isolation (preventing‌ failures from propagating), and security/privacy (protecting tenant data‌ and computation from observation or interference). Notably, most‌ existing research on disaggregation focuses on simple isolation‌ abstractions such as Linux processes and containers; KrakOS's‌ work on full virtual machine support for disaggregated‌ architectures addresses a gap in current research while‌ providing the strong isolation properties required for production‌ multi-tenant cloud environments.

3.2.4 A4 - Fault Tolerance‌

System designers often neglect fault tolerance during initial‌ development, focusing primarily on functionality and performance. This‌ approach can render initially effective solutions impractical when‌ resilience requirements are considered, necessitating costly redesigns or‌ abandonment of otherwise promising ideas. KrakOS researchers take‌ a different approach by considering fault tolerance as‌ a first-class concern from the design phase, integrating‌ resilience mechanisms into the fundamental architecture rather than‌ retrofitting them later. The team has identified specific‌ approaches to incorporate fault tolerance for each of‌ the three preceding research axes (machine virtualization, mutant‌ kernels, and disaggregation), ensuring that innovations in these‌ areas can be deployed in production environments where‌ failures are inevitable.

Fault Tolerance for Virtual Machines‌

Observability—the practice of monitoring system execution—is essential for‌ numerous critical functions including crash detection, hang detection,‌ intrusion detection, and performance monitoring. However, implementing effective‌ observability for virtual machines creates a fundamental dilemma‌ known as the Observer/Observed problem. On one hand,‌ the Observer and Observed must reside in distinct‌ fault domains to prevent fault propagation; if they‌ share a fault domain, a failure in the‌ Observed system can corrupt or crash the Observer,‌ defeating the purpose of monitoring. On the other‌ hand, the Observer requires easy and efficient access‌ to the Observed system's state to perform meaningful‌ monitoring without introducing prohibitive performance overhead.

This dilemma‌ becomes particularly acute in virtualized environments. A single‌ VM can host multiple applications, making it a‌ complex entity to monitor. Embedding both Observers and‌ Observed components within the same VM using traditional‌ non-virtualized abstractions (such as separate processes) proves ineffective‌ because a VM crash necessarily crashes all contained‌ processes, including any Observers. Existing approaches, such as‌ the out-of-VM observability framework proposed by Ding et‌ al. 17, attempt to solve this by‌ dedicating a separate VM for observation. However, this‌ architecture introduces performance-costly observation mechanisms because cross-VM communication‌ is significantly more expensive than intra-VM operations. Moreover,‌ it leads to substantial resource waste since each‌ user VM requires a corresponding observer VM, effectively‌ doubling memory, CPU, and management overhead.

KrakOS proposes‌ a novel approach: integrating the Observer into the‌ VMM (Virtual Machine Monitor) membrane. A VM consists‌ of two components: the guest OS that executes‌ applications (a black box from the datacenter manager's perspective) and the VMM‌ that virtualizes hardware and‌ manages the guest. The‌‌ key insight is that the VMM and guest‌ OS share the same‌ address space, yet the‌‌ guest remains isolated through hardware virtualization mechanisms like‌ extended page tables. By‌ extending the VMM to‌‌ incorporate the Observer as a second guest alongside‌ the guest OS, KrakOS‌ achieves both isolation (the‌‌ Observer runs in a separate protection domain) and‌ efficient access (the Observer‌ can directly examine guest‌‌ state without crossing VM boundaries). This architecture enables‌ multiple critical use cases‌ including ransomware detection through‌‌ behavioral monitoring, addressing the semantic gap between hypervisor‌ and guest OS by‌ maintaining high-level semantic information,‌‌ real-time security monitoring with minimal overhead, and performance‌ anomaly detection that can‌ identify subtle degradation patterns.‌‌

Fault Tolerance for Mutant Kernels

. When OS‌ services are externalized to‌ user space (as described‌‌ in research axis A2 on mutant kernels), they‌ become significantly more vulnerable‌ to failures than their‌‌ kernel space counterparts. While an application process crash‌ typically affects only that‌ specific application, an OS‌‌ service crash can impact all applications depending on‌ that service, yet user-space‌ services lack the protection‌‌ and recovery mechanisms traditionally afforded to kernel components.‌ This creates a fundamental‌ challenge: how to provide‌‌ the flexibility and safety benefits of user-space implementation‌ while maintaining the reliability‌ expectations of critical system‌‌ services.

KrakOS explores three complementary approaches to address‌ this challenge. The first‌ approach proposes designing a‌‌ new first-class abstraction specifically for OS services that‌ acknowledges their special status—distinct‌ from both ordinary application‌‌ processes and kernel code. This abstraction would provide‌ appropriate protection, scheduling priority,‌ and recovery mechanisms tailored‌‌ to the unique requirements of system services, aligning‌ directly with the broader‌ research agenda on new‌‌ concurrency and isolation abstractions discussed in axis A2.‌

The second approach leverages‌ a kernel fallback mechanism‌‌ where both user-space and kernel-space versions of an‌ OS service coexist. In‌ case of user-space service‌‌ failure, the system can temporarily rely on the‌ default kernel implementation during‌ maintenance and recovery. However,‌‌ this approach introduces the significant challenge of transferring‌ or synchronizing state between‌ versions that may employ‌‌ different policies (such as Most Recently Used versus‌ Least Recently Used for‌ page replacement) and maintain‌‌ different internal data structures. Solving this state reconciliation‌ problem requires either designing‌ services with compatible internal‌‌ representations or developing sophisticated state translation mechanisms.

The‌ third approach employs user-space‌ redundancy by replicating OS‌‌ services across multiple address spaces. While replication is‌ a well-established technique for‌ fault tolerance, applying it‌‌ to OS services presents unique challenges. Maintaining replica‌ state coherence requires coordination‌ protocols, but traditional replication‌‌ algorithms introduce performance overhead that is unacceptable for‌ latency-critical system services. OS‌ services must handle queries‌‌ at microsecond scale—for example, page fault handling cannot‌ tolerate millisecond-scale coordination delays‌ introduced by consensus protocols.‌‌ Therefore, KrakOS must develop new replication techniques specifically‌ optimized for the extreme‌ performance requirements of system‌‌ services while still providing‌ meaningful fault tolerance guarantees.

Fault Tolerance for Disaggregation‌

. The goal of this research direction is‌ to provide correctness and availability guarantees for disaggregated‌ systems, particularly in hard disaggregation designs where resources‌ are physically separated into specialized boards. Prior work‌ on disaggregation has primarily addressed conceptual models and‌ performance optimization, often disregarding reliability concerns that become‌ critical in production deployments.

This research specifically targets‌ hardware crash failures where one or more resource‌ boards crash within a rack—a failure mode that‌ differs fundamentally from traditional node crashes in server-centric‌ architectures. The smaller granularity of failures in disaggregated‌ systems fundamentally reshapes both the challenges and opportunities‌ for fault tolerance, as losing a single memory‌ board affects multiple virtual servers simultaneously in ways‌ that differ qualitatively from losing an entire node.‌ Applications are more likely to encounter failures in‌ disaggregated contexts because the disaggregation of resources increases‌ the number of independent components that can fail,‌ effectively multiplying failure probabilities. The impact and optimal‌ recovery strategy depend critically on the type of‌ failed board (CPU, memory, disk, or network) and‌ its specific configuration, such as cache size on‌ CPU boards or memory capacity on memory boards.‌ Consequently, no one-size-fits-all approach can be effective—different failure‌ scenarios demand fundamentally different recovery mechanisms.

KrakOS pursues‌ several research directions to address these challenges. The‌ team is developing new formalisms for reasoning about‌ failures, communication patterns, and consistency guarantees in disaggregated‌ infrastructure, as existing theoretical frameworks assume monolithic server‌ architectures. New failure models are essential because existing‌ models cannot accurately capture the hybrid nature of‌ disaggregated systems, where intra-rack communication over ultra-fast fabrics‌ differs fundamentally from inter-rack communication over commodity networks.‌ The team investigates suitable consistency models for applications‌ executing in disaggregated datacenters, balancing the tension between‌ strong guarantees that simplify application development and relaxed‌ models that enable better performance. Finally, cache coherence‌ protocols must be redesigned to handle failures gracefully—for‌ example, determining correct behavior when a CPU core‌ that exclusively owns a cached object crashes, potentially‌ leaving other cores with stale data or blocked‌ on unavailable resources.

4 Application domains

4.1 Overview‌

The research efforts of KrakOS target data centers‌ that run all types of applications, unlike the‌ HPC (High-Performance Computing) domain which focuses on specific‌ scientific workloads. KrakOS aims at accommodating various application‌ types while maintaining a key constraint: non-degradation of‌ performance for one application type in favor of‌ another (unless explicitly specified as a desired policy).‌

Operating across multiple system layers (hypervisor, operating system,‌ and middleware), KrakOS addresses several target areas with‌ specific characteristics and requirements.

4.2 Hypervisor Layer

4.2.1‌ Target Hypervisors

KrakOS focuses on open-source hypervisors that‌ dominate cloud deployments. Xen is used extensively in‌ cloud environments, particularly valued for its strong security‌ properties and robust isolation mechanisms that enable safe‌ multi-tenant deployments. KVM, integrated directly into the‌ Linux kernel, has been widely adopted by major‌ cloud providers due to its performance characteristics and seamless integration with existing‌ Linux infrastructure. By focusing‌ on these two dominant‌‌ hypervisors, KrakOS ensures that research contributions can be‌ rapidly adopted in production‌ cloud environments.

4.2.2 Cloud‌‌ Deployment Models

KrakOS research addresses both private and‌ public cloud deployment models,‌ each with distinct characteristics‌‌ and requirements. In private clouds, where applications‌ belong to a single‌ entity, best-effort resource management‌‌ is often permissible, allowing the focus to remain‌ on overall efficiency and‌ performance optimization across the‌‌ entire infrastructure. In contrast, public clouds hosting applications‌ from different owners require‌ that each application receives‌‌ its subscribed amount of resources, necessitating strict isolation‌ mechanisms and rigorous SLA‌ (Service Level Agreement) enforcement‌‌ to prevent interference between tenants and ensure contractual‌ obligations are met.

4.2.3‌ Cloud Service Models

KrakOS‌‌ deliberately limits its scope to the most complex‌ cloud service models where‌ systems research can have‌‌ the greatest impact. Infrastructure as a Service (IaaS)‌ represents the traditional VM-based‌ cloud model with startup‌‌ times on the order of minutes and complex‌ requirements for resource management,‌ live migration, and multi-tenant‌‌ isolation. Function as a Service (FaaS), one‌ of the newest and‌ most challenging cloud models,‌‌ demands ultra-fast startup times at the microsecond scale,‌ elastic scaling that can‌ respond to rapid workload‌‌ changes, and fine-grained resource allocation mechanisms that can‌ efficiently multiplex short-lived function‌ invocations. The constraints of‌‌ FaaS fundamentally challenge traditional operating system and virtualization‌ assumptions, making it a‌ particularly rich area for‌‌ systems innovation.

4.3 Operating System Layer

KrakOS targets‌ Linux as its primary‌ operating system due to‌‌ several compelling factors. Linux enjoys widespread adoption in‌ both cloud and enterprise‌ environments, making research contributions‌‌ immediately relevant to production deployments. Its open-source nature‌ enables deep modifications and‌ experimental reimplementation of core‌‌ subsystems, essential for systems research. The rich ecosystem‌ and strong community support‌ ensure that innovations can‌‌ be integrated into mainline development and benefit from‌ collaborative improvement. Finally, Linux's‌ presence across the computing‌‌ spectrum—from massive cloud datacenters to resource-constrained edge devices—ensures‌ that KrakOS research on‌ Linux has broad applicability.‌‌

4.4 Middleware and Orchestration

KrakOS addresses several critical‌ middleware layers that sit‌ between applications and infrastructure.‌‌ Message-Oriented Middleware (MOM) plays a vital role in‌ application interoperability within distributed‌ systems, enabling inter-application communication,‌‌ service decoupling, and asynchronous message processing that allows‌ systems to scale and‌ evolve independently. The Edge-Cloud‌‌ continuum represents an increasingly important deployment model where‌ computation must be distributed‌ across multiple tiers—from resource-constrained‌‌ edge devices to massive cloud datacenters—requiring sophisticated mechanisms‌ for latency-sensitive application placement‌ and resource management across‌‌ heterogeneous distributed environments. Kubernetes serves as the primary‌ focus for container orchestration‌ research, as it has‌‌ become the de facto standard for automated deployment,‌ scaling, and management of‌ containerized applications, with direct‌‌ connections to KrakOS research on disaggregation and resource‌ management. Finally, middleware for‌ large-scale data processing,‌‌ encompassing both real-time stream processing and batch analytics,‌ presents challenges in efficiently‌ managing data movement and‌‌ computation placement, directly connecting‌ to the team's work on storage and memory‌ management optimization.

4.5 Domain-Specific Applications

4.5.1 Genomics and‌ Bioinformatics

Through the ANR PicNIC project in collaboration‌ with ICO (Institut de Cancérologie de l'Ouest), KrakOS‌ addresses critical challenges in genomic data processing. The‌ research focuses on reducing data movements in genomic‌ datacenters, optimizing execution times for complex genomic analysis‌ pipelines, minimizing energy consumption, and improving data-intensive workload‌ performance. Genomic applications present unique challenges with extremely‌ large datasets ranging from terabytes to petabytes, complex‌ multi-stage computational pipelines with diverse resource requirements, I/O-intensive‌ operations that can bottleneck on storage systems, and‌ critical needs for data locality optimization to avoid‌ expensive data transfers. These characteristics make genomics an‌ ideal testbed for KrakOS research on disaggregation, efficient‌ I/O, and energy-aware resource management.

4.5.2 Memory-Intensive Applications‌

Given fundamental memory resource limitations in datacenters and‌ the need to accelerate disk-intensive applications, KrakOS specifically‌ targets memory-intensive workloads. Key-value stores such as Memcached‌ serve as caching systems that maintain critical data‌ in memory for low-latency access, requiring efficient in-memory‌ data structure management and horizontal scalability across multiple‌ servers. Graph processing applications perform large-scale analytics on‌ graph structures, characterized by random memory access patterns‌ that challenge traditional memory hierarchies and demand sophisticated‌ memory management to maintain performance at scale.

4.5.3‌ Microservices Architectures

Microservices have emerged as the dominant‌ programming model for modern Internet services, presenting both‌ opportunities and challenges for systems research. These architectures‌ consist of distributed, loosely-coupled services that can be‌ independently deployed and scaled, often implemented in multiple‌ programming languages (polyglot development), with complex inter-service communication‌ patterns. For KrakOS research, microservices introduce challenges in‌ fine-grained resource allocation (as individual services may have‌ vastly different resource needs), service discovery and intelligent‌ routing, fault tolerance mechanisms that prevent cascading failures‌ across service dependencies, and comprehensive performance monitoring and‌ observability that can track requests across dozens of‌ service invocations.

4.6 Cross-Cutting Application Characteristics

KrakOS research‌ addresses applications spanning an enormous range of characteristics,‌ ensuring that proposed solutions are robust and generally‌ applicable. Latency requirements vary from microsecond-scale responsiveness demanded‌ by FaaS functions to minutes or hours acceptable‌ for batch processing jobs. Resource consumption ranges from‌ lightweight serverless functions consuming mere megabytes of memory‌ to resource-intensive analytics requiring hundreds of gigabytes and‌ multiple accelerators. Deployment patterns include single-tenant applications in‌ private clouds, multi-tenant services in public clouds, and‌ hybrid deployments spanning cloud and edge infrastructure. Data‌ patterns encompass data-intensive applications like genomics where I/O‌ dominates execution time, and compute-intensive simulations where CPU‌ and accelerator performance are critical. This diversity of‌ application domains ensures that KrakOS solutions must be‌ general, robust, and applicable to real-world production environments‌ across multiple industries rather than optimized for narrow‌ use cases.

5 Social and environmental responsibility

5.1‌ Energy Efficiency and Green Computing

Energy efficiency is‌ embedded as a core concern throughout KrakOS research‌ activities, reflecting the team's commitment to reducing the‌ environmental footprint of computing systems. The team conducts research on energy-aware virtualization‌ mechanisms that optimize power‌ consumption without sacrificing performance,‌‌ addressing the growing challenge of datacenter energy costs‌ and carbon emissions. This‌ includes developing novel resource‌‌ management algorithms that consider energy as a first-class‌ optimization criterion alongside traditional‌ performance metrics. The team‌‌ has developed specialized tools for measuring and optimizing‌ energy consumption at multiple‌ system layers. These measurement‌‌ frameworks provide the foundation for understanding energy behavior‌ and designing more efficient‌ systems.

5.1.1 Participation in‌‌ Standardization Initiatives

Nicolas Palix serves as mission leader‌ for "Action Monitoring" within‌ GDRS Écoinfo, the‌‌ national research network dedicated to eco-responsible digital practices.‌ In this role, he‌ coordinates efforts to establish‌‌ best practices and metrics for evaluating the environmental‌ impact of digital technologies‌ across French research institutions.‌‌ The team actively contributes to AFNOR SPEC 2314‌ on Frugal AI,‌ working to define standards‌‌ and best practices for resource-efficient artificial intelligence. This‌ standardization effort aims to‌ ensure that AI systems‌‌ can deliver high performance while minimizing computational resource‌ consumption and energy usage,‌ making AI technologies more‌‌ accessible to organizations with limited infrastructure. The three-year‌ IAoundé Project, funded‌ by Région AURA, focuses‌‌ specifically on frugal AI research and capacity building,‌ promoting sustainable computing practices‌ particularly for resource-constrained environments‌‌ in developing countries.

5.2 Diversity, Equity, and Inclusion‌

5.2.1 Leadership in DEI‌ Initiatives

Alain Tchana serves‌‌ as a member of the ACM (Association for‌ Computing Machinery) Diversity, Equity,‌ and Inclusion Council,‌‌ a prestigious appointment that recognizes his leadership in‌ promoting inclusive practices in‌ computing research and education.‌‌ Through this role, Tchana advocates for increased representation‌ of underrepresented groups in‌ computer science research and‌‌ education, drawing on his extensive experience building partnerships‌ between European and African‌ institutions. He works to‌‌ ensure equitable access to computing resources and opportunities,‌ particularly for researchers and‌ students from developing countries‌‌ who face systemic barriers to participation in international‌ research. Within KrakOS and‌ the broader systems research‌‌ community, we foster inclusive research practices that value‌ diverse perspectives and create‌ welcoming environments for all‌‌ researchers.

5.2.2 Gender Diversity Reflection

The team engages‌ in ongoing self-assessment through‌ internal discussions specifically focused‌‌ on "how to increase the number of women‌ in the team," recognizing‌ that gender diversity remains‌‌ a critical challenge in systems research. KrakOS implements‌ conscious recruiting practices designed‌ to encourage applications from‌‌ underrepresented groups, including targeted outreach to diverse student‌ populations and careful attention‌ to inclusive language in‌‌ job postings and internship descriptions. The team prioritizes‌ creating a welcoming and‌ supportive environment for all‌‌ members, with policies that promote work-life balance and‌ accommodate diverse needs. While‌ the team acknowledges significant‌‌ work remains to achieve representative diversity, these ongoing‌ efforts—particularly the financial commitment‌ to women interns—reflect a‌‌ genuine commitment to structural change rather than symbolic‌ gestures.

5.3 Open Science‌ and Reproducible Research

KrakOS‌‌ maintains a strong commitment to open science principles,‌ recognizing that scientific progress‌ depends on transparent sharing‌‌ of methods, data, and‌ results. The team publishes open-source software and tools‌ including Faho (PIM operating system), vPIM (PIM virtualization),‌ MigCheck (migration feasibility testing), GoodKit (VM introspection), and‌ B-Side (system call identification), making these research artifacts‌ freely available to the community. Team members actively‌ contribute to major open-source projects including the Xen‌ hypervisor and Linux kernel, ensuring that research innovations‌ can benefit production systems used worldwide. The team‌ regularly organizes workshops and seminars for knowledge dissemination,‌ including tutorials at conferences like ComPAS 2025, the‌ Workshop Défi OS, and the Xen Project Winter‌ Meetup, fostering dialogue between researchers and practitioners.

6‌ Highlights of the year

6.1 Team Creation and‌ Inauguration

KrakOS was officially created on October 1,‌ 2024, as an Inria project-team in partnership with‌ Université Grenoble Alpes, Grenoble INP, and CNRS. The‌ team's inauguration ceremony took place on November 25,‌ 2024, at the Inria Centre at Université Grenoble‌ Alpes. This event was honored by the attendance‌ of Sacha Krakowiak, the distinguished emeritus professor after‌ whom the team is named, symbolizing the continuity‌ between pioneering work in operating systems research in‌ Grenoble and KrakOS's mission to advance the field‌ for modern datacenter environments.

6.2 HDR and PhD‌ Defenses

Baptiste Lepers successfully defended his Habilitation à‌ Diriger des Recherches (HDR) on December 12, 2024,‌ marking a significant milestone for the team and‌ recognizing his contributions to operating systems research, particularly‌ in the areas of scheduling, memory management, and‌ system performance optimization. Two PhD students completed their‌ doctoral work in 2024-2025: Papa Assane Fall and‌ William Wu.

6.3 Major Publications

The team achieved‌ remarkable publication success at premier systems conferences, demonstrating‌ the quality and impact of KrakOS research. Papers‌ were accepted at EuroSys 2025 and NSDI 2025‌, two of the most selective venues in‌ systems research. Two papers were accepted at APSys‌ 2025. Additional acceptances include SIGMETRICS 2025 on‌ Intel User Interrupts performance analysis, ASIACCS 2025 on‌ SIMBox fraud detection, and two papers at Middleware‌ 2024 and 2025 on Processing-in-Memory virtualization and binary-level‌ system call identification.

6.4 Awards and Recognition

Team‌ members and alumni received prestigious recognition for their‌ research contributions. Anne-Josiane Kouam was honored with the‌ Prix Science Ouverte 2025, recognizing her commitment to‌ open science principles and her work on fraud‌ detection in telecommunications that balances security with privacy‌ preservation. Yasmine Djebrouni received the Accessit (honorable mention)‌ for the GDR RSD Thesis Award 2025. Stella‌ Bitchebe earned the Accessit for the GDR RSD‌ Thesis Award 2024 for her thesis on nested‌ virtualization optimization.

6.5 International Collaborations

KrakOS expanded its‌ international research network through multiple funding mechanisms and‌ partnership programs. The team secured funding through the‌ France Berkeley Fund for collaboration with Natacha Crooks‌ at UC Berkeley. An Associated Team proposal with‌ the University of British Columbia, co-led with‌ Mohammad Shahrad, is under review to advance responsible‌ cloud computing research. The Associated Team with ENSPY‌ Cameroon, co-led with Thomas Bouetou, was approved and supports the IAoundé‌ frugal AI initiative. An‌ Associated Team with the‌‌ University of Sydney, partnering with Vincent Gramoli,‌ enables blockchain systems research‌ and PhD co-supervision. Additionally,‌‌ a Mourou/Strickland Program collaboration with Mohammad Shahrad at‌ UBC facilitates advanced research‌ exchanges.

The team maintained‌‌ active international mobility with significant research visits: Willy‌ Zwaenepoel from the University‌ of Sydney spent six‌‌ months at KrakOS, contributing expertise in distributed systems;‌ Gohar Irfan Chaudhry from‌ MIT visited for two‌‌ weeks for collaborative research discussions; Alain Tchana conducted‌ extended research stays at‌ MIT (2.5 months) and‌‌ UBC (2 weeks); Maxime Collette and Alain Tchana‌ visited ETH Zurich for‌ collaborative discussions; and multiple‌‌ researchers exchanged visits between Cameroon and Grenoble, strengthening‌ the IAoundé partnership.

6.6‌ Conference Organization and Leadership‌‌

Team members held prominent leadership positions in the‌ systems research community: Artifact‌ Evaluation Chair for OSDI/ATC‌‌ 2025 and SOSP 2025, Shadow PC Chair for‌ EuroSys 2025. Team members‌ served on program committees‌‌ of major conferences including EuroSys 2025 and 2026,‌ SIGMETRICS 2025, NSDI 2025,‌ ASPLOS 2025, Middleware 2025,‌‌ NCA 2025, SOSP 2026 and FAST 2026. Vania‌ Marangozova-Martin served as President‌ of the system track‌‌ for ComPAS 2025, the premier French-language systems conference.‌ Team members also participated‌ in numerous PhD defense‌‌ committees, serving as presidents, reviewers, and CSI (Comité‌ de Suivi Individuel) members,‌ contributing to doctoral education‌‌ across France.

KrakOS organized several community events including‌ the Xen Project Winter‌ Meetup, co-organized with‌‌ Vates on January 30-31, 2025, bringing together international‌ contributors to the Xen‌ hypervisor. The team also‌‌ organized the Workshop Défi OS on December 13,‌ 2024, fostering collaboration among‌ French research teams working‌‌ on operating systems challenges.

6.7 Industrial Partnerships

KrakOS‌ actively pursued industrial partnerships‌ to ensure research relevance‌‌ and facilitate technology transfer. With Vates, a‌ leading French virtualization company,‌ the team submitted proposals‌‌ for a LabCom VirtDisk-Lab and a BPI project,‌ and established one CIFRE‌ PhD thesis on virtualization‌‌ and storage systems. A MIAI Industrial Chair proposal‌ was submitted jointly with‌ Vates and EasyVirt, focusing‌‌ on confidential computing and AI-based health workloads. The‌ team secured one CIFRE‌ thesis with Huawei Technologies‌‌, advancing research on AI optimization. Two ongoing‌ CIFRE theses with Orange‌ Labs address virtual machine‌‌ introspection and machine failure detection, contributing to operational‌ challenges in large-scale cloud‌ deployments. While not all‌‌ funding applications were successful, these partnerships demonstrate KrakOS's‌ commitment to bridging academic‌ research and industrial needs.‌‌

7 Latest software developments, platforms, open data

7.1‌ New Software

7.1.1 USM‌

USM is a comprehensive‌‌ framework for developing and deploying memory management policies‌ in Linux entirely in‌ userspace. Unlike traditional approaches‌‌ where memory management policies are embedded in the‌ kernel, USM adopts a‌ microkernel-inspired architecture that moves‌‌ policy implementation to userspace while retaining critical mechanisms‌ in the kernel. The‌ framework provides complete coverage‌‌ of memory management aspects including page allocation, page‌ eviction decisions (what pages‌ to evict, when to‌‌ evict them, and where‌ to store evicted content), and integrated policies that‌ coordinate these decisions. USM enables rapid development and‌ safe experimentation with novel memory management strategies without‌ requiring kernel modifications or system reboots.

The source‌ code has been publicly released to support further‌ research and adoption by the systems community. Development‌ is led by Alain Tchana, Renaud Lachaize, Papa‌ Fall and Jean-Pierre Lozi.

7.1.2 MigCheck

MigCheck is‌ a tool designed to test the feasibility of‌ virtual machine migration in heterogeneous hardware environments before‌ actual migration attempts. The tool performs comprehensive analysis‌ of hardware compatibility, examining CPU instruction set architectures‌ to predict potential migration issues. By identifying incompatibilities‌ early, MigCheck prevents costly migration failures and service‌ disruptions in production cloud environments. The tool is‌ currently in the maturation phase and was submitted‌ for LSI Carnot funding. Development is led by‌ Alain Tchana, Renaud Lachaize, and Kenta Ishiguro.

7.1.3‌ vPIM

vPIM provides comprehensive virtualization support for Processing-in-Memory‌ (PIM) devices, enabling multiple virtual machines to efficiently‌ share PIM hardware while maintaining strict isolation and‌ performance guarantees. The system implements novel scheduling and‌ resource management policies specifically designed for the unique‌ characteristics of PIM architectures, where computation occurs directly‌ within memory arrays. vPIM addresses critical challenges in‌ PIM virtualization including memory allocation, DPU (Data Processing‌ Unit) scheduling, and performance isolation between co-located tenants.‌ The source code has been publicly released to‌ support further research and adoption by the systems‌ community. Contact: Alain Tchana.

7.1.4 Faho

Faho is‌ an operating system specifically designed for UPMEM Processing-in-Memory‌ devices, providing dynamic and efficient sharing of Data‌ Processing Units (DPUs) among multiple applications. Unlike traditional‌ batch-oriented approaches, Faho implements time-sharing mechanisms that allow‌ unpredictable job arrivals to be handled efficiently while‌ maintaining fairness and high utilization. The system includes‌ sophisticated scheduling algorithms that account for the unique‌ characteristics of PIM hardware, including limited on-chip memory‌ and the cost of data movement between host‌ and PIM devices. Faho's source code is publicly‌ available, facilitating integration into existing PIM-based systems. Development‌ is led by Alain Tchana and Renaud Lachaize.‌

7.1.5 GoodKit

GoodKit provides an efficient and robust‌ virtual machine introspection framework that enables security monitoring,‌ debugging, and analysis of guest VM. The framework‌ is designed to minimize performance overhead while providing‌ comprehensive visibility into VM internals, including memory access‌ patterns, system call activity, and kernel data structures.‌ GoodKit addresses the semantic gap problem that traditionally‌ plagues VM introspection by maintaining high-level semantic information‌ about guest OS structures. This capability is essential‌ for security applications such as intrusion detection, malware‌ analysis, and compliance monitoring in cloud environments. The‌ source code is publicly available. Development is led‌ by Alain Tchana and Renaud Lachaize.

7.1.6 B-Side‌

B-Side performs sophisticated binary-level static identification of system‌ calls in compiled applications, enabling security analysis and‌ monitoring without requiring access to source code. The‌ tool employs advanced program analysis techniques including control‌ flow reconstruction, symbolic execution, and pattern matching to accurately identify system call‌ sites even in heavily‌ optimized or obfuscated binaries.‌‌ B-Side is particularly valuable for security auditing of‌ closed-source software, legacy applications,‌ and potentially malicious code‌‌ where source access is unavailable. Contact: Alain Tchana.‌

7.1.7 P4Cemaker

P4CEMaker is‌ a novel system designed‌‌ to semiautomatically accelerate existing RDMA-based consensus protocols through‌ the use of a‌ programmable switch. We demonstrated‌‌ the usefulness of P4CEMaker by accelerating four different‌ consensus protocols, achieving up‌ to 2 times performance‌‌ improvement in around a day of work per‌ protocol. Contact: Baptiste Lepers.‌

7.1.8 DirtBuster

DirtBuster is‌‌ a tool that identifies scenarios in which the‌ CPU caches perform suboptimally.‌ CPU caches have been‌‌ heavily optimized to cache DRAM, but are not‌ increasingly used to cache‌ data coming from other‌‌ memory devices (e.g., persistent memory, CXL memory, FPGA‌ memory). In such scenarios,‌ caches may perform suboptimally.‌‌ By using a combination of static and dynamic‌ analysis, DirtBuster identifies applications‌ and code regions that‌‌ are likely to suffer from suboptimal cache behavior.‌ Developers can then add‌ hints to direct the‌‌ cache (these hints are also suggested by DirtBuster).‌ Contact: Baptiste Lepers.

7.2‌ New platforms

7.2.1 IBARA‌‌ - Portable Micro-Cluster for Africa

IBARA is a‌ portable and autonomous micro-cluster‌ specifically designed for teaching,‌‌ research, and service hosting in countries facing electrical‌ infrastructure challenges. This innovative‌ platform integrates high-performance micro-computers‌‌ within a compact transportable suitcase, enabling rapid deployment‌ of computing capabilities in‌ isolated locations or areas‌‌ with severe electricity deficits.

IBARA's distinguishing feature is‌ its hybrid power system‌ guaranteeing uninterrupted operation. The‌‌ platform operates on mains electricity when available, automatically‌ switches to a storage‌ battery (automotive-type) during power‌‌ outages, and maintains battery charge through an integrated‌ foldable solar panel, enabling‌ completely off-grid operation. This‌‌ design addresses the reality of frequent power interruptions‌ in many African regions‌ while providing the reliable‌‌ computing infrastructure essential for modern education and research.‌

Within the IAoundé project‌ framework, IBARA enables KrakOS‌‌ to deliver hands-on teaching on cloud computing, virtualization,‌ and distributed systems at‌ partner universities (University of‌‌ Yaoundé 1, ENSPY) without requiring expensive datacenter infrastructure.‌ Students gain practical experience‌ with modern technologies—virtual machines,‌‌ distributed applications, container orchestration—using a system designed for‌ their specific infrastructural constraints.‌ As a research platform,‌‌ IBARA supports experiments on energy-aware scheduling, resilient system‌ design, and efficient resource‌ utilization, aligning with KrakOS's‌‌ work on green computing while addressing real-world deployment‌ challenges. The platform also‌ provides practical hosting capabilities‌‌ for local university services, reducing dependence on distant‌ cloud providers and supporting‌ digital sovereignty.

Key Features:‌‌ Portable micro-cluster in transportable suitcase; hybrid power (mains/battery/solar);‌ autonomous operation in isolated‌ locations; supports teaching, research,‌‌ and local hosting

7.2.2 Grid'5000

KrakOS extensively uses‌ Grid'5000, a large-scale distributed‌ computing testbed for experimental‌‌ research. The platform provides access to diverse hardware‌ configurations essential for validating‌ virtualization and resource management‌‌ research.

7.2.3 SLICES-FR

The team participates in SLICES-FR,‌ the French component of‌ the European SLICES infrastructure‌‌ for large-scale experimental research‌ in networking, distributed computing, and IoT.

8 New‌ results

8.1 Physical Memory Management in Userspace (USM)‌

Papa Assane Fall's PhD research addressed a critical‌ challenge in datacenter memory management. Main memory is‌ a critical resource in datacenters due to its‌ major impact on application performance and server costs.‌ However, Linux's memory management (MM) system, designed to‌ be general-purpose, is not always optimal for the‌ diverse workload requirements encountered in production cloud environments.‌

Fall introduced USM (User-Space Memory), the first‌ complete framework for rapid development of memory management‌ policies in Linux. USM adopts a microkernel-inspired design‌ that enables MM policies to run entirely in‌ userspace, aligning with KrakOS's broader research agenda on‌ mutant kernels (axis A2). This architecture addresses several‌ key requirements for extensible memory management including generality‌ (supporting diverse policy types), simplicity (reducing development complexity),‌ safety (preventing policy bugs from crashing the kernel),‌ reconfigurability (enabling dynamic policy changes), transparency (maintaining compatibility‌ with existing applications), and observability (providing detailed insights‌ into memory behavior).

Participants: Assane Fall, Jean-Pierre‌ Lozi, Renaud Lachaize, Alain Tchana.‌

8.2 Processing-in-Memory Virtualization

The team made significant advances‌ in Processing-in-Memory (PIM) virtualization through two complementary systems.‌ First, vPIM 16 provides a comprehensive virtualization solution‌ for PIM devices, addressing the challenge of efficiently‌ virtualizing emerging PIM architectures. This system enables multiple‌ virtual machines to share PIM hardware while maintaining‌ performance isolation between tenants, a critical requirement for‌ cloud environments. Second, the team developed Faho,‌ a time-sharing system designed to optimally manage UPMEM‌ PIM resources when independent jobs arrive unpredictably in‌ the system. Faho implements sophisticated scheduling policies that‌ balance fairness, throughput, and energy efficiency, demonstrating that‌ PIM systems can effectively support multi-tenant workloads in‌ production cloud environments. This work is under review‌ at ISCA.

Participants: Maxime Collette, Ni Weihao‌, Renaud Lachaize, Alain Tchana.

8.3‌ Heterogeneous VM Migration

Research on heterogeneous virtual machine‌ migration led to the development of MigCheck,‌ a tool that tests migration feasibility across different‌ hardware platforms before attempting actual migration. This work‌ is crucial for cloud operators managing diverse hardware‌ fleets, as it prevents migration failures that can‌ lead to service disruptions and resource waste. This‌ research addresses a critical operational challenge in modern‌ cloud datacenters where hardware heterogeneity is increasing due‌ to rapid technology evolution. This work is under‌ review at EuroSys.

Participants: Kenta Ishiguro, Fonyuy-Asheri‌ Caleb, Eloua Barraud, David Bromberg,‌ Renaud Lachaize, Alain Tchana.

8.4 Understanding‌ Intel User Interrupts

Yves Koné's work on Intel‌ User Interrupts provides deep insights into this emerging‌ hardware feature, which enables user-space applications to receive‌ hardware interrupts without kernel intervention. Through comprehensive performance‌ characterization and analysis, the research demonstrates both the‌ opportunities and limitations of this new mechanism. This‌ work was accepted at SIGMETRICS 2025 and contributes‌ to understanding how modern hardware features can be‌ leveraged to improve application performance at the microsecond scale.

Participants: Yves Kone‌, Louis Duval,‌ Pascal Felber, Daniel‌‌ Hagimont, Renaud Lachaize, Alain Tchana.‌

8.5 System Call Identification‌ for Security

The team‌‌ developed B-Side, a tool that enables binary-level‌ static identification of system‌ calls in compiled applications.‌‌ B-Side provides a foundation for security monitoring and‌ analysis tools that work‌ without requiring source code‌‌ access, addressing a critical need in security auditing‌ of closed-source software and‌ legacy systems. This work‌‌ was published at Middleware 2025.

Participants: Gaspard Thévenon‌, Kevin Nguetchouang,‌ Kahina Lazri, Pierre‌‌ Olivier, Alain Tchana.

8.6 SIMBox Fraud‌ Detection

Josiane Kouam's work‌ on detecting SIMBox fraud‌‌ through latency anomalies (SigN) demonstrates how‌ system-level monitoring can address‌ real-world security challenges in‌‌ telecommunications. SIMBox fraud, where international calls are illegally‌ routed through mobile networks‌ to avoid charges, costs‌‌ telecom operators billions of dollars annually. The SigN‌ system leverages subtle timing‌ differences in call routing‌‌ to identify fraudulent traffic patterns without requiring deep‌ packet inspection or customer‌ data access, making it‌‌ both privacy-preserving and GDPR-compliant. This research was accepted‌ at ASIACCS 2025 and‌ is being deployed in‌‌ collaboration with telecom operators in Africa to combat‌ fraud while respecting user‌ privacy.

Participants: Josiane Kouam‌‌, Aline Carneiro, Philippe Martins, Cédric‌ Adjih, Alain Tchana‌.

8.7 P4Cemaker

Paul‌‌ Breuil worked on P4CEMaker, a novel system designed‌ to semi-automatically accelerate existing‌ RDMA-based consensus protocols using‌‌ a programmable switch. Central to the design of‌ P4CEMaker is the insight‌ that, despite the diversity‌‌ of algorithmic approaches used by consensus protocols (e.g.,‌ fault detection and leader‌ election), they rely on‌‌ a common set of networking operations such as‌ scattering and gathering values,‌ which can be offloaded‌‌ to programmable switches. P4CEMaker consists of two components:‌ a dynamic analysis tool‌ that automatically detects these‌‌ network operations and provides developers with precise call-graph‌ information showing where and‌ how they are executed‌‌ in the code, and a versatile hardware acceleration‌ library that enables these‌ operations to run in‌‌ hardware with minimal code changes. Paul used P4CEMaker‌ to accelerate four different‌ consensus protocols, achieving up‌‌ to a 2× performance improvement with roughly one‌ day of work per‌ protocol. P4Cemaker was published‌‌ in ICDCS 2025.

Participants: Jakob Nibler, Thomas‌ Ropars.

8.8 Pre-Stores‌

William Wu worked on‌‌ improving the performance of CPU caches when they‌ are used to cache‌ memories other than regular‌‌ DRAM. These scenarios are becoming common (persistent memory,‌ remote memory accessed via‌ CXL, etc.). William introduced‌‌ the notion of software pre-storing - the converse‌ of software prefetching. With‌ software pre-fetching, instructions are‌‌ inserted in the code to asynchronously move data‌ up in the memory‌ hierarchy. With software pre-storing,‌‌ instructions are inserted to direct the CPU to‌ asynchronously move data down‌ in the memory hierarchy.‌‌ Pre-storing can be implemented by using existing processor‌ instructions. Software pre-storing provides‌ performance benefits for write-heavy‌‌ applications on emerging architectures.‌

William identified application scenarios in which software pre-storing‌ is beneficial, and developed a tool, DirtBuster, that‌ identifies applications and code regions that can benefit‌ from pre-storing. He evaluated the concept of software‌ pre-storing and the DirtBuster tool on two CPU‌ architectures (ARM and x86) and two types of‌ cacheable memories (PMEM and cache-coherent DRAM accessed through‌ an FPGA). He demonstrate dperformance improvements for key-value‌ stores, HPC applications, message passing, and Tensorflow, by‌ up to 2.3x. The work was published in‌ EuroSys'25.

Participants: Xiaoxiang Wu, Baptiste Lepers,‌ Willy Zwaenepoel.

8.9 Carbon Footprint of Storage‌ in Datacenters

The team works on the analysis‌ on the carbon footprint of storage in the‌ cloud. During the year, our work has focused‌ on studying the impact of the storage technology‌ (HDD vs SSD) on the trade-off between performance‌ and carbon footprint, considering the case of key-value‌ stores. The work of Jakob Nibler has demonstrated‌ that this type of database, commonly used in‌ datacenters, there could be situations where using HDDs‌ instead of SSDs can be better from the‌ carbon footprint point of view. This is especially‌ true if the applications are unable to take‌ full advantage of the high performance of SSD‌ devices, and if the energy powering the datacenters‌ has a low carbon intensity. These results open‌ new research directions for reducing the environmental impacts‌ of Cloud infrastructures.

Participants: Jakob Nibler, Thomas‌ Ropars.

8.9.1 IBARA - Portable Micro-Cluster for‌ teaching

IBARA is a portable and autonomous micro-cluster‌ specifically designed for teaching, cloud and big data‌ research, and hosting services in African countries. This‌ innovative platform addresses a critical challenge: providing reliable‌ computing infrastructure in environments with unstable or unavailable‌ electrical power. IBARA integrates a set of high-performance‌ micro-computers within a compact, easily transportable suitcase, making‌ it an ideal solution for rapid deployment of‌ computing capabilities in isolated locations or areas with‌ severe electricity deficits.

The platform's major strength lies‌ in its hybrid power system that guarantees uninterrupted‌ operation under varying power conditions. When standard electrical‌ current is available, IBARA operates on mains power‌ like conventional computing infrastructure. However, when power outages‌ occur—a frequent occurrence in many African regions—the system‌ immediately switches to a storage battery (automotive-type battery)‌ ensuring continuous operation without data loss or service‌ interruption. To maintain long-term autonomy, the battery is‌ kept charged through a small foldable solar panel‌ that can be integrated into the suitcase or‌ connected externally, enabling completely off-grid operation in sunny‌ conditions typical of many African deployments.

Participants: Blandine‌ Ntchoutta, Alain Tchana.

9 Bilateral contracts‌ and grants with industry

Vates. KrakOS maintains‌ a strong partnership with Vates, a leading‌ French virtualization company. Donald Onana started his CIFRE‌ PhD thesis in November 2025, focusing on VM‌ observability. The collaboration extends through the MIAI Industrial‌ Chair (under review) on confidential computing and AI-based‌ health workloads, combining expertise in secure virtualization with machine learning applications in‌ healthcare. A potential CIFRE‌ thesis for Louis Duval‌‌ is under discussion. This partnership is further strengthened‌ through the ANR YUPIM‌ project, which brings‌‌ together academic and industrial expertise in Processing-in-Memory virtualization.‌
Orange Labs. The‌ team collaborates with Orange‌‌ Labs through two CIFRE PhD theses. Dufy Teguia‌ focuses on virtual machine‌ introspection for security and‌‌ monitoring, while Eric Okala works on machine failure‌ detection and recovery mechanisms‌ in cloud environments. These‌‌ collaborations are integrated within the ANR SecondChance project‌ and the ANR SCALER‌ project.
Huawei.‌‌ KrakOS established a CIFRE partnership with Huawei Technologies‌, supporting Benjamin Priour's‌ PhD research on AI‌‌ workload optimization.

10 Partnerships and cooperations

10.1 International‌ Initiatives

KrakOs is involved‌ in the Important Project‌‌ of Common European Interest on Next Generation Cloud‌ Infrastructure and Services (IPCEI-CIS).‌ More specifically, KrakOs contributes‌‌ to the E2CC (Eco Edge to Cloud Continuum)‌ project.

10.1.1 Associated Teams‌

University of British Columbia‌‌ (Canada). KrakOS has submitted a proposal for‌ an Inria Associated Team‌ with Mohammad Shahrad at‌‌ the University of British Columbia, focusing on‌ responsible cloud computing with‌ emphasis on energy efficiency,‌‌ carbon-aware scheduling, and sustainable datacenter operations. The proposal‌ is currently under review.‌
The Cameroon (ENSPY, University‌‌ of Yaoundé). An Inria Associated Team with‌ Thomas Bouetou at ENSPY‌ and University of Yaoundé‌‌ 1 has been accepted, centered on frugal AI‌ research and capacity building‌ in resource-constrained environments. This‌‌ partnership strengthens long-term collaboration with Cameroonian institutions and‌ supports the IAoundé project‌ objectives.
The University of‌‌ Sydney (Australia). The team established an Inria‌ Associated Team with Vincent‌ Gramoli at the University‌‌ of Sydney, focusing on blockchain systems, distributed‌ consensus protocols, and high-performance‌ distributed ledger technologies. This‌‌ collaboration includes co-supervision of PhD student Paul Breuil‌ and facilitates student mobility‌ between France and Australia.‌‌ In addition, Xiaoxiang Wu and Yuben Yang, two‌ PhD students of Baptiste‌ Lepers (hired before Baptiste‌‌ joints the team) during six months.

10.1.2 Other‌ International Collaborations

The IAoundé‌ project, funded by‌‌ the PAI AURA, establishes a formal collaboration with‌ Cameroonian institutions including University‌ of Yaoundé 1 and‌‌ ENSPY, promoting frugal computing research. Seven researchers from‌ Cameroon visited us during‌ the year and we‌‌ realized eight visits in Cameroon.
KrakOS collaborates with‌ Pierre Olivier at the‌ University of Manchester, UK‌‌, on operating systems and virtualization research, including‌ co-supervision of PhD students‌ and joint publications on‌‌ systems security.
The team partners with Pascal Felber‌ at the University of‌ Neuchâtel, Switzerland, on‌‌ leveraging modern hardware features for improving performance.
Collaboration‌ with Natacha Crooks at‌ UC Berkeley, USA,‌‌ is supported by the France Berkeley Fund, focusing‌ on building a uniform‌ framework for memory management‌‌ and thread scheduling.
Adam Belay at MIT hosted‌ Alain Tchana for research‌ discussions on operating systems‌‌ for microsecond-scale computing and datacenter efficiency.
The team‌ has collaboration with Timothy‌ Roscoe at ETH Zurich,‌‌ Switzerland, with multiple‌ research visits by Alain Tchana, Baptiste Lepers, and‌ Maxime Collette, exploring systems architecture and hardware-software co-design.‌
KrakOs collaborates with the team of Fumio Machida‌ (University of Tsukuba) on the modeling of performance‌ anomalies in micro-services applications. Gabriel Antunes Grabber visited‌ the team for 3 months (April-June 2025) thanks‌ to a UGA Idex formation grant.

10.2 National‌ Initiatives

10.2.1 ANR Projects

The ANR PRCE YUPIM‌ project, led by Principal Investigator Alain Tchana,‌ is currently ongoing and focuses on advancing Processing-in-Memory‌ virtualization technologies for next-generation cloud infrastructures.
ANR PRME‌ KNext, under the leadership of Baptiste Lepers,‌ was submitted to develop next-generation kernel architectures that‌ leverage emerging hardware features and new concurrency abstractions‌ for improved performance and security.
The ANR PRC‌ XRay project, with Nicolas Palix as Scientific‌ Responsible, was submitted to develop advanced static analysis‌ tools for Linux kernel code, improving security and‌ reliability through automated verification techniques.

10.2.2 PEPR Projects‌

KrakOS is actively involved in the PEPR Cloud‌, contributing to three projects: DIVA, STEEL, and‌ TARANIS.

10.2.3 Inria Challenges (Défis)

The Défi Inria‌ OS (Operating Systems Challenge) provided substantial support to‌ KrakOS, funding three PhD theses, one postdoctoral position,‌ and two six-month engineering positions, enabling the team‌ to pursue ambitious research directions in modern operating‌ systems design.

10.2.4 Regional Projects

KrakOS received funding‌ from Région Auvergne-Rhône-Alpes for the three-year IAoundé project‌ focused on frugal AI research, strengthening partnerships with‌ African institutions and promoting sustainable computing practices in‌ resource-constrained environments. We have also received funding from‌ the UGA International Research Booster for the same‌ collaboration.
Two LIG Émergence projects were accepted.
KrakOS‌ received funding from LabEx Persyval-Lab for research on‌ virtualization of UPMEM Processing-in-Memory (PIM) technology, advancing the‌ integration of emerging memory-centric computing architectures into cloud‌ environments.

10.3 Collaboration with Other Research Teams

KrakOS‌ collaborates with the WIDE team at Inria Rennes‌ through the co-supervision of one PhD student (Fonyuy-Asheri‌ Caleb), focusing on VM live migration on heterogeneous‌ processors. While located in Rennes, Fonyuy-Asheri Caleb visits‌ KrakOS every two months for at least one‌ week.
The team works with the STACK team‌ at Inria on building a carbon aware FaaS‌ framework, co-supervising two master's interns.
Collaboration with the‌ Whisper team at Inria involves the co-supervision of‌ three PhD students working on memory management, semantic‌ gap, and bug finding in Linux.
KrakOS partners‌ with the SEPIA team at IRIT (Toulouse) to‌ co-supervise three PhD students on topics related to‌ memoiry management in virtualized systems, security and IO‌ improvement.
The team collaborates closely with AGEIS Lab‌ at UGA on GDPR compliance and data protection‌ research, co-supervising three master's interns and two postdoctoral‌ researchers.

10.4 Conference and Workshop Organization

The team‌ co-organized the Xen Project Winter Meetup on January‌ 30-31, 2025 in Grenoble, bringing together international contributors‌ and users of the Xen hypervisor.
KrakOS organized‌ the Workshop Inria Défi OS on December 13,‌ 2024, in Grenoble, facilitating collaboration and knowledge exchange among French research teams‌ working on operating systems.‌
The team organized the‌‌ IAoundé Conference with events in June 2025 in‌ Grenoble and August 2025‌ in Cameroon, promoting frugal‌‌ AI and systems research in partnership with African‌ institutions.
KrakOS organized the‌ Workshop VMPSec (Virtualization, Migration,‌‌ Performance and Security) in June 2025 in Grenoble,‌ addressing critical challenges in‌ modern virtualization technologies.
The‌‌ team supported the creation of a new Nuit‌ de l'Info 2025 site‌ in Cameroon for students‌‌ participating in the IAoundé project, extending this popular‌ French student programming competition‌ to Africa.

11 Dissemination‌‌

11.1 Invited Talks

Alain Tchana gave invited talks‌ at multiple prestigious venues‌ including GT SSLR (GDR‌‌ Sécurité) in Paris, ETH Zurich, UBC, Seine AI‌ workshop organized by Huawei,‌ the 128-bit RISC-V European‌‌ workshop at HiPEAC Barcelona, and the midi de‌ la recherche at ENSIMAG.‌
Baptiste Lepers delivered invited‌‌ talks at ETH Zurich, gave a keynote at‌ JSI Inria, and presented‌ at the PizzaTalk series‌‌ at LIG.
PhD students presented their accepted papers‌ at major international conferences‌ including Middleware 2024, APSys‌‌ 2025, EuroSys 2025, NSDI 2025, SIGMETRICS 2025, and‌ ASIACCS 2025.
The team‌ organized a tutorial on‌‌ virtualization at ComPAS 2025, sharing expertise and best‌ practices with the French-speaking‌ systems research community.

11.2‌‌ Scientific Expertise

Nicolas Palix serves as mission leader‌ for "Action Monitoring" within‌ GDRS Écoinfo and contributes‌‌ to AFNOR SPEC 2314 on Frugal AI standardization.‌
Renaud Lachaize is a‌ member of the MIAIA‌‌ Cluster selection committee.
Fabienne Boyer serves as the‌ representative of LIG at‌ the MACI scientific council.‌‌
Alain Tchana served as external reviewer for ERC‌ Advanced Grants 2026.
Team‌ members serve on program‌‌ committees of several international conferences including EuroSys, SOSP,‌ ASPLOS, Middleware, NSDI, CCGrid,‌ IC2E and NCA.

11.3‌‌ Research Administration

Alain Tchana served as member of‌ CoNRS (Comité National de‌ la Recherche Scientifique) and‌‌ serves on the ACM DEI Council (Diversity, Equity,‌ and Inclusion).
Renaud Lachaize‌ is a member of‌‌ the SIGOPS ASF staff.
Noël De Palma serves‌ as head of LIG‌ (Laboratoire d'Informatique de Grenoble).‌‌
Thomas Ropars is a member of the GDR‌ RSD board (In charge‌ of the relations between‌‌ the GDR and the conferences and schools).

11.4‌ Teaching - Supervision -‌ Juries

11.4.1 Teaching

All‌‌ permanent team members are faculty members (enseignants-chercheurs) with‌ teaching responsibilities at Université‌ Grenoble Alpes and Grenoble‌‌ INP.

To broaden the recruitment sphere, team members‌ also teach at institutions‌ beyond Grenoble, including ENS‌‌ de Lyon, attracting talented students from diverse academic‌ backgrounds to systems research.‌

In addition to their‌‌ national teaching responsibilities, team members regularly conduct teaching‌ missions abroad, particularly at‌ partner universities in Cameroon‌‌ such as University of Yaoundé 1 and ENSPY,‌ where they deliver courses‌ on operating systems, virtualization,‌‌ and distributed systems.

11.4.2 Supervision

The team supervised‌ 18 PhD students in‌ 2024, 3 postdocs, and‌‌ more than 20 interns. KrakOS maintains a very‌ open internship policy, recognizing‌ that internships are an‌‌ essential pathway for engaging‌ students in systems research and cultivating the next‌ generation of researchers in operating systems and distributed‌ computing. The team also runs a mentoring program‌ for students in Cameroon, providing guidance and support‌ to students at partner universities such as University‌ of Yaoundé 1 and ENSPY, helping them develop‌ research skills and pursue advanced studies in computer‌ systems.

12 Scientific production

12.1 Publications of the‌ year

International journals

1 articleY.Yves Kone‌, L.Louis Duval, R.Renaud Lachaize‌, P.Pascal Felber, D.Daniel Hagimont‌ and A.Alain Tchana. Understanding Intel User‌ Interrupts.Proceedings of the ACM on Measurement‌ and Analysis of Computing Systems 92June‌ 2025, 1-32HALDOI

International peer-reviewed conferences‌

2 inproceedingsP.Paul Breuil and B.Baptiste‌ Lepers. P4CEMaker: automated hardware acceleration of consensus‌ protocols.ICDCS 2025 - IEEE 45th International‌ Conference on Distributed Computing SystemsGlasgow, United Kingdom‌IEEEJuly 2025, 681-691HAL DOI
3‌ inproceedingsB.Brice Ekane, D.Djob Mvondo‌, R.Renaud Lachaize, Y.-D.Yérom-David Bromberg‌, A.Alain Tchana and D.Daniel Hagimont‌. DISC: Backpressure Mitigation In Multi-tier Applications With‌ Distributed Shared Connection.Proceedings of the 22nd‌ USENIX Symposium on Networked Systems Design and Implementation‌ (NSDI 25)NSDI 2025 - 22nd USENIX Symposium‌ on Networked Systems Design and ImplementationPhiladelphia (Pennsylvania),‌ United StatesApril 2025, 55-70HAL
4‌ inproceedingsK.Kenta Ishiguro, K.Kohei Hayama‌, A.Ayase Yokoyama, H.Hiroshi Yamada‌ and T.Toshio Hirotsu. Nesting Overlay File‌ Systems with ShadowWhiteout.APSys '25: 16th ACM‌ SIGOPS Asia-Pacific Workshop on SystemsSeoul, South Korea‌ACMOctober 2025, 15-21HAL DOI
5‌ inproceedingsG.Gabriel Job Antunes Grabher, F.‌Fumio Machida and T.Thomas Ropars. Modeling‌ Anomaly Detection in Cloud Services: Analysis of the‌ Properties that Impact Latency and Resource Consumption.‌Proceedings of the IEEE/ACM 18th International Conference on‌ Utility and Cloud ComputingUCC 2025 - IEEE/ACM‌ 18th International Conference on Utility and Cloud Computing‌Nantes, France2025, 1-10HAL DOI
6‌ inproceedingsY.Yves Kone, L.Louis Duval‌, R.Renaud Lachaize, P.Pascal Felber‌, D.Daniel Hagimont and A.Alain Tchana‌. Understanding Intel User Interrupts.Abstracts of‌ the 2025 ACM SIGMETRICS International Conference on Measurement‌ and Modeling of Computer SystemsSIGMETRICS 2025 -‌ ACM SIGMETRICS International Conference on Measurement and Modeling‌ of Computer Systems531Stony Brook, United‌ StatesACMJune 2025, 127-129HAL DOI‌
7 inproceedingsJ.Jakob Nibler and T.Thomas‌ Ropars. Carbon Footprint of Storage in Data‌ Centers: the Impact of Using Ssds for Key-Value‌ Stores.2025 IEEE 25th International Symposium on‌ Cluster, Cloud and Internet Computing (CCGrid), ProceedingsCCGrid‌ 2025 - IEEE 25th International Symposium on Cluster,‌ Cloud and Internet ComputingTromsø, NorwayIEEEMay‌ 2025, 305-314HALDOI
8 inproceedingsA.Alain Tchana, J.‌Jordan Gounou, P.‌ A.Papa Assane Fall‌‌, R.Renaud Lachaize, H.Hippolyte Tapamo‌, C. P.Celestin‌ Parfait Bessala Bessala and‌‌ V.Vivien Quema. Crazy : Unifying Memory‌ and CPU Management Subsystems‌.APSys '25: 16th‌‌ ACM SIGOPS Asia-Pacific Workshop on SystemsSeoul (Korea),‌ South KoreaACMOctober‌ 2025, 1-7HAL‌‌DOI
9 inproceedingsX.Xiaoxiang Wu, B.‌Baptiste Lepers and W.‌Willy Zwaenepoel. Pre-Stores:‌‌ Proactive Software-guided Movement of Data Down the Memory‌ Hierarchy.https://dl.acm.org/doi/pdf/10.1145/3689031.3696097EuroSys‌ 2025 - Twentieth European‌‌ Conference on Computer SystemsRotterdam, NetherlandsACMMarch‌ 2025, 1161-1176HAL‌DOI

Reports & preprints‌‌

10 reportI.Ilhem Fajjari, Y.Yannick‌ Nzali Koagne, J.‌Joaquim Soares and V.‌‌Vania Marangozova. D 4.2-1 Use Case Prototypes‌.Université Grenoble Alpes;‌ Orange Innovation; Orange Direction‌‌ EOLASMarch 2025HAL

12.2 Cited publications

11‌ inproceedingsP.Pradeep Ambati‌, Í.Íñigo Goiri‌‌, F.Felipe Frujeri, A.Alper Gun‌, K.Ke Wang‌, B.Brian Dolan‌‌, B.Brian Corell, S.Sekhar Pasupuleti‌, T.Thomas Moscibroda‌, S.Sameh Elnikety‌‌, M.Marcus Fontoura and R.Ricardo Bianchini‌. Providing SLOs for‌ resource-harvesting VMs in cloud‌‌ platforms.Proceedings of the 14th USENIX Conference‌ on Operating Systems Design‌ and ImplementationOSDI'20USA‌‌USENIX Association2020back to text
12 inproceedings‌J. T.Jack Tigar‌ Humphries, N.Neel‌‌ Natu, A.Ashwin Chaugule, O.Ofir‌ Weisse, B.Barret‌ Rhoden, J.Josh‌‌ Don, L.Luigi Rizzo, O.Oleg‌ Rombakh, P.Paul‌ Turner and C.Christos‌‌ Kozyrakis. ghOSt: Fast & Flexible User-Space Delegation‌ of Linux Scheduling.‌Proceedings of the ACM‌‌ SIGOPS 28th Symposium on Operating Systems PrinciplesSOSP‌ '21New York, NY,‌ USAVirtual Event, Germany‌‌Association for Computing Machinery2021, 588–604URL:‌ https://doi.org/10.1145/3477132.3483542DOI back to‌ text
13 inproceedingsJ.‌‌Jing Liu, A.Anthony Rebello, Y.‌Yifan Dai, C.‌Chenhao Ye, S.‌‌Sudarsun Kannan, A. C.Andrea C. Arpaci-Dusseau‌ and R. H.Remzi‌ H. Arpaci-Dusseau. Scale‌‌ and Performance in a Filesystem Semi-Microkernel.Proceedings‌ of the ACM SIGOPS‌ 28th Symposium on Operating‌‌ Systems PrinciplesSOSP '21New York, NY, USA‌Virtual Event, GermanyAssociation‌ for Computing Machinery2021‌‌, 819–835URL: https://doi.org/10.1145/3477132.3483581DOI back to text‌
14 inproceedingsM.Michael‌ Marty, M.Marc‌‌ de Kruijf, J.Jacob Adriaens, C.‌Christopher Alfeld, S.‌Sean Bauer, C.‌‌Carlo Contavalli, M.Michael Dalton, N.‌Nandita Dukkipati, W.‌ C.William C. Evans‌‌, S.Steve Gribble, N.Nicholas Kidd‌, R.Roman Kononov‌, G.Gautam Kumar‌‌, C.Carl Mauer, E.Emily Musick‌, L.Lena Olson‌, E.Erik Rubow‌‌, M.Michael Ryan, K.Kevin Springborn‌, P.Paul Turner‌, V.Valas Valancius‌‌, X.Xi Wang‌ and A.Amin Vahdat. Snap: a microkernel‌ approach to host networking.Proceedings of the‌ 27th ACM Symposium on Operating Systems PrinciplesSOSP‌ '19New York, NY, USAHuntsville, Ontario, Canada‌Association for Computing Machinery2019, 399–413URL:‌ https://doi.org/10.1145/3341301.3359657DOI back to text
15 inproceedingsA.‌Alain Tchana, D.Dorian Goepp, S.‌Stella Bitchebe and R.Renaud Lachaize. xOS:‌ The End Of The Process-Thread Duo Reign.‌Proceedings of the 14th ACM SIGOPS Asia-Pacific Workshop‌ on SystemsAPSys '23New York, NY, USA‌Seoul, Republic of KoreaAssociation for Computing Machinery‌2023, 1–8URL: https://doi.org/10.1145/3609510.3609817DOI back to‌ text
16 inproceedingsD.Dufy Teguia, J.‌Jiaxuan Chen, S.Stella Bitchebe, O.‌Oana Balmau and A.Alain Tchana. vPIM:‌ Processing-in-Memory Virtualization.Proceedings of the 25th International‌ Middleware ConferenceMiddleware '24New York, NY, USA‌Hong Kong, Hong KongAssociation for Computing Machinery‌2024, 417–430URL: https://doi.org/10.1145/3652892.3700782DOI back to‌ text
17 inproceedingsS.Siqi Zhao, X.‌Xuhua Ding, W.Wen Xu and D.‌Dawu Gu. Seeing Through The Same Lens:‌ Introspecting Guest Address Space At Native Speed.‌Security Symposium (USENIX Sec'17)USENIX2017back to‌ text

KRAKOS - 2025

KRAKOS - 2025

2025Activity report﻿​​﻿Project-TeamKRAKOS

Keywords

Computer Science and﻿​﻿﻿ Digital Science

Other Research Topics﻿​﻿﻿ and Application Domains

1 Team members,​‌﻿﻿ visitors, external collaborators

Research​​﻿﻿ Scientist

Faculty Members

Post-Doctoral﻿​﻿﻿ Fellows

PhD​​​‌ Students

Technical Staff

Interns and​​​‌ Apprentices

Administrative Assistant

2 Overall objectives

2.1​​​‌ Presentation

2.2 Objectives

3​​​‌ Research program

3.1 Methodology﻿​﻿﻿

3.1.1 M1 -﻿​﻿﻿ Virtualization

3.1.2 M2​​​‌ - Profiling, Tracing and﻿﻿﻿‌ Monitoring

3.2 Research﻿﻿﻿‌ Axes

3.2.1 A1 -﻿​​﻿ Machine Virtualization

3.2.2 A2 - Mutant​‌﻿﻿ Kernels and Key Abstractions​​﻿﻿ for Concurrency

3.2.3 A3 - Disaggregation​​﻿﻿

Soft﻿​﻿﻿ Disaggregation (Software-based)

Hard﻿‌​‌ Disaggregation (Hardware-based)

3.2.4​​﻿﻿ A4 - Fault Tolerance​​​‌

Fault​​﻿﻿ Tolerance for Virtual Machines​​​‌

Fault Tolerance for Mutant﻿​​﻿ Kernels

Fault Tolerance for Disaggregation​‌﻿﻿

4﻿​﻿﻿ Application domains

4.1 Overview​‌﻿﻿

4.2 Hypervisor Layer

4.2.1​‌﻿﻿ Target Hypervisors

4.2.2 Cloud﻿‌​‌ Deployment Models

4.2.3﻿﻿﻿‌ Cloud Service Models

4.3 Operating﻿​​﻿ System Layer

4.4 Middleware and Orchestration﻿​​﻿

4.5 Domain-Specific​​﻿﻿ Applications

4.5.1 Genomics and​​​‌ Bioinformatics

4.5.2 Memory-Intensive Applications​​​‌

4.5.3​​​‌ Microservices Architectures

4.6 Cross-Cutting﻿​﻿﻿ Application Characteristics

5 Social​​﻿﻿ and environmental responsibility

5.1​​​‌ Energy Efficiency and Green﻿​﻿﻿ Computing

5.1.1 Participation in﻿‌​‌ Standardization Initiatives

5.2﻿​​﻿ Diversity, Equity, and Inclusion​​​‌

5.2.1 Leadership in DEI﻿﻿﻿‌ Initiatives

5.2.2 Gender Diversity﻿​​﻿ Reflection

5.3 Open Science﻿﻿﻿‌ and Reproducible Research

6​​​‌ Highlights of the year﻿​﻿﻿

6.1 Team Creation and​‌﻿﻿ Inauguration

6.2 HDR and PhD​​​‌ Defenses

6.3 Major​​﻿﻿ Publications

6.4​​﻿﻿ Awards and Recognition

6.5 International​​﻿﻿ Collaborations

6.6﻿﻿﻿‌ Conference Organization and Leadership﻿‌​‌

6.7 Industrial Partnerships

7 Latest software developments,﻿​​﻿ platforms, open data

7.1​​​‌ New Software

7.1.1 USM﻿﻿﻿‌

7.1.2 MigCheck

7.1.3​​​‌ vPIM

7.1.4 Faho

7.1.5 GoodKit

7.1.6 B-Side​‌﻿﻿

7.1.7 P4Cemaker

7.1.8 DirtBuster

7.2﻿﻿﻿‌ New platforms

7.2.1 IBARA﻿‌​‌ - Portable Micro-Cluster for﻿​​﻿ Africa

7.2.2﻿​​﻿ Grid'5000

7.2.3 SLICES-FR

8 New​‌﻿﻿ results

8.1 Physical Memory​​﻿﻿ Management in Userspace (USM)​​​‌

8.2 Processing-in-Memory Virtualization

8.3​​​‌ Heterogeneous VM Migration

8.4 Understanding​‌﻿﻿ Intel User Interrupts

8.5 System Call Identification﻿﻿﻿‌ for Security

8.6 SIMBox Fraud​​​‌ Detection

2025Activity reportProject-TeamKRAKOS

Computer Science and Digital Science

Other Research Topics and Application Domains

1 Team members,‌ visitors, external collaborators

Research Scientist

Post-Doctoral Fellows

PhD‌ Students

Interns and‌ Apprentices

2.1‌ Presentation

3‌ Research program

3.1 Methodology

3.1.1 M1 - Virtualization

3.1.2 M2‌ - Profiling, Tracing and‌ Monitoring

3.2 Research‌ Axes

3.2.1 A1 - Machine Virtualization

3.2.2 A2 - Mutant‌ Kernels and Key Abstractions for Concurrency

3.2.3 A3 - Disaggregation

Soft Disaggregation (Software-based)

Hard‌‌ Disaggregation (Hardware-based)

3.2.4 A4 - Fault Tolerance‌

Fault Tolerance for Virtual Machines‌

Fault Tolerance for Mutant Kernels

Fault Tolerance for Disaggregation‌

4 Application domains

4.1 Overview‌

4.2.1‌ Target Hypervisors

4.2.2 Cloud‌‌ Deployment Models

4.2.3‌ Cloud Service Models

4.3 Operating System Layer

4.4 Middleware and Orchestration

4.5 Domain-Specific Applications

4.5.1 Genomics and‌ Bioinformatics

4.5.2 Memory-Intensive Applications‌

4.5.3‌ Microservices Architectures

4.6 Cross-Cutting Application Characteristics

5 Social and environmental responsibility

5.1‌ Energy Efficiency and Green Computing

5.1.1 Participation in‌‌ Standardization Initiatives

5.2 Diversity, Equity, and Inclusion‌

5.2.1 Leadership in DEI‌ Initiatives

5.2.2 Gender Diversity Reflection

5.3 Open Science‌ and Reproducible Research

6‌ Highlights of the year

6.1 Team Creation and‌ Inauguration

6.2 HDR and PhD‌ Defenses

6.3 Major Publications

6.4 Awards and Recognition

6.5 International Collaborations

6.6‌ Conference Organization and Leadership‌‌

7 Latest software developments, platforms, open data

7.1‌ New Software

7.1.1 USM‌

7.1.3‌ vPIM

7.1.6 B-Side‌

7.2‌ New platforms

7.2.1 IBARA‌‌ - Portable Micro-Cluster for Africa

7.2.2 Grid'5000

8 New‌ results

8.1 Physical Memory Management in Userspace (USM)‌

8.3‌ Heterogeneous VM Migration

8.4 Understanding‌ Intel User Interrupts

8.5 System Call Identification‌ for Security

8.6 SIMBox Fraud‌ Detection

8.8 Pre-Stores‌

8.9 Carbon Footprint of Storage‌ in Datacenters

8.9.1 IBARA - Portable Micro-Cluster for‌ teaching

9 Bilateral contracts‌ and grants with industry

10 Partnerships and cooperations

10.1 International‌ Initiatives

10.1.1 Associated Teams‌

10.1.2 Other‌ International Collaborations

10.2 National‌ Initiatives

10.2.1 ANR Projects

10.2.2 PEPR Projects‌

10.2.3 Inria Challenges (Défis)

10.2.4 Regional Projects

10.3 Collaboration with Other Research Teams

10.4 Conference and Workshop Organization

11 Dissemination‌‌

11.2‌‌ Scientific Expertise