CASH - 2023

2023Activity reportProject-TeamCASH

RNSR: 201822804N
  • Research center Inria Lyon Centre
  • In partnership with:Université Claude Bernard (Lyon 1), Ecole normale supérieure de Lyon, CNRS
  • Team name: Compilation and Analyses for Software and Hardware
  • In collaboration with:Laboratoire de l'Informatique du Parallélisme (LIP)
  • Domain:Algorithmics, Programming, Software and Architecture
  • Theme:Architecture, Languages and Compilation


Computer Science and Digital Science

  • A2.1. Programming Languages
  • A2.1.1. Semantics of programming languages
  • A2.1.2. Imperative programming
  • A2.1.4. Functional programming
  • A2.1.6. Concurrent programming
  • A2.1.7. Distributed programming
  • A2.1.10. Domain-specific languages
  • A2.1.11. Proof languages
  • A2.2. Compilation
  • A2.2.1. Static analysis
  • A2.2.2. Memory models
  • A2.2.3. Memory management
  • A2.2.4. Parallel architectures
  • A2.2.5. Run-time systems
  • A2.2.6. GPGPU, FPGA...
  • A2.2.8. Code generation
  • A2.3.1. Embedded systems
  • A2.4. Formal method for verification, reliability, certification
  • A2.4.1. Analysis
  • A2.4.2. Model-checking
  • A2.4.3. Proofs
  • A2.5.3. Empirical Software Engineering
  • A2.5.4. Software Maintenance & Evolution
  • A7.2.1. Decision procedures
  • A7.2.3. Interactive Theorem Proving

1 Team members, visitors, external collaborators

Research Scientists

  • Christophe Alias [INRIA, Researcher, HDR]
  • Ludovic Henrio [CNRS, Researcher, HDR]
  • Gabriel Radanne [INRIA, Senior Researcher]
  • Yannick Zakowski [INRIA, Researcher]

Faculty Member

  • Matthieu Moy [Team leader, UNIV LYON I, Associate Professor, HDR]

Post-Doctoral Fellows

  • Emmanuel Arrighi [ENS DE LYON, from Sep 2023, ATER]
  • Bruno Ferres [INRIA, Post-Doctoral Fellow, until Aug 2023]

PhD Students

  • Thaïs Baudon [ENS DE LYON]
  • Nicolas Chappe [ENS DE LYON]
  • Julien Emmanuel [ENS DE LYON, until Jan 2023]
  • Amaury Maille [UNIV LYON I, ATER, until Aug 2023]
  • Oussama Oulkaid [ANIAH, CIFRE]
  • Alec Sadler [INRIA, from Sep 2023]
  • Hugo Thievenaz [INRIA]

Interns and Apprentices

  • Mohamed-Aymane Akil [INRIA, Intern, from May 2023 until Aug 2023]
  • David Gozlan [INRIA, Intern, from May 2023 until Aug 2023]
  • Galaad Langlois [ENS DE LYON, Intern, from Feb 2023 until Jul 2023]
  • Nicolas Nardino [ENS de Lyon, Intern, from Feb 2023 until Jul 2023]

Administrative Assistant

  • Elise Denoyelle [INRIA, from Dec 2023]

External Collaborator

  • Laure Gonnord [GRENOBLE INP, HDR]

2 Overall objectives

Research objectives.

The overall objective of the CASH team is to take advantage of the characteristics of the specific hardware (generic hardware, hardware accelerators, or reconfigurable chips) to compile energy efficient software and hardware. To reach this goal, the CASH team provides efficient analyses and optimizing compilation frameworks for dataflow programming models. These contributions are combined with two other research directions. First, research on foundations of programming language and program analysis provides a theoretical basis for our work. Second, parallel and scalable simulation of hardware systems, combined with high-level synthesis tools, result in an end-to-end workflow for circuit design.

The scientific focus of CASH is on compute kernels and assembly of kernels, and the first goal is to improve their efficient compilation. However the team also works in collaboration with application developers, to understand better the overall need in HPC and design optimizations that are effective in the context of the targeted applications. Small computation kernels (tens of lines of code) that can be analyzed and optimized aggressively, medium-size kernels (hundreds of lines of code) that require modular analysis, and assembly of compute kernels (either as classical imperative programs or written directly in a dataflow language).

Our objective is to allow developers to design their own kernels, and benefit from good performance in terms of speed and energy efficiency without having to deal with fine-grained optimizations by hand. Consequently, our objective is first to improve the performance and energy consumption for HPC applications, while providing programming tools that can be used by developers and are at a convenient level of abstraction.

Obviously, large applications are not limited to assembly of compute kernels. Our languages and formalism definitions and analyses must also be able to deal with general programs. Our targets also include generalist programs with complex behaviors such as recursive programs operating on arrays, lists and trees; worklist algorithms (we often use the polyhedral model, a powerful theory to optimize loop nests, but it does not support data structures such as lists). Analysis on these programs should be able to detect non licit memory accesses, memory consumption, hotspots, ..., and to prove functional properties.

Our Approach and methodology.

We target a balance between theory and practice: problems extracted from industrial requirements often yield theoretical problems.

On the practical side, the CASH team targets applied research, in the sense that most research topics are driven by actual needs, discussed either through industrial partnership or extracted from available benchmarks.

The theoretical aspects ensure the coherency and the correctness of our approach. We rely on a precise definition of the manipulated languages and their semantics. The formalization of the different representations of the programs and of the analyses allow us to show that these different tasks will be performed with the same understanding of the program semantics.

Our approach is to cross-fertilize between several communities. For example, the abstract interpretation community provides a sound theoretical framework and very powerful analysis, but these are rarely applied in the context of optimizing compilation. Similarly, the hardware simulation community usually considers compilers as black-boxes and does not interact with researchers in compilation.

While a global approach links CASH activities and members, we do not plan to have a single unified toolchain where all contributions would be implemented. For example, contributions in the domain of static analysis of sequential programs may be implemented in the LLVM tool, results on dataflow models are applied both in the SigmaC compiler and in the DCC HLS tool, ...This also implies that different activities of CASH target different application domains and potential end-users.

Research directions.

The main objectives of the cash team are to provide scalable and expressive static analysis and optimizing parallel compilers. These directions rely on programming languages and representation of programs in which parallelism and dataflow play a crucial role. A central research direction aims at the study of parallelism and dataflow aspects in programming languages, both from a practical perspective (syntax or structure), and from a theoretical point of view (semantics). The CASH team also has simulation activities that are both applied internally in CASH, to simulate intermediate representations, and for embedded systems.

3 Research program

3.1 Research direction 1: Parallel and Dataflow Programming Models

In the last decades, several frameworks have emerged to design efficient compiler algorithms. The efficiency of all the optimizations performed in compilers strongly relies on effective static analyses and intermediate representations. Dataflow models are a natural intermediate representation for hardware compilers (HLS) and more generally for parallelizing compilers. Indeed, dataflow models capture task-level parallelism and can be mapped naturally to parallel architectures. In a way, a dataflow model is a partition of the computation into processes and a partition of the flow dependences into channels. This partitioning prepares resource allocation (which processor/hardware to use) and medium-grain communications.

The main goal of the CASH team is to provide efficient analyses and the optimizing compilation frameworks for dataflow programming models. The results of the team relies on programming languages and representation of programs in which parallelism and dataflow play a crucial role. This first research direction aims at defining these dataflow languages and intermediate representations, both from a practical perspective (syntax or structure), and from a theoretical point of view (semantics). This first research direction thus defines the models on which the other directions will rely. It is important to note that we do not restrict ourselves to a strict definition of dataflow languages: more generally, we are interested in the parallel languages in which dataflow synchronization plays a significant role.

Intermediate dataflow model. The intermediate dataflow model is a representation of the program that is adapted for optimization and scheduling. It is obtained from the analysis of a (parallel or sequential) program and should at some point be used for compilation. The dataflow model must specify precisely its semantics and parallelism granularity. It must also be analyzable with polyhedral techniques, where powerful concepts exist to design compiler analysis, e.g., scheduling or resource allocation. Polyhedral Process Networks 63 extended with a module system could be a good starting point. But then, how to fit non-polyhedral parts of the program? A solution is to hide non-polyhedral parts into processes with a proper polyhedral abstraction. This organization between polyhedral and non-polyhedral processes will be a key aspect of our medium-grain dataflow model. The design of our intermediate dataflow model and the precise definition of its semantics will constitute a reliable basis to formally define and ensure the correctness of algorithms proposed by CASH: compilation, optimizations and analyses.

Dataflow programming languages. Dataflow paradigm has also been explored quite intensively in programming languages. Indeed, there exists a large panel of dataflow languages, whose characteristics differ notably, the major point of variability being the scheduling of agents and their communications. There is indeed a continuum from the synchronous dataflow languages like Lustre  45 or Streamit  59, where the scheduling is fully static, and general communicating networks like KPNs  47 or RVC-Cal  29 where a dedicated runtime is responsible for scheduling tasks dynamically, when they can be executed. These languages share some similarities with actor languages that go even further in the decoupling of processes by considering them as independent reactive entities. Another objective of the CASH team is to study dataflow programming languages, their semantics, their expressiveness, and their compilation. The specificity of the CASH team is that these languages will be designed taking into consideration the compilation using polyhedral techniques. In particular, we will explore which dataflow constructs are better adapted for our static analysis, compilation, and scheduling techniques. In practice we want to propose high-level primitives to express data dependency, this way the programmer can express parallelism in a dataflow way instead of the classical communication-oriented dependencies. The higher-level more declarative point of view makes programming easier but also give more optimization opportunities. These primitives will be inspired by the existing works in the polyhedral model framework, as well as dataflow languages, but also in the actors and active object languages  37 that nowadays introduce more and more dataflow primitives to enable data-driven interactions between agents, particularly with futures  34, 42.

Formal semantics

Proving the correctness of an analysis or of a program transformation requires a formal semantics of the language considered. Depending on the context, our formalizations may take the form of paper definitions, or of a mechanization inside of a proof assistant. While more time consuming, the latter may ensure in the adequate context some additional trust in the proofs established, as well as a tighter connection to an executable artifact. We have been recently studying in particular the formalization of concurrent and parallel paradigms, under weak memory models notably, by building on top of the interaction tree  66 approach developed for the Coq proof assistant.

Programming models and program transformations.

So far, the programming models designed in this direction allow to express parallelism in novel ways, but don't leverage the optimising compiler transformation introduced in direction 3. Indeed, optimising compilers only provide control over their behavior through extra-language annotations called “pragmas”. Since those annotations are outside the language, they do not benefit from abstraction and modularity, and are often brittle. We plan to provide better integration between the optimisation passes of compiler inside the language itself through the use of meta-programming, by presenting optimisations as first class objects which can be applied, composed and manipulated in the language. A first step of this long term project is to investigate how to express loop transformations (developed by polyhedral model approaches) using existing meta-programming constructs.

3.1.1 Expected Impact

The impact of this research direction is both the usability of our representation for static analyses and optimizations performed in Sections 3.2 and 3.3, and the usability of its semantics to prove the correctness of these analyses.

3.1.2 Scientific Program

Medium-term activities.

We plan to extend the existing results to widen the expressiveness of our intermediate representation and design new parallelism constructs. We will also work on the semantics of dataflow languages:

  • Propose new stream programming models and a clean semantics where all kinds of parallelisms are expressed explicitly, and where all activities from code design to compilation and scheduling can be clearly expressed.
  • Identify a core language that is rich enough to be representative of the dataflow languages we are interested in, but abstract and small enough to enable formal reasoning and proofs of correctness for our analyses and optimizations.
Long-term activities.

In a longer-term vision, the work on semantics, while remaining driven by the applications, would lead to to more mature results, for instance:

  • Design more expressive dataflow languages and intermediate representations which would at the same time be expressive enough to capture all the features we want for aggressive HPC optimizations, and sufficiently restrictive to be (at least partially) statically analyzable at a reasonable cost.
  • Define a module system for our medium-grain dataflow language. A program will then be divided into modules that can follow different compilation schemes and execution models but still communicate together. This will allow us to encapsulate a program that does not fit the polyhedral model into a polyhedral one and vice versa. Also, this will allow a compositional analysis and compilation, as opposed to global analysis which is limited in scalability.

3.2 Research direction 2: Expressive, Scalable and Certified Static Analyses

The design and implementation of efficient compilers becomes more difficult each day, as they need to bridge the gap between complex languages and complex architectures. Application developers use languages that bring them close to the problem that they need to solve which explains the importance of high-level programming languages. However, high-level programming languages tend to become more distant from the hardware which they are meant to command.

In this research direction, we propose to design expressive and scalable static analyses for compilers. This topic is closely linked to Sections 3.1 and 3.3 since the design of an efficient intermediate representation is made while regarding the analyses it enables. The intermediate representation should be expressive enough to embed maximal information; however if the representation is too complex the design of scalable analyses will be harder.

The analyses we plan to design in this activity will of course be mainly driven by the HPC dataflow optimizations we mentioned in the preceding sections; however we will also target other kinds of analyses applicable to more general purpose programs. We will thus consider two main directions:

  • Extend the applicability of the polyhedral model, in order to deal with HPC applications that do not fit totally in this category. More specifically, we plan to work on more complex control and also on complex data structures, like sparse matrices, which are heavily used in HPC.
  • Design of specialized static analyses for memory diagnostic and optimization inside general purpose compilers.

For both activities, we plan to cross fertilize ideas coming from the abstract interpretation community as well as language design, dataflow semantics, and WCET estimation techniques.

Correct by construction analyses. The design of well-defined semantics for the chosen programming language and intermediate representation will allow us to show the correctness of our analyses. The precise study of the semantics of Section 3.1 will allow us to adapt the analysis to the characteristics of the language, and prove that such an adaptation is well founded. This approach will be applicable both on the source language and on the intermediate representation.

We are interested both in paper proofs and verified proofs using a proof assistant such as Coq. Formally verified analysis crucially rely on a formal semantics of the programming language the analysis operates on: Yannick Zakowskiprecisely developed recently a new formal semantics in Coq for the sequential fragment of LLVM IR 8, the intermediate representation at the heart of the LLVM compilation infrastructure.

The semantics of Vellvm, which technically relies on Interaction Trees  66, enjoys crucial properties of compositionality and modularity. By leveraging these meta-theoretic properties of the semantics of the language, we believe that the additional objective of formal correctness can be compatible with the objectives of expressivity and scalability of the analyses we wish to develop for LLVM in particular.

The design of formal semantics allows formulating well-foundedness criteria relatively to the language semantics, that we can use to design our analyses, and then to study which extensions of the languages can be envisioned and analyzed safely, and which extensions (if any) are difficult to analyze and should be avoided. Here the correct identification of a core language for our formal studies (see Section 3.1) will play a crucial role as the core language should feature all the characteristics that might make the analysis difficult or incorrect.

Scalable abstract domains. We already have experience in designing low-cost semi relational abstract domains for pointers 53, 50, as well as tailoring static analyses for specialized applications in compilation 40, 58, Synchronous Dataflow scheduling 57, and extending the polyhedral model to irregular applications 25. We also have experience in the design of various static verification techniques adapted to different programming paradigms.

Modularity of programming languages Modularity is an essential property of modern programming languages, allowing to assemble pieces of software in a high level and composable fashion. We aim to develop new module systems and tools for large scale ecosystems. A first aspect of this work is to pursue the collaboration with Didier Remy (Inria Cambium) and Jacques Garrigue (University of Nagoya) on designing module systems for ML languages. Gabriel Radanne is working on the formalization and implementation of a new rich module system which can serve as foundation for further experiment on the OCaml module system. A second aspect is to improve the ease of use of large ecosystems. We also work on tools to assist software developers, such as a tool to search functions by types, in a way that scales to complete ecosystems.

3.2.1 Expected impact

The impact of this work is the significantly widened applicability of various tools/compilers related to parallelization: allow optimizations for a larger class of programs, and allow low-cost analysis that scale to very large programs.

We target both analysis for optimization and analysis to detect, or prove the absence of bugs.

3.2.2 Scientific Program

Medium-term activities.

In the context of Paul Iannetta's Phd thesis, we have proposed a semantic rephrasing of the polyhedral model and proposed first steps toward and effective "polyhedral-like compilation" for algebraic datastructures like trees. In medium term, we want to extend the applicability of this new model for arbitrary layouts. The must challenging ones are sparse matrices. This activity still relies on a formalization of the optimization activities (dependency computation, scheduling, compilation) in a more general Abstract-Interpretation based framework in order to make the approximations explicit.

At the same time, we plan to continue to work on scaling static analyses for general purpose programs, in the spirit of Maroua Maalej's PhD  49, whose contribution is a sequence of memory analyses inside production compilers. We already began a collaboration with Caroline Collange (PACAP team of IRISA Laboratory) on the design of static analyses to optimize copies from the global memory of a GPU to the block kernels (to increase locality). In particular, we have the objective to design specialized analyses but with an explicit notion of cost/precision compromise, in the spirit of the paper  44 that tries to formalize the cost/precision compromise of interprocedural analyses with respect to a “context sensitivity parameter”.

Long-term activities.

In a longer-term vision, the work on scalable static analyses, whether or not directed from the dataflow activities, will be pursued in the direction of large general-purpose programs.

An ambitious challenge is to find a generic way of adapting existing (relational) abstract domains within the Single Static Information  30 framework so as to improve their scalability. With this framework, we would be able to design static analyses, in the spirit of the seminal paper  36 which gave a theoretical scheme for classical abstract interpretation analyses.

We also plan to work on the interface between the analyses and their optimization clients inside production compilers.

3.3 Research direction 3: Optimizing Program Transformations

In this part, we propose to design the compiler analyses and optimizations for the medium-grain dataflow model defined in section 3.1. We also propose to exploit these techniques to improve the compilation of dataflow languages based on actors. Hence our activity is split into the following parts:

  • Translating a sequential program into a medium-grain dataflow model. The programmer cannot be expected to rewrite the legacy HPC code, which is usually relatively large. Hence, compiler techniques must be invented to do the translation.
  • Transforming and scheduling our medium-grain dataflow model to meet some classic optimization criteria, such as throughput, local memory requirements, or I/O traffic.
  • Combining agents and polyhedral kernels in dataflow languages. We propose to apply the techniques above to optimize the processes in actor-based dataflow languages and combine them with the parallelism existing in the languages.

We plan to rely extensively on the polyhedral model to define our compiler analysis. The polyhedral model was originally designed to analyze imperative programs. Analysis (such as scheduling or buffer allocation) must be redefined in light of dataflow semantics.

Translating a sequential program into a medium-grain dataflow model. The programs considered are compute-intensive parts from HPC applications, typically big HPC kernels of several hundreds of lines of C code. In particular, we expect to analyze the process code (actors) from the dataflow programs. On short ACL (Affine Control Loop) programs, direct solutions exist 61 and rely directly on array dataflow analysis 39. On bigger ACL programs, this analysis no longer scales. We plan to address this issue by modularizing array dataflow analysis. Indeed, by splitting the program into processes, the complexity is mechanically reduced. This is a general observation, which was exploited in the past to compute schedules 41. When the program is no longer ACL, a clear distinction must be made between polyhedral parts and non polyhedral parts. Hence, our medium-grain dataflow language must distinguish between polyhedral process networks, and non-polyhedral code fragments. This structure raises new challenges: How to abstract away non-polyhedral parts while keeping the polyhedrality of the dataflow program? Which trade-off(s) between precision and scalability are effective?

Medium-grain data transfers minimization. When the system consists of a single computing unit connected to a slow memory, the roofline model 64 defines the optimal ratio of computation per data transfer (operational intensity). The operational intensity is then translated to a partition of the computation (loop tiling) into reuse units: inside a reuse unit, data are transfered locally; between reuse units, data are transfered through the slow memory. On a fine-grain dataflow model, reuse units are exposed with loop tiling; this is the case for example in Data-aware Process Network (DPN) 27. The following questions are however still open: How does that translate on medium-grain dataflow models? And fundamentally what does it mean to tile a dataflow model?

Combining agents and polyhedral kernels in dataflow languages. In addition to the approach developed above, we propose to explore the compilation of dataflow programming languages. In fact, among the applications targeted by the project, some of them are already thought or specified as dataflow actors (video compression, machine-learning algorithms,...).

So far, parallelization techniques for such applications have focused on taking advantage of the decomposition into agents, potentially duplicating some agents to have several instances that work on different data items in parallel  43. In the presence of big agents, the programmer is left with the splitting (or merging) of these agents by-hand if she wants to further parallelize her program (or at least give this opportunity to the runtime, which in general only sees agents as non-malleable entities). In the presence of arrays and loop-nests, or, more generally, some kind of regularity in the agent's code, however, we believe that the programmer would benefit from automatic parallelization techniques such as those proposed in the previous paragraphs. To achieve the goal of a totally integrated approach where programmers write the applications they have in mind (application flow in agents where the agents' code express potential parallelism), and then it is up to the system (compiler, runtime) to propose adequate optimizations, we propose to build on solid formal definition of the language semantics (thus the formal specification of parallelism occurring at the agent level) to provide hierarchical solutions to the problem of compilation and scheduling of such applications.

Certified compilation We will develop a research direction around the formal proof of compilation passes, and of optimizing program transformations in particular. Although realistic formally verified optimizing compilers are roughly 15 years old, three limitations to the current state of the art are apparent.

First, loop optimizations have been very sparsely tackled, their proof rising difficult semantic issues. We intend on one side to leverage the compositionality of Interaction-Tree-based semantics as used in Vellvm to improve the situation. An orthogonal axis we wish to explore is the formalization in Coq of the Polyhedral Model, as pioneered in 2021 by Courant and Leroy  35.

Second, parallelism and concurrency have been almost ignored by the verified compilation community. This problem is a major long term endeavor for we first need to develop the appropriate semantic tools. Ludovic Henrio and Yannick Zakowski will work with a master student, Ambre Suhamy, to explore the use of Interaction Trees to model various paradigms for concurrency, paving the long term way to an extension of Vellvm to concurrency.

Third, these proofs are very brittle for they rely on concrete implementation of memory models rather than axiomatizations of those. Ludovic Henrio and Yannick Zakowski will work with a master student, Alban Reynaud, to develop semantic tools to reason formally up-to arbitrary algebras in Coq. One of the core objectives of this project is to prove optimizations at a higher level of abstraction, so that these proofs remain valid by construction under changes in the memory model.

The compiler analyses proposed above do not target a specific platform. In this part, we propose to leverage these analysis to develop source-level optimizations for high-level synthesis (HLS).

High-level synthesis consists in compiling a kernel written in a high-level language (typically in C) into a circuit. As for any compiler, an HLS tool consists in a front-end which translates the input kernel into an intermediate representation. This intermediate representation captures the control/flow dependences between computation units, generally in a hierarchical fashion. Then, the back-end maps this intermediate representation to a circuit (e.g. FPGA configuration). We believe that HLS tools must be thought as fine-grain automatic parallelizers. In classic HLS tools, the parallelism is expressed and exploited at the back-end level during the scheduling and the resource allocation of arithmetic operations. We believe that it would be far more profitable to derive the parallelism at the front-end level.

Hence, CASH will focus on the front-end pass and the intermediate representation. Low-level back-end techniques are not in the scope of CASH. Specifically, CASH will leverage the dataflow representation developed in Section 3.1 and the compilation techniques developed in Section 3.3 to develop a relevant intermediate representation for HLS and the corresponding front-end compilation algorithms.

Our results will be evaluated by using existing HLS tools (e.g., Intel HLS compiler, Xilinx Vivado HLS). We will implement our compiler as a source-to-source transformation in front of HLS tools. With this approach, HLS tools are considered as a “back-end black box”. The CASH scheme is thus: (i) front-end: produce the CASH dataflow representation from the input C kernel. Then, (ii) turn this dataflow representation to a C program with pragmas for an HLS tool. This step must convey the characteristics of the dataflow representation found by step (i) (e.g. dataflow execution, fifo synchronisation, channel size). This source-to-source approach will allow us to get a full source-to-FPGA flow demonstrating the benefits of our tools while relying on existing tools for low-level optimizations. Step (i) will start from the Dcc tool developed by Christophe Alias, which already produces a dataflow intermediate representation: the Data-aware Process Networks (DPN) 27. Hence, the very first step is then to chose an HLS tool and to investiguate which input should be fed to the HLS tool so it “respects” the parallelism and the resource allocation suggested by the DPN. From this basis, we plan to investiguate the points described thereafter.

Roofline model and dataflow-level resource evaluation. Operational intensity must be tuned according to the roofline model. The roofline model 64 must be redefined in light of FPGA constraints. Indeed, the peak performance is no longer constant: it depends on the operational intensity itself. The more operational intensity we need, the more local memory we use, the less parallelization we get (since FPGA resources are limited), and finally the less performance we get! Hence, multiple iterations may be needed before reaching an efficient implementation. To accelerate the design process, we propose to iterate at the dataflow program level, which implies a fast resource evaluation at the dataflow level.

Reducing FPGA resources. Each parallel unit must use as little resources as possible to maximize parallel duplication, hence the final performance. This requires to factorize the control and the channels. Both can be achieved with source-to-source optimizations at dataflow level. The main issue with outputs from polyhedral optimization is large piecewise affine functions that require a wide silicon surface on the FPGA to be computed. Actually we do not need to compute a closed form (expression that can be evaluated in bounded time on the FPGA) statically. We believe that the circuit can be compacted if we allow control parts to be evaluated dynamically. Finally, though dataflow architectures are a natural candidate, adjustments are required to fit FPGA constraints (2D circuit, few memory blocks). Ideas from systolic arrays 56 can be borrowed to re-use the same piece of data multiple times, despite the limitation to regular kernels and the lack of I/O flexibility. A trade-off must be found between pure dataflow and systolic communications.

Improving circuit throughput. Since we target streaming applications, the throughput must be optimized. To achieve such an optimization, we need to address the following questions. How to derive an optimal upper bound on the throughput for polyhedral process network? Which dataflow transformations should be performed to reach it? The limiting factors are well known: I/O (decoding of burst data), communications through addressable channels, and latencies of the arithmetic operators. Finally, it is also necessary to find the right methodology to measure the throughput statically and/or dynamically.

3.3.1 Expected impact

In general, splitting a program into simpler processes simplifies the problem. This observation leads to the following points:

  • By abstracting away irregular parts in processes, we expect to structure the long-term problem of handling irregular applications in the polyhedral model. The long-term impact is to widen the applicability of the polyhedral model to irregular kernels.
  • Splitting a program into processes reduces the problem size. Hence, it becomes possible to scale traditionally expensive polyhedral analysis such as scheduling or tiling to quote a few.

As for the third research direction, the short term impact is the possibility to combine efficiently classical dataflow programming with compiler polyhedral-based optimizations. We will first propose ad-hoc solutions coming from our HPC application expertise, but supported by strong theoretical results that prove their correctness and their applicability in practice. In the longer term, our work will allow specifying, designing, analyzing, and compiling HPC dataflow applications in a unified way. We target semi-automatic approaches where pertinent feedback is given to the developer during the development process.

3.3.2 Scientific Program

Short-term and ongoing activities.

We plan to evaluate the impact of state-of-the-art polyhedral source-to-source transformations on HLS for FPGA. Our results on polyhedral HLS (DPN 26, 28) could also be a good starting point for this purpose. We will give a particular focus to memory layout transformations, easier to implement as a source level transformation. Then, we will tackle control optimizations throught the adaptation of loop tiling to HLS constraints.

Medium-term activities.

The results of the preceding paragraph are partial and have been obtained with a simple experimental approach only using off-the-shelf tools. We are thus encouraged to pursue research on combining expertise from dataflow programming languages and polyhedral compilation. Our long term objective is to go towards a formal framework to express, compile, and run dataflow applications with intrinsic instruction or pipeline parallelism.

We plan to investigate in the following directions:

  • Investigate how polyhedral analysis extends on modular dataflow programs. For instance, how to modularize polyhedral scheduling analysis on our dataflow programs?
  • Develop a proof of concept and validate it on linear algebra kernels (SVD, Gram-Schmidt, etc.).
  • Explore various areas of applications from classical dataflow examples, like radio and video processing, to more recent applications in deep learning algorithmic. This will enable us to identify some potential (intra and extra) agent optimization patterns that could be leveraged into new language idioms.

Also, we plan to explore how polyhedral transformations might scale on larger applications, typically those found in deep-learning algorithms. We will investigate how the regularity of polyhedral kernels can be exploited to infer general affine transformations from a few offline execution traces. This is the main goal of the PolyTrace exploratory action, started on 2021 in collaboration with Waseda University. We will first target offline memory allocation, an important transformation used in HLS and generally in automatic parallelization.

Finally, we plan to explore how on-the-fly evaluation can reduce the complexity of the control. A good starting point is the control required for the load process (which fetch data from the distant memory). If we want to avoid multiple load of the same data, the FSM (Finite State Machine) that describes it is usually very complex. We believe that dynamic construction of the load set (set of data to load from the main memory) will use less silicon than an FSM with large piecewise affine functions computed statically.

Long-term activities.

Current work focus on purely polyhedral applications. Irregular parts are not handled. Also, a notion of tiling is required so the communications of the dataflow program with the outside world can be tuned with respect to the local memory size. Hence, we plan to investigate the following points:

  • Assess simple polyhedral/non polyhedral partitioning: How non-polyhedral parts can be hidden in processes/channels? How to abstract the dataflow dependencies between processes? What would be the impact on analyses? We target programs with irregular control (e.g., while loop, early exits) and regular data (arrays with affine accesses).
  • Design tiling schemes for modular dataflow programs: What does it mean to tile a dataflow program? Which compiler algorithms to use?
  • Implement a mature compiler infrastructure from the front-end to code generation for a reasonable subset of the representation.

Also, we plan to systematize the definition of scalable polyhedral compilers using extrapolation from offline traces. Both theoretical and applied research are required to reach this goal. The research strategy consists in studying several instances (memory allocation, scheduling, etc). Then, in producing the theoretical ingredients to reach a general methodology of conception.

3.4 Research direction 4: Simulation and Hardware

Complex systems such as systems-on-a-chip or HPC computer with FPGA accelerator comprise both hardware and software parts, tightly coupled together. In particular, the software cannot be executed without the hardware, or at least a simulator of the hardware.

Because of the increasing complexity of both software and hardware, traditional simulation techniques (Register Transfer Level, RTL) are too slow to allow full system simulation in reasonable time. New techniques such as Transaction Level Modeling (TLM)  52 in SystemC  46 have been introduced and widely adopted in the industry. Internally, SystemC uses discrete-event simulation, with efficient context-switch using cooperative scheduling. TLM abstracts away communication details, and allows modules to communicate using function calls. We are particularly interested in the loosely timed coding style where the timing of the platform is not modeled precisely, and which allows the fastest simulations. This allowed gaining several orders of magnitude of simulation speed. However, SystemC/TLM is also reaching its limits in terms of performance, in particular due to its lack of parallelism.

Work on SystemC/TLM parallel execution is both an application of other work on parallelism in the team and a tool complementary to HLS presented in Sections 3.1 (dataflow models and programs) and 3.3 (application to FPGA). Indeed, some of the parallelization techniques we develop in CASH could apply to SystemC/TLM programs. Conversely, a complete design-flow based on HLS needs fast system-level simulation: the full-system usually contains both hardware parts designed using HLS, handwritten hardware components, and software.

We also work on simulation of the DPN intermediate representation. Simulation is a very important tool to help validate and debug a complete compiler chain. Without simulation, validating the front-end of the compiler requires running the full back-end and checking the generated circuit. Simulation can avoid the execution time of the backend and provide better debugging tools.

Automatic parallelization has shown to be hard, if at all possible, on loosely timed models  33. We focus on semi-automatic approaches where the programmer only needs to make minor modifications of programs to get significant speedups. We already obtained results in the joint PhD (with Tanguy Sassolas) of Gabriel Busnot with CEA-LIST. The research targets parallelizing SystemC heterogeneous simulations, extending SCale  62, which is very efficient to simulate parallel homogeneous platforms such as multi-core chips. We removed the need for manual address annotations, which did not work when the software does non-trivial memory management (virtual memory using a memory management unit, dynamic allocation), since the address ranges cannot be known statically. We can now parallelize simulation running with a full software stack including Linux.

We are also working with Bull/Atos on HPC interconnect simulation, using SimGrid  38. Our goal is to allow simulating an application that normally runs on a large number of nodes on a single computer, and obtain relevant performance metrics.

3.4.1 Expected Impact

The short term impact is the possibility to improve simulation speed with a reasonable additional programming effort. The amount of additional programming effort will thus be evaluated in the short term.

In the longer term, our work will allow scaling up simulations both in terms of models and execution platforms. Models are needed not only for individual Systems on a Chip, but also for sets of systems communicating together (e.g., the full model for a car which comprises several systems communicating together), and/or heterogeneous models. In terms of execution platform, we are studying both parallel and distributed simulations.

3.4.2 Scientific Program

Medium-term activities.

We started working on the “heterogeneous” aspect of simulations with an approach allowing changing the level of details in a simulation at runtime.

Several research teams have proposed different approaches to deal with parallelism and heterogeneity. Each approach targets a specific abstraction level and coding style. While we do not hope for a universal solution, we believe that a better coordination of different actors of the domain could lead to a better integration of solutions. We could imagine, for example, a platform with one subsystem accelerated with SCale  62 from CEA-LIST, some compute-intensive parts delegated to sc-during  51 from Matthieu Moy, and a co-simulation with external physical solvers using SystemC-MDVP  31 from LIP6. We plan to work on the convergence of approaches, ideally both through point-to-point collaborations and with a collaborative project.

A common issue with heterogeneous simulation is the level of abstraction. Physical models only simulate one scenario and require concrete input values, while TLM models are usually abstract and not aware of precise physical values. One option we would like to investigate is a way to deal with loose information, e.g. manipulate intervals of possible values instead of individual, concrete values. This would allow a simulation to be symbolic with respect to the physical values.

Long-term activities.

In the long term, our vision is a simulation framework that will allow combining several simulators (not necessarily all SystemC-based), and allow running them in a parallel way. The Functional Mockup Interface (FMI) standard is a good basis to build upon, but the standard does not allow expressing timing and functional constraints needed for a full co-simulation to run properly.

4 Application domains

The CASH team targets HPC programs, at different levels. Small computation kernels (tens of lines of code) that can be analyzed and optimized aggressively, medium-size kernels (hundreds of lines of code) that require modular analysis, and assembly of compute kernels (either as classical imperative programs or written directly in a dataflow language).

The work on various application domains and categories of programs is driven by the same idea: exploring various topics is a way to converge on unifying representations and algorithms even for specific applications. All these applications share the same research challenge: find a way to integrate computations, data, mapping, and scheduling in a common analysis and compilation framework.

Typical HPC kernels include linear solvers, stencils, matrix factorizations, BLAS kernels, etc. Many kernels can be found in the Polybench/C benchmark suite 54. The irregular versions can be found in 55. Numerical kernels used in quantitative finance 65 are also good candidates, e.g., finite difference and Monte-Carlo simulation.

The medium-size applications we target are streaming algorithms  29, scientific workflows  60, and also the now very rich domain of deep learning applications  48. We explore the possibilities of writing (see Section 3.1) and compiling (see Section 3.3) applications using a dataflow language. As a first step, we will target dataflow programs written in SigmaC 32 for which the fine grain parallelism is not taken into account. In parallel, we will also study the problem of deriving relevant (with respect to safety or optimization) properties on dataflow programs with array iterators.

The approach of CASH is based on compilation, and our objective is to allow developers to design their own kernels, and benefit from good performance in terms of speed and energy efficiency without having to deal with fine-grained optimizations by hand. Consequently, our objective is first to improve the performance and energy consumption for HPC applications, while providing programming tools that can be used by developers and are at a convenient level of abstraction.

Obviously, large applications are not limited to assembly of compute kernels. Our languages and formalism definitions and analyses must also be able to deal with general programs. Our targets also include generalist programs with complex behaviors such as recursive programs operating on arrays, lists and trees; worklist algorithms (lists are not handled within the polyhedral domain). Analysis on these programs should be able to detect non licit memory accesses, memory consumption, hotspots, ..., and to prove functional properties.

The simulation activities are both applied internally in CASH, to simulate intermediate representations, and for embedded systems. We are interested in Transaction-Level Models (TLM) of Systems-on-a-Chip (SoCs) including processors and hardware accelerators. TLM provides an abstract but executable model of the chip, with enough details to run the embedded software. We are particularly interested in models written in a loosely timed coding style. We plan to extend these to heterogeneous simulations including a SystemC/TLM part to model the numerical part of the chip, and other simulators to model physical parts of the system.

5 Social and environmental responsibility

5.1 Footprint of research activities

Although we do not have a precise measure of our carbon (and other environmental) footprint, the two main sources of impact of computer-science research activities are usually transport (plane) and digital equipment (lifecycle of computers and other electronic devices).

Many members of the CASH team are already in an approach of reducing their international travel, and hopefully the new solutions we had to set up to continue our activities during the COVID crisis will allow us to continue our research with a sustainable amount of travel, and using other forms of remote collaborations when possible.

As far as digital equipment is concerned, we try to extend the lifetime of our machines as much as possible.

5.2 Impact of research results

Many aspects of our research are meant to provide tools to make programs more efficient, in particular more power-efficient. It is very hard, however, to asses the actual impact of such research. In many cases, improvements in power-efficiency lead to a rebound effect which may weaken the benefit of the improvement, or even lead to an increase in total consumption (backfire).

CASH provides tools for developers, but does not develop end-user applications. We believe the social impact of our research depends more on the way developers will use our tools than on the way we conduct our research. We do have a responsibility on the application domains we promote, though.

Ludovic Henrio followed the "Atelier Sciences Environnements Sociétés Inria 2021" (atelier Sens) organized by Eric Tannier in June 2021. Then, for the voluntary Cash members, he has animated an atelier Sens during the Cash seminar in October 2021.

6 Highlights of the year

6.1 Articles

  • Our works on choice trees, an extension of interaction trees for representing non-deterministic computations with effects in Coq, has been accepted at POPL 2023 9.
  • Our work on memory representation of Algebraic Data Types has been accepted at ICFP 2023 13

7 New software, platforms, open data

7.1 New software

7.1.1 DCC

  • Name:
    DPN C Compiler
  • Keywords:
    Polyhedral compilation, Automatic parallelization, High-level synthesis
  • Functional Description:
    Dcc (Data-aware process network C Compiler) compiles a regular C kernel to a data-aware process network (DPN), a dataflow intermediate representation suitable for high-level synthesis in the context of high-performance computing. Dcc has been registered at the APP ("Agence de protection des programmes") and transferred to the XtremLogic start-up under an Inria license.
  • News of the Year:
    This year, Dcc was enhanced with user-guided loop tiling. Given a user-specified tiling template, a correct loop tiling with minimal latency is inferred.
  • Publication:
  • Contact:
    Christophe Alias
  • Participants:
    Christophe Alias, Alexandru Plesco

7.1.2 PoCo

  • Name:
    Polyhedral Compilation Library
  • Keywords:
    Polyhedral compilation, Automatic parallelization
  • Functional Description:
    PoCo (Polyhedral Compilation Library) is framework to develop program analysis and optimizations in the polyhedral model. PoCo features polyhedral building blocks as well as state-of-the-art polyhedral program analysis. PoCo has been registered at the APP (“agence de protection des programmes”) and transferred to the XtremLogic start-up under an Inria licence.
  • News of the Year:
    This year, GLPK was interfaced to the symbolic engine. Also, the Farkas engine was improved to handle more complex affine constraints.
  • Contact:
    Christophe Alias
  • Participant:
    Christophe Alias

7.1.3 Encore with dataflow explicit futures

  • Keywords:
    Language, Optimizing compiler, Source-to-source compiler, Compilers
  • Functional Description:
    Fork of the Encore language compiler, with a new "Flow" construct implementing data-flow explicit futures.
  • URL:
  • Contact:
    Ludovic Henrio

7.1.4 fkcc

  • Name:
    The Farkas Calculator
  • Keywords:
    DSL, Farkas Lemma, Polyhedral compilation
  • Scientific Description:
    fkcc is a scripting tool to prototype program analyses and transformations exploiting the affine form of Farkas lemma. Our language is general enough to prototype in a few lines sophisticated termination and scheduling algorithms. The tool is freely available and may be tried online via a web interface. We believe that fkcc is the missing chain to accelerate the development of program analyses and transformations exploiting the affine form of Farkas lemma.
  • Functional Description:
    fkcc is a scripting tool to prototype program analyses and transformations exploiting the affine form of Farkas lemma. Our language is general enough to prototype in a few lines sophisticated termination and scheduling algorithms. The tool is freely available and may be tried online via a web interface. We believe that fkcc is the missing chain to accelerate the development of program analyses and transformations exploiting the affine form of Farkas lemma.
  • Release Contributions:
    - Script language - Polyhedral constructors - Farkas summation solver
  • URL:
  • Publication:
  • Contact:
    Christophe Alias
  • Participant:
    Christophe Alias

7.1.5 Vellvm

  • Keywords:
    Coq, Semantic, Compilation, Proof assistant, Proof
  • Scientific Description:
    A modern formalization in the Coq proof assistant of the sequential fragment of LLVM IR. The semantics, based on the Interaction Trees library, presents several rare properties for mechanized development of this scale: it is compositional, modular, and extracts to a certified executable interpreter. A rich equational theory of the language is provided, and several verified tools based on this semantics are in development.
  • Functional Description:
    Formalization in the Coq proof assistant of a subset of the LLVM compilation infrastructure.
  • URL:
  • Contact:
    Yannick Zakowski
  • Participants:
    Yannick Zakowski, Steve Zdancewic, Calvin Beck, Irene Yoon
  • Partner:
    University of Pennsylvania

7.1.6 vaphor

  • Name:
    Verification of Programs with Horn Clauses
  • Keyword:
    Program verification
  • Functional Description:
    Program to horn clauses horn clauses with arrays abstraction
  • Contact:
    Laure Gonnord
  • Partner:

7.1.7 Data Abstraction

  • Name:
    Data Abstraction
  • Keywords:
    Static analysis, Program verification, Propositional logic
  • Functional Description:

    The tool is an element of a static program (or other) verification process which is done in three steps:

    1. Transform the verification problem into Horn clauses, perhaps using MiniJavaConverter or SeaHorn 2. Simplify the Horn clauses using data abstraction (this tool). 3. Solve the Horn clauses using a Horn solver such as Z3

  • Contact:
    Laure Gonnord
  • Partner:

7.1.8 S4BXI

  • Keywords:
    Simulation, HPC, Network simulator
  • Functional Description:
    S4BXI is a simulator of the Portals4 network API. It is written using SimGrid's S4U interface, which provides a fast flow-model. More specifically, this simulator is tuned to model as best as possible Bull's hardware implementation of portals (BXI interconnect)
  • URL:
  • Contact:
    Julien Emmanuel
  • Partner:
    Bull - Atos Technologies

7.1.9 llvm-pass

7.1.10 ribbit

  • Keywords:
    Compilation, Pattern matching, Algebraic Data Types
  • Functional Description:
    Ribbit is a compiler for pattern languages with algebraic data types which is parameterized by the memory representation of types. Given a memory representation, it generates efficient and correct code for pattern matching clauses.
  • URL:
  • Contact:
    Gabriel Radanne

7.1.11 calv

  • Name:
    AVL calculator
  • Keywords:
    Data structures, OpenMP
  • Functional Description:
    calv is a calculator which is used to run different implementations of AVL trees, and compare their relative performances.
  • URL:
  • Contact:
    Paul Iannetta

7.1.12 adtr

  • Name:
    ADT Rewriting language
  • Keywords:
    Compilation, Static typing, Algebraic Data Types, Term Rewriting Systems
  • Functional Description:

    ADTs are generally represented by nested pointers, for each constructors of the algebraic data type. Furthermore, they are generally manipulated persistently, by allocating new constructors.

    ADTr allow representing ADTs in a flat way while compiling a pattern match-like construction as a rewrite on the memory representation. The goal is to then use this representation to optimize the rewriting and exploit parallelism.

  • URL:
  • Publication:
  • Contact:
    Gabriel Radanne
  • Participants:
    Gabriel Radanne, Paul Iannetta, Laure Gonnord

7.1.13 dowsing

  • Keywords:
    Static typing, Ocaml
  • Functional Description:

    Dowsing is a tool to search function by types. Given a simple OCaml type, it will quickly find all functions whose types are compatible.

    Dowsing works by building a database containing all the specified libraries. New libraries can be added to the database. It then builds an index which allow to quickly answer to requests.

  • URL:
  • Publication:
  • Contact:
    Gabriel Radanne
  • Participants:
    Gabriel Radanne, Laure Gonnord

7.1.14 odoc

  • Keyword:
  • Functional Description:
    OCaml is a statically typed programming language with wide-spread use in both academia and industry. Odoc is a tool to generate documentation of OCaml libraries, either as HTML websites for online distribution or to create PDF manuals and man pages.
  • URL:
  • Contact:
    Gabriel Radanne
  • Participants:
    Jon Ludlam, Gabriel Radanne, Florian Angeletti, Leo White

7.1.15 PoLA

  • Name:
    PoLA: a Polyhedral Liveness Analyser
  • Keywords:
    Polyhedral compilation, Array contraction
  • Functional Description:
    PoLA is a C++ tool that optimizes the footprint of C(++) programs of the polyhedral model by applying reduced mappings deduced from dynamic analysis of the program. More precisely, we apply a dataflow analysis on traces of a program, obtained either by execution or interpretation, and infer parametrized mappings for the arrays used for intermediate computations. This tool is part of the Polytrace project.
  • URL:
  • Publications:
  • Contact:
    Christophe Alias
  • Participants:
    Hugo Thievenaz, Christophe Alias, Keiji Kimura
  • Partner:
    Waseda University

7.1.16 Actors-OCaml

7.1.17 ctrees

  • Name:
    Choice Trees
  • Keywords:
    Coq, Concurrency, Formalisation, Semantics, Proof assistant
  • Functional Description:
    We develop so-called "ctrees", a data-structure in Coq suitable for modelling and reasoning about non-deterministic programming languages as an executable monadic interpreter. We link this new library to the Interaction Trees project: ctrees offer a valid target for interpretation of non-deterministic events.
  • URL:
  • Contact:
    Yannick Zakowski

8 New results

This section presents the scientific results obtained in the evaluation period. They are grouped according to the directions of our research program.

8.1 Research direction 1: Parallel and Dataflow Programming Models

8.1.1 Flexible Synchronization for Parallel Computations.

Participants: Ludovic Henrio, Matthieu Moy, Amaury Maillé.

Parallel applications make use of parallelism where work is shared between tasks; often, tasks need to exchange data stored in arrays or FIFO queues and synchronize depending on the availability of these data. In the thesis of Amaury Maillé we explored different approahces to parametrise manually or automatically the granularity of synchronisation induced by such data transmission.

Amaury defended his PhD thesis on July 7, presenting the results of his research on this subject 19.

8.1.2 Locally abstract globally concrete semantics

Participants: Ludovic Henrio, Reiner Hähnle, Einar Broch Johnsen, Violet Ka I Pun, Crystal Chang Din, Lizeth Tapia Tarifa.

This research direction aims at designing a new way to write semantics for concurrent languages. The objective is to design semantics in a compositional way, where each primitive has a local behavior, and to adopt a style much closer to verification frameworks so that the design of an automatic verifier for the language is easier. The local semantics is expressed in a symbolic and abstract way, a global semantics gathers the abstract local traces and concretizes them. We have a reliable basis for the semantics of a simple language (a concurrent while language) and for a complex one (ABS), but the exact semantics and the methodology for writing it is still under development. After 2 meetings in 2019, this work has slowed down in 2020 and 2021, partly because of Covid restrictions but several visits of Reiner Hähnle in the Cash team allowed us to progress on the subject and to prepare a follow-up relating scheduling and LAGC. The separation of concerns in the LAGC semantics between state computation rules on one hand and the scheduling rules on the other, makes it possible to characterize fairness constructively at a semantic level and prove fairness of the scheduling at this level. This allowed us to characterise a new form of fairness and describe a scheduler at the programming language semantic level.

In 2023, the journal paper describinbg LAGC has been accepted to TOPLAS after several revisions. It will be published in 2024. We finished the work on scheduling and had it accepted at <Programming>. Irt will be presented at the programming conference in 2024 11.

This is a joint with Reiner Hähnle (TU Darmstadt), Einar Broch Johnsen, Crystal Chang Din, Lizeth Tapia Tarifa (Univ Oslo), Violet Ka I Pun (Univ Oslo and Univ of applied science Bergen).

8.1.3 Deterministic parallel programs

Participants: Ludovic Henrio, Einar Broch Johnsen, Violet Ka I Pun, Yannick Zakowski.

This research direction takes place through visits and remote meetings between Ludovic Henrio and our Norwegian colleagues. First results were published in 2021 on a simple static criteria for deterministic behaviour of active objects. We are now extending this work to be able to ensure deterministic behaviour in more cases and to lay a theoretical background that will make our results more general and easier to adapt to different settings. This year, we formalised in Coq a result by DeBruinjn dating back from the 70th on proving confluence of a system. In the process, we solved some mistakes in the existing proof, and generalised it in a way that will make it even more useful in the context of programming language semantics. We continued to investigate the question of confluence for distributed progrmming languages based on this proof and extended the Coq framework with concurrent programming language use-cases. We expect to finish the mechanization and publish these results in 2024.

Note that the CASH team previously published a survey on parallelism and determinacy 6.

8.1.4 PNets: Parametrized networks of automata

Participants: Ludovic Henrio, Quentin Corradi, Eric Madelaine, Rabéa Ameur Boulifa.

pNets (parameterised networks of synchronised automata) are semantic objects for defining the semantics of composition operators and parallel systems. We have used pNets for the behavioral specification and verification of distributed components, and proved that open pNets (i.e. pNets with holes) were a good formalism to reason on operators and parameterized systems. This year, we finished the formalisation and proved the basic properties of a refinement theory for open pNets. These results were published and presented at SEFM 12.

8.1.5 A Survey on Verified Reconfiguration

Participants: Ludovic Henrio, Helene Coullon, Frederic Loulergue, Simon Robillard.

We have conducted a survey on the use of formal methods to ensure safety of reconfiguration of distributed system, that is to say the runtime adaptation of a deployed distributed software system. The survey article is written together with Hélène Coullon and Simon Robillard (IMT Atlantique, Inria, LS2N, UBL), and Frédéric Loulergue (Northern Arizona University). Hélène Coullon is the coordinator and the article has been published in 2023 10.

8.1.6 Verified Compilation Infrastructure for Concurrent Programs

Participants: Nicolas Chappe, Ludovic Henrio, Yannick Zakowski.

The objective of this research direction is to provide semantic and reasoning tools for the formalization of concurrent programs and the verification of compilers for concurrent languages. In particular, we want to apply these results to the design of verified optimizing compilers for parallel high-level languages. We wish to proceed in the spirit of the approach advocated in Vellvm 8: compositional, modular, executable monadic interpreters based on Interaction Trees  66 are used to specify the semantics of the language, in contrast with more traditional transition systems. Proving correct optimizations for such concurrent languages naturally requires new proof techniques that we need to design as well. Last year had seen the successful publication of the ctrees project. This year's major contributions in this line of work are:

  • Nicolas Chappe has made major contributions to the library. In particular, he has identified an alternate definition of strong bisimilarity and strong similarity enjoying better proof principles. He has furthermore continued his investigations in applying these semantics methods to modelling concurrent programs under weak memory models, modelling a minimal version of Vellvm as a first step towards scaling these methods. We project a submission covering his work during the first half of 2024.
  • In collaboration with Lef Ioannidis and Steve Zdancewic, we have formalized in Coq a CTL logic, and showed how ctrees can be instantiated as a model of the logic. Intuitively, CTL formulas are speculated to be convenient abstractions to express protocols on the external events that a given computation may exhibit at run time. We are ironning out relevant applications to the approach, in view of a submission during the year 2024.

8.1.7 Operational Game Semantics

Participants: Peio Borthelle, Tom Hirschowitz, Guilhem Jaber, Yannick Zakowski.

Peio Borthelle, PhD student at the Lama in Chambéry co-advised by Tom Hirschowitz, Guilhem Jaber, and Yannick Zakowski, works on the formalization in Coq of Operational Game Semantics (OGS). OGS is a technique used to define sufficient conditions to proving the contextual equivalence of higher-order programs, in which names are exchange in lieu of higher order values. This year has seen major breakthroughs in Peio's work. A complete, axiom-free development has been achieved. In this development, a notion of OGS is defined over an abstract, axiomatic notion of programing language. The bisimilarity of the resulting OGS is proved to be sound w.r.t. contextual equivalence. Finally, examples such as System L are proved to satisfy the interface. In terms of dissemination, this work has led to: - A first communication at the TYPES'23 workshop titled "Games and Strategies using Coinductive Types" - An incoming second communication this month (january 2024) at the GALOP'24 workshop titled "An abstract, certified account of Operational Game Semantics" - The writing of a paper that will be submitted this month (january 2024) at LICS'24

8.1.8 Foundational support to datatypes and codatatypes in Coq

Participants: Galaad Langlois, Damien Pous, Yannick Zakowski.

Libraries such as the interaction trees or the choice trees that Yannick Zakowski develops rely heavily on coinductive datatypes, and functions building values of these codatatypes. Coq's support is, in some respects, lacking in this realm, putting an excessive burden on the shoulders of the programmer/mathematician. While libraries such that Pous's coinduction library helps greatly in proving coinductive properties, no support exists in Coq to help writing corecursive functions.

Damien Pous and Yannick Zakowski have advised Galaad Langlois as part of his Master 2 internship to build a first contribution in this direction. The internship has resulted in a library formalizing a class of functors, so-called polynomial functors as spawned by containers, for which we build the initial algebra (the associated Inductive datatype) and the final coalgebra (the associated CoInductive datatype). The construction is done in the category of setoids, allowing for a completely axom free result. This result has been presented by Galaad at the Coq Workshop 2023.

8.1.9 Actors and algebraic effects

Participants: Martin Andrieux, Ludovic Henrio, Gabriel Radanne.

This works aims to understand the link between two constructions. Actors, on one hand, aim to provide high level language constructors for concurency and parallelism. They have been implemented and successfully used in several industry-grade frameworks, such as Akka. Algebraic effects allow the precise modelling of operation with effects, while providing excellent composition properties. They have been used both as a fundamental primitive for theoretical study, but also used as effective building blocks to create new complex control and effectful operators. The new version of OCaml with multicore support promotes the use of algebraic effects to implement new concurrency primitives. We implement actors using algebraic effects, and obtain a practical, efficient implementation of Actors for OCaml. In 2022, we designed such embedding and implemented it as a proof of concept library 7.1.16 using multicore OCaml.

In 2023, we formalised the embedding and proved the correctness of our implementation relatively to the actor model. We wrote an article describing the library and its formalisation 20. It will be published by Springer in a special volume in 2024.

We now aim at extending the library to make it more versatile.

8.2 Research direction 2: Expressive, Scalable and Certified Analyses

8.2.1 Verification of electric properties on transistor-level descriptions of circuits, using formal methods

Participants: Oussama Oulkaid, Bruno Ferres, Ludovic Henrio, Matthieu Moy, Gabriel Radanne.

We started discussions with the Aniah start-up in 2019, and started a formal partnership in 2022, with the recruitment of Bruno Ferres as a post-doc, and Oussama Oulkaid as a CIFRE Ph.D (co-supervised by Aniah, Verimag, and LIP). We developed a prototype verification tool. The tool compiles transistor-level circuit descriptions (CDL file format) to logical formula expressing the semantics of the circuit plus a property to verify, and uses an SMT solver (Z3) to check the validity of the property. The tool was successfully used on a real-life case study, and we showed that our approach can reduce the number of false-alarms significantly compared to traditional approaches, with a reasonable computational cost (under a second for most sub-circuits analyzed). To the best of our knowledge, formal methdos like SAT/SMT-solving were never applied to multi-supplies electronic circuits before. We published a short paper presenting these results to the “late breaking results” track of the DATE 2023 conference 15, and got a longer version of the paper accepted for DATE 2024. The technique experimented in the prototype was successfully re-implemented in the production tool commercialized by Aniah and is now available in the latest release.

In parallel with the technical work, we conducted a thorough review of existing work on the domain, and submitted a survey article to the TODAES journal.

We are currently working on richer semantics able to take into account more properties on the circuits under analysis.

8.2.2 Search functions by types

Participants: Gabriel Radanne, Laure Gonnord, Clement Allain, Pauline Garelli, Emmanuel Arrighi.

Dowsindex is a tool to allows searching in a collection of libraries using types as query. Given a type, the tool returns a list of functions whose type can be unified to the query modulo isomorphisms. Using unification allows the returns type to be more general than the query and the isomorphisms abstract some details of the implementation, for example, the order of the arguments of functions. Unfortunately, algorithms for unification modulo type isomorphisms are costly (at best NP). An exhaustive search would not be usable during programming in practice.

In this research direction, we investigate how to scale search by types. For this purpose, we developed new algorithm technique similar to indexes used in databases, but appropriate for keys following a rich language of types. We have developed a prototype, Dowsing 7.1.13, implementing these ideas. In 2023, Emmanuel Arrighi started a Postdoc on this topic, established benchmark to do empirical experiment and started working of the unification modulo isomorphisms algorithms.

8.2.3 A new module system for OCaml

Participants: Clement Blaudeau, Didier Remy, Gabriel Radanne.

ML modules are offer large-scale notions of composition and modularity. Provided as an additional layer on top of the core language, they have proven both vital to the working OCaml and SML programmers, and inspiring to other use-cases and languages. Unfortunately, their meta-theory remanins difficult to comprehend, requiring heavy machinery to prove their soundness.

In this research direction, we study a translation from ML modules to Fω to provide a new comprehensive description of a generative subset of OCaml modules, embarking on a journey right from the source OCaml module system, up to Fω , and back. We propose a “middle representation” called canonical that combines the best of both worlds. Our goal is to obtain type soundness, but also and more importantly, a deeper insight into the signature avoidance problem, along with ways to improve both the OCaml language and its typechecking algorithm. In 2023, we developed a full account of both "applicative" and "generative" cases (which are in OCaml). We published recently the generative case 17 and wrote an article covering the full language.

8.3 Research direction 3: Optimizing Program Transformations

8.3.1 Memory optimizations for Algebraic Data Types

Participants: Thaïs Baudon, Gabriel Radanne, Laure Gonnord.

In the last few decades, Algebraic Data Types (ADT) have emerged as an incredibly effective tool to model and manipulate data for programming. Additionally, ADTs could provide numerous advantages for optimizing compilers, as the rich declarative description could allow them to choose the memory representation of the types.

Initially, ADTs were mostly present in functional programming languages such as OCaml and Haskell. Such GC-managed functional languages generally use uniform memory representation which prohibit agressive optimisations of the representation of ADTs. However, ADTs are now present in many different languages, notably Scala and Rust, which permit such optimizations.

The goal of this research direction is to investigate how to represent terms of Algebraic Data Types and how to compile pattern matching efficiently. We aim to develop a generic compilation framework which accomodate arbitrarely complex memory representation for terms, and to provide news ways to optimize the representation of ADTs. A prototyper compiler has been implemented 7.1.10. In 2023, We developed a language with a dual view of types, the high level view of algebraic data types and their memory layout. We published this language, its formalization, and the compilation algorithms in ICFP 13. We then extended this setup with more complex types 24 and a richer compilation algorithm.

8.3.2 Vellvm: Verified LLVM

Participants: Calvin Beck, Irene Yoon, Yannick Zakowski, Steve Zdancewic.

We develop, in collaboration with the University of Pennsylvania, a formally verified in Coq compilation infrastructure based on LLVM, dubbed Vellvm 7.1.5. Compared to other existing verified compilation framework, we define the semantics of the languages we consider as monadic interpreters built on top of the Interaction Trees framework. This approach brings us benefits in terms of modularity, compositionality and executability, as well as leads to an equational mode of reasoning to establish refinements. The following major achievements have taken place this year:

  • The major redefinition of the memory model, led by Calvin Beck and accounting for the necessary finite view of the memory imposed by the presence of pointer to integer casts in LLVM IR, is essentially complete. We are ironing out the last details in prevision of a submission at ICFP'24.
  • Spiral is a compilation framework for the generalization of efficient low level code for numerical computations. HELIX is a formalization in Coq of part of this framework that Vadim Zaliva has developped during his PhD. We have finally finished the last details of the proof of the compilation chain, and written a journal paper describing the project. We will submit it at the end of January 2024 at the TOPLAS journal.

8.3.3 Verified Abstract Interpreters as Monadic Interpreters

Participants: Laure Gonnord, Sébastien Michelland, Yannick Zakowski.

In the realm of verified compilation, one typically wants to verify the static analyzes used by the compiler. In existing works, the analysis is typically written as a fuel-based pure function in Coq and verified against the semantics described as a transition system. The goal of this research is to develop the tools and reasoning principles to transfer these ideas to a context where the semantics of the language is defined as a monadic interpreters built on Interaction Trees.

During his internship, Sébastien Michelland had developed a first promising prototype, establishing a highly modular framework to build and prove correct such analyses. He has instantiated his result on a toy Imp language, and is now aiming at instantiating it on a toy assembly language. This year, this project has seen major progress: a fully fledged, admit-free, development has been achieved with the construction of an abstract interpreter for both an imperative language with failure and a CFG-based, assembly-style, language. We have written a paper 23 describing these contributions and are considering a submission as is or with further extensions during the first semester of 2024.

8.3.4 A verified CompCert backend for OptiTrust

Participants: Nicolas Nardino, Arthur Chargueraud, Yannick Zakowski.

OptiTrust is an ANR led by Arthur Chargueraud, of which Yannick Zakowski is a participant. The project revolves around a DSL for writing program optimisations for high performance, highly parallel, code. A functional OCaml prototype is already used to perform ambitious case studies of source to source optimization of C code thanks to this DSL. The ANR revolves around providing foundational soundness guarantees to the tool.

In particular, Nicolas Nardino has done a Master 2 internship revolving around the compilation of the programming language used as internal representation for the source programs upon which the optimizations are performed (essentially a rich imperative lambda calculus) and (one of the languages of) CompCert, a verified C compiler. A partial prototype has been developed during this internship.

8.3.5 Scalable Array Contraction using Trace-Based Polyhedral Analysis

Participants: Hugo Thievenaz, Keiji Kimura, Christophe Alias.

In this work, we defend the iconoclast idea that polyhedral optimizations might be computed without expensive polyhedral operations, simply by applying a lightweight analysis on a few off-line execution traces. The main intuition being that, since polyhedral transformations are expressed as affine mappings, only a few points are required to infer the general mapping. Our hope is to compute those points from a few off-line execution traces. We focus on array contraction, a well known technique to reallocate temporary arrays thanks to affine mappings so the array size is reduced. We describe a trace selection algorithm, a liveness algorithm from an execution trace, and another to compute the maximum number of variables alive alongside a dimension, from which we get our scalar modular mappings. We show that a simple interpolation allow to infer the modulo mapping.

This year, we have validated the scalability of our approach on real life benchmarks from the high-level synthesis world. A journal publication is under preparation.

8.3.6 Partial Evaluation of Dense Code on Sparse Structures

Participants: Alec Sadler, Gabriel Dehame, Christophe Alias.

Most HPC computations process sparse tensors. The resulting code is highly dynamic, which makes code optimization quite challenging. One way is to start from the original dense specification, which is usually much more regular and ready to be optimized thanks to state-of-the-art program optimization algorithms. Then, to specialize that code on the sparse input structure. We propose a novel approach to apply that specialization. The key ingredient of our algorithm is the transitive closure of affine relation, for which efficient and accurate heuristics exist. Experimental evaluation shows the effectiveness of our approach.

Our results are available as a research report 21. They are still under publication.

8.4 Research direction 4: Simulation and Hardware

8.4.1 S4BXI: the MPI-ready Portals 4 Simulator

Participants: Julien Emmanuel, Matthieu Moy, Ludovic Henrio, Grégoire Pichon.

We present a simulator for High Performance Computing (HPC) interconnection networks. It models Portals 4, a standard low-level API for communication, and it allows running unmodified applications that use higher-level network APIs such as the Message Passing Interface (MPI). It is based on SimGrid, a framework used to build single-threaded simulators based on a cooperative actor model. Unlike existing tools like SMPI, we rely on an actual MPI implementation, hence our simulation takes into account MPI's implementation details in the performance. We applied the approach on a case study using the BullSequana eXascale Interconnect (BXI) made by Atos, which highlights how such a simulator can help design space exploration (DSE) for new interconnects. The Ph.D of Julien Emmanuel on the topic was defended in early 2023.

9 Bilateral contracts and grants with industry

9.1 Partnertship with the Aniah startup on circuit verification

Participants: Bruno Ferres, Matthieu Moy, Ludovic Henrio, Gabriel Radanne, Oussama Oulkaid.

The CASH team started discussion with the Aniah startup in 2019, to work on verification of electrical properties of circuits at transistor level. We recruited a post-doc (Bruno Ferres) in March 2022, and formalized the collaboration with a bilateral contract (Réf. Inria : 2021-1144), in parallel with a joint internship with LIP, Verimag laboratory and Aniah (Oussama Oulkaid), which led to a CIFRE Ph.D (LIP/Verimag/Aniah) started in October 1st 2022. The collaboration led to the development of a prototype tool, which served a the basis for the re-implementation of the approach in the production tool, and to two articles accepted at the DATE conference plus one ongoing submission.

9.2 CAVOC Project with Inria/Nomadic Labs

Participants: Guilhem Jaber, Gabriel Radanne, Laure Gonnord.

This project aims to develop a sound and precise static analyzer for OCaml, that can catch large classes of bugs represented by uncaught exceptions. It will deal with both user-defined exceptions, and built-in ones used to represent error behaviors, like the ones triggered by failwith, assert, or a match failure. Via “assert-failure” detection, it will thus be able to check that invariants annotated by users hold. The analyzer will reason compositionally on programs, in order to analyze them at the granularity of a function or of a module. It will be sound in a strong way: if an OCaml module is considered to be correct by the analyzer, then one will have the guarantee that no OCaml code interacting with this module can trigger uncaught exceptions coming from the code of this module. In order to be precise, it will take into account the abstraction properties provided by the type system and the module system of the language: local values, abstracted definition of types, parametric polymorphism. The goal being that most of the interactions taken into account correspond to typeable OCaml code.

This project is part of the partnership between Inria and Nomadic Labs, and lead by Guilhem Jaber , from the Inria Team Galinette.

10 Partnerships and cooperations

10.1 International initiatives

10.1.1 Participation in other International Programs

Polytrace Exploratory Action

Participants: Christophe Alias, Keiji Kimura.

  • Title:
    Polytrace – Scaling Polyhedral Compilers with Trace Analysis
  • Partner Institution(s):
    Waseda University, Japan
  • Date/Duration:
    4 years, until December 2024.
  • Additionnal info/keywords:
    Compilers, HPC, Polyhedral Model, Trace Analysis

10.2 International research visitors

10.2.1 Visits of international scientists

Other international visits to the team
Violet Ka I Pun and Einar Broch Johnsen
  • Status
    (professor/assistant professor))
  • Institution of origin:
    Univ of applied science(Bergen) and Univ of Oslo
  • Country:
  • Dates:
  • Context of the visit:
  • Mobility program/type of mobility:
    research stay
Reiner Hähnle
  • Status
  • Institution of origin:
    Teschnische Universitat Darmstadt
  • Country:
  • Dates:
  • Context of the visit:
  • Mobility program/type of mobility:
    research stay

10.2.2 Visits to international teams

Research stays abroad
Yannick Zakowski
  • Visited institution:
    University of Pennsylvania
  • Country:
  • Dates:
    August 1st 2023 to October 15th 2023
  • Context of the visit:
    ongoing scentific collaborations with Steve Zdancewic around the Vellvm project
  • Mobility program/type of mobility:
    Research stay
Christophe Alias
  • Visited institution:
    Waseda University
  • Country:
  • Dates:
    October 30 to December 4
  • Context of the visit:
    Polytrace exploratory action
  • Mobility program/type of mobility:
    Research stay

10.3 National initiatives

PEPR NumPex, ExaSoft Project (WP2, Task 2.3)

Participants: Alec Sadler, Christophe Alias, Thierry Gautier, Xavier Rival, Philippe Clauss.

  • Title:
    Polysparse – Compiling Sparse Kernels by Specialization
  • Partner Institution(s):
    Inria Paris (X. Rival), Inria Nancy (P. Clauss)
  • Date/Duration:
    6 years, started this year.
  • Additionnal info/keywords:
    Compilers, HPC, Polyhedral Model, Sparse Computation

11 Dissemination

11.1 Promoting scientific activities

The team participated in a DECLICS (Dialogues Entre Chercheurs et Lycéens pour les Intéresser à la Construction des Savoirs) meeting with high-school students (Lycée Saint-Just). Matthieu Moy was “Capitaine” and Yannick Zakowski and Emmanuel Arrighi were “Ambassadors”.

11.1.1 Scientific events: organisation

Ludovic Henrio organised a workshop on active objects and the ABS kanguage in Lyon.

General chair, scientific chair

Ludovic Henrio is member of the steering committee of ICE workshops.

11.1.2 Scientific events: selection

Member of the conference program committees
  • Gabriel Radannewas a PC member for JFLA'23, ICFP'23
  • Yannick Zakowskiwas a PC member for CoqPL'24, GALOP'24, JFLA'24, POPL'24
Member of the Artifact Evaluation committees
  • Nicolas Chappewas a member of the Artifact Evaluation Committee for POPL'24

11.1.3 Journal

  • Christophe Alias was a reviewer for PARCO, TETC and TRETS.
  • Gabriel Radanne was an external reviewer for JFP (1 paper)
  • Matthieu Moy was a reviewer for TACO and TECS.

11.1.4 Conferences

  • Christophe Alias was a reviewer for COMPAS'23.

11.1.5 Leadership within the scientific community

  • Ludovic Henrio is one of the responsibles, with Kévin Martin of the GdT CLAP inside the GDR GPL.
  • Laure Gonnord is responsible for "Ecole des Jeunes Chercheurs en Programmation" in the GdR GPL.

11.1.6 Scientific expertise

  • Christophe Alias is scientific advisor for the Xtremlogic startup.
  • Ludovic Henrio was a member of the evaluation committee of LIPN laboratory (Dec 2023).

11.1.7 Research administration

  • Ludovic Henrio is member of the "commission recherche" of labex Milyon.

11.2 Teaching - Supervision - Juries

11.2.1 Teaching

Nicolas Chappe published an article about the online platform that allows teaching programming languages at ENS Lyon 14.

  • Christophe Alias: "Compilation", INSA CVL 3A, cours+TD, 27h ETD.
  • Hugo Thievenaz: "Algorithmique programmation impérative, initiation", L1 UCBL, TP, 18h ETD.
  • Hugo Thievenaz: "Bases de l'architecture pour la programmation", L1 UCBL, TP, 24h ETD.
  • Matthieu Moy: Responsible of the “licence d'informatique UCBL”. “Programmation concurrente”, L3 UCBL, 26h; “Projet Informatique”, L3 UCBL, 9.5h; “Systèmes d'exploitation”, L2 UCBL, 25h.
  • Nicolas Chappe: "Architecture des ordinateurs", L2 UCBL, 24h TP
  • Amaury Maillé: "Algorithmique et Programmation Orientée Objet", L3 UCBL, TD, 14h EQTD
  • Amaury Maillé: "Algorithmique et Programmation Orientée Objet", L3 UCBL, TD+TP, 32.16h EQTD
  • Amaury Maillé: "Programmation Logique", L2 UCBL, TP, 12h EQTD
  • Amaury Maillé: "Algorithmique, programmation impérative, initiation", L1 UCBL, TD+TP, 1.67h EQTD
  • Amaury Maillé: "Algorithmique, programmation et structures de données", L2 UCBL, TD+TP, 24h EQTD
  • Amaury Maillé: "Architecture des ordinateurs", L2 UCBL, TP, 16h EQTD
  • Emanuel Arrighi: "Théorie de la Programmation", L3 ENS, TP
  • Emanuel Arrighi: "Algorithmique 1", L3 ENS, TP
Master 1
  • Christophe Alias: "Optimisation des applications embarquées", INSA CVL 4A, cours+TD, 27h ETD.
  • Christophe Alias: "Compilation", Préparation à l'agrégation d'informatique, ENS-Lyon, cours, 18h ETD.
  • Hugo Thievenaz: "Compilation / traduction des programmes", M1 UCBL, TD+TP, 22.5h ETD.
  • Thaïs Baudon: "Compilation", Préparation à l'agrégation d'informatique, ENS-Lyon, TP, 10h ETD.
  • Matthieu Moy: “Compilation et traduction des programmes”, M1 UCBL, responsible, 31h; “Gestion de projet et génie logiciel”, M1 UCBL, responsible, 32h; “Projet pour l'Orientation en Master”, M1 UCBL, 2 students supervised.
  • Gabriel Radanne, Ludovic Henrio and Hugo Thievenaz: "Compilation and Analysis" (CAP), ENS-Lyon, Master d'Informatique Fondamentale, cours, 48h CM + 28h TD, 64h ETD.
  • Gabriel Radanne: “Projet”, M1 ENS-Lyon, 9 students supervised.
  • Gabriel Radanne: TP "Programmation Fonctionelle", L3 UCBL, 16h ETD
  • Thaïs Baudon: TD de compilation, préparation à l'agrégation d'informatique, ENS de Lyon, 10ETD.
Master 2
  • Christophe Alias: "Polyhedral Compilation: A Short Survey", Waseda University, 3 lectures of 2h.
  • Yannick Zakowski: "Program Verification with Coinduction and Proof Assistants", ENS-Lyon, cours, 24h ETD.
  • Ludovic Henrio: University of Nice Sophia Antipolis, M2 Ubinet. "an algorithmic approach to distributed systems" 3h30 CM+TD

11.2.2 Supervision

  • Christophe Alias co-advises the PhD thesis of Hugo Thievenaz with Keiji Kimura (Waseda University).
  • Christophe Alias advises the PhD thesis of Alec Sadler with the collaboration of Thierry Gautier, Xavier Rival (Inria Paris) and Philippe Clauss (Inria Strasbourg).
  • Ludovic Henrio and Yannick Zakowski co-advise the PhD thesis of Nicolas Chappe.
  • Gabriel Radanne and Laure Gonnord co-advise the PhD thesis of Thaïs Baudon.
  • Matthieu Moy co-advises the PhD thesis of Oussama Oulkaid with Pascal Raymond (Verimag), Mehdi Khosravian (Aniah), and Bruno Ferres (Verimag).

11.2.3 Defended Ph.D

  • Julien Emmanuel: "A full stack simulator for HPC: Multi-level modelling of the BXI interconnect to predict the performance of MPI applications" 18, directed by Ludovic Henrio and Matthieu Moy.
  • Amaury Maillé: "Simple, Safe and Efficient Abstractions for Communication and Streaming in Parallel Computing" 19, directed by Ludovic Henrio and Matthieu Moy

11.2.4 Juries

  • Christophe Alias was examiner for the "oral d'infomatique du second concours de l'ENS de Lyon" competitive examination.
  • Christophe Alias was "correcteur" for the "X/ENS filière PSI" competitive examination.
  • Christophe Alias was a specialist member for the CSI of Clément Rosseti (advisor: Philippe Clauss).
  • Gabriel Radanne was jury for the "Épreuve d'algorithmique du concours de l'ENS".

11.3 Popularization

11.3.1 Education

  • Thaïs Baudon: Conception d'activités débranchées pour la vulgarisation de l'informatique et des mathématiques auprès du grand public, Maison des Mathématiques et de l'Informatique (MMI), 32h ETD.
  • Thaïs Baudon: "Maths en Jeans" with Joël Felderhoff et Daniel Hirschkoff

12 Scientific production

12.1 Major publications

  • 1 inproceedingsC.Christophe Alias and A.Alexandru Plesco. Data-Aware Process Networks.CC 2021 - 30th ACM SIGPLAN International Conference on Compiler ConstructionVirtual, South KoreaACMMarch 2021, 1-11HALDOI
  • 2 inproceedingsT.Thaïs Baudon, G.Gabriel Radanne and L.Laure Gonnord. Bit-Stealing Made Legal: Compilation for Custom Memory Representations Of Algebraic Data Types.Proceedings of the ACM on Programming LanguagesICFP 2023ICFPSeattle (USA), United StatesSeptember 2023HALDOI
  • 3 articleG.Gabriel Busnot, T.Tanguy Sassolas, N.Nicolas Ventroux and M.Matthieu Moy. Standard-compliant parallel SystemC simulation of loosely-timed transaction level models: From baremetal to Linux-based applications support.Integration, the VLSI Journal79July 2021, 23-40HALDOI
  • 4 articleN.Nicolas Chappe, P.Paul He, L.Ludovic Henrio, Y.Yannick Zakowski and S.Steve Zdancewic. Choice Trees: Representing Nondeterministic, Recursive, and Impure Programs in Coq.Proceedings of the ACM on Programming LanguagesJanuary 2023, 1-31HALDOI
  • 5 articleN.Nicolas Chappe, L.Ludovic Henrio, A.Amaury Maillé, M.Matthieu Moy and H.Hadrien Renaud. An Optimised Flow for Futures: From Theory to Practice.The Art, Science, and Engineering of Programming61July 2021, 1-41HALDOI
  • 6 articleL.Laure Gonnord, L.Ludovic Henrio, L.Lionel Morel and G.Gabriel Radanne. A Survey on Parallelism and Determinism.ACM Computing SurveysSeptember 2022HALDOIback to text
  • 7 articleR.Reiner Hähnle and L.Ludovic Henrio. Provably Fair Cooperative Scheduling.The Art, Science, and Engineering of Programming82October 2023HALDOI
  • 8 articleY.Yannick Zakowski, C.Calvin Beck, I.Irene Yoon, I.Ilia Zaichuk, V.Vadim Zaliva and S.Steve Zdancewic. Modular, compositional, and executable formal semantics for LLVM IR.Proceedings of the ACM on Programming Languages5ICFPAugust 2021, 1-30HALDOIback to textback to text

12.2 Publications of the year

International journals

International peer-reviewed conferences

National peer-reviewed Conferences

  • 16 inproceedingsJ.Jean Abou-Samra, Y.Yannick Zakowski and M.Martin Bodin. Effectful Programming across Heterogeneous Computations -Work in Progress.Journées Francophones des Langages ApplicatifsJFLA 2023 - 34èmes Journées Francophones des Langages ApplicatifsPraz-sur-Arly, France2023, 7-23HAL
  • 17 inproceedingsC.Clément Blaudeau, D.Didier Rémy and G.Gabriel Radanne. Retrofitting OCaml modules: Fixing signature avoidance in the generative case.Journées Francophones des Langages ApplicatifsJFLA 2023 - 34èmes Journées Francophones des Langages ApplicatifsPraz-sur-Arly, FranceJanuary 2023, 59-100HALback to text

Doctoral dissertations and habilitation theses

  • 18 thesisJ.Julien Emmanuel. A full stack simulator for HPC: Multi-level modelling of the BXI interconnect to predict the performance of MPI applications.Université Claude Bernard Lyon 1March 2023HALback to text
  • 19 thesisA.Amaury Maillé. Simple, Safe and Efficient Abstractions for Communication and Streaming in Parallel Computing.Ecole normale supérieure de lyon - ENS LYONJuly 2023HALback to textback to text

Reports & preprints

12.3 Cited publications

  • 25 inproceedingsC.Christophe Alias, A.Alain Darte, P.Paul Feautrier and L.Laure Gonnord. Multi-dimensional Rankings, Program Termination, and Complexity Bounds of Flowchart Programs.International Static Analysis Symposium (SAS'10)2010back to text
  • 26 inproceedingsC.Christophe Alias and A.Alexandru Plesco. Data-Aware Process Networks.Proceedings of the 30th ACM SIGPLAN International Conference on Compiler ConstructionCC 2021New York, NY, USAVirtual, Republic of KoreaAssociation for Computing Machinery2021, 1–11URL: https://doi.org/10.1145/3446804.3446847DOIback to text
  • 27 techreportC.Christophe Alias and A.Alexandru Plesco. Data-aware Process Networks.RR-8735Inria - Research Centre Grenoble -- Rhône-AlpesJune 2015, 32HALback to textback to text
  • 28 miscC.Christophe Alias and A.Alexandru Plesco. Method of Automatic Synthesis of Circuits, Device and Computer Program associated therewith.Patent n° FR1453308April 2014back to text
  • 29 articleI.Ihab Amer, C.Christophe Lucarz, G.Ghislain Roquier, M.Marco Mattavelli, M.Mickael Raulet, J.-F.J-F Nezan and O.Olivier Deforges. Reconfigurable video coding on multicore.Signal Processing Magazine, IEEE2662009, 113--123URL: http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5230810back to textback to text
  • 30 mastersthesisS.Scott Ananian. The Static Single Information Form.MA ThesisMITSeptember 1999back to text
  • 31 inproceedingsC. B.Cédric Ben Aoun, L.Liliana Andrade, T.Torsten Maehne, F.François Pêcheux, M.-M.Marie-Minerve Louërat and A.Alain Vachouxy. Pre-simulation elaboration of heterogeneous systems: The SystemC multi-disciplinary virtual prototyping approach.Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS), 2015 International Conference onIEEE2015, 278--285back to text
  • 32 inproceedingsP.Pascal Aubry, P.-E.Pierre-Edouard Beaucamps, F.Frédéric Blanc, B.Bruno Bodin, S.Sergiu Carpov, L.Lo\"ic Cudennec, V.Vincent David, P.Philippe Doré, P.Paul Dubrulle, B.Benôit Dupont De Dinechin, F.François Galea, T.Thierry Goubier, M.Michel Harrand, S.Samuel Jones, J.-D.Jean-Denis Lesage, S.Stéphane Louise, N.Nicolas Morey Chaisemartin, T. H.Thanh Hai Nguyen, X.Xavier Raynaud and R.Renaud Sirdey. Extended Cyclostatic Dataflow Program Compilation and Execution for an Integrated Manycore Processor.Alchemy 2013 - Architecture, Languages, Compilation and Hardware support for Emerging ManYcore systems18Proceedings of the International Conference on Computational Science, ICCS 2013Barcelona, SpainJune 2013, 1624-1633HALDOIback to text
  • 33 articleD.Denis Becker, M.Matthieu Moy and J.Jérôme Cornet. Parallel Simulation of Loosely Timed SystemC/TLM Programs: Challenges Raised by an Industrial Case Study.MDPI Electronics522016, 22URL: https://hal.archives-ouvertes.fr/hal-01321055DOIback to text
  • 34 bookD.Denis Caromel and L.Ludovic Henrio. A Theory of Distributed Objects.Springer-Verlag2004back to text
  • 35 articleN.Nathanaël Courant and X.Xavier Leroy. Verified Code Generation for the Polyhedral Model.Proceedings of the ACM on Programming Languages5POPLJanuary 2021, 40:1-40:24HALDOIback to text
  • 36 inproceedingsP.Patrick Cousot and R.Radhia Cousot. Abstract interpretation: A unified lattice model for static analysis of programs by construction or approximation of fixpoints.4th ACM Symposium on Principles of Programming Languages (POPL'77)Los AngelesJanuary 1977, 238-252back to text
  • 37 articleF.Frank De Boer, V.Vlad Serbanescu, R.Reiner Hähnle, L.Ludovic Henrio, J.Justine Rochas, C. C.Crystal Chang Din, E.Einar Broch Johnsen, M.Marjan Sirjani, E.Ehsan Khamespanah, K.Kiko Fernandez-Reyes and A. M.Albert Mingkun Yang. A Survey of Active Object Languages.ACM Comput. Surv.505October 2017, 76:1--76:39URL: http://doi.acm.org/10.1145/3122848DOIback to text
  • 38 inproceedingsJ.Julien Emmanuel, M.Matthieu Moy, L.Ludovic Henrio and G.Gregoire Pichon. Simulation of the Portals 4 protocol, and case study on the BXI interconnect.HPCS 2020 - International Conference on High Performance Computing & SimulationBarcelona, SpainDecember 2020, 1-8HALback to text
  • 39 articleP.Paul Feautrier. Dataflow analysis of array and scalar references.International Journal of Parallel Programming2011991, 23--53back to text
  • 40 articleP.Paul Feautrier, A.Abdoulaye Gamatié and L.Laure Gonnord. Enhancing the Compilation of Synchronous Dataflow Programs with a Combined Numerical-Boolean Abstraction.CSI Journal of Computing142012, 8:86--8:99URL: http://hal.inria.fr/hal-00860785back to text
  • 41 articleP.Paul Feautrier. Scalable and Structured Scheduling.International Journal of Parallel Programming345October 2006, 459--487back to text
  • 42 inproceedingsK.Kiko Fernandez-Reyes, D.Dave Clarke, E.Elias Castegren and H.-P.Huu-Phuc Vo. Forward to a Promising Future.Conference proceedings COORDINATION 2018Uppsala University, Computing Science2018back to text
  • 43 phdthesisM. I.Michael I Gordon. Compiler techniques for scalable performance of stream programs on multicore architectures.Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science2010back to text
  • 44 inproceedingsO.Oh Hakjoo, L.Lee Wonchan, H.Heo Kihong, Y.Yang Hongseok and Y.Yi Kwangkeun. Selective context-sensitivity guided by impact pre-analysis.ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI '14, Edinburgh, United Kingdom - June 09 - 11, 2014ACM2014, 49back to text
  • 45 articleN.N. Halbwachs, P.P. Caspi, P.P. Raymond and D.D. Pilaud. The synchronous data flow programming language LUSTRE.Proceedings of the IEEE799Sep 1991, 1305-1320back to text
  • 46 manualIEEE 1666 Standard: SystemC Language Reference Manual.Open SystemC Initiative2011, URL: http://www.accellera.org/back to text
  • 47 inproceedingsG.G. Kahn. The semantics of a simple language for parallel programming.Information processingNorth-Holland1974back to text
  • 48 inproceedingsA.Alex Krizhevsky, I.Ilya Sutskever and G. E.Geoffrey E Hinton. Imagenet classification with deep convolutional neural networks.Advances in neural information processing systems2012, 1097--1105back to text
  • 49 phdthesisM.Maroua Maalej Kammoun. Low-cost memory analyses for efficient compilers.Thèse de doctorat, Université Lyon1Université Lyon 12017, URL: http://www.theses.fr/2017LYSE1167back to text
  • 50 inproceedingsM.Maroua Maalej, V.Vitor Paisante, P.Pedro Ramos, L.Laure Gonnord and F.Fernando Pereira. Pointer Disambiguation via Strict Inequalities.Code Generation and OptimisationAustin, United StatesFebruary 2017HALback to text
  • 51 inproceedingsM.Matthieu Moy. Parallel Programming with SystemC for Loosely Timed Models: A Non-Intrusive Approach.DATEGrenoble, FranceMarch 2013, 9HALback to text
  • 52 manualOSCI TLM-2.0 Language Reference Manual.Open SystemC Initiative (OSCI)June 2008, URL: http://www.accellera.org/downloads/standardsback to text
  • 53 inproceedingsV.Vitor Paisante, M.Maroua Maalej, L.Leonardo Barbosa, L.Laure Gonnord and F. M.Fernando Magno Quintao Pereira. Symbolic Range Analysis of Pointers.International Symposium of Code Generation and OptmizationBarcelon, SpainMarch 2016, 791-809HALback to text
  • 54 miscL.-N.Louis-Noël Pouchet. Polybench: The polyhedral benchmark suite.2012, URL: http://www.cs.ucla.edu/~pouchet/software/polybench/back to text
  • 55 articleW. H.William H Press, S. A.Saul A Teukolsky, W. T.William T Vetterling and B. P.Brian P Flannery. Numerical recipes in C++.The art of scientific computing2015back to text
  • 56 articleP.Patrice Quinton. Automatic synthesis of systolic arrays from uniform recurrent equations.ACM SIGARCH Computer Architecture News1231984, 208--214back to text
  • 57 inproceedingsH.Hamza Rihani, M.Matthieu Moy, C.Claire Maïza, R. I.Robert I. Davis and S.Sebastian Altmeyer. Response Time Analysis of Synchronous Data Flow Programs on a Many-Core Processor.Proceedings of the 24th International Conference on Real-Time Networks and SystemsRTNS '16New York, NY, USABrest, FranceACM2016, 67--76URL: http://doi.acm.org/10.1145/2997465.2997472DOIback to text
  • 58 inproceedingsH. N.Henrique Nazaré Willer Santos, I.Izabella Maffra, L.Leonardo Oliveira, F.Fernando Pereira and L.Laure Gonnord. Validation of Memory Accesses Through Symbolic Analyses.Proceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages And Applications (OOPSLA'14)Portland, Oregon, United StatesOctober 2014HALback to text
  • 59 phdthesisW.William Thies. Language and compiler support for stream programs.Massachusetts Institute of Technology2009back to text
  • 60 bookJ.Jeffrey Travis and J.Jim Kring. LabVIEW for everyone: graphical programming made easy and fun.Prentice-Hall2007back to text
  • 61 phdthesisA.Alexandru Turjan. Compiling Nested Loop Programs to Process Networks.Universiteit Leiden2007back to text
  • 62 inproceedingsN.Nicolas Ventroux and T.Tanguy Sassolas. A new parallel SystemC kernel leveraging manycore architectures.Design, Automation & Test in Europe Conference & Exhibition (DATE), 2016IEEE2016, 487--492back to textback to text
  • 63 inbookS.Sven Verdoolaege. Polyhedral Process Networks.Handbook of Signal Processing SystemsSpringer2010, 931--965back to text
  • 64 articleS.Samuel Williams, A.Andrew Waterman and D.David Patterson. Roofline: an insightful visual performance model for multicore architectures.Communications of the ACM5242009, 65--76back to textback to text
  • 65 bookP.Paul Wilmott. Quantitative Finance.Wiley2006back to text
  • 66 articleL.-y.Li-yao Xia, Y.Yannick Zakowski, P.Paul He, C.-K.Chung-Kil Hur, G.Gregory Malecha, B. C.Benjamin C. Pierce and S.Steve Zdancewic. Interaction Trees.Proceedings of the ACM on Programming Languages4POPL2020HALDOIback to textback to textback to text