The goal of the project is the design, semantics and compilation of languages for the implementation of provably safe and efficient computing systems. We are driven by the ideal of a unique source code used both to program and simulate a wide variety of systems, including (1) embedded real-time controllers (e.g., fly-by-wire, engine control); (2) computationally intensive applications (e.g., video); (3) the simulation of (a possibly huge number of) embedded systems in close interaction (e.g., simulation of electrical or sensor networks, train tracking). All these applications share the need for formally defined languages used both for simulation and the generation of target code. For that purpose, we design languages and experiment with compilers that transform mathematical specifications of systems into target code, that may execute on parallel (multi-core) architectures.
Our research team draws inspiration and focus from the simplicity and complementarity of the data-flow model of Kahn process networks, synchronous concurrency, and the expression of the two in functional languages. To reach our goal, we plan to leverage a large body of formal principles: language design, semantics, type theory, concurrency models (including recent works on the formalisation of relaxed memory models), synchronous circuits and algorithms (code generation, optimization, polyhedral compilation).
Robin Morisset was Awarded a Google Doctoral Fellowship.
Louis Mandel and Marc Pouzet received a reward for the paper introducing the ReactiveML language for the first time and presented at the French conference JFLA 2005 (“On the occasion of this quarter century, the program committees and steering selected four outstanding contributions from the articles published in JFLA past decade.")
Louis Mandel has been hired in Sept. 2014 at Collège de France, as an Assistant Professor.
Our project is founded on our expertise in three complementary domains: (1) synchronous functional programming and its extensions to deal with features such as communication with bounded buffers and dynamic process creation; (2) mathematical models for synchronous circuits; (3) compilation techniques for synchronous languages and optimizing/parallelizing compilers.
A strong point of the team is its experience and investment in the development of languages and compilers. Members of the team also have direct collaborations for several years with major industrial companies in the field and several of our results are integrated in successful products. Our main results are briefly summarized below.
In , Paul Caspi and Marc Pouzet introduced
synchronous Kahn networks as those Kahn networks that can be
statically scheduled and executed with bounded buffers. This was the
origin of the language
Lucid Synchrone,
In 2000, Marc Pouzet started to collaborate with the SCADE team of
Esterel-Technologies on the design of a new version of
SCADE.
Dassault-Systèmes (Grenoble R&D center, part of
Delmia-automation) developed the language LCM, a variant of Lucid Synchrone
that is used for the simulation of factories. LCM follows closely
the principles and programming constructs of Lucid Synchrone (higher-order,
type inference, mix of data-flow and hierarchical automata). The
team in Grenoble is integrating this development into a new compiler
for the language
Modelica.
In parallel, the goal of ReactiveML
The development of ReactiveML was started by Louis Mandel during his PhD
thesis , and is ongoing. The
language extends Ocaml
Several open problems have been solved by Louis Mandel: the interaction between ML features (higher-order) and reactive constructs with a proper type system; efficient simulation that avoids busy waiting. The latter problem is particularly difficult in synchronous languages because of possible reactions to the absence of a signal. In the ReactiveML implementation, there is no busy waiting: inactive processes have no impact on the overall performance. It turns out that this enables ReactiveML to simulate millions of (logical) parallel processes and to compete with the best event-driven simulators .
ReactiveML has been used for simulating routing protocols in ad-hoc networks and large scale sensor networks . The designer benefits from a real programming language that gives precise control of the level of simulation (e.g., each network layer up to the MAC layer) and programs can be connected to models of the physical environment programmed with Lutin . ReactiveML is used since 2006 by the synchronous team at VERIMAG, Grenoble (in collaboration with France-Telecom) for the development of low-consumption routing protocols in sensor networks.
In the data-flow synchronous model, the clock calculus is a static
analysis that ensures execution in bounded memory. It checks that
the values produced by a node are instantaneously consumed by
connected nodes (synchronous constraint). To program Kahn process
networks with bounded buffers (as in video applications), it is thus
necessary to explicitly place nodes that implement buffers. The
buffers sizes and the clocks at which data must be read or written
have to be computed manually. In practice, it is done with
simulation or successive tries and errors. This task is difficult
and error prone. The aim of the
Technically, it allows processes to be composed whenever they can be
synchronized through a bounded
buffer , . The new flexibility is
obtained by relaxing the clock calculus by replacing the equality of
clocks by a sub-typing rule. The result is a more expressive
language which still offers the same guarantees as the original. The
first version of the model was based on clocks represented as
ultimately periodic binary words . It was algorithmically
expensive and limited to periodic systems. In ,
an abstraction mechanism is proposed which permits direct reasoning
on sets of clocks that are defined as a rational slope and two
shifts. An implementation of the
This work started as a collaboration between Marc Pouzet (LIP6, Paris, then LRI and Inria Proval, Orsay), Marc Duranton (Philips Research then NXP, Eindhoven), Albert Cohen (Inria Alchemy, Orsay) and Christine Eisenbeis (Inria Alchemy, Orsay) on the real-time programming of video stream applications in set-top boxes. It was significantly extended by Louis Mandel and Florence Plateau during her PhD thesis (supervised by Marc Pouzet and Louis Mandel). Low-level support has been investigated with Cupertino Miranda, Philippe Dumont (Inria Alchemy, Orsay) and Antoniu Pop (Mines ParisTech). Further directions of research and experimentation have been and are being followed through the theses of Léonard Gérard, Adrien Guatto and Nhat Minh Lê.
Despite decades of progress, the best parallelizing and optimizing compilers still fail to extract parallelism and to perform the necessary optimizations to harness multi-core processors and their complex memory hierarchies. Polyhedral compilation aims at facilitating the construction of more effective optimization and parallelization algorithms. It captures the flow of data between individual instances of statements in a loop nest, allowing to accurately model the behavior of the program and represent complex parallelizing and optimizing transformations. Affine multidimensional scheduling is one of the main tools in polyhedral compilation . Albert Cohen, in collaboration with Cédric Bastoul, Sylvain Girbal, Nicolas Vasilache, Louis-Noël Pouchet and Konrad Trifunovic (LRI and Inria Alchemy, Orsay) has contributed to a large number of research, development and transfer activities in this area.
The relation between polyhedral compilation and data-flow synchrony has been identified through data-flow array languages , , , and the study of the scheduling and mapping algorithms for these languages. We would like to deepen the exploration of this link, embedding polyhedral techniques into the compilation flow of data-flow, relaxed synchronous languages.
Our previous work led to the design of a theoretical and algorithmic
framework rooted in the polyhedral model of compilation, and to the
implementation of a set of tools based on production compilers
(Open64, GCC) and source-to-source prototypes (PoCC,
http://
After an initial experiment with Open64 , , we
ported these techniques to GCC , , and
LLVM , applying them to
multi-level parallelization and optimization problems, including
vectorization and exploitation of thread-level
parallelism. Independently, we made significant progress in the
design of effective optimization heuristics, working on the
interactions between the semantics of the compiler's intermediate
representation and the structure of the optimization space
, ,
, .
These results open opportunities for complex optimizations that
target larger problems, such as the scheduling and placement of
process networks, or the offloading of computational kernels to
hardware accelerators (such as GPUs). A new framework has been
designed, centered on the Integer Set Library (isl,
http://
For both cost and performance reasons, computing systems tightly couple parts realized in hardware with parts realized in software. The boundary between hardware and software keeps moving with the underlying technology and the external economic pressure. Moreover, thanks to FPGA technology, hardware itself has become programmable. There is now a pressing need from industry for hardware/software co-design, and for tools which automatically turn software code into hardware circuits, or more usually, into hybrid code that simultaneously targets GPUs, multiple cores, encryption ASICs, and other specialized chips.
Departing from customary C-to-VHDL compilation, we trust that sharper results can be achieved from source programs that specify bit-wise time/space behavior in a rigorous synchronous language, rather than just the I/O behavior in some (ill-specified) subset of C. This specification allows the designer to also program the (asynchronous) environment in which to operate the entire system, and to profile/measure/control each variable of the design.
At any time, the designer can edit a single specification of the system, from which both the software and the hardware are automatically compiled, and guaranteed to be compatible. Once correct (functionally and with respect to the behavioral specification), the application can be automatically deployed (and tested) on a hard/soft hybrid co-design support.
Key aspects of the advocated methodology were validated by Jean Vuillemin in the design of a PAL2HDTV video sampler , . The circuit was automatically compiled from a synchronous source specification, decorated and guided by a few key hints to the hardware back-end, that targetted an FPGA running at real-time video specifications: a tightly-packed highly-efficient design at 240MHz, generated 100% automatically from the application specification source code, and including all run-time/debug/test/validate ancillary software. It was subsequently commercialized on FPGA by LetItWave, and then on ASIC by Zoran. This successful experience underlines our research perspectives on parallel synchronous programming.
The project addresses the design, semantics and implementation of programming languages together with compilation techniques to develop provably safe and efficient computing systems. Traditional applications can be found in safety critical embedded systems with hard real-time constraints such as avionics (e.g., fly-by-wire command), railways (e.g., on board control, engine control), nuclear plants (e.g., emergency control of the plant). While embedded applications have been centralized, they are now massively parallel and physically distributed (e.g., sensor networks, train tracking, distributed simulation of factories) and they integrate computationally intensive algorithms (e.g., video processing) with a mix of hard and soft real-time constraints. Finally, systems are heterogeneous with discrete devices communicating with physical ones (e.g., interface between analog and digital circuits). Programming and simulating a whole system from a unique source code, with static guarantees on the reproducibility of simulations together with a compiler to generate target embedded code is a scientific and industrial challenge of great importance.
Synchronous languages, type and clock inference, causality analysis, compilation
Lucid Synchrone is a language for the implementation of reactive systems. It is based on the synchronous model of time as provided by Lustre combined with features from ML languages. It provides powerful extensions such as type and clock inference, type-based causality and initialization analysis and allows to arbitrarily mix data-flow systems and hierarchical automata or flows and valued signals.
It is distributed under binary form, at URL
http://
The language was used, from 1996 to 2006 as a laboratory to experiment various extensions of the language Lustre. Several programming constructs (e.g. merge, last, mix of data-flow and control-structures like automata), type-based program analysis (e.g., typing, clock calculus) and compilation methods, originaly introduced in Lucid Synchrone are now integrated in the new SCADE 6 compiler developped at Esterel-Technologies and commercialized since 2008.
Three major release of the language has been done and the current version is V3 (dev. in 2006). As of 2013, the language is still used for teaching and in our research but we do not develop it anymore. Nonetheless, we have integrated several features from Lucid Synchrone in new research prototypes described below. The Heptagon language and compiler are a direct descendent of it. The new language Zélus for hybrid systems modeling borrows many features originaly introduced in Lucid Synchrone.
Programming language, synchronous reactive programming, concurrent systems, dedicated type-systems.
ReactiveML is a programming language dedicated to the implementation of interactive systems as found in graphical user interfaces, video games or simulation problems. ReactiveML is based on the synchronous reactive model due to Boussinot, embedded in an ML language (OCaml).
The Synchronous reactive model provides synchronous parallel composition and dynamic features like the dynamic creation of processes. In ReactiveML, the reactive model is integrated at the language level (not as a library) which leads to a safer and a more natural programming paradigm.
ReactiveML is distributed at URL http://
The language was mainly used for the simulation of mobile ad hoc networks at the Pierre and Marie Curie University and for the simulation of sensor networks at France Telecom and Verimag (CNRS, Grenoble). A new application to mixed music programming has been developed.
In 2013, a new web site has been developed. New programming constructs
have been added. The runtime system has been cleanup. Moreover, a new
implementation based on the PhD of Cédric Pasteur has also been
provided http://
Synchronous languages, compilation, optimizing compilation, parallel code generation, behavioral synthesis.
Heptagon is an experimental language for the implementation of embedded real-time reactive systems. It is developed inside the Synchronics large-scale initiative, in collaboration with Inria Rhones-Alpes. It is essentially a subset of Lucid Synchrone, without type inference, type polymorphism and higher-order. It is thus a Lustre-like language extended with hierchical automata in a form very close to SCADE 6. The intention for making this new language and compiler is to develop new aggressive optimization techniques for sequential C code and compilation methods for generating parallel code for different platforms. This explains much of the simplifications we have made in order to ease the development of compilation techniques.
Some extensions have already been made, most notably automata, a parallel code generator with Futures, support for correct and efficient in-place array computations. It's currently used to experiment with linear typing for arrays and also to introduce a concept of asynchronous parallel computations. The compiler developed in our team generates C, C++, java and VHDL code.
Transfer activities based on our experience in Heptagon are taking place through the “Fiabilité and Sûreté de Fonctionnement” project at IRT SystemX, led by Alstom Transport, since 2013.
Heptagon is jointly developed with Gwenael Delaval and Alain Girault
from the Inria POP ART team (Grenoble). Gwenael Delaval is developing
the controller synthesis tool BZR (http://
Lucy-n is a language to program in the n-synchronous model. The language is similar to Lustre with a buffer construct. The Lucy-n compiler ensures that programs can be executed in bounded memory and automatically computes buffer sizes. Hence this language allows to program Kahn networks, the compiler being able to statically compute bounds for all FIFOs in the program.
The language compiler and associated tools are available in a binary
form at http://
In 2013, a complete re-implemtantion has been started. This new version will take into account the new features developed during the PhD of Adrien Guatto. Parallel code generation for this new version also involves compilation and runtime system research in collaboration with Nhat Minh Lê and Robin Morisset.
The ML-Sundials bindings allow the use of the state-of-the-art Sundials numerical simulation library from OCaml programs (like, for instance, the Zélus runtime). The Sundials packages includes three main components: CVODE, IDA, and KINSOL.
This year we redesigned and reimplemented the interface to CVODE to fix a problem with memory leaks between OCaml and C heaps. We have submitted an APP request for this code. The CVODE component is an important part of our work on the Zélus programming language.
We also developed a new interface for the IDA component, which we have started to use in our experiments with DAEs (Modelica).
We plan to develop an interface for the remaining KINSOL component over the next three months and then to release the entire library under an open-source license.
Zélus is a new programming language for hybrid system modeling. It is based on a synchronous language but extends it with Ordinary Differential Equations (ODEs) to model continuous-time behabiors. It allows for combining arbitrarily data-flow equations, hierarchical automata and ODEs. The language keeps all the fundamental features of synchronous languages: the compiler statically ensure the absence of deadlocks and critical races; it is able to generate statically scheduled code running in bounded time and space and a type-system is used to distinguish discrete and logical-time signals from continuous-time ones. The ability to combines those features with ODEs made the language usable both for programming discrete controllers and their physical environment.
The Zélus implementation has two main parts: a compiler that transforms Zélus programs into OCaml programs and a runtime library that orchestrates compiled programs and numeric solvers. The runtime can use the Sundials numeric solver, or custom implementations of well-known algorithms for numerically approximating continuous dynamics.
This year we reimplemented several basic numeric solver algorithms after a careful analysis of the Simulink versions together with the binding to SUNDIALS CVODE. This was necessary to enable detailed comparsions between our tool and Simulink (the de facto industrial standard in this domain). We also improved the algorithm for zero-crossing detection, simplified and streamlined the back-end interface.
We developed several new examples to aid in the development, debugging, and dissemination of our work together with various talks and demonstrations. These included a simple backhoe model (which served as a introducing example in the HSCC paper ), an adaptive control example from Astrom and Wittenmark's text, and a model of Zeno behaviour based on a zig-zagging object (presented at Synchron).
Zélus has been released officially in 2013 with several complete
documented examples on http://
Compilation, optimizing compilation, parallel data-flow programming
automatic parallelization, polyhedral compilation.
http://
Licence: GPLv3+ and LGPLv3+
The GNU Compiler Collection includes front ends for C, C++, Objective-C, Fortran, Java, Ada, and Go, as well as libraries for these languages (libstdc++, libgcj,...). GCC was originally written as the compiler for the GNU operating system. The GNU system was developed to be 100% free software, free in the sense that it respects the user's freedom.
PARKAS contributes to the polyhedral compilation framework, also known as Graphite. We also distribute an experimental branch for a stream-programming extension of OpenMP called OpenStream (used in numerous research activites and grants). This effort borrows key design elements to synchronous data-flow languages.
Tobias Grosser is one of main contributors of the Graphite optimization pass of GCC.
Presburger arithmetic, integer linear programming, polyhedral library,
automatic parallelization, polyhedral compilation.
http://
Licence: MIT
isl is a library for manipulating sets and relations of integer points bounded by linear constraints. Supported operations on sets include intersection, union, set difference, emptiness check, convex hull, (integer) affine hull, integer projection, transitive closure (and over-approximation), computing the lexicographic minimum using parametric integer programming. It includes an ILP solver based on generalized basis reduction, and a new polyhedral code generator. isl also supports affine transformations for polyhedral compilation, and increasingly abstract representations to model source and intermediate code in a polyhedral framework.
isl has become the de-facto standard for every recent polyhedral compilation project. Thanks to a license change from LGPL to MIT, its adoption is also picking up in industry.
Presburger arithmetic, integer linear programming, polyhedral library,
automatic parallelization, polyhedral compilation.
http://
Licence: MIT
More tools are being developed, based on isl. PPCG is our source-to-source research tool for automatic parallelization in the polyhedral model. It serves as a test bed for many compilation algorithms and heuristics published by our group, and is currently the best automatic parallelizer for CUDA and OpenCL (on the Polybench suite).
Languages, semantics, tool support, theorem prouvers.
We are working on tools to support large scale semantic definitions, for programming languages and architecture specifications. For that we develop two complementary tools, Ott and Lem.
Ott is a tool for writing definitions of programming languages and calculi. It takes as input a definition of a language syntax and semantics, in a concise and readable ASCII notation that is close to what one would write in informal mathematics. It generates output:
a
a Coq version of the definition;
an Isabelle version of the definition; and
a HOL version of the definition.
Additionally, it can be run as a filter, taking a
The main goal of the Ott tool is to support work on large programming language definitions, where the scale makes it hard to keep a definition internally consistent, and to keep a tight correspondence between a definition and implementations. We also wish to ease rapid prototyping work with smaller calculi, and to make it easier to exchange definitions and definition fragments between groups. The theorem-prover backends should enable a smooth transition between use of informal and formal mathematics.
Lem is a lightweight tool for writing, managing, and publishing large scale semantic definitions. It is also intended as an intermediate language for generating definitions from domain-specific tools, and for porting definitions between interactive theorem proving systems (such as Coq, HOL4, and Isabelle). As such it is a complementary tool to Ott. Lem resembles a pure subset of Objective Caml, supporting typical functional programming constructs, including top-level parametric polymorphism, datatypes, records, higher-order functions, and pattern matching. It also supports common logical mechanisms including list and set comprehensions, universal and existential quantifiers, and inductively defined relations. From this, Lem generates OCaml, HOL4, Coq, and Isabelle code.
In collaboration with Peter Sewell (Cambridge University) and Scott Owens (University of Kent).
The current version of Ott is about 30000 lines of OCaml. The tool is
available from http://
The development version of Lem is available from
http://
Languages, concurrency, memory models, C11/C++11, compiler, bugs.
The cmmtest tool performs random testing of C and C++ compilers against the C11/C++11 memory model. A test case is any well-defined, sequential C program; for each test case, cmmtest:
compiles the program using the compiler and compiler optimisations that are being tested;
runs the compiled program in an instrumented execution environment that logs all memory accesses to global variables and synchronisations;
compares the recorded trace with a reference trace for the same program, checking if the recorded trace can be obtained from the reference trace by valid eliminations, reorderings and introductions.
Cmmtest identified several mistaken write introductions and other unexpected behaviours in the latest release of the gcc compiler. These have been promptly fixed by the gcc developers.
Cmmtest is available from
http://
.
ReactiveML is an extension of OCaml with synchronous concurrency, based on synchronous parallel composition and broadcast of signals. The goal is to provide a general model of deterministic concurrency inside a general purpose functional language to program reactive systems. It is particularly suited to program discrete simulations, for instance of sensor networks.
One of the current focus of the research is being able to simulate huge systems, composed of millions of agents, by extending the current purely sequential implementation in order to be able to take advantage of multi-core and distributed architectures. This goal has led to the introduction of a new programming construct, reactive domain, which allows to define local time scales. These domains help for the distribution of the code but also increase the expressiveness of the language. In particular, it allows to do time refinement. A paper on this new construct and the related static analysis has been published . An extended version is under submission.
We continued the work on a new reactivity analysis which ensures that a process can not prevent the other ones to from executing. This analysis has published in . An English version is under submission.
The runtime of ReactiveML has been cleanup and a multi-threaded implementation has been developed. A paper describing this new implementation will be published in .
All these novelties has been described precisely in the PhD thesis of Cédric Pasteur .
During the year, ReactiveML has also bee applied to mixed music. Mixed music is about live musicians interacting with electronic parts which are controlled by a computer during the performance. It allows composers to use and combine traditional instruments with complex synthesized sounds and other electronic devices. There are several languages dedicated to the writing of mixed music scores. Among them, the Antescofo language coupled with an advanced score follower allows a composer to manage the reactive aspects of musical performances: how electronic parts interact with a musician. However these domain specific languages do not offer the expressiveness of functional programming.
We defined a synchronous semantics for the core language of Antescofo and an alternative implementation based on an embedding inside ReactiveML . The semantics reduces to a few rules, is mathematically precise and leads to an interpretor of only a few hundred lines. The efficiency of this interpretor compares well with that of the actual implementation: on all musical pieces we have tested, response times have been less than the reaction time of the human ear. Moreover, this approach offers to the composer recursion, higher order, inductive types, as well as a simple way to program complex reactive behaviors thanks to the synchronous model of concurrency on which ReactiveML is built .
.
Synchronous programming languages in the vein of Lustre were designed for critical real-time systems. They are, however, not that well adapted to embedded applications with more pressing computational needs, since the generated code will usually not contain loops or arrays.
An essential task of a Lustre compiler is to determine whether a
program can be executed within bounded memory. This process is called
the "clock calculus", and consists in mapping every item of each
program stream to a logical date in a global, discrete time scale. For
a given stream, the mapping itself is called a "clock", and is a
strictly increasing function from stream positions to natural numbers
representing ticks: two items cannot be computed at the same time. In
practice, this function is represented as an infinite binary stream
where the boolean
In recent work, Guatto, Cohen, Mandel and Pouzet considered the extension of the Lustre and Lucid Synchrone clock calculus to allow computing several values instantaneously. This simple idea has a deep impact on all aspects of the language: - its denotational semantics has to account for bursts of values; - the clock calculus now features integers rather than booleans: each integer denotes the size of the burst at the corresponding instant; - causality analysis has to take bursts into account when rejecting self-referential programs; - the code generation process translates bursts to arrays and clocks to counted loops.
A prototype implementation exploiting this idea and generating C code with loops is underway and a paper describing the base of the clock calculus will be published .
This work extends nicely the n-synchronous model that introduced a way to compose streams which have almost the same clock and can be synchronized through the use of a finite buffer.
.
The Ad hoc On demand Distance Vector (AODV) routing protocol is described in RFC3561. It allows the nodes in a Mobile Ad hoc Network (MANET) to know where to forward messages so that they eventually reach their destinations. The nodes of such networks are reactive systems that cooperate to provide a global service (the sending of messages from node to node) satisfying certain correctness properties (namely `loop freedom'—that messages are never sent in circles).
We have mechanized an existing formal but pen-and-paper proof of loop freedom of AODV in the interactive theorem prover Isabelle/HOL. While the process algebra model and the fine details of the original proof are quite formal, the structure of the proof is much less so. This necessitated the development of new framework elements and techniques in Isabelle. In particular, we adapted standard theory on inductive assertions to show invariants over individual reactive nodes and introduced machinery for assume/guarantee reasoning to lift these invariants to networks of communicating processes. While the original proof reasoned informally over traces, the mechanized proof is purely based on invariant reasoning, i.e., on reasoning over pairs of reachable states. Our combination of techniques works very well and is likely useful for modelling and verifying similar protocols in an interactive theorem prover.
We are currently finalising a paper describing this work for submission in January.
In collaboration with Peter Hofner (NICTA) and Robert J. van Glabbeek (UNSW/NICTA).
During year 2013, we mainly worked on three directions: (a) the treatment of DAEs; (b) the design and implementation of a causality analysis for hybrid systems modelers; (c) the study of numerical techniques for non-smooth dynamical systems.
As part of our participation in the European project MODRIO and
SYS2SOFT projects, we have been developing a prototype for simulating
DAE (Differential-Algebraic Equations) systems. DAEs are the basis of
the language Modelica and their interaction with discrete features —
in particular the novel ones introduced in 2012, like hierarchical
automata and clocks — raise difficult semantical and compilation
issues. The goal is to precisely define the interaction between
synchronous programming constructs and DAEs, in term of semantics and
compilation. One strong difficulty at the moment is that existing
techniques (index reduction, dymmy derivative) are not modular and
force, either to (a) write an interpretor where index reduction is
done dynamically every time a mode change occurs or (b) statically
enumerate all the modes, performing index reduction for every of
those. While the first technique is too slow in practice (and it is
not used in the most advanced Modelica compiler), the second one may
explode in practice (putting
Work to-date has focused on implementing standard algorithms from the literature (notably Pantelides, Dummy Derivatives, Dynamic State Selection). Despite the importance of these algorithms to tools like Modelica, we found that important implementation details and “tricks” are not always well documented.
This work is developed hand-in-hand with the interface to the Sundials IDA solver.
We have designed a causality analysis for a language that mix stream equations, hierarchical automata and ODEs and implemented it in the Zélus compiler. Its purpose is to give a sufficient condition for a hybrid program can be turned into statically scheduled code. Moreover, the analysis ensures that absence of discontinuities outside of declared zero-crossing events. This result is novel and the proof deeply rely on the use of non standard analysis introduced in our previous works. This new result has been accepted for publication at HSCC 2014.
In parallel, we collaborate with Bernard Brogliato and Vincent Acary (Inria team BIBOP, Grenoble) on non smooth dynamical systems. Beside general-purpose techniques for solving DAEs and implemented in Modelica compilers, there exist dedicated methods for systems with a lot of discontinuities and contacts (in mechanical system, electrical analogous circuits, etc.). They are far more efficient and numerically accurate than general-purpose techniques when the number of contact is important (e.g., transient in electrical circuits, a bag of marbles). They are based on a time stepping execution and do not have to stop at every zero-crossing event. The combination of those techniques with event detection ones (as used in the Simulink tool) is largely unknown. We are currently inverstigating the extension of our previous work to take Brogliato and Acary techniques into account. This is a novel but promising direction of research for the year to come.
In this research activity, we develop the new language Zélus used as a laboratory for experimenting novel programming constructs and compilation techniques. It serves to illustrate our research as Lucid Synchrone did in the past.
In collaboration with Benoit Caillaud and Albert Benveniste of the Inria HYCOMES team.
We are close to completing a careful analysis of literature related to the quasi-synchronous model for real-time, distributed systems. We have extended existing results by increasing their precision, providing detailed proofs, and simplifying protocol descriptions. The work to-date is documented in a draft document which we expect will eventually become a technical report or journal article.
Quasi-synchronous architectures, sometimes termed Loosely Time-Triggered Architectures (LTTAs), are ubiquitious in the development of distributed, real-time systems. They represent a broad class of systems whose modelling and programming mixes elements of discrete time, physical time, and a notion of approximation. We expect that addressing these elements—in the Zélus programming language—will lead to insights and advances in a broader ambition to program in physical time.
Compilers sometimes generate correct sequential code but break the concurrency memory model of the programming language: these subtle compiler bugs are observable only when the miscompiled functions interact with concurrent contexts, making them particularly hard to detect. In this work we design a strategy to reduce the hard problem of hunting concurrency compiler bugs to differential testing of sequential code and build a tool that puts this strategy to work. Our first contribution is a theory of sound optimisations in the C11/C++11 memory model, covering most of the optimisations we have observed in real compilers and validating the claim that common compiler optisations are sound in the C11/C++11 memory model. Our second contribution is to show how, building on this theory, concurrency compiler bugs can be identified by comparing the memory trace of compiled code against a reference memory trace for the source code. Our tool identified several mistaken write introductions and other unexpected behaviours in the latest release of the gcc compiler.
We studied the semantic design and verified compilation of a C-like programming language for concurrent shared-memory computation above x86 multiprocessors. The design of such a language is made surprisingly subtle by several factors: the relaxed-memory behaviour of the hardware, the effects of compiler optimisation on concurrent code, the need to support high-performance concurrent algorithms, and the desire for a reasonably simple programming model. In turn, this complexity makes verified (or verifying) compilation both essential and challenging. This project started in 2010. In 2013 an article, describing the correctness proof of all the phases of our CompCertTSO compiler (including experimental fence eliminations), appeared in the Journal of the ACM .
In collaboration with Jaroslav Sevcik (U. Cambridge), Viktor Vafeiadis (MPI-SWS), Suresh Jagannathan (Purdue U.), Peter Sewell (U. Cambridge).
This research project aims at improving the design of the JavaScript language. In we present a security infrastructure which allows users and content providers to specify access control policies over subsets of a JavaScript program by leveraging the con- cept of delimited histories with revocation. We implement our proposal in WebKit and evaluate it with three policies on 50 widely used websites with no changes to their JavaScript code and report performance overheads and violations. In we propose a typed extension of JavaScript combining dynamic types, concrete types and like types to let developers pick the level of guarantee that is appropriate for their code. We have implemented our type system and we report on performance and software engineering benefits.
With Gregor Richards and Jan Vitek (Purdue University).
Time-tiling is necessary for the efficient execution of iterative stencil computations. Classical hyper-rectangular tiles cannot be used due to the combination of backward and forward dependences along space dimensions. Existing techniques trade temporal data reuse for inefficiencies in other areas, such as load imbalance, redundant computations, or increased control flow overhead, therefore making it challenging for use with GPUs.
We proposed a time-tiling method for iterative stencil computations on GPUs. Our method is the first tiling algorithm solving the following constraints simultaneously: it does not involve redundant computations, it favors coalesced global-memory accesses, data reuse in local/shared-memory or cache, avoidance of thread divergence, and concurrency, combining hexagonal tile shapes along the time and one spatial dimension with classical tiling along the other spatial dimensions. Hexagonal tiles expose multi-level parallelism as well as data reuse. Experimental results demonstrate significant performance improvements over existing stencil compilers.
Part of this work also involved our colleagues from the POLYFLOW associate-team at the Indian Institute of Science, Bangalore, India.
Task-parallel programming models are getting increasingly popular. Many of them provide expressive mechanisms for inter-task synchronization. For example, OpenMP 4.0 will integrate data-driven execution semantics derived from the StarSs research language. Compared to data-parallel and fork-join models of parallelism, the advanced features being introduced into task-parallel models in turn enable improved scalability through load balancing, memory latency mitigation, mitigation of the pressure on memory bandwidth, and as a side effect, reduced power consumption.
We developed a systematic approach to compile a loop nest into concurrent, dependent tasks. We formulated a partitioning scheme based on the tile-to-tile dependences, represented as affine polyhedra. This scheme ensures at compilation time that tasks belonging to the same class have the same, fully explicit incoming and outgoing dependence patterns. This alleviates the burden of a full-blown dependence resolver to track the readiness of tasks at run time. We evaluated our approach and algorithms in the PPCG compiler, targeting OpenStream, our experimental data-flow task-parallel language with explicit inter-task dependences and a lightweight runtime. Experimental results demonstrate the effectiveness of the approach.
Part of this work also involved our colleagues from the POLYFLOW associate-team at the Indian Institute of Science, Bangalore, India.
User-space scheduling and concurrent first-in first-out queues are two essential building blocks of parallel programming runtimes. They are, however, rarely used together since typical schedulers are oblivious to the ordering constraints introduced by buffered communication.
Chase and Lev's concurrent deque is a key data structure in shared-memory parallel programming and plays an essential role in work-stealing schedulers. We provided the first correctness proof of an optimized implementation of Chase and Lev's deque on top of the POWER and ARM architectures: these provide very relaxed memory models, which we exploit to improve performance but considerably complicate the reasoning. We also studied an optimized x86 and a portable C11 implementation, conducting systematic experiments to evaluate the impact of memory barrier optimizations. Our results demonstrate the benefits of hand tuning the deque code when running on top of relaxed memory models.
Based on this early success, we started working on a more global solution using a new lock-free algorithm for stalling and waking-up tasks in a user-space scheduler according to changes in the state of the corresponding queues. The algorithm is portable and correct, since it is written and proven against the C11 memory model. We showed through experiments that it can serve as a keystone to efficient parallel runtime systems.
These efforts underline the parallelizing compilation research for
During year 2013, we have worked on the use of formal verification of compilation steps in the compiler of a Lustre-like synchronous language. Two main directions has been taken:
The use of SMT-based
The development of a dedicated verification technique to prove the equivalence between a Lustre program and its sequential implementation. We plan to pursue this work during year 2014. Cesare Tinelli will be visiting professor for a month during June 2014.
Kalray 20K grant including the donation of an MPPA Developer workstation (with MPPA 256 accelerator) and support for a short-term research project (2 months of postdoc).
Google Doctoral Fellowships of Tobias Grosser and Robin Morisset.
ANR WMC project (program “jeunes chercheuses, jeunes chercheurs”), 2012–2016, 200 Keuros. F. Zappa Nardelli is the main investigator.
ANR Boole project (program “action blanche”), 2009-2014.
ANR Partout (program “defis”), 2009-2012. Louis Mandel and Marc Pouzet.
ANR CAFEIN, 2013-2015. Marc Pouzet.
Action d'envergure Synchronics, 2008-2012. The action was driven by Alain Girault (Inria, PopArt, Grenoble) and Marc Pouzet (Inria, Parkas, Paris-Rocquencourt), to focus on “langages for embedded systems”. This has been instrumental in driving our new research on hybrid system modelers.
FUI project OpenGPU, 2008–2012.
Sys2Soft contract (Briques Génériques du Logiciel Embarqué). Partenaire principal: Dassault-Systèmes, etc. Inria contacts are Benoit Caillaud (HYCOMES, Rennes) and Marc Pouzet (PARKAS, Paris).
ManycoreLabs contract (Briques Génériques du Logiciel Embarqué). Partenaire principal: Kalray. Inria contacts are Albert Cohen (PARKAS, Paris) and Alain Darte (COMPSYS, Lyon).
Marc Pouzet is scientific advisor for the Esterel-Technologies/ANSYS company.
Type: CAPACITIES
Defi: Alternative Paths to Components and Systems
Instrument: Coordination and Support Action
Objectif: Advanced Computing, embedded and Control systems
Duration: September 2013 – August 2016
Coordinator: Rainer Leupers
Partner: RWTH Aachen (Germany)
Inria contact: Albert Cohen
Abstract: coordination action to support bilateral technology transfer partnerships (TTPs); prototype of future H2020 transfer instruments.
Type: ARTEMIS
Defi: Alternative Paths to Components and Systems
Instrument: ASP
Objectif: NC
Duration: April 2013 – March 2016
Coordinator: Christian Fabre
Partner: CEA Leti (Grenoble)
Inria contact: Albert Cohen
Abstract: cognitive/smart cameras enabled by hardware accelerators, including manycore processors (STHORM platform of ST) and GPUs.
Duration: December 2012 - December 2014
Coordinator: EDF
Partner: Dassault-Systèmes, EDF, Institut Francais du Pétrole, DLR (Munich, Germany), LMS-Imagine, Inria.
Inria contact: Benoit Caillaud (HYCOMES, Rennes); Marc Pouzet (PARKAS, Paris)
Title: Polyhedral Compilation for Data-Flow Programming Languages
Inria principal investigator: Albert Cohen
International Partner (Institution - Laboratory - Researcher):
IISc Bangalore (India) - Department of Computer Science and Automation - Albert Cohen
Duration: 2013 - 2016
See also: http://
Polyhedral techniques for program transformation are now used in several proprietary and open source compilers. However, most of the research on polyhedral compilation has focused on imperative languages such as C, where computation is specified in terms of statements with zero or more nested loops and other control structures around them. Graphical data-flow languages, where there is no notion of statements or a schedule specifying their relative execution order, have so far not been studied using a powerful transformation or optimization approach. These languages are extremely popular in system analysis, modeling and design, in embedded reactive control. They also underline the construction of many domain-specific languages and compiler intermediate representations. The copy and execution semantics of data-flow languages impose a different set of challenges. We plan to bridge this gap by studying techniques that could enable extraction of a polyhedral representation from data-flow programs, transform them, and synthesize them from their equivalent polyhedral representation.
We have regular invited professors in the PARKAS team:
In 2012, one month (June/July), Prof. Stephen Edwards (Columbia Univ., New York, USA).
In 2013, one month (June), Prof. Mary Sheeran from (Chalmers Univ., Sweden).
Pankaj Prateek, Anirudh Kumar, and Pankaj More, students at IIT Kanpur, India, worked in the Parkas team under the supervision of Francesco Zappa Nardelli from 4th May, 2013 to 23 July, 2013.
Guillaume Chelfi, student at Telecom Paris and the MPRI program, under the supervision of Francesco Zappa Nardelli and Marc Pouzet, from 1st of March, 2013, to 31st July, 2013. Guillaume Chelfi worked on the formal verification of the translation of synchronous programs to sequential code.
Louis Mandel supervised the 5-months MPRI Internship of Louis Jachiet from April to August. Louis Jachiet worked on the static scheduling of ReactiveML programs.
Albert Cohen supervised the 3-months Internship of Vincent Thiberville, 3rd year student at École Polytechnique, from April to June. Vincent conducted experimental studies and proposed enhanced methods to support array-based computations in the Heptagon synchronous language.
October, Louis Mandel spent 2 weeks in the team of Vijay Saraswat at IBM T.J. Watson. He worked on the type system of the X10 language.
Albert Cohen was the program chair of CC 2014, the TPC chair of the DAC 2013 and 2014 ESS1 subcommittees, and the co-Program Chair of the APPT 2013 bi-annual Symposium on Advanced Parallel Processing Technology. Albert Cohen was also a member of the PC of PLDI 2014, and a member of the ERC of ASPLOS 2014, PPoPP 2014 and ICS 2014. Albert Cohen also participated to the PC of the IMPACT and HiRES workshops associated with HiPEAC 2014.
Albert Cohen is an associate editor of ACM TACO and IJPP (Springer).
Albert Cohen will be the general chair of PPoPP 2015.
Albert Cohen was the sponsor chair for the HiPEAC 2013 and HiPEAC 2014 conference, and will serve as the exhibit and sponsor chair for HiPEAC 2015.
Marc Pouzet was a member of the PC of DAC 2014, AFADL 2014, MSR 2013, RTNS 2013, DATE 2013.
Marc Pouzet manages with Catherine Dubois (ENSIIE, Evry, France) the GDR (“Groupe de Recherche”) TLP (“Types, Langages et Preuves”) du CNRS (“Centre National de Recherche Scientifique”). Two one-day seminars are organised every year.
Licence: T. Bourke & J. Vuillemin, “Digital Systems”, 64h, L3, Ecole normale supérieure, France
Licence: L. Mandel, “Systèmes”, 42h, L3, Université Pars-Sud 11, France
Licence: L. Mandel & M. Pouzet, “Systèmes et réseaux”, 24h+24h, L3, Ecole normale supérieure, France
Licence: L. Mandel, “Langages de programmation et compilation”, 24h, L3, Ecole normale supérieure, France
Master: L. Mandel & M. Pouzet, “Synchronous Systems”, 8h+16h, M2, MPRI: Ecole normale supérieure and Université Paris Diderot, France
Master: A. Cohen & F. Zappa Nardelli, “Semantics, languages and algorithms for multicore programming”, 9h+14h, M2, MPRI: Ecole normale supérieure and Université Paris Diderot, France
Licence: “Components of a Computing System Introduction to Computer Architecture and Operating Systems” (L3), A. Cohen (44h), École Polytechnique, France
Master 1 École Polytechnique: “Operating Systems Principles and Programming” (M1), A. Cohen (38h), École Polytechnique, France
Marc Pouzet is supervising the national entry exam in computer science for École normale supérieure.
Marc Pouzet is director of studies (“Directeur des études”) for the CS department of École normale supérieure.
PhD : Léonard Gérard, Programmer le parallélisme avec des futures en Heptagon un langage synchrone flot de données et étude des réseaux de Kahn en vue d'une compilation synchrone, Université Paris-Sud 11, Orsay. Soutenue le 25 septembre 2013, au LRI, à Orsay.
PhD : Cédric Pasteur, Raffinement temporel et exécution parallèle dans un langage synchrone fonctionnel, Université Pierre et Marie Curie (UPMC), soutenue le 26 novembre 2013, au Collège de France. Encadrants: Louis Mandel et Marc Pouzet.
PhD in progress : Guillaume Baudart, Real-time fidelity in Quasi-synchronous Systems, Start: 1/10/2013, Timothy Bourke and Marc Pouzet
PhD in progress : Robin Morisset, Compiler Optimisations and Concurrency, 1/10/2013, F. Zappa Nardelli
Albert Cohen was the president of the Habilitation Thesis committee of Fabien Coelho, MINES ParisTech.
Albert Cohen was the president of the PhD thesis committee of Bruno Bodin, UPMC (CIFRE Kalray).
Albert Cohen was a reviewer for the PhD thesis of Martin Schindewolf at the Karlsruhe Institute of Technology.
Albert Cohen was a reviewer for the PhD thesis of Daniel Cordes at TU Dortmund.
Albert Cohen was a reviewer for the PhD thesis of Yuriy Kashnikov, UVSQ.
Albert Cohen was an examiner in the PhD thesis committee of Thomas Preud'Homme, UPMC.
Marc Pouzet was a reviewer for the PhD thesis of Boris Golden, École Polytechnique.
Marc Pouzet was a reviewer for the PhD thesis of Gideon Smeding, Université de Grenoble.
Albert Cohen was a member of a hiring committee for professors at the University of Strasbourg.
Albert Cohen was a member of a hiring committee for assistant professors (“maître de conférences”) at the University Claude Bernard de Lyon.