An embedded architecture is an artifact of heterogeneous constituents and at the crossing of several design viewpoints: software, embedded in hardware, interfaced with the physical world. Time takes different forms when observed from each of these viewpoints: continuous or discrete, event-based or time-triggered. Modeling and programming formalisms that represent software, hardware and physics significantly alter this perception of time. Therefore, time reasoning in system design is usually isolated to a specific design problem: simulation, profiling, performance, scheduling, parallelization, simulation. The aim of project-team TEA is to define conceptually unified frameworks for reasoning on composition and integration in cyber-physical system design, and to put this reasoning to practice by revisiting analysis and synthesis issues in real-time system design with soundness and compositionality gained from formalization.
In the construction of complex systems, information technology (IT) has become a central force of revolutionary changes, driven by the exponential increase of computational power. In the field of telecommunication, IT provides the necessary basis for systems of networked distributed applications. In the field of control engineering, IT provides the necessary basis for embedded control applications. The combination of telecommunication and embedded systems into networked embedded systems opens up a new range of systems, capable of providing more intelligent functionalities, thanks to information and communication (ICT). Networked embedded systems have revolutionized several application domains: energy networks, industrial automation and transport systems.
20th-century science and technology brought us effective methods and tools for designing both computational and physical systems, such as for instance Simulink and Matlab. But the design of cyber-physical systems (CPS) is much more than the union of those two fields. Traditionally, information scientists only have a hazy notion of requirements imposed by the physical environment of computers. Similarly, mechanical, civil, and chemical engineers view computers strictly as devices executing algorithms. CPS design is, to date, mostly executed in this ad-hoc manner, without sound, mathematically grounded, integrative methodology. A new science of CPS design will allow to create machines with complex dynamics and high control reliability, and apply to new industries and applications, such as IoT or edge devices, in a reliable and economically efficient way. Progress requires nothing less than the construction of a new science and technology foundation for CPS that is simultaneously physical and computational.
Beyond the buzzword, a CPS is a ubiquitous object of our everyday life. CPSs have evolved from individual independent units (e.g. an ABS brake) to more and more integrated networks of units, which may be aggregated into larger components or sub-systems. For example, a transportation monitoring network aggregates monitored stations and trains through a large scale distributed system with relatively high latency. Each individual train is being controlled by a train control network, each car in the train has its own real-time bus to control embedded devices. More and more, CPSs are mixing real-time low latency technology with higher latency distributed computing technology.
A common feature found in CPSs is the ever presence of concurrency and parallelism in models. Large systems are increasingly mixing both types of concurrency. They are structured hierarchically and comprise multiple synchronous devices connected by buses or networks that communicate asynchronously. This led to the advent of so-called GALS (Globally Asynchronous, Locally Synchronous) models, or PALS (Physically Asynchronous, Logically Synchronous) systems, where reactive synchronous objects are communicating asynchronously. Still, these infrastructures, together with their programming models, share some fundamental concerns: parallelism and concurrency synchronization, determinism and functional correctness, scheduling optimality and calculation time predictability.
Additionally, CPSs monitor and control real-world processes, the dynamics of which are usually governed by physical laws. These laws are expressed by physicists as mathematical equations and formulas. Discrete CPS models cannot ignore these dynamics, but whereas the equations express the continuous behavior usually using real (irrational) variables, the models usually have to work with discrete time and approximate floating point variables.
A cyber-physical, or reactive, or embedded system is the integration of heterogeneous components originating from several design viewpoints: reactive software, some of which is embedded in hardware, interfaced with the physical environment through mechanical parts. Time takes different forms when observed from each of these viewpoints: it is discrete and event-based in software, discrete and time-triggered in hardware, continuous in mechanics or physics. Design of CPS often benefits from concepts of multiform and logical time(s) for their natural description. High-level formalisms used to model software, hardware and physics additionally alter this perception of time quite significantly.
In model-based system design, time is usually abstracted to serve the purpose of one of many design tasks: verification, simulation, profiling, performance analysis, scheduling analysis, parallelization, distribution, or virtual prototyping. For example in non-real-time commodity software, timing abstraction such as number of instructions and algorithmic complexity is sufficient: software will run the same on different machines, except slower or faster. Alternatively, in cyber-physical systems, multiple recurring instances of meaningful events may create as many dedicated logical clocks, on which to ground modeling and design practices.
Time abstraction increases efficiency in event-driven simulation or execution (i.e SystemC simulation models try to abstract time, from cycle-accurate to approximate-time, and to loosely-time), while attempting to retain functionality, but without any actual guarantee of valid accuracy (responsibility is left to the model designer). Functional determinism (a.k.a. conflict-freeness in Petri Nets, monotonicity in Kahn PNs, confluence in Milner's CCS, latency-insensitivity and elasticity in circuit design) allows for reducing to some amount the problem to that of many schedules of a single self-timed behavior, and time in many system studies is partitioned into models of computation and communication (MoCCs). Multiple, multiform time(s) raises the question of combination, abstraction or refinement between distinct time bases. The question of combining continuous time with discrete logical time calls for proper discretization in simulation and implementation. While timed reasoning takes multiple forms, there is no unified foundation to reason about multi-form time in system design.
The objective of project-team TEA is henceforth to define formal models for timed quantitative reasoning, composition, and integration in embedded system design. Formal time models and calculi should allow us to revisit common domain problems in real-time system design, such as time predictability and determinism, memory resources predictability, real-time scheduling, mixed-criticality and power management; yet from the perspective gained from inter-domain timed and quantitative abstraction or refinement relations. A regained focus on fundamentals will allow to deliver better tooled methodologies for virtual prototyping and integration of embedded architectures.
The challenges of team TEA support the claim that sound Cyber-Physical System design (including embedded, reactive, and concurrent systems altogether) should consider multi-form time models as a central aspect. In this aim, architectural specifications found in software engineering are a natural focal point to start from. Architecture descriptions organize a system model into manageable components, establish clear interfaces between them, collect domain-specific constraints and properties to help correct integration of components during system design. The definition of a formal design methodology to support heterogeneous or multi-form models of time in architecture descriptions demands the elaboration of sound mathematical foundations and the development of formal calculi and methods to instrument them.
System design based on the “synchronous paradigm” has focused the attention of many academic and industrial actors on abstracting non-functional implementation details from system design. This elegant design abstraction focuses on the logic of interaction in reactive programs rather than their timed behavior, allowing to secure functional correctness while remaining an intuitive programming model for embedded systems. Yet, it corresponds to embedded technologies of single cores and synchronous buses from the 90s, and may hardly cover the semantic diversity of distribution, parallelism, heterogeneity, of cyber-physical systems found in 21st century Internet-connected, true-time-synchronized clouds, of tomorrow's grids.
By contrast with a synchronous hypothesis, yet from the same era, the polychronous MoCC is inherently capable of describing multi-clock abstractions of GALS systems. Polychrony is implemented in the data-flow specification language Signal, available in the Eclipse project POP (Polychrony on Polarsys) and in the CCSL standard Clock Constraints in CCSL available from the TimeSquare project. Both provide tooled infrastructures to refine high-level specifications into real-time streaming applications or locally synchronous and globally asynchronous systems, through a series of model analysis, verification, and synthesis services. These tool-supported refinement and transformation techniques can assist the system engineer from the earliest design stages of requirement specification to the latest stages of synthesis, scheduling and deployment. These characteristics make polychrony much closer to the required semantic for compositional, refinement-based, architecture-driven, system design.
While polychrony was a step ahead of the traditional synchronous hypothesis, CCSL is a leap forward from synchrony and polychrony. The essence of CCSL is “multi-form time” toward addressing all of the domain-specific physical, electronic and logical aspects of cyber-physical system design.
To formalize timed semantics for system design, we shall rely on algebraic representations of time as clocks found in previous works and introduce a paradigm of "time system": refinement types that represent timed behaviors. Just as a type system abstracts data carried along operations in a program, a “time system” abstracts the causal interaction of that program module or hardware element with its environment, its pre- and post-conditions, its assumptions and guarantees, either logical or numerical, discrete or continuous. Some fundamental concepts we envision are present in the clock calculi found in data-flow synchronous languages like Signal or Lustre, yet bound to a particular model of timed concurrency.
In particular, the principle of refinement type systems 1, is to associate information (data-types) inferred from programs and models with properties pertaining, for instance, to the algebraic domain on their value, or any algebraic property related to its computation: effect, memory usage, pre-post condition, value-range, cost, speed, time, temporal logic 2. Being grounded on type and domain theories, such type systems system should naturally be equipped with program analysis techniques based on type inference (for data-type inference) or abstract interpretation (for program properties inference) to help establish formal relations between heterogeneous component “types”.
Gaining scalability requires the capacity to modularly decompose systems which can be obtained using Abadi and Lamport's “Composing Specifications” and implemented by the notion of assume-guarantee contracts or Dijkstra monads. Verification problems encompassing heterogeneously timed specifications are common and of great variety: checking correctness between abstract (e.g. the synchronous hypothesis) and concrete time models (e.g. real-time architectures) relates to desynchronisation (from synchrony to asynchrony) and scheduling analysis (from synchronous data-flow to hardware). More generally, they can be perceived from heterogeneous timing viewpoints (e.g. mapping a synchronous-time software on a real-time middleware or hardware).
This perspective demands capabilities to use abstraction and refinement mechanisms for time models (using simulation, refinement, bi-simulation, equivalence relations) but also to prove more specific properties (synchronization, determinism, endochrony). All this formalization effort will allow to effectively perform the tooled validation of common cross-domain properties (e.g. cost v.s. power v.s. performance v.s. software mapping) and tackle problems such as these integrating constraints of battery capacity, on-board CPU performance, available memory resources, software schedulability, to logical software correctness and plant controllability.
To address the formalization of such cross-domain case studies, modeling the architecture formally plays an essential role. An architectural model represents components in a distributed system as boxes with well-defined interfaces, connections between ports on component interfaces, and specifies component properties that can be used in analytical reasoning about the model.
In system design, an architectural specification serves several important purposes. First, it breaks down a system model into components of manageable size and complexity, to establish clear interfaces between components. In this way, complexity becomes manageable by hiding details that are not relevant at a given level of abstraction. Clear, formally defined, and semantically rich component interfaces facilitate integration by allowing most validation efforts to be conducted modularly. Connections between components, which specify how components interact with each other, help propagate the guaranteed effects of a component to the assumptions of linked components.
Most importantly, an architectural model is a repository to share knowledge about the system being designed. This knowledge can be represented as requirements, design artifacts, component implementations, held together by a structural backbone. Such a repository enables automatic generation of analytical models for different aspects of the system, such as timing, reliability, security, performance, energy, etc. Since all the models are generated from the same source, the consistency of assumptions w.r.t. guarantees, of abstractions w.r.t. refinements, used for different analyses becomes easier, and can be properly ensured in a design methodology based on formal verification and synthesis methods.
Based on sound formalization of time and CPS architectures, real-time scheduling theory provides tools for predicting the timing behavior of a CPS which consists of many interacting software and hardware components. Expressing parallelism among software components is a crucial aspect of the design process of a CPS. It allows for efficient partition and exploitation of available resources.
The literature about real-time scheduling 3 provides very mature schedulability tests regarding many scheduling strategies, preemptive or non-preemptive scheduling, uniprocessor or multiprocessor scheduling, etc. Scheduling of data-flow graphs has also been extensively studied in the past decades.
A milestone in this prospect is the development of abstract affine scheduling techniques 4. It consists, first, of approximating task communication patterns (e.g. between Safety-Critical Java threads) using cyclo-static data-flow graphs and affine functions. Then, it uses state of the art ILP techniques to find optimal schedules and to concretize them as real-time schedules in the program implementations 56.
Abstract scheduling, or the use of abstraction and refinement techniques in scheduling borrowed to the theory of abstract interpretation 7 is a promising development toward tooled methodologies to orchestrate thousands of heterogeneous hardware/software blocks on modern CPS architectures (just consider modern cars or aircrafts). It is an issue that simply defies the state of the art and known bounds of complexity theory in the field, and consequently requires a particular focus.
The IoT is a network of devices that sense, actuate and change our immediate environment. Against this fundamental role of sensing and actuation, design of edge devices often considers actions and event timings to be primarily software implementation issues: programming models for IoT abstract even the most rudimentary information regarding timing, sensing and the effects of actuation. As a result, applications programming interfaces (API) for IoT allow wiring systems fast without any meaningful assertions about correctness, reliability or resilience.
We make the case that the "API glue" must give way to a logical interface expressed using contracts or refinement types. Interfaces can be governed by a calculus – a refinement type calculus – to enable reasoning on time, sensing and actuation, in a way that provides both deep specification refinement, for mechanized verification of requirements, and multi-layered abstraction, to support compositionality and scalability, from one end of the system to the other.
Our project seeks to elevate the “function as type” paradigm to that of “system as type”: to define a refinement type calculus based on concepts of contracts for reasoning on networked devices and integrate them as cyber-physical systems 8. An invited paper 9 outlines our progress with respect to this aim and plans towards building a verified programming environment for networked IoT devices: we propose a type-driven approach to verifying and building safe and secure IoT applications.
Accounting for such constraints in a more principled fashion demands reasoning about the composition of all the software and hardware components of the application. Our proposed framework takes a step in this direction by (1) using refinement types to make physical constraints explicit and (2) imposing an event-driven programming discipline to simplify the reasoning of system-wide properties to that of an event queue. In taking this approach, a developer could build a verified IoT application by ensuring that a well-typed program cannot violate the physical constraints of its architecture and environment.
In collaboration with Mitsubishi R&D, we explore another application domain where time and domain heterogeneity are prime concerns: factory automation. In factory automation alone, a system is conventionally built from generic computing modules: PLCs (Programmable Logic Controllers), connected to the environment with actuators and detectors, and linked to a distributed network. Each individual, physically distributed, PLC module must be timely programmed to perform individually coherent actions and fulfill the global physical, chemical, safety, power efficiency, performance and latency requirements of the whole production chain. Factory chains are subject to global and heterogeneous (physical, electronic, functional) requirements whose enforcement must be orchestrated for all individual components.
Model-based analysis in factory automation emerges from different scientific domains and focuses on different CPS abstractions that interact in subtle ways: logic of PLC programs, real-time electro-mechanical processing, physical and chemical environments. This yields domain communication problems that render individual domain analysis useless. For instance, if one domain analysis (e.g. software) modifies a system model in a way that violates assumptions made by another domain (e.g. chemistry) then the detection of its violation may well be impossible to explain to either the software or chemistry experts. As a consequence, cross-domain analysis issues are discovered very late during system integration and lead to costly fixes. This is particularly prevalent in multi-tier industries, such as avionic, automotive, factories, where systems are prominently integrated from independently-developed parts.
We released the first fully verified implementation of "femto-containers": a virtual machine executing eBPF scripts (extended Berkeley Packet Filters) to provide kernel extensibility in the RIOT operating system 10. The workflow of our implementation consisted of 1) modelling an executable interpreter of the rBPF virtual machine in Coq; 2) formally verifying that the rBPF interpreter satisfies software faults isolation; 3) refining this proof model into an equivalent monadic representation; 4) transpiling this monadic representation as an imperative program in CompCert Clight using
CertrBPF includes a verified C verifier and interpreter. The verifier performs static checks to verify that the rBPF instructions are syntactically correct. If the verification succeeds, the program is run by the interpreter. The static verification and the dynamic checks ensure software fault isolation of the running rBPF propgram. Namely, we have the guarantee that:
The interpreter never crashes. More precisely, the proved C program is free of undefined behaviours such as division by zero or invalid memory accesses. All the memory accesses are performed in designated memory regions which are an argument of the interpreter.
The development of CertrBPF follows a refinement methodology with three main layers:
The proof model: an executable Coq specification of the rBPF virtual machine The synthesis model: a refined and optimised executable Coq program that is close in style to a C program. Eventually, we have the synthesis model (named dx model) which is compliant with the dx C extraction tool. The implementation model: the extracted C implementation in the form of CompCert Clight AST.
ADFG is a synthesis tool of real-time system scheduling parameters: ADFG computes task periods and buffer sizes of systems resulting in a trade-off between throughput maximization and buffer size minimization. ADFG synthesizes systems modeled by ultimately cyclo-static dataflow (UCSDF) graphs, an extension of the standard CSDF model.
Knowing the WCET (Worst Case Execute Time) of the actors and their exchanges on the channels, ADFG tries to synthezise the scheduler of the application. ADFG offers several scheduling policies and can detect unschedulable systems. It ensures that the real scheduling does not cause overflows or underflows and tries to maximize the throughput (the processors utilization) while minimizing the storage space needed between the actors (i.e. the buffer sizes).
Abstract affine scheduling is first applied on the dataflow graph, that consists only of periodic actors, to compute timeless scheduling constraints (e.g., relations between the speeds of two actors) and buffering parameters. Then, symbolic schedulability policies analysis (i.e., synthesis of timing and scheduling parameters of actors) is applied to produce the scheduler for the actors.
ADFG, initially defined to synthesize real-time schedulers for SCJ/L1 applications, may be used for scheduling analysis of AADL programs.
Our project aims at the formal verification of safety and security properties for embedded operating system libraries in the RIOT operating system, focusing on a virtual machine to implement kernel-extensibility by hosting runtime applications and services using the virtual instruction set architecture (ISA) of eBPF (extended Berkeley packet filter).
We implemented a fully verified rBPF virtual machine for RIOT's "femto-containers" in the proof assistant Coq 10. The workflow of our implementation consists of 1) modelling an executable interpreter of the rBPF virtual machine in Coq; 2) formally verifying that the rBPF interpreter satisfies software faults isolation; 3) refining this proof model into a monadic form; 4) transpiling this monadic representation as an imperative program in CompCert Clight; 4) proving a forward simulation relation between the monadic Coq model and the Clight implementation.
The resulting artifact: CertFC [6.1.1] was then benchmarked, validated, and integrated to RIOT's-featured femto-containers as an open-source library. A second publication reports this evaluation and its integration to RIoT OS 11.
The associate-team CONVEX yielded a number of important results in the development of theorem-proving models for cyber-physical systems verification, under the joint supervision of Xiong Xu with team TEA and Naijun Zhan's group at ISCAS, Beijing. In 8, we define an extension of Hoare&He’s Unifying Theories of Programming (UTP) with higher-order quantification and provide a formal semantics for modeling and verifying hybrid systems. Higher-order UTP (HUTP) provides a semantics foundation for cyber-physical systems (CPSs). It separates the concerns in CPS design into time, state and trace, which support the specification of discrete, real-time and continuous dynamics, concurrency and communication, and higher-order quantification. Within HUTP, we defined a calculus of normal hybrid designs to model, analyze, compose, refine and verify heterogeneous hybrid system models. In addition, we defined respective formal semantics for Hybrid Communicating Sequential Processes (HCSP) and Simulink using HUTP and justified the correctness of the translation from Simulink to HCSP by translation validation as an application of HUTP. We have started applying this framework to co-modeling real-time hardware platforms in AADL and hybrid discrete-continuous models in Simulink/Stateflow (AADL+S/S) to uniformly analyze and verify this combination in HUTP. In this aim, 9 provides, to our knwoledge, the simplest formalization of a semantic for Simulink.
We are furthering our study and application of the HUTP in the Isabelle/HOL theorem prover by developping a hybrid extension of Milner’s π-calculus. Our aim is to model, analyze and verify mobile hybrid systems, or more generally mobile and extensible hybrid systems as found in the IoT in the form of intermittent, mobile, distributed cyber-physical systems such as transportation grids in general (data, resources, vehicules).
First, based on our previous work on higher-order unifying theories of programming (HUTP), we will investigate the mechanism of mobile hybrid systems, i.e., an HUTP theory extended with mobility. Then, we will focus on the operational semantics of a hybrid extension to a generic π-calculus. Finally, we will refine this theory to a protocolar session calculus and build a session type system for hybrid mobile processes, to characterize important safety and security properties, such as determinism and isolation.
The goal of this PhD project is to build on the previous work done in Simon Lunel's PhD thesis. The goal is to ensure the correctness-by-design of cyber-physical system models. It deals with cyber-physical systems (CPS), which are assemblies of networked, heterogeneous, hardware, and software components sensing, evaluating, and actuating a physical environment. This heterogeneity induces complexity that makes CPSs challenging to model correctly. Since CPSs often have critical functions, it is however of utmost importance to formally verify them in order to provide the highest guarantees of safety. Faced with CPS complexity, model abstraction becomes paramount to make verification attainable. To this end, assume/guarantee contracts enable component model abstraction to support a sound, structured, and modular verification process.
While abstractions of models by contracts are usually proved sound, none of the related contract frameworks themselves have, to the best of our knowledge, been formally proved correct so far. In this aim, we present the formalization of a generic assume/guarantee contract theory in the Coq proof assistant. We identify and prove theorems that ensure its correctness. Our theory is generic, or parametric, in that it can be instantiated and used with any given logic, in particular hybrid logics, in which highly complex cyber-physical systems can uniformly be described.
We are pursuing the development of this model by instantiating this parametric theory with differential dynamic logic, a language to specify cyber-physical sytems. This instantiation provides a use case for the meta-theory of contracts. To this end, we are using CoqDL, a formalization of differential dynamic logic that was developed by the Logical Systems Lab of Carnegie Mellon University.
We continued our research involving the development of advanced signal processing dataflow methods for the ADFG tool. This research is collaboratively developed with the TEA Team, Hai Nam Tran (Lab-STICC/UBO), Alexandre Honorat (INSA) and Shuvra Bhattacharyya (UMD/INSA/INRIA). Our emphasis during this reporting period is the public release of the ADFG infrastructure on Gitlab 6.1.2.
Inria – Mitsubishi Electric framework program (2018+)
Mitsubishi Electric R&D Europe (2019-2022)
Jean-Pierre Talpin chairs the steering committee of the ACM-IEEE International Conference on Formal Methods and Models for System Design (MEMOCODE) which, for its 20th anniversary, joined the ESWEEK event.
He also cochaired the program committee of the Symposium on Dependable Software Engineering: Theories, Tools and Applications (SETTA'22).
He participated in the program committee of the International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation (SAMOS XXII) and of the 1st International Workshop on Formal Engineering of Cyber-Physical Systems (FE-CPS@TASE'22).
Jean-Pierre Talpin served as rapporteur and president for the Ph.D. defense committee of Frédéric Fort, entitled "Programing adaptative real-time systems", at the University of Lille.
He also served as rapporteur for the Ph.D. defense committee of Nicolas Dejon, entitled "Conception d'un noyau sécurisé pour objets contraints", at the University of Lille.
Jean-Pierre Talpin surpervises the Ph.D. Thesis of Shenghao Yuan with Frédéric Besson, Inria Epicure (registered with University of Rennes, since 2020).
Jean-Pierre Talpin surpervises the Ph.D. Thesis of Shenghao Yuan with Benoït Boyer, MERCE (registered with University of Rennes, since 2019).