Formes
Formesstands for FORmal Methods for Embedded Systems. Formesis aiming at making research advances towards the development of safe and reliable embedded systems, by exploiting synergies between two different approaches, namely (real time) hardware simulation and formal proofs development.
Embedded systems have become ubiquitous in our everyday life, ranging from simple sensors to complex systems such as mobile phones, network routers, airplane, aerospace and defense apparatus. As embedded devices include increasingly sophisticated hardware and software, the development of combined hardware and software has become a key to economic success.
The development of embedded systems uses hardware with increasing capacities. As embedded devices include increasingly sophisticated hardware running complex functions, the development of software for embedded systems is becoming a critical issue for the industry. There are often stringent time to market and quality requirements for embedded systems manufacturers. Safety and security requirements are satisfied by using strong validation tools and some form of formal methods, accompanied with certification processes such as DO178 or Common Criteria certification. These requirements for quality of service, safety and security imply to have formally proved the required properties of the system before it is deployed.
Within the context described above, the Formesproject aims at addressing the challenges of embedded systems design with a new approach, combining fast hardware simulation techniques with advanced formal methods, in order to formally prove qualitative and quantitative properties of the final system. This approach requires the construction of a simulation environment and tools for the analysis of simulation outputs and proofs of properties of the simulated system. We therefore need to connect simulation tools with code-analyzers and easy-to-use theorem provers for achieving the following tasks:
Enhance the hardware simulation technologies with new techniques to improve simulation speed, and produce program representations that are adequate for formal analysis and proofs of the simulated programs ;
Connect validation tools that can be used in conjunction with simulation outputs that can be exploited using formal methods ;
Extend and improve the theorem proving technologies and tools to support the application to embedded software simulation.
A main novelty of the project, besides improving the existing technologies and tools, relies in the application itself: to combine simulation technologies with formal methods in order to cut down the development time for embedded software and scale up its reliability. Apart from being a novelty, this combination is also a necessity: proving very large code is unrealistic and will remain so for quite some time; and relying only on simulation for assessing critical properties of embedded systems is unrealistic as well.
We assume that these properties can be localized in critical, but small, parts of the code, or dedicated hardware models. This nevertheless requires scaling up the proof activity by an order of magnitude with respect to the size of codes and the proof development time. We expect that it is realistic to rely on both combined. We plan to rely on formal proofs for assessing properties of small, critical components of the embedded system that can be analyzed independently of the environment. We plan to rely on formal proofs as well for assessing correctness of the elaboration of program representation abstractions from object code. We plan to rely on simulations for testing the whole embedded system, and to formal proofs to verify the completeness of test sets. We finally plan to rely on formal proofs again for verifying the correct functioning of our tools. Proving properties of these various abstractions requires using a certified, interactive theorem prover.
The first open source version of the
SimSoC
Since October 2010, SimSoC contains an ARM processor simulator that has been generated automatically from the ARM documentation (see Section ).
CoqMT
The project Formeswas created in September 2008, by union of three different smaller groups which origin and interests were somewhat different : a group working on simulation of embedded systems at CASIA since march 2007 under the leadership of Vania Joloboff; a second group working on user-assisted theorem proving under the leadership of Jean-Pierre Jouannaud originated from the INRIA project-teams LogiCalat INRIA-Saclay-Ile-de-France and Protheoat INRIA-Lorraine; and a group working on model-checking and trustworthy computing at Tsinghua University under the leadership of Gu Ming. The second group moved from France to Beijing in September 2008. A previous 4 weeks visit of Jean-Pierre Jouannaud and Frédéric Blanqui in March 2008 had been used to define the new project Formes, and prepare its installation at Tsinghua university.
Formesis the acronym for FORmal Methods for Embedded Systems, and indeed we aim at combining in this project formal methods of very different origins for analyzing embedded systems. We develop a software (SimSoC) for simulatingembedded systems, but we also develop other techniques and tools in order to analyze and predict their behavior, and that of the software running on such systems. These techniques themselves are of different origin, and are usually developed in different teams around the world. Verificationtechniques based on model checking have been extensively and successfully used in the past to analyze hardware systems. Decisions procedures, like SAT, are now common place to analyze specific software applications, such as scheduling. Proof assistants are more and more employed to cary out formal proofsof correctness of security protocols and more generally non-trivial pieces of software. One originality of our project is to combineall these techniques in order to achieve our goal : to design methods and tools allowing one to build reliable software, also called trustworthy computing.
In the next sections, we describe in more details these five areas, and their relationship to Formes.
The development of complex embedded systems platforms requires putting together many hardware components, processor cores, application specific co-processors, bus architectures, peripherals, etc. The hardware platform of a project is seldom entirely new. In fact, in most cases, 80 percent of the hardware components are re-used from previous projects or simply are COTS (Commercial Off-The-Shelf) components. There is no need to simulate in great detail these already proven components, whereas there is a need to run fast simulation of the software using these components.
These requirements call for an integrated, modular simulation environment where already proven components can be simulated quickly, (possibly including real hardware in the loop), new components under design can be tested more thoroughly, and the software can be tested on the complete platform with reasonable speed.
Modularity and fast prototyping also have become important aspects of simulation frameworks, for investigating alternative designs with easier re-use and integration of third party components.
The project aims at developing such a rapid prototyping, modular simulation platform, combining new hardware components modeling, verification techniques, fast software simulation for proven components, capable of running the real embedded software application without any change.
To fully simulate a complete hardware platform, one must simulate the processors, the co-processors, together with the peripherals such as network controllers, graphics controllers, USB controllers, etc. A commonly used solution is the combination of some ISS (Instruction Set Simulator) connected to a Hardware Description Language (HDL) simulator which can be implemented by software or by using a FPGA simulator. These solutions tend to present slow iteration design cycles and implementing the FPGA means the hardware has already been designed at low level, which comes normally late in the project and become very costly when using large FPGA platforms. Others have implemented a co-simulation environment, using two separate technologies, typically one using a HDL and another one using an ISS , , . Some communication and synchronization must be designed and maintained between the two using some inter-process communication (IPC), which slows down the process.
The idea we pursue is to combine hardware modeling and fast simulation into a fully integrated, software based (not using FPGA) simulation environment named SimSoC, which uses a single simulation loop thanks to Transaction Level Modeling (TLM) , combined with a new ISS technology designed specifically to fit within the TLM environment.
The most challenging way to enhance simulation speed is to simulate the processors. Processor simulation is achieved with Instruction Set Simulation (ISS). There are several alternatives to achieve such simulation. In interpretive simulation, each instruction of the target program is fetched from memory, decoded, and executed. This method is flexible and easy to implement, but the simulation speed is slow as it wastes a lot of time in decoding. Interpretive simulation is used in Simplescalar . Another technique to implement a fast ISS is dynamic translation , , which has been favored by many , , , in the past decade.
With dynamic translation, the binary target instructions are fetched from memory at run-time, like in interpretive simulation. They are decoded on the first execution and the simulator translates these instructions into another representation which is stored into a cache. On further execution of the same instructions, the translated cached version is used. Dynamic translation introduces a translation time phase as part of the overall simulation time. But as the resulting cached code is re-used, the translation time is amortized over time. If the code is modified during run-time, the simulator must invalidate the cached representation. Dynamic translation provides much faster simulation while keeping the advantage of interpretive simulation as it supports the simulation of programs that have either dynamic loading or self-modifying code.
There are many ways of translating binary code into cached data, which each come at a price, with different trade-offs between the translation time and the obtained speed up on cache execution. Also, simulation speed-ups usually don't come for free : most of time there is a trade-off between accuracy and speed.
There are two well known variants of the dynamic translation technology: the target code is translated either directly into machine code for the simulation host, or into an intermediate representation, independent from the host machine, that makes it possible to execute the code with faster speed. Both have pros and cons.
Processor simulation is also achieved in Virtual Machines such as QEMU and GXEMUL that emulate to a large extent the behavior of a particular hardware platform. The technique used in QEMU is a form of dynamic translation. The target code is translated directly into machine code using some pre-determined code patterns that have been pre-compiled with the C compiler. Both QEMU and GXEMUL include many device models of open-source C code, but this code is hard to reuse. The functions that emulate device accesses do not have the same profile. The scheduling process of the parallel hardware entities is not specified well enough to guarantee the compatibility between several emulators or re-usability of third-party models using the standards from the electronics industry (e.g. IEEE 1666).
A challenge in the development of high peformance simulators is to maintain simultaneously fast speed and simulation accuracy. In the Formesproject, we expect to develop a dynamic translation technology satisfying the following additional objectives:
provide different levels of translation with different degrees of accuracy so that users can choose between accurate and slow (for debugging) or less accurate but fast simulation.
to take advantage of multi-processor simulation hosts to parallelize the simulation;
to define intermediate representations of programs that optimize the simulation speed and possibly provide a more convenient format for studying properties of the simulated programs.
Another objective of the Formessimulation is to extract information from the simulated applications to prove properties. Running a simulation is exercising a test case. In most cases, if a test is failing, a bug has been found. One can use model checking tools to generate tests that can be run on the simulator to check whether the test fails or not on the real application. It is also a goal of Formessimulation activity to use such formal methods tools to detect bugs, either by generating tests, or by using formal methods tools to analyze the results of simulation sessions.
Coq
is one of the most popular proof
assistant, in the academia and in the industry. Based on the
Calculus of Inductive Constructions, Coq has three kinds of
basic entities: objects are used for computations (data,
programs, proofs are objects); types express properties of
objects; kinds categorize types by their logical structure.
Coq's type checker can decide whether a given object
satisfies a given type, and if a given type has a logical
structure expressed by a given kind. Because it is possible
to (uniformly) define inductive types such as lists,
dependent types such as lists-of-length-n, parametric types
such as lists-of-something, inductive properties such as
(
e
v
e
n
n)for some natural number
n, etc, writing small specifications in Coq is an easy
task. Writing proofs is a harder (non-automatable) task that
must be done by the user with the help of tactics. Automating
proofs when possible is a necessary step for dissemination of
these techniques, as is scaling up. These are the problems we
are interested in.
Modeling in Coq is not always as easy as argued. In Coq, a
powerful, very useful mecanism identifies expressions up to
computation. For example, identifying two lists of identical
content but respective lengths
m+
nand
n+
mis no problem if
mand
nare given integers, but does not work if
mand
nare unknowns, since
n+
m=
m+
nis a valid theorem of arithmetic
which cannot be proved by mere computation. It follows that
the statement
reverse(
l::
l') =
reverse(
l')::
reverse(
l)is not typable,
::standing for appending two lists.
This problem that seemingly innocent statements cannot be
written in Coq because they do not type-check has been
considered a major open problem for years. Blanqui, Jouannaud
and Strub have recently introduced a new paradigm named
Coq modulo Theories, in which computations do not
operate only on closed terms (as are
1 + 2and
2 + 1) but on open expressions of a
decidable theory (as is
n+
m=
m+
nin Presburger arithmetic). This
work started with the PhD thesis of Pierre-Yves Strub
Scaling up is yet another challenge. Modeling a large,
complex software is a hard task which has been addressed
within the Coq community in two different ways. By developing
a module system for Coq in the OCaml style, which makes it
possible to modularize proof developments and hence to
develop modular libraries. By developing a methodology for
modeling real programs and proving their properties with Coq.
This methodology allows to translate a JavaCard (tool
Krakatoa
Our last challenge is to rely on certified tools only. In particular, we decided to certify in Coq all extensions of Coq developed in the project: the core logic of CoqMT (a Calculus of Inductive Constructions encorporating Presburger arithmetic) has been certified with Coq. Of course, Coq itself cannot be reduced to CCI anymore, which makes the certification of the real logicof CoqMT a major challenge. The most critical parts of the simulator will also be certified. As for compilers, there are two ways to certify tools: either, the code is proved correct, or it outputs a certificate that can be checked. The second approach demands less man-power, and has the other advantage to be compatible with the use of tools taken from the shelves, provided these tools are open-source since they must be equipped with a mechanism for generating certificates. This is the approach we will favor for the theories to be used in CoqMT, as well as for the SMT prover to be developed. For the simulator SimSoC itself, we shall probably combine both approaches.
Some of these challenges require expertise in both rewriting and type theory. To maintain this combined expertise in Formes, we also carry out theoretical activities in these areas, even if they may sometimes appear remotely connected to the mainstream of our work on the verification of embedded systems. First and higher-order rewriting deal with relations on sets (abstract rewriting), term algebras (first-order rewriting), and binding algebras (higher-order rewriting), which are generated by a (usually finite) set of pairs. Important problems are few: termination (also called strong normalization) is the property of non-existence of infinite computations; confluence is the property that rewriting computations, although non-deterministic, return a unique result, hence define functions; Subject reduction is the property that computations preserve types. Since the third is usually easy to check, we are mostly interested in confluence and termination.
Model checking is an automatic formal verification technique . In order to apply the technique, users have to formally specify desired properties on an abstract model of the system under verification. Model checkers will check whether the abstract model satisfies the given properties. If model checkers are able to prove or disprove the properties on the abstract model, they report the result and terminate. In practice, however, abstract models can be extremely complicated, model checkers may not conclude with reasonable computational resources.
Compositional reasoning is a way to ameliorate the complexity in abstract models . Compositional reasoning tries to prove global properties on abstract models by establishing local properties on their components. If local properties on components are easier to verify, compositional reasoning can improve the capacity of model checking by local reasoning. Experiences however suggest that local reasoning may not suffice to establish global properties. It is rare that a global property can be established without considering their interactions. In assume-guarantee reasoning, model checkers try to verify local properties under a contextual assumption of each component. If contextual assumptions faithfully capture interactions among components, model checkers can conclude the verification of global properties.
Finding contextual assumptions however is difficult and may require clairvoyance. Interestingly, a fully automated technique for computing contextual assumptions was proposed in . The automated technique formalizes the contextual assumption generation problem as a learning problem. If properties and abstract models are formalized as finite automata, then a contextual assumption is nothing but an unknown finite automaton that characterizes the environment. Applying a learning algorithm for finite automata, the automated technique will generate contextual assumptions for assume-guarantee reasoning. Experimental results show that the automated technique can outperform a monolithic and explicit verification algorithm.
The success of the learning-based assume-guarantee
reasoning is however not satisfactory. Most verification
tools are using implicit algorithms. In fact, implicit
representations such as Binary Decision Diagrams can improve
the capacity of model checking algorithms in several order of
magnitudes. Early learning-based techniques, on the other
hand, are based on the
L*learning algorithm using explicit representations. If
a contextual assumption requires hundreds of states, the
learning algorithm will take too much time to infer an
assumption. Subsequently, early learning-based techniques
cannot compete with monolithic implicit verification
.
Recently, we propose assume-guarantee reasoning with implicit learning . Our idea is to adopt an implicit representation used in the learning-based framework. Instead of enumerating states of contextual assumptions explicitly, our new technique computes transition relations as an implicit representation of contextual assumptions. Using a learning algorithm for Boolean functions, the new technique can easily compute contextual assumptions with thousands of states. Our preliminary experimental results show that the implicit learning technique can outperform interpolation-based monolithic implicit model checking in several parametrized test cases such as synchronous bus arbiters and the MSI cache coherence protocol.
Learning Boolean functions can also be applied to loop invariant inference , . Suppose that a programmer annotates a loop with pre- and post-conditions. We would like to compute a loop invariant to verify that the annotated loop conforms to its specification. Finding loop invariants manually is very tedious. One makes a first guess and then iteratively refines the guess by examining the loop body. This process is in fact very similar to learning an unknown formula. Applying predicate abstraction and decision procedures, a learning algorithm for Boolean functions can infer loop invariants generated by a given set of atomic predicates. Preliminary experimental results show that the learning-based technique is effective for annotated loops extracted from source codes of Linux and SPEC2000 benchmarks.
Although implicit learning techniques have been developed for assume-guarantee reasoning and loop invariant inference successfully, challenges still remain. Currently, the learning algorithm is able to infer Boolean functions over tens of Boolean variables. Contextual assumptions over tens of Boolean variables are not enough. Ideally, one would like to have contextual assumptions over hundreds (even thousands) of Boolean variables. On the other hand, it is known that learning arbitrary Boolean functions is infeasible. The scalability of implicit learning techniques cannot be improved satisfactorily by tuning the learning algorithm alone. Combining implicit learning with abstraction will be essential to improve its scalability.
Our second challenge is to extend learning-based
techniques to other computation models. In addition to finite
automata, probabilistic automata and timed automata are also
widely used to specify abstract models. Their verification
problems are much more difficult than those for finite
automata. Compositional reasoning thus can improve the
capacity of model checkers more significantly. Recently, the
L*algorithm is applied in assume-guarantee reasoning for
probabilistic automata
. The new technique is
unfortunately incomplete. Developing a complete
learning-based assume-guarantee reasoning technique for
probabilistic automata and timed automata will be very useful
to their verification.
Through predicate abstraction, learning Boolean functions can be very useful in program analysis. We have successfully applied algorithmic learning to infer both quantified and quantifier-free loop invariants for annotated loops. Applying algorithmic learning to static analysis or program testing will be our last challenge. In the context of program analysis, scalability of the learning algorithm is less of an issue. Formulas over tens of atomic predicates usually suffice to characterize relation among program variables. On the other hand, learning algorithms require oracles to answer queries or generate samples. Designing such oracles necessarily requires information extracted from program texts. How to extract information will be essential to applying algorithmic learning in static analysis or program testing.
Decision procedures are of utmost importance for us, since they are at the heart of theorem proving and verification. Research in decision procedures started several decades ago, and are now commonly used both in the academia and industry. A decision procedure is an algorithm which returns a correct yes/no answer to a given input decision problem. Many real-world problems can be reduced to the decision problems, making this technique very practical. For example, Intel and AMD are developing solvers for their circuit verification tools, while Microsoft is developing decision procedures for their code analysis tools.
Mathematical logic is the appropriate tool to formulate a decision problem. Most decision problems are formulated as a decidable fragment of a first-order logic interpreted in some specific domain. On such, easy and popular fragment, is propositional (or Boolean) logic, which corresponding decision procedure is called SAT. Representing real problems in SAT often results in ackward encodings that destroy the logical structure of the original problem.
A very popular, effective recent trend is Satisfiability Modulo Theories (SMT) , a general technique to solve decision problems formulated as propositional formulas operating on atoms in a given background theory, for example linear real arithmetic. Existing approaches for solving SMT problems can be classified into two categories: lazymethod , and eagermethod . The eager method encodes an SMT problem into an equi-satisfiable SAT problem, while the lazy method employs different theory solvers for each theory and coordinates them appropriately. The eager method does allow the user to express her problem in a natural way, but does not exploit its logical structure to spped up the computation. The lazy approach is more appealing, and has prompted much interest in algorithms for the various background theories important in practice.
Our SMT solver aCiNO is based on the lazy approach. So far, it provides with two (popular) theories only: linear real arithmetic (LRA) and uninterpreted functions (UF). For efficiency consideration, the solver is implemented in an incremental way. It also invokes an online SAT solver, which is now a modified DPLL procedure, so that recovery from conflicts is possible. Our challenge here is twofold: first, to add other theories of interest for the project, we are currently working on fragments of the theory of arrays , . The theory of arrays is important because of its use for expressing loop invariants in programs with arrays, but its full first-order theory is undecidable. We are also interested in the theory of bit vectors, very much used for hardware verification.
Theory solvers implement state-of-the-art algorithms which
sophistication makes their correct implementation a delicate
task. Moreover, SMT solvers themselves employ a quite complex
machinery, making them error prone as well
Since the early days of software development, computer scientists have been interested in designing methods for improving software quality. Formal methods based on model checking, correctness proofs, common criteria certification, all address this issue in their own way. None of these methods, however, considers the trustworthiness of a given software system as a system-level property, requiring to grasp a given software within its environment of execution.
The major challenge we want to address here is to provide a framework in which to formalize the notion of trustworthyness, to evaluate the trustworthiness of a given software, and if necessary improve it.
To make trustworthiness a fruitful concept, our vision is to formalize it via a hierarchy of observability and controllability degrees: the more the software is observable and controllable, the more its behaviors can be trusted by users. On the other hand, users from different application domains have different expectations from the software they use. For example, aerospace embedded software should be safety-critical while e-commerce software should be insensitive to attacks. As a result, trustworthiness should be domain-specific.
A main challenge is the evaluation of trustworthiness. We believe that users should be responsible for describing the level of trustworthyness they need, in the form of formal requirements that the software should satisfy. A major issue is to come up with some predefined levels of trustworthyness for the major applicative areas. Another is to use stepwise refinement techniques to achieve the appropriate level of trustworthyness. These levels would then drive the design and implementation of a software system: the objective would be to design a model with enough details (observability) to make it possible to check all requirements of that level.
The other challenge is the effective integration of results obtained from different verification methods. There are many verification techniques, like simulation, testing, model checking and theorem proving. These methods may operate on different models of the software to be then executed, while trustworthness should measure our trust in the real software running in its real execution environment. There are also monitoring and analysis techniques to capture the characteristics of actual executions of the system. Integrating all the analyses in order to decide the trustworthiness level of a software is quite a hard task.
Simulation is relevant to most areas where complex embedded systems are used, not only to the semiconductor industry for System-on-Chip modeling, but also to any application where a complex hardware platform must be assembled to run the application software. It has applications for example in industry automation, digital TV, telecommunications and transportation.
aCiNO is an SMT (Satisfiability Modulo Theory) solver based on a Nelson-Oppen architecture, and written in C++. Currently, two popular theories are considered: linear real arithmetic (LRA) and uninterpreted functions (UF). A lazy approach is used for solving SMT problem. For efficiency consideration, the solver is implemented in an incremental way. It also invokes an online SAT solver, which is now a modified MiniSAT, so that recovery from conflict is possible. aCiNO is currently under testing. We plan a first release of aCiNO on January 2011, and a website shortly later. Participation to the SMT-COMP competition may occur already in 2011.
CoLoR is a Coq library on rewriting theory and termination of nearly 70,000 lines of code . It provides definitions and theorems for:
Mathematical structures: relations, (ordered) semi-rings.
Data structures: lists, vectors, polynomials with multiple variables, finite multisets, matrices.
Term structures: strings, algebraic terms with symbols of fixed arity, algebraic terms with varyadic symbols, simply typed lambda-terms.
Transformation techniques: conversion from strings to algebraic terms, conversion from algebraic to varyadic terms, arguments filtering, rule elimination, dependency pairs, dependency graph decomposition, semantic labelling.
Termination criteria: polynomial interpretations, multiset ordering, lexicographic ordering, first and higher order recursive path ordering, matrix interpretations.
Rainbow is a tool for automatically certifying termination certificates expressed in the CPF XML format used in the termination competition on termination . Termination certificates are translated and checked in Coq by using the CoLoR library.
Rainbow won the 2007 and 2008 competitions and was 2nd in the 2009 competition of certified termination provers . It did not take part in the 2010 competition organized only 6 months after the 2009 competition.
CoLoR and Rainbow are distributed under the CeCILL license
on
http://
CoqMT is an evoluation of the Coq proof assistant allowing to dynamically load decision procedures for first-order theories in the conversion checker of the Coq kernel. Users decide which Coq symbols are handled by the decision procedures through the use of mapping primitives. Having dynamic loading and mapping facilities allows users to write their own decision procedures or take anyone from the shelves and use them in Coq without any additional modification of the Coq source code. Because proofs in CoqMT may differ substantially from proofs in Coq, backwards compatibility with previous versions of Coq cannot be ensured.
At this moment, CoqMT comes with a predefined decision procedure for integer linear arithmetic which generates small certificates (unlike previously existing procedures).
CoqMT, the decision procedure for integer linear
arithmetic and the theory of dependent lists, are accessible
via a GIT repository accessible from
http://
Joint work with Jiaguang Sun (Tsinghua University, China).
EDOLA-PLC is an integrated tool for domain-specific modeling and verification of PLC applications. It is based on a domain-specific modeling language, EDOLA, to describe system models. It supports both model checking and automatic theorem proving techniques for verification. The goal of this tool is to possess both the usability in domain modeling, the reusability in its architecture and the capability of automatic verification.
For the moment, we have developed a prototype of the EDOLA language, which can easily describe the features of PLC applications like the scan cycle mechanism, the pattern of environment model, time constraints and five property patterns. TLA+ was chosen as the intermediate language to implement the automatic verification of EDOLA models. A prototype of EDOLA-PLC has also been developed, which comes along with an editor to help writing EDOLA models. To automatically verify properties on EDOLA models, it provides the interface for both a model checker TLC and a first-order theorem prover SPASS .
Joint work with Pierre Weis (INRIA Rocquencourt).
Moca is a construction functions generator for OCaml data types with invariants.
It allows the high-level definition and automatic management of complex invariants for data types. In addition, it provides the automatic generation of maximally shared values, independently or in conjunction with the declared invariants.
A relational data type is a concrete data type that declares invariants or relations that are verified by its constructors. For each relational data type definition, Moca compiles a set of construction functions that implements the declared relations.
Moca supports two kinds of relations:
predifined algebraic relations (such as associativity or commutativity of a binary constructor),
user-defined rewrite rules that map some pattern of constructors and variables to some arbitrary user's define expression.
The properties that user-defined rules should satisfy (completeness, termination, and confluence of the resulting term rewriting system) must be verified by a programmer's proof before compilation. For the predifined relations, Moca generates construction functions that allow each equivalence class to be uniquely represented by their canonical value.
Moca is distributed under QPL on
http://
SimSoC is an infrastructure to run simulation models which comes along with a library of simulation models. SimSoC allows its users to experiment various system architectures, study hardware/software partition, and develop embedded software in a co-design environment before the hardware is ready to be used. SimSoC aims at providing high performance, yet accurate simulation, and provide tools to evaluate performance and functional or non functional properties of the simulated system.
SimSoC is based on SystemC standard and uses Transaction Level Modeling for interactions between the simulation models. The current version of SimSoC is based on the open source libraries from the OSCI Consortium: SystemC version 2.2 and TLM 2.0.1 , . Hardware components are modeled as TLM models, and since TLM is itself based on SystemC, the simulation is driven by the SystemC kernel. We use standard, unmodified, SystemC (version 2.2), hence the simulator has a single simulation loop.
The first open source version of SimSoC, SimSoC v0.6.1, has been released in March 2010. It contains a full simulator for ARM V5 running at an average speed of 80 Millions instructions per second in mode DT2, and a simulator for the Power PC architecture with an average speed of 20 Mips in mode DT1. It represents about 70,000 lines of source code and includes:
Instruction Set Simulators. The ARM Version 5 architecture has been implemented with DT0, DT1, DT2 mode. The PowerPC 600 architecture with DT0 and DT1 mode. For both architectures, complete simulation models of the processor and MMU are provided, making it possible to run operating systems of the simulated platform. MIPS architecture in DT0 mode is under development.
A dynamic translator from binary programs to an internal representation. For the ARM architecture a compiler has been developed that generates the C++ translated code (for DT2), using parametrized specialization options.
Peripheral models including a serial line controller, a flash memory controller, an interrupt controller.
A utility to generate permanent storage for flash memory simulation; a compiler tool to generate instruction binary decoder.
Examples illustratating the use of the library and infrastructure.
In early November, we released a new version: SimSoC v0.7. This new version includes models of two ARM-based boards, an accurate floating-point simulator, new transport layers for network simulation, and an ARMv6 ISS.
SimSoC is distributed under LGPL on
https://
SimSoC-Cert is a set of tools that can automatically generate in various target languages (Coq and C) the decoding functions and the state transition functions of each instruction and addressing mode of the ARMv6 architecture manual (implemented by the ARM11 processor family) but the Thumb and coprocessor instructions. The input of SimSoC-Cert is the ARMv6 architecture manual itself.
Based on this, we first developed simlight(8000 generated lines of C, plus 1500 hand-written lines of C), a simulator for ARMv6 programs using no peripheral and no coprocessor. Next, we developed simlight2, a fast ARMv6 simulator integrated inside a SystemC/TLM module, now part of SimSoC v0.7.
We have released several versions of SimSoC:
version 0.6 in March. It was the first public release of SimSoC
version 0.7 in early November. This version provides two additional boards, new debuggers, new Ethernet transport layers, and a new floating point simulator.
version 0.7.1 on December 10th. This version adds the LLVM-based dynamic translation for ARMv6 and PowerPC, and fix a few bugs.
We have done the following work that is included in the last open-source:
A floating point unit simulator has been developed. This floating point simulator is an accurate simulator of the target architecture, although most of the software is architecture independent. This module is using the MPFR library from INRIA . Using MPFR function calls with appropriate parameters, a specific floating point architecture can be emulated exactly. This work is presented in .
The PowerPC ISS was improved to support DT2 mode dynamic translation, reaching a speed of 75 Mips.
Reusing the techniques we used for ARM and PowerPC, the MIPS 32 bits simulator has been improved with DT1 mode and DT2 mode. The MIPS simulator now runs at 67 Mips.
We have consolidated the debug framework so that the GDB debugger, or any GDB protocol based remote debugger, can now be fully used on SimSoC for both PowerPC and ARM.
A Linux user mode simulation module has been added. This user mode makes it possible to run application programs compiled for Linux directly on the simulator (without booting a simulated Linux).
We provide a SoC simulator for an existing ARM based SoC for which the technical documentation is publicly available, namely the Texas Instruments AM1707 chip. Although it is not a complete chip simulator, we have implemented a significant subset of this chip so that Linux kernel can boot and run with network support.
Beyond the above developments, we have started new work to improve simulation speed, with the goal to develop a new translation mode named DT3, and explored several ideas. We have investigated the LLVM software from University of Illinois to consider its integration within SimSoC. We have analyzed the LLVM software, written some experimental code and concluded we could benefit from this approach:
A possibility is to optimize the simulator code. A simulation typically has much dynamic information that is not known at compile time of the simulation code. However, after the simulation has started, a lot of configuration information has become static. Therefore it is possible to explore recompiling the simulation module, using constant propagation techniques to propagate the newly discovered static information. An experiment made by compiling some benchmarks with llvm-g++and such lazy optimizing has been done.
Another idea is to selectively compile to native code the most frequently executed sequence of simulated instructions. We have developed a first prototype of this DT3 mode, working both for ARMv6 and PowerPC. This first version has been included in the release 0.7.1. On our testbed, the DT3 mode is currently about 20% faster than the DT2 mode, but we have still a lot of optimizations to implement. An article presenting this first version has been submitted to a Chinese conference.
We are also studying an approach to parallelize simulation.
SimSoC code uses complex features of the C++ language and of the SystemC library. Moreover, achieving high simulation speeds requires complex optimizations, such as dynamic translation.
This complexity is problematic, because beyond speed, faithfulnessis required: all instructions have to be simulated exactly as described in the hardware specification. There is a strong need to strengthen the confidence that simulations results provide are indeed based on a model which is faithful to this specification. Intensive tests are a first answer. For instance, as SimSoC is able to run a Linux kernel on top of a simulated ARM, we know that many situations are covered. However it turned out, through further experiments, that it was not sufficient: wrong behaviors coming from rare instructions were observed after several months.
We therefore decided to certifythe simulator (hence the name SimSoC-Cert) using formal methods based on Coq, to prove that it conforms to the expected behavior. Currently, we focus on the certification of the CPU part of the ARMv6 architecture , and not ARMVv5 as for SimSoC itself.
A first issue is whether to certify an existing code or produce a new, certified code. A second is to be able to upgrade the code from ARMv6 to ARMvx, when a new architecture is released by ARM. Our answer to both has been to automatically generate a specification of ARMv6 and the corresponding code directly from the reference manual. Then, a Coq specification of the ARMv6 can be generated, or a code satisfying the Coq specification. Using an automatic generator avoids the kind of errors that would have been introduced by manual translation.
We have completed the formalization in Coq of the ARMv6
specification. The most tedious part is automatically
generated, using a parser of the reference manual and a Coq
generator. The OCaml code of the generator is 3.3 KLOC. It
generates 3.2 KLOC of Coq code. Additionally, there are 6
KLOC of Coq code (among which 2 KLOC are written by hand
and 4 KLOC are reused from CompCert
Using then this ARMv6 formalization, we have developed two CPU simulators:
simlight, which is a stand-alone simulator for the ARMv6 instruction set. This simulator is written in C (8 KLOC generated and 1.5 KLOC hand-written), and can simulate ARMv6 programs as long as they do not access any peripherals (excepted the physical memory) nor coprocessors. This simulator is used as an intermediate stage in the certification of SimSoC.
simlight2, which is an optimized simulator for the ARMv6 instruction set. This simulator can be either wrapped in a stand-alone simulator like simlight(1), or integrated in a SystemC/TLM module. This SystemC/TLM module has been added to SimSoC, and will replace the old hand-written ARM CPU simulator. Many optimizations are applied during generation, and so the generated simulator (74 KLOC generated and 4 KLOC hand-written) is as fast as a hand-written one.
A poster describing the generation of simlight2has been presented at APLAS, and an article has been accepted .
Taking again advantage of our reference manual reader, we also developed a massive test generator, which generates tests for the simlightand simlight2decoders. One serious bug and two minor bugs have been found by running the generated tests.
As more and more devices are becoming 'network-aware' and a set of connected devices is viewed in applications as a coherent sub-system with some properties that can be guaranteed, such as a set of power control devices in a given set of rooms. In order to validate such systems, one must perform networked test and simulate both the network and the network controller hardware.
To validate the application software of networked embedded systems, it is necessary to simulate the network hardware. As the goal is to test embedded application software, it is required to simulate the hardware interface so the software drivers can be run. Conversely the simulated hardware must receive application data packets from remote devices in the specified standard from the (real or simulated) physical layer. That type of network simulation amounts to defining an infrastructure such that each simulated platform can communicate with the application software driver and with the other simulators using the specified standard. Ideally a parametrized generic infrastructure should be defined such that a particular network controller simulator could be derived from the generic infrastructure by instantiating parameters. Testing a new application with a new network controller model should not require to redo everything from scratch. There is an issue of re-usability of the simulation components, which means the parametrization must be carefully designed. A second class of issues to be resolved in this area is the control of the simulation. To simulate particular application conditions, one must be able to control from a central test point what every system does, in order to execute test scenarios where each device performs in a particular way. This implies a protocol to control each individual simulation on the network.
We are looking forward in this part of the SimSoC project setting up a networked virtual prototyping tool to validate networked embedded systems. To our knowledge, some academic experiments have been done to simulate networked devices on multi-processing machines, but no virtual prototyping tool has been developed for simulating network hardware and test such network hardware over a network cluster in controlled conditions, for example with hardware fault injections at specific timings.
Our goal is to create a “Cluster in the Loop Simulator” to simulate embedded systems hardware (including network/telecom hardware) using Cloud computing to run and test embedded application (software) on the networked simulators.
We have completely redesigned the SimSoC network simulation part to that effect. The previous framework used a fixed centralized TCP-based protocol for simulating the Ethernet network. It has been redesigned to allow for easily changing this underlying transport protocol. Moreover, new transport protocols, based on UDP multicasting and packet injections, have been implemented, leading to a mean latency - on the simulated network - grossly divided by 4.
To measure performance enhancements of our new network simulator, we have developed a full system simulation of the Texas Instruments SoC AM1707, which includes an Ethernet 100 Mbits hardware controller. We have developed a SystemC/TLM model of this controller and a few necessary peripherals.
The Linux operating systems boots and runs on this simulated SoC. Using the simulated network controller, one can mount NFS remote file systems and work with acceptable speed. We have been able to boot a GNU/Linux operating system whose root files system is located on a network files server, and to entirely recompile the GNU coreutilsusing a tools chain located on this network files server within a few hours of simulation.
A main earlier achievement of the team is the Calculus of Presburger Inductive Constructions , which allows to have an extensional equality for Presburger arithmetic or other inductive types instead of an intensional equality, as is the rule since the very early days of Martin-Löf's type theory who first recognized that type checking becomes undecidable in presence of an extensional equality at all types. Recent extensions of the calculus are described below.
In , we describe a modification of the Calculus of Inductive Constructions allowing the use of decision procedures in the computation mechanism, which improves over our previous work. This procedure can actually use equations extracted from the proof context, a mechanism which raises challenging technical difficulties. In , , we give a new definition of the calculus without most of the restrictions made in our previous work, and proved its core logic in Coq. This development has been the basis of CoqMT, our new version of Coq. As a paradigmatic example, we developed the basic theory of dependent lists with CoqMT. Compared with the same development for non-dependent lists, very few modifications were necessary to carry out the proofs.
We have now started a new important generalization of the previous work, in order to abstract the properties of conversion needed for the first-order theory in CoqMT, and to include a predicative hierarchy of universes which is needed for some applications and brings us closer to the actual implementation. We have progressed when inductive types are restricted to weak elimination, and expect to have it written by the end of the year (the mechanism for extracting equations from the proof context is not considered here). The case of strong elimination appears to be doable if we are only interested in showing consistency of the calculus, but very challenging for showing strong normalization and decidability of type checking. Some recent progress in this direction made by Bruno Barras might help.
We plan a few more generalizations aiming at a better understanding of the mechanism for extracting equations from the proof context: the ability to consider polymorphic first-order theories, the extraction of equations from pattern matching definitions, and the extraction of equations from quantified equations. We believe that all these should follow from the current framework, requiring technical work but no new conceptual breakthrough.
This work has been carried out in part with Vincent Van Oostrom, from the university of Utrecht.
Invented by Vincent Van Oostrom, decreasing diagrams capture both kinds of diagrams arising from Newmann's Lemma and Hindley's Lemma: they indeed allow to reduce all known confluence methods to critical pairs computations, and a search of decreasing diagrams for them all, where decreasingness is measured by a well-founded order on proof steps.
In , we give a new simple proof of Van Oostrom's main theorem, and extend the method of decreasing diagrams to rewrite relations on a term algebra. We prove that the union of a terminating left-linear systems, and a non-terminating linear system is confluent provided the various critical pairs existing in in their combination have decreasing diagrams (with respect to some order built from the respective orders of both systems).
Our first goal now is to further simplify and generalize these results in order to get rid of the left-linearity assumption for the first system, and of the right-linearity assumption for the second. This would yield a true generalization of the well-known Knuth-Bendix-Huet confluence result for terminating systems, and at the same time of various critical-pair based results found in the literature for non-terminating systems. Good progress has been made along this path.
Our second goal is to develop a Coq library in order to search for and certify confluence proofs. We are waiting here for the release of the open source Coq library for confluence certification by Middledorp and his team in order to avoid duplicating efforts.
Joint work with Claude Kirchner (INRIA Bordeaux) and Colin Riba (ENS Lyon).
The confluence of untyped lambda-calculus with unconditional rewriting is now well understood. In , we investigate the confluence of lambda-calculus with conditional rewriting and provide general results in two directions.
First, when conditional rules are algebraic. This extends results of Müller and Dougherty for unconditional rewriting. Two cases are considered, whether beta-reduction is allowed or not in the evaluation of conditions. Moreover, Dougherty's result is improved from the assumption of strongly normalizing beta-reduction to weakly normalizing beta-reduction. We also provide examples showing that outside these conditions, modularity of confluence is difficult to achieve.
Second, we go beyond the algebraic framework and get new confluence results using a restricted notion of orthogonality that takes advantage of the conditional part of rewrite rules.
Joint work with Benjamin Monate (CEA, Laboratory LIST, Saclay).
In
, we are interested in proofs
of properties of infinite families of specifications, like
the family of dihedral groups of order
nfor some natural number
n, or the family of a multi-core hardware with
ncores for some natural number
n. So far, we have described a confluence test when
these families can be presented by parametrized words over
a finite alphabet of parametrized size, as in the case of
the example of dihedral groups.
Technically, we introduce parametrized rewrite systems for describing infinite families of finite string rewrite systems depending upon non-negative integer parameters, as well as ways to reason uniformly over these families. Unlike previous work, the vocabulary on which a rewrite system in the family is built depends itself on the integer parameters. Rewriting makes use of a toolkit for parametrized words which allows to describe a rewrite step made independently by all systems in an infinite family by a single, effective parametrized rewrite step. The main result is a confluence test for all systems in a family at once, based on a critical pair lemma classically based on computing (finitely many) overlaps between left-hand sides of parametrized rules and then checking for their joinability. Although parametrized rewriting is systematically non-terminating, we describe a termination test which succeeds if all systems in the family terminate.
The problem of showing termination (also called strong normalization in this context) of higher-order higher-type calculi such as the calculus of inductive constructions is well-know to be a very difficult question. The major step was done by Girard in his thesis with the notion of reducibility candidates, which are sets of terms enjoying appropriate closure properties so as to show strong normalization by induction on the structure of terms.
In a previous work with Albert Rubio from the Technical University of Catalogna , we introduced the higher-order recursive path ordering, which was further improved as the computability path ordering in collaboration with Frederic Blanqui and Albert Rubio . An important, still elusive property of this ordering is that it somehow makes explicit the computational content of Girard's method, therefore allowing to reduce strong normalization proofs to ordering comparisons. This work was carried out for simply typed lambda calculi.
Our project is to lift the definition (and well-foundedness proof) of the computability path ordering to higher-type calculi. So far, we have considered the case of Girard's system F (a fully polymorphic lambda calculus) for which we have preliminary results for which a weak form of redex creation only is allowed , .
Joint work with Adam Koprowski (MLstate).
Termination is an important property of programs; notably required for programs formulated in proof assistants. It is a very active subject of research in the Turing-complete formalism of term rewriting systems, where many methods and tools have been developed over the years to address this problem. Ensuring reliability of those tools is therefore an important issue.
In , Frédéric Blanqui and Adam Koprowski present the library CoLoR that formalizes important results of the theory of well-founded (rewrite) relations in the proof assistant Coq, and their application to the automated verification of termination certificates, as produced by termination tools.
Sidi Ould Biha formalized in Coq the subterm criterion and the notion of usable rules used in the dependency pairs framework , . In contrast with the formalization of previous criteria, these ones require classical logic and the axiom of choice. A similar work but in the proof assistant Isabelle/HOL is described in . It would therefore be interesting to compare the two formalizations.
Kim-Quyen Ly extended the library on integer polynomials in order to be able to use polynomials on any (ordered) ring. She also formalized other criteria for (strict) monotony in order to allow the use of polynomials with negative coefficients .
In recent years, several authors devised size-based termination criteria for ML-like function definitions (lambda-calculi with inductive types and case analysis) that can handle non-structural recursive calls , , . These criteria use a notion of size related to the semantics of inductive types. Roughly speaking, for data types like lists or trees, the size of a normal term is the height of its tree representation. Typing judgments are then extended to derive informations about the size of terms. Hence, termination can be ensured by using a combination of type checking and constraint solving on size annotations. In , we extend these works to rewriting-based function definitions, dependent types and more general notions of size. We therefore provide a powerful termination criterion for the combination of rewriting and beta-reduction in the Calculus of Constructions.
This works detail and extend in important ways the conference papers , . In particular, we show how to use other notions of size and allow a much larger class of rule left-hand sides in successor-based systems.
Joint work with Keiichirou Kusakari and Sho Suzuki from Nagoya University, Japan.
The static dependency pair method is a method for proving the termination of higher-order rewrite systems à laNipkow . It combines the dependency pair method introduced for first-order rewrite systems with the notion of strong computability introduced for typed lambda-calculi . Argument filterings and usable rules are two important methods of the dependency pair framework used by current state-of-the-art first-order automated termination provers , . In , we extend the class of higher-order systems on which the static dependency pair method can be applied. Then, we extend argument filterings and usable rules to higher-order rewriting, hence providing the basis for a powerful automated termination prover for higher-order rewrite systems.
In , we showed how inductive hypotheses can be manually inverted with small proof terms, using just dependent elimination with a diagonal predicate. The technique is “pure” (it works without any auxiliary type). It can also be used to discriminate, in some sense, the constructors of an inductive type of sort Prop in Coq.
The coq standard library contains a proof of the constructive description schema, which infers the sigma-existence (i.e., Set-existence) of a witness to a predicate from the regular existence (i.e., Prop-existence). One requires that the underlying set is countable and that the predicate is decidable. A much shorter proof than before is now contributed and distributed since the last release of Coq (8.3, ConstructiveEpsilon.v).
Array theory is essential to program verification. In SMT solvers, arrays are formalized by Equality and Uninterpreted Functions (EUF) . Arrays in EUF have an infinite number of unbounded elements, making extensional equality of two arrays in EUF undecidable. Using counter automata, it has been shown that the fragment with a single alternation of quantifiers is decidable , . This fragment is however far too restrictive for practical applications.
In , we investigate a first-order array theory of bounded elements (UABE). This fragment is of course very meaningful since its corresponds to a limitation of the physical world, and very useful, since it allows arbitrary nesting of quantifiers. In UABE, arrays contain bounded elements, that is elements of a finite set, such as machine real numbers in simple precision. By reducing to weak second-order logic with one successor (WS1S), we show that the proposed array theory is decidable. Finally, we show that the extension to unbounded elements is undecidable. Moreover, allowing linear arithmetic expressions in indices makes array theory of bounded elements unsolvable as well.
For complete SAT-based model checking, the interpolation-based algorithm can be more efficient than loop-free induction . Given two inconsistent Boolean formulas, their interpolant can be generated by a resolution proof of inconsistency. The size of resolution proofs however is exponential. A naïve implementation of the interpolation-based model checking algorithm is very likely to perform poorly in both time and space.
In this project, we implement an efficient interpolation-based model checking algorithm in NuSMV . A BDD-like data structure to represent interpolants is developed. The new data structure allows us to perform reduction and simplification on the fly. Preliminary experimental results show that our implementation is comparable to the BDD-based algorithm.
To ensure trust in the result of SAT-solvers, one has to prove the correction of the implementation inside a proof checker, or to modify this solver such that it produces a witness which can be checked by an external tool.
This work allows the certification of satisfiability (easy) as well as unsatisfiability (complex) proofs given by a set of regular input resolution proofs (or trace) as provided by the zChaffSAT-solver.
This development includes: i) a formalization of the resolution algorithm in Coq, ii) a formalization of the zChafftrace checker algorithm in Coq, and iii) a tool written in a mix of C++/ OCamltaking a zChafftrace and generating a Coqfile representing the initial SAT problem, along with a proof of the unsatisfiability of this problem. Checking zChafftraces is then reduced to compiling the file produced in the last step.
The development is done in Coq. It applies with minor modifications to other solvers, in particular to PicoSAT (for which the clauses in the output trace must be sorted). It is described in .
Joint work with William N. N. Hung (Synopsys Inc., USA) and Xiaoyu Song (Portland State University, USA).
Model checking suffers from the state explosion problem. Compositional abstraction and abstraction refinement have been investigated in many areas to address this problem. This paper considers the compositional model checking for timed systems. We present an automated approach which combines compositional abstraction and counter-example guided abstraction refinement (CEGAR) . Given a timed system, the proposed approach exploits the semantics of timed automata to procure its abstraction. Our approach is conservative. Hence, any safety property which holds on the abstraction is guaranteed to hold on the concrete model. In the case of a spurious counter-example, our proposed approach refines and strengthens the abstraction in a component-wise method. We implemented our method with the model checking tool Uppaal. Experimental results show promising improvements.
Joint work with Yu-Fang Chen (Academia Sinica, Taiwan), Edmund M. Clarke (CMU, USA), Azadeh Farzan (University of Toronto, Canada), Ming-Hsien Tsai (National Taiwan University, Taiwan), and Yih-Kuen Tsay (National Taiwan University, Taiwan).
In recent years, a learning-based technique for assumption generation has attracted lots of attention in assume-guarantee reasoning . The authors apply a learning algorithm for finite automata in order to infer an unknown assumption. The idea is theoretically elegant, but it does not appear to be scalable. It is not clear whether automated assume-guarantee reasoning can actually outperform monolithic verification .
In
, we propose a purely implicit
solution to the contextual assumption generation problem in
assume-guarantee reasoning. Instead of improving the
L*algorithm – a learning algorithm for finite
automata, our algorithm computes implicit representations
of contextual assumptions by the CDNF algorithm – a
learning algorithm for Boolean functions. We report three
parametrized test cases where our solution outperforms the
monolithic interpolation-based Model Checking
algorithm.
There are, of course, many learning algorithms for Boolean functions. It is of utmost importance to identify the most effective learning algorithm in practice.
In , we compare two learning algorithms for generating contextual assumptions in automated assume-guarantee reasoning. The CDNF algorithm implicitly represents contextual assumptions by a conjunction of DNF formulas, while the OBDD learning algorithm uses ordered binary decision diagrams as its representation. Using these learning algorithms, the performance of assume-guarantee reasoning is compared with monolithic interpolation-based Model Checking in parametrized hardware test cases.
Joint work with Soonho Kong (Seoul National University, Korea), Yungbum Jung (Seoul National University, Korea), and Kwangkeun Yi (Seoul National University, Korea).
Given a loop annotated with pre- and post-conditions, loop invariants are often used to prove that the annotated loop conforms to its specification. Finding loop invariants, however, is very tedious. Traditional fixpoint-based loop invariant generation techniques essentially concretize least fixed points in the abstract domain. Predicate abstraction and fixed point computation are the bottlenecks of the traditional approach.
By combining algorithmic learning, decision procedures, and predicate abstraction, we present an automated technique for finding loop invariants in propositional formulas . Given invariant approximations derived from pre- and post-conditions, our new technique exploits the flexibility in invariants by a simple randomized mechanism. The proposed technique is able to generate invariants for some Linux device drivers and SPEC2000 benchmarks in our experiments.
Joint work with Soonho Kong (Seoul National University, Korea), Yungbum Jung (Seoul National University, Korea), Cristina David (National University of Singapore, Singapore), and Kwangkeun Yi (Seoul National University, Korea).
Algorithmic learning is applied to loop invariant generation lately . Exploiting the precondition and post-condition from users, a learning algorithm is able to infer quantifier-free loop invariants for the annotated loop. The learning-based technique however is not applicable to loops with arrays. Since loop invariants for array manipulation often require quantification, the work is too restrictive for such loops.
By combining algorithmic learning, decision procedures, predicate abstraction, and simple templates, we present an automated technique for finding quantified loop invariants in . Our technique can find arbitrary first-order invariants (modulo a fixed set of atomic propositions and an underlying SMT solver) in the form of the given template and exploits the flexibility in invariants by a simple randomized mechanism. The proposed technique is able to find quantified invariants for loops from the Linux source, as well as for the benchmark code used in the previous works. Our contribution is a simpler technique than the previous works yet with a reasonable derivation power.
PLC stands for Programmable Logic Controller . PLCs are widely used in the industry. TLA (Temporal Logic of Actions) is a logic for specifying and reasoning about concurrent and reactive systems . It is the basis for TLA+, a complete specification language .
Joint work with Stephan Merz (INRIA Nancy, France).
We report on a method formally specifying and verifying PLCs in the specification language TLA+ . The specification framework is generic. It separates the description of the environment from that of the controller itself and its structure is consistent with the scan cycle mechanism used by PLCs. Specifications can be parametrized with the number of replicated components. We have validated our approach on a concrete case study, a controller for fire fighting equipment in a ship dock, and report on the results obtained for this case study.
Joint work with Xiaoyu Song (Portland State University, USA).
Timers play a pivotal role in PLC real-time embedded system applications. This work addresses the formal validation of PLC systems with timers in the theorem proving system Coq. The timer behavior is characterized formally. In , a refinement validation methodology is presented in terms of an abstract model and a concrete model. The refinement is calibrated by a mapping relation. The soundness of the methodology is shown in the proving system. An illustrative case study demonstrates the effectiveness of the approach.
Joint work with Xiaoyu Song (Portland State University, USA).
Timers play a pivotal role in PLC real-time embedded system applications. This work addresses the formal validation of PLC systems with timers in the theorem proving system Coq. The timer behavior is characterized formally. In , a refinement validation methodology is presented in terms of an abstract model and a concrete model. The refinement is calibrated by a mapping relation. The soundness of the methodology is shown in the proving system. An illustrative case study demonstrates the effectiveness of the approach.
In , we present a novel method to specify and verify PLC software systems with the theorem proving system Coq. Dependent inductive data types are harnessed to represent the component specifications. Modular and parametrized specification and verification are proposed. An illustrative example demonstrates the effectiveness of the method.
Property preservation refinement ensures that the properties of the original model still hold after refinement.
In this on-going resarch, we first design two kinds of refinement operators for timed automata. Then, invariance, absence, exist, and respond patterns for PLCs are modeled within the framework of timed automata, and we investigate the conditions that preserve these properties after refinement.
Joint work with Jianmin Wang (Tsinghua University, China).
Abstraction is a necessary step in model checking. When abstract models contain spurious behaviors, they have to be refined to remove such behaviors. Counterexample guided abstraction refinement method (CEGAR) is a well-known technique that automatically refines abstraction in software verification . In , we investigate improvements of this method.
First, we present an effective predicate abstraction macanism for program verification. We present a novel method to compute the abstract states, which narrows down their number. Second, an algorithm for property checking is given based on weighted-graphs, which aims at finding a shortest counter-example as early as possible. Third, we present a effective method for detecting linear equalities which can be used in CEGAR. Fourth, we describe an efficient method to constructing interpolants for a quantifier-free first-order logic formula. All these results are then used to verify PLC programs with timers within their environment in CEGAR.
Joint work with Xiaoyu Song (Portland State University, USA).
The notion of machine-closedness plays an important role in system specification . A system specification consists of two parts: a safety property and a liveness property. A specificationis machine-closed if the liveness property does not constrain the safety property. It is commonly agreed that machine-closedness should be considered as a sanity check. It can be shown that TLA+ specifications made of a transition system and a possibly countably infinite set of fairness constraints are machine-closed.
We present a formal proof of the machine-closedness
theorem for TLA+ in the theorem proving system Coq. A
shallow embedding is employed for the proof which is
independent of a concrete syntax. Fundamental concepts
needed to state the machine-closedness theorem are
addressed in the proof development. A useful proof pattern
of constructing a trace with desired properties is defined,
and a number of Coq reusable libraries are developed. The
proof scripts can be downloaded from
http://
Joint work with Xiaoyu Song (Portland State University, USA).
With the wide use of real-time systems, embedded systems and pervasive computing in everyday applications, time-sensitive systems (i.e. systems whose behavior is influenced by the passing of time) attract many people's attention. To formally analyze time-sensitive systems, it is necessary to represent time in the formalisms.
We present a method to ameliorate the usability of TLA+ in specifying and verifying time-sensitive systems . A real-time module RealTimeNew is introduced to encapsulate the definitions of commonly used time patterns. The basic patterns specify the time duration of an action or the time interval between actions. Advanced time patterns like delay, deadline, and timeout are further defined based on the basic ones. We then present a general framework to differentiate the temporal characterizations from system functionality with time constraints. The temporal specification is concise and provably as a refinement of its corresponding functional description without time.
TLA+ plays an important role in our modeling and verification framework. It is an intermediate language that connects the specification language EDOLA, model checking tools and the theorem proving system Coq. It is an ongoing work and started three months ago. The preliminary result is that, we have built a tool that can translate a single TLA+ module into Coq, but only a selected set of TLA+ operators is supported.
Verification techniques of colored Petri net (CPN) models include occurrence graphs, invariant analysis, deduction-based method, etc. However, these techniques have limitations. Theorem proving is a supplement for them. We provide a novel translation-based method to formally prove properties on a colored Petri net. In our method, TLA+ is chosen as the intermediate language. We give semantic translation rules from a CPN to a TLA+ model. The formal proof of a CPN is then turned to the proof of a TLA+ model, which can be easily done with the existent TLA+ proving rules and the automatic theorem provers. The translation from CPN to TLA+, fills the gap between CPN and the logic-based proving. A resource allocator example is used to explain the method and validate these rules. We also show that how the refinement relationship of two CPN models can be proved with our method.
The definition of a formal semantics of PLC programing language is an important step towards the certification of PLC programs. We defined an operational semantics of the Instruction List language for PLC that covers a large subset of the standard. It includes one type of timers used in PLC (On delay timers) and support the cyclic behavior of PLCs. We formalized this semantics in the proof assistant Coq. We used it to proof some properties about simple PLC programs widely used in car parks. In collaboration with Jan Olaf Blech from Fortiss (Germany), we are planing to use this semantics to prove some safety properties on an real case industrial example of PLC. This work was presented in a poster at the APLAS 2010 conference.
Joint work with Yuxin Deng (Jiaotong University, Shanghai) and Stéphane Grumbach (LIAMA/Netquest).
Netlog is a language designed and implemented in the Netquest project for describing protocols. Netlog has a precise semantics, provides a high level of abstraction thanks to its Datalog flavor and benefits from an efficient implementation. This makes it a very interesting target language for proofs of protocols. Netlog comes with two possible semantics: a synchronous semantics, better suited to tightly coupled parallel systems and an asynchronous semantics, better suited to distributed systems.
We designed a formal model of Netlog in Coq, where the two possible semantics are derived from common basic blocks. In a fully certified framework, a formal proof of the Netlog engine (running on each node) would be required. We don't attack this part at the moment: we assume that the implementation respects the general properties stated in our model and focus on the issues raised by the distributed model of computation provided by Netlog.
As a proof of concept, we applied this framework to an algorithm constructing a Breadth-First Search Spanning Tree (BFS) in a distributed system . The main point was to provide a methodology for proving that a property of global configurations is achieved from local moves.
Joint work with Zoé Drey (INRIA Bordeaux, France).
Various forms of pervasive computing environments are being deployed in an increasing number of areas including hospitals, homes and military settings. Entities in this environment provide rich functionalities (i.e. services). How to organize these heterogeneous and distributed entities to deliver user-defined services is challenging. Pantagruel is an approach to integrate a taxonomic description of a pervasive computing environment into a visual programming language . A taxonomy describes the relevant entities of a given pervasive computing area and serves as a parameter to a sensor-controller-actuator development paradigm. The orchestration of area-specific entities is supported by high-level constructs, customized with respect to taxonomic information. Pantagruel is also a language that describes and manages services. Furthermore, Pantagruel can be viewed as a high level service contract between the service designer and the program implementer. The work presents a formalization of Pantagruel, both its syntax and semantics. Four kinds of static properties are stated based on the formalization. Predicate abstraction based algorithms are designed to verify the properties.
The goal of this project contracted with Schneider Electric China is to develop a full system simulator for a System-on-Chip used by Schneider Electric in their automation product line.
The goal of this project is to complete the PowerPC simulator and compare its performance with another simulator used internally by Orange IT Labs.
SIVES
CCCBIP is a proposal coordinated by
Jean-François Monin for building a certified compilation
chain for BIP
Formesorganized the 2nd Asian-Pacific Summer
School on Formal Methods
Formesorganized jointly with Peking University the
ARTIST European Network of Excellence on Embedded Systems
Design its 2010 Summer School
Formesorganizes a weekly seminar which is a major local forum in the area of formal methods, with a steady participation of colleagues who come from the other nearby research institutions, CASIA, ISCAS and Peking University, to attend the presentations. All seminars are announced on our website, as well as the other relevant local seminars or events, in particular those taking place at ISCAS.
Formesis part
of the working group LTP on Languages, Types and Proofs
of the GDR GPL
Formesis part
of the working group LAC on Logic, Algebra and Calculus
of the GDR IM
Vania Joloboff was invited to give presentations at Northeastern University in Shenyang, Harbin Engineering University in Harbin, Tsinghua University Campus in Shenzhen, and the Shenzhen Institute of Advanced Technology.
Vania Joloboff visited CEA Saclay and LIP6 laboratory in April to discuss their respective simulation projects and technologies. He also visited ST Microelectronics in May and September and organized a meeting with ST Microelectronics, Schneider Electric, CEA, TIMA and Verimag laboratory to consider potential collaboration on national or international projects.
Vania Joloboff had several meetings with EADS representatives in China and France (visit in August) to discuss potential collaboration with LIAMA.
Jean-François Monin visited Jiaotong University BASICS from March 21 to March 24 and from June 13 to 18.
Jean-Pierre Jouannaud was invited by the European pavilion at the international exposition in Shanghai (Chinese-European research collaborations).
Jean-Pierre Jouannaud was invited to give a seminar talk at Tsinghua University, in the seminar of the computer science laboratory.
Jean-Pierre Jouannaud was keynote speaker at CSCM 2010 in Shanghai.
Jean-François Monin visited AIST and NII lab at Tokyo on April 27 and 28, and gave a talk “towards proving Netlog programs” at AIST.
Jean-François Monin visited Academia Sinica at Taipei on December 21, and gave a talk “Proving Netlog programs”.
Jean-François Monin visited VERIMAG and UJF in January, July and November 2010 to discuss about future collaboration on BIP and the candidature of UJF to the LIAMA consortium.
Hui Xiao visited Nanjing University to discuss collaboration on a simulation project.
Frédéric Blanqui is a member of the Steering Committee of the International Conference on Rewriting Techniques and Applications (RTA) from July 2010 to July 2013.
Frédéric Blanqui was PC member of the 12th International ACM SIGPLAN Symposium on Principles and Practice of Declarative Programming (PPDP'10) and of the 5th International Workshop on Higher-Order Rewriting (HOR'10).
Vania Joloboff was PhD jury for Marius Gligor “Fast Simulation Strategies and Adaptive DVFS Algorithms for Low Power MPSoCs”, University of Grenoble, September 2010.
Jean-Pierre Jouannaud was PC chair of the 25th IEEE Symposium on Logic in Computer Science (LICS'10).
Jean-Pierre Jouannaud was PC member of DCM 2009 and DCM 2010.
Jean-Pierre Jouannaud chaired the committee for the Gödel prize 2010 and the committee for the Kleene Prize 2010, and was a member of the committee for the LICS Test of Time Award 2010.
Jean-Pierre Jouannaud is a member of the LICS organizing committee.
Jean-Pierre Jouannaud is a member of the editorial board of the International Journal of Software and Informatics (IJSI).
Jean-Pierre Jouannaud is a guest co-editor of JACM (selection of 3 papers from LICS 2010), and a co-guest editor of LMCS (selection of papers from LICS 2010).
Jean-Pierre Jouannaud is a member of the advisory committee of Academia Sinica, Taipei, Taiwan.
Jean-François Monin was PhD jury for Serguei Lenglet “Bisimulations dans les calculs avec passivation”, University of Grenoble, January 2010.
Jean-François Monin was PhD jury and rapporteur for Johannes Kanig “Specification and Proof of Higher-Order Programs”, University of Paris-Sud, November 2010.
Jean-François Monin was HDR jury for Xavier Urbain “Preuve automatique : techniques, outils et certification”, University of Paris-Sud, November 2010.
Bow-Yaw Wang was a PC co-chair of the 4th IEEE International Symposium on Theoretical Aspects of Software Engineering (TASE'10).
Frédéric Blanqui supervised the 5-months internship of Kim-Quyen Ly on the formalization of rational polynomial interpretations in Coq.
Vania Joloboff supervised Chinese Master students Peng Shan and Sen Guo since January, and Xinlei Zhou since end of June. He is also supervising since early September French student William Kilque from France CPE engineering school doing “année de césure”.
Frédéric Blanqui, Jianqi Li, Jean-François Monin, Sidi Ould Biha, Xiaomu Shi, Pierre-Yves Strub, Lianyi Zhang, Qian Wang participated to the classes in the 2nd Asian-Pacific Summer School on Formal Methods.
Frédéric Blanqui gave a lecture on type checking and type inference and a lecture on termination of -reduction at the 2nd Asian-Pacific Summer School on Formal Methods.
Fei He and Bow-Yaw Wang gave a graduate course Formal Verification for Software Systemsat Tsinghua University (16 lectures, 2 hours each) from March 2010 to June 2010.
Jean-Pierre Jouannaud gave a lecture on modelization with automata at the school of Software, Tsinghua University, early 2010.
Jean-Pierre Jouannaud gave a lecture on first-order logic and a lecture on the Curry-Howard correspondence at the 2nd Asian-Pacific Summer School on Formal Methods.
Jean-François Monin gave two lectures on mathematical induction at the 2nd Asian-Pacific Summer School on Formal Methods.
Jean-François Monin gave an undergraduate course Formal Reasoning in Practiceat Tsinghua University (12 lectures, 3 hours each), from March 2010 to June 2010.
Jean-François Monin gave an undergraduate course Introduction to Interactive Proof of Softwareat Tsinghua University 16 lectures, 2 hours each) from September 2010 to January 2011.
Jean-François Monin gave 2 lectures on Coq for Foundations of Programming Languagesat the Institute of Software of the Chinese Academy of Sciences in November 2010.
Vania Joloboff gave a lecture at Tsinghua University on Virtual Prototyping techniques.
Vania Joloboff gave a 1 day tutorial at the ARTIST 2010 Summer School in China.
Cody Roux, PhD at INRIA Nancy in the Pareo team, supervised by Claude Kirchner (INRIA Bordeaux), Gilles Dowek (INRIA Saclay, École Polytechnique) and Frédéric Blanqui visited Formes from June 2 to July 1 to work on his thesis with Frédéric Blanqui.
Ming-Hsien Tsai, PhD student at National Taiwan University, supervised by Yih-Kuen Tsay (National Taiwan University), visited Formes from March 15 to June 2 to work on the formalization of a numerical abstraction for C programs with Bow-Yaw Wang and Jean-Pierre Jouannaud.
Jean-Jacques Lévy (INRIA), director of the MSR-INRIA Joint Center, visited Formesfrom August 17 to September 17, gave two introductory lectures on -calculus at the 2nd Asian-Pacific Summer School on Formal Methods, and six advanced lectures on -calculus at Tsinghua University.
Yu Zhang (USTC, China) gave a talk on April 9 on “Our Research on Verified Software: Compilation and Verification”.
Xinyu Jiang (USTC, China) gave a talk on April 9 on “Modular Verification of Dynamic Code Loading and Linking”.
David Pichardie (INRIA Rennes) gave a talk on April 16 on “A Certified Denotational Abstract Interpreter”.
Afonso Ferreira, CNRS Scientific Coordinator for International Affairs, visited LIAMA from May 19 to May 21.
Reynald Affeldt (AIST, Japan) gave a talk on May 28 on “Formal verification of cryptographic software in Coq”.
Rongjie Yan (VERIMAG) gave a talk on May 28 on “Incremental Component-based Construction and Verification using Invariants”.
Cody Roux (INRIA Nancy) gave a talk on June 25 on “Refinement types as Higher Order Dependency Pairs”.
Valérie Pécresse, French Minister of Higher Education and Research, visited LIAMA on July 5. Vania Joloboff hosted the visit with Chinese Director Tianzi Jiang.
Laurent Fribourg (CNRS, ENS Cachan) gave a talk on September 20 on “Simulation + Uncertainty = Model Checking”.
Patrick Devedjian, French Minister of Reflation, visited LIAMA on October 18. Jean-François Monin gave a presentation of Formesactivities.
José Meseguer (University of Illinois at Urbana-Champaign) gave a talk on November 15 on “Formalization and Correctness of the PALS Architectural Pattern for Distributed Real-Time Systems”.
Sophie Quinton (VERIMAG) gave a talk on November 23 on “Reasoning about Safety and Progress using Contracts”.
Kwangkeun Yi (Seoul National University, South Korea) gave a talk on December 3 on “Multi-Staged Program's Static Type System and Beyond”.