The general objective of the Toccata project is to promote formal specification and computer-assisted proof in the development of software that requires high assurance in terms of safety and correctness with respect to its intended behavior. Such safety-critical software appears in many application domains like transportation (e.g., aviation, aerospace, railway, and more and more in cars), communication (e.g., internet, smartphones), health devices, etc. The number of tasks performed by software is quickly increasing, together with the number of lines of code involved. Given the need of high assurance of safety in the functional behavior of such applications, the need for automated (i.e., computer-assisted) methods and techniques to bring guarantee of safety became a major challenge. In the past and at present, the most widely used approach to check safety of software is to apply heavy test campaigns, which take a large part of the costs of software development. Yet they cannot ensure that all the bugs are caught, and remaining bugs may have catastrophic causes (e.g., the Heartbleed bug in OpenSSL library discovered in 2014).

Generally speaking, software verification approaches pursue three goals: (1) verification should be sound, in the sense that no bugs should be missed, (2) verification should not produce false alarms, or as few as possible, (3) it should be as automatic as possible. Reaching all three goals at the same time is a challenge. A large class of approaches emphasizes goals (2) and (3): testing, run-time verification, symbolic execution, model checking, etc. Static analysis, such as abstract interpretation, emphasizes goals (1) and (3). Deductive verification emphasizes (1) and (2). The Toccata project is mainly interested in exploring the deductive verification approach, although we also consider the other ones in some cases.

In the past decade, significant progress has been made in the
domain of deductive program verification. This is emphasized by some
success stories of application of these techniques on industrial-scale
software. For example, the Atelier B system was used to develop
part of the embedded software of the Paris metro line
14 40 and other railway-related systems; a
formally proved C compiler was developed using the Coq proof
assistant 62; the L4-verified project developed a
formally verified micro-kernel with high security guarantees, using
analysis tools on top of the Isabelle/HOL proof
assistant 61. A bug in the JDK implementation of
TimSort was discovered using the KeY
environment 57 and a fixed version was
proved sound. Another sign of recent progress is the emergence of
deductive verification competitions (e.g.,
VerifyThis 2). Finally, recent trends in the
industrial practice for development of critical software is to require
more and more guarantees of safety, e.g., the new DO-178C standard for
developing avionics software adds to the former DO-178B the use of
formal models and formal methods. It also emphasizes the need for
certification of the analysis tools involved in the process.

There are two main families of approaches for deductive
verification. Methods in the first family build on top of mathematical
proof assistants (e.g., Coq, Isabelle) in which both the model and the
program are encoded; the proof that the program meets its
specification is typically conducted in an interactive way using the
underlying proof construction engine. Methods from the second family
proceed by the design of standalone tools taking as input a program in
a particular programming language (e.g., C, Java) specified with a
dedicated annotation language (e.g., ACSL 36,
JML 46) and automatically producing a set of
mathematical formulas (the verification conditions) which are
typically proved using automatic provers (e.g., Z3 64,
Alt-Ergo 49, CVC4 35).

The first family of approaches usually offers a higher level of assurance than the second, but also demands more work to perform the proofs (because of their interactive nature) and makes them less easy to adopt by industry. Moreover, they generally do not allow to directly analyze a program written in a mainstream programming language like Java or C. The second kind of approaches has benefited in the past years from the tremendous progress made in SAT and SMT solving techniques, allowing more impact on industrial practices, but suffers from a lower level of trust: in all parts of the proof chain (the model of the input programming language, the VC generator, the back-end automatic prover), potential errors may appear, compromising the guarantee offered. Moreover, while these approaches are applied to mainstream languages, they usually support only a subset of their features.

One of our original skills is the ability to conduct proofs by using automatic provers and proof assistants at the same time, depending on the difficulty of the program, and specifically the difficulty of each particular verification condition. We thus believe that we are in a good position to propose a bridge between the two families of approaches of deductive verification presented above. Establishing this bridge is one of the goals of the Toccata project: we want to provide methods and tools for deductive program verification that can offer both a high amount of proof automation and a high guarantee of validity. Indeed, an axis of research of Toccata is the development of languages, methods and tools that are themselves formally proved correct. Recent advances in the foundations of deductive verification include various aspects such as reasoning efficiently on bitvector programs 54 or providing counterexamples when a proof does not succeed 50.

A specifically challenging aspect of deductive verification methods is
how does one deal with memory mutation in general, an issue that
appear under various similar forms such the reasoning on mutable data
structures or on concurrent programs, with the common denominator of
the tracking of memory change on shared data. The ability to track
aliasing is also a key for the ability of specifying programs and
conduct proofs using the advanced notion of ghost code7, notion that can be push forward very far as
demonstrated by our work on ghost monitors 48.

In industrial applications, numerical calculations are very common (e.g. control software in transportation). Typically they involve floating-point numbers. Some of the members of Toccata have an internationally recognized expertise on deductive program verification involving floating-point computations. Our past work includes a new approach for proving behavioral properties of numerical C programs using Frama-C/Jessie 34, various examples of applications of that approach 44, the use of the Gappa solver for proving numerical algorithms 52, an approach to take architectures and compilers into account when dealing with floating-point programs 45, 66. We contributed to the CompCert verified compiler, regarding the support for floating-point operations 3. We also contributed to the Handbook of Floating-Point Arithmetic 65. A representative case study is the analysis and the proof of both the method error and the rounding error of a numerical analysis program solving the one-dimension acoustic wave equation 4241. We published a reference book on the verification of floating-point algorithms with Coq 4. Our experience led us to a conclusion that verification of numerical programs can benefit a lot from combining automatic and interactive theorem proving 43, 44, 55. Verification of numerical programs is another main axis of Toccata.

Deductive program verification methods are built upon theorem provers to decide whether a expected proof obligation on a program is a valid mathematical proposition, hence working on deductive verification requires a certain amount of work on the aspect of design of theorem provers. We are involved in particular in the Alt-Ergo SMT solver, for which we designed an original approach for reasoning on arithmetic facts 610 ; the Gappa tool dedicated to reasoning on rounding errors in floating-point computations 63; and the interval tactic to reason about real approximations 8. Proof by reflection is also a powerful approach for advanced reasoning about programs 9.

In the past, we have been more and more involved in the development of significantly large case studies and applications, such as for example the verification of matrix multiplication algorithms 5, the design of verified OCaml librairies 47, the realization of a platform for verification of shell scripts 38 1, or the correct-by-construction design of an efficient library for arbitrary-precision arithmetic 9.

Our scientific programme detailed below is structured into four axes:

Let us conclude with more general considerations about our agenda of the next four years: we want to keep on

Permanent researchers: S. Conchon, J.-C. Filliâtre, C. Marché, G. Melquiond, A. Paskevich

This axis covers the central theme of the team: deductive verification, from the point of view of its foundations but also our will to spread its use in software development. The general motto we want to defend is “deductive verification for the masses”. A non-exhaustive list of subjects we want to address is as follows.

A significant part of the work achieved in this axis is related to the Why3 toolbox and its ecosystem, displayed on Figure 1. The boxes in red background correspond to the tools we develop in the Toccata team.

Permanent researchers: J.-C. Filliâtre, C. Marché, G. Melquiond, A. Paskevich

This axis concerns specifically the techniques for reasoning on programs where aliasing is the central issue. It covers the methods based on type-based alias analysis and related memory models, on specific program logics such as separation logics, and extended model-checking. It concerns the application on analysis of C or C++ codes, on Ada codes involving pointers, but also concurrent programs in general. The main topics planned are:

Permanent researchers: S. Boldo, C. Marché, G. Melquiond

We of course want to keep this axis which is a major originality of Toccata. The main topics of the next 4 years will be:

Permanent researchers: S. Boldo, S. Conchon, J.-C. Filliâtre, C. Marché, G. Melquiond, A. Paskevich

This axis covers applications in general. The applications we currently have in mind are:

The application domains we target involve safety-critical software, that is where a high-level guarantee of soundness of functional execution of the software is wanted. Currently our industrial collaborations or impact mainly belong to the domain of transportation: aerospace, aviation, railway, automotive.

Generally speaking, we believe that our increasing industrial impact is a representative success for our general goal of spreading deductive verification methods to a larger audience, and we are firmly engaged into continuing such kind of actions in the future.

Through the creation of the ProofInUse joint lab in 2014, with AdaCore company, we have a growing impact on the community of industrial development of safety-critical applications written in Ada. See the web page for a an overview of AdaCore's customer projects, in particular those involving the use of the SPARK Pro tool set. This impact involves both the use of Why3 for generating VCs on Ada source codes, and the use of Alt-Ergo for performing proofs of those VCs.

The impact of ProofInUse can also be measured in term of job creation: the first two ProofInUse engineers, D. Hauzar and C. Fumex, employed initially on Inria temporary positions, have now been hired on permanent positions in AdaCore company. It is also interesting to notice that this effort allowed AdaCore company to get new customers, in particular the domains of application of deductive formal verification went beyond the historical domain of aerospace: application in automotive, cyber-security, health (artificial heart).

Impactful results were produced in the context of the CoLiS project for the formal analysis of Debian packages. A first important step was the version 2 of the design of the CoLiS language done by B. Becker, C. Marché and other co-authors 39, that includes a modified formal syntax, a extended formal semantics, together with the design of concrete and symbolic interpreters. Those interpreters are specified and implemented in Why3, proved correct (following the initial approach for the concrete interpreter published in 2018 58 and an approach for symbolic interpretation 38), and finally extracted to OCaml code.

To make the extracted code effective, it must be linked together with a library that implements a solver for feature constraints 60, and also a library that formally specifies the behavior of basic UNIX utilities. The latter library is documented in details in a research report 59.

A third result is a large verification campaign running the CoLiS toolbox on all the packages of the current Debian distribution. The results of this campaign were reported in another article 37 that was submitted to TACAS conference in 2020, and finally presented in the 2021 edition. The most visible side effect of this experiment is the discovery of bugs: more than 150 bug reports have been filled against various Debian packages.

The current plans for continuation of the ProofInUse joint lab is to form a ProofInUse Consortium with an extension at a larger perimeter than Ada applications. We started to collaborate with the TrustInSoft company for the verification of C and C++ codes, including the use of Why3 to design verified and reusable C libraries (ongoing CIFRE PhD thesis). We also started to collaborate with Mitsubishi Electric R&D Centre Europe in Rennes for a specific usage of Why3 for verifying embedded devices (logic controllers). The recent best paper award at the FMICS conference is a result of this last collaboration.

Our research activities make use of computers for developing software and developing formal proofs. A continuous integration methodology for mature software like Why3 is mandatory for ensuring a safe software engineering process for maintenance and evolution. We make the necessary efforts to keep the energy consumption of such a continuous integration process as low as possible.

Ensuring the reproducibility of proofs in formal verification is essential. It is thus mandatory to replay such proofs regularly to make sure that our changes in our software do not loose existing proofs. For example, we need to make sure that the case studies in formal verification that we present in our gallery are reproducible. We also make the necessary efforts to keep the energy consumption for replaying proofs low, by doing it only when necessary.

As widely accepted nowadays, the major sources of environmental impact of research is travel to international conferences by plane, and renewal of electronic devices. The number of travels we made in 2021 remained very low with respect to previous years, of course because of the Covid pandemic. The impact on research was mitigated thanks to the possibility of participating to conferences using remote communication systems. We intend to continue limiting the environmental impact of our travels. Concerning renewal of electronic devices, that is mainly laptops and monitors, we have always been careful on keeping them usable for as long time as possible.

Our research results aims at improving the quality of software, in particular in mission-critical contexts. As such, making software more safe is likely to reduce the necessity for maintenance operations and thus reducing energy costs.

Our efforts are mostly towards ensuring the safety of functional behavior of software, but we also increasingly consider the verification of their time or memory consumption. Reducing those would naturally induce a reduction in energy consumption.

Our research never involve any processing of personal data, and
consequently we have no concern about preserving individual privacy,
and no concern with respect to the RGPD (Règlement Général sur
la Protection des Données).

In the past years, increasingly more Computer Science topics have been
introduced in the high school curricula. This evolution resulted in an
increasing need for teachers with high skills in that domain. The
French ministry of education has decided in 2021 to create a new
discipline in the concours de l'agrégation, which is the most
selective competition for recruiting high school teachers. To prepare
the first round of this recruiting competition taking place in 2022,
Sylvie
Boldo has been nominated as president of the competition
committee. Notice that she is the only full-time researcher to
chair an agrégation committee this year.

The team of Jean-Christophe Filliâtre and Andrei Paskevich got the
Best overall team award. The team of Quentin Garchery and
Xavier Denis won the first place for the Best student team
award.

VerifyThis is a series of program verification competitions, which takes place annually since 2011. The competition offers a number of challenges presented in natural language and pseudo-code. Participants have to formalize the requirements, implement a solution, and formally verify the implementation for adherence to the specification.

Cláudio Belo Lourenço and Claude Marché,
with co-authors from Mitsubishi Electric R&D (Rennes, France)
received the Best-Paper Award at the 26th International Conference
on Formal Methods for Industrial Critical Systems.

Their contribution Automated Verification of Temporal Properties of
Ladder Programs was valued by the jury as a “good example for how
formal methods can be used in industrial applications” with
“industrial interest for both legacy Ladder programs and programs to
be developed”.

The following lists all the software distributed publicly and for which at least one author is member of the team.

CoqInterval is a library for the proof assistant Coq.

It provides several tactics for proving theorems on enclosures of real-valued expressions. The proofs are performed by an interval kernel which relies on a computable formalization of floating-point arithmetic in Coq.

The Marelle team developed a formalization of rigorous polynomial approximation using Taylor models in Coq. In 2014, this library has been included in CoqInterval.

The Flocq library for the Coq proof assistant is a comprehensive formalization of floating-point arithmetic: core definitions, axiomatic and computational rounding operations, high-level properties. It provides a framework for developers to formally verify numerical applications.

Flocq is currently used by the CompCert verified compiler to support floating-point computations.

Coq version 8.14 integrates many usability improvements, as well as an important change in the core language. The main changes include:

- The internal representation of match has changed to a more space-efficient and cleaner structure, allowing the fix of a completeness issue with cumulative inductive types in the type-checker. The internal representation is now closer to the user-level view of match, where the argument context of branches and the inductive binders "in" and "as" do not carry type annotations.

- A new "coqnative" binary performs separate native compilation of libraries, starting from a .vo file. It is supported by coq_makefile.

- Improvements to typeclasses and canonical structure resolution, allowing more terms to be considered as classes or keys.

- More control over notation declarations and support for primitive types in string and number notations.

- Removal of deprecated tactics, notably omega, which has been replaced by a greatly improved lia, along with many bug fixes.

- New Ltac2 APIs for interaction with Ltac1, manipulation of inductive types and printing.

Many changes and additions to the standard library in the numbers, vectors and lists libraries. A new signed primitive integers library Sint63 is available in addition to the unsigned Uint63 library.

When verifying programs where the data have some recursive
structure, it is natural to make use of global invariants that are
themselves recursively defined. Though this is mathematically
elegant, this makes the proofs more complex, as the preservation of
these invariants now requires induction. In particular, this makes
the proofs less amenable to automation. An alternative is to use
local invariants attached to individual components of the structure
and which only involve a bounded number of elements. These are
called decentralized invariants. When the structure is
updated, the footprint of the modification only impacts a bounded
number of invariants and reestablishing them does not require
induction. In this paper 13, Filliâtre
illustrates this idea on three non-trivial programs, for which fully
automated proofs are achieved.

This paper appears in a special issue “E pur si muove” of the Journal of Logical and Algebraic Methods in Programming, that is tribute to José Manuel Esgalhado Valença on the occasion of his jubilation.

The growth of the computing capacities makes it possible to obtain
more and more precise simulation results, often
calculated in binary64. However, exascale is
pushing back the known limits and the problems of accumulating
round-off errors could come back and require to increase further the
precision. But working with extended precision, regardless of the
method used, has a significant cost in memory, computation time and
energy. It is therefore important to measure the robustness of
the binary64 format by anticipating the future computing
resources in order to ensure its durability in numerical
simulations. For this purpose, W. Weens, T. Vazquez-Gonzalez and
L. Ben Salem-Knapp performed a set of numerical experiments
25. Those were performed with weak
floats which were specifically designed to conduct an empirical
study of round-off errors in hydrodynamic simulations and to build
an error model that extracts the part due to round-off error in the
results. This model confirms that errors remain dominated by the
scheme errors in the performed numerical experiments.
Other numerical experiments have been done in order to check
whether binary64 provides enough accuracy in the context of
hydrodynamics exascale computations 23.

Numerical simulations are carefully-written programs, and their correctness is based on mathematical results. Nevertheless, those programs rely on floating-point arithmetic and the corresponding round-off errors are often ignored. L. Ben Salem-Knapp, S. Boldo and W. Weens studied a specific simple scheme applied to advection, that is a particular equation from hydrodynamics dedicated to the transport of a substance. Their work shows a tight bound on the round-off error of the 1-dimensional and 2-dimensional upwind schemes, with an error roughly proportional to the number of steps. The error bounds are generic with respect to the floating-point format and exceptional behaviors are taken into account. Some experiments give an insight of the quality of the bounds 28.

Several new results were produced in the context of the CoLiS project for the formal analysis of Debian packages. A first important step is the version 2 of the design of the CoLiS language done by B. Becker, C. Marché and other co-authors 39, that includes a modified formal syntax, a extended formal semantics, together with the design of concrete and symbolic interpreters. Those interpreters are specified and implemented in Why3, proved correct (following the initial approach for the concrete interpreter published in 2018 58 and the recent approach for symbolic interpretation mentioned above 38), and finally extracted to OCaml code.

To make the extracted code effective, it must be linked together with a library that implements a solver for feature constraints 60, and also a library that formally specifies the behavior of basic UNIX utilities. The latter library is documented in details in a research report 59.

A third result is a large verification campaign running the CoLiS toolbox on all the packages of the current Debian distribution. The results of this campaign were reported in another article 37 that was presented at TACAS conference in 2021. The most visible side effect of this experiment is the discovery of bugs: more than 150 bugs report have been filled against various Debian packages. A journal paper reporting updated experimental results using an improved implementation of the platform, and on the new Debian stable distribution, is under submission.

We have bilateral contracts which are closely related to a joint effort called the ProofInUse consortium. The objective of ProofInUse is to provide verification tools, based on mathematical proof, to industry users. These tools are aimed at replacing or complementing the existing test activities, whilst reducing costs.

This consortium is a follow-up of the former LabCom ProofInUse between Toccata and the SME AdaCore, funded by the ANR programme “Laboratoires communs”, from April 2014 to March 2017.

This collaboration is a joint effort of the Inria project-team Toccata and the AdaCore company which provides development tools for the Ada programming language. It is funded by a 5-year bilateral contract from Jan 2019 to Dec 2023.

The SME AdaCore is a software publisher specializing in providing software development tools for critical systems. A previous successful collaboration between Toccata and AdaCore enabled Why3 technology to be put into the heart of the AdaCore-developed SPARK technology.

The objective of ProofInUse-AdaCore is to significantly increase the capabilities and performances of the Spark/Ada verification environment proposed by AdaCore. It aims at integration of verification techniques at the state-of-the-art of academic research, via the generic environment Why3 for deductive program verification developed by Toccata.

This bilateral contract is part of the ProofInUse effort. This collaboration joins efforts of the Inria project-team Toccata and the company Mitsubishi Electric R&D (MERCE) in Rennes. It is funded by a bilateral contract of 3 years and 6 months from Nov 2019 to April 2023.

MERCE has strong and recognized skills in the field of formal methods. In the industrial context of the Mitsubishi Electric Group, MERCE has acquired knowledge of the specific needs of the development processes and meets the needs of the group in different areas of application by providing automatic verification and demonstration tools adapted to the problems encountered.

The objective of ProofInUse-MERCE is to significantly improve on-going MERCE tools regarding the verification of Programmable Logic Controllers and also regarding the verification of numerical C codes.

This bilateral contract is part of the ProofInUse effort. This collaboration joins efforts of the Inria project-team Toccata and the company TrustInSoft in Paris. It is funded by a bilateral contract of 24 months from Dec 2020 to Nov 2022.

TrustInSoft is an SME that offers the TIS-Analyzer environment for analysis of safety and security properties of source codes written in C and C++ languages. A version of TIS-Analyzer is available online, under the name TaaS (TrustInSoft as a Service).

The objective of ProofInUse-TrustInSoft is to integrate Deductive Verification in the platform TIS-Analyzer, with a special interest in the generation of counterexample to help the user in case of proof failure.

A contract has been signed in 2021 between the CEA-DAM
(“Direction des applications militaires”) and Toccata about
the management of the PhD thesis of Louise Ben Salem-Knapp with
William Weens (CEA-DAM) and Guillaume Perrin (CEA-DAM). The PhD has
stopped in October 2021, also ending the contract.

This topic of the PhD is between computer science and applied mathematics. We consider algorithms from numerical analysis and verify their good behavior on computers. This behavior, proven by supposing that the computations are perfect, could be put in fault by the problems of round-off errors and of overflows due to computations in floating-point arithmetic. We plan to study the impact of round-off errors in a hydrodynamic code. Hydrodynamics is the skeleton model of many physical models used in industry. It contains numerous technical, mathematical and numerical difficulties, which does not prevent its massive use in the simulation industry on increasingly complex problems. Today, the resolution of such problems requires the use of super-calculators, as well as the implementation of algorithms adapted to massively parallel calculation. The very large number of calculations required to produce results raises the question of their numerical quality.

Clément Pascutto started a CIFRE PhD in June 2020, under then supervision of Jean-Christophe Filliâtre (at Toccata) and Thomas Gazagnaire (at Tarides). The subject of the PhD is the dynamic and deductive verification of OCaml programs and its application to distributed data structures.

Léo Andrès started a CIFRE PhD in October 2021, under the supervision of Jean-Christophe Filliâtre (at Toccata) and Pierre Chambart and Vincent Laviron (at OCamlPro). The subject of the PhD is the design, formalization, and implementation of a garbage collector for WebAssembly.

EMC2 is an ERC Synergy project that aims to overcome some of the current limitations in the field of molecular simulation and aims to provide academic communities and industrial companies with new generation, dramatically faster and quantitatively reliable molecular simulation software. This will enable those communities to address major technological and societal challenges of the 21st century in health, energy and the environment for instance.

Using computers to formulate conjectures and consolidate proof steps pervades all mathematics fields, even the most abstract. Most computer proofs are produced by symbolic computations, using computer algebra systems. However, these systems suffer from severe, intrinsic flaws, rendering computational correction and verification challenging. The FRESCO project aims to shed light on whether computer algebra could be both reliable and fast. Researchers will disrupt the architecture of proof assistants, which serve as the best tools for representing mathematics in silico, enriching their programming features while preserving their compatibility with their logical foundations. They will also design novel mathematical software that should feature a high-level, performance-oriented programming environment for writing efficient code to boost computational mathematics.

The last twenty years have seen the advent of computer-aided proofs in mathematics and this trend is getting more and more important. They request various levels of numerical safety, from fast and stable computations to formal proofs of the computations. Hovewer, the necessary tools and routines are usually ad hoc, sometimes unavailable, or inexistent. On a complementary perspective, numerical safety is also critical for complex guidance and control algorithms, in the context of increased satellite autonomy. We plan to design a whole set of theorems, algorithms and software developments, that will allow one to study a computational problem on all (or any) of the desired levels of numerical rigor. Key developments include fast and certified spectral methods and polynomial arithmetic, with subsequent formal verifications. There will be a strong feedback between the development of our tools and the applications that motivate it.

The project led by École Normale Supérieure de Lyon (LIP) has started in February 2021 and lasts for 4 years. Partners: Inria (teams Aric, Galinette, Lfant, Marelle, Toccata), École Polytechnique (LIX), Sorbonne Université (LIP6), Université Sorbonne Paris Nord (LIPN), CNRS (LAAS).