The PAREO team aims at designing and implementing tools for the specification, analysis and verification of software and systems. At the heart of our project is therefore the will to study fundamental aspects of programming languages (logic, semantics, algorithmics, etc.) and to make major contributions to the design of new programming languages. An important part of our research effort will be dedicated to the design of new fundamental concepts and tools to analyze existing programs and systems. To achieve this goal we focus on:
the improvement of theoretical foundations of rewriting and deduction;
the integration of the corresponding formal methods in programming and verification environments;
the practical applications of the proposed formalisms.
It is a common claim that rewriting is ubiquitous in computer science and mathematical logic. And indeed the rewriting concept appears from very theoretical settings to very practical implementations. Some extreme examples are the mail system under Unix that uses rules in order to rewrite mail addresses in canonical forms and the transition rules describing the behaviors of tree automata. Rewriting is used in semantics in order to describe the meaning of programming languages as well as in program transformations like, for example, re-engineering of Cobol programs . It is used in order to compute, implicitly or explicitly as in Mathematica or MuPAD, but also to perform deduction when describing by inference rules a logic , a theorem prover or a constraint solver . It is of course central in systems making the notion of rule an explicit and first class object, like expert systems, programming languages based on equational logic, algebraic specifications, functional programming and transition systems.
In this context, the study of the theoretical foundations of rewriting have to be continued and effective rewrite based tools should be developed. The extensions of first-order rewriting with higher-order and higher-dimension features are hot topics and these research directions naturally encompass the study of the rewriting calculus, of polygraphs and of their interaction. The usefulness of these concepts becomes more clear when they are implemented and a considerable effort is thus put nowadays in the development of expressive and efficient rewrite based programming languages.
Programming languages are formalisms used to describe programs, applications, or software which aim to be executed on a given hardware. In principle, any Turing complete language is sufficient to describe the computations we want to perform. However, in practice the choice of the programming language is important because it helps to be effective and to improve the quality of the software. For instance, a web application is rarely developed using a Turing machine or assembly language. By choosing an adequate formalism, it becomes easier to reason about the program, to analyze, certify, transform, optimize, or compile it. The choice of the programming language also has an impact on the quality of the software. By providing high-level constructs as well as static verifications, like typing, we can have an impact on the software design, allowing more expressiveness, more modularity, and a better reuse of code. This also improves the productivity of the programmer, and contributes to reducing the presence of errors.
The quality of a programming language depends on two main factors. First, the intrinsic design, which describes the programming model, the data model, the features provided by the language, as well as the semantics of the constructs. The second factor is the programmer and the application which is targeted. A language is not necessarily good for a given application if the concepts of the application domain cannot be easily manipulated. Similarly, it may not be good for a given person if the constructs provided by the language are not correctly understood by the programmer.
In the Pareo group we target a population of programmers interested in improving the long-term maintainability and the quality of their software, as well as their efficiency in implementing complex algorithms. Our privileged domain of application is large since it concerns the development of transformations. This ranges from the transformation of textual or structured documents such as XML, to the analysis and the transformation of programs and models. This also includes the development of tools such as theorem provers, proof assistants, or model checkers, where the transformations of proofs and the transitions between states play a crucial role. In that context, the expressiveness of the programming language is important. Indeed, complex encodings into low level data structures should be avoided, in contrast to high level notions such as abstract types and transformation rules that should be provided.
It is now well established that the notions of term and rewrite rule are two universal abstractions well suited to model tree based data types and the transformations that can be done upon them. Over the last ten years we have developed a strong experience in designing and programming with rule based languages , , . We have introduced and studied the notion of strategy , which is a way to control how the rules should be applied. This provides the separation which is essential to isolate the logic and to make the rules reusable in different contexts.
To improve the quality of programs, it is also essential to have a clear description of their intended behaviors. For that, the semantics of the programming language should be formally specified.
There is still a lot of progress to be done in these directions. In particular, rule based programming can be made even more expressive by extending the existing matching algorithms to context-matching or to new data structures such as graphs or polygraphs. New algorithms and implementation techniques have to be found to improve the efficiency and make the rule based programming approach effective on large problems. Separating the rules from the control is very important. This is done by introducing a language for describing strategies. We still have to invent new formalisms and new strategy primitives which are both expressive enough and theoretically well grounded. A challenge is to find a good strategy language we can reason about, to prove termination properties for instance.
On the static analysis side, new formalized typing algorithms are needed to properly integrate rule based programming into already existing host languages such as Java. The notion of traversal strategy merits to be better studied in order to become more flexible and still provide a guarantee that the result of a transformation is correctly typed.
The huge diversity of the rewriting concept is obvious and when one wants to focus on the underlying notions, it becomes quickly clear that several technical points should be settled. For example, what kind of objects are rewritten? Terms, graphs, strings, sets, multisets, others? Once we have established this, what is a rewrite rule? What is a left-hand side, a right-hand side, a condition, a context? And then, what is the effect of a rule application? This leads immediately to defining more technical concepts like variables in bound or free situations, substitutions and substitution application, matching, replacement; all notions being specific to the kind of objects that have to be rewritten. Once this is solved one has to understand the meaning of the application of a set of rules on (classes of) objects. And last but not least, depending on the intended use of rewriting, one would like to define an induced relation, or a logic, or a calculus.
In this very general picture, we have introduced a calculus whose main
design concept is to make all the basic ingredients of rewriting
explicit objects, in particular the notions of rule
application and result. We concentrate on
term rewriting, we introduce a very general notion of rewrite
rule and we make the rule application and result explicit
concepts. These are the basic ingredients of the rewriting-
or
The
The study of the rewriting calculus turns out to be extremely successful in terms of fundamental results and of applications . Different instances of this calculus together with their corresponding type systems have been proposed and studied. The expressive power of this calculus was illustrated by comparing it with similar formalisms and in particular by giving a typed encoding of standard strategies used in first-order rewriting and classical rewrite based languages like ELAN and Tom.
Beside the theoretical transfer that can be performed via the
cooperations or the scientific publications, an important part of
the research done in the Pareo project-team is published within
software. Tom is our flagship implementation. It is available
via the Inria Gforge (http://
Teaching: when (for good or bad reasons) functional programming is not taught nor used, Tom is an interesting alternative to exemplify the notions of abstract data type and pattern-matching in a Java object oriented course.
Software quality: it is now well established that functional languages such as Caml are very successful to produce high-assurance software as well as tools used for software certification. In the same vein, Tom is very well suited to develop, in Java, tools such as provers, model checkers, or static analyzers.
Symbolic transformation: the use of formal anchors makes possible the transformation of low-level data structures such as C structures or arrays, using a high-level formalism, namely pattern matching, including associative matching. Tom is therefore a natural choice each time a symbolic transformation has to be implemented in C or Java for instance. Tom has been successfully used to implement the Rodin simplifier, for the B formal method.
Prototyping: by providing abstract data types, private types, pattern matching, rules and strategies, Tom allows the development of quite complex prototypes in a short time. When using Java as the host-language, the full runtime library can be used. Combined with the constructs provided by Tom, such as strategies, this procures a tremendous advantage.
One of the most successful transfer is certainly the use of Tom made by Business Objects/SAP. Indeed, after benchmarking several other rule based languages, they decided to choose Tom to implement a part of their software. Tom is used in Paris, Toulouse and Vancouver. The standard representation provided by Tom is used as an exchange format by the teams of these sites.
ATerm (short for Annotated Term) is an abstract data type designed for the exchange of tree-like data structures between distributed applications.
The ATerm library forms a comprehensive procedural interface which enables creation and manipulation of ATerms in C and Java. The ATerm implementation is based on maximal subterm sharing and automatic garbage collection.
We are involved (with the CWI) in the implementation of the Java version, as well as in the garbage collector of the C version. The Java version of the ATerm library is used in particular by Tom.
The ATerm library is documented, maintained, and available at the
following address:
http://
Since 2002, we have developed a new system called Tom , presented in , . This system consists of a pattern matching compiler which is particularly well-suited for programming various transformations on trees/terms and XML documents. Its design follows our experiments on the efficient compilation of rule-based systems . The main originality of this system is to be language and data-structure independent. This means that the Tom technology can be used in a C, C++ or Java environment. The tool can be seen as a Yacc-like compiler translating patterns into executable pattern matching automata. Similarly to Yacc, when a match is found, the corresponding semantic action (a sequence of instructions written in the chosen underlying language) is triggered and executed. Tom supports sophisticated matching theories such as associative matching with neutral element (also known as list-matching). This kind of matching theory is particularly well-suited to perform list or XML based transformations for example.
In addition to the notion of rule, Tom offers a sophisticated way of controlling their application: a strategy language. Based on a clear semantics, this language allows to define classical traversal strategies such as innermost, outermost, etc. Moreover, Tom provides an extension of pattern matching, called anti-pattern matching. This corresponds to a natural way to specify complements (i.e., what should not be there to fire a rule). Tom also supports the definition of cyclic graph data-structures, as well as matching algorithms and rewriting rules for term-graphs.
Tom is documented, maintained, and available at
http://
Control operators, such as call/cc in Scheme or SML, allow programs to
have access and manipulate their execution context. We study the behavioral
theory of the
In , we define a calculus with higher-order
polymorphic functions, recursive types with arrow and product type constructors
and set-theoretic type connectives (union, intersection, and negation). We study
the explicitly-typed version of the calculus in which type instantiation is
driven by explicit instantiation annotations. In particular, we define an
explicitly-typed
.
Several approaches for proving the confluence and the termination of term rewriting systems have been proposed and the corresponding techniques have been implemented in tools like Aprove and TTT2 . On the other hand, there are relatively few works on the study of these properties in the context of strategic rewriting and the corresponding results were generally obtained for some specific strategies and not within a generic framework. It would thus be interesting to reformulate these notions in the general formalism we have previously proposed and to establish confluence and termination conditions similar to the ones used in standard rewriting.
We have first focused on the termination property and we targeted the
rewriting strategies of the Tom language. We propose a direct
approach which consists in translating Tom strategies into a
rewriting system which is not guided by a given evaluation strategy
and we show that our systematic transformation preserves the
termination. This allowed us to take advantage of the termination
proof techniques available for standard rewriting and in particular to
use existing termination tools (such as Aprove and TTT2) to prove the
termination of strategic rewriting systems. The efficiency and
scalability of these latter tools has a direct impact on the
performances of our approach especially for complex strategies for
which an important number of rewrite rules could be generated. We have
nevertheless proposed a meta-level implementation of the
automatic transformation which improves significantly the performances of the
approach.
The corresponding tool is available at http://
.
Quality is crucial for software systems and several aspects should be taken into account. Formal verification techniques like model checking and automated theorem proving can be used to guarantee the correctness of finite or infinite systems. While these approaches provide a high level of confidence they are sometimes difficult and expensive to apply. Software testing is another approach and although it cannot guarantee correctness it can be very efficient in finding errors.
We have proposed a property based testing framework for the Tom
language inspired from the ones proposed in the context of functional
programming. The previously developed tool has been improved by
integrating it in the Junit framework. The tests are thus
highly automatized and the library can be smoothly integrated in
classical IDEs. The relatively simple shrinking method which searches
a smaller counter-example starting from an initial relatively complex
one has been also improved.
The library is available at
http://
Automated inductive reasoning for term rewriting has been extensively studied in the literature. Classes of equations and term rewriting systems (TRSs) with decidable inductive validity have been identified and used to automatize the inductive reasoning. In , we give procedures for deciding the inductive validity of equations in some standard TRSs on natural numbers and lists. Contrary to previous decidability results, our procedures can automatically decide without involving induction reasoning the inductive validity of arbitrary equations for these TRSs, that is, without imposing any syntactical restrictions on the form of equations. We also report on the complexity of our decision procedures. These decision procedures are implemented in our automated provers for inductive theorems of TRSs and experiments are reported.
In , we detail the integration in SPIKE, an implicit induction theorem prover, of two reasoning modules operating over naturals combined with interpreted symbols. The first integration schema is à la Boyer-Moore, based on the combination of a congruence closure procedure with a decision procedure for linear arithmetic over rationals/reals. The second follows a ‘black-box’ approach and is based on external SMT solvers. It is shown that the two extensions significantly increase the power of SPIKE; their performances are compared when proving a non-trivial application.
In the setting of classical first-order logic with inductive
predicates, two kinds of sequent-based induction reasoning are
distinguished: cyclic and structural. Proving their equivalence is of
great theoretical and practical interest for the automated reasoning
community. Previously, it has been shown how to transform any
structural proof developed with the LKID system into a cyclic proof
using the CLKID
We participate in the “Logic and Complexity” part of the GDR–IM (CNRS Research Group on Mathematical Computer Science), in the projects “Logic, Algebra and Computation” (mixing algebraic and logical systems) and “Geometry of Computation” (using geometrical and topological methods in computer science).
We are also involved in the GDR-GPL (CNRS Research Network on Software Engineering), as a member of the FORWAL group and member of the Scientific Board of the GDR.
Program: PHC Polonium
Project title: Expressing concurrency through control operators
Duration: January 2015 - December 2016
Coordinator: Sergueï Lenglet
Other partner: Institute of Computer Science, University of Wrocław, Poland
Abstract: The goal of this project is to explore the interplay between concurrency,
continuations, and control operators at a fundamental level. We do not restrict
ourselves to a specific programming language, but we use more general and well
established formal models, namely process calculi (such as the
Subject: Property Based Testing
Date: from Feb 2014 until Jun 2014
Institution: Erasmus Mundus MSc in Dependable Software Systems
Horatiu Cirstea:
PC member of RuleML 2014 (International RuleML Symposium on Rule Interchange and Applications).
Steering committee of RULE.
Sergueï Lenglet:
PC member of WPTE'14 (Workshop on Rewriting Techniques for Program Transformations and Evaluation)
Pierre-Etienne Moreau:
PC member of TERMGRAPH'2014 (8th International Workshop on Computing with Terms and Graphs)
PC member of WRLA'2014 (10th International Workshop on Rewriting Logic and its Applications)
Sorin Stratulat:
PC member of CISIS 2014 (7th International Conference on Computational Intelligence in Security for Information Systems)
PC member of IAS 2014 (10th International Conference on Information Assurance and Security)
PC member of SYNASC 2014 (16th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing)
Horatiu Cirstea:
Reviewer for ESOP'2014, ICTAC'2014, LOPSTR'2014
Sergueï Lenglet:
Reviewer for CSL-LICS'2014, RTA-TLCA'2014, FOSSACS'2015
Pierre-Etienne Moreau:
Reviewer for TERMGRAPH'2014, WASDETT'2014
Pierre-Etienne Moreau:
Advisory Board of Science of Computer Programming, software track
Horatiu Cirstea:
Reviewer for Journal of Symbolic Computation
Sergueï Lenglet:
Reviewer for Fundamenta Informaticae and HOSC (Higher-Order and Symbolic Computation)
Pierre-Etienne Moreau:
Reviewer for Electronic Proceedings in Theoretical Computer Science (EPTCS)
Sorin Stratulat:
Reviewer for Journal of Symbolic Computation
Licence: Sergueï Lenglet, teaching at IUT Charlemagne, first year and second year students.
Licence: Horatiu Cirstea, Responsible for the first year of the Licence in Computer Science, Université de Lorraine.
Licence: Pierre-Etienne Moreau, Responsible of the course “Introduction to Algorithms and Programming” (http://
Master: Horatiu Cirstea, Responsible for the Master speciality “Logiciels: Théorie, méthodes et ingénierie”, Université de Lorraine.
Master: Pierre-Etienne Moreau, Head of the Computer Science Department at Mines Nancy, Université de Lorraine.
Doctorate: Pierre-Etienne Moreau, Implementing Term Rewriting, International School on Rewriting, Valparaíso, Chile.
PhD: Jean-Christophe Bach, A formal Island for qualifiable model transformations, Université de Lorraine, September 12th 2014, Pierre-Etienne Moreau
PhD in progress: Duy Duc Nguyen, Aided design of multi-physics and multi-scale systems based on asymptotic methods, Horatiu Cirstea (co-supervised with Michel Lenczner and Federic Zamkotsian)
PhD in progress: Amira Henaien, Certification du raisonnement formel porté sur des systèmes d’information critiques, November 2010, Sorin Stratulat (co-supervised with Maurice Margenstern)
Horatiu Cirstea:
PhD committee of Julien Ferté, reviewer, “Régularité et contraintes de descendance, équations algébriques+”, Marseille 2014
PhD committee of Hernan Vanzetto, examiner, “Automatisation des preuves et synthèse des types pour la théorie des ensembles dans le contexte de TLA+”, Nancy 2014
PhD committee of Bin Yang, reviewer, “Contribution to a kernel of a symbolic asymptotic modeling software”, Besançon 2014
Pierre-Etienne Moreau:
PhD committee of Jean-Christophe Bach, superviser, “Un îlot formel pour les transformations de modèles qualifiables”, Nancy 2014
PhD committee of Faiez Zalila, reviewer, “Methods and Tools for the Integration of Formal Verification in Domain-Specific Languages”, Toulouse 2014
PhD committee of Cyrille Wiedling, examiner, “Formal Verification of Advanced Families of Security Protocols: E-Voting and APIs”, Nancy 2014
Sorin Stratulat:
PhD committee of Abdelkader Kersani, examiner, “Preuves par induction dans le calcul de superposition”, Grenoble 2014
Jean-Christophe Bach participated to scientific mediation by proposing several activities to demonstrate the algorithmic thinking at the core of the Computer Science without requiring any computer or even electric devices. These activities are the first part of the CSIRL (Computer Science In Real Life) project which aims to popularize computer science and to initiate children, school students and non-scientists into this domain.
Pierre-Etienne Moreau gave two lectures about “Robotics and Programming” in the ISN course (Informatique et Science du Numérique), in order to help professors of “classes de terminale” to teach this discipline. He organized a three day course about “Algorithms, Programming and Databases” in order to help professors of “classes préparatoires aux grandes écoles” to teach this discipline.
Pierre-Etienne Moreau is member of the national committee for Inria “Médiation Scientifique”. He also participated to “Fête de la Science 2014” at Mines Nancy.