Constraint Logic Programming supports a great ambition for programming: the one of making of programming essentially a modeling task, with equations, constraints and logical formulas.
Constraint Programming is a field born during the mid 80s from Logic Programming, Linear Programming coming from Operations Research, and Constraint Propagation techniques coming from Artificial Intelligence. Its foundation is the use of relations on mathematical variables to compute with partial information. The successes of Constraint Programming for solving combinatorial optimization problems in industry or commerce are related to the bringing of new local consistency techniques and of declarative languages which allow control on the mixing of heterogeneous resolution techniques: numerical, symbolic, deductive and heuristic.
The "Contraintes" group investigates the logical foundations, design, implementation, programming environments and applications of constraint programming languages. The study of Concurrent Constraint languages is a core aspect of the project as they provide a conceptual framework for analyzing different issues of constraint programming, like constraint resolution techniques, concurrent modeling, reactive applications, etc.
The main application domains investigated are combinatorial optimization problems and bio-informatics. In bio-informatics, our objective is not to work on structural biology problems, which has been the main trend up to now, but to attack the great challenge of systems biology, namely to model the function, activity and interaction of molecular systems in living cells, with logic programming concepts and constraint programming and program verification technologies.
The class of Concurrent Constraint programming languages (CC) was introduced a decade ago by Vijay Saraswat as a unifying framework for constraint logic programming and concurrent logic programming. The CC paradigm constitutes a representative abstraction of constraint programming languages, and thus allows a fine grained study of their fundamental properties.
CC generalizes the Constraint Logic Programming framework (CLP) by introducing a synchronization primitive, based on constraint entailment. It is a model of concurrent computation, where agents communicate through a shared store, represented by a constraint, which expresses some partial information on the values of the variables involved in the computation. The variables play the role of transmissible dynamically created communication channels.
One of the big successes of CC has been the simple and elegant reconstruction of finite domain constraint solvers, and the cooperation of several models to solve a single combinatorial problem. On the other hand, to use CC for programming reactive applications forces one to abandon the hypothesis of monotonic evolution of the constraint store; this is a strong motivation for new extensions of CC languages.
There are strong completeness theorems relating the execution of a CLP program and its translation in classical logic, which provide smooth reasoning techniques for such programs. However these theorems are broken by the synchronization operation of CC. Looking for a logical semantics of CC programs in the general paradigm of logic programming, program = formula, execution = proof search, leads to a translation in Jean-Yves Girard's linear logic. This allows the recovery of some completeness results about successes and stores; even suspensions may be characterized with the non-commutative logic of Ruet and Abrusci.
It is thus possible to address important issues for Constraint Programming:
verifying CC programs;
combining CLP and state-based programming;
dealing with local search inside a global constraint solving procedure.
The last two cases rely on a natural extension of CC languages, called Linear Concurrent Constraint languages (LCC), which simply replaces constraint systems built onto classical logic by constraint systems built onto linear logic. This allows us to represent state changes thanks to the consumption of resources during the synchronization action, modeled by the linear implication.
Our domains of application use quite different constraint systems:
finite domains (bounded natural numbers): primitive constraint of some finite domain membership, numerical, symbolic, higher order and global constraints;
reals: Simplex algorithm for linear constraints and interval methods otherwise;
terms: subtyping constraints and ontologies.
The project works on both the constraint resolution methods and their cooperation. The main focus is the efficiency of constraint propagation methods, and their combination with local search methods.
Handling hundreds of thousands of heterogeneous constraints on as many variables is impossible without tools specially designed to show the different views of the execution.
The DiSCiPl ESPRIT project showed the use of declarative diagnosis methods based on the logical semantics of programs for interactive debugging systems, and developed tools to visualize the execution of CLP programs: showing the effect of propagation, the shape of the search space (detection of symmetries), the impact of heuristics, etc. We have pursued this research with our partners of the RNTL project OADymPPaC. The main results of this project are a generic trace format for constraint programming languages and a large investigation of visualization tools.
Semantics based debugging, static and dynamic typing, and more generally validation of CC programs are also studied in the project. Concerning typing issues in constraint programming, the usual coercions between constraint domains (e.g. between booleans and integers, lists and terms) offer a particularly challenging task to type systems design.
Systems biology is a cross-disciplinary domain involving biology, computer science, logics, mathematics, and physics to elucidate the high-level functions of the cell from their biochemical bases at the molecular level.
At the end of the Nineties, research in Bioinformatics evolved, passing from the analysis of the genomic sequence to the analysis of various data produced in mass by post-genomic technologies (expression of RNA and proteins, protein-protein interactions, etc). The complexity of the systems concerned requires a large research effort to develop the symbolic notation of biological processes and data.
In order to scale-up and get over the complexity walls to reason about biological systems, there is a general feeling that biology needs to be partly reinvented, and that, beyond providing tools to biologists, computer science has much to offer in terms of concepts and methods.
The number and economic impact of combinatorial optimization problems found in the industrial world are constantly increasing. They cover:
resource allocation;
placement;
scheduling;
parallelization;
transport;
etc.
The last forty years have brought many improvements in Operations Research resolution techniques. In this framework, Constraint Programming can be seen as providing, on the one hand, local coherence techniques that can be applied to various numerical or symbolic constraints, and on the other hand, declarative languages. This last point is crucial for quickly developing complex combinations of algorithms, which is not possible without a language with a high level of abstraction. It allowed for better results, for instance in scheduling problems, than traditional methods, and is promised to an even better future when thinking about cooperation of global resolution and local propagation techniques.
The project builds upon its knowledge of CC languages, constraint solvers and their implementation to work in these directions. The LCC languages offer a framework for theoretical analysis; and the soft constraints framework gives a new way to tackle over-constrained problems. The work on programming environments helps to integrate the Constraint Programming tools into this application domain.
In 2002 , we have started a Collaborative Research Initiative ARC CPBIO on ``Process Calculi and Biology of Molecular Networks'' which is ending this year. By working on well understood biological models, we sought:
to identify in the family of competitive models coming from the Theory of Concurrency (Pi-calculus, Join-calculus and their derivatives) and from Logic Programming (Constraint Logic Programming, Concurrent Constraint languages and their extensions to discrete and continuous time, TCC, HCC), the ingredients of a language for the modular and multi-scale representation of biological processes;
to provide a series of examples of biomolecular processes transcribed in formal languages, and a set of biological questions of interest about these models;
to design and apply to these examples formal computational reasoning tools for the simulation, the analysis and the querying of the models.
This work lead us to the design and implementation of the Biochemical Abstract Machine BIOCHAM that has the unique feature of providing formal languages corresponding to different levels of abstraction, for, on the one hand, modeling bio-molecular interaction diagrams with reaction rules, and on the other hand, modeling the biological properties of the system in temporal logic.
This double formalization of both the model and the properties of the biological system at hand opens several new research avenues. Our research continues by a large involvement in two European Projects of the 6th PCRD. First the STREP APrIL II where the main objective is to apply probabilistic inductive logic programming techniques to bioinformatics applications. Our main focus is on the networks and pathways of the cell, and the semi-automatic completion of models from observed temporal properties of the system. Second, the Network of Excellence REWERSE, in which we focus, amongst other themes, on bio-informatics as a field of application of the new Semantic Web technologies based on rules and constraints . In this context we intend to use the biological knowledge stored in online ontologies like GO in order to improve different tasks of the modellers:
re-use and composition of models;
semi-automatic correction/completion of models.
GNU Prolog is a free Prolog compiler with constraint solving over finite domains developed by Daniel Diaz. GNU Prolog accepts Prolog extended with primitives for constraint programming and produces native binaries (like gcc does from a C source). The Prolog part conforms to the ISO standard for Prolog with many practical extensions (global variables, OS interface, sockets,...). GNU Prolog also includes an efficient constraint solver over Finite Domains (FD), giving the user the combined power of constraint programming and the declarativity of logic programming.
An experimental concurrent extension called GNU-Prolog-RH is developed by Rémy Hæmmerlé . This version is an extension of GNU-Prolog with attributed variables, coroutining and constraint logic programming over reals. This version greatly extends the expressive and modeling power of GNU-Prolog. It is also used to prototype the design and implementation of our new SiLCC language.
The Biochemical Abstract Machine BIOCHAM provides precise semantics to bio-molecular interaction maps at two abstraction levels: the quantitative level of molecular concentrations, and the qualitative level of Boolean values. Based on these formal semantics, BIOCHAM offers:
a compositional rule-based language for modeling biochemical systems, allowing patterns and kinetic expressions when numerical data are available;
a numerical simulator and a non-deterministic boolean simulator;
a powerful query language based on temporal logic (CTL for qualitative models and LTL with constraints for quantitative models) for expressing biological queries such as reachability, checkpoints, oscillations or stability;
a machine learning system to infer interaction rules from observed temporal properties.
BIOCHAM is fully implemented in Prolog and interfaced to the state-of-the-art symbolic model checker NuSMV for evaluating boolean CTL queries in large models over several hundreds of variables. BIOCHAM models can be imported from, and exported to, the standard Systems Biology Markup Language SBML. A repository of computational models of biological systems called CMBSlib aimed at comparing models as well as formalisms, has been developed in collaboration with our partners of the ARC CPBIO.
TCLP is a prescriptive type system for Constraint Logic Programming, currently: ISO-Prolog, GNU-Prolog, SICStus Prolog and the constraint programming libraries of SICStus Prolog. The flexibility of type checking in TCLP is due to three kinds of polymorphism: parametric polymorphism (e.g. list(A)), subtyping (e.g. list(A)<term) and overloading (e.g. -:num*numnum and -:A*Bpair(A,B)). No type declaration are required, thanks to the type inference algorithm for predicates and to a default term type for function symbols.
CLPGUI is a generic graphical user interface written in Java for constraint logic programming. It is currently available for GNU-Prolog and Sicstus Prolog. CLPGUI has been developed both for teaching purposes and for debugging complex programs. The graphical user interface is composed of several windows: one main console and several dynamic 2D and 3D viewers of the search tree and of finite domain variables. With CLPGUI it is possible to execute incrementally any goal, backtrack or recompute any state represented as a node in the search tree. The level of granularity of the search tree is defined by annotations in the CLP program.
Codeine Codeine is an event-oriented tracer which extends the tracer of GNU-Prolog (version 1.2.16). It is pattern driven and allows to dynamically modify the produced trace through a dialog between a debugging tools and the tracer. The execution is sliced into elementary steps and, for each step, Codeine makes a lot of execution data available. It is distributed under the GPL licence.
This is a C library
to solve constraint satisfaction problems by the adaptive
local search method. The current release is limited to permutation problems.
More precisely, all n variables have the same
domain x1..xn and are subject to an implicit all-different
constraint. Several problems fall into this category and some examples are
provided with the library.
We are developing SiLCC , an imperative and concurrent constraint programming language based on a single paradigm: the one of Vijay Saraswat's concurrent constraint programming extended with constraint systems based on Jean-Yves Girard's Linear Logic. In the late 90's we developed the theory of this extension and we have now begun its implementation.
From a constraint programming point of view, the unique combination of constraint programming with imperative features opens many new possibilities, among which:
the capability of programming constraint solvers in the language, making them extensible by the user,
making a fully bootstrapped implementation of a constraint programming language (for the first time since Prolog)
combining constraint reasoning with state change
proving program correctness using Linear Logic.
Our current implementation of SiLCC uses GNU-Prolog-RH as temporary kernel language, on top of which a module system and the first bootstrapping libraries have been developed . The objective is to define a small kernel language as an instance of LCC over a simple constraint domain of labelled graphs, on top of which the complete SiLCC language will be built by bootstrapping. Bootstrapping is a fundamental step for getting over the current limits concerning extensibility, robustness, teaching and even efficiency of today constraint programming tools.
While a program-structuring feature is required for a production programming language, the current proposals for the inclusion of modules in the ISO Prolog standard are not very consensual. We have thus investigated an alternative solution based on Contextual Logic Programming (CxLP). Informally, the main point of CxLP is that programs are structured as sets of predicates (units) which can be dynamically combined in an execution attribute called a context. Goals are seen just as in regular Prolog, except for the fact that the matching predicates are to be located in all the units which make up the current context. We extended CxLP to attach arguments to units: these serve the dual purpose of acting as ``unit-global'' variables and as state placeholders in actual contexts. CxLP clearly carries a higher overhead than regular Prolog, as the context must be searched at run-time for the unit that defines a goal's predicate, a process which requires at least one extra indirection compared to straight Prolog; this kind of situation has become more usual and less of a performance issue in recent systems, in Object-Oriented and even in procedural languages, for instance as a result of using dynamically-loaded shared libraries. Last year we built a first prototype implementation of a Contextual Logic Programming language inside GNU-Prolog. This prototype has surprisingly good performance, considering the lack of optimizations. Contextual Constraint Logic Programming is a powerful paradigm in which to design and implement Organizational Information Systems, particularly when integrated with the ISCO/ISTO mediator framework . The implementation has been improved and is actively used in a real-world setting in the Universidade de Evora's second generation web-based academic information system.
In his thesis , Emmanuel Coquery has shown the decidability and NP-completeness of the satisfiability problem for non-structural subtyping constraints in quasi-lattices, and has implemented a complete prescriptive type system for constraint logic programs, named TCLP.
These results are currently applied to the typing of XML documents and semantic web languages and to ontological reasoning .
We have designed a new global constraint for cutset problems .
A cutset in a directed graph G =
(V, E) is a set of vertices that cuts all cycles in G. Finding
a cutset of minimum cardinality is NP-hard. There exist several
approximate algorithms and exact algorithms, most of them using graph
reduction techniques. The cutset constraint we propose is a boolean constraint
over variables associated to the vertices of a given graph, that
states that the subgraph restricted to the vertices having their
boolean variable set to true is acyclic. We propose a filtering
algorithm based on graph contraction operations and inference of
simple boolean constraints, that has a linear time complexity in O(|E| + |V|).
The efficiency of the
cutset constraint combined with heuristics based on graph properties
is shown on benchmarks of the literature for pure minimum
cutset problems, and on an application to log-based reconciliation
problems where the global cutset constraint is mixed with other boolean
constraints.
A constraint programming approach to cutwidth problems, that are central problems in model checking and variable ordering heuristics, has been investigated with less success in .
Another new global constraint mixing flow constraints with concentration constraints is emerging from our work with TOTAL concerning the optimisation of crude oil blending.
One main result of the RNTL OADymPPaC project is the finalisation of a generic trace format for constraint programming, based on an abstract semantics of finite domain solvers . This generic trace enables debugging tools to be defined almost independently from finite domain solvers, and conversely, tracers to be built independently from these tools .
The trace syntax is represented using an XML DTD, called ``gentra4cp.dtd'' and described in . A compliant trace is encoded in an XML format according to this DTD and follows the described semantics.
The generic trace format has been implemented as an extension of the GNU-prolog tracer, allowing to trace constraint search-tree and propagation .
The generic trace format is now used in several solvers. It should be extended to handle new constraint domains. Therefore we are organizing its distribution and evolution through a sourceforge tra4cp project.
A complete description of the CLPGUI tool has been published in including:
its generic architecture for visualizing and controlling the execution of a constraint program,
the representation of large search trees and domain filtering with 2D and 3D views,
the capability to create application-oriented interfaces with two examples of placement problems, one in augmented virtual reality.
A new platform called PAVOT has also been designed, allowing an easier and modular integration of several visualizers with several solvers, thanks to the generic trace format described in the previous section. The CLPGUI 2D and 3D search tree viewers have been extended to show the volume of propagation events at each node. The originality lies in the variety of parameters which allow a general analysis of constraint propagation events as well.
We also contributed to the adaptation of the CP-INFOVIS Analysis Toolkit to handle the generic trace format , based on the INFOVIS toolkit developed by J-.D. Fekete (INRIA-Futurs) and M. Ghoniem (EMN-Nantes) CP-INFOVIS. This opens the possibility to analyse constraint program execution with adjacency matrix and plotted graphs.
In collaboration with Evelyne Lutton (COMPLEX Team).
The problem of office assignment on the INRIA Rocquencourt campus can be considered as a very complex constraint satisfaction problem: the demand of research teams exceeds the actual resource, and in the same time the constraints and preferences of each team are difficult to represent and tune within standard constraint satisfaction software. The adaptive search library described above has been used to find solutions which maximize the overall preferences, given preference criteria for each user. Evolutionary techniques have been used to infer the preferences of the user from their notations of the solutions proposed by the system.
We have experimented in 2003 the scheme of a multi-user interactive evolutionary approach for the management of relative weights of user preferences. This work has been continued in 2004 , in order to produce a prototype for real size testing.
The biochemical abstract machine BIOCHAM is a software environment to represent and analyze protein-protein and protein-DNA interaction networks. The expressiveness of the BIOCHAM modeling language has been illustrated by providing a formal counterpart of Kohn's map of the mammalian cell cycle control , and by importing many other models written in other formalisms, from the Web . This effectively turns an otherwise static knowledge into a discrete transition system incorporating a qualitative description of the dynamics. Our proposal to use the Computation Tree Logic CTL as a query language for querying the biological properties of the system has been integrated in BIOCHAM and validated with many examples with respect to:
the capability to express biological properties of the system in a completely formal way ,
the efficiency of symbolic model checking tools in this context ,
Since version 2.0 we have extended BIOCHAM to support numerical models based on (highly non-linear) Ordinary Differential Equations (ODE) . BIOCHAM now provides different simulation methods for such models, including an implementation in Prolog of Rosenbrock's method for stiff systems, but also provides a Prolog implementation of constraint-based LTL model-checking for numerical traces, simplifying .
The generalization of this approach to probabilistic models using probabilistic model checking is also under investigation in the APrIL II European project. The idea is to uniformly support different abstraction levels to reason about complex bio-molecular interaction networks in BIOCHAM, namely three abstraction levels :
population of molecules,
concentrations,
presence/absence of molecules.
The formalization of bio-molecular interactions in BIOCHAM syntax, and the specification of the observed behavior of the system in temporal logic (CTL or LTL) make it possible to develop machine-learning algorithms to automatically correct or complete existing models. It is worth noting that structural learning from temporal logic properties is quite new, both from the machine learning perspective and from the systems biology perspective.
In the framework of the APrIL II STREP we first applied state-of-the-art inductive logic programming tools to simple reachability properties. We then developed a more powerful ad-hoc enumerative approach to the learning of new rules from general temporal logic properties, where the objective is to complete a boolean model w.r.t. a CTL specification .
In the quantitative case too, the same strategy can be applied. Since a constraint-based LTL model-checking algorithm has been implemented in BIOCHAM, an enumerative approach similar to the qualitative case can be applied to the estimation of parameters w.r.t. an LTL specification. This provides already an interesting speedup compared to the manual trial and error method used by biologists/modelers.
In both of these cases, the results are quite promising and a more thorough study is under work.
Collaboration within the RNTL project OADymPPaC (Nov. 2000 - May 2004)
Technology transfer of the generic trace format for the CHIP product.
Technology transfer of the CLPGUI software for the dynamic visualization of CHIP program execution.
Technology transfer of visualization techniques through the adaptation of INFOVIS toolkit.
Contribution to the seminar of the CHIP user's Club.
Collaboration within the RNTL pre-competitive project Manifico (Feb. 2003 - Feb. 2006, 150 Keuros) on non intrusive metacompilation of matching with constraints in rule based languages
Collaboration within the RNTL project OADymPPaC, evaluation of Ilog Discovery visualization component in the framework of constraint programming
Collaboration with TOTAL on a constraint programming approach to the optimization of crude oil blending and the Thesis of Aurélie Strobbe under a CIFRE contract.
RNTL project MANIFICO (Sep. 2003-2006) on the compilation of rules and constraints. with LORIA PROTHEO and ILOG coord.
INRIA cooperative research initiative ARC CPBIO (2002-2004) on ``Process Calculi and Molecular Networks Biology'', with LORIA MODBIO, CNRS PPS lab. of University Paris 7 and Genoscope Evry, coord. F. Fages INRIA Rocquencourt.
RNTL project OADymPPaC (Nov. 2000 - May 2004) ``Tools for Dynamic Analysis and Debugging of Constraint Programs'' with IRISA LANDES/INSA, INRIA-Futurs, Ecole des Mines de Nantes, University of Orléans, COSYTEC and ILOG, coord. P. Deransart INRIA Rocquencourt.
CIFRE thesis and industrial contract with TOTAL (July 2004- July 2007) on the optimization of petroleum processes by constraint programming.
6th PCRD STREP APRIL II ``Applications of probabilistic inductive logic programming'', coord. Prof. L. de Raedt, University of Freiburg.
6th PCRD Network of Excellence REWERSE ``Reasoning with rules and semantics'', coord. Prof. F. Bry, Ludwig Maximillian's University in Munich.
5th PCRD Network of Excellence COLOGNET ``Computational logic network'', area of constraint logic programming coord. F. Rossi, University of Padova.
ERCIM Working Group on Constraints, coord. F. Fages, INRIA Rocquencourt.
Prof. Luc de Raedt from the University of Freiburg in Germany and Prof. Jan Maluzinski from the University of Linkoping in Sweden have been invited for short visits.
Contraintes is affiliated to the Doctoral school EDITE of the University of Paris 6.
All Ph.D. students and some members of Contraintes teach in the first or second cycles of Universities or Engineering schools. Our involvement in third cycle cursus is the following:
Master Parisien de Recherche en Informatique (MPRI) lecture on Programmation par Contraintes by Sylvain Soliman and François Fages (15h).
MPRI lecture on Bio-informatique formelle by François Fages and Laurence Calzone (15h).
DEA SPL ``Sémantique, Preuve et Langages'', ENS, X, University Paris 6, 7, 11: course of programmation par contraintes, François Fages 10h, Sylvain Soliman 10h.
DEA Informatique, University of Orléans: François Fages 3h.
DESS Informatique, University Paris Sorbonne: Daniel Diaz 30h.
Pierre Deransart is the General Secretary, past Chairman, of the ``Association Française pour la Programmation en Logique et la programmation par Contraintes'' AFPLC, and member of the Steering Committee of PPDP. He is in charge of international relationships of INRIA with Brazil and Portugal. He is the manager of the OADymPPaC project.
François Fages is the Chairman of the ERCIM Working Group on Constraints, vice-Chairman, past Chairman of the ``Association Française pour la Programmation par Contraintes'' AFPLC, member past-Chairman of the Steering Committee of the ACM International Conference on Principles and Practice of Declarative Programming, PPDP, a member of the Scientific Council of the French-Russian Liapunov Institute, and a member of the Editorial Board of RAIRO Operations Research.
Sylvain Soliman is the Secretary of the ERCIM Working Group on Constraints.