Section: New Results
Ontological Query Answering with Rules
Participants : Jean-François Baget, Marie-Laure Mugnier, Michaël Thomazo, Michel Leclère, Eric Salvat, Mélanie König.
In collaboration with: Sebastian Rudolph (Karlsruhe Institute of Technology)
We have developed a framework based on rules that have the ability of generating unknown individuals, an ability sometimes called value invention in databases. These rules are of the form body head, where the body and the head are conjunctions of atoms (without function symbols except constants) and variables that occur only in the head are existentially quantified, hence their name existential rules hereafter. E.g., . These rules can be seen as the logical translation of conceptual graph rules, historically a main focus of the team [70] [55] . Existential rules have the same logical form as the well-known Tuple-Generating Dependencies (TGDs) in databases [45] . TGDs have been extensively used as a high-level generalization of different kinds of constraints, e.g., for data exchange [57] . Recently, there has been renewed interest for TGDs seen as rules in the context of ontological query answering. Indeed, the value invention feature has been recognized as crucial in an open-world perspective, where it cannot be assumed that all individuals are known in advance. The deductive database language Datalog allows to express some ontological knowledge but it does not allow for value invention. This motivated the recent extension of Datalog to TGDs (i.e., existential rules), which gave rise to the Datalog +/- family [52] , [53] , [54] . In KRR and in the Semantic Web, ontological knowledge is often represented with formalisms based on description logics (DLs). However, DLs traditionally focused on reasoning tasks about the ontology itself (the so-called TBox), for instance classifying concepts; querying tasks were restricted to ground atom entailment. Conjunctive query answering with classical DLs has appeared to be extremely complex, hence less expressive DLs more adapted to conjunctive query answering on large amounts of data have been designed recently, namely DL-Lite [51] , [41] , [63] , and more generally Horn DLs (see e.g., [60] ), cf. also the tractable profiles of the Semantic Web language OWL2. Existential rules cover the core of lightweight DLs dedicated to query answering, while being more powerful and flexible [53] , [44] ,[21] . In particular, they have unrestricted predicate arity (while DLs consider unary and binary predicates only), which allows for a natural coupling with database schemas, in which relations may have any arity; moreover, adding pieces of information, for instance to take contextual knowledge into account, is made easy by the unrestricted predicate arity, since these pieces can be added as new predicate arguments.
Building on our previous work on conceptual graphs, while meeting this new trend, we have developed a knowledge representation framework centered on existential rules, which can be seen both as logic-based and graph-based.
Entailment, hence query answering, with existential rules is not decidable, thus finding decidable classes of rules as expressive as possible is a crucial issue. We have pursued our previous work on better understanding the border between decidability and undecidability. We have also extended rule dependency to k-dependency, which takes into account sequences of rule applications.
-
Results published in Artificial Intelligence Journal [13] (extending the work in [3] , [44] ); keynote talk synthesizing this work at RR'2011 [20] ; extension to k-dependency at RR'2011 [22]
.
For newly exhibited decidable classes (namely, “frontier-one”, “frontier-guarded” and “weakly-frontier-guarded” rules), the problem complexity was unknown, moreover there was no algorithm for computing entailment. First, we have classified these classes with respect to combined complexity (i.e., usual complexity) with both unbounded and bounded predicate arity, and data complexity (i.e., restricting the input of the decision problem to the facts). An interesting result is that some of the new classes (namely frontier-one and frontier-guarded rules) have a polynomial time data complexity. Secondly, we have provided a generic algorithm for query entailment with a large class of rules including these classes, which is worst-case optimal for combined complexity (with or without bounded predicate arity) as well as for data complexity.
-
Results partially published at IJCAI'2011 [21] . Long paper in preparation with extended complexity results and all proofs, for submission to a major artificial intelligence journal.