Section: New Results

Logics and Graph-Based Languages for Ontology-Mediated Query Answering

Participants : Jean-François Baget, Meghyn Bienvenu, Efstathios Delivorias, Michel Leclère, Marie-Laure Mugnier, Swan Rocher, Federico Ulliana.

Ontolology-mediated query answering (and more generally Ontology-Based Data Access, OBDA) is a recent paradigm in data management, which takes into account inferences enabled by an ontology when querying data. In other words, the notion of a database is replaced by that of a knowledge base, composed of data (also called facts) and of an ontology. Two families of formalisms for representing and reasoning with the ontological component are considered in this context: description logics and the more recent existential rule framework. Until last year, the team has mainly investigated existential rules. This expressive formalism generalizes most lightweight description logics used in OBDA (such as ℰℒ and DL-Lite, on which OWL 2 tractable profiles are based) on the one hand, and Datalog, the language of deductive databases, on the other hand. With the arrival of Meghyn Bienvenu, description logics have joined the core formalisms studied by the team. Compared to existential rules, the description logics considered for OBDA lead to lower complexity classes and specific algorithmic techniques. Studying both formalisms is scientifically highly relevant, specially in the context of OBDA.

We have also broadened this research line by starting investigating ontological languages for non-relational data, an issue that has barely been considered yet.

Before presenting this year' results, we recall the two classical ways of processing rules, namely forward chaining and backward chaining. In forward chaining (also known as the chase in databases), the rules are applied to enrich the initial facts and query answering can then be solved by evaluating the query against the “saturated” factbase (as in a classical database system i.e., with forgetting the rules). The backward chaining process can be divided into two steps: first, the initial query is rewritten using the rules into a first-order query (typically a union of conjunctive queries, UCQ); then the rewritten query is evaluated against the initial factbase (again, as in a classical database system). Note that forward and backward processes do not halt for all kinds of existential rules nor all lightweight description logics.

New Results in the Description Logics Framework

When using Description Logics (DL) ontologies to access relational data, mappings are used to link the relational schema to the vocabulary of the ontology (which uses only unary and binary predicates). In order to debug and optimize DL-based OBDA systems, it is important to be able to analyze and compare ontology-mapping pairs, called OBDA specifications. Prior work in this direction compared specifications using classical notions of equivalence and entailment.

  • We have explored an alternative approach in which two specifications are deemed equivalent if they give the same answers to the considered query or class of queries for all possible data sources. After formally defining such query-based notions of entailment and equivalence of OBDA specifications, we investigated the complexity of the resulting analysis tasks when the ontology is formulated in (fragments of) DL-LiteR, which forms the basis for the Semantic Web ontological language OWL 2 QL.

  • We consider a range of Horn DLs for which query answering has polynomial data complexity, but which do not guarantee the existence of First-Order(FO)-rewritings of all queries. In order to extend the applicability of the FO-rewriting technique, a key task is to be able to identify specific ontology-query pairs that admit an FO-rewriting. This led us to study FO-rewritability of conjunctive queries in the presence of ontologies formulated in DLs ranging between ℰℒ and Horn-𝒮ℋℐℱ, along with related query containment problems. Apart from providing characterizations, we established complexity results ranging from EXPTIME via NEXPTIME to 2EXPTIME, pointing out several interesting effects. In particular, FO-rewriting is more complex for conjunctive queries than for atomic queries when inverse roles are present, but not otherwise.

New Results in the Existential Rule Framework

Several new theoretical results have been obtained on ontology-mediated query answering with existential rules:

  • While most work in the area of ontology-mediated query answering focuses on conjunctive queries, navigational queries are gaining increasing attention. In collaboration with Michael Thomazo (Inria CEDAR), we took a step towards a better understanding of the combination of navigational query languages and existential rules by pinpointing the (data and combined) complexities of evaluating path queries (more precisely, two-way regular path queries) over knowledge bases whose ontology is composed of linear existential rules (a class of rules that can be seen as a natural generalisation of the description logic DL-Lite). We extended an algorithm tailored for DL-Lite and showed that, despite an exponential blow-up with respect to the maximum predicate arity, our algorithm was worst-case optimal.

    • RR'16 (Best paper award) [29]

  • Boundedness is an important notion for optimizing the processing of rule languages, as it ensures that materialisation can be performed in a predefined number of steps, independently from the size of any factbase. We are currently studying several boundedness notions for existential rules that extend the well-known boundedness notion of Datalog, and investigate their relationships with properties ensuring the finiteness of the chase or query rewriting. One of our first results is that, for a natural notion of boundedness, bounded existential rules are exactly those at the intersection of finite expansion sets (which ensure that any factbase has a finite sound and complete saturation) and finite unification sets (which ensure that any conjunctive query can be finitely rewritten into a sound and complete union of conjunctive queries).

  • Finally, Swan Rocher's PhD thesis deepened the study of the decidability and complexity of conjunctive query answering for classes of existential rules added with transitivity rules (previous results were presented at IJCAI 2015 [48])

Querying NoSQL databases (Key-value stores)

Over the last decade, research efforts to develop algorithms for OBDA have built on the assumption that data conforms to relational structures (including RDF) and that the paradigm can be deployed on top of relational databases with conjunctive queries at the core (e.g., in SQL or SPARQL). However, this is not the prominent way on which data is today stored and exchanged, especially in the Web. Whether OBDA can be developed for non-relational structures, like those shared by increasingly popular NOSQL languages sustaining Big-Data analytics, is still an open question. In collaboration with Marie-Christine Rousset (University of Grenoble, LIG), we proposed the first framework for studying the problem of answering ontology-mediated queries on top of NOSQL key-value stores. More precisely, we formalized the core data model and basic queries of these systems and introduced a rule language (NO-RL) to express lightweight ontologies on top of data. We defined a sound and complete query rewriting technique and studied the decidability and data complexity of answering ontology-mediated queries depending on considered the rule fragment.