LINKS - 2018 - Annual activity report

LINKS

LINKS - 2018

Project-Team Links

Team, Visitors, External Collaborators

Overall Objectives

Research Program

Application Domains

Highlights of the Year

New Software and Platforms

New Results

Bilateral Contracts and Grants with Industry

Bilateral Grants with Industry

Partnerships and Cooperations

Dissemination

Bibliography

Publications of the year

Previous |

Home | Next next

Section: Research Program

Querying Heterogeneous Linked Data

Our main objective is to query collections of linked datasets. In the static setting, we consider two kinds of links: explicit links between elements of the datasets, such as equalities or pointers, and logical links between relations of different datasets such as schema mappings. In the dynamic setting, we permit a third kind of links that point to “intentional” relations computable from a description, such as the application of a Web service or the application of a schema mapping.

We believe that collections of linked datasets are usually too big to ensure a global knowledge of all datasets. Therefore, schema mappings and constraints should remain between pairs of datasets. Our main goal is to be able to pose a query on a collection of datasets, while accounting for the possible recursive effects of schema mappings. For illustration, consider a ring of datasets $D_{1}$ , $D_{2}$ , $D_{3}$ linked by schema mappings $M_{1}$ , $M_{2}$ , $M_{3}$ that tell us how to complete a database $D_{i}$ by new elements from the next database in the cycle.

The mappings $M_{i}$ induce three intentional datasets $I_{1}$ , $I_{2}$ , and $I_{3}$ , such that $I_{i}$ contains all elements from $D_{i}$ and all elements implied by $M_{i}$ from the next intentional dataset in the ring:

I_{1} = D_{1} \cup M_{1} (I_{2}), I_{2} = D_{2} \cup M_{2} (I_{3}), I_{3} = D_{3} \cup M_{3} (I_{1})

Clearly, the global information collected by the intentional datasets depends recursively on all three original datasets $D_{i}$ . Queries to the global information can now be specified as standard queries to the intentional databases $I_{i}$ . However, we will never materialize the intentional databases $I_{i}$ . Instead, we can rewrite queries on one of the intentional datasets $I_{i}$ to recursive queries on the union of the original datasets $D_{1}$ , $D_{2}$ , and $D_{3}$ with their links and relations. Therefore, a query answering algorithm is needed for recursive queries, that chases the “links” between the $D_{i}$ in order to compute the part of $I_{i}$ needed for the purpose of query answering.

This illustrates that we must account for the graph data models when dealing with linked data collections whose elements are linked, and that query languages for such graphs must provide recursion in order to chase links. Therefore, we will have to study graph databases with recursive queries, such as rdf graphs with sparql queries, but also other classes of graph databases and queries.

We study schemas and mappings between datasets with different kinds of data models and the complexity of evaluating recursive queries over graphs. In order to use schema mapping for efficiently querying the different datasets, we need to optimize the queries by taking into account the mappings. Therefore, we will study static analysis of schema mappings and recursive queries. Finally, we develop concrete applications in which our fundamental techniques can be applied.

Previous |

Home | Next next