Section: New Results

Distributed data management

Participants : Serge Abiteboul, Emilien Antoine, Daniel Deutch, Alban Galland, Wojciech Kazana, Yannis Katsis, Luc Segoufin, Cristina Sirangelo.

Distributed knowledge base.

As a foundation for managing distribution, we have proposed a model of a distributed knowledge base, that handles data and meta-data, as well as access control and localization, in a unique integrated setting. To support automatic reasoning on this knowledge base, we also introduced a novel rule-based language supporting the exchange of rules, namely Webdamlog. This work has been presented [21] and demonstrated [26] at major database conferences.

Probabilistic XML.

Data from the Web are imprecise and uncertain. To manage this imprecision in a well-principled way, we have made significant advances in the field of probabilistic databases, and specifically, probabilistic XML. We have introduced new tractable probabilistic models for representing uncertain hierarchical information, and carried out in-depth studies of query evaluation, aggregation, and updates in various probabilistic XML models. These results have matured and some of the results are available in journal articles, e.g., [14] .

Enumeration of query answers.

In many applications the output of a query may have a huge size and enumerating all the answers may already consume too many of the allowed resources. In this case it may be appropriate to first output a small subset of the answers and then, on demand, output a subsequent small numbers of answers and so on until all possible answers have been exhausted. To make this even more attractive it is preferable to be able to minimize the time necessary to output the first answers and, from a given set of answers, also minimize the time necessary to output the next set of answers - this second time interval is known as the delay. We have shown that this was doable with a linear preprocessing time and constant enumeration delay for first-order queries over structures of bounded degree [19] .

Data exchange and Web incomplete information.

We have addressed the problem of restructuring data exchanged between communicating applications on the Web. We have proposed and analyzed a new language to specify data restructuring rules (schema mappings). This language generalizes existing mapping dependencies, by allowing a more flexible specification mechanism [20] .


We also invested a lot of effort in a textbook (undergraduate and graduate level) on Web data management (nicknamed Jorge) to be published at Cambridge University Press [38] . The book is already available on the Webdam Web site http://webdam.inria.fr/Jorge