Section: Application Domains
Links are important for web users, who try to locate relevant information. They typically want to pose their queries locally and obtain the answers from both local and remote repositories. With the concept of linked data collections, today's web users are provided with a virtual collection of data and explicit links. One of the goal of our project is to enrich the collection of data and links with more expressive mappings between local relations and external resources. The latter are not available in the current Web and would lead to better take advantage of the diversity and heterogeneity of information. The answer to a user query needs to exploit both explicit links, such as pointers to external resources or semantic correspondences to those and logical links to external repositories, represented as schema mappings. Therefore, the second goal is to evaluate local queries across such mappings and thus exploit the semantic knowledge of external resources. However, we argue that the benefits of links are not limited to casual users. In this paragraph, we briefly discuss two applications in which linked data collections need to be enriched and queried.
Collective Intelligence. Collective knowledge is a shared or group intelligence that emerges from the collaboration of individuals (from Wikipedia). There are many contexts in which such a concept is readily applicable. We advocate here one possible scenario, namely that of Business Intelligence. In the past decade, most of the enterprise data was proprietary, thus residing within the enterprise repository, along with the knowledge derived from that data. Today's enterprises and businessmen need to face the problem of information explosion, due to the Internet's ability to rapidly convey large amounts of information throughout the world via end-user applications and tools. Although linked data collections exist by bridging the gap between enterprise data and external resources, they are not sufficient to support the various tasks of Business Intelligence. To make a concrete example, concepts in an enterprise repository need to be matched with concepts in Wikipedia and this can be done via pointers or equalities. However, more complex logical statements (i.e. mappings) need to be conceived to map a portion of a local database to a portion of an RDF graph, such as a subgraph in Wikipedia or in a social network, e.g. LinkedIn. Such mappings would then enrich the amount of collective knowledge shared within the enterprise and let more complex queries be evaluated. As an example, businessmen with the aid of business intelligence tools need to make complex sentimental analysis on the potential clients and for such a reason, such tools must be able to pose complex queries, that exploit the previous logical mappings to guide their analysis. Moreover, the external resources may be rapidly evolving thus leading to revisit the current state of collective intelligence.
Data cleaning. The second example of application of our proposal concerns scientists who want to quickly inspect relevant literature and datasets. In such a case, local knowledge that comes from a local repository of publications belonging to a research institute (e.g. HAL) need to be integrated with other Web-based repositories, such as DBLP, Google Scholar, ResearchGate and even Wikipedia. Indeed, the local repository may be incomplete or contain semantic ambiguities, such as mistaken or missing conference venues, mistaken long names for the publication venues and journals, missing explanation of research keywords, and opaque keywords. We envision a publication management system that exploits both explicit links, namely pointers to external resources and logical links, i.e. more complex relationships between local portions of data and remote resources. There are different tasks that such a scenario could entail such as (i) cleaning the errors with links to correct data e.g. via mappings from HAL to DBLP for the publications errors, and via mappings from HAL to Wikipedia for opaque keywords, (ii) thoroughly enrich the list of publications of a given research institute, and (iii) support complex queries on the corrected data combined with logical mappings.