Section: New Results

Quality and interoperability of large document catalogues

Participants : Michel Chein, Madalina Croitoru, Alain Gutierrez, Michel Leclère, Rallou Thomopoulos.

The work in this research line takes place in the ANR project Qualinca, devoted to methods and tools to repair linkage errors in bibliographical databases (see Qualinca in Section  9.1 ). Within this project, we specially work with our applicative partner ABES (French Agency for Academic Libraries, http://www.abes.fr/ ).

ABES manages several catalogues and authority bases, in particular the Sudoc, the collective catalogue of French academic libraries. ABES also provides services to libraries and end-users, as well as to other catalogue managers (e.g., OCLC for Worldcat and, in France, Adonis for the Isidore platform).

This year, we devoted most of our research effort to the following aspects in collaboration with ABES:

  1. the finalization of a conceptual model of ABES librarian expertise in their linkage activity, and its formalization in our theoretical framework; the formalized model is both logical (the knowledge is expressed by facts, rules and constraints in first-order logic) and numerical (some predicates, which correspond to qualitative criteria, are computed by numerical functions, which themselves take as input the result of logical queries to the knowledge base).

  2. the development of a diagnosis prototype, called SudoQual, which implements this model; in brief, SudoQual takes as input a given apellation (i.e., family name and first name), retrieves all references potentially associated with this appellation and outputs sameAs and Different links between these references. To develop SudoQual, we built an API on top of our tool Cogui.

  3. first experiments with SudoQual on the Sudoc base, with the results being checked manually by ABES librarians.

Research report [37]

This work required a tight collaboration with ABES (materialized by bimonthly meetings and numerous ponctual exchanges). The first experiments yield extremely satisfactory results, hence ABES is now considering turning SudoQual into a production tool used by librarians in their daily work to validate/correct autority links in the Sudoc catalogue. This requires to define a suitable user-interface, which is an issue we are currently discussing with ABES. We are also preparing experiments at a larger scale on a sample provided by ABES.

Besides, in collaboration with Qualinca partner LRI, we developed a method and a tool to fusion data linked by “same-as” links. More precisely, given an RDF dataset, our tool allows to merge “same as” data, which are often conflictual, into a unified and consistant representation using a multi-criteria decision method. The tool was evaluated on a dataset provided by INA and LIG, two other partners of Qualinca.

EGC 2015 [32]

Still with the LRI partner, who developed a logic-based decision tool that statuates on the validity of same-as links in RDF data, we investigated the use of argumentation techniques to explain why “same-as” links are invalidated by this tool.

SUM 2015 [25]