EN FR
EN FR


Section: New Results

Knowledge Engineering and Web of Data

Participants : Emmanuelle Gaillard, Nicolas Jay, Florence Le Ber, Jean Lieber, Amedeo Napoli, Emmanuel Nauer, Justine Reynaud, Yannick Toussaint.

Keywords:

knowledge engineering, web of data, definition mining, classification-based reasoning, case-based reasoning, belief revision, semantic web

Current Trends in Case-Based Reasoning

The Taaable project was originally created as a challenger of the Computer Cooking Contest (ICCBR Conference) [72]. Beyond its participation to the CCC challenges, the Taaable project aims at federating various research themes including case-based reasoning (CBR), knowledge discovery, knowledge engineering and belief change theory [6]. CBR performs adaptation of recipes w.r.t. user constraints. The reasoning process is based on a cooking domain ontology (especially hierarchies of classes) and adaptation rules. The knowledge base is encoded within a semantic wiki containing the recipes, the domain ontology and adaptation rules.

Adaptation rules have been used to manage ingredient adaptation with a restrictive set of available ingredients [43]. Three types of rule have been identified. The first type is about the substitution of ingredients belonging to a same category (e.g. dairy) by the sole available ingredient of this category (e.g. yogurt). The second type of rule is in concern with substitution, according to the role the ingredients play in the recipe, e.g. egg can be replaced by salmon in salad recipes because they are both playing the role of a protein. The last type of rules consists in removing ingredients of original recipes when they are not concerned by a rule of the first nor second type.

FCA allows the classification of objects according to the properties they share into a concept lattice. A lattice has been built from a large set a cocktail recipes according to the ingredients they use, producing a hierarchy of ingredient combinations. For example, when a cocktail recipe R has to be adapted, this lattice can be used to search the best ingredient combinations in the concepts that are the closest to the concept representing R [43].

Two main research works were carried out about the application of CBR in medicine. Imaging, in particular in nuclear medicine, is getting more and more complex over the years. Each year, new radiotracers and machines are developed and tested. Despite this rapid evolution, few studies address the issue of image interpretation and imaging report. In [35], we show how nuclear image interpretation is improved by Tetra, a new case-based decision support system.

Cancer registries are important tools in the fight against cancer. At the heart of these registries is the data collection and coding process. Ruled by complex international standards and numerous best practices, operators are easily overwhelmed. In [48], a system is presented to assist operators in the interpretation of best medical coding practices.

Finally, an approach to adaptation based on the principles of analogical transfer applied to the formalism RDFS has been developed. It is based on the problem-solution dependency represented as an RDFS graph: this dependency within the source case is modified so that it fits the context of the target problem [2]. This is implemented within the so-called SQTRL system (for “SPARQL Query Transformation Rule Language” http://tuuurbine.loria.fr/sqtrl/) [2]. The development of SQTRL is based on a collaboration between Orpailleur team and the Archives Henri Poincaré (http://poincare.univ-lorraine.fr/).

Exploring and Classifying the Web of Data

A part of the research work in Knowledge Engineering is oriented towards knowledge discovery in the web of data, following the increase of data published in RDF (Resource Description Framework) format and the interest in machine processable data. The quick growth of Linked Open Data (LOD) has led to challenging aspects regarding quality assessment and data exploration of the RDF triples that shape the LOD cloud. In the team, we are particularly interested in the “completeness of the data” viewed as their their potential to provide concept definitions in terms of necessary and sufficient conditions [69]. We have proposed a novel technique based on Formal Concept Analysis which classifies subsets of RDF data into a concept lattice [47]. This allows data exploration as well as the discovery of implication rules which are used to automatically detect “possible completions of RDF data” and to provide definitions. Moreover, this is a way of reconciling syntax and semantics in the LOD cloud. Experiments on the DBpedia knowledge base shows that this kind of approach is well-founded and effective.

In the same way, FCA can be used to improve ontologies associated with the Web of data. Accordingly, we proposed a method to build a concept lattice from linked data and compare the structure of this lattice with an ontology used to type the considered data [46]. The result of this comparison shows which “new axioms” can be proposed to ontology developers for guiding their design work.