EN FR
EN FR


Section: Research Program

From KDD to KDDK

Keywords:

knowledge discovery in databases, knowledge discovery in databases guided by domain knowledge, data mining

Knowledge discovery in databases is a process for extracting knowledge units from large databases, units that can be interpreted and reused within knowledge-based systems. From an operational point of view, the KDD process is performed within a KDD system including databases, data mining modules, and interfaces for interactions, e.g. editing and visualization. The KDD process is based on three main operations: selection and preparation of the data, data mining, and finally interpretation of the extracted units. The KDDK process –as implemented in the research work of the Orpailleur team– is based on data mining methods that are either symbolic or numerical:

  • Symbolic methods are based on frequent itemsets search, association rule extraction [108] , Formal Concept Analysis and extensions [93] .

  • Numerical methods are based on higher order stochastic models, namely second-order Hidden Markov Models (HMM2) and Hidden Markov fields (HMRF), which are especially designed for an efficient modeling of space and time [9] .

The principle summarizing KDDK can be understood as a process going from complex data units to knowledge units being guided by domain knowledge [104] . Two original aspects can be underlined: (i) the knowledge discovery process is guided by domain knowledge at each step of the process, and (ii) the extracted units are embedded within a knowledge-based system for problem solving purposes.

The KDDK process in the research work of Orpailleur is mainly based on classification, which is a polymorphic process involved in modeling, mining, representing, and reasoning tasks. Finally, the KDDK process is intended to feed knowledge-based systems working in application domains, e.g. agronomy, astronomy, biology, chemistry, and medicine, and also in the context of semantic web for text mining, information retrieval, and ontology engineering [96] , [81] .