EN FR
EN FR


Section: Partnerships and Cooperations

International Initiatives

Participation In International Programs

Facepe Inria Project: CM2ID

Participants : Amedeo Napoli [contact person] , Chedy Raïssi.

Combining Numerical and Symbolical Methods for the Classification of Multi-valued and Interval Data (CM2ID)

This research project called “Combining Numerical and Symbolical Methods for the Classification of Multi-valued and Interval Data (CM2ID)” involves the Orpailleur Team at Inria NGE, AxIS at Inria Rocquencourt (Yves Lechevallier) and the computer science laboratory of the University of Recife (Prof. Francisco de A.T. de Carvalho). The project aims at developing and comparing classification and clustering algorithms for interval and multi-valued data. Two families of algorithms are studied, namely “clustering algorithms” based on the use of a similarity or a distance for comparing the objects, and “classification algorithms in Formal Concept Analysis (FCA)” based on attribute sharing between objects. The objectives here are to combine the facilities of both families of algorithms for improving the potential of each family in dealing with more complex and voluminous datasets, in order to push the complexity barrier farther in the mining of complex data. Biological data, namely gene expression data, are used for test and evaluation of the combination of algorithms.

The project involves three teams, one Brazilian team and two French Inria teams, including specialists of clustering and classification methods. Thus the complementarity of the teams is ensured and, in addition, close contacts exist with experts of the domain of data for carrying on a complete evaluation of the results obtained by the combined algorithms expected to be designed during the project.

Fapemig Inria Project: IKMSDM

Participants : Amedeo Napoli [contact person] , Chedy Raïssi.

This Fapemig – Inria research project, called “Incorporating knowledge models into scalable data mining algorithms” involves researchers at Universidade Federal de Minas Gerais in Belo Horizonte –a group led by Prof. Wagner Meira– and the Orpailleur team at Inria Nancy Grand Est. In this project we are interested in the mining of large amount of data and we target two relevant application scenarios where such issue may be observed. The first one is text mining, i.e. extracting knowledge from texts and document categorization. The second application scenario is graph mining, i.e. determining relationship-based patterns and use these relations to perform classification tasks. In both cases, the computational complexity is large either because the high dimensionality of the data or the complexity of the patterns to be mined.

One strategy to ease the execution of such data mining tasks is to use existing knowledge to restrict the search space and to assess the quality of the patterns found. This existing knowledge may be formalized in ontologies but also in other ways whose study is a research issue in this project. Once we are able to build knowledge models, we need to determine how to use such knowledge models, which is a second major research issue in this project. In particular, we want to design and evaluate mechanisms that allow the exploitation of existing knowledge for sake of improving data mining algorithms.

Finally, the computational complexity of the algorithms remains a major issue and we intend to address it through parallel algorithms. Data mining algorithms, in general, represent a challenge for sake of parallelization because they are irregular and intensive in terms of both computing and communication. Accordingly, in a first joint work, we developed a new parallel algorithm to build skycubes based on the Anthill framework developed at UFMG. The paper was presented in a local Brazilian Conference and an extended journal version will appear in a 2012 special issue of the International Journal of Parallel Programming.

International collaborations in Mining complex data

Participants : Mehwish Alam, Aleksey Buzmakov, Victor Codocedo, Adrien Coulet, Elias Egho, Ioanna Lykourentzou, Amedeo Napoli [contact person] , Chedy Raïssi.

PICS CNRS CADOE

A first collaboration involves “Université du Québec à Montréal” (UQAM) in Montréal with Prof. Petko Valtchev and Laboratoire LIRMM in Montpellier with Prof. Marianne Huchard. This collaboration is supported by a CNRS PICS project (2011-2014), which is called “Concept Analysis driving Ontology Engineering” and abbreviated in “CAdOE”. The research work within this project is aimed at defining and implementing a semi-automatic methodology supporting ontology engineering based on the joint use of Formal Concept Analysis (FCA) and Relational Concept Analysis (RCA). At the moment, some elements of this methodology are existing and were used in text mining [86] , [85] , but this methodology should be completed and improved, especially regarding the applicability on complex data and the interoperability with knowledge representation modules.

Collaboration with HSE Moscow

A second collaboration involves Sergei Kusnetsov at Higher School of Economics in Moscow (HSE). Amedeo Napoli visited HSE laboratory in November 2012 (with the support of HSE) and Sergei Kuznetsov visited Inria NGE in August and in December 2012. These visits were the occasion of preparing a publications (submitted for the next year). This shows that the collaboration is on-going and that there is still a substantial research work to be done.

AGAUR Project: collaboration with UPC Barcelona

This project mainly involves Amedeo Napoli and Jaume Baixeries who is an Associate Professor at UPC Barcelona (Universitat Politècnica de Catalunya). Amedeo Napoli had a stay of roughly two months in December 2011 and May-June 2012. Both researchers have worked, jointly with Mehdi Kaytoue, on the characterization of functional dependencies in many-valued data with FCA and pattern structures. In this work, functional dependencies are directly taken into account and this shows a different but important capability of pattern structures to deal with complex data [30] .

PHC Zenon (Cyprus)

A third collaboration –a PHC Zenon project– exists with Florent Domenach, associated professor at the University of Nicosia in Cyprus. This project is entitled “Knowledge Discovery for Complex Data in Formal and Relational Concept Analysis” (KD4CD) and is aimed at studying and combining different types of classification process in the framework of FCA. These processes can be based on Galois connections but also on the so-called “overhangings”, i.e. a kind of generalization of closure systems. Moreover, another interest is put on consensus theory where the objective is to find the better classification of a set of abjects according to a quality measure (this could be applied to ontologies). This year, there were two visits, one from Cyprus to France in October 2012 and the other from France to Cyprus in December 2012. Publications are currently submitted.