Section: New Results

Information and Social Networks Mining for Supporting Information Retrieval

Clustering of Relational Data and Social Networks Data: Graph Aggregation

Participant : Yves Lechevallier.

The automatic detection of communities in a social network can provide a kind of graph aggregation. The objective of graph aggregations is to produce small and understandable summaries and it can highlight communities in the network, which greatly facilitates the interpretation.

Social networks allow having a global view of the different actors and different interactions between them, thus facilitating the analysis and information retrieval.

In the enterprise context, a considerable amount of information is stored in relational databases. Therefore, relational database can be a rich source to extract social network.

During this year we updated the program developed by Louati Amine in 2011. A book chapter [34] proposes a new aggregation criteria.

This work is done by Louati Amine (AxIS) in collaboration with Marie-Aude Aufaure, head of the Business Intelligence Team, "Ecole Centrale de Paris", MAS Laboratory.

Multi-View Clustering of Relational Data

Participants : Thierry Despeyroux, Francisco de Carvalho, Yves Lechevallier.

In the work reported in [47] in collaboration with Francisco de A.T. de Carvalho, we introduce an improvement of a clustering algorithm described in [78] that is able to partition objects taking into account simultaneously their relational descriptions given by multiple dissimilarity matrices. In this version of the prototype clusters depend on the variables of the representation space. These matrices could have been generated using different sets of variables and dissimilarity functions. This method, which is based on the dynamic clustering algorithm for relational data, is designed to provided a partition and a vector of prototypes for each cluster as well as to learn a relevance weight for each dissimilarity matrix by optimizing an adequacy criterion that measures the fit between clusters and their representatives. These relevance weights change at each algorithm iteration and are different from one cluster to another. Moreover, various tools for the partition and cluster interpretation furnished by this new algorithm are also presented.

Two experiments demonstrate the usefulness of this clustering method and the merit of the partition and cluster interpretation tools. The first one use a data set from UCI machine learning repository concerning handwritten numbers (digitalized pictures). The second uses a set of reports for which we have an expert classification given a priori. This work has been published this year as a chapter in "Advances in Knowledge Discovery and Management" [32] .