Section: New Results
Model for conditionally correlated categorical data
Participants : Christophe Biernacki, Matthieu Marbac-Lourdelle, Vincent Vandewalle.
It is a model-based clustering proposal (called CMM for Conditional Modes Model) where categorical data are grouped into conditionally independent blocks. The corresponding block distribution is a parsimonious multinomial distribution where the few free parameters correspond to the most likely modality crossings, while the remaining probability mass is uniformly spread over the other modality crossings. The exact computation of the integrated complete-data likelihood allows to perform the model selection, by a Gibbs sampler, reducing the computing time consuming by parameter estimation and avoiding BIC criterion biases pointed out by our experiments. An article has been now submitted to an international journal [49] . Furthermore, an R package (CoModes) is available on Rforge (see 5.4 ).