Section: New Results
Statistical learning methodology and theory
Participants : Vincent Brault, Gilles Celeux, Christine Keribin, Erwan Le Pennec, Lucie Montuelle, Michel Prenat, Solenne Thivin.
Vincent Brault, Ph D. student of Gilles Celeux and Christine Keribin defended his thesis on the Latent Block Model (LBM) for categorical data. Their work investigated a Gibbs algorithm to avoid solutions with empty clusters on synthetic as well as real data (Congressional Voting Records and genomic data. They detailed the link between the information criteria ICL and BIC, compared them on synthetic and real data, and conjectured that these criteria are both consistent for LBM, which is not a standard behavior. Hence, ICL has to be preferred for LBM. This work is now published in Statistics and Computing.
Vincent Brault has achieved a detailed bibliographical review on coclustering with Aurore Lomet (UTC) which is currently under revision. He has also worked in collaboration with Mahindra Mariadassou (INRA) to overview the state of the art on theoretical results for latent or stochastic block model.
Vincent Brault, Christine Keribin and Mahindra Mariadassou have started a collaboration to tackle the consistency and asymptotic normality for the maximum likelihood and variational estimators in a stochastic or latent block model.
Gilles Celeux has started a collaboration with Jean-Patrick Baudry on strategies to avoid the traps of the EM algorithm in mixture analysis. They anayse the effect of the spurious local maximizers and the regulariszed algorithms to avoid these spurious solutions. They explore the link of the degree of regularization and the slope heuristics. Moreover, they propose and study strategies to initiate the EM algorithm embedding the solution with components and the starting position with component to avoid suboptimal solutions.
Erwan Le Pennec is supervising Solenne Thivin in her CIFRE with Michel Prenat and Thales Optronique. The aim is target detection on complex background such as clouds or sea. Their approach is a local approach based on test decision theory. They have obtained theoretical and numerical results on a segmentation based approach in which a simple Markov field testing procedure is used in each cell of a data driven partition. They also have obtained experimental results on images (or patches) unsupervised classification, with the aim of better calibrate the detection procedure. The classification is based on features which are defined in cloud texture modeling activity.
Erwan Le Pennec and Michel Prenat have also collaborated on a cloud texture modeling using a non-parametric approach. Such a modeling coud be used to better calibrate the detection procedure: it can lead to more examples than the one acquired and it could be the basis of an ensemble method.