Section: New Results
Mixture models
Taking into account the curse of dimensionality
Participant : Stéphane Girard.
Joint work with: Bouveyron, C. (Université Paris 1), Fauvel, M. (ENSAT Toulouse)
In the PhD work of Charles Bouveyron (co-advised by Cordelia Schmid from the Inria LEAR team) [53] , we propose new Gaussian models of high dimensional data for classification purposes. We assume that the data live in several groups located in subspaces of lower dimensions. Two different strategies arise:
the introduction in the model of a dimension reduction constraint for each group
the use of parsimonious models obtained by imposing to different groups to share the same values of some parameters
This modelling yields a new supervised classification method called High Dimensional Discriminant Analysis (HDDA) [4] . Some versions of this method have been tested on the supervised classification of objects in images. This approach has been adapted to the unsupervised classification framework, and the related method is named High Dimensional Data Clustering (HDDC) [3] . Also, the description of the R package is published in [11] . Our recent work consists in adding a kernel in the previous methods to deal with nonlinear data classification [27] , [45] .
Robust mixture modelling using skewed multivariate distributions with variable amounts of tailweight
Participants : Florence Forbes, Darren Wraith.
Clustering concerns the assignment of each of
A useful representation of the
where
For many applications, the distribution of the data may also be highly asymmetric in addition to being heavy tailed (or affected by outliers). A natural extension to the Gaussian scale mixture case is to consider location and scale Gaussian mixtures of the form,
where
Although the above approaches provide for great flexibility in
modelling data of highly asymmetric and heavy tailed form the
above approaches assume
In this work, we show that the location and scale mixture
representation can be further explored and propose a framework
that is considerably simpler than those previously proposed with
distributions exhibiting interesting properties. Using the normal
inverse Gaussian distribution (NIG) as an example, we extend the
standard location and scale mixture of Gaussian
representation to allow for the tail behaviour to be set or
estimated differently in each dimension of the variable space. The
key elements of the approach are the introduction of
multidimensional weights and a decomposition of the matrix
Robust clustering for high dimensional data
Participants : Florence Forbes, Darren Wraith, Minwoo Lee.
For a clustering problem, a parametric mixture model is one of the popular approaches. Most of all, Gaussian mixture models are widely used in various fields of study such as data mining, pattern recognition, machine learning, and statistical analysis. The modeling and computational flexibility of the Gaussian mixture model makes it possible to model a rich class of density, and provides a simple mathematical form of cluster models.
Despite the success of Gaussian mixtures, the parameter
estimations can be severely affected by outliers. By adding an
additional degrees of freedom (dof) parameter, a robustness tuning
parameter, the robust improvement in clustering has been achieved.
Although adopting
Along with robustness from
This work proposes an approach that combines robust clustering
with the HDDC. The use of the mixture of multivariate
Partially Supervised Mapping: A Unified Model for Regression and Dimensionality Reduction
Participant : Florence Forbes.
Joint work with: Antoine Deleforge and Radu Horaud from the Inria Perception team.
We cast dimensionality reduction and regression in a unified latent variable model. We propose a two-step strategy consisting of characterizing a non-linear reversed output-to-input regression with a generative piecewise-linear model, followed by Bayes inversion to obtain an output density given an input. We describe and analyze the most general case of this model, namely when only some components of the output variables are observed while the other components are latent. We provide two EM inference procedures and their initialization. Using simulated and real data, we show that the proposed method outperforms several existing ones.
Variational EM for Binaural Sound-Source Separation and Localization
Participant : Florence Forbes.
Joint work with: Antoine Deleforge and Radu Horaud from the Inria Perception team.
We addressed the problem of sound-source separation and localization in real-world conditions with two microphones. Both tasks are solved within a unified formulation using supervised mapping. While the parameters of the direct mapping are learned during a training stage that uses sources emitting white noise (calibration), the inverse mapping is estimated using a variational EM formulation. The proposed algorithm can deal with natural sound sources such as speech which are known to yield sparse spectrograms, and is able to locate multiple sources both in azimuth and in elevation. Extensive experiments with real data show that the method outperform state-of-the-art both in separation and localization.