## Section: Research Program

### Mixture models

Participants : Alexis Arnaud, Jean-Baptiste Durand, Florence Forbes, Stéphane Girard, Julyan Arbel, Jean-Michel Bécu, Hongliang Lu, Fabien Boux, Veronica Munoz Ramirez, Benoit Kugler, Alexandre Constantin, Fei Zheng.

**Key-words:**
mixture of distributions, EM algorithm, missing data, conditional independence,
statistical pattern recognition, clustering,
unsupervised and partially supervised learning.

In a first approach, we consider statistical parametric models, $\theta $ being the parameter, possibly multi-dimensional, usually unknown and to be estimated. We consider cases where the data naturally divides into observed data $y=\{{y}_{1},...,{y}_{n}\}$ and unobserved or missing data $z=\{{z}_{1},...,{z}_{n}\}$. The missing data ${z}_{i}$ represents for instance the memberships of one of a set of $K$ alternative categories. The distribution of an observed ${y}_{i}$ can be written as a finite mixture of distributions,

$\begin{array}{c}\hfill f({y}_{i};\theta )=\sum _{k=1}^{K}P({z}_{i}=k;\theta )f({y}_{i}\mid {z}_{i};\theta )\phantom{\rule{0.277778em}{0ex}}.\end{array}$ | (1) |

These models are interesting in that they may point out hidden
variables responsible for most of the observed variability and so
that the observed variables are *conditionally* independent.
Their estimation is often difficult due to the missing data. The
Expectation-Maximization (EM) algorithm is a general and now
standard approach to maximization of the likelihood in missing
data problems. It provides parameter estimation but also values
for missing data.

Mixture models correspond to independent ${z}_{i}$'s. They have been increasingly used in statistical pattern recognition. They enable a formal (model-based) approach to (unsupervised) clustering.