Section: New Results

Non parametric state–space model for missing–data imputation

Participants : Thi Tuyet Trang Chau, François Le Gland, Valérie Monbet, Mathias Rousset.

This is a collaboration with Pierre Ailliot (université de Bretagne Occidentale, Brest), Ronan Fablet and Pierre Tandéo (Télécom Bretagne, Brest), Anne Cuzol (université de Bretagne Sud, Vannes) and Bernard Chapron (IFREMER, Brest).

Missing data are present in many environmental data–sets and this work aims at developing a general method for imputing them. State–space models (SSM) have already extensively been used in this framework. The basic idea consists in introducing the true environmental process, which we aim at reconstructing, as a latent process and model the data available at neighboring sites in space and/or time conditionally to this latent process. A key input of SSMs is a stochastic model which describes the temporal evolution of the environmental process of interest. In many applications, the dynamic is complex and can hardly be described using a tractable parametric model. Here we investigate a data-driven method where the dynamical model is learned using a non-parametric approach and historical observations of the environmental process of interest. From a statistical point of view, we will address various aspects related to SSMs in a non–parametric framework. First we will discuss the estimation of the filtering and smoothing distributions, that is the distribution of the latent space given the observations, using sequential Monte Carlo approaches in conjunction with local linear regression. Then, a more difficult and original question consists in building a non–parametric estimate of the dynamics which takes into account the measurement errors which are present in historical data. We will propose an EM–like algorithm where the historical data are corrected recursively. The methodology will be illustrated and validated on an univariate toy example.