Section: Overall Objectives

Overall Objectives

We are a research team on machine learning, with an emphasis on statistical methods. Processing huge amounts of complex data has created a need for statistical methods which could remain valid under very weak hypotheses, in very high dimensional spaces. Our aim is to contribute to a robust, adaptive, computationally efficient and desirably non-asymptotic theory of statistics which could be profitable to learning.

Our theoretical studies bear on the following mathematical tools:

  • regression models used for supervised learning, from different perspectives: the PAC-Bayesian approach to generalization bounds; robust estimators; model selection and model aggregation;

  • sparse models of prediction and 1 –regularization;

  • interactions between unsupervised learning, information theory and adaptive data representation;

  • individual sequence theory;

  • multi-armed bandit problems (possibly indexed by a continuous set).

We are involved in the following applications:

  • the improvement of prediction through the on-line aggregation of predictors, with an emphasis on the forecasting of air quality, electricity consumption, production data of oil reservoirs;

  • natural image analysis, and more precisely the use of unsupervised learning in data representation;

  • computational linguistics;

  • statistical inference on biological and neurobiological data.