Section: New Results
Supervised learning
Participants : Alexandra Carpentier, Emmanuel Duflos, Hachem Kadri, Manuel Loth, OdalricAmbrym Maillard, Rémi Munos, Philippe Preux, Christophe Salperwyck.
Regression and classification

Sparse Recovery with Brownian Sensing [19] .
We consider the problem of recovering the parameter $\alpha $ of a sparse function $f$ (i.e. the number of nonzero entries of $\alpha $ is small compared to the number $K$ of features) given noisy evaluations of f at a set of wellchosen sampling points. We introduce an additional randomization process, called Brownian sensing, based on the computation of stochastic integrals, which produces a Gaussian sensing matrix, for which good recovery properties are proven, independently on the number of sampling points $N$, even when the features are arbitrarily nonorthogonal. Under the assumption that $f$ is Hölder continuous with exponent at least $1/2$ we provide an estimate of the parameter with quadratic error $O\left(\right\left\eta \right/N)$, where $\eta $ is the observation noise. The method uses a set of sampling points uniformly distributed along a onedimensional curve selected according to the features. We report numerical experiments illustrating our method.

Operatorvalued Kernels for Nonlinear FDA [31] , [38] , [30] , [53] Following the extension of RKHS to functional setting [74] , we further developed this work in [38] for functional supervised classification.
We introduced a set of rigorously defined operatorvalued kernels that can be valuably applied to nonparametric operator learning when input and output data are continuous smooth functions, and we have showed their use for solving the problem of minimizing a ${L}^{2}$regularized functional in the case of functional outputs without the need to discretize covariate and target functions [53] .
The framework developed can also be applied when the input data are both discrete and continuous [30] .
Our fully functional approach has been successfully applied to the problems of speech inversion [31] and sound recognition [38] , showing that the proposed framework is particularly relevant for audio signal processing applications where attributes are really functions and dependent of each other.
This work is done in collaboration with Francis Bach (INRIA, Sierra), Alain Rakotomamonjy and Stéphane Canu (LITIS, Rouen).

Datumwise representation [44] , [54] . We consider supervised classification. We introduce the concept of datumwise representation for supervised classification [44] . While traditional approaches yield a “best” representation at the data space level, that is, the same representation is used for all the data, we proposed the idea, as well as an algorithm, that yields the “best” representation for each data. Among other appealing properties, this leads to sparse representation of each data, and an averaged sparser representation of each data in the data space. Along a classifier, the learning algorithm produces a “representer”, that is a function that yields a representation given a data.
We further improved this approach to encompass various settings which are traditionally kept as different (costsensitive classification and different structured sparsity) [54] .

Isoregularization descent [1] . Manuel Loth has defended his PhD dissertation [1] where he has provided a detailed presentation and analysis of his algorithm to solve the LASSO. This algorithm is very efficient. It is an active set algorithm that solves the LASSO by considering it a convex problem with linear constraints.

Learning with few examples [41] , [64] . Christophe Salperwyck has studied the performance of various classifiers when few examples are available. This is an important point in incremental learning, and few studies have been devoted to this particular setting. Performance we are accustomed to when the examples are quite numerous are severely disturbed in this setting. For more details, please see [41] , [64] .

Incremental discretization [40] . In incremental learning, discretization should be adaptive in order to cope with the values of the attributes that are observed. This issue is currently under study by Christophe Salperwyck [40] .