Section: New Results
Dimension free principal component analysis
Participants : Olivier Catoni, Ilaria Giulini.
In a work in progress, Ilaria Giulini, as part of her PhD studies, proved the following dimension free inequality, related to Principal Component Analysis in high dimension. Given an i.i.d. sample , of vector valued random variables , there exists an estimator of the quadratic form such that for any , with probability at least , for any ,
where
where is the Gram matrix and where is some kurtosis coefficient. This result proves that the expected energy in direction can be estimated at a rate that is independent of the dimension of the ambient space . It is obtained using PAC-Bayes inequalities with Gaussian parameter perturbations. The same bound holds in a Hilbert space of infinite dimension, opening the possibility of a rigorous mathematical study of kernel principal component analysis of random data, where the data are represented in a possibly infinite dimensional reproducing kernel Hilbert space.