## Section: New Results

### Sharp analysis of low-rank kernel matrix approximations

Participant : Francis Bach [correspondent] .

Kernel methods, such as the support vector machine or kernel ridge regression, are now widely used in many areas of science and engineering, such as computer vision or bioinformatics. However, kernel methods typically suffer from at least quadratic running-time complexity in the number of observations $n$, as this is the complexity of computing the kernel matrix. In large-scale settings where $n$ may be large, this is usually not acceptable. In [7] , we consider supervised learning problems within the positive-definite kernel framework, such as kernel ridge regression, kernel logistic regression or the support vector machine. Low-rank approximations of the kernel matrix are often considered as they allow the reduction of running time complexities to $O\left({p}^{2}n\right)$, where $p$ is the rank of the approximation. The practicality of such methods thus depends on the required rank $p$. In this paper, we show that in the context of kernel ridge regression, for approximations based on a random subset of columns of the original kernel matrix, the rank $p$ may be chosen to be linear in the *degrees of freedom* associated with the problem, a quantity which is classically used in the statistical analysis of such methods, and is often seen as the implicit number of parameters of non-parametric estimators. This result enables simple algorithms that have sub-quadratic running time complexity, but provably exhibit the same *predictive performance* than existing algorithms, for any given problem instance, and not only for worst-case situations.