Section: New Results
-fold cross-validation and -fold penalization in least-squares density estimation
Participant : Sylvain Arlot [correspondant] .
In [22] , we study -fold cross-validation for model selection in least-squares density estimation. The goal is to provide theoretical grounds for choosing in order to minimize the least-squares risk of the selected estimator. We first prove a non asymptotic oracle inequality for -fold cross-validation and its bias-corrected version (-fold penalization), with an upper bound decreasing as a function of . In particular, this result implies -fold penalization is asymptotically optimal. Then, we compute the variance of -fold cross-validation and related criteria, as well as the variance of key quantities for model selection performances. We show these variances depend on like (at least in some particular cases), suggesting the performances increase much from to or 10, and then is almost constant. Overall, this explains the common advice to take —at least in our setting and when the computational power is limited—, as confirmed by some simulation experiments.
Collaboration with Matthieu Lerasle (CNRS, University Nice Sophia Antipolis).