Section: New Results
Statistical aspects of topological and geometric data analysis
Stability and Minimax Optimality of Tangential Delaunay Complexes for Manifold Reconstruction
Participant : Eddie Aamari.
In collaboration with C. Levrard (Univ. Paris Diderot).
we consider the problem of optimality in manifold reconstruction. A random sample
Rates in the Central Limit Theorem and diffusion approximation via Stein's Method
Participant : Thomas Bonis.
We present a way to apply Stein's method in order to bound the Wasserstein distance between a, possibly discrete, measure and another measure assumed to be the invariant measure of a diffusion operator. We apply this construction to obtain convergence rates, in terms of
Rates of Convergence for Robust Geometric Inference
Participants : Frédéric Chazal, Bertrand Michel.
In collaboration with P. Massart (Univ. Paris Sud et Inria Select team).
Distances to compact sets are widely used in the field of Topological Data Analysis for inferring geometric and topological features from point clouds. In this context, the distance to a probability measure (DTM) has been introduced by Chazal et al. as a robust alternative to the distance a compact set. In practice, the DTM can be estimated by its empirical counterpart, that is the distance to the empirical measure (DTEM). In this paper we give a tight control of the deviation of the DTEM. Our analysis relies on a local analysis of empirical processes. In particular, we show that the rate of convergence of the DTEM directly depends on the regularity at zero of a particular quantile function which contains some local information about the geometry of the support. This quantile function is the relevant quantity to describe precisely how difficult is a geometric inference problem. Several numerical experiments illustrate the convergence of the DTEM and also confirm that our bounds are tight [19].
Data driven estimation of Laplace-Beltrami operator
Participants : Frédéric Chazal, Bertrand Michel, Ilaria Giulini.
Approximations of Laplace-Beltrami operators on manifolds through graph Laplacians have become popular tools in data analysis and machine learning. These discretized operators usually depend on bandwidth parameters whose tuning remains a theoretical and practical problem. In this paper, we address this problem for the unnormalized graph Laplacian by establishing an oracle inequality that opens the door to a well-founded data-driven procedure for the bandwidth selection. Our approach relies on recent results by Lacour and Massart on the so-called Lepski’s method [26].