Section: New Results
Statistical aspects of topological and geometric data analysis
The DTM-signature for a geometric comparison of metric-measure spaces from samples
Participant : Claire Brécheteau.
In [43], we introduce the notion of DTM-signature, a measure on
Estimating the Reach of a Manifold
Participants : Eddie Aamari, Frédéric Chazal, Bertrand Michel.
In collaboration with J. Kim, A. Rinaldo, L. Wasserman (Carnegie Mellon University)
Various problems of computational geometry and manifold learning encode geometric regularity through the so-called reach, a generalized convexity parameter. The reach
Robust Topological Inference: Distance To a Measure and Kernel Distance
Participants : Frédéric Chazal, Bertrand Michel.
In collaboration with B. Fasy, F. Lecci, A. Rinaldo, L. Wasserman.
Let
Statistical analysis and parameter selection for Mapper
Participants : Steve Oudot, Bertrand Michel, Mathieu Carrière.
In [44] we study the question of the statistical convergence of the 1-dimensional Mapper to its continuous analogue, the Reeb graph. We show that the Mapper is an optimal estimator of the Reeb graph, which gives, as a byproduct, a method to automatically tune its parameters and compute confidence regions on its topological features, such as its loops and flares. This allows to circumvent the issue of testing a large grid of parameters and keeping the most stable ones in the brute-force setting, which is widely used in visualization, clustering and feature selection with the Mapper.
Sliced Wasserstein Kernel for Persistence Diagrams
Participants : Steve Oudot, Mathieu Carrière.
In collaboration with M. Cuturi (ENSAE)
Persistence diagrams (PDs) play a key role in topological data analysis (TDA), in which they are routinely used to describe succinctly complex topological properties of complicated shapes. PDs enjoy strong stability properties and have proven their utility in various learning contexts. They do not, however, live in a space naturally endowed with a Hilbert structure and are usually compared with specific distances, such as the bottleneck distance. To incorporate PDs in a learning pipeline, several kernels have been proposed for PDs with a strong emphasis on the stability of the RKHS distance w.r.t. perturbations of the PDs. In [27], we use the Sliced Wasserstein approximation of the Wasserstein distance to define a new kernel for PDs, which is not only provably stable but also provably discriminative w.r.t. the Wasserstein distance W1∞ between PDs. We also demonstrate its practicality, by developing an approximation technique to reduce kernel computation time, and show that our proposal compares favorably to existing kernels for PDs on several benchmarks.
An introduction to Topological Data Analysis: fundamental and practical aspects for data scientists
Participants : Frédéric Chazal, Bertrand Michel.
Topological Data Analysis (TDA) is a recent and fast growing field providing a set of new topological and geometric tools to infer relevant features for possibly complex data. In [45], we propose a brief introduction, through a few selected recent and state-of-the-art topics, to basic fundamental and practical aspects of TDA for non experts.