Section: New Results
Statistical aspects of topological and geometric data analysis
Estimating the Reach of a Manifold
Participants : Frédéric Chazal, Jisu Kim, Bertrand Michel.
In collaboration with E. Aamari (Univ. Paris-Diderot), A. Rinaldo, L. Wasserman (Carnegie Mellon University).
In [13], various problems in manifold estimation make use of a quantity called the reach, denoted by
A statistical test of isomorphism between metric-measure spaces using the distance-to-a-measure signature
Participant : Claire Brecheteau.
In [20], we introduce the notion of DTM-signature, a measure on
The test is based on subsampling methods and comes with theoretical guarantees. It is proven to be of the correct level asymptotically. Also, when the measures are supported on compact subsets of
An algorithm is proposed for the implementation of this statistical test, and its performance is compared to the performance of other methods through numerical experiments.
On the choice of weight functions for linear representations of persistence diagrams
Participant : Vincent Divol.
In collaboration with Wolfgang Polonik (UC Davis).
Persistence diagrams are efficient descriptors of the topology of a point cloud. As they do not naturally belong to a Hilbert space, standard statistical methods cannot be directly applied to them. Instead, feature maps (or representations) are commonly used for the analysis. A large class of feature maps, which we call linear, depends on some weight functions, the choice of which is a critical issue. An important criterion to choose a weight function is to ensure stability of the feature maps with respect to Wasserstein distances on diagrams. In [21], we improve known results on the stability of such maps, and extend it to general weight functions. We also address the choice of the weight function by considering an asymptotic setting; assume that
Understanding the Topology and the Geometry of the Persistence Diagram Space via Optimal Partial Transport
Participants : Vincent Divol, Théo Lacombe.
Despite the obvious similarities between the metrics used in topological data analysis and those of optimal transport, an optimal-transport based formalism to study persistence diagrams and similar topological descriptors has yet to come. In [48], by considering the space of persistence diagrams as a measure space, and by observing that its metrics can be expressed as solutions of optimal partial transport problems, we introduce a generalization of persistence diagrams, namely Radon measures supported on the upper half plane. Such measures naturally appear in topological data analysis when considering continuous representations of persistence diagrams (e.g. persistence surfaces) but also as limits for laws of large numbers on persistence diagrams or as expectations of probability distributions on the persistence diagrams space. We study the topological properties of this new space, which will also hold for the closed subspace of persistence diagrams. New results include a characterization of convergence with respect to transport metrics, the existence of Fréchet means for any distribution of diagrams, and an exhaustive description of continuous linear representations of persistence diagrams. We also showcase the usefulness of this framework to study random persistence diagrams by providing several statistical results made meaningful thanks to this new formalism.