Section: New Results
Topological and Geometric Inference
Homological reconstruction and simplification in
Participants : Olivier Devillers, Marc Glisse.
In collaboration with Dominique Attali (Gipsa-lab), Ulrich Bauer (Göttingen Univ.), and André Lieutier (Dassault Systèmes).
We consider the problem of deciding whether the persistent homology group of a simplicial pair can be realized as the homology of some space with . We show that this problem is NP-complete even if is embedded in .
As a consequence, we show that it is NP-hard to simplify level and sublevel sets of scalar functions on within a given tolerance constraint. This problem has relevance to the visualization of medical images by isosurfaces. We also show an implication to the theory of well groups of scalar functions: not every well group can be realized by some level set, and deciding whether a well group can be realized is NP-complete [43] .
The structure and stability of persistence modules
Participants : Frédéric Chazal, Marc Glisse, Steve Oudot.
In collaboration with Vin de Silva (Pomona College)
We give a self-contained treatment of the theory of persistence modules indexed over the real line. We give new proofs of the standard results. Persistence diagrams are constructed using measure theory. Linear algebra lemmas are simplified using a new notation for calculations on quiver representations. We show that the stringent finiteness conditions required by traditional methods are not necessary to prove the existence and stability of the persistence diagram. We introduce weaker hypotheses for taming persistence modules, which are met in practice and are strong enough for the theory still to work. The constructions and proofs enabled by our framework are, we claim, cleaner and simpler [54] .
Persistence stability for geometric complexes
Participants : Frédéric Chazal, Steve Oudot.
In collaboration with Vin de Silva (Pomona College)
We study the properties of the homology of different geometric filtered complexes (such as Vietoris–Rips, Čech and witness complexes) built on top of precompact spaces. Using recent developments in the theory of topological persistence [54] we provide simple and natural proofs of the stability of the persistent homology of such complexes with respect to the Gromov–Hausdorff distance. We also exhibit a few noteworthy properties of the homology of the Rips and Čech complexes built on top of compact spaces [53] .
Zigzag zoology: rips zigzags for homology inference
Participants : Steve Oudot, Donald Sheehy.
For points sampled near a compact set , the persistence barcode of the Rips filtration built from the sample contains information about the homology of as long as satisfies some geometric assumptions. The Rips filtration is prohibitively large, however zigzag persistence can be used to keep the size linear. We present several species of Rips-like zigzags and compare them with respect to the signal-to-noise ratio, a measure of how well the underlying homology is represented in the persistence barcode relative to the noise in the barcode at the relevant scales. Some of these Rips-like zigzags have been available as part of the Dionysus library for several years while others are new. Interestingly, we show that some species of Rips zigzags will exhibit less noise than the (non-zigzag) Rips filtration itself. Thus, the Rips zigzag can offer improvements in both size complexity and signal-to-noise ratio.
Along the way, we develop new techniques for manipulating and comparing persistence barcodes from zigzag modules. We give methods for reversing arrows and removing spaces from a zigzag. We also discuss factoring zigzags and a kind of interleaving of two zigzags that allows their barcodes to be compared. These techniques were developed to provide our theoretical analysis of the signal-to-noise ratio of Rips-like zigzags, but they are of independent interest as they apply to zigzag modules generally [60] .
A space and time efficient implementation for computing persistent homology
Participants : Jean-Daniel Boissonnat, Clément Maria.
In collaboration with Tamal Dey (Ohio State University)
The persistent homology with -coefficients coincides with the same for cohomology because of duality. Recently, it has been observed that the cohomology based algorithms perform much better in practice than the originally proposed homology based persistence algorithm. We have implemented a cohomology based algorithm that attaches binary labels called annotations with the simplices. This algorithm fits very naturally with our recently developed data structure called simplex tree to represent simplicial complexes [49] , [22] . By taking advantages of several practical tricks such as representing annotations compactly with memory words, using a union-find structure that eliminates duplicate annotation vectors, and a lazy evaluation, we save both space and time cost for computations. The complexity of the procedure, in practice, depends almost linearly on the size of the simplicial complex and on the variables related to the maximal dimension of the local homology groups we maintain during the computation, which remain small in practice. We provide a theoretical analysis as well as a detailed experimental study of our implementation. Experimental results show that our implementation performs several times better than the existing state-of-the-art software for computing persistent homology in terms of both time and memory requirements and can handle very large (several hundred million simplices in high-dimension) complexes efficiently [45] .
Minimax rates for homology inference
Participant : Donald Sheehy.
In collaboration with Sivaraman Balakrishnan and Alessandro Rinaldo and Aarti Singh and Larry A. Wasserman (Carnegie Mellon University)
Often, high dimensional data lie close to a low-dimensional submanifold and it is of interest to understand the geometry of these submanifolds. The homology groups of a manifold are important topological invariants that provide an algebraic summary of the manifold. These groups contain rich topological information, for instance, about the connected components, holes, tunnels and sometimes the dimension of the manifold. We consider the statistical problem of estimating the homology of a manifold from noisy samples under several different noise models. We derive upper and lower bounds on the minimax risk for this problem. Our upper bounds are based on estimators which are constructed from a union of balls of appropriate radius around carefully selected points. In each case, we establish complementary lower bounds using Le Cam's lemma [15] .
Linear-size approximations to the Vietoris-Rips filtration
Participant : Donald Sheehy.
The Vietoris-Rips filtration is a versatile tool in topological data analysis. Unfortunately, it is often too large to construct in full. We show how to construct an -size filtered simplicial complex on an -point metric space such that the persistence diagram is a good approximation to that of the Vietoris-Rips filtration. The filtration can be constructed in time. The constants depend only on the doubling dimension of the metric space and the desired tightness of the approximation. For the first time, this makes it computationally tractable to approximate the persistence diagram of the Vietoris-Rips filtration across all scales for large data sets. Our approach uses a hierarchical net-tree to sparsify the filtration. We can either sparsify the data by throwing out points at larger scales to give a zigzag filtration, or sparsify the underlying graph by throwing out edges at larger scales to give a standard filtration. Both methods yield the same guarantees [34] .
A multicover nerve for geometric inference
Participant : Donald Sheehy.
We show that filtering the barycentric decomposition of a Čech complex by the cardinality of the vertices captures precisely the topology of -covered regions among a collection of balls for all values of . Moreover, we relate this result to the Vietoris-Rips complex to get an approximation in terms of the persistent homology [33] .
Computing well diagrams for vector fields on
Participant : Frédéric Chazal.
In collaboration with Primoz Skraba (Lubiana Univ.), Amit Patel (Rutgers Univ.)
Using topological degree theory, we present and prove correctness of a fast algorithm for computing the well diagram, a quantitative property, of a vector field on Euclidean space [17] .