Section: New Results

Topological approach for multimodal data processing

A General Neural Network Architecture for Persistence Diagrams and Graph Classification

Participants : Frédéric Chazal, Théo Lacombe, Martin Royer.

In collaboration with Mathieu Carrière (Colombia Univ.) and Umeda Yuhei and Ike Yiuchi (Fujitsu Labs).

Persistence diagrams, the most common descriptors of Topological Data Analysis, encode topological properties of data and have already proved pivotal in many different applications of data science. However, since the (metric) space of persistence diagrams is not Hilbert, they end up being difficult inputs for most Machine Learning techniques. To address this concern, several vectorization methods have been put forward that embed persistence diagrams into either finite-dimensional Euclidean space or (implicit) infinite dimensional Hilbert space with kernels. In [44], we focus on persistence diagrams built on top of graphs. Relying on extended persistence theory and the so-called heat kernel signature, we show how graphs can be encoded by (extended) persistence diagrams in a provably stable way. We then propose a general and versatile framework for learning vectorizations of persistence diagrams, which encompasses most of the vectorization techniques used in the literature. We finally showcase the experimental strength of our setup by achieving competitive scores on classification tasks on real-life graph datasets.

Topological Data Analysis for Arrhythmia Detection through Modular Neural Networks

Participant : Frédéric Chazal.

In collaboration with Umeda Yuhei and Meryll Dindin (Fujitsu Labs).

In [47], we present an innovative and generic deep learning approach to monitor heart conditions from ECG signals.We focus our attention on both the detection and classification of abnormal heartbeats, known as arrhythmia. We strongly insist on generalization throughout the construction of a deep-learning model that turns out to be effective for new unseen patient. The novelty of our approach relies on the use of topological data analysis as basis of our multichannel architecture, to diminish the bias due to individual differences. We show that our structure reaches the performances of the state-of-the-art methods regarding arrhythmia detection and classification.

ATOL: Automatic Topologically-Oriented Learning

Participants : Frédéric Chazal, Martin Royer.

In collaboration with Umeda Yuhei and Ike Yiuchi (Fujitsu Labs).

There are abundant cases for using Topological Data Analysis (TDA) in a learning context, but robust topological information commonly comes in the form of a set of persistence diagrams, objects that by nature are uneasy to affix to a generic machine learning framework. In [56], we introduce a vectorisation method for diagrams that allows to collect information from topological descriptors into a format fit for machine learning tools. Based on a few observations, the method is learned and tailored to discriminate the various important plane regions a diagram is set into. With this tool one can automatically augment any sort of machine learning problem with access to a TDA method, enhance performances, construct features reflecting underlying changes in topological behaviour. The proposed methodology comes with only high level tuning parameters such as the encoding budget for topological features. We provide an open-access, ready-to-use implementation and notebook. We showcase the strengths and versatility of our approach on a number of applications. From emulous and modern graph collections to a highly topological synthetic dynamical orbits data, we prove that the method matches or beats the state-of-the-art in encoding persistence diagrams to solve hard problems. We then apply our method in the context of an industrial, difficult time-series regression problem and show the approach to be relevant.

Inverse Problems in Topological Persistence: a Survey

Participant : Steve Oudot.

In collaboration with Elchanan Solomon (Duke).

In [27] we review the literature on inverse problems in topological persistence theory.The first half of the survey is concerned with the question of surjectivity, i.e. the existence of rightinverses, and the second half focuses on injectivity, i.e. left inverses. Throughout, we highlightthe tools and theorems that underlie these advances, and direct the reader’s attention to openproblems, both theoretical and applied.

Intrinsic Topological Transforms via the Distance Kernel Embedding

Participants : Clément Maria, Steve Oudot.

In collaboration with Elchanan Solomon (Duke).

Topological transforms are parametrized families of topological invariants, which, by analogy with transforms in signal processing, are much more discriminative than single measurements. The first two topological transforms to be defined were the Persistent Homology Transform and Euler Characteristic Transform, both of which apply to shapes embedded in Euclidean space. The contribution of this work [54] is to define topological transforms that depend only on the intrinsic geometry of a shape, and hence are invariant to the choice of embedding. To that end, given an abstract metric measure space, we define an integral operator whose eigenfunctions are used to compute sublevel set persistent homology. We demonstrate that this operator, which we call the distance kernel operator, enjoys desirable stability properties, and that its spectrum and eigenfunctions concisely encode the large-scale geometry of our metric measure space. We then define a number of topological transforms using the eigenfunctions of this operator, and observe that these transforms inherit many of the stability and injectivity properties of the distance kernel operator.

A Framework for Differential Calculus on Persistence Barcodes

Participant : Steve Oudot.

In collaboration with Jacob Leygonie and Ulrike Tillmann (Oxford).

In [52], we define notions of differentiability for maps from and to the space of persistence barcodes. Inspired by the theory of diffeological spaces, the proposed framework uses lifts to the space of ordered barcodes, from which derivatives can be computed. The two derived notions of differentiability (respectively from and to the space of barcodes) combine together naturally to produce a chain rule that enables the use of gradient descent for objective functions factoring through the space of barcodes. We illustrate the versatility of this framework by showing how it can be used to analyze the smoothness of various parametrized families of filtrations arising in topological data analysis.