DataShape is a research project in Topological Data Analysis (tda), a recent field whose aim is to uncover, understand and exploit the topological and geometric structure underlying complex and possibly high dimensional data. The DataShape project gathers a unique variety of expertise that allows it to embrace the mathematical, statistical, algorithmic and applied aspects of the field in a common framework ranging from fundamental theoretical studies to experimental research and software development.

The expected output of DataShape is two-fold. First, we intend to set-up and develop the mathematical, statistical and algorithmic foundations of Topological and Geometric Data Analysis. Second, we intend to develop the Gudhi platform in order to provide an efficient state-of-the-art toolbox for the understanding of the topology and geometry of data.

tda requires to construct and manipulate appropriate representations of complex and high dimensional shapes. A major difficulty comes from the fact that the complexity of data structures and algorithms used to approximate shapes rapidly grows as the dimensionality increases, which makes them intractable in high dimensions. We focus our research on simplicial complexes which offer a convenient representation of general shapes and generalize graphs and triangulations. Our work includes the study of simplicial complexes with good approximation properties and the design of compact data structures to represent them.

In low dimensions, effective shape reconstruction techniques exist that can provide precise geometric approximations very efficiently and under reasonable sampling conditions. Extending those techniques to higher dimensions as is required in the context of tda is problematic since almost all methods in low dimensions rely on the computation of a subdivision of the ambient space. A direct extension of those methods would immediately lead to algorithms whose complexities depend exponentially on the ambient dimension, which is prohibitive in most applications. A first direction to by-pass the curse of dimensionality is to develop algorithms whose complexities depend on the intrinsic dimension of the data (which most of the time is small although unknown) rather than on the dimension of the ambient space. Another direction is to resort to cruder approximations that only captures the homotopy type or the homology of the sampled shape. The recent theory of persistent homology provides a powerful and robust tool to study the homology of sampled spaces in a stable way.

The wide variety of larger and larger available data - often corrupted by noise and outliers - requires to consider the statistical properties of their topological and geometric features and to propose new relevant statistical models for their study.

There exist various statistical and machine learning methods intending to uncover the geometric structure of data. Beyond manifold learning and dimensionality reduction approaches that generally do not allow to assert the relevance of the inferred topological and geometric features and are not well-suited for the analysis of complex topological structures, set estimation methods intend to estimate, from random samples, a set around which the data is concentrated. In these methods, that include support and manifold estimation, principal curves/manifolds and their various generalizations to name a few, the estimation problems are usually considered under losses, such as Hausdorff distance or symmetric difference, that are not sensitive to the topology of the estimated sets, preventing these tools to directly infer topological or geometric information.

Regarding purely topological features, the statistical estimation of homology or homotopy type of compact subsets of Euclidean spaces, has only been considered recently, most of the time under the quite restrictive assumption that the data are randomly sampled from smooth manifolds.

In a more general setting, with the emergence of new geometric inference tools based on the study of distance functions and algebraic topology tools such as persistent homology, computational topology has recently seen an important development offering a new set of methods to infer relevant topological and geometric features of data sampled in general metric spaces. The use of these tools remains widely heuristic and until recently there were only a few preliminary results establishing connections between geometric inference, persistent homology and statistics. However, this direction has attracted a lot of attention over the last three years. In particular, stability properties and new representations of persistent homology information have led to very promising results to which the DataShape members have significantly contributed. These preliminary results open many perspectives and research directions that need to be explored.

Our goal is to build on our first statistical results in tda to develop the mathematical foundations of Statistical Topological and Geometric Data Analysis. Combined with the other objectives, our ultimate goal is to provide a well-founded and effective statistical toolbox for the understanding of topology and geometry of data.

Due to their geometric nature, multimodal data (images, video, 3D shapes, etc.) are of particular interest for the techniques we develop. Our goal is to establish a rigorous framework in which data having different representations can all be processed, mapped and exploited jointly. This requires adapting our tools and sometimes developing entirely new or specialized approaches.

The choice of multimedia data is motivated primarily by the fact that the amount of such data is steadily growing (with e.g. video streaming accounting for nearly two thirds of peak North-American Internet traffic, and almost half a billion images being posted on social networks each day), while at the same time it poses significant challenges in designing informative notions of (dis)-similarity as standard metrics (e.g. Euclidean distances between points) are not relevant.

We develop a high quality open source software platform called gudhi which is becoming a reference in geometric and topological data analysis in high dimensions. The goal is not to provide code tailored to the numerous potential applications but rather to provide the central data structures and algorithms that underly applications in geometric and topological data analysis.

The development of the gudhi platform also serves to benchmark and optimize new algorithmic solutions resulting from our theoretical work. Such development necessitates a whole line of research on software architecture and interface design, heuristics and fine-tuning optimization, robustness and arithmetic issues, and visualization. We aim at providing a full programming environment following the same recipes that made up the success story of the cgal library, the reference library in computational geometry.

Some of the algorithms implemented on the platform will also be interfaced to other software platform, such as the R software

Our work is mostly of a fundamental mathematical and algorithmic nature but finds applications in a variety of application in data analysis, more precisely in Topological Data Analysis (tda). Although tda is a quite recent field, it already founds applications in material science, biology, sensor networks, 3D shapes analysis and processing, to name a few.

More specifically, DataShapehas recently started to work on the analysis of trajectories obtained from inertial sensors (starting PhD thesis of Bertrand Beaufils) and is exploring some possible new applications in material science.

Jean-Daniel Boissonnat has been elected a professor at the Collège de France, on the Chair Informatics and Computational Sciences for the academic year 2016-2017.

Publication of a book , providing a self-contained presentation of the theory of persistence modules over the real line, the objects that are at the heart of the field of tda.

Geometric Understanding in Higher Dimensions

Scientific Description

The current release of the GUDHI library includes:

Data structures to represent, construct and manipulate simplicial and cubical complexes.

Algorithms to compute simplicial complexes from point cloud data.

Algorithms to compute persistent homology and multi-field persistent homology.

Simplification methods via implicit representations.

Functional Description

The GUDHI open source library will provide the central data structures and algorithms that underlie applications in geometry understanding in higher dimensions. It is intended to both help the development of new algorithmic solutions inside and outside the project, and to facilitate the transfer of results in applied fields.

Participants: Jean-Daniel Boissonnat, Marc Glisse, Mariette Yvinec, Clément Maria, David Salinas, Paweł Dłotko, Siargey Kachanovich and Vincent Rouvreau

Contact: Jean-Daniel Boissonnat

In collaboration with Karthik C.S. (Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Israel)

A filtration over a simplicial complex

This direction has been recently pursued for the case of maintaining
simplicial complexes. For instance, Boissonnat et al. [SoCG '15]
considered storing the simplices that are maximal for the inclusion
and Attali et al. [IJCGA '12] considered storing the simplices that
block the expansion of the complex. Nevertheless, so far there has
been no data structure that compactly stores the *filtration* of
a simplicial complex, while also allowing the efficient implementation
of basic operations on the complex.

A new *edge-deletion* algorithm for the fast construction of Flag complexes, which only depends on the number of critical simplices and the number of vertices.

A new *matrix-parsing* algorithm to quickly construct the relaxed strong Delaunay complexes, depending only on the number of witnesses and the dimension of the complex.

Anisotropic meshes are desirable for various applications, such as the numerical solving of partial differential equations and graphics. In , we introduce an algorithm to compute discrete approximations of Riemannian Voronoi diagrams on 2-manifolds. This is not straightforward because geodesics, shortest paths between points, and therefore distances cannot in general be computed exactly. Our implementation employs recent developments in the numerical computation of geodesic distances and is accelerated through the use of an underlying anisotropic graph structure. We give conditions that guarantee that our discrete Riemannian Voronoi diagram is combinatorially equivalent to the Riemannian Voronoi diagram and that its dual is an embedded triangulation, using both approximate geodesics and straight edges. Both the theoretical guarantees on the approximation of the Voronoi diagram and the implementation are new and provide a step towards the practical application of Riemannian Delaunay triangulations.

In collaboration with M. Buchet (Tohoku University), D. Sheehy (Univ. Connecticut).

A new paradigm for point cloud data analysis has emerged recently, where point clouds are no longer treated as mere compact sets but rather as empirical measures. A notion of distance to such measures has been defined and shown to be stable with respect to perturbations of the measure. This distance can easily be computed pointwise in the case of a point cloud, but its sublevel-sets, which carry the geometric information about the measure, remain hard to compute or approximate. This makes it challenging to adapt many powerful techniques based on the Euclidean distance to a point cloud to the more general setting of the distance to a measure on a metric space. We propose an efficient and reliable scheme to approximate the topological structure of the family of sublevel-sets of the distance to a measure. We obtain an algorithm for approximating the persistent homology of the distance to an empirical measure that works in arbitrary metric spaces. Precise quality and complexity guarantees are given with a discussion on the behavior of our approach in practice .

A merged paper with Ezra, Esther (School of Mathematics, Georgia Institute of Technology, Atlanta, U.S.A.)

We refine the bound on the packing number, originally shown by Haussler, for shallow geometric set systems. Specifically, let

A new *tight upper bound* for shallow-packings in

A

where

A new algorithmic *lower bound* for largest

In collaboration with Nabil Mustafa (Université Paris-Est, Laboratoire d'Informatique Gaspard-Monge, ESIEE Paris, France.)

Showing the existence of

In this paper we give a short proof of this theorem
in the space of a few elementary paragraphs,
showing that it follows by combining
the

This implies all known cases of results on unweighted

A new *unified proof* for all known bounds on unweighted

In collaboration with Bruno Jartoux and Nabil Mustafa (Université Paris-Est Marne-la-Vallée, Laboratoire d'Informatique Gaspard-Monge, ESIEE Paris, France.)

The packing lemma of Haussler states that given a set system

an optimal lower bound for shallow packings, thus settling the open question in Ezra (SODA 2014) and Dutta et al. (SoCG 2015),

improved bounds on Mnets, providing a combinatorial analogue to Macbeath regions in convex geometry (Annals of Mathematics, 1952),

simplifying and generalizing the main technical tool in Fox et al. (J. of the EMS, 2016).

Besides using the packing lemma and a combinatorial construction, our proofs combine tools from polynomial partitioning and the probabilistic method.

A new *optimal lower bound* for shallow packings.

New *improved bounds* for M-nets - combinatorial analogs of Macbeath regions in convex geometry.

In collaboration with Nabil Mustafa (Université Paris-Est Marne-la-Vallée, Laboratoire d'Informatique Gaspard-Monge, ESIEE Paris, France.)

The Khatri-Šidák lemma says that for any Gaussian measure

A new *asymmetric* inequality for gaussian measure. .

In collaboration with C. Levrard (Univ. Paris Diderot).

we consider the problem of optimality in manifold reconstruction. A random sample

We present a way to apply Stein's method in order to bound the Wasserstein distance between a, possibly discrete, measure and another measure assumed to be the invariant measure of a diffusion operator. We apply this construction to obtain convergence rates, in terms of

In collaboration with P. Massart (Univ. Paris Sud et Inria Select team).

Distances to compact sets are widely used in the field of Topological Data Analysis for inferring geometric and topological features from point clouds. In this context, the distance to a probability measure (DTM) has been introduced by Chazal et al. as a robust alternative to the distance a compact set. In practice, the DTM can be estimated by its empirical counterpart, that is the distance to the empirical measure (DTEM). In this paper we give a tight control of the deviation of the DTEM. Our analysis relies on a local analysis of empirical processes. In particular, we show that the rate of convergence of the DTEM directly depends on the regularity at zero of a particular quantile function which contains some local information about the geometry of the support. This quantile function is the relevant quantity to describe precisely how difficult is a geometric inference problem. Several numerical experiments illustrate the convergence of the DTEM and also confirm that our bounds are tight .

Approximations of Laplace-Beltrami operators on manifolds through graph Laplacians have become popular tools in data analysis and machine learning. These discretized operators usually depend on bandwidth parameters whose tuning remains a theoretical and practical problem. In this paper, we address this problem for the unnormalized graph Laplacian by establishing an oracle inequality that opens the door to a well-founded data-driven procedure for the bandwidth selection. Our approach relies on recent results by Lacour and Massart on the so-called Lepski’s method .

We propose a novel pooling approach for shape classification and recognition using the bag-of-words pipeline, based on topological persistence, a recent tool from Topological Data Analysis. Our technique extends the standard max-pooling, which summarizes the distribution of a visual feature with a single number, thereby losing any notion of spatiality. Instead, we propose to use topological persistence, and the derived persistence diagrams, to provide significantly more informative and spatially sensitive characterizations of the feature functions, which can lead to better recognition performance. Unfortunately, despite their conceptual appeal, persistence diagrams are difficult to handle, since they are not naturally represented as vectors in Euclidean space and even the standard metric, the bottleneck distance is not easy to compute. Furthermore, classical distances between diagrams, such as the bottleneck and Wasserstein distances, do not allow to build positive definite kernels that can be used for learning. To handle this issue, we provide a novel way to transform persistence diagrams into vectors, in which comparisons are trivial. Finally, we demonstrate the performance of our construction on the Non-Rigid 3D Human Models SHREC 2014 dataset, where we show that topological pooling can provide significant improvements over the standard pooling methods for the shape pose recognition within the bag-of-words pipeline .

Given a continuous function

We characterize the class of persistence modules indexed over *block*—i.e. a horizontal band, a vertical band, an upper-right
quadrant, or a lower-left quadrant. Assuming the modules are *pointwise finite-dimensional* (pfd), we show that they are
decomposable into block summands if and only if they satisfy a certain
local property called *exactness*. Our proof follows the same
scheme as the proof of decomposition for pfd persistence modules
indexed over

In collaboration with T. Wanner (George Mason University).

Phase separation mechanisms can produce a variety of complicated and intricate microstructures, which often can be difficult to characterize in a quantitative way. In recent years, a number of novel topological metrics for microstructures have been proposed, which measure essential connectivity information and are based on techniques from algebraic topology. Such metrics are inherently computable using computational homology, provided the microstructures are discretized using a thresholding process. However, while in many cases the thresholding is straightforward, noise and measurement errors can lead to misleading metric values. In such situations, persistence landscapes have been proposed as a natural topology metric. Common to all of these approaches is the enormous data reduction, which passes from complicated patterns to discrete information. It is therefore natural to wonder what type of information is actually retained by the topology. In the present paper, we demonstrate that averaged persistence landscapes can be used to recover central system information in the Cahn-Hilliard theory of phase separation. More precisely, we show that topological information of evolving microstructures alone suffices to accurately detect both concentration information and the actual decomposition stage of a data snapshot. Considering that persistent homology only measures discrete connectivity information, regardless of the size of the topological features, these results indicate that the system parameters in a phase separation process affect the topology considerably more than anticipated. We believe that the methods discussed in this paper could provide a valuable tool for relating experimental data to model simulations .

In collaboration with K. Hess, L. Ran, H. Markram, E. Muller, M. Nolte, M. Reimann, M. Scolamiero, K. Turner (Univ. of Aberden, EPFL, Brain and Mind Institute).

A first draft digital reconstruction and simulation of a microcircuit of neurons in the neocortex of a two-week-old rat was recently published. Since graph-theoretical methods may not be sufficient to understand the immense complexity of the network formed by the neurons and their connections, we explored whether application of methods from algebraic topology can provide a novel and useful perspective on the structural and functional organization of the microcircuit. Structural topological analysis revealed that directed graphs representing the connectivity between neurons are significantly different from random graphs and that there exist an enormous number of simplicial complexes of different dimensions representing all-to-all connections within different sets of neurons, the most extreme motif of neuronal clustering reported so far in the brain. Functional topological analysis based on data from simulations confirmed the interest of a new approach to studying the relationship between the structure of the connectome and its emergent functions. In particular, functional responses to different stimuli can readily be distinguished by topological methods. This study represents the first algebraic topological analysis of connectomics data from neural microcircuits and shows promise for general applications in network science.

In collaboration with P. Bubenik (University of Florida).

Topological data analysis provides a multiscale description of the geometry and topology of quantitative data. The persistence landscape is a topological summary that can be easily combined with tools from statistics and machine learning. We give efficient algorithms for calculating persistence landscapes, their averages, and distances between such averages. We discuss an implementation of these algorithms and some related procedures. These are intended to facilitate the combination of statistics and machine learning with topological data analysis. We present an experiment showing that the low-dimensional persistence landscapes of points sampled from spheres (and boxes) of varying dimensions differ.

In collaboration with O. Devillers and S. Lazard (Inria Nancy), David Bremner (University of New Brunswick, Canada), Giuseppe Liotta (University of Perugia, Italy), Tamara Mchedlidze (KIT, Germany), Sue Whitesides (University of Victoria, Canada), Stephen Wismath (University of Lethbridge, Canada).

Research contract with GeometryFactory in the context of Mael Rouxel-Labbé's Ph.D. thesis on anisotropic mesh generation .

- Acronym : TopData.

- Type : ANR blanc.

- Title : Topological Data Analysis: Statistical Methods and Inference.

- Coordinator : Frédéric Chazal (DataShape).

- Duration : 4 years starting October 2013.

- Others Partners: Département de Mathématiques (Université Paris Sud), Institut de Mathématiques (Université de Bourgogne), LPMA (Université Paris Diderot), LSTA (Université Pierre et Marie Curie).

- Abstract: TopData aims at designing new mathematical frameworks, models and algorithmic tools to infer and analyze the topological and geometric structure of data in different statistical settings. Its goal is to set up the mathematical and algorithmic foundations of Statistical Topological and Geometric Data Analysis and to provide robust and efficient tools to explore, infer and exploit the underlying geometric structure of various data.

Our conviction, at the root of this project, is that there is a real need to combine statistical and topological/geometric approaches in a common framework, in order to face the challenges raised by the inference and the study of topological and geometric properties of the wide variety of larger and larger available data. We are also convinced that these challenges need to be addressed both from the mathematical side and the algorithmic and application sides. Our project brings together in a unique way experts in Statistics, Geometric Inference and Computational Topology and Geometry. Our common objective is to design new theoretical frameworks and algorithmic tools and thus to contribute to the emergence of a new field at the crossroads of these domains. Beyond the purely scientific aspects we hope this project will help to give birth to an active interdisciplinary community. With these goals in mind we intend to promote, disseminate and make our tools available and useful for a broad audience, including people from other fields.

- See also: http://

Title: Algorithmic Foundations of Geometry Understanding in Higher Dimensions.

Program: FP7.

Type: ERC.

Duration: February 2014 - January 2019.

Coordinator: Inria.

PI: Jean-Daniel Boissonnat.

The central goal of this proposal is to settle the algorithmic foundations of geometry understanding in dimensions higher than 3. We coin the term geometry understanding to encompass a collection of tasks including the computer representation and the approximation of geometric structures, and the inference of geometric or topological properties of sampled shapes. The need to understand geometric structures is ubiquitous in science and has become an essential part of scientific computing and data analysis. Geometry understanding is by no means limited to three dimensions. Many applications in physics, biology, and engineering require a keen understanding of the geometry of a variety of higher dimensional spaces to capture concise information from the underlying often highly nonlinear structure of data. Our approach is complementary to manifold learning techniques and aims at developing an effective theory for geometric and topological data analysis. To reach these objectives, the guiding principle will be to foster a symbiotic relationship between theory and practice, and to address fundamental research issues along three parallel advancing fronts. We will simultaneously develop mathematical approaches providing theoretical guarantees, effective algorithms that are amenable to theoretical analysis and rigorous experimental validation, and perennial software development. We will undertake the development of a high-quality open source software platform to implement the most important geometric data structures and algorithms at the heart of geometry understanding in higher dimensions. The platform will be a unique vehicle towards researchers from other fields and will serve as a basis for groundbreaking advances in scientific computing and data analysis.

Title: Computations And Topological Statistics

International Partner (Institution - Laboratory - Researcher):

Carnegie Mellon University (United States) - Department of Statistics - Larry Wasserman

Start year: 2015

See also: http://

Topological Data Analysis (tda) is an emergent field attracting interest from various communities, that has recently known academic and industrial successes. Its aim is to identify and infer geometric and topological features of data to develop new methods and tools for data exploration and data analysis. tda results mostly rely on deterministic assumptions which are not satisfactory from a statistical viewpoint and which lead to a heuristic use of tda tools in practice. Bringing together the strong expertise of two groups in Statistics (L. Wasserman's group at CMU) and Computational Topology and Geometry (Inria Geometrica), the main objective of CATS is to set-up the mathematical foundations of Statistical tda to design new tda methods and to develop efficient and easy-to-use software tools for tda.

Ramsay Dyer (April and November 2016)

Arijit Ghosh, Indian Statistical Institute, Kolkata (April and November 2016)

Jose Carlos Gomez Larranaga, CIMAT, Guanajuato, Mexico (September 2016)

Kim Jisu, CMU, Pittsburgh, USA (May and December 2016).

Antony Bak, Palantir company, USA (October 2016)

Uday Kusupati, Indian Institute of Technology, Bombay (May-July 2016)

Sandip Banerjee (bourse Charpak), Indian Statistical Institute, Kolkata (March-August 2016)

Sameer Desai, Indian Statistical Institute, Kolkata (October-December 2016)

Steve Oudot and Jérémy Cochoy spent 3 months (Sept.-Nov.) at the Institute for Computational and Experimental Research in Mathematics (ICERM) at Brown University. They were invited there for the semester program entitled *Topology in Motion* (see https://

Jean-Daniel Boissonnat and Frédéric Chazal co-organized the joint GUDHI-TOPDATA workshop in Porquerolles, October 17-20.

Frédéric Chazal co-organized the SMAI-SIGMA Conference 2016 at Luminy (CIRM) in November.

Maks Ovsjanikov: Paper co-chair of the Symposium on Geometry Processing 2016 (SGP 2016).

Frédéric Chazal: Symposium on Geometry Processing 2016 (SGP 2016).

Steve Oudot: Symposium on Geometry Processing 2016 (SGP 2016).

Marc Glisse: Symposium on Computational Geometry (SoCG 2016).

Jean-Daniel Boissonnat is a member of the Editorial Board of
*Journal of the ACM*, *Discrete and Computational Geometry*,
*International Journal on Computational
Geometry and Applications*.

Frédéric Chazal is a member of the Editorial Board of *SIAM Journal on Imaging Sciences*, *Discrete and Computational Geometry (Springer)*, *Graphical Models (Elsevier), and Journal of Applied and Computational Topology (Springer)*.

Steve Oudot is a member of the Editorial Board of *Journal of Computational Geometry*.

Frédéric Chazal, ACCAPT conference, Aalborg, Danmark, April 2016.

Frédéric Chazal, Joint Mathematical Meetings, Seattle, USA, January 2016.

Frédéric Chazal, Séminaire Parisien de Géométrie Algorithmique, Paris, October 2016.

Frédéric Chazal, 9th International Conference of theERCIM WG on Computational and Methodological Statistics, December 2016.

Steve Oudot, ACCAPT conference, Aalborg, Danmark, April 2016.

Steve Oudot, Workshop SIGMA 2016, CIRM, France, November 2016.

Steve Oudot, Applied Topology Seminar, Brown University, USA, November 2016.

Steve Oudot, Topology and Neuroscience Seminar, Princeton University, USA, November 2016.

Frédéric Chazal was a member of the ANR committee, CES 40 (Mathematics and Computer Science).

Master : Frédéric Chazal, Analyse Topologique des Données, 30h eq-TD, Université Paris-Sud, France.

Master : Jean-Daniel Boissonnat and Marc Glisse, Computational Geometry Learning, 36h eq-TD, M2, MPRI, France.

Doctorat : Frédéric Chazal and Bertrand Michel, An introduction to Topological Data Analysis, 18h eq-TD, Universitad Autonoma de Barcelona, Spain.

Master : Steve Oudot, Topological Data Analysis, 45h eq-TD, M1, École Polytechnique, France.

Master : Steve Oudot and Frédéric Cazals, Geometric Methods for Data Analysis, 30h eq-TD, M1, École Centrale Paris, France.

Master : Jean-Daniel Boissonnat, Winter School on Computational geometry and Topology, Inria Sophia Antipolis Méditerrannée, January 2016.

Doctorat : Steve Oudot, École Mathématique en Afrique on *Topologie différentielle, géométrie algébrique et applications*, La Marsa, Tunisia, March-April 2016.

Doctorat : Steve Oudot, Summer School on Mathematical Methods for High-Dimensional Data Analysis, Technical University of Munich, Germany, July 2016.

PhD: Thomas Bonis, Statistical Learning Algorithms for Geometric and Topological Data Analysis, December 1st, 2016, Frédéric Chazal.

PhD : Mael Rouxel-Labbé, Génération de maillages anisotropes, december 16, 2016, Jean-Daniel Boissonnat.

PhD: Ruqi Huang, Algorithms for topological inference in metric spaces, December 14, 2016, Frédéric Chazal.

PhD in progress: Eddie Aamari, A Statistical Approach of Topological Data Analysis, started September 1st, 2014, Frédéric Chazal (co-advised by Pascal Massart).

PhD in progress: Claire Brécheteau, Statistical aspects of distance-like functions , started September 1st, 2015, Frédéric Chazal (co-advised by Pascal Massart).

PhD in progress: Bertrand Beaufils, Méthodes topologiques et apprentissage statistique pour l’actimétrie du piéton à partir de données de mouvement, started November 2016, Frédéric Chazal (co-advised by Bertrand Michel).

PhD in progress: Mathieu Carrière, Topological signatures for geometric data, started November 1st, 2014, Steve Oudot.

PhD in progress: Jérémy Cochoy, Decomposition and stability of multidimensional persistence modules, started September 1st, 2015, Steve Oudot.

PhD in progress: Nicolas Berkouk, Categorification of topological graph structures, started November 1st, 2016, Steve Oudot.

PhD in progress: Alba Chiara de Vitis, Concentration of measure and clustering.

PhD in progress: Siargey Kachanovich, Approximate algorithms in higher dimensional geometry.

PhD in progress: François Godi, Data structures and algorithms for topological data analysis and high dimensional geometry.

Frédéric Chazal was a member (and reviewer) of the PhD defense committee of Mariia Fodetenkova (Inria Nancy).