EN FR
EN FR


Section: New Results

High-dimensional statistical inference

31. Discrete Mean Field Games: Existence of Equilibria and Convergence [12] We consider mean field games with discrete state spaces (called discrete mean field games in the following) and we analyze these games in continuous and discrete time, over finite as well as infinite time horizons. We prove the existence of a mean field equilibrium assuming continuity of the cost and of the drift. These conditions are more general than the existing papers studying finite state space mean field games. Besides, we also study the convergence of the equilibria of N -player games to mean field equilibria in our four settings. On the one hand, we define a class of strategies in which any sequence of equilibria of the finite games converges weakly to a mean field equilibrium when the number of players goes to infinity. On the other hand, we exhibit equilibria outside this class that do not converge to mean field equilibria and for which the value of the game does not converge. In discrete time this non-convergence phenomenon implies that the Folk theorem does not scale to the mean field limit.

32. Modularity-based Sparse Soft Graph Clustering [32] Clustering is a central problem in machine learning for which graph-based approaches have proven their efficiency. In this paper, we study a relaxation of the modularity maxi-mization problem, well-known in the graph partitioning literature. A solution of this relaxation gives to each element of the dataset a probability to belong to a given cluster, whereas a solution of the standard modularity problem is a partition. We introduce an efficient optimization algorithm to solve this relaxation, that is both memory efficient and local. Furthermore, we prove that our method includes, as a special case, the Louvain optimization scheme, a state-of-the-art technique to solve the traditional modularity problem. Experiments on both synthetic and real-world data illustrate that our approach provides meaningful information on various types of data.

33. Phase Transitions, Optimal Errors and Optimality of Message-Passing in Generalized Linear Models [41] We consider generalized linear models where an unknown n-dimensional signal vector is observed through the successive application of a random matrix and a non-linear (possibly probabilistic) componentwise function. We consider the models in the high-dimensional limit, where the observation consists of m points, and m/nα where α stays finite in the limit m,n. This situation is ubiquitous in applications ranging from supervised machine learning to signal processing. A substantial amount of work suggests that both the inference and learning tasks in these problems have sharp intrinsic limitations when the available data become too scarce or too noisy. Here, we provide rigorous asymptotic predictions for these thresholds through the proof of a simple expression for the mutual information between the observations and the signal. Thanks to this expression we also obtain as a consequence the optimal value of the generalization error in many statistical learning models of interest, such as the teacher-student binary perceptron, and introduce several new models with remarquable properties. We compute these thresholds (or "phase transitions") using ideas from statistical physics that are turned into rigorous methods thanks to a new powerful smart-path interpolation technique called the stochastic interpolation method, which has recently been introduced by two of the authors. Moreover we show that a polynomial-time algorithm refered to as generalized approximate message-passing reaches the optimal generalization performance for a large set of parameters in these problems. Our results clarify the difficulties and challenges one has to face when solving complex high-dimensional statistical problems.

34. Efficient inference in stochastic block models with vertex labels [18] We study the stochastic block model with two communities where vertices contain side information in the form of a vertex label. These vertex labels may have arbitrary label distributions, depending on the community memberships. We analyze a version of the popular belief propagation algorithm. We show that this algorithm achieves the highest accuracy possible whenever a certain function of the network parameters has a unique fixed point. When this function has multiple fixed points, the belief propagation algorithm may not perform optimally, where we conjecture that a non-polynomial time algorithm may perform better than BP. We show that increasing the information in the vertex labels may reduce the number of fixed points and hence lead to optimality of belief propagation.

35. Planting trees in graphs, and finding them back [36] In this paper we study detection and reconstruction of planted structures in Erdős-Rényi random graphs. Motivated by a problem of communication security, we focus on planted structures that consist in a tree graph. For planted line graphs, we establish the following phase diagram. In a low density region where the average degree λ of the initial graph is below some critical value λc=1, detection and reconstruction go from impossible to easy as the line length K crosses some critical value f(λ)ln(n), where n is the number of nodes in the graph. In the high density region λ>λc, detection goes from impossible to easy as K goes from o(n) to ω(n), and reconstruction remains impossible so long as K=o(n). For D-ary trees of varying depth h and 2DO(1), we identify a low-density region λ<λD, such that the following holds. There is a threshold *=g(D)ln(ln(n)) with the following properties. Detection goes from feasible to impossible as h crosses h*. We also show that only partial reconstruction is feasible at best for *. We conjecture a similar picture to hold for D-ary trees as for lines in the high-density region λ>λD, but confirm only the following part of this picture: Detection is easy for D-ary trees of size ω(n), while at best only partial reconstruction is feasible for D-ary trees of any size o(n). These results are in contrast with the corresponding picture for detection and reconstruction of low rank planted structures, such as dense subgraphs and block communities: We observe a discrepancy between detection and reconstruction, the latter being impossible for a wide range of parameters where detection is easy. This property does not hold for previously studied low rank planted structures.

36. Robustness of spectral methods for community detection [37] This work is concerned with community detection. Specifically, we consider a random graph drawn according to the stochastic block model: its vertex set is partitioned into blocks, or communities, and edges are placed randomly and independently of each other with probability depending only on the communities of their two endpoints. In this context, our aim is to recover the community labels better than by random guess, based only on the observation of the graph.

In the sparse case, where edge probabilities are in O(1/n), we introduce a new spectral method based on the distance matrix Dł, where Dłij=1 iff the graph distance between i and j, noted d(i,j) is equal to . We show that when clog(n) for carefully chosen c, the eigenvectors associated to the largest eigenvalues of Dł provide enough information to perform non-trivial community recovery with high probability, provided we are above the so-called Kesten-Stigum threshold. This yields an efficient algorithm for community detection, since computation of the matrix Dł can be done in O(n1+κ) operations for a small constant κ.

We then study the sensitivity of the eigendecomposition of Dł when we allow an adversarial perturbation of the edges of G. We show that when the considered perturbation does not affect more than O(nε) vertices for some small ε>0, the highest eigenvalues and their corresponding eigenvectors incur negligible perturbations, which allows us to still perform efficient recovery.

Our proposed spectral method therefore: i) is robust to larger perturbations than prior spectral methods, while semi-definite programming (or SDP) methods can tolerate yet larger perturbations; ii) achieves non-trivial detection down to the KS threshold, which is conjectured to be optimal and is beyond reach of existing SDP approaches; iii) is faster than SDP approaches.