CELESTE - 2019 - Rapport annuel d'activité

CELESTE

CELESTE - 2019

Project-Team Celeste

Team, Visitors, External Collaborators

Overall Objectives

Mathematical statistics and learning

Research Program

Application Domains

New Software and Platforms

New Results

Partnerships and Cooperations

National Initiatives

Dissemination

Bibliography

Publications of the year

Previous |

Home | Next next

Section: New Results

Optimal pair-matching

The sequential pair-matching problem appears in many applications (in particular for the internet) where one wants to discover, sequentially, good matches between pairs of individuals, for a given budget. C. Giraud, Y. Issartel, L. Lehéricy and M. Lerasle propose a formulation of this problem as a special bandit problem on graphs [23]. Formally, the set of individuals is represented by the nodes of a graph where the edges, unobserved at first, represent the potential good matches. The algorithm queries pairs of nodes and observes the presence/absence of edges. Its goal is to discover as many edges as possible with a fixed budget of queries. Pair-matching is a particular instance of multi-armed bandit problem in which the arms are pairs of individuals and the rewards are edges linking these pairs. This bandit problem is non-standard though, as each arm can only be played once.

Given this last constraint, sublinear regret can be expected only if the graph has some underlying structure. C. Giraud, Y. Issartel, L. Lehéricy and M. Lerasle show in [23] that sublinear regret is achievable in the case where the graph is generated according to a Stochastic Block Model (SBM) with two communities. Optimal regret bounds are computed for this pair-matching problem. They exhibit a phase transition related to the Kesten-Stigund threshold for community detection in SBMs. In practice, it is meaningful to constrain each node to be sampled less than a given amount of times, in order to avoid concentration of queries on a set of individuals. This setting is more challenging both on the statistical side and the algorithmic side. Optimal rates are also derived in this context, exhibiting how the regret deteriorates with this constraint.

Previous |

Home | Next next