Local and adaptive mirror descents in extensive-form games

FAIRPLAY Coopetitive AI: Fairness, Privacy, Incentives

Optimization, machine learning and statistical methods

Applied Mathematics, Computation and Simulation

http://team.inria.fr/fairplay Centre de Recherche en Economie et Stastistique Institut Polytechnique de Paris, Criteo Project-Team A4.8. - Privacy-enhancing technologies A8.11. - Game Theory A9.2. - Machine learning A9.9. - Distributed AI, Multi-agent B9.9. - Ethics B9.10. - Privacy Inria Saclay Centre at Institut Polytechnique de Paris Patrick Loiseau Chercheur Team leader, INRIA, Researcher oui Marc Abeille Chercheur Criteo, Industrial member Benjamin Heymann Chercheur Criteo, Industrial member Hugo Richard Chercheur Criteo, Industrial member Maxime Vono Chercheur Criteo, Industrial member Vianney Perchet Enseignant Team leader, Criteo & ENSAE , Professor oui Cristina Butucea Enseignant ENSAE, Professor oui Matthieu Lerasle Enseignant ENSAE, Professor oui Dorian Baudry PostDoc ENSAE Simon Finster PostDoc INRIA, Post-Doctoral Fellow, from Oct 2023 Simon Finster PostDoc CNRS, Post-Doctoral Fellow, until Sep 2023 Solenne Gaucher PostDoc ENSAE, from Sep 2023 Nadav Merlis PostDoc ENSAE Denis Sokolov PostDoc INRIA, Post-Doctoral Fellow, from Dec 2023 Ziyad Benomar PhD ENSAE Maria Cherifa PhD Criteo, CIFRE Lorenzo Croissant PhD ENSAE, ATER Hafedh El Ferchichi PhD ENSAE Côme Fiegel PhD ENSAE Mike Liu PhD ENSAE Mathieu Molina PhD INRIA Matilde Tullii PhD ENSAE Felipe Garrido Lucero Technique INRIA, Engineer Sruthi Gorantla Technique INRIA, Engineer, from Jun 2023 until Aug 2023 Reda Jalal Stagiaire INRIA, Intern, from May 2023 until Nov 2023 Giovanni Montanari Stagiaire ENSAE, Intern, from Apr 2023 until Sep 2023 Nicolas Noldus Stagiaire INRIA, Intern, from Jun 2023 until Jul 2023 Melanie Da Silva Assistant Inria, from May 2023 Clément Calauzènes CollaborateurExterieur Criteo Remi Castera CollaborateurExterieur Univ. Grenoble Alpes Julien Combe CollaborateurExterieur Ecole Polytechnique oui Azadeh Khaleghi CollaborateurExterieur ENSAE Jaouad Mourtada CollaborateurExterieur ENSAE Overall objectives Broad context

One of the principal objectives of Machine Learning (ML) is to automatically discover using past data some underlying structure behind a data generating process in order either to explain past observations or, perhaps more importantly, to make predictions and/or to optimize decisions made on future instances. The area of ML has exploded over the past decade and has had a tremendous impact in many application domains such as computer vision or bioinformatics.

Most of the current ML literature focuses on the case of a single agent (an algorithm) trying to complete some learning task based on gathered data that follows an exogenous distribution independent of the algorithm. One of the key assumptions is that this data has sufficient “regularity” for classical techniques to work. This classical paradigm of “a single agent learning on nice data”, however, is no longer adequate for many practical and crucial tasks that imply users (who own the gathered data) and/or other (learning) agents that are also trying to optimize their own objectives simultaneously, in a competitive or conflicting way. This is the case, for instance, in most learning tasks related to Internet applications (content recommendation/ranking, ad auctions, fraud detection, etc.). Moreover, as such learning tasks rely on users' personal data and as their outcome affect users in return, it is no longer sufficient to focus on optimizing prediction performance metrics—it becomes crucial to consider societal and ethical aspects such as fairness or privacy.

The field of single agent ML builds on techniques from domains such as statistics, optimization, or functional analysis. When different agents are involved, a strategic aspect inherent in game theory enters the picture. Indeed, interactions—either positive or negative—between rational entities (firms, single user at home, algorithms, etc.) foster individual strategic behavior such as hiding information, misleading other agents, free-riding, etc. Unfortunately, this selfishness degrades the quality of the data or of the predictions, prevents efficient learning and overall may diminish the social welfare. These strategic aspects, together with the decentralized nature of decision making in a multi-agent environment, also make it harder to build algorithms that meet fairness and privacy constraints.

The overarching objective of FAIRPLAY is to create algorithms that learn for and with users—and techniques to analyze them—, that is to create procedures able to perform classical learning tasks (prediction, decision, explanation) when the data is generated or provided by strategic agents, possibly in the presence of other competing learning agents, while respecting the fairness and privacy of the involved users. To that end, we will naturally rely on multi-agent models where the different agents may be either agents generating or providing data, or agents learning in a way that interacts with other agents; and we will put a special focus on societal and ethical aspects, in particular fairness and privacy. Note that in FAIRPLAY, we focus on the technical challenges inherent to formalizing mathematically and respecting ethical properties such as non-discrimination or privacy, often seen as constraints in the learning procedure. Nevertheless, throughout the team's life, we will reflect on these mathematical definitions for the particular applications studied, in particular their philosophical roots and legal interpretation, through interactions with HSS researchers and with legal specialists (from Criteo).

Multi-agent systems

Any company developing and implementing ML algorithms is in fact one agent within a large network of users and other firms. Assuming that the data is i.i.d. and can be treated irrespectively of the environment response—as is done in the classical ML paradigm—might be a good first approximation, but should be overcome. Users, clients, suppliers, and competitors are adaptive and change their behavior depending on each other's interactions. The future of many ML companies—such as Criteo—will consist in creating platforms matching the demand (created by their users) to the offer (proposed by their clients), under the system constraints (imposed by suppliers and competitors). Each of these agents have different, conflicting interests that should be taken into account in the model, which naturally becomes a multi-agent model.

Each agent in a multi-agent system may be modeled as having their own utility function $u_{i}$ that can depend on the action of other agents. Then, there are two main types of objectives: individual or collective 99. If each agent is making their own decision, then they can be modeled as each optimizing their own individual utility (which may include personal benefit as well as other considerations such as altruism where appropriate) unilaterally and in a decentralized way. This is why a mechanism providing correct incentives to agents is often necessary. At the other extreme, social welfare is the collective objective defined as the cumulative sum of utilities of all agents. To optimize it, it is almost always necessary to consider a centralized optimization or learning protocol. A key question in multi-agent systems is to apprehend the “social cost” of letting agents optimize their own utility by choosing unilaterally their decision compared to the one maximizing social welfare; this is often measured by the “price of anarchy”/“price of stability” 108: the ratio of the maximum social welfare to the (worst/best) social welfare when agents optimize individually.

The natural language to model and study multi-agent systems is game theory—see below for a list of tools and techniques on which FAIRPLAY relies, game theory being the first of them. Multi-agent systems have been studied in the past; but not with a focus on learning systems where agents are either learning or providing data, which is our focus in FAIRPLAY and leads to a blend of game theory and learning techniques. We note here again that, wherever appropriate, we shall reflect (in part together with colleagues from HSS) on the soundness of the utility framework for the considered applications.

Societal aspects and ethics

There are several important ethical aspects that must be investigated in multi-agent systems involving users either as data providers or as individuals affected by the ML agent decision (or both).

Fairness and Discrimination

When ML decisions directly affect humans, it is important to ensure that they do not violate fairness principles, be they based on ethical or legal grounds. As ML made its way in many areas of decision making, it was unfortunately repeatedly observed that it can lead to discrimination (regardless of whether or not it is intentional) based on gender, race, age, or other sensitive attributes. This was observed in online targeted advertisement 93, 119, 33, 77, 93, 35, but also in many other applications such as hiring 64, data-driven healthcare 71, or justice 94. Biases also have the unfortunate tendency to reinforce. An operating multi-agent learning system should be able in the long run to get rid by itself of inherent population biases, that is, be fair amongst users irrespective of the improperly constructed dataset.

The mathematical formulation of fairness has been debated in recent works. Although a few initial works proposed a notion of individual fairness, which mandates that “similar individuals” receive “similar outcomes” 66, this notion was quickly found unpractical because it relies on a metric to define closeness that makes the definition somewhat arbitrary. Most of the works then focused on notions of group fairness, which mandate equality of outcome “on average” across different groups defined by sensitive attributes (e.g., race, gender, religious belief, etc.). Most of the works on group fairness focus on the classification problem (e.g., classifying whether a job applicant is good or not for the job) where each data example $(X_{i}, Y_{i})$ contains a set of features $X_{i}$ and a true label $Y_{i} \in {0, 1}$ and the goal is to make a prediction ${\hat{Y}}_{i}$ based on the features $X_{i}$ that has a high probability to be equal to the true label. Assuming that there is a single sensitive attribute $s_{i}$ that can take two values $a$ or $b$ , this defines two groups: those for whom $s_{i} = a$ and those for whom $s_{i} = b$ . There are several different concepts of group fairness that can be considered; we shall especially focus on demographic parity (DP), which prescribes $P ({\hat{Y}}_{i} = 1 | s_{i} = a) = P ({\hat{Y}}_{i} = 1 | s_{i} = b)$ and equal opportunity (EO) 78, which mandates that $P ({\hat{Y}}_{i} = 1 | s_{i} = a, Y_{i} = 1) = P ({\hat{Y}}_{i} = 1 | s_{i} = b, Y_{i} = 1)$ .

The fair classification literature proposed, for each of these fairness notions, ways to train fair classifiers based on three main ideas: pre-processing 127, in-processing 125, 126, 122, and post-processing 78. All of these works, however, focus on idealized situations where a single decision-maker has access to ground truth data with the sensitive features and labels in order to train classifiers that respect fairness constraints. We use similar group fairness definitions and extend them (in particular through causality), but our goal is to go further in terms of algorithms by modeling practical scenarios with multiple decision-makers and incomplete information (in particular lack of ground truth on the labels).

Privacy vs. Incentives

ML algorithms, in particular in Internet applications, often rely on users' personal information (whether it is directly their personal data or indirectly some hidden “type” – gender, ethnicity, behaviors, etc.). Nevertheless, users may be willing to provide their personal information if it increases their utility. This brings a number of key questions. First, how can we learn while protecting users' privacy (and how should privacy even be defined)? Second, finding the right balance between those two a-priori incompatible concepts is challenging; how much (and even simply how) should an agent be compensated for providing useful and accurate data?

Differential privacy is the most widely used private learning framework 65, 67, 114 and ensures that the output of an algorithm does not significantly depend on a single element of the whole dataset. These privacy constraints are often too strong for economic applications (as illustrated before, it is sometimes optimal to disclose some private information). $f$ -divergence privacy costs have thus been proposed in recent literature as a promising alternative 57. These $f$ -divergences, such as Kullback-Leibler, are also used by economists to measure the cost of information from a Bayesian perspective, as in the rational inattention literature 118, 101, 96. It was only recently that this approach was considered to measure “privacy losses” in economic mechanisms 68. In this model, the mechanism designer has some prior belief on the unobserved and private information. After observing the player's action, this belief is updated and the cost of information corresponds to the KL between the prior and posterior distributions of this private information.

This privacy concept can be refined up to a single user level, into the so-called local differential privacy. Informally speaking, the algorithm output can also depend on a single user data that still must be kept private. Estimation are actually sometimes more challenging under this constraint, i.e., estimation rates degrade 115, 52, 53 but is sometimes more adapted to handle user-generated data 73.

Interestingly, we note that the notions of privacy and fairness are somewhat incompatible. This will motivate Theme 2 developed in our research program.

A large variety of tools and techniques

Analyzing multi-agent learning systems with ethical constraints will require us to use, develop, and merge several different theoretical tools and techniques. We describe the main ones here. Note that although FAIRPLAY is motivated by practical use-cases and applications, part of the team's objectives is to improve those tools as necessary to tackle the problems studied.

Game theory and economics

Game theory 72 is the natural mathematical tool to model multiple interacting decision-makers (called players). A game is defined by a set of players, a set of possible actions for each player, and a payoff function for each player that can depend on the actions of all the players (that is the distinguishing feature of a game compared to an optimization problem). The most standard solution concept is the so-called Nash equilibrium, which is defined as a strategy profile (i.e., a collection of possibly randomized action for each player) such that each player is at best response (i.e., has the maximum payoff given the others' strategies). It is a “static” (one-shot) solution concept, but there also exist dynamic solution concepts for repeated games 56, 103.

Online and reinforcement learning 49

In online learning (a.k.a. multi-armed bandit 50, 110), data is gathered and treated on the fly. For instance, consider an online binary classification problem. Some unlabelled data $X_{t} \in R^{d}$ is observed, and the agent predicts its label $Y_{t}$ ; let us denote ${\hat{Y}}_{t} \in \pm 1$ the prediction. The agent potentially observes the loss $1 {Y_{t} \neq {\hat{Y}}_{t}}$ and then receives another new unlabeled data example $X_{t + 1}$ . In that specific problem, the typical learning objective is to perform asymptotically as good as the best classifier $f^{*}$ in some given class $ℱ$ , i.e., such that the loss $\sum_{t = 1}^{T} 1 {Y_{t} \neq {\hat{Y}}_{t}}$ is $o (T)$ -close to ${max}_{f \in ℱ} \sum_{t = 1}^{T} 1 {Y_{t} \neq f (X_{t})}$ ; the difference between those terms is called regret. The more general model with an underlying state of the world $S_{t} \in 𝒮$ that evolves at each step following some Markov Decision Process (MDP, i.e., the transition matrix from $S_{t}$ to $S_{t + 1}$ depend on the actions of the agent) and impacts the loss function is called reinforcement learning (RL). RL is an incredibly powerful learning technique, provided enough data are available since learning is usually quite slow. This is why the recent successes involve settings with heavy simulations (like games) or well-understood physical systems (like robots).

These techniques will be central to our approach as we aim to model problems where ground truth data is not available upfront and problems involving sequential decision making. There have been some successful first results in that direction. For instance, there are applications (e.g., cognitive radio) where several agents (users) aim at finding a matching with resources (the different bandwidth). They can do that by “probing” the resources, estimating their preferences and trying to find some stable matchings 47, 95.

Online algorithms 45 and theoretical computer science

Online algorithms are closely related to online learning with a major twist. In online learning, the agent has “0-look ahead”; for instance, in the online binary classification example, the loss at stage $t$ was $1 {Y_{t} \neq {\hat{Y}}_{t}}$ but $Y_{t}$ was not known in advance. The comparison class, on the other hand, was the empirical performance of a given set of classifiers. In online algorithms, the agents have “1-look ahead”; in the classification example, this means that $Y_{t}$ is known before choosing ${\hat{Y}}_{t}$ . But the overall objective is obviously no longer the minimisation of the empirical error, but the minimisation of this error plus the total number of changes (say). The comparison class is then larger, namely a subset of admissible (or the whole set) sequences of prediction ${\pm 1}^{T}$ . The typical and relevant example of online problem relevant for Criteo that will be investigated is the matching problem: agents and resources arrive sequentially and must be, if possible, paired together as fast as possible (and as successfully as possible). Variants of these problems include the optimal stopping time question (when/how make a final decision) such as prophet inequalities and related questions 62,

Optimal transport 120

Optimal transport is a quite old problem introduced by Monge where an agent aims at moving a pile of sand to fill a hole at the smallest possible price. Formally speaking, given two probability measures $μ$ and $ν$ on some space $𝒳$ , the optimal transport problem consist in finding (if it exists, otherwise the problem can be relaxed) a transport map $T : 𝒳 \to 𝒳$ that minimizes $\int_{𝒳} c (x, T (x)) d μ (x)$ for some cost function $c : 𝒳^{2} \to R$ , under the constraint that $T ♯ μ = ν$ , where $T ♯ μ$ is the push-forward measure of $μ$ by $T$ . Interestingly, when $μ$ and $ν$ are empirical measures, i.e., $μ = \frac{1}{N} \sum_{n = 1} δ_{x_{n}}$ and $ν = \frac{1}{N} \sum_{n = 1} δ_{y_{n}}$ , a transport map is nothing more than a matching between ${x_{n}}$ and ${y_{n}}$ that minimizes the cost $\sum_{n} c (x_{n}, T (x_{n}))$ .

Recently, optimal transport gained a lot of interest in the ML community 109 thanks to its application to images and to new techniques to compute approximate matchings in a tractable way 112. Even more unexpected applications of optimal transport have been discovered: to protect privacy 48, fairness 41, etc. Those connections are promising, but only primitive for the moment. For instance, consider the problem of matching students to schools. The unfairness level of a school can be measured as the Wasserstein distance between the distribution of the students within that school compared to the overall distribution of students. Then the matching algorithms could have a constraint of minimizing the sum of (or its maximum) unfairness levels; alternatively, we could aim at designing mechanisms giving incentives to schools to be fair in their allocation (or at least in their list preferences), typically by paying a higher fee if the unfairness level is high.

General objectives

The overarching objective of FAIRPLAY of to create algorithms to learn for and with users—and techniques to analyze them—, through the study of multi-agent learning systems where the agents can be cooperatively or competitively learning agents, or agents providing or generating data, while guaranteeing that fairness and privacy constraints are satisfied for the involved users. We detail this global objective into a number of more specific ones.

Objective 1: Developing fair and private mechanisms

Our first objective is to incorporate ethical aspects of fairness and privacy in mechanisms used in typical problems occurring in Internet applications, in particular auctions, matching, and recommendation. We will focus on social welfare and consider realistic cases with multiple agents and sequential learning that occur in practice due to sequential decision making. Our objective is both to construct models to analyze the problem, to devise algorithms that respect the constraints at stake, and to evaluate the different trade-offs in standard notions of utility introduced by ethical constraints.

Objective 2: Developing multi-agent statistics and learning

Data is now acquired, treated and/or generated by a whole network of agents interacting with the environment. There are also often multiple agents learning either collaboratively or competitively. Our second objective is to build a new set of tools to perform statistics and learning tasks in such environments. To this end, we aim at modeling these situations as multi-agent systems and at studying the dynamics and equilibrium of these complex game-theoretic situations between multiple learning algorithms and data providers.

Objective 3: Improving the theoretical state of the art

Research must rely on theoretical, proven guarantees. We develop new results for the techniques introduced before, such as prophet inequalities, (online) matchings, bandits and RL, etc.

Objective 4: Proposing practical solutions and enhancing transfer from research to industry

Our last scientific objective is to apply and implement theoretical works and results to practical cases. This will be a crucial component of the project as we focus on transfer within Criteo.

Objective 5: Scientific Publications

We aim at publishing our results in top-tier machine learning conferences (NeurIPS, ICML, COLT, ICLR, etc.) and in top-tier game theory journals (Games and Economic Behavior, Mathematics of OR, etc.). We will also target conferences at the junction of those fields (EC, WINE, WebConf, etc.) as well as conferences specifically on security and privacy (IEEE S&P, NDSS, CSS, PETS, etc.) and on fairness (FAccT, AIES).

All the five objectives are interlaced. For instance, fairness and privacy constraints are important in Objective 2 whereas the multi-agent aspect is also important in Objective 1. Objectives 4 and 5 are transversal and present in all the first three objectives.

Research program

To reach the objectives laid out above, we organize the research in three themes. The first one focuses on developing fair mechanisms. The second one considers private mechanisms, and in particular considers the challenge of reconciling fairness and privacy—which are often conflicting notions. The last theme, somewhat transverse to the first two, consists in leveraging/incorporating structure in all those problems in order to speed up learning. Of course, all themes share common points on both the problems/applications considered and the methods and tools used to tackle them; hence there are cross-fertilization between the different themes.

Theme 1: Developing fair mechanisms for auctions and matching problems Fairness in auction-based systems

Online ads platforms are nowadays used to advertise not just products, but also opportunities such as jobs, houses, or financial services. This makes it crucial for such platforms to respect fairness criteria (be it only for legal reasons), as an unfair ad system would deprive a part of the population of some potentially interesting opportunities. Despite this pressing need, there is currently no technical solution in place to provably prevent discriminations. One of the main challenge is that ad impression decisions are the outcome of an auction mechanism that involves bidding decisions of multiple self-interested agents controlling only a small part of the process, while group fairness notions are defined on the outcome of a large number of impressions. We propose to investigate two mechanisms to guarantee fairness in such a complex auction-based system (note that we focus on online ad auctions but the work has broader applicability).

Advertiser-centric (or bidder-centric) fairness

We first focus on advertiser-centric fairness, i.e., the advertiser of a third-party needs to make sure that the reached audience is fair independently of the ad auction platform. A key difficulty is that the advertiser does not control the final decision for each ad impression, which depends on the bids of other advertisers competing in the same auction and on the platform's mechanism. Hence, it is necessary that the advertiser keeps track of the auctions won for each of the groups and dynamically adjusts its bids in order to maintain the required balance.

A first difficulty is to model the behavior of other advertisers. We can first use a mean-field games approach similar to 81 that approximates the other bidders by an (unknown) distribution and checks equilibrium consistency; this makes sense if there are many bidders. We can also leverage refined mean-field approximations 75 to provide better approximations for smaller numbers of advertisers. Then a second difficulty is to find an optimal bidding policy that enforces the fairness constraint. We can investigate two approaches. One is based on an MDP (Markov Decision Process) that encodes the current fairness level and imposes a hard constraint. The second is based on modeling the problem as a contextual bandit problem. We note that in addition to fairness constraints, privacy constraints may complicate the optimal solution finding.

Platform-centric (or auction-centric) fairness

We also consider the problem from the platform's perspective, i.e., we assume that it is the platform's responsibility to enforce the fairness constraint. We also focus here on demographic parity. To make the solution practical, we do not consider modification of the auction mechanism, instead we consider a given mechanism and let the platform adapt dynamically the bids of each advertiser to achieve the fairness guarantee. This approach would be similar to the pacing multipliers used by some platforms 61, 60, but using different multipliers for the different groups (i.e., different values of the sensitive attribute).

Following recent theoretical work on auction fairness 54, 80, 58 (which assumes that the targeted population of all ads is known in advance along with all their characteristics), we can formulate fairness as a constraint in an optimization problem for each advertiser. We study fairness in this static auction problem in which the auction mechanism is fixed (e.g., to second price). We then move to the online setting in which users (but also advertisers) are dynamic and in which decisions must be taken online, which we approach through dynamic adjustment of pacing multipliers.

Fairness in matching and selection problems

In this second part, we study fairness in selection and matching problems such as hiring or college admission. The selection problem corresponds to any situation in which one needs to select (i.e., assign a binary label to) data examples or individuals but with a constraint on the number of maximum number of positive labels. There are many applications of selection problems such as police checks, loan approvals, or medical screening. The matching problem can be seen as the more general variant with multiple selectors. Again, a particular focus is put here on cases involving repeated selection/matching problems and multiple decision makers.

Fair repeated multistage selection

In our work 69, we identified that a key source of discrimination in (static) selection problems is differential variance, i.e., the fact that one has quality estimates that have different variances for different groups. In practice, however, the selection problem is often ran repeatedly (e.g., at each hiring campaign) and with partial (and increasing) information to exploit for making decisions.

Here, we consider the repeated multistage selection problem, where at each round a multistage selection problem is solved. A key aspect is that, at the end of a round, one learns extra information about the candidates that were selected—hence one can refine (i.e., decrease the variance of) the quality estimate for the groups in which more candidates were selected. We will first rethink fairness constraints in this type of repeated decision making problems. Then we will both study the discrimination that come out of natural (e.g., greedy) procedure as well as design (near) optimal ones for the constraints at stake. We also investigate how the constraints affect the selection utility.

Multiple decision-makers

Next, we investigate cases with multiple decision-makers. We propose two cases in particular. The first one is the simple two-stage selection problem but where the decision-maker doing the first-stage selection is different from the decision-maker doing the second-stage selection. This is a typical case for instance for recruiting agencies that propose sublists of candidates to different firms that wish to hire. The second case is when multiple-decision makers are trying to make a selection simultaneously—a typical example of this being the college admission problem (or faculty recruitment). We intend to model it as a game between the different colleges and to study both static solutions as well as dynamic solutions with sequential learning, again modeling it as a bandit problem and looking for regret-minimizing algorithms under fairness constraints. A number of important questions arise here: if each college makes its selection independently and strategically (but based on quality estimates with variances that differ amongst groups), how does it affect the “global fairness” metrics (meaning in aggregate across the different colleges) and the “local fairness” metrics (meaning for an individual college)? What changes if there is a central admission system (such as Parcoursup)? And in this later case, how to handle fairness on the side of colleges (i.e., treat each college fairly in some sense)?

Fair matching with incentives in two-sided platforms

We will study specifically the case of a platform matching demand on one side to offer on the other side, with fairness constraints on each side. This is the case for instance in online job markets (or crowdsourcing). This is similar to the previous case but, in addition, here there is an extra incentives problem: companies need to give the right incentives to job applicants to accept the jobs; while the platform doing the match needs to ensure fairness on both sides (job applicants and companies). This gives rise to a complicated interplay between learning and incentives that we will tackle again in the repeated setting.

We finally mention that, in many of these matching problems, there is an important time component: each agent needs to be matched “as soon as possible”, yielding a trade-off between the delay to be matched and the quality of the match. There is also a problem of participation incentives; that is how the matching algorithm used affect the behavior of the participants in the matching “market” (participation or not, information revelation, etc.). In the long-term, we will incorporate these aspects in the above models.

Throughout the work in this theme, we will also consider a question transverse and present in all the models above: how can we handle multidimensional fairness, that is, where there are multiple sensitive attributes and consequently an exponential number of sub-groups defined by all intersections; this combinatorial is challenging and, for the moment, still exploratory.

Theme 2: Reconciling, and enforcing privacy with fairness

In the previous theme, we implicitly assumed that we know the users' group, i.e., their sensitive attributes such as gender, age, or ethnicity. In practice, one of the key question when implementing fairness mechanisms is how to measure/control fairness metrics without having access to these protected attributes. This question relates to the link between privacy and fairness and the trade-off between them (as fairness requires data and privacy tends to protect it) 113, 63.

A first option to solve this problem would be (when it is possible) to build proxies 76, 116 for protected attributes using available information (e.g., websites visited or products bought) and to measure or control for fairness using those in place of the protected attributes. As the accuracy of these proxies cannot be assessed, however, they cannot be used for any type of “public certification”—that is, for a company to show reasonable fairness guarantees to clients (e.g., as a commercial argument), or (even less) to regulators. Moreover, in many cases, the entity responsible for fairness should not be accessing sensitive information, even through proxies, for privacy reasons.

In FAIRPLAY, we investigate a different means of certifying fairness of decisions without having access to sensitive attributes, by partnering with a trusted third-party that collects protected attributes (that could for instance be a regulator or a public entity, such as Nielsen, say). We distinguish two cases:

If the third-party and the company share a common identifier of users, then computing the fairness metric without leaking information to each other will boil down to a problem of secure multi-party computation (SMC). In such a case, there could be a need to be able to learn, which opens the topic of learning and privacy under SMC. This scenario, however, is likely not the most realistic one as having a common identifier requires a deep technical integration.

If the third-party and the company do not share a common identifier of users, but there are common features that they both observe 84, then it is possible only to partially identify the joint distribution. With additional structural assumptions on the distribution, however, it could be identified accurately enough to estimate biases and fairness metrics. This becomes a distribution identification problem and brings a number of questions such as: how to do the distribution identification? how to optimally query data from the third party to train fair algorithms with high utility? etc. An important point to keep in mind in such a study is that it is likely that the third party user-base is different from that of the company. It will therefore be key to handle the covariate shift from one distribution to the other while estimating biases.

This distribution identification problem will be important in the context of privacy, even independently of fairness issues. Indeed, in the near future, most learning will happen in a privacy-preserving setting (for instance, because of the Chrome privacy sandbox). This will require new learning schemes (different from e.g., Empirical Risk Minimization) as samples from the usual joint distribution $(X, Y)$ of samples/labels will no longer be observed. Only aggregated data—e.g., (empirical) marginals of the form $E [Y | X_{2} = 4, X_{7} = ``lemonde.fr'']$ —will be observed, with a limited budget of requests. This also brings questions such as how to mix it with ERM on some parts of the traffic, what is the (certainly adaptive or active) optimal strategy to query the marginals, etc. This problem will be further complicated by the fact that privacy (for instance through the variety of consents) will be heterogeneous: all features are not available all the time. This is therefore strongly related to learning with missing features and imputation 83.

In relation to the above problems, a key question is to determine what is the most appropriate definition of fairness to consider. Recall that it is well-known that usual fairness metrics are not compatible 88. Moreover, in online advertising, fairness can be measured at multiple different levels: at the level of bids, at the level of audience reached, at the level of clicking users, etc. Fairness at the level of bids does not imply fairness of the audience reached (see Theme 1); yet external auditors would measure which ad is displayed—as was done for some ad platforms 117—hence in terms of public image, that would be the appropriate level to define fairness. Intimately, the above problem relates to the question of measuring what is the relevant audience of an ad, which would define the label if one were to use the EO fairness notion. This label is typically not available. We can explore three ways to overcome this issue. The first is to find a sequential way to learn the label through users clicking on ads. The second and third options are to focus in a first step on individual fairness, or on counterfactual fairness 91, which has many possible different level of assumptions and was popularized in 2020 92. The notion of counterfactual is key in causality 111. A model is said counterfactually fair if its prediction does not change (too much) by intervening on the sensitive attribute. Several works already propose ways of designing models that are counterfactually fair 87, 124, 123. This seems to be quite an interesting, but challenging direction to follow.

Finally, an alternative direction would be to purse modeling the trade-off between privacy and fairness. For instance, in some game theoretic models, users can choose the quantity of data that they reveal 74, 48, so that the objective functions integrate different levels of fairness/privacy. Then those models model should be studied both in terms of equilibrium and in the online setup, with the objective of identifying how the strategic privacy considerations affect the fairness-utility tradeoff.

Theme 3: Exploiting structure in online algorithms and learning problems

Our last research direction is somewhat transverse, with possible application to improving algorithms in the first three themes. We explore how the underlying structure can be exploited, in the online and learning problems considered before, to improve performance. Note that, in all these problems, we will incorporate the fairness and privacy aspects even if they are somewhat transverse to the structure considered.1 The following sections are illustrating examples on how hidden structure can be leveraged in specific examples.

Leveraging structure in online matching

Finding large matchings in graphs is a longstanding problem with a rich history and many practical and theoretical applications 102, 79, 39, 38. Recall that given a graph $G = (ℰ)$ —where $i s a s e t o f v e r t i c e s a n d$ E $i s a s e t o f e d g e s - - -, a m a t c h i n g$ ME $i s a s u b s e t o f e d g e s s u c h t h a t e a c h v e r t e x b e l o n g s t o a t m o s t o n e e d g e$ eM $. I n t h a t c o n t e x t, a p e r f e c t m a t c h i n g$ M $i s a m a t c h i n g w h e r e e a c h v e r t e x$ v is associated to an edge $e \in ℳ$ , and a maximum matching is a matching of maximum size (one can also consider weights on edges). Here, we study an online setting, which is more adequate in applications such as Internet advertising where ad impressions must be assigned to available ad slots 102, 51. Consider a bipartite graph, where $U \cup V$ is the union of two disjoints sets. Nodes $u \in U$ are known beforehand, whereas nodes $v \in V$ are discovered one at a time, along with the edges they belong to, and must be either immediately matched to an available (i.e., unmatched yet) vertex $u \in U$ or discarded. Online bipartite matching is relevant in two-sided markets besides ad allocations such as assigning tasks to workers 79.

A natural measure for the quality of an online matching algorithm is the “competitive ratio” (CR): the ratio between the size of the created matching to the size of the optimal one 102. The seminal work 86 introduced an optimal algorithm for the adversarial case 42, that guarantees a CR of $1 - \frac{1}{e}$ ; but focusing on a pessimistic worst-case. In practice, some relevant knowledge (either given a priori or learned during the process) on the underlying structure of the problem can be leveraged. The focus then shifted to models taking into account some type of stochasticity in the arrival model, mostly for the i.i.d. model where arriving vertices $v \in V$ are drawn from a fixed distribution $𝒟$ 82, 40, 70, 85, 97, 98. The classical approach consists in optimizing the CR over the distribution $𝒟$ . Even in this seemingly optimistic framework, however, it is now known that there is no hope for a CR of more than 0.823 98. Moreover, this generally leads to very large linear programs (LP).

A more recent approach restricts the distribution $𝒟$ over which the problem is optimized to classes of graphs with an underlying stochastic structure. The benefit of this approach is two-fold: it gives hope for higher competitive ratios, and for simpler algorithms. Experiments also proved that complex algorithms optimized on $𝒟$ fared no better than simple greedy heuristics on “real-life” graphs 46. A few results along these lines show that is a promising path. For instance, 51 studied the problem on graphs where each vertex has degree at least $d$ and found a competitive ratio of $1 - {(1 - 1 / d)}^{d}$ . On d-regular graphs, 59 designed a $1 - O (\sqrt{log d} / \sqrt{d})$ competitive algorithm. 100 showed that greedy algorithms were highly efficient on Erdös-Renyi random graphs, with a competitive ratio of 0.837 in the worst case. 37 showed that in a specific market with two types of matching agents, the behavior of the matching algorithm varies with the homogeneity of the market. Our goal here is to go beyond the independence assumption underlying all these works.

Introducing correlation and inhomogeneity

We will start by deriving and studying optimal online matching strategies on widely studied classes of graphs that present simple inhomongeneity or correlation structures (which are often present in applications). The stochastic block model 34 is often used to generate graphs with an underlying community structure. It presents a simple correlation structure: two vertices in the same community are more likely to have a common neighbors than two vertices in different communities. Another focus point will be a generalized version of the Erdös-Renyi model, where the inplace vertices $u \in 𝒰$ are divided into sets $s_{i}$ , where $u \in s_{i}$ generates an edge with probability $p_{i} = \frac{c_{i}}{n}$ . These two settings should give us a better understanding of how heterogeneity and correlation affect the matching performance.

Deriving the competitive ratio implies to study the asymptotic size of maximum matchings in random graphs. Two methods are usually used. The first and constructive one is the study of the Karp-Sipser algorithm on the graph 43. The second one involves the rich theory of graph weak local convergence 44. A straightforward application of the methods, however, requires the graph to have independence properties; adapting them to graphs with a correlation structure will require new ideas.

Configuration models and random geometric graphs

A configuration model is described as follows (in the bi-partite case). Each vertex

u \in 𝒰

has a number of half-edges drawn for the same distribution

π_{𝒰}

and each vertex

v \in h a s a n u m b e r o f h a l f - e d g e s d r a w n f r o m

(w i t h t h e a s s u m p t i o n t h a t t h e e x p e c t e d t o t a l n u m b e r s o f h a l f e d g e s f r o m

a n d

are the same). Then a vertex

v \in t h a t a r r i v e s i n t h e s e q u e n t i a l f a s h i o n h a s i t s h a l f - e d g e s ` ` c o m p l e t e d^{''} b y a (s t i l l f r e e) h a l f - e d g e o f

. T h i s i s a s t a n d a r d w a y o f c r e a t i n g r a n d o m g r a p h s w i t h (a l m o s t) f i x e d d i s t r i b u t i o n o f d e g r e e s . H e r e t h e q u e s t i o n w o u l d s i m p l y b e t h e c o m p e t i t i v e r a t i o o f s o m e g r e e d y a l g o r i t h m, w h e t h e r t h e d i s t r i b u t i o n s

a n d

a r e k n o w n b e f o r e h a n d o r l e a r n e d o n t h e f l y . A n i n t e r e s t i n g v a r i a n t o f t h i s p r o b l e m w o u l d b e t o a s s u m e t h e e x i s t e n c e o f a (h i d d e n o r n o t) g e o m e t r i c g r a p h . E a c h

u U

i s d r a w n i . i . d i n

(s a y a G a u s s i a n c e n t e r e d a t 0) a n d s i m i l a r l y f o r

v . Then there is an edge between

u

and

v

with a probability depending on the distance between them. Here again, interesting variants can be explored depending on whether the distribution is known or not, and whether the locations of

u

and/or

v

are observed or not.

Learning while matching

In practical applications, the full stochastic structure of the graphs may not be known beforehand. This begs the question: what will happen to the performance of the algorithms if the graph parameters are learned while matching? In the generalized Erdös-Renyi graph, this will correspond to learning the probability of generating edges. For the stochastic block model, the matching algorithm will have to perform online community detection.

Exploiting side-information in sequential learning

We end with an open direction that may be relevant to many of the problems considered above: how to use side-information to speed-up learning. In many sequential learning problems where one receives feedback for each action taken, it is actually possible to deduce, for free, extra information from the known structure of the problem. However, how to incorporate that information in the learning process is often unclear. We describe it through two examples.

One-sided feedback in auctions

In online ad auctions, the advertisers' strategy is to bid in a compact set of possible bids. After placing a bid, the advertiser learns whether they won the auction or not; but even if they do not observe the bids of other advertisers, they can deduce for free some extra information: if they win they learn that they would have won with any higher bid and if they loose they learn that they would have lost with any lower bid 121, 55. We will investigate how to incorporate this extra information in RL procedures devised in Theme 1. One option is by leveraging the Kaplan-Meier estimator.

Side-information in dynamic resource allocation problems and matching

Generalizing the idea above, one can observe side-information in many other problems 36. Typically, in resource allocation problems (e.g., how to allocate a budget of ad impressions), one can leverage a monotony property: one would not have gained more by allocating less. Similarly, in matching with unknown weights, it is often possible upon doing a particular match to learn the weight of other potential pairs.

Application domains Typical problems and use-cases

In FAIRPLAY, we focus mainly on problems involving learning that relate to Internet applications. We will tackle generic problems (in particular auctions and matching) with different applications in mind, in particular some applications in the context of Criteo's business but also others. A crucial property of those applications is the aforementioned ethical concerns, namely fairness and/or privacy. The team was shaped and motivated by several such use-cases, from more practical (with possible short or middle term applications in particular in Criteo products) to more theoretical and exploratory ones. We describe first here the main types of generic problems and use-cases considered in this context.

Auctions 90

There are many different types of auctions that an agent can use to sell one or several items in her possession to $n$ potential buyers. This is the typical way in which spots to place ads are sold to potential advertisers. In case of a single item, the seller ask buyers to bid $b_{i} \in [0, 1]$ on the item and the winner of the item is designating via an “allocation rule” that maps bids $b \in {[0, 1]}^{n}$ to a winner in ${0, ..., n}$ (0 refers to the no winner case). Then the payment rule $p : {[0, 1]}^{n} \to {[0, 1]}^{n}$ indicates the amount of money that each bidder must pay to the seller. Auctions are specific cases of a broader family of “mechanisms”. Knowing the allocation and payment rules, bidders have incentives to bid strategically. Different auctions (or rules) end up with different revenue to the seller, who can choose the optimal rules. This is rather standard in economics, but these interactions become way more intricate when repeated over time (as in the online ad market 106), when several items are sold at the same time (for instance in bundles), when the buyers have partial information about the actual value of the item 121 and/or reciprocally when the seller does not know the value distributions of the buyer. In that case, she might be tempted to try to learn them from the previous bids in order to design the optimal mechanism. Knowing this, the bidders have incentives to long term strategic behaviors, ending up in a quite complicated game between learning algorithms 107. This setting of interacting algorithms is actually of interest by itself, irrespectively of ad auctions. It is noteworthy also that traditional auction mechanisms do not guarantee any fairness notion and that the literature on fixing that (for applications where it matters) is only nascent 54, 105, 58, 80.

Matching 89, 104

A matching is nothing more than a bi-partite graph between some agents (patients, doctors, students) and some resources (respectively, organs, hospital, schools). The objective is to figure out what is the “optimal” matching for a given criterion. Interestingly, there are two different—and mostly unrelated yet—concepts of “good matching”. The first one is called “stable” in the sense that each agent expresses preferences over resources (and vice-versa) and be such that no couple (agent-resource) that are un-paired would prefer to be paired together than with their current paired resource/agent. In the other concept of matching, agents and resources are described by some features (say vectors in $R^{d}$ , denoted by $a_{n}$ for agents and $r_{m}$ for resources) and pairing $a_{n}$ to $r_{m}$ incurs a cost of $c (a_{n}, r_{m})$ , for some a given function $c : {(R^{d})}^{2} \to [0, 1]$ . The objective is then to minimize the total cost of the matching $\sum_{n} c (a_{n}, r_{σ (n)})$ , where $σ (n)$ is the resource allocated to agent $n$ .

Matching is used is many different applications such as university admission (e.g., in Parcoursup). Notice that strategic interactions arise in matching if agents or resources can disclose their preferences/features to each other. Learning is also present as soon as not everything is known, e.g., the preferences or costs. Many applications of matching (again, such as college admission) are typical examples where fairness and privacy are of utmost importance. Finally, matching is also at the basis of several Internet applications and Criteo products, for instance to solve the problem of matching a given ad budget to fixed ad slots.

Ethical notions in those use-cases

In both problems, individual users are involved and there is a clear need to consider fairness and privacy. However, the precise motivation and instantiation of these notions depends on the specific use-case. In fact, it is often part of the research question to decide which are the most relevant fairness and privacy notions, as mentioned in Section 2.1. We will throughout the team's life put an important focus on this question, as well as on the question of the impact of the chosen notion on performance.

Application areas

In FAIRPLAY, we consider both applications to Criteo use-cases (online advertisement) and other applications (with other appropriate partners).

Online advertisement

Online advertising offers an application area for all of the research themes of FAIRPLAY; which we investigate primarily with Criteo.

First, online advertising is a typical application of online auctions and we consider applications of the work on auctions to Criteo use-cases, in particular the work on advertiser-centric fairness where the advertiser is Criteo. From a practical point of view, privacy will have to be enforced in such applications. For instance, when information is provided to advertisers to define audiences or to visualize the performance of their campaigns (insights) there is a possibility of leaking sensitive information on users. In particular, excellent proxies on protected attributes should probably not be leaked to advertisers, or transformed before (e.g., with the differential privacy techniques). This is therefore also an application of the fairness-vs-privacy research thread.

Note that, even before considering those questions, the first very important theoretical question is to determine what is the more appropriate definition of fairness (as there are, as mentioned above, many different variations) in those applications. We recall that it is well-known that usual fairness metrics are not compatible 88. Moreover, in online advertising, fairness can be measured in term of bidding and recommendation or in term of what ads are actually displayed. Being fair on bidding does not lead to fairness in ads displaying 105, mainly because of the other advertising actors. While fairness in bidding and/or recommendation seem the most important because they only rely on our models, external auditors can easily get data on which ads we display.

We will also investigate applications of fair matching techniques to online advertsing and to Criteo matching products—namely retargeting (personalized ads displayed on a website) and retail media (sponsored products on a merchant website). Indeed, one of Criteo major products, retail media can be cast as an online matching problem. On a given e-commerce website (say, target), several advertisers—currently brands—are running campaigns so that their products are “sponsored” or “boosted”, i.e., they appear higher on the list of results of a given query. The budgets (from a Criteo perspective) must be cleared (daily, monthly or annually). This constraint is easy thanks to the high traffic, but the main issue is that, without control/pacing/matching in times, the budget is depleted after only a few hours on a relatively low quality traffic (i.e., users that generate few conversions hence a small ROI for the advertisers). The question is therefore whether an individual user should be matched or not to boosted/sponsored products at a given time so that the ROI of the advertisers is maximized, the campaign budget is depleted and the retailer does not suffer too much from this active corruption of its organic results. Those are three different and concurrent objectives (for respectively the advertisers, Criteo and the retailers) that must be somehow conciliated. This problem (and more generally this type of problems) offers a rich application area to the FAIRPLAY research program. Indeed, it is crucial to ensure that fairness and privacy are respected. On the other hand, users, clicks, conversion arrival are not “worst case”. They rather follow some complicated—but certainly learnable—process; which allows applying our results on exploiting structure.

“Physical matching”

We investigate a number of other applications of matching: assignment of daycare slots to kids, mutation of professors to different academies, assignment of kidneys to patients, assignment of job applicants to jobs. In all these applications, there are crucial constraints of fairness that complicate the matching. We leverage existing partnership with the CNAF, the French Ministry of Education and the Agence de la biomédecine in Paris for the first three applications; for the last we will consolidate a nascent partnership with Pole Emploi and LinkedIn.

Highlights of the year

The team's convention was officially signed by Criteo, ENSAE and Inria in December 2023. Simon Mauras was hired as a CRCN in 2023 and will join the team in early 2024. The team (Vianney Perchet and Julien Combe) organized a very successful week-long conference at CIRM in December on “From matchings to markets. A tale of Mathematics, Economics and Computer Science.”

Awards

The team received a best paper award at ICML 2023 for the paper “Local and adaptive mirror descents in extensive-form games” by Côme Fiegel, Pierre Ménard, Tadashi Kozuno, Rémi Munos, Vianney Perchet, Michal Valko 19.

New results Auctions and mechanism design BenjaminHeymannSimonFinster

In 18, we consider the problem of maximizing the success probability of policy allocations in online bidding systems. The effectiveness of advertising in e-commerce largely depends on the ability of merchants to bid on and win impressions for their targeted users. The bidding procedure is highly complex due to various factors such as market competition, user behavior, and the diverse objectives of advertisers. In this paper we consider the problem at the level of user timelines instead of individual bid requests, manipulating full policies (i.e. pre-defined bidding strategies) and not bid values. In order to optimally allocate policies to users, typical multiple treatments allocation methods solve knapsack-like problems which aim at maximizing an expected value under constraints. In the industrial contexts such as online advertising, we argue that optimizing for the probability of success is a more suited objective than expected value maximization, and we introduce the SuccessProbaMax algorithm that aims at finding the policy allocation which is the most likely to outperform a fixed reference policy. Finally, we conduct comprehensive experiments both on synthetic and real-world data to evaluate its performance. The results demonstrate that our proposed algorithm outperforms conventional expected-value maximization algorithms in terms of success rate.

In 20, we study a mechanism design problem for pool testing. Large-scale testing is crucial in pandemic containment, but resources are often prohibitively constrained. We study the optimal application of pooled testing for populations that are heterogeneous with respect to an individual's infection probability and utility that materializes if included in a negative test. We show that the welfare gain from overlapping testing over non-overlapping testing is bounded. Moreover, non-overlapping allocations, which are both conceptually and logistically simpler to implement, are empirically near-optimal, and we design a heuristic mechanism for finding these near-optimal test allocations. In numerical experiments, we highlight the efficacy and viability of our heuristic in practice. We also implement and provide experimental evidence on the benefits of utility-weighted pooled testing in a real-world setting. Our pilot study at a higher education research institute in Mexico finds no evidence that performance and mental health outcomes of participants in our testing regime are worse than under the first-best counterfactual of full access for individuals without testing.

In 21, we study substitutes markets with budget constraints. Markets with multiple divisible goods have been studied widely from the perspective of revenue and welfare. In general, it is well known that envy-free revenue-maximal outcomes can result in lower welfare than competitive equilibrium outcomes. We study a market in which buyers have quasilinear utilities with linear substitutes valuations and budget constraints, and the seller must find prices and an envy-free allocation that maximise revenue or welfare. Our setup mirrors markets such as ad auctions and auctions for the exchange of financial assets. We prove that the unique competitive equilibrium prices are also envy-free revenue-maximal. This coincidence of maximal revenue and welfare is surprising and breaks down even when buyers have piecewise-linear valuations. We present a novel characterisation of the set of 'feasible' prices at which demand does not exceed supply, show that this set has an elementwise minimal price vector, and demonstrate that these prices maximise revenue and welfare. The proof also implies an algorithm for finding this unique price vector.

Online algorithms and learning MatthieuLeraslePatrickLoiseauVianneyPerchetDorianBaudryNadavMerlis

Learning in games

In citefiegel:hal-04416177, we study how to learn $ϵ$ -optimal strategies in zero-sum imperfect information games (IIG) with trajectory feedback. In this setting, players update their policies sequentially based on their observations over a fixed number of episodes, denoted by $T$ . Existing procedures suffer from high variance due to the use of importance sampling over sequences of actions (Steinberger et al., 2020; McAleer et al., 2022). To reduce this variance, we consider a fixed sampling approach, where players still update their policies over time, but with observations obtained through a given fixed sampling policy. Our approach is based on an adaptive Online Mirror Descent (OMD) algorithm that applies OMD locally to each information set, using individually decreasing learning rates and a regularized loss. We show that this approach guarantees a convergence rate of $\tilde{O} (T^{- 1 / 2})$ with high probability and has a near-optimal dependence on the game parameters when applied with the best theoretical choices of learning rates and sampling policies. To achieve these results, we generalize the notion of OMD stabilization, allowing for time-varying regularization with convex increments.

Bandits, control and reinforcement learning

In 10, we consider the problem of regret minimization in non-parametric stochastic bandits. When the rewards are known to be bounded from above, there exists asymptotically optimal algorithms, with asymptotic regret depending on an infimum of Kullback-Leibler divergences (KL). These algorithms are computationally expensive and require storing all past rewards, thus simpler but non-optimal algorithms are often used instead. We introduce several methods to approximate the infimum KL which reduce drastically the computational and memory costs of existing optimal algorithms, while keeping their regret guaranties. We apply our findings to design new variants of the MED and IMED algorithms, and demonstrate their interest with extensive numerical simulations.

In 17, we introduce Dynamic Contextual Markov Decision Processes (DCMDPs), a novel reinforcement learning framework for history-dependent environments that generalizes the contextual MDP framework to handle non-Markov environments, where contexts change over time. We consider special cases of the model, with a focus on logistic DCMDPs, which break the exponential dependence on history length by leveraging aggregation functions to determine context transitions. This special structure allows us to derive an upper-confidence-bound style algorithm for which we establish regret bounds. Motivated by our theoretical results, we introduce a practical model-based algorithm for logistic DCMDPs that plans in a latent space and uses optimism over history-dependent features. We demonstrate the efficacy of our approach on a recommendation task (using MovieLens data) where user behavior dynamics evolve in response to recommendations.

In 4, we consider the diffusive limit of a typical pure-jump Markovian control problem as the intensity of the driving Poisson process tends to infinity. We show that the convergence speed is provided by the Hölder constant of the Hessian of the limit problem, and explain how correction terms can be constructed. This provides an alternative efficient method for the numerical approximation of the optimal control of a pure-jump problem in situations with very high intensity of jump. We illustrate this approach in the context of a display advertising auction problem.

Online (and offline) algorithms

In 15, we study online algorithms with advice querying under a budget constraint. Several problems have been extensively studied in the learning-augmented setting, where the algorithm has access to some, possibly incorrect, predictions. However, it is assumed in most works that the predictions are provided to the algorithm as input, with no constraint on their size. In this paper, we consider algorithms with access to a limited number of predictions, that they can request at any time during their execution. We study three classical problems in competitive analysis, the ski rental problem, the secretary problem, and the non-clairvoyant job scheduling. We address the question of when to query predictions and how to use them.

In 16, we also study online algorithms with predictions. A popular approach to go beyond the worst-case analysis of online algorithms is to assume the existence of predictions that can be leveraged to improve performances. Those predictions are usually given by some external sources that cannot be fully trusted. Instead, we argue that trustful predictions can be built by algorithms, while they run. We investigate this idea in the illustrative context of static scheduling with exponential job sizes. Indeed, we prove that algorithms agnostic to this structure do not perform better than in the worst case. In contrast, when the expected job sizes are known, we show that the best algorithm using this information, called Follow-The-Perfect-Prediction (FTPP), exhibits much better performances. Then, we introduce two adaptive explore-then-commit types of algorithms: they both first (partially) learn expected job sizes and then follow FTPP once their self-predictions are confident enough. On the one hand, ETCU explores in "series", by completing jobs sequentially to acquire information. On the other hand, ETCRR, inspired by the optimal worst-case algorithm Round-Robin (RR), explores efficiently in "parallel". We prove that both of them asymptotically reach the performances of FTPP, with a faster rate for ETCRR. Those findings are empirically evaluated on synthetic data.

In 9, we study scheduling of moldable tasks, in the offline setting. Moldable tasks allow schedulers to determine the number of processors assigned to each task, thus enabling efficient use of large-scale parallel processing systems. We consider the problem of scheduling independent moldable tasks on processors and propose a new perspective of the existing speedup models: as the number p of processors assigned to a task increases, the speedup is linear if $p$ is small and becomes sublinear after $p$ exceeds a threshold. Based on this, we propose an efficient approximation algorithm to minimize the makespan. As a by-product, we also propose an approximation algorithm to maximize the sum of values of tasks completed by a deadline; this scheduling objective is considered for moldable tasks for the first time while similar works have been done for other types of parallel tasks.

Statistical estimation

In 11, we discuss an application of Stochastic Approximation to statistical estimation of highdimensional sparse parameters. The proposed solution reduces to resolving a penalized stochastic optimization problem on each stage of a multistage algorithm; each problem being solved to a prescribed accuracy by the non-Euclidean Composite Stochastic Mirror Descent (CSMD) algorithm. Assuming that the problem objective is smooth and quadratically minorated and stochastic perturbations are sub-Gaussian, our analysis prescribes the method parameters which ensure fast convergence of the estimation error (the radius of a confidence ball of a given norm around the approximate solution). This convergence is linear during the first "preliminary" phase of the routine and is sublinear during the second "asymptotic" phase. We consider an application of the proposed approach to sparse Generalized Linear Regression problem. In this setting, we show that the proposed algorithm attains the optimal convergence of the estimation error under weak assumptions on the regressor distribution. We also present a numerical study illustrating the performance of the algorithm on high-dimensional simulation data.

In 8, we study a change-point detection problem. Given a times series $𝐘$ in $ℝ^{n}$ , with a piece-wise contant mean and independent components, the twin problems of change-point detection and change-point localization respectively amount to detecting the existence of times where the mean varies and estimating the positions of those change-points. In this work, we tightly characterize optimal rates for both problems and uncover the phase transition phenomenon from a global testing problem to a local estimation problem. Introducing a suitable definition of the energy of a change-point, we first establish in the single change-point setting that the optimal detection threshold is $\sqrt{2 l o g l o g (n)}$ . When the energy is just above the detection threshold, then the problem of localizing the change-point becomes purely parametric: it only depends on the difference in means and not on the position of the change-point anymore. Interestingly, for most change-point positions, including all those away from the endpoints of the time series, it is possible to detect and localize them at a much smaller energy level. In the multiple change-point setting, we establish the energy detection threshold and show similarly that the optimal localization error of a specific change-point becomes purely parametric. Along the way, tight optimal rates for Hausdorff and l 1 estimation losses of the vector of all change-points positions are also established. Two procedures achieving these optimal rates are introduced. The first one is a least-squares estimator with a new multiscale penalty that favours well spread change-points. The second one is a two-step multiscale post-processing procedure whose computational complexity can be as low as $O (n l o g (n))$ . Notably, these two procedures accommodate with the presence of possibly many low-energy and therefore undetectable change-points and are still able to detect and localize high-energy change-points even with the presence of those nuisance parameters.

In 6, we study variable selection, monotone likelihood ratio and group sparsity. In the pivotal variable selection problem, we derive the exact non-asymptotic minimax selector over the class of all s-sparse vectors, which is also the Bayes selector with respect to the uniform prior. While this optimal selector is, in general, not realizable in polynomial time, we show that its tractable counterpart (the scan selector) attains the minimax expected Hamming risk to within factor 2, and is also exact minimax with respect to the probability of wrong recovery. As a consequence, we establish explicit lower bounds under the monotone likelihood ratio property and we obtain a tight characterization of the minimax risk in terms of the best separable selector risk. We apply these general results to derive necessary and sufficient conditions of exact and almost full recovery in the location model with light tail distributions and in the problem of group variable selection under Gaussian noise.

Privacy, Fairness, and Transparency CristinaButuceaPatrickLoiseauVianneyPerchet

In 14, we consider the problem of online allocation subject to a long-term fairness penalty. Contrary to existing works, however, we do not assume that the decision-maker observes the protected attributes – which is often unrealistic in practice. Instead they can purchase data that help estimate them from sources of different quality; and hence reduce the fairness penalty at some cost. We model this problem as a multi-armed bandit problem where each arm corresponds to the choice of a data source, coupled with the online allocation problem. We propose an algorithm that jointly solves both problems and show that it has a regret bounded by $O (\sqrt{T})$ . A key difficulty is that the rewards received by selecting a source are correlated by the fairness penalty, which leads to a need for randomization (despite a stochastic setting). Our algorithm takes into account contextual information available before the source selection, and can adapt to many different fairness notions. We also show that in some instances, the estimates used can be learned on the fly.

In 12, we study the related problem of transparency, in the particular case of targeted advertising. Several targeted advertising platforms offer transparency mechanisms, but researchers and civil societies repeatedly showed that those have major limitations. In this paper, we propose a collaborative ad transparency method to infer, without the cooperation of ad platforms, the targeting parameters used by advertisers to target their ads. Our idea is to ask users to donate data about their attributes and the ads they receive and to use this data to infer the targeting attributes of an ad campaign. We propose a Maximum Likelihood Estimator based on a simplified Bernoulli ad delivery model. We first test our inference method through controlled ad experiments on Facebook. Then, to further investigate the potential and limitations of collaborative ad transparency, we propose a simulation framework that allows varying key parameters. We validate that our framework gives accuracies consistent with real-world observations such that the insights from our simulations are transferable to the real world. We then perform an extensive simulation study for ad campaigns that target a combination of two attributes. Our results show that we can obtain good accuracy whenever at least ten monitored users receive an ad. This usually requires a few thousand monitored users, regardless of population size. Our simulation framework is based on a new method to generate a synthetic population with statistical properties resembling the actual population, which may be of independent interest.

In 13, we also study a transparency problem related to fairness, but in the context of decentralized systems. In permissionless blockchains, transaction issuers include a fee to incentivize miners to include their transactions. To accurately estimate this prioritization fee for a transaction, transaction issuers (or blockchain participants, more generally) rely on two fundamental notions of transparency, namely contention and prioritization transparency. Contention transparency implies that participants are aware of every pending transaction that will contend with a given transaction for inclusion. Prioritization transparency states that the participants are aware of the transaction or prioritization fees paid by every such contending transaction. Neither of these notions of transparency holds well today. Private relay networks, for instance, allow users to send transactions privately to miners. Besides, users can offer fees to miners via either direct transfers to miners' wallets or off-chain payments-neither of which are public. In this work, we characterize the lack of contention and prioritization transparency in Bitcoin and Ethereum resulting from such practices. We show that private relay networks are widely used and private transactions are quite prevalent. We show that the lack of transparency facilitates miners to collude and overcharge users who may use these private relay networks despite them offering little to no guarantees on transaction prioritization. The lack of these transparencies in blockchains has crucial implications for transaction issuers as well as the stability of blockchains. Finally, we make our data sets and scripts publicly available.

In 5, we address the problem of variable selection in a high-dimensional but sparse mean model, under the additional constraint that only privatised data are available for inference. The original data are vectors with independent entries having a symmetric, strongly log-concave distribution on $ℝ$ . For this purpose, we adopt a recent generalisation of classical minimax theory to the framework of local $α -$ differential privacy. We provide lower and upper bounds on the rate of convergence for the expected Hamming loss over classes of at most $s$ -sparse vectors whose non-zero coordinates are separated from 0 by a constant $a > 0$ . As corollaries, we derive necessary and sufficient conditions (up to log factors) for exact recovery and for almost full recovery. When we restrict our attention to non-interactive mechanisms that act independently on each coordinate our lower bound shows that, contrary to the non-private setting, both exact and almost full recovery are impossible whatever the value of $a$ in the high-dimensional regime such that $n α^{2} / d^{2} \leq 1$ . However, in the regime $n α^{2} / d^{2} ≫ log (d)$ we can exhibit a critical value $a^{*}$ (up to a logarithmic factor) such that exact and almost full recovery are possible for all $a ≫ a^{*}$ and impossible for $a \leq a^{*}$ . We show that these results can be improved when allowing for all non-interactive (that act globally on all coordinates) locally $α -$ differentially private mechanisms in the sense that phase transitions occur at lower levels.

In 7, we study interactive versus non-interactive locally differentially private estimation for the quadratic functional. Local differential privacy has recently received increasing attention from the statistics community as a valuable tool to protect the privacy of individual data owners without the need of a trusted third party. Similar to the classical notion of randomized response, the idea is that data owners randomize their true information locally and only release the perturbed data. Many different protocols for such local perturbation procedures can be designed. In most estimation problems studied in the literature so far, however, no significant difference in terms of minimax risk between purely non-interactive protocols and protocols that allow for some amount of interaction between individual data providers could be observed. In this paper we show that for estimating the integrated square of a density, sequentially interactive procedures improve substantially over the best possible non-interactive procedure in terms of minimax rate of estimation. In particular, in the non-interactive scenario we identify an elbow in the minimax rate at $s = 3 / 4$ , whereas in the sequentially interactive scenario the elbow is at $s = 1 / 2$ . This is markedly different from both, the case of direct observations, where the elbow is well known to be at $s = 1 / 4$ , as well as from the case where Laplace noise is added to the original data, where an elbow at $s = 9 / 4$ is obtained. We also provide adaptive estimators that achieve the optimal rate up to log-factors, we draw connections to non-parametric goodness-of-fit testing and estimation of more general integral functionals and conduct a series of numerical experiments. The fact that a particular locally differentially private, but interactive, mechanism improves over the simple non-interactive one is also of great importance for practical implementations of local differential privacy.

Partnerships and cooperations International research visitors Visits of international scientists Other international visits to the team A. Rohde Professor

Freiburg University

Germany

Dates:

May 31 - June 1, 2023

Context of the visit:

Research stay

A. Meister Professor

Rostock University

Germany

Dates:

May 31 - June 1, 2023

Context of the visit:

Research stay

A. Celli Professor

Bocconi University

Italy

Dates:

Sept 11 Sept 22, 2023

Context of the visit:

Research stay

Visits to international teams Research stays abroad Cristina Butucea Visited institution:

Heidelberg University

Country:

Dates:

February 20-24, 2023

Context of the visit:

Research, discussions

Visited institution:

Nottingham University

Country:

Dates:

July 2-7, 2023

Context of the visit:

Research, discussions

Vianney Perchet Visited institution:

MIT

Country:

Dates:

May 1 - May 7, 2023

Context of the visit:

Research, discussions & seminar

National initiatives Foundry (PEPR IA) PatrickLoiseau

Title:

Foundry: Foundation of robustness and reliability in AI

Partner Institution(s):

Inria

CNRS

Université Paris Dauphine

Institut Mines Telecom

ENS de Lyon

Date/Duration:

2023-2027 (4 years)

Additionnal info/keywords:

PEPR IA projet cible, 245k euros. Fairness, matching, auctions.

FairPlay (ANR JCJC) PatrickLoiseau

Title:

FairPlay: Fair algorithms via game theory and sequential learning

Partner Institution(s):

Inria

Date/Duration:

2021-2025 (4 years)

Additionnal info/keywords:

ANR JCJC project, 245k euros. Fairness, matching, auctions.

Explainable and Responsible AI (MIAI chair) PatrickLoiseau

Title:

Explainable and Responsible AI chair of the MIAI @ Grenoble Alpes institute

Partner Institution(s):

Univ. Grenoble Alpes

Date/Duration:

2019-2023 (4 years)

Additionnal info/keywords:

Chair of the MIAI @ Grenoble Alpes institute co-held by Patrick Loiseau. Fairness, privacy.

BOLD (ANR) VianneyPerchet

Title:

BOLD: Beyond Online Learning for Better Decisions

Partner Institution(s):

Crest, Genes

Date/Duration:

2019-2024 (4.5 years)

Additionnal info/keywords:

ANR project, 270k euros. online learning, optimization, bandits.

Dissemination Promoting scientific activities Scientific events: organisation General chair, scientific chair VianneyPerchet

Title:

From matchings to markets. A tale of Mathematics, Economics and Computer Science.

Partner Institution(s):

Crest, Genes

Date/Duration:

December 2023

Location:

CIRM, Marseille

Scientific events: selection Member of the conference program committees Patrick Loiseau:

NeurIPS, ECML-PKDD, EWAF

Vianney Perchet:

NeurIPS, ICLR, ICML, COLT, ALT

Hugo Richard:

NeurIPS, ICML, AISTATS

Marc Abeille:

ALT

Clément Calauzènes:

ICML, NeurIPS

Benjamin Heymann

NeurIPS, AISTATS

Maxime Vono

NeurIPS

Journal Member of the editorial boards Vianney Perchet:

Foundations and Trends in Machine Learning, Operation Research, Operation Research Letters, Journal of Machine Learning Research, Journal of Dynamics and Games,

Cristina Butucea:

Annals of Statistics, Bernoulli

Reviewer - reviewing activities Patrick Loiseau:

Journal of Machine Learning Research, Mathematics of Operation Research, Games and Economic Behavior

Vianney Perchet:

Annals of Statistics, Mathematics of Operation Research, Journal of the ACM

Matthieu Lerasle

Annals of statistics, Journal of the European Mathematical Society, Probability and Related Fields, Journal of Machine Learning Research, Journal of the American Statistical Association.

Marc Abeille:

Journal of Machine Learning Research

Maxime Vono

Journal of Computational and Graphical Statistics

Benjamin Heymann:

SIAM Control and Optimization, Internation Journal of Game Theory, Mathematical Reviews

Invited talks Patrick Loiseau:

Alpine Game Theory Symposium (Grenoble), Columbia University, Ethics of Public Robots and AI (Skema)

Vianney Perchet:

Alpine Game Theory Symposium (Grenoble), Algorithms, Learning, and Games (Sicily), Optimization and Statistical Learning (les Houches), Mathematics of Data Science (Singapore), Seminar of the Maths Department of MIT (Boston), Games Learning and Networks (Singapore), Optimization 2023 (Seattle), Workshop on Information & Learning in Decisions & Operations (INSEAD), FILOFOCS 2023 (Paris), R. Cominetti's Feist (Chile), NeurIPS Workshop I cannot Believe It's Not Better (New Orleans), New Methods in Statistics (Marseille)

Matthieu Lerasle

Journées Statistiques du Sud, Journées ALEA, Colloquium Université Rouen, Séminaire Université du Luxembourg, Séminaire Geneva School of Economics and Management.

Cristina Butucea

University of Vienna, Séminaire de statistique IHP, Anniversary Conference M. Neumann University of Bamberg, Workshop on Tests and Bandits University of Potsdam

Marc Abeille:

UpperBound conference (2023), Reinforcement Learning Theory seminars

Benjamin Heymann:

Fundamental Challenges in Causality seminar workshop, Causality in Practice workshop, Causality Discussion Group

Maxime Vono:

Federated Learning One World Seminar

Scientific expertise Vianney Perchet:

Expert for the evaluation of the LABEX MME:DII

Cristina Butucea:

Reviewer for tenure committee Harvard University, hiring committees France

Teaching - Supervision - Juries Supervision Patrick Loiseau:

PhD students: Rémi Castera, Mathieu Molina, Mélissa Tamine; postdocs: Felipe Garrido Lucero, Simon Finster, Denis Sokolov

Vianney Perchet:

PhD students: Sasila Ilandarideva, Flore Sentenac, Come Fiegel, Maria Cherifa, Mathieu Molina, Ziyad Benomar, Mike Liu, Hafedh El Ferchichi. postdocs: Felipe Garrido Lucero, Nadav Merlis, Dorian Baudry

Matthieu Lerasle

PhD Students: Clara Carlier, Hugo Chardon, Hafedh El Ferchichi.

Cristina Butucea

PhD students: Nayel Bettache, Henning Stein

Marc Abeille:

Lorenzo Croissant

Clément Calauzènes:

Morgane Goibert, Maria Cherifa

Benjamin Heymann:

Mélissa Tamine

Maxime Vono:

Mélissa Tamine

Juries Patrick Loiseau:

PhD jury A. Bardou (reviewer), V. Do (reviewer)

Vianney Perchet:

PhD jury: J. Achddou (reviewer), D. Beaudry, A. Bismuth, C.-S. Gauthier (reviewer), H. Dakdouk (reviewer), S. Gaucher, F. Hu, G. Rizk, P. Muller, HDR jury: A. Simonetto (reviewer)

Matthieu Lerasle

PhD Jury: S. Arradi-Alaoui (reviewer), J. Cheng. HDR Jury: P. Mozharovskyi, A, Sabourin (internal reviewer).

Cristina Butucea

PhD Jury: E. Pilliat (Montpellier), C Deslandes (CMAP)

Teaching ENSAE:

Advanced Optimization

Third year, lectures

Algorithm Design and Analysis

Third year, lectures

Theoretical Foundations of Machine Learning

Second year, lectures

Stopping time and online algorithms

Third year, lectures

Statistics

(ML) 1st and second year

Nonparametric Statistics

3rd year, M2

Mathematical Foundations of Probabilities

1st year

Programming project

1st year

Ecole Polytechnique:

INF421: design and analysis of algorithms

(Patrick Loiseau). Second-year level, PCs.

INF581: Advanced Machine Learning and Autonomous Agents

(Patrick Loiseau). Third-year/M1 level, lectures and labs.

MAP433: Statistics

(ML). First-year cycle polytechnicien, PCs.

MAP576: Learning Theory

(ML). Second-year cycle polytechnicien, Lecture.

Université Paris-Saclay:

High Dimensional Probability

(ML). Master 2

Stopping Time and Random Algorithm

(ML). Master 2

PSL:

Introduction to machine learning

(Hugo Richard). L3 level, Lectures and labs.

Master IASD:

Recommender Systems

(Clément Calauzènes). Master 2, Lectures.

Local and adaptive mirror descents in extensive-form games C. Côme Fiegel P. Pierre Ménard T. Tadashi Kozuno R. Rémi Munos V. Vianney Perchet M. Michal Valko 2023 International Conference on Machine Learning New Orleans, United States September 2023 Trading-off price for data quality to achieve fair online allocation M. Mathieu Molina N. Nicolas Gast P. Patrick Loiseau V. Vianney Perchet NeurIPS 2023 - 37th Conference on Neural Information Processing Systems New orleans, USA, United States 2023 1-43 Optimal Change-Point Detection and Localization N. Nicolas Verzelen M. Magalie Fromont M. Matthieu Lerasle P. Patricia Reynaud-Bouret Annals of Statistics 2023 51 4 1586-1610 Diffusive limit approximation of pure-jump optimal stochastic control problems M. Marc Abeille B. Bruno Bouchard L. Lorenzo Croissant Journal of Optimization Theory and Applications January 2023 Phase transitions for support recovery under local differential privacy C. Cristina Butucea A. Amandine Dubois A. Adrien Saumard Mathematical Statistics and Learning June 2023 6 1 1-50 Variable selection, monotone likelihood ratio and group sparsity C. Cristina Butucea E. Enno Mammen M. Mohamed Ndaoud A. Alexandre Tsybakov The Annals of Statistics February 2023 51 1 Interactive versus noninteractive locally differentially private estimation: Two elbows for the quadratic functional C. Cristina Butucea A. Angelika Rohde L. Lukas Steinberger The Annals of Statistics April 2023 51 2 Optimal Change-Point Detection and Localization N. Nicolas Verzelen M. Magalie Fromont M. Matthieu Lerasle P. Patricia Reynaud-Bouret Annals of Statistics 2023 51 4 1586-1610 Efficient approximation algorithms for scheduling moldable tasks X. Xiaohu Wu P. Patrick Loiseau European Journal of Operational Research October 2023 310 1 71-83 Fast Asymptotically Optimal Algorithms for Non-Parametric Stochastic Bandits D. Dorian Baudry F. Fabien Pesquerel R. Rémy Degenne O.-A. Odalric-Ambrym Maillard Thirty-seventh Conference on Neural Information Processing Systems New Orleans (Louisiana), United States December 2023 Stochastic Mirror Descent for Large-Scale Sparse Recovery Y. Yannis Bekri S. Sasila Ilandarideva A. B. Anatoli B. Juditsky V. Vianney Perchet 26th International Conference on Artificial Intelligence and Statistics (AISTATS) Valencia, Spain April 2023 Collaborative Ad Transparency: Promises and Limitations E. Eleni Gkiouzepi A. Athanasios Andreou O. Oana Goga P. Patrick Loiseau SP 2023 - 44th IEEE Symposium on Security and Privacy San Francisco, United States May 2023 Dissecting Bitcoin and Ethereum Transactions: On the Lack of Transaction Contention and Prioritization Transparency in Blockchains J. Johnnatan Messias V. Vabuk Pahari B. Balakrishnan Chandrasekaran K. P. Krishna P Gummadi P. Patrick Loiseau FC 2023 - Financial Cryptography and Data Security 2023 Bol, Brač, Croatia May 2023 Trading-off price for data quality to achieve fair online allocation M. Mathieu Molina N. Nicolas Gast P. Patrick Loiseau V. Vianney Perchet NeurIPS 2023 - 37th Conference on Neural Information Processing Systems New orleans, USA, United States 2023 1-43 Advice Querying under Budget Constraint for Online Algorithms V. Vianney Perchet Z. Ziyad Benomar NeurIPS 2023 - 37th Conference on Neural Information Processing Systems New Orleans, United States December 2023 Static Scheduling with Predictions Learned through Efficient Exploration H. Hugo Richard F. Flore Sentenac C. Corentin Odic M. Mathieu Molina V. Vianney Perchet 2023 International Conference on Machine Learning Honolulu (Hawai), United States May 2022 Reinforcement Learning with History-Dependent Dynamic Contexts G. Guy Tennenholtz N. Nadav Merlis L. Lior Shani M. Martin Mladenov C. Craig Boutilier ICML Honolulu, United States May 2023 Maximizing the Success Probability of Policy Allocations in Online Systems A. Artem Betlei M. Mariia Vladimirova M. Mehdi Sebbar N. Nicolas Urien T. Thibaud Rahier B. Benjamin Heymann AAAI2024 Vancouver, Canada 2023 arXiv Local and adaptive mirror descents in extensive-form games C. Côme Fiegel P. Pierre Ménard T. Tadashi Kozuno R. Rémi Munos V. Vianney Perchet M. Michal Valko 2023 International Conference on Machine Learning New Orleans, United States September 2023 Welfare-Maximizing Pooled Testing S. Simon Finster M. G. Michelle González Amador E. Edwin Lock F. Francisco Marmolejo-Cossío E. Evi Micha A. D. Ariel D. Procaccia EC 2023 - The 24th ACM Conference on Economics and Computation London, United Kingdom 2023 Substitutes markets with budget constraints: solving for competitive and optimal prices S. Simon Finster P. Paul Goldberg E. Edwin Lock WINE 2023 - The 19th Conference On Web And InterNet Economics Shanghai, China 2023 Foundations of Modern Statistics A. Amandine Dubois T. Thomas Berrett C. Cristina Butucea Springer Proceedings in Mathematics & Statistics 2023 Springer International Publishing 425 53-119 Foundations of Modern Statistics D. Denis Belomestny C. Cristina Butucea E. Enno Mammen E. Eric Moulines M. Markus Reiß V. Vladimir Ulyanov Springer Proceedings in Mathematics & Statistics Festschrift in Honor of Vladimir Spokoiny 2023 Springer International Publishing; Springer International Publishing 425 Multi-Armed Bandits with Guaranteed Revenue per Arm D. Dorian Baudry N. Nadav Merlis M. Mathieu Molina H. Hugo Richard V. Vianney Perchet January 2024 Addressing bias in online selection with limited budget of comparisons Z. Ziyad Benomar E. Evgenii Chzhen N. Nicolas Schreuder V. Vianney Perchet November 2023 Two-sided Matrix Regression N. Nayel Bettache C. Cristina Butucea March 2023 Simultaneous off-the-grid learning of mixtures issued from a continuous dictionary C. Cristina Butucea J.-F. Jean-François Delmas A. Anne Dutfoy C. Clément Hardy January 2024 Statistical Discrimination in Stable Matching R. Rémi Castera P. Patrick Loiseau B. Bary Pradelski April 2023 Near-continuous time Reinforcement Learning for continuous state-action spaces L. Lorenzo Croissant M. Marc Abeille B. Bruno Bouchard September 2023 Adapting to game trees in zero-sum imperfect information games C. Côme Fiegel P. Pierre Ménard T. Tadashi Kozuno R. Rémi Munos V. Vianney Perchet M. Michal Valko December 2022 DU-Shapley: A Shapley Value Proxy for Efficient Dataset Valuation F. Felipe Garrido-Lucero B. Benjamin Heymann M. Maxime Vono P. Patrick Loiseau V. Vianney Perchet June 2023 Online Matching in Geometric Random Graphs F. Flore Sentenac N. Nathan Noiry M. Matthieu Lerasle L. Laurent Ménard V. Vianney Perchet October 2023 24 CFR § 100.75 - Discriminatory advertisements, statements and notices. Community detection and stochastic block models: recent developments E. Emmanuel Abbe The Journal of Machine Learning Research 2017 18 1 6446--6531 Discrimination through optimization: How Facebook's ad delivery can lead to skewed outcomes M. Muhammad Ali P. Piotr Sapiezynski M. Miranda Bogen A. Aleksandra Korolova A. Alan Mislove A. Aaron Rieke 2019 Online learning with feedback graphs: Beyond bandits N. Noga Alon N. Nicolo Cesa-Bianchi O. Ofer Dekel T. Tomer Koren 2015 40 On matching and thickness in heterogeneous dynamic markets I. Itai Ashlagi M. Maximilien Burq P. Patrick Jaillet V. Vahideh Manshadi Operations Research 2019 67 4 927--949 Kidney Exchange in Dynamic Sparse Heterogenous Pools I. Itai Ashlagi P. Patrick Jaillet V. H. Vahideh H. Manshadi 2013 Online stochastic optimization in the large: Application to kidney exchange P. Pranjal Awasthi T. Tuomas Sandholm 2009 Improved bounds for online stochastic matching B. Bahman Bahmani M. Michael Kapralov 2010 170--181 Obtaining fairness using optimal transport theory E. E. del Barrio F. F. Gamboa P. P. Gordaliza J.-M. J.-M. Loubes 2018 1--25 On-line bipartite matching made simple B. Benjamin Birnbaum C. Claire Mathieu Acm Sigact News 2008 39 1 80--87 The width of random graph orders B. Bela Bollobas G. Graham Brightwell The Mathematical Scientist 01 1995 20 Matchings on infinite graphs C. Charles Bordenave M. Marc Lelarge J. Justin Salez 2011 Online Computation and Competitive Analysis A. A. Borodin R. R. El-Yaniv 1998 Cambridge University Presss An Experimental Study of Algorithms for Online Bipartite Matching A. Allan Borodin C. Christodoulos Karavasilis D. Denis Pankratov 2018 SIC-MMAB: Synchronisation Involves Communication in Multiplayer Multi-Armed Bandits E. E. Boursier V. V. Perchet 2018 1--31 Utility/Privacy Trade-off through the lens of Optimal Transport E. Etienne Boursier V. Vianney Perchet Proceedings of Machine Learning Research August 2020 PMLR 108 591--601 Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems S. S. Bubeck N. N. Cesa-Bianchi Machine Learning 2012 5 1 1--122 Bounded regret in stochastic multi-armed bandits S. S. Bubeck V. V. Perchet P. P. Rigollet Journal of Machine Learning Research: Workshop and Conference Proceedings (COLT) 2013 30 122--134 Online Primal-Dual Algorithms for Maximizing Ad-Auctions Revenue N. Niv Buchbinder K. Kamal Jain J. (. Joseph (Seffi) Naor 2007 253--264 Local differential privacy: Elbow effect in optimal density estimation and adaptation over Besov ellipsoids C. Cristina Butucea A. Amandine Dubois M. Martin Kroll A. Adrien Saumard others Bernoulli 2020 26 3 1727--1764 Interactive versus non-interactive locally, differentially private estimation: Two elbows for the quadratic functional C. Cristina Butucea A. Angelika Rohde L. Lukas Steinberger Annals of Stats 2023 Toward Controlling Discrimination in Online Ad Auctions L. E. L Elisa Celis A. Anay Mehrotra N. K. Nisheeth K Vishnoi 2019 Regret minimization for reserve prices in second-price auctions N. N. Cesa-Bianchi C. C. Gentile Y. Y. Mansour 2013 Prediction, Learning, and Games N. Nicolò Cesa-Bianchi G. Gabor Lugosi 2006 Cambridge University Press Capacity bounded differential privacy K. K. Chaudhuri J. J. Imola A. A. Machanavajjhala 2019 3469--3478 Fairness in ad auctions through inverse proportionality S. Shuchi Chawla M. Meena Jagadeesan 2020 Randomized online matching in regular graphs I. R. Ilan Reuven Cohen D. David Wajc 2018 960--979 Pacing equilibrium in first-price auction markets V. Vincent Conitzer C. Christian Kroer D. Debmalya Panigrahi O. Okke Schrijvers E. Eric Sodomka N. E. Nicolas E Stier-Moses C. Chris Wilkens 2019 Multiplicative pacing equilibria in auction markets V. Vincent Conitzer C. Christian Kroer E. Eric Sodomka N. E. Nicolas E Stier-Moses 2018 Prophet inequalities for iid random variables from an unknown distribution J. José Correa P. Paul Dütting F. Felix Fischer K. Kevin Schewior 2019 3--17 On the Compatibility of Privacy and Fairness R. Rachel Cummings V. Varun Gupta D. Dhamma Kimpara J. Jamie Morgenstern 2019 Amazon scraps secret AI recruiting tool that showed bias against women J. Jeffrey Dastin 2018 Differential privacy C. C. Dwork Encyclopedia of Cryptography and Security 2011 338--340 Fairness through awareness C. Cynthia Dwork M. Moritz Hardt T. Toniann Pitassi O. Omer Reingold R. Richard Zemel 2012 Calibrating noise to sensitivity in private data analysis C. Cynthia Dwork F. Frank McSherry K. Kobbi Nissim A. Adam Smith 2006 265--284 Optimal Privacy-Constrained Mechanisms R. R. Eilat K. K. Eliaz X. X. Mu 2019 On the Effect of Positive Discrimination on Multistage Selection Problems in the Presence of Implicit Variance V. Vitalii Emelianov N. Nicolas Gast K. P. Krishna P. Gummadi P. Patrick Loiseau 2020 Online Stochastic Matching: Beating 1-1/e J. Jon Feldman A. Aranyak Mehta V. Vahab Mirrokni S. S. Muthukrishnan 2009 Fairness in Precision Medicine K. Kadija Ferryman M. Mikaela Pitcan 2018 Game Theory D. D. Fudenberg J. J. Tirole 1991 MIT press Local Differentially Private Regret Minimization in Reinforcement Learning E. Evrard Garcelon V. Vianney Perchet C. Ciara Pike-Burke M. Matteo Pirotta arXiv preprint arXiv:2010.07778 2020 Linear Regression from Strategic Data Sources N. Nicolas Gast S. Stratis Ioannidis P. Patrick Loiseau B. Benjamin Roussillon ACM Transactions on Economics and Computation May 2020 8 2 10:1--10:24 A Refined Mean Field Approximation N. Nicolas Gast B. Benny Van Houdt 2017 Proxy fairness M. Maya Gupta A. Andrew Cotter M. M. Mahdi Milani Fard S. Serena Wang arXiv preprint arXiv:1806.11212 2018 Bias in online freelance marketplaces: Evidence from taskrabbit and fiverr A. Anikó Hannák C. Claudia Wagner D. David Garcia A. Alan Mislove M. Markus Strohmaier C. Christo Wilson 2017 Equality of Opportunity in Supervised Learning M. Moritz Hardt E. Eric Price N. Nathan Srebro 2016 Online task assignment in crowdsourcing markets C.-J. Chien-Ju Ho J. W. Jennifer Wortman Vaughan 2012 Multi-Category Fairness in Sponsored Search Auctions C. Christina Ilvento M. Meena Jagadeesan S. Shuchi Chawla 2020 Mean field equilibria of dynamic auctions with learning K. Krishnamurthy Iyer R. Ramesh Johari M. Mukund Sundararajan Management Science 2014 60 12 2949--2970 Online stochastic matching: New algorithms with better bounds P. Patrick Jaillet X. Xin Lu Mathematics of Operations Research 2014 39 3 624--646 On the consistency of supervised learning with missing values J. Julie Josse N. Nicolas Prost E. Erwan Scornet G. Gaël Varoquaux arXiv preprint arXiv:1902.06931 2019 Assessing algorithmic fairness with unobserved protected class using data combination N. Nathan Kallus X. Xiaojie Mao A. Angela Zhou arXiv preprint arXiv:1906.00285 2019 Online bipartite matching with unknown distributions C. Chinmay Karande A. Aranyak Mehta P. Pushkar Tripathi 2011 587--596 An optimal algorithm for on-line bipartite matching R. M. Richard M Karp U. V. Umesh V Vazirani V. V. Vijay V Vazirani 1990 352--358 Avoiding discrimination through causal reasoning N. Niki Kilbertus M. R. Mateo Rojas Carulla G. Giambattista Parascandolo M. Moritz Hardt D. Dominik Janzing B. Bernhard Schölkopf 2017 656--666 Model-Agnostic Characterization of Fairness Trade-offs J. S. Joon Sik Kim J. Jiahao Chen A. Ameet Talwalkar arXiv preprint arXiv:2004.03424 2020 Stable Marriage and Its Relation to Other Combinatorial Problems: An Introduction to the Mathematical Analysis of Algorithms D. D.E. Knuth 1996 English translation, (CRM Proceedings and Lecture Notes), American Mathematical Society Auction Theory V. V. Krishna 2009 Elsevier Counterfactual fairness M. J. Matt J Kusner J. Joshua Loftus C. Chris Russell R. Ricardo Silva 2017 4066--4076 The long road to fairer algorithms M. J. Matt J Kusner J. R. Joshua R Loftus 2020 Algorithmic Bias? An Empirical Study of Apparent Gender-Based Discrimination in the Display of STEM Career Ads A. Anja Lambrecht C. Catherine Tucker Management Science 2019 How We Analyzed the COMPAS Recidivism Algorithm J. Jeff Larson S. Surya Mattu L. Lauren Kirchner J. Julia Angwin 2016 Competing Bandits in Matching Markets L. L. Liu H. H. Mania M. I. M. I.| Jordan 2019 1--15 Business cycle dynamics under rational inattention B. B. Maćkowiak M. M. Wiederholt The Review of Economic Studies 2015 82 4 1502--1532 Online bipartite matching with random arrivals: an approach based on strongly factor-revealing lps M. Mohammad Mahdian Q. Qiqi Yan 2011 597--606 Online stochastic matching: Online actions based on offline statistics V. H. Vahideh H Manshadi S. O. Shayan Oveis Gharan A. Amin Saberi Mathematics of Operations Research 2012 37 4 559--573 Microeconomic Theory A. A. Mas-Colell M. M. Whinston J. J. Green 1995 Oxford University Press Greedy online bipartite matching on random graphs A. Andrew Mastin P. Patrick Jaillet arXiv preprint arXiv:1307.2536 2013 Rational inattention to discrete choices: A new foundation for the multinomial logit model F. F. Matėjka A. A. McKay American Economic Review 2015 105 1 272--98 Online Matching and Ad Allocation A. Aranyak Mehta Found. Trends Theor. Comput. Sci. October 2013 8 4 265–368 Repeated Games J.-F. Jean-Francois Mertens S. Sylvain Sorin S. Shmuel Zamir Econometric Society Monographs 2015 Cambridge University Press Algorithms for the Assignment and Transportation Problems J. J. Munkres Journal of the Society for Industrial and Applied Mathematics 1957 5 1 32--38 Bidding Strategies with Gender Nondiscrimination Constraints for Online Ad Auctions M. Milad Nasr M. C. Michael Carl Tschantz 2020 The bidder’s standpoint : a simple way to improve bidding strategies in revenue-maximizing auctions T. T. Nedelec M. M. Abeille C. C. Calauzenes B. B. Heymann V. V. Perchet N. N. El Karoui 2019 Learning to bid in revenue-maximizing auctions T. T. Nedelec N. N. El Karoui V. V. Perchet 2019 4781--4789 Algorithmic Game Theory N. N. Nisan T. T. Roughgarden E. E. Tardos V. V. Vazirani 2007 Cambridge University Press A differential game on Wasserstein space. Application to weak approachability with partial monitoring V. V. Perchet M. M. Quincampoix Journal of Dynamics and Games 2019 6 65--85 The multi-armed bandit problem with covariates V. V. Perchet P. P. Rigollet Annals of Statistics 2013 41 693--721 Elements of causal inference J. Jonas Peters D. Dominik Janzing B. Bernhard Schölkopf 2017 The MIT Press Computational Optimal Transport G. G. Peyré M. M. Cuturi 2018 ArXiv:1803.00567 Fair Inputs and Fair Outputs: The Incompatibility of Fairness in Privacy and Accuracy B. Bashir Rastegarpanah M. Mark Crovella K. Krishna Gummadi 2020 Distance makes the types grow stronger: a calculus for differential privacy J. J. Reed B. B.C. Pierce 2010 45 157--168 Geometrizing rates of convergence under local differential privacy constraints A. Angelika Rohde L. Lukas Steinberger others Annals of Statistics 2020 48 5 2646--2670 What's in a Name? Reducing Bias in Bios without Access to Protected Attributes A. Alexey Romanov M. Maria De-Arteaga H. Hanna Wallach J. Jennifer Chayes C. Christian Borgs A. Alexandra Chouldechova S. Sahin Geyik K. Krishnaram Kenthapadi A. Anna Rumshisky A. T. Adam Tauman Kalai arXiv preprint arXiv:1904.05233 2019 Algorithms that" Don't See Color": Comparing Biases in Lookalike and Special Ad Audiences P. Piotr Sapiezynski A. Avijit Gosh L. Levi Kaplan A. Alan Mislove A. Aaron Rieke arXiv preprint arXiv:1912.07579 2019 Implications of rational inattention C. C.A. Sims Journal of monetary Economics 2003 50 3 665--690 On the Potential for Discrimination in Online Targeted Advertising T. Till Speicher M. Muhammad Ali G. Giridhari Venkatadri F. N. Filipe Nunes Ribeiro G. George Arvanitakis F. Fabr\'icio Benevenuto K. P. Krishna P. Gummadi P. Patrick Loiseau A. Alan Mislove 2018 Topics in optimal transportation C. C. Villani 2003 Graduate studies in Mathematics, AMS 58 Online learning in repeated auctions J. J. Weed V. V. Perchet P. P. Rigollet 2016 Learning Non-Discriminatory Predictors B. Blake Woodworth S. Suriya Gunasekar M. I. Mesrob I. Ohannessian N. Nathan Srebro 2017 Counterfactual Fairness: Unidentification, Bound and Algorithm. Y. Yongkai Wu L. Lu Zhang X. Xintao Wu 2019 1438--1444 Pc-fairness: A unified framework for measuring causality-based fairness Y. Yongkai Wu L. Lu Zhang X. Xintao Wu H. Hanghang Tong 2019 3404--3414 Fairness Beyond Disparate Treatment & Disparate Impact: Learning Classification Without Disparate Mistreatment M. B. Muhammad Bilal Zafar I. Isabel Valera M. Manuel Gomez Rodriguez K. P. Krishna P. Gummadi 2017 From Parity to Preference-based Notions of Fairness in Classification M. B. Muhammad Bilal Zafar I. Isabel Valera M. Manuel Gomez Rodriguez K. P. Krishna P. Gummadi A. Adrian Weller 2017 Learning Fair Representations R. Rich Zemel Y. Yu Wu K. Kevin Swersky T. Toni Pitassi C. Cynthia Dwork 2013