Large distributed infrastructures are rampant in our society. Numerical simulations form the basis of computational sciences and high performance computing infrastructures have become scientific instruments with similar roles as those of test tubes or telescopes. Cloud infrastructures are used by companies in such an intense way that even the shortest outage quickly incurs the loss of several millions of dollars. But every citizen also relies on (and interacts with) such infrastructures via complex wireless mobile embedded devices whose nature is constantly evolving. In this way, the advent of digital miniaturization and interconnection has enabled our homes, power stations, cars and bikes to evolve into smart grids and smart transportation systems that should be optimized to fulfill societal expectations.

Our dependence and intense usage of such gigantic systems obviously leads to very high expectations in terms of performance. Indeed, we strive for low-cost and energy-efficient systems that seamlessly adapt to changing environments that can only be accessed through uncertain measurements. Such digital systems also have to take into account both the users' profile and expectations to efficiently and fairly share resources in an online way. Analyzing, designing and provisioning such systems has thus become a real challenge.

Such systems are characterized by their
ever-growing size,
intrinsic heterogeneity and distributedness,
user-driven requirements,
and an unpredictable variability that renders them essentially stochastic.
In such contexts, many of the former design and analysis
hypotheses (homogeneity, limited hierarchy, omniscient view,
optimization carried out by a single entity, open-loop
optimization, user outside of the picture) have become obsolete, which
calls for radically new approaches. Properly studying such systems
requires a drastic rethinking of fundamental aspects regarding the system's
observation (measure, trace, methodology, design of experiments),
analysis (modeling, simulation, trace analysis and visualization),
and optimization (distributed, online, stochastic).

The goal of the POLARIS project is to contribute to the understanding of the performance of very large scale
distributed systems by applying ideas from diverse research fields and application domains.
We believe that studying all these different aspects at once without restricting to specific systems is the key to push forward our understanding of such challenges and to proposing innovative solutions.
This is why we intend to investigate problems arising from application
domains as varied as large computing systems, wireless networks, smart
grids and transportation systems.

The members of the POLARIS project cover a very wide spectrum of expertise in performance evaluation and models, distributed optimization, and analysis of HPC middleware. Specifically, POLARIS' members have worked extensively on:

AI and Learning is everywhere now. Let us clarify how our research activities are positionned with respect to this trend.

A first line of research in POLARIS is devoted to the use statistical learning techniques (Bayesian inference) to model the expected performance of distributed systems, to build aggregated performance views, to feed simulators of such systems, or to detect anomalous behaviours.

In a distributed context it is also essential to design systems that can seamlessly adapt to the workload and to the evolving behaviour of its components (users, resources, network). Obtaining faithful information on the dynamic of the system can be particularly difficult, which is why it is generally more efficient to design systems that dynamically learn the best actions to play through trial and errors. A key characteristic of the work in the POLARIS project is to leverage regularly game-theoretic modeling to handle situations where the resources or the decision is distributed among several agents or even situations where a centralised decision maker has to adapt to strategic users.

An important research direction in POLARIS is thus centered on reinforcement learning (Multi-armed bandits, Q-learning, online learning) and active learning in environments with one or several of the following features:

As a side effect, many of the gained insights can often be used to dramatically improve the scalability and the performance of the implementation of more standard machine or deep learning techniques over supercomputers.

The POLARIS members are thus particularly interested in the design and analysis of adaptive learning algorithms for multi-agent systems, i.e. agents that seek to progressively improve their performance on a specific task (see Figure). The resulting algorithms should not only learn an efficient (Nash) equilibrium but they should also be able of doing so quickly (low regret), even when facing the difficulties associated to a distributed context (lack of coordination, uncertain world, information delay, limited feedback, …)

In the rest of this document, we describe in detail our new results in the above areas.

Evaluating the scalability, robustness, energy consumption and performance of large infrastructures such as exascale platforms and clouds raises severe methodological challenges. The complexity of such platforms mandates empirical evaluation but direct experimentation via an application deployment on a real-world testbed is often limited by the few platforms available at hand and is even sometimes impossible (cost, access, early stages of the infrastructure design, etc.). Furthermore, such experiments are costly, difficult to control and therefore difficult to reproduce. Although many of these digital systems have been built by men, they have reached such a complexity level that we are no longer able to study them like artificial systems and have to deal with the same kind of experimental issues as natural sciences. The development of a sound experimental methodology for the evaluation of resource management solutions is among the most important ways to cope with the growing complexity of computing environments. Although computing environments come with their own specific challenges, we believe such general observation problems should be addressed by borrowing good practices and techniques developed in many other domains of science, in particular (1) Predictive Simulation, (2) Trace Analysis and Visualization, and (3) the Design of Experiments.

Large computing systems are particularly complex to understand because of the interplay between their discrete nature (originating from deterministic computer programs) and their stochastic nature (emerging from the physical world, long distance interactions, and complex hardware and software stacks). A first line of research in POLARIS is devoted to the design of relatively simple statistical models of key components of distributed systems and their exploitation to feed simulators of such systems, to build aggregated performance views, and to detect anomalous behaviors.

Unlike direct experimentation via an application deployment on a real-world testbed, simulation enables fully repeatable and configurable experiments that can often be conducted quickly for arbitrary hypothetical scenarios. In spite of these promises, current simulation practice is often not conducive to obtaining scientifically sound results. To date, most simulation results in the parallel and distributed computing literature are obtained with simulators that are ad hoc, unavailable, undocumented, and/or no longer maintained. As a result, most published simulation results build on throw-away (short-lived and non validated) simulators that are specifically designed for a particular study, which prevents other researchers from building upon it. There is thus a strong need for recognized simulation frameworks by which simulation results can be reproduced, further analyzed and improved.

Many simulators of MPI applications have been developed by renowned HPC groups (e.g., at SDSC , BSC , UIUC , Sandia Nat. Lab. , ORNL or ETH Zürich ) but most of them builds on restrictive network and application modeling assumptions that generally prevent to faithfully predict execution times, which limits the use of simulation to indication of gross trends at best.

The SimGrid simulation toolkit, whose development started more than 20 years ago in UCSD, is a renowned project which gathers more than 1,700 citations and has supported the research of at least 550 articles. The most important contribution of POLARIS to this project in the last years has been to improve the quality of SimGrid to the point where it can be used effectively on a daily basis by practitioners to accurately reproduce the dynamic of real HPC systems.
In particular, SMPI, a simulator based on SimGrid that simulates unmodified MPI applications written in C/C++ or FORTRAN, has now become a very unique tool allowing to faithfully study particularly complex scenario such as legacy a legacy Geophysics application that suffers from spatial and temporal load balancing problem , or the HPL benchmark . We have shown that the performance (both for time and energy consumption ) predicted through our simulations was systematically within a few percents of real experiments, which allows to reliably tune the applications at very low cost. This capacity has also been leveraged to study (through StarPU-SimGrid) complex and modern task-based applications running on heterogeneous sets of hybrid (CPUs + GPUs) nodes . The phenomenon studied through this approach would be particularly difficult to study through real experiments but yet allow to address
real problems of these applications. Finally, SimGrid is also heavily used through BatSim, a batch simulator developed in the DATAMOVE team and which leverages SimGrid, to investigate the performance of machine learning strategies in a batch scheduling context , .

Many monolithic visualization tools have been developed by renowned HPC groups since decades (e.g., BSC , Jülich and TU Dresden , , UIUC , , and ANL ) but most of these tools build on the classical information visualization that consists in always first presenting an overview of the data, possibly by plotting everything if computing power allows, and then to allow users to zoom and filter, providing details on demand. However in our context, the amount of data comprised in such traces is several orders of magnitude larger than the number of pixels on a screen and displaying even a small fraction of the trace leads to harmful visualization artifacts. Such traces are typically made of events that occur at very different time and space scales and originate from different sources, which hinders classical approaches, especially when the application structure departs from classical MPI programs with a BSP/SMPD structure. In particular, modern HPC applications that build on a task-based runtime and run on hybrid nodes are particularly challenging to analyze. Indeed, the underlying task-graph is dynamically scheduled to avoid spurious synchronizations, which prevents classical visualizations to exploit and reveal the application structure.

In , we explain how modern data analytics tools can be used to build, from heterogeneous information sources, custom, reproducible and insightful visualizations of task-based HPC applications at a very low development cost in the StarVZ framework.
By specifying and validating statistical models of the performance of HPC applications/systems, we manage to identify when their behavior departs from what is expected and detect performance anomalies. This approach has first been applied to state-of-the art linear algebra libraries in and more recently to a sparse direct solver . In both cases, we have been able to identify and fix several non-trivial anomalies that had not been noticed even by the application and runtime developpers.
Finally, these models not only allow to reveal when applications depart from what is expected but also to summarize the execution by focusing on the most important features, which is particularly useful when comparing two execution

Part of our work is devoted to the control of experiments on both classical (HPC) and novel (IoT/Fog in a smart home context) infrastructures. To this end, we heavily rely on experimental testbeds
such as Grid5000 and FIT-IoTLab that can be well-controlled but real
experiments are nonetheless quite resource-consuming. Design of experiments has been successfully applied in many fields (e.g., agriculture, chemistry, industrial processes) where experiments are considered expensive. Building on concrete use cases, we explore how Design of Experiments and Reproducible Research techniques can be used to (1) design transparent auto-tuning strategies of scientific computation kernels (2) set up systematic performance non regression tests on Grid5000 (450 nodes for 1.5 year) and detect many abnormal events (related to bios and system upgrades, cooling, faulty memory and power unstabiliy) that had a significant effect on the nodes, from subtle performance changes of 1% to much more severe degradations of more than 10%, and had yet been unnoticed by both Grid’5000 technical team and Grid’5000 users (3) design and evaluate the performance of service provisioning strategies in Fog infrastructures.

Stochastic models often suffer from the curse of dimensionality: their complexity grows exponentially with the number of dimensions of the system. At the same time, very large stochastic systems are sometimes easier to analyze: it can be shown that some classes of stochastic systems simplify as their dimension goes to infinity because of averaging effects such as the law of large numbers, or the central limit theorem. This form the basis of what is called an asymptotic method, which consists in studying what happens when a system gets large in order to build an approximation that is easier to study or to simulate.

Within the team, the research that we conduct in this axis is to foster the applicability of these asymptotic methods to new application areas. This lead us to work on the application of classical methods to new problems, but also to develop new approximation methods that take into account special features of the systems we study (i.e., moderate number of dimensions, transient behavior, random matrices). Typical applications are mean field method for performance evaluation, application to distributed optimization, and more recently statistical learning. One of the originality of our work is to quantify precisely what is the error made by such approximations. This allows us to define refinement terms that lead to more accurate approximations.

Mean field approximation is a well-known technique in statistical physics, that was originally introduced to study systems composed of a very large number of particles (say mean field). Nowadays, variants of this technique are widely applied in many domains: in game theory for instance (with the example of mean field games), but also to quantify the performance of distributed algorithms. Mean field approximation is often justified by showing a system of

In , we give a partial answer to this question. We show that, for most of the mean field models used for performance evaluation, the error made when using a mean field approximation is a exact rate of accuracy. This result came from the use of Stein's method that allows one to quantify precisely the distance between two stochastic processes. Subsequently, in , we show that the constant in the

Mean field approximation is widely used in the perfromance evaluation community to analyze and design distributed control algorithms. Our contribution in this domain has covered mainly two applications: cache replacement algorithms and load balancing algorithms.

Cache replacement algorithms are widely used in content delivery networks. In , , , we show how mean field and refined mean field approximation can be used to evaluate the performance of list-based cache replacement algorithms. In particuler, we show that such policies can outperform the classically used LRU algorithm. A methological contribution of our work is that, when evaluating precisely the behavior of such a policy, the refined mean field approximation is both faster and more accurate than what could be obtain with a stochastic simulator.

Computing resources are often spread across many machines. An efficient use of such resources requires the design of a good load balancing strategy, to distribute the load among the available machines. In , , , we study two paradigms that we use to design asymptotically optimal load balancing policies where a central broker sends tasks to a set of parallel servers. We show in , that combining the classical round-robin allocation plus an evaluation of the tasks sizes can yield a policy that has a zero delay in the large system limit. This policy is interesting because the broker does not need any feedback from the servers. At the same time, this policy needs to estimate or know job durations, which is not always possible. A different approach is used in where we consider a policy that does not need to estimate job durations but that uses some feedback from the servers plus a memory of where jobs where send. We show that this paradigm can also be used to design zero-delay load balancing policies as the system size grows to infinity.

Various notions of mean field games have been introduced in the years 2000-2010 in theoretical economics, engineering or game theory. A mean field game is a game in which an individual tries to maximize its utility while evolving in a population of other individuals whose behavior are not directly affected by the individual. An equilibrium is a population dynamics for which a selfish individual would behave as the population. In , we develop the notion of discrete space mean field games, that is more amendable to study that the previously introduced notions of mean field games. This lead to two interesting contributions: mean field games are not always the limits of stochastic games as the number of players grow , mean field games can be used to study how much vaccination should be subsidized to encourage people to adapt an socially optimal behaviour .

Online learning concerns the study of
repeated decision-making in changing environments.
Of course, depending on the context, the words “learning” and “decision-making” may refer to very different things:
in economics, this could mean predicting how rational agents react to market drifts;
in data networks, it could mean adapting the way packets are routed based on changing traffic conditions;
in machine learning and AI applications, it could mean training a neural network or the guidance system of a self-driving car;
etc.
In particular, the changes in the learner's environment could be
either exogenous (that is, independent of the learner's decisions, such as the weather affecting the time of travel),
or endogenous (i.e., they could depend on the learner's decisions, as in a game of poker),
or any combination thereof.
However, the goal for the learner(s) is always the same:
to make more informed decisions that lead to better rewards over time.

The study of online learning models and algorithms dates back to the seminal work of Robbins, Nash and Bellman in the 50's, and it has since given rise to a vigorous research field at the interface of game theory, control and optimization, with numerous applications in operations research, machine learning, and data science. In this general context, our team focuses on the asymptotic behavior of online learning and optimization algorithms, both single- and multi-agent: whether they converge, at what speed, and/or what type of non-stationary, off-equilibrium behaviors may arise when they do not.

The focus of POLARIS on game-theoretic and Markovian models of learning covers a set of specific challenges that dovetail in a highly synergistic manner with the work of other learning-oriented teams within Inria (like SCOOL in Lille, SIERRA in Paris, and THOTH in Grenoble), and it is an important component of Inria's activities and contributions in the field (which includes major industrial stakeholders like Google / DeepMind, Facebook, Microsoft, Amazon, and many others).

Our team's work on online learning covers both single- and multi-agent models; in the sequel, we present some highlights of our work structured along these basic axes.

In the single-agent setting, an important problem in the theory of Markov decision processes – i.e., discrete-time control processes with decision-dependent randomness – is the so-called “restless bandit” problem. Here, the learner chooses an action – or “arm” – from a finite set, and the mechanism determining the action's reward changes depending on whether the action was chosen or not (in contrast to standard Markov problems where the activation of an arm does not have this effect). In this general setting, Whittle conjectured – and Weber and Weiss proved – that Whittle's eponymous index policy is asymptotically optimal. However, the result of Weber and Weiss is purely asymptotic, and the rate of this convergence remained elusive for several decades. This gap was finally settled in a series of POLARIS papers , where we showed that Whittle indices (as well as other index policies) become optimal at a geometric rate under the same technical conditions used by Weber and Weiss to prove Whittle's conjecture, plus a technical requirement on the non-singularity of the fixed point of the mean-field dynamics. We also propose the first sub-cubic algorithm to compute Whittle and Gittins indexes. As for reinforcement learning in Markovian bandits, we have shown that Bayesian and optimistic approaches do not use the structure of Markovian bandits similarly: While Bayesian learning has both a regret and a computational complexity that scales linearly with the number of arms, optimistic approaches all incur an exponential computation time, at least in their current versions .

In the multi-agent setting, our work has focused on the following fundamental question:

Does the concurrent use of (possibly optimal) single-agent learning algorithms

ensure convergence to Nash equilibrium in multi-agent, game-theoretic environments?

Conventional wisdom might suggest a positive answer to this question because of the following “folk theorem”:
under no-regret learning, the agents' empirical frequency of play converges to the game's set of coarse correlated equilibria.
However, the actual implications of this result are quite weak:
First, it concerns the empirical frequency of play and not the day-to-day sequence of actions employed by the players.
Second, it concerns coarse correlated equilibria which may be supported on strictly dominated strategies – and are thus unacceptable in terms of rationalizability.
These realizations prompted us to make a clean break with conventional wisdom on this topic,
ultimately showing that the answer to the above question is, in general, “no”:
specifically, , showed that the (optimal) class of “follow-the-regularized-leader” (FTRL) learning algorithms leads to Poincaré recurrence even in simple,

This negative result generated significant interest in the literature as it contributed in shifting the focus towards identifying which Nash equilibria may arise as stable limit points of FTRL algorithms and dynamics.
Earlier work by POLARIS on the topic , , suggested that strict Nash equilibria
play an important role in this question.
This suspicion was recently confirmed in a series of papers where we established a sweeping negative result to the effect that mixed Nash equilibria are incompatible with no-regret learning.
Specifically, we showed that any Nash equilibrium which is not strict cannot be stable and attracting under the dynamics of FTRL, especially in the presence of randomness and uncertainty.
This result has significant implications for predicting the outcome of a multi-agent learning process because, combined with , it establishes the following far-reaching equivalence:
a state is asymptotically stable under no-regret learning if and only if it is a strict Nash equilibrium.

Going beyond finite games, this further raised the question of what type of non-convergent behaviors can be observed in continuous games – such as the class of stochastic min-max problems that are typically associated to generative adversarial networks (GANs) in machine learning. This question was one of our primary collaboration axes with EPFL, and led to a joint research project focused on the characterization of the convergence properties of zeroth-, first-, and (scalable) second-order methods in non-convex/non-concave problems. In particular, we showed in that these state-of-the-art min-max optimization algorithms may converge with arbitrarily high probability to attractors that are in no way min-max optimal or even stationary – and, in fact, may not even contain a single stationary point (let alone a Nash equilibrium). Spurious convergence phenomena of this type can arise even in two-dimensional problems, a fact which corroborates the empirical evidence surrounding the formidable difficulty of training GANs.

The topics in this axis emerge from current social and economic questions rather than from a fixed set of mathematical methods. To this end we have identified large trends such as energy efficiency, fairness, privacy, and the growing number of new market places. In addition, COVID has posed new questions that opened new paths of research with strong links to policy making.

Throughout these works, the focus of the team is on modeling aspects of the aforementioned problems, and obtaining strong theoretical results that can give high-level guidelines on the design of markets or of decision-making procedures. Where relevant, we complement those works by measurement studies and audits of existing systems that allow identifying key issues. As this work is driven by topics, rather than methods, it allows for a wide range of collaborations, including with enterprises (e.g., Naverlabs), policy makers, and academics from various fields (economics, policy, epidemiology, etc.).

Other teams at Inria cover some of the societal challenges listed here (e.g., PRIVATICS, COMETE) but rather in isolation. The specificity of POLARIS resides in the breadth of societal topics covered and of the collaborations with non-CS researchers and non-research bodies; as well as in the application of methods such as game theory to those topics.

As algorithmic decision-making became increasingly omnipresent in our daily lives (in domains ranging from credits to advertising, hiring, or medicine); it also became increasingly apparent that the outcome of algorithms can be discriminatory for various reasons. Since 2016, the scientific community working on the problem of algorithmic fairness has been exponentially increasing. In this context, in the early days, we worked on better understanding the extent of the problem through measurement in the case of social networks . In particular, in this work, we showed that in advertising platforms, discrimination can occur from multiple different internal processes that cannot be controlled, and we advocate for measuring discrimination on the outcome directly. Then we worked on proposing solutions to guarantee fair representation in online public recommendations (aka trending topics on Twitter) . This is an example of an application in which it was observed that recommendations are typically biased towards some demographic groups. In this work, our proposed solution draws an analogy between recommendation and voting and builds on existing works on fair representation in voting. Finally, in most recent times, we worked on better understanding the sources of discrimination, in the particular simple case of selection problems, and the consequences of fixing it. While most works attribute discrimination to implicit bias of the decision maker , we identified a fundamentally different source of discrimination: Even in the absence of implicit bias in a decision maker’s estimate of candidates’ quality, the estimates may differ between the different groups in their variance—that is, the decision maker’s ability to precisely estimate a candidate’s quality may depend on the candidate’s group . We show that this differential variance leads to discrimination for two reasonable baseline decision makers (group-oblivious and Bayesian optimal). Then we analyze the consequence on the selection utility of imposing fairness mechanisms such as demographic parity or its generalization; in particular we identify some cases for which imposing fairness can improve utility. In , we also study similar questions in the two-stage setting, and derive the optimal selector and the “price of local fairness’’ one pays in utility by imposing that the interim stage be fair.

Online services in general, and social networks in particular, collect massive amounts of data about their users (both online and offline). It is critical that (i) the users’ data is protected so that it cannot leak and (ii) users can know what data the service has about them and understand how it is used—this is the transparency requirement. In this context, we did two kinds of work. First, we studied social networks through measurement, in particular using the use case of Facebook. We showed that their advertising platform, through the PII-based targeting option, allowed attackers to discover some personal data of users . We also proposed an alternative design—valid for any system that proposed PII-based targeting—and proved that it fixes the problem. We then audited the transparency mechanisms of the Facebook ad platform, specifically the “Ad Preferences’’ page that shows what interests the platform inferred about a user, and the “Why am I seeing this’’ button that gives some reasons why the user saw a particular ad. In both cases, we laid the foundation for defining the quality of explanations and we showed that the explanations given were lacking key desirable properties (they were incomplete and misleading, they have since been changed) . A follow-up work shed further light on the typical uses of the platform . In another work, we proposed an innovative protocol based on randomized withdrawal to protect public posts deletion privacy . Finally, in , we study an alternative data sharing ecosystem where users can choose the precision of the data they give. We model it as a game and show that, if users are motivated to reveal data by a public good component of the outcome’s precision, then certain basic statistical properties (the optimality of generalized least squares in particular) no longer hold.

Market design operates at the intersection of computer science and economics and has become increasingly important as many markets are redesigned on digital platforms. Studying markets for commodities, in an ongoing project we evaluate how different fee models alter strategic incentives for both buyers and sellers. We identify two general classes of fees: for one, strategic manipulation becomes infeasible as the market grows large and agents therefore have no incentive to misreport their true valuation. On the other hand, strategic manipulation is possible and we show that in this case agents aim to maximally shade their bids. This has immediate implications for the design of such markets. By contrast, considers a matching market where buyers and sellers have heterogeneous preferences over each other. Traders arrive at random to the market and the market maker, having limited information, aims to optimize when to open the market for a clearing event to take place. There is a tradeoff between thickening the market (to achieve better matches) and matching quickly (to reduce waiting time of traders in the market). The tradeoff is made explicit for a wide range of underlying preferences. These works are adding to an ongoing effort to better understand and design markets .

The COVID-19 pandemic has put humanity to one of the defining challenges of its generation and as such naturally trans-disciplinary efforts have been necessary to support decision making. In a series of articles we proposed Green Zoning. `Green zones’–areas where the virus is under control based on a uniform set of conditions–can progressively return to normal economic and social activity levels, and mobility between them is permitted. By contrast, stricter public health measures are in place in ‘red zones’, and mobility between red and green zones is restricted. France and Spain were among the first countries to introduce green zoning in April 2020. The initial success of this proposal opened up the way to a large amount of follow-up work analyzing and proposing various tools to effectively deploy different tools to combat the pandemic (e.g., focus-mass testing and a vaccination policy ). In a joint work with a group of leading economists, public health researchers and sociologists it was found that countries that opted to aim to eliminate the virus fared better not only for public health, but also for the economy and civil liberties . Overall this work has been characterized by close interactions with policy makers in France, Spain and the European Commission as well as substantial activity in public discourse (via TV, newspapers and radio).

Our work on energy efficiency spanned multiple different areas and applications such as embedded systems and smart grids. Minimizing the energy consumption of embedded systems with real-time constraints is becoming more important for ecological as well as practical reasons since batteries are becoming standard power supplies. Dynamically changing the speed of the processor is the most common and efficient way to reduce energy consumption . In fact, this is the reason why modern processors are equipped with Dynamic Voltage and Frequency Scaling (DVFS) technology . In a stochastic environment, with random job sizes and arrival times, combining hard deadlines and energy minimization via DVFS-based techniques is difficult because forcing hard deadlines requires considering the worst cases, hardly compatible with random dynamics. Nevertheless, progress have been made to solve these types of problems in a series of papers using constrained Markov decision processes, both on the theoretical side (proving existence of optimal policies and showing their structure , , ) as well as on the experimental side (showing the gains of optimal policies over classical solutions ).

In the context of a collaboration with Enedis and Schneider Electric (via the Smart Grid chair of Grenoble-INP), we also study the problem of using smart meters to optimize the behavior of electrical distribution networks. We made three kinds of contributions on this subject: (1) how to design efficient control strategies in such a system , , , (2) how to co-simulate an electrical network and a communication network , and (3) what is the performance of the communication protocol (PLC G3) used by the Linky smart meters .

Supercomputers typically comprise thousands to millions of multi-core
CPUs with GPU accelerators interconnected by complex interconnection
networks that are typically structured as an intricate hierarchy of
network switches. Capacity planning and management of such systems not
only raises challenges in term of computing efficiency but also in
term of energy consumption. Most legacy (SPMD) applications struggle
to benefit from such infrastructure since the slightest failure or
load imbalance immediately causes the whole program to stop or at best
to waste resources. To scale and handle the stochastic nature of
resources, these applications have to rely on dynamic runtimes that
schedule computations and communications in an opportunistic way. Such
evolution raises challenges not only in terms of programming but also
in terms of observation (complexity and dynamicity prevents experiment
reproducibility, intrusiveness hinders large scale data collection,
...) and analysis (dynamic and flexible application structures make
classical visualization and simulation techniques totally ineffective
and require to build on ad hoc information on the application
structure).

Considerable interest has arisen from the seminal prediction that the use of multiple-input, multiple-output (MIMO) technologies can lead to substantial gains in information throughput in wireless communications, especially when used at a massive level. In particular, by employing multiple inexpensive service antennas, it is possible to exploit spatial multiplexing in the transmission and reception of radio signals, the only physical limit being the number of antennas that can be deployed on a portable device. As a result, the wireless medium can accommodate greater volumes of data traffic without requiring the reallocation (and subsequent re-regulation) of additional frequency bands. In this context, throughput maximization in the presence of interference by neighboring transmitters leads to games with convex action sets (covariance matrices with trace constraints) and individually concave utility functions (each user's Shannon throughput); developing efficient and distributed optimization protocols for such systems is one of the core objectives of Theme 5.

Another major challenge that occurs here is due to the fact that the efficient physical layer optimization of wireless networks relies on perfect (or close to perfect) channel state information (CSI), on both the uplink and the downlink. Due to the vastly increased computational overhead of this feedback – especially in decentralized, small-cell environments – the ongoing transition to fifth generation (5G) wireless networks is expected to go hand-in-hand with distributed learning and optimization methods that can operate reliably in feedback-starved environments. Accordingly, one of POLARIS' application-driven goals will be to leverage the algorithmic output of Theme 5 into a highly adaptive resource allocation framework for next-géneration wireless systems that can effectively "learn in the dark", without requiring crippling amounts of feedback.

Smart urban transport systems and smart grids are two examples of collective adaptive systems. They consist of a large number of heterogeneous entities with decentralised control and varying degrees of complex autonomous behaviour. We develop an analysis tools to help to reason about such systems. Our work relies on tools from fluid and mean-field approximation to build decentralized algorithms that solve complex optimization problems. We focus on two problems: decentralized control of electric grids and capacity planning in vehicle-sharing systems to improve load balancing.

Social computing systems are online digital systems that use personal data of their users at their core to deliver personalized services directly to the users. They are omnipresent and include for instance recommendation systems, social networks, online medias, daily apps, etc. Despite their interest and utility for users, these systems pose critical challenges of privacy, security, transparency, and respect of certain ethical constraints such as fairness. Solving these challenges involves a mix of measurement and/or audit to understand and assess issues, and modeling and optimization to propose and calibrate solutions.

The carbon footprint of the team has been quite minimal in 2021 since there has been no travel allowed with most of us working from home. Our team does not train heavy ML models requiring important processing power although some of us perform computer science experiments, mostly using the Grid5000 platforms. We keep this usage very reasonable and rely on cheaper alternatives (e.g., simulations) as much as possible.

On November 19th, a Sens Workshop was held, organized within the Inria DataMove and Polaris teams. We were ten participants. All participants were permanent members of one of the two teams. Participation in the workshop was on a voluntary basis. The day's proceedings followed four main axes: (1) Why do you do research? (2) Construction of a map of the expectations of everyone in the team. (3) Selection of two texts to be read and exchange around questions about the goal of research; (3) Prospective.

The first axe took place mainly in small groups. We were interested in why we work in the academic world and why this subject. It emerged that the motivations are very varied (ranging from intellectual curiosity to the desire to change the world), as well as the desire of several participants to change their object of study. The second session aimed to map the goals and constraints that bind us to our profession as researchers and to organize these different themes into a mural. Rigorous scientific production and education seem to be at the center of our priorities, while a question remains about the lack of group emulation and the too great part of individualism (and competition) in the current academic world. The last axes were the occasion to exchange around several texts on the impact of our profession of researcher in the current digital world, in particular linked to the fact that digital technology deeply modifies human activities and relationships, with a strong societal and environmental impact.

Without taking a concrete direction, this day was rich in learning. We think that it will be followed by other days of this type in the future.

Romain Couillet has organized several introductory seminars on the Anthropocene, which he has presented to students at the UGA and Grenoble-INP, as well as to associations in Grenoble (FNE, AgirAlternatif). He is also co-responsible of the Digital Transformation DU. He has published three articles on the issue of "usability" of artificial intelligence, and is the organizer of a special session on "Signal processing and resilience" for the GRETSI 2022 conference. He is also co-creator of the sustainable AI transversal axis of the MIAI project in Grenoble. Finally, he is a trainer for the "Fresque du Climat" and a member of Adrastia and FNE Isère.

The efforts of B. Pradelski on COVID policy has received lots of media coverage in Le Monde and other major newspapers. See Section for more details.

The new results produced by the team in 2020 can be grouped into the following categories.

Finely tuning applications and understanding the influence of key parameters (number of processes, granularity, collective operation algorithms, virtual topology, and process placement) is critical to obtain good performance on supercomputers. With the high consumption of running applications at scale, doing so solely to optimize their performance is particularly costly. We have shown in

that SimGrid and SMPI could be used to obtain inexpensive but faithful predictions of expected performance. The methodology we propose decouples the complexity of the platform, which is captured through statistical models of the performance of its main components (MPI communications, BLAS operations), from the complexity of adaptive applications by emulating the application and skipping regular non-MPI parts of the code. We demonstrate the capability of our method with High-Performance Linpack (HPL), the benchmark used to rank supercomputers in the TOP500, which requires careful tuning. This work presents an extensive (in)validation study that compares simulation with real experiments and demonstrates our ability to predict the performance of HPL within a few percent consistently. This study allows us to identify the main modeling pitfalls (e.g., spatial and temporal node variability or network heterogeneity and irregular behavior) that need to be considered. Our “surrogate” also allows studying several subtle HPL parameter optimization problems while accounting for uncertainty on the platform. This work is part of the PhD work of Tom Cornebize

and the spatial and temporal node variability has also been investigated and quantified through a systematically measurement of the performance of more than 450 nodes from a dozen of clusters of the Grid’5000 testbed for two years using a rigorous experimental discipline. Using a simple statistical test, we managed to detect many performance changes, from subtle ones of 1% to much more severe degradation of more than 10%, but which could significantly impact the outcome of experiments. The root cause behind detected performance changes ranges from BIOS and system upgrades to cooling issues, faulty memory, and power instability. These events went unnoticed by both Grid’5000 technical team and Grid’5000 users, yet they could greatly harm the reproducibility of experiments and lead to wrong scientific conclusions. All this work heavily builds on reproducible research methodology: the data and metadata collected for this work are permanently and publicly archived under an open license

and presented at through a collection of Jupyter notebooks at

.

Overall, over the last few years, the quality of SimGrid predictions for HPC applications has reach an unprecedented quality, which allows investigating and optimizing the performance of complex applications in a very controled and reproducible yet realistic way. In , we study ExaGeoStat, a task-based machine learning framework specifically designed for geostatistics data. Every iteration of this application comprises several phases that do not scale in the same way, which makes the load particularly challenging to balance. In this work, we show how such applications with multiple phases with distinct resource necessities can take advantage of inter-node heterogeneity to improve performance and reduce resource idleness. We first show how to improve application phase overlap by optimizing runtime and scheduling decisions and then how to compute the optimal distribution for all the phases using a linear program leveraging node heterogeneity while limiting communication overhead. The performance gains of our phase overlap improvements are between 36% and 50% compared to the original base synchronous and homogeneous execution. We show that by adding some slow nodes to a homogeneous set of fast nodes, we can improve the performance by another 25% compared to a standard block-cyclic distribution, thereby harnessing any machine. Most of these algorithmic and scheduling improvements have been investigated in simulation with StarPU-SimGrid as it allows for controled tracing and debugging on specific platform configurations before being confirmed through real experiments on real testbeds such as Grid 5000 and Santos Dumont.

Large systems can be (1) particularly difficult to analyze because of inherent state-space explosion and (2) require robust and scalable scheduling techniques. In this series of work, we contribute to a better understanding of both aspects.

A complementary approach to work stealing in such system is replication but it is a double-edged weapon that must be handled with caution as the resource overhead may be detrimental when used too aggressively. In , we provide a queueing-theoretic framework for job replication schemes based on the principle "replicate a job as soon as the system detects it as a straggler". This is called job speculation. Recent works have analyzed replication on arrival, which we refer to as replication. Replication is motivated by its implementation in Google's BigTable. However, systems such as Apache Spark and Hadoop MapReduce implement speculative job execution. The performance and optimization of speculative job execution is not well understood. To this end, we propose a queueing network model for load balancing where each server can speculate on the execution time of a job. Specifically, each job is initially assigned to a single server by a frontend dispatcher. Then, when its execution begins, the server sets a timeout. If the job completes before the timeout, it leaves the network, otherwise the job is terminated and relaunched or resumed at another server where it will complete. We provide a necessary and sufficient condition for the stability of speculative queueing networks with heterogeneous servers, general job sizes and scheduling disciplines. We find that speculation can increase the stability region of the network when compared with standard load balancing models and replication schemes. We provide general conditions under which timeouts increase the size of the stability region and derive a formula for the optimal speculation time, i.e., the timeout that minimizes the load induced through speculation. We compare speculation with redundant-d and redundant-to-idle-queued rules under an S&X model. For light loaded systems, redundancy schemes provide better response times. However, for moderate to heavy loadings, redundancy schemes can lose capacity and have markedly worse response times when compared with the proposed speculative scheme.

A key challenge in such systems comes from the structure and from the high variability of the execution time distribution. We also study the dispatching to parallel servers problem in where we seek to minimize the average cost experienced by the system over an infinite time horizon. A standard approach for solving this problem is through policy iteration techniques, which relies on the computation of value functions. In this context, we consider the continuous-space

Finally, when the number of entities is large, most computations are untractable but mean field approximation is a powerful technique to study the performance of very large stochastic systems represented as systems of interacting objects. Applications include load balancing models, epidemic spreading, cache replacement policies, or large-scale data centers, for which mean field approximation gives very accurate estimates of the transient or steady-state behaviors. In a series of recent papers, a new and more accurate approximation, called the refined mean field approximation has been presented. A key strength of this technique lies in its applicability to not-so-large systems. Yet, computing this new approximation can be cumbersome. In , we present a tool, called rmf tool and available at , that takes the description of a mean field model, and can numerically compute its mean field approximations and refinement.

Some form of stationarity and time synchronism is generally required to guarantee the efficiency of distributed algorithms in large systems. In this series of work, we work on releasing some of these properties.

One of the most widely used methods for solving large-scale stochastic optimization problems is distributed asynchronous stochastic gradient descent (DASGD), a family of algorithms that result from parallelizing stochastic gradient descent on distributed computing architectures (possibly) asychronously. However, a key obstacle in the efficient implementation of DASGD is the issue of delays: when a computing node contributes a gradient update, the global model parameter may have already been updated by other nodes several times over, thereby rendering this gradient information stale. These delays can quickly add up if the computational throughput of a node is saturated, so the convergence of DASGD may be compromised in the presence of large delays. In , we show that, by carefully tuning the algorithm's step-size, convergence to the critical set is still achieved in mean square, even if the delays grow unbounded at a polynomial rate. We also establish finer results in a broad class of structured optimization problems (called variationally coherent), where we show that DASGD converges to a global optimum with probability 1 under the same delay assumptions. Together, these results contribute to the broad landscape of large-scale non-convex stochastic optimization by offering state-of-the-art theoretical guarantees and providing insights for algorithm design.

Finally, we examine in and the long-run behavior of multi-agent online learning in games that evolve over time. Specifically, we examine the equilibrium tracking and convergence properties of no-regret learning algorithms in continuous games that evolve over time. We focus on learning via "mirror descent", a widely used class of noregret learning schemes where players take small steps along their individual payoff gradients and then "mirror" the output back to their action sets, and we show that the induced sequence of play (a) converges to Nash equilibrium in time-varying games that stabilize in the long run to a strictly monotone limit; and (b) it stays asymptotically close to the evolving equilibrium of the sequence of stage games (assuming they are strongly monotone). Our results apply to both gradient-based and payoff-based feedback, i.e., the "bandit" case where players only observe the payoffs of their chosen actions.

Variational inequalities – and, in particular, stochastic variational inequalities – have recently attracted considerable attention in machine learning and learning theory as a flexible paradigm for "optimization beyond minimization", i.e., for problems where finding an optimal solution does not necessarily involve minimizing a loss function.

An other challenging and promising problem motivated by applications in machine learning and operations research is stochastic regret minimization in non-convex problems:

Designing algorithms that perform well in a variety of regime is particularly challenging. In a series of work, we study how to get the best of both worlds in a variety of contexts:

Finally, we study how such no-regret strategies fare in a multi-agent context. In game-theoretic learning, several agents are simultaneously following their individual interests, so the environment is non-stationary from each player's perspective. In this context, the performance of a learning algorithm is often measured by its regret. However, no-regret algorithms are not created equal in terms of game-theoretic guarantees: depending on how they are tuned, some of them may drive the system to an equilibrium, while others could produce cyclic, chaotic, or otherwise divergent trajectories. To account for this, we propose in a range of no-regret policies based on optimistic mirror descent, with the following desirable properties: i) they do not require any prior tuning or knowledge of the game; ii) they all achieve

Learning in Games is considerably more difficult than in classical minimization games as the resulting equilibria may be attractive or not and the dynamic often exhibit cyclic behaviors.

The Multi-armed Stochastic Bandit framework is a classic reinforcement learning problem to study the exploration exploitation trade-off dilemma and for which several optimal algorithms like UCB and Thompson sampling, whose optimality has only recently been proved by Kaufmann et al., have been proposed. Although the first strategy is an optimistic strategy which systematically chooses the "most promising" arm, the second ones build on a Bayesian perspective and samples the posterior to decide which arm to select. The Markovian Bandit allows to model situations where the reward distribution is modeled as a Markov chain and may thus exhibit temporal changes. A key challenge in this context is the curse of dimensionality, which basically says that the state size of the Markov process is exponential in the number of the system components so that the complexity of computing an optimal policy and its value are exponential.

The general deployment of machine-learning systems in many domains ranging from security to recommendation and advertising to guide strategic decisions leads to an interesting line of research from a game theory perspective.

A first line of work in this context is related to fairness and adversarial classification. Discrimination in selection problems such as hiring or college admission is often explained by implicit bias from the decision maker against disadvantaged demographic groups. In , we consider a model where the decision maker receives a noisy estimate of each candidate's quality, whose variance depends on the candidate's group-we argue that such differential variance is a key feature of many selection problems. We analyze two notable settings: in the first, the noise variances are unknown to the decision maker who simply picks the candidates with the highest estimated quality independently of their group; in the second, the variances are known and the decision maker picks candidates having the highest expected quality given the noisy estimate. We show that both baseline decision makers yield discrimination, although in opposite directions: the first leads to under-representation of the low-variance group while the second leads to under-representation of the high-variance group. We study the effect on the selection utility of imposing a fairness mechanism that we term the $

The Colonel Blotto game is a well-known resource allocation games introduced by Borel (1921) that finds application in many domains like politics (where political parties distribute their budgets to compete over voters), cybersecurity (where effort is distributed to attack/defend targets), online advertising (where marketing campaigns allocate the time to broadcast ads to attract web users), or telecommunication (where network service providers distribute and lease their spectrum to the users). In , we introduce the Colonel Blotto game with favoritism, an extension where the winner-determination rule is generalized to include pre-allocations and asymmetry of the players' resources effectiveness on each battlefield. Such favoritism is found in many classical applications of the Colonel Blotto game. We focus on the Nash equilibrium. First, we consider the closely related model of all-pay auctions with favoritism and completely characterize its equilibrium. Based on this result, we prove the existence of a set of optimal univariate distributions-which serve as candidate marginals for an equilibrium-of the Colonel Blotto game with favoritism and show an explicit construction thereof. In several particular cases, this directly leads to an equilibrium of the Colonel Blotto game with favoritism. In other cases, we use these optimal univariate distributions to derive an approximate equilibrium with well-controlled approximation error. Finally, we propose an algorithm-based on the notion of winding number in parametric curves-to efficiently compute an approximation of the proposed optimal univariate distributions with arbitrarily small error.

The Covid-19 pandemic has deeply impacted our lives and caused more than 5.49 million deaths worldwide, making it one of the deadliest pandemic in history. Several policies have been proposed to respond this illness and mitigate both its spread and its impact on both populations' health and economy. We have been studying and supporting some of policies through our expertise in large system analysis (game theory, mean field, etc.)

Bary Pradelski has been among the first researchers to actively promote Green zoning which has emerged as a widely used policy response to tackle the Covid-19 pandemic . ‘Green zones’ – areas where the virus is under control based on a uniform set of conditions – can progressively return to normal economic and social activity levels, and mobility between them is permitted. By contrast, stricter public health measures are in place in ‘red zones’, and mobility between red and green zones is restricted. France and Spain were among the first countries to introduce green zoning in April 2020. Subsequently, more and more countries followed suit and the European Commission advocated for the implementation of a European green zoning strategy, which has been supported by the EU member states. While there remain coordination problems, green zoning has proven to be an effective strategy for containing the spread of the virus and limiting its negative economic and social impact. This strategy should provide important lessons and prove useful in future outbreaks. Research in epidemiology indicates that thoroughly implemented and operationalised green zoning can prevent the spread of a transmittable disease that is poorly understood, highly virulent, and potentially highly lethal. Finally, there is strong evidence that green zoning can reduce economic and societal damage as it avoids worst-in-class measures.

Unfortunately, locking down entire regions is not satisfactory and has dramatic consequences on health, the economy, and civil liberties, and several countries have responded in very different ways. Some countries have consistently aimed for elimination – i.e., maximum action to control SARS-CoV-2 and stop community transmission as quickly as possible – while others have opted for mitigation – i.e., action increased in a step-wise, targeted way to reduce cases so as not to overwhelm health-care systems. In , we show that the former ones have generally fared much better than the latter ones by comparing deaths, gross domestic product (GDP) growth, and strictness of lock-down measures during the first 12 months of the pandemic. Furthermore, the mitigation has favored the proliferation of new SARS-CoV-2 variants and countries that opt to live with the virus will likely pose a threat to other countries, notably those that have less access to COVID-19 vaccines. Although many scientists have called for a coordinated international strategy to eliminate SARS-CoV-2, it has unfortunately not been heard yet.

Patrick Loiseau has a Cifre contract with Naver labs (2020-2023) on "Fairness in multi-stakeholder recommendation platforms”, which supports the PhD student Till Kletti.

Patrick Loiseau and Panayotis Mertikopoulos have a grant from DGA (2018-2021) that complements the funding of PhD student (Benjamin Roussillon) to work on game theoretic models for adversarial classification.

Projet DISCMAN (projet IRS de l'UGA). DISCMAN (Distributed Control for Multi-Agent systems and Networks) is a joint IRS project funded by IDEX Université Grenoble-Alpes. Its main objectives is to develop distributed equilibrium convergence algorithms for large-scale control and optimization problems, both offline and online. It is being coordinated by P. Mertikopoulos (POLARIS), and it involves a joint team of researchers from the LIG and LJK laboratories in Grenoble.

The efforts of B. Pradelski on COVID policy has received lots of media coverage in Le Monde and other major newspapers. See for example: