## Section: New Results

### Game Theory

Participants : Eitan Altman, Konstantin Avrachenkov, Mandar Datar, Swapnil Dhamal, Alain Jean-Marie.

#### Resource allocation: Kelly mechanism and Tullock game

The price-anticipating Kelly mechanism (PAKM) is one of the most extensively used strategies to allocate divisible resources for strategic users in communication networks and computing systems. It is known in other communities as the Tullock game. The users are deemed as selfish and also benign, each of which maximizes his individual utility of the allocated resources minus his payment to the network operator. E. Altman, A. Reiffers-Masson (IISc Bangalore, India), D. Sadoc-Menasche (UFJR, Brazil), M. Datar, S. Dhamal, C. Touati (Inria Grenoble-Rhone-Alpes) and R. El-Azouzi (CERI/LIA, Univ Avignon) have first applied this type of games to competition in crypto-currency protocols between miners in blockchain [11]. Blockchain is a distributed synchronized secure database containing validated blocks of transactions. A block is validated by special nodes called miners and the validation of each new block is done via the solution of a computationally difficult problem, which is called the proof-of-work puzzle. The miners compete against each other and the first to solve the problem announces it, the block is then verified by the majority of miners in this network, trying to reach consensus. After the propagated block reaches the consensus, it is successfully added to the distributed database. The miner who found the solution receives a reward either in the form of crypto-currencies or in the form of a transaction reward. The authors show that the discrete version of the game is equivalent to a congestion game and thus has an equilibrium in pure strategies.

E. Altman, M. Datar, C. Touati (Inria Grenoble-Rhone-Alpes) and G. Burnside (Nokia Bell Labs) then introduce further constraints on the total amount of resources used and study pricing issues in this constrained game. They show that a normalized equilibrium (in the sense of Rosen) exists which implies that pricing can be done in a scalable way, i.e; prices can be chosen to be independent of the player. A possible way to prove this structure is to show that the utilities are strict diagonal concave (which is an extension to game setting of concavity) which they did in [27].

In [25], Y. Xu, Z. Xiao, T. Ni, X. Wang (all from Fudan Univ, China), J. H. Wang (Tsinghua Univ, China) and E. Altman formulate a non-cooperative Tullock game consisting of a finite amount of benign users and one misbehaving user. The maliciousness of this misbehaving user is captured by his willingness to pay to trade for unit degradation in the utilities of benign users. The network operator allocates resources to all the users via the price-anticipating Kelly mechanism. They present six important performance metrics with regard to the total utility and the total net utility of benign users, and the revenue of network operator under three different scenarios: with and without the misbehaving user, and the maximum. We quantify the robustness of PAKM against the misbehaving actions by deriving the upper and lower bounds of these metrics.

#### A stochastic game with non-classical information structure

In [44], V. Kavitha, M. Maheshwari (both from IIT Bombay, India) and E. Altman introduce a stochastic game with partial, asymmetric and non-classical information. They obtain relevant equilibrium policies using a new approach which allows managing the belief updates in a structured manner. Agents have access only to partial information updates, and their approach is to consider optimal open loop control until the information update. The agents continuously control the rates of their Poisson search clocks to acquire the locks, the agent to get all the locks before others would get reward one. However, the agents have no information about the acquisition status of others and will incur a cost proportional to their rate process. The authors solved the problem for the case with two agents and two locks and conjectured the results for a general number of agents. They showed that a pair of (partial) state-dependent time-threshold policies form a Nash equilibrium.

#### Zero-Sum stochastic games over the field of real algebraic numbers

In [14], K. Avrachenkov together with V. Ejov (Flinders Univ, Australia), J. Filar and A. Moghaddam (both from Univ of Queensland, Australia) have considered a finite state, finite action, zero-sum stochastic games with data defining the game lying in the ordered field of real algebraic numbers. In both the discounted and the limiting average versions of these games, they prove that the value vector also lies in the same field of real algebraic numbers. Their method supplies finite construction of univariate polynomials whose roots contain these value vectors. In the case where the data of the game are rational, the method also provides a way of checking whether the entries of the value vectors are also rational.

#### Evolutionary Markov games

I. Brunetti (CIRED), Y. Hayel (CERI/LIA, Univ Avignon) and E. Altman extend in [59] evolutionary game theory by introducing the concept of individual state. They analyze a particular simple case, in which they associate a state to each player, and suppose that this state determines the set of available actions. They consider deterministic stationary policies and suppose that the choice of a policy determines the fitness of the player and it impacts the evolution of the state. They define the interdependent dynamics of states and policies and introduce the State Policy coupled Dynamics in order to study the evolution of the population profile. They prove the relation between the rest points of the system and the equilibria of the game. Then they assume that the processes of states and policies move with different velocities: this assumption allows them to solve the system and then find the equilibria of the game with two different methods: the singular perturbation method and a matrix approach.

#### Stochastic replicator dynamics

In [12], K. Avrachenkov and V.S. Borkar (IIT Bombay, India) have considered a novel model of stochastic replicator dynamics for potential games that converts to a Langevin equation on a sphere after a change of variables. This is distinct from the models of stochastic replicator dynamics studied earlier. In particular, it is ill-posed due to non-uniqueness of solutions, but is amenable to the Kolmogorov selection principle that picks a unique solution. The model allows us to make specific statements regarding metastable states such as small noise asymptotics for mean exit times from their domain of attraction, and quasi-stationary measures. We illustrate the general results by specializing them to replicator dynamics on graphs and demonstrate that the numerical experiments support theoretical predictions.

#### Stochastic coalitional better-response dynamics for finite games with application to network formation games

In [57], K. Avrachenkov and V.V. Sing (IIT Delhi, India) have considered coalition formation among players in $n$-player finite strategic game over infinite horizon. At each time a randomly formed coalition makes a joint deviation from a current action profile such that at new action profile all the players from the coalition are strictly benefited. Such deviations define a coalitional better-response (CBR) dynamics that is in general stochastic. The CBR dynamics either converges to a $\mathcal{K}$-stable equilibrium or becomes stuck in a closed cycle. The authors also assume that at each time a selected coalition makes mistake in deviation with small probability that add mutations (perturbations) into CBR dynamics. They prove that all $\mathcal{K}$-stable equilibria and all action profiles from closed cycles, that have minimum stochastic potential, are stochastically stable. A similar statement holds for strict $\mathcal{K}$-stable equilibrium. They apply the CBR dynamics to study the dynamic formation of the networks in the presence of mutations. Under the CBR dynamics all strongly stable networks and closed cycles of networks are stochastically stable.

#### Strong Stackelberg equilibria in stochastic games

In a joint work with V. Bucarey López (Univ Libre de Bruxelles, Belgium and Inria team Inocs ), E. Della Vecchia (Univ Nacional de Rosario, Argentina), and F. Ordóñez (Univ de Chile, Chile), A. Jean-Marie has considered Stackelberg equilibria for discounted stochastic games. The motivation originates in applications of Game Theory to security issues, but the question is of general theoretical and practical relevance. The solution concept of interest is that of Stationary Strong Stackelberg Equlibrium (SSSE) policies: both players apply state feedback policies; the leader announces her strategy and the follower plays a best response to it. Tie breaks are resolved in favor of the leader. The authors provide classes of games where the SSSE exists, and we prove via counterexamples that SSSE does not exist in the general case. They define suitable dynamic programming operators whose fixed points are referred to as Fixed Point Equilibrium (FPE). They show that the FPE and SSSE coincide for a class of games with Myopic Follower Strategy. Numerical examples shed light on the relationship between SSSE and FPE and the behavior of Value Iteration, Policy Iteration and Mathematical programming formulations for this problem. A security application illustrates the solution concepts and the efficiency of the algorithms introduced. The results are presented in [67], [50], [51].

#### Routing on a ring network

R. Burra, C. Singh and J. Kuri (IISc Bangalore, India), study in [60] with E. Altman routing on a ring network in which traffic originates from nodes on the ring and is destined to the center. The users can take direct paths from originating nodes to the center and also multihop paths via other nodes. The authors show that routing games with only one and two hop paths and linear costs are potential games. They give explicit expressions of Nash equilibrium flows for networks with any generic cost function and symmetric loads. They also consider a ring network with random number of users at nodes, all of them having same demand, and linear routing costs. They give explicit characterization of Nash equilibria for two cases: (i) General i.i.d. loads and one and two hop paths, (ii) Bernoulli distributed loads. They also analyze optimal routing in each of these cases.

#### Routing games applied to the network neutrality debate

The Network Neutrality issue has been at the center of debate worldwide lately. Some countries have established laws so that principles of network neutrality are respected. Among the questions that have been discussed in these debates there is whether to allow agreements between service and content providers, i.e. to allow some preferential treatment by an operator to traffic from some providers (identity-based discrimination). In [63], A. Reiffers-Masson (IISc Bangalore), Y. Hayel, T. Jimenez (CERI/LIA, Univ Avignon) and E. Altman, study this question using models from routing games.

#### Peering vs transit: A game theoretical model for autonomous systems connectivity

G. Accongiagioco (IMT, Italy), E. Altman, E. Gregori (Institute of Informatics and Telematics, Univ Pisa) and Luciano Lenzini (Dipartimento di Informatica, Univ Pisa) propose a model for network optimization in a non-cooperative game setting with specific reference to the Internet connectivity. The model describes the decisions taken by an Autonomous System when joining the Internet. They first define a realistic model for the interconnection costs incurred; then they use this cost model to perform a game theoretic analysis of the decisions related to link creation and traffic routing, keeping into account the peering/transit dichotomy. The proposed model does not fall into the standard category of routing games, hence they devise new tools to solve it by exploiting specific properties of the game. They prove analytically the existence of multiple equilibria.

#### Altruistic behavior and evolutionary games

Within some species like bees or ants, the one who interacts is not the one who reproduces. This implies that the Darwinian fitness is related to the entire swarm and not to a single individual and thus, standard Evolutionary Game models do not apply to these species. Furthermore, in many species, one finds altruistic behaviors, which favors the group to which the playing individual belongs, but which may hurt the single individual. In [58], [62], I. Brunetti (CIRED), R. El-Azouzi, M. Haddad, H. Gaiech, Y. Hayel (LIA/CERI, Univ Avignon) and E. Altman define evolutionary games between group of players and study the equilibrium behavior as well as convergence to equilibrium.