## Section: Scientific Foundations

### Large System Modeling and Analysis

Participants : Bruno Gaujal, Derrick Kondo, Arnaud Legrand, Panayotis Mertikopoulos, Florence Perronnin, Brigitte Plateau, Olivier Richard, Corinne Touati, Jean-Marc Vincent.

#### Simulation of distributed systems

Since the advent of distributed computer systems, an active field of
research has been the investigation of *scheduling* strategies
for parallel applications. The common approach is to employ
scheduling heuristics that approximate an optimal schedule.
Unfortunately, it is often impossible to obtain analytical results
to compare the efficiency of these heuristics. One possibility is
to conduct large numbers of back-to-back experiments on real
platforms. While this is possible on tightly-coupled platforms, it
is infeasible on modern distributed platforms (i.e. Grids or
peer-to-peer environments) as it is labor-intensive and does not
enable repeatable results. The solution is to resort to
*simulations*.

##### Flow Simulations

To make simulations of large systems efficient and trustful, we have used flow simulations (where streams of packets are abstracted into flows). SimGrid is a simulation platform that not only enable one to get repeatable results but also make it possible to explore wide ranges of platform and application scenarios.

##### Perfect Simulation

Using a constructive representation of a Markovian queuing network based on events (often called GSMPs), we have designed a perfect simulation algorithms computing samples distributed according to the stationary distribution of the Markov process with no bias. The tools based on our algorithms ($\psi $) can sample the stationary measure of Markov processes using directly the queuing network description. Some monotone networks with up to ${10}^{50}$ states can be handled within minutes over a regular PC.

#### Fluid models and mean field limits

When the size of systems grows very large, one may use asymptotic techniques to get a faithful estimate of their behavior. One such tools is mean field analysis and fluid limits, that can be used at a modeling and simulation level. Proving that large discrete dynamic systems can be approximated by continuous dynamics uses the theory of stochastic approximation pioneered by Michel Benaïm or population dynamics introduced by Thomas Kurtz and others. We have extended the stochastic approximation approach to take into account discontinuities in the dynamics as well as to tackle optimization issues.

Recent applications include call centers and peer to peer systems. where the mean field approach helps to get a better understanding of the behavior of the system and to solve several optimization problems. Another application concerns task brokering in desktop grids taking into account statistical features of tasks as well as of the availability of the processors. Mean field has also been applied to the performance evaluation of work stealing in large systems and to model central/local controllers as well as knitting systems.

#### Discrete Event Systems

The interaction of several processes through synchronization, competition or superposition within a distributed system is a big source of difficulties because it induces a state space explosion and a non-linear dynamic behavior. The use of exotic algebra, such as (min,max,plus) can help. Highly synchronous systems become linear in this framework and therefore are amenable to formal solutions. More complicated systems are neither linear in (max,plus) nor in the classical algebra. Several qualitative properties have been established for a large class of such systems called free-choice Petri nets (sub-additivity, monotonicity or convexity properties). Such qualitative properties are sometimes enough to assess the class of routing policies optimizing the global behavior of the system. They are also useful to design efficient numerical tools computing their asymptotic behavior.

The worst case analysis of networks can also be done using the
(max,plus) machinery, called *network calculus* or *real
time calculus* in this context.

#### Game Theory

Resources in large-scale distributed platforms (grid computing platforms, enterprise networks, peer-to-peer systems) are shared by a number of users having conflicting interests who are thus prone to act selfishly. A natural framework for studying such non-cooperative individual decision-making is game theory. In particular, game theory models the decentralized nature of decision-making.

It is well known that such non-cooperative behaviors can lead to
important inefficiencies and unfairness. In other words, individual
optimizations often result in global resource waste. In the context
of game theory, a situation in which all users selfishly optimize their
own utility is known as a *Nash equilibrium* or *Wardrop
equilibrium*. In such equilibria, no user has interest in
unilaterally deviating from its strategy. Such policies are thus
very natural to seek in fully distributed systems and have some
stability properties. However, a possible consequence is the
*Braess paradox* in which the increase of resource happens at
the expense of *every* user. This is why, the study of the
occurrence and degree of such inefficiency is of crucial
interest. Up until now, little is known about general conditions for
optimality or degree of efficiency of these equilibria, in a general
setting.

Many techniques have been developed to enforce some form of
collaboration and improve these equilibria. In this context, it is
generally prohibitive to take joint decisions so that a global
optimization cannot be achieved. A possible option relies on the
establishment of virtual prices, also called *shadow prices* in
congestion networks. These prices ensure a rational use of
resources. Equilibria can also be improved by advising policies to
mobiles such that any user that does not follow these pieces of
advice will necessarily penalize herself (*correlated
equilibria*).