The DistribCom team addresses distributed algorithms for network and service management, and the management of Web services. More precisely, the overall focus of DistribCom is on
*algorithms for distributed management.*

Today, research on network and service management as well as Web Services mainly focuses on issues of software architecture and infrastructure deployment. However, these areas also involve
algorithmic problems such as fault diagnosis and alarm correlation, testing, QoS evaluation, and negotiation for QoS. DistribCom develops the foundations supporting such algorithms, namely:
fundamentals of distributed observation and supervision of systems involving concurrency. Our algorithms are model-based. For obvious reasons of complexity, such models cannot be built by hand.
Therefore we also address the novel topic of
*self-modeling,*i.e., the automatic construction of models, both structural and behavioral.

Our research topics are currently structured as follows:

fundamentals of distributed observation and supervision of concurrent systems;

self-modeling;

algorithms for distributed management of telecommunications systems and services;

Web Services orchestrations, functional and QoS aspects;

Active XML peers for Web scale data and workflow management.

Our main industrial ties are with Alcatel and France-Telecom, on the topic of networks and service management.

Anne Bouillard was recruited at ENS Cachan this year, and joined our team only this fall. Therefore, although she is very welcome, her activities are not included here.

Management of telecommunications networks and services, and Web services, involves the following algorithmic tasks:

Alarm or message correlation is one of the five basic tasks in network and service management. It consists in causally relating the various alarms collected throughout the considered
infrastructure—be it a network or a service sitting on top of a transport infrastructure. Fault management requires in particular reconstructing the set of all state histories that can
explain a given log of observations. Testing amounts to understanding and analyzing the responses of a network or service to a given set of stimuli; stimuli are generally selected according
to given test purposes. All these are variants of the general problem of
*observing*a network or service. Networks and services are large distributed systems, and we aim at observing them in a distributed way as well, namely: logs are collected in a
distributed way and observation is performed by a distributed set of supervising peers.

QoS issues are a well established topic for single domain networks or services, for various protocols — e.g., Diffserv for IP. Performance evaluation techniques are used that follow a ``closed world'' point of view: the modeling involves the overall traffic, and resource characteristics are assumed known. These approaches extend to some telecommunication services as well, e.g., when considering (G)MPLS over an IP network layer.

However, for higher level applications, including composite Web services (also called
*orchestrations*, this approach to QoS is no longer valid. For instance, an orchestration using other Web services has no knowledge of how many users are calling the same Web services.
In addition, it has no knowledge of the transport resources it is using. Therefore, the well developed ``closed world'' approach can no longer be used.
*Contract*based approaches are considered instead, in which a given orchestration offers promises to its users on the basis of promises it has from its subcontracting services. In this
context, contract composition becomes a central issue. Monitoring is needed to check for possible breaching of the contract. Coutermeasures would consist in reconfigurating the
orchestration by replacing the failed subcontracted services by alternative ones.

The DistribCom team focuses on the algorithms supporting the above tasks. Therefore models providing an adequate framework are fundamental. We focus on models of discrete systems, not models of streams or fluid types of models. And we address the distributed and asynchronous nature of the underlying systems by using models involving only local, not global, states, and local, not global, time. These models are reviewed in section .

We use these mathematical models to support our algorithms and we use them also to study and develop formalisms of Web services orchestrations and workflow management in a more general setting.

For Finite State Machines (FSM), a large body of theory has been developed to address problems such as: observation (the inference of hidden state trajectories from incomplete observations), control, diagnosis, and learning. These are difficult problems, even for simple models such as FSM's. One of the research tracks of DistribCom consists in extending such theories to distributed systems involving concurrency, i.e., systems in which both time and states are local, not global. For such systems, even very basic concepts such as ``trajectories'' or ``executions'' need to be deeply revisited. Computer scientists have for a long time recognized this topic of concurrent and distributed systems as a central one. In this section, we briefly introduce the reader to the models of scenarios, event structures, nets, languages of scenarios, graph grammars, and their variants.

The simplest concept related to concurrency is that of a finite execution of a distributed machine. The
*scenario*shown in Figure

is an example. The figure shows the life-time (from top to bottom) of four processes (or instances). The instance can exchange asynchronous messages. In this example, some local variables
can be tested and assigned. In this model, events are totally ordered for each instance, but only partially ordered between different instances. Thus, time is local, not global. The natural
concept of state is local too (i.e., attached to individual instances). Global states can be defined, they however require nontrivial algorithms for their distributed construction. Finite
scenarios introduce the two key concepts of
*causality*and
*concurrency*
. The causality relation is a partial
order, we denote it by
. In Figure
, the reception of AU_AIS is causally related to the
sending of MS_AIS by the rs_TTP, while it is concurrent with the receipt of MS_AIS by the alarm manager.

Scenarios have been informally used by telecom engineers for a long time. Their formalization was introduced by the work done in the framework of ITU and OMG on High-level Message Sequence Charts and on UML Sequence Diagrams in the last ten years, see , . This allowed in particular to formally define infinite scenarios, and to enhance them with variables, guards, etc , , . Today, scenarios are routinely offered by UML and related systems and software modeling tools, Figure showed such an example.

The next step is to model sets of finite executions of a distributed machine.
*Event structures*were invented by Glynn Winskel and co-authors in 1980
,
. This data structure collects all the
executions by superimposing shared prefixes. Figure
shows an example.

The top most diagram shows an HMSC. i.e., an automaton whose transitions are labeled by basic scenarios. Consider first the scenarios as abstract labels. The set of all executions of this automaton is then shown on the bottom left diagram, in the form of an execution tree. For sequential machines, executions trees collect all the executions by superimposing shared prefixes.

Now, the right diagram shows the ``white box'' version of the former, in which the concatenation of the successive basic scenarios has been performed by chaining them instance by instance.
The result is a
*event structure,*i.e., a branching structure consisting of events related by a
*causality*relation (depicted by directed arrows) and a
*conflict*relation (depicted by a non directed arc labeled by a
#symbol). Events that are neither causally related nor in conflict are called
*concurrent.*Concurrent processes model the ``parallel progress'' of components.

Categories of event structures have been defined, with associated morphisms, products, and co-products, see . Products and co-products formalize the concepts of parallel composition and ``union'' of event structures, respectively. This provides the needed apparatus for composing and projecting (or abstracting) systems.

Event structures have been mostly used to give the semantics of various formalisms or languages, such as Petri nets, CCS, CSP, etc , . We in DistribCom make a nonstandard use of these, e.g., we use them as a structure to compute and express the solutions of observation or diagnosis problems, for concurrent systems.

The next step is to have finite representations of systems having possibly infinite executions. In DistribCom, we use two such formalisms:
*Petri nets*
,
and
*languages of scenarios*such as High-level Message Sequence Charts (HMSC)
,
. Petri nets are well known, at least in
their basic form, we do not introduce them here. We use so-called
*safe*Petri Nets, in which markings are boolean (tokens can be either 0 or 1); and we use also variants, see below. Languages of scenarios are simply obtained as illustrated in
Figure
: 1/ equip basic scenarios with a concatenation
operation, and 2/ consider an automaton whose transitions are labeled with basic scenarios. Executions of Petri Nets and HMSC can be represented with concurrency in the form of event
structures. We have shown this for HMSC's in Figure
, and it is obtained in a similar way for Petri nets.

Two extensions of the basic concepts of nets or scenario languages are useful for us:

Nets or scenario languages enriched with variables, actions, and guards. This is useful to model general concurrent and distributed dynamical systems in which a certain discrete
abstraction of the control is represented by means of a net or a scenario language. Manipulating such
*symbolic nets*requires using abstraction techniques. Time Petri nets and network of timed automata are particular cases of symbolic nets.

Probabilistic Nets or event structures. Whereas a huge literature exists on stochastic Petri nets or stochastic process algebras (in computer science), randomizing
*concurrent models,*i.e., with
's being concurrent trajectories, not sequential ones, has been addressed only since the 21st century. We have contributed to this new area of research.

The last and perhaps most important issue, for our applications, is the handling of dynamic changes in the systems model. This is motivated by the constant use of dynamic reconfigurations
in management systems. Extensions of net models have been proposed to capture this, for example the
*dynamic nets*of Vladimiro Sassone
; for the moment, such models lack a
suitable theory of unfoldings. A relevant alternative is the class of
*graph grammars*
,
. Graph grammars transform graphs by
means of a finite set of rules.

Graph grammars have been equipped with a rich theory of unfoldings, and more generally received much attention from a theoretical viewpoint. While there are numerous modeling applications in Biology, Chemistry, computer science, etc, we at DistribCom test their use for distributed network management algorithms for systems subject to reconfiguration.

Telecommunications have grown up from a basic technology of networks and transport to a much more complex jungle of networks, services, and applications. This motivates a strong research effort towards ``autonomic communications'': one would like to program networks at the service level (sometimes called the business level, since it directly involves contracts with customers), and let the (possibly cross-domain) infrastructure adapt itself in order to ensure a given QoS, isolate and repair failures, etc. This tendency appears for example in the policy-based management, or in the quest for ``self-XX'' functionalities (self-configuration, self-monitoring, self-healing, etc.). One of our objectives is to address these subjects.

These problems have several common features. First of all, they involve concurrent systems,
*i.e.*systems where several things can happen independently at the same time. Secondly, they are built in a modular way, by combining elementary components into large connected structures.
Third common point: these systems exhibit dynamicity. Reconfigurations, connections/deconnections of new components or clients are part of the normal activity, and should not require that
monitoring algorithms be reset or modified each time the system is changed. And finally, the size and heterogeneity of these systems prevents from using a centralized monitoring architecture.
This motivates developing distributed and modular approaches.

As an example of distributed monitoring, our first application was related to diagnosis issues in transport networks, i.e. the low-level layers of networks (physical, transport and network layers). We have focused on circuit oriented networks, such as SDH/WDM protocols or GMPLS protocols. These systems assemble hundreds of functions of components, and the failure of one of them generally induces side-effects in many others. This phenomenon is known as ``fault propagation;'' it results in hundreds of alarms produced by the various components and collected at different locations in the network. Identifying origins of faults from these alarms has now reached a level of complexity that prevents the traditional human analysis of alarms. Due to the size of systems, the automatic diagnosis task cannot be done in a centralized manner, and must be solved by a network of local supervisors that coordinate their work to provide a coherent view of what happened in the network.

Currently we are addressing issues concerning dynamic services in heterogeneous networks and services. The emphazis is on guaranteeing desired levels of QoS in situations where SLAs have to be negotiated, instantiated, and monitored in a non-local fashion: the immediate peer-to-peer contact allowing for negotiation, monitoring and penalties, is between a client and a provider, or two neighboring providers only.

In higher layers, at the level of services, the same search for flexibility motivates the development of tools to rapidly assemble Web-services into larger services, called
*orchestrations*or
*choreographies*. Recently, standard languages for service workflow have even been proposed such as IBM's Web Services Flow Language
or Microsoft's XLang
, which converged to the BPEL4WS proposal
and subsequently WSCDL proposal
for choreographies. Tools for BPEL are,
among others, commercially available from Telelogic

The implementation of orchestration and choreography description languages raises a number of difficulties related to efficiency, clean semantics, and reproducibility of executions, that are impairing their industrial acceptance. In addition, issues of composite QoS associated with orchestrations are not a mature area . We develop studies in these areas.

A serious shortcoming of approaches to Web Service orchestration and choreography is that they mostly abstract data away. Symmetrically, modern approaches to Web data management
,
typically based on XML and Xqueries rely on
too simplistic forms of control. We believe that time has come for a convergence of sophistication in terms of control and richness in data, for workflow and data management over the Web. We
believe that
*active Peer-to-Peer XML-based documents*, as proposed by S. Abiteboul under the name of AXML

The SOFAT toolbox is a scenario manipulation toolbox. Its aim is to implement all known formal manipulations on scenarios. The toolbox implements several formal models such as partial orders, graph grammars, graphs, and algorithm dedicated to these models (Tarjan, cycle detection for graphs, Caucal's normalization for graph grammars, etc. ). The SOFAT toolbox is permanently updated to integrate new algorithms. It is currently used for a research contract with France Telecom, and is freely available form INRIA's website. The last update of SOFAT includes the fibered product operation described in . This year SOFAT was used in the CO2 project. A connection to a performance evaluation toolchain in the RNRT project PERSIFORM is also under study.

The emerging topic of
*self-modeling*addresses the automatic construction of sophisticated behavioral models. This problem is a real challenge for large systems, and an unavoidable and delicate task that
directly affects the performances of model-based monitoring tools. We address self-modeling issues in two different ways: by assembling generic model blocs, and by learning methods.

The first approach has been developed in previous RNRT projects, dedicated to alarm correlation techniques in telecommunication networks, and is now part of the team background. The principle is as follows: small generic network components are designed by hand, using information from failure propagation scenarios (expert knowledge) and information from technology standards. These network components include connectivity capabilities, as detailed in information model standards, and generic behaviors. As an interesting feature, standards define network components in a hierarchical manner, progressively refining their definition from general functions down to a specific technology and finally to equipment implementation. This hierarchy is reflected in our models, so only a limited part of component models has to be adjusted by hand. In a second time, the supervised network is scanned to discover which components are present, and how they are connected. This ``network discovery'' phase builds the model of the network by connecting the corresponding component models. It results in a possibly large model of the system that is then used as the basis for alarm correlation algorithms. In 2005-2006, this approach has been experimented in a real situation under a contract with the Alcatel Research and Innovation, in cooperation with the Optical Network business Division, see . Tools have been developed, both to easily model components in UML (with the above mentioned inheritage features), and to perform failure diagnosis in submarine line terminal equipment.

These last two years, effort has be shifted toward the second approach, by automatically inferring or refining (part of) the model with tests and learning algorithms.

Efficient learning algorithms , , were known for a long time in the sequential case with centralized observations. Interestingly enough, learning a distributed system is not as easy as learning a sequential one, and in many cases, Vasilevskii-Chow's algorithm is not efficient anymore. In , we provide a new algorithm for learning a distributed system, together with the proof that it is optimal, and experimental results to compare it with generic algorithms.

Last year, we proposed an online compression algorithm for distributed executions that could be used to infer a scenario model from observations of a distributed system. We have also investigated extensions of scenario models to gain expressive power and allow users to model typical behaviors of distributed systems containing sliding-windows executions. In complement, we had studied how to obtain HMSCs by abstraction of communicating automata . This year, we have investigated several approaches to provide users with languages and tools to build expressive models more easily.

Another research direction consists in helping a user collecting and gathering observations of a system to build a coherent model. In 2004, we have proposed an operator to compose redundant scenarios , called amalgamated sum. The main drawback of this sum is that the common parts of two behaviors must be given explicitely before composing two scenarios. This year, we have provided an algorithm to detect automatically redundancies in scenarios . As the number of possible solutions that have to be studied is exponential in the size of the smallest scenario, we have implemented an heuristic method, and proved its convergence towards the best solution (i.e., the greatest common subsets in the two scenarios).

In this section we collect our fundamental results regarding the models we use for distributed systems.

The monitoring algorithms developed in Distribcom heavily rely on an efficient representation of trajectory sets for large concurrent systems. Unfoldings, event structures, or the more recently proposed time-unfoldings are natural candidates for that. A key feature to derive efficient distributed algorithms is the factorization property of these objects: when a system can be expressed as a combination of components, its (time-)unfolding can be expressed as well as a combination of (time-)unfoldings of its components. The ``combination'' can be a standard product of components, or more interestingly can be a pullback. The latter expresses that components interact via an interface. We have shown that all pullbacks exist in the category of safe Petri nets , and so, by category theory arguments, these operations are preserved by all the ``unfolding'' operations mentioned above.

Once one knows that the unfolding of a compound system admits itself a factorized form, a problem consists in computing its minimal factors,
*i.e.*the trajectories of each component that contribute to a least one trajectory of the global system. In other words, one would like to determine projections of the global unfolding
on each component, without computing the (huge) global unfolding, of course. Several solutions have been proposed, that take the form of local computations combining products (or pullbacks)
and projections. For example in
computations were based on event
structures, and in
they were based on augmented branching
processes. In
, Paolo Baldan (Venice), Stefan Haar and
B. Koenig (Duisburg) address the problem in yet another manner: they consider a restricted set of compositions by pullback, and propose to base computations on
*interleaving Structures*or
**ILS**. The advantage is that projections are (much) easier in the category of
**ILS** ; in particular,
shows that
**ILS**possess a useful (embedding,projection)-factorization property.

Distributed computations based on event structures, on branching processes (prefixes of unfoldings) or on trellis processes (prefixes of time-unfoldings) are quite technical. This is essentially due to the specific features of the true concurrency semantics. We have done the exercise of re-expressing the theory in a simpler framework, where runs are ordinary sequences of events rather than partial orders. All results can be rederived nicely: products and pullbacks exist, unfoldings and time-unfoldings can also be defined and enjoy the same factorization properties. And projections are much easier to define. Alltogether, one can design distributed monitoring algorithms for networks of automata in very much the same way as for combinations of Petri nets . Several surprises came up however. In particular, trellis processes enjoy the necessary properties only if they are defined with respect to a local notion of time (time elapses differently in each component), instead of a global notion of time. This strongly suggests that one should avoid using a global clock if distributed computations are desired. And once local clocks are necessary, one is not far from true concurrency semantics , .

As a last topic related to distributed computations based on various representations of trajectory sets, we have explored the possibility of using symbolic unfoldings (see next section). The latter encode in an even more compact manner the unfolding of systems where transitions are ``programs'' changing the value of a limited set of variables. The idea is to avoid representing all possible changes on these variables (as in the classical notion of unfolding), but rather to simply indicate that these variables have been modified by the transition. We have proposed a category theory description of these objects, and proved that the symbolic unfolding functor preserves product. The last step to distributed computations will be the derivation of a projection operation. Paper in preparation.

Our work on
*true concurrency probabilistic models*is joint work with our former PhD student Samy Abbes, this year post-doc at LIAFA, Paris. The work of this year has consisted in finalizing papers.
We review our progresses and refer the reader to the 2004 activity report for the motivations of this study.

In year 2000, we launched a research programme on probabilistic models of concurrent systems. This is different from stochastic Petri nets in all existing variants, since the latter are ultimately interpreted as Markov chains, a model in which both state and time are global. This is also different from probabilistic automata and process algebras, which are in fact tightly related to Markov Decision Processes. In our case, trajectories are partial orders of events with local states, and the space consists of the set of maximal configurations of the unfolding of the considered net. This problem was also recently and independently considered by Hagen Völzer , Daniele Varacca and Glynn Winskel .

S. Abbes has constructed
*probabilistic event structures*
. His work encompasses event structures
with confusion (corresponding to Petri nets that may not be free choice). The models he has developed exhibit the nice property that concurrent processes are probabilistically independent,
conditionally on their common past. In words, ``concurrency matches probabilistic independence''. When specialized to event structures arising from safe Petri nets, this model specializes to
that of
*Markov nets.*Markov nets satisfy a strong Markov property and are such that concurrency matches probabilistic independence. Global and local renewal theory for Markov nets have been
developed by S. Abbes
. This year has been devoted to the
finalization of an important paper
on Markov nets and their law of large
numbers.

Research on scenarios has followed two main directions. The first direction is the extension of the expressivity of scenario languages. The second direction is the study of composition mechanisms for scenarios.

An important aspect of our work is to compare different modeling formalisms to know what can and cannot be described with a particular formalism. This year, we have obtained an important
result regarding communicating systems, and their relationship with scenarios. We prove in
that the CHMSC scenario language, the
communicating finite state machines, the regular languages of interleavings and the monadic second order logic on MSCs have the same expressivity under realistic assumptions, de facto
extending Kleene, Buchi and Zielonka Theorems. The restriction we consider is existential boundednesses, meaning that there exists a channel bound such that every scenario can be executed
using that bound, even if
*some*interleaving of a scenario may overcome the bound. If a scenario is not existentially-bounded, then the system can make some choices leading to a deadlock, whatever scheduler and
channel bounds are used. Not only did we prove that the expressive power is the same, but we gave a constructive algorithm to transform one formalism into another one, allowing to verify such
systems. A perspective is to use such results in control, and more particularly in quasi static scheduling.

The research on scenarios has followed two other directions. The first direction is the extension of the expressivity of scenario languages. The second direction is the study of composition mechanisms for scenarios.

A major challenge with scenarios is to be able to model distributed sytems while preserving the decidability of some problems. In their simplest version, scenarios can be seen as automata labeled by partial orders. These order automata generate a family of partial orders, that aim at describing non-interleaved execution traces of distributed systems. Several problems such as vacuity of families intersection, equality, and so on are reputedly undecidable . However, some interesting problems, such as diagnosis remain decidable .

Scenarios embed the expressive power of Mazurkiewicz traces (which results in most of undecidability results associated to scenarios), but are not able to model typical behaviors of so-called sliding windows. When a protocol implements a sliding window, distributed executions may have the shape of infinite braids. Partial order automata only allow for the description of recognizable MSC languages , i.e., families of partial orders that can be build by concatenation of a finite number orders from a finite order alphabet.

To extend the expressivity of order automata, several solutions have been proposed. E. Gunter proposed an extension called compositional MSC, that embeds into order automata the expressive power of communicating automata. Unsurprisingly, several decidable problems that were decidable for order automata (and among them diagnosis) become undecidable for compositional MSCs. This year, we have focused on an extension of order automata called causal MSCs that allows some commutation among composed orders. This extension seems to have nice properties : causal MSCs embed the expressive power of order automata, we can identify a regular subset of the language, and furthermore, diagnosis remains decidable on this model.

The second research direction concerning scenarios is composition. Previous work on composition relied on a fibered product of order automata. The main disadvantage of this approach is that it forces an arbitrary synchronization among orders. This year, we have proposed a new approach to compose several scenarios, using a mixed product . The mixed product allows an interleaving of two partial order automata. Some control on the interleaving is performed using control events that are necessarily located on a common process in the system. This composition mechanism opens new perspectives for a modular design of services specifications.

Since three years, we have developped in the context of Thomas Chatain's thesis, a supervision method based on the unfolding of high-level models of concurrency. By high-level, we mean models with variables, which demand a symbolic approach to build partial order trajectories from distributed observations.

In 2006, we focussed our research on time aspects. Time Petri nets have proved their interest in modeling real-time concurrent systems. Their usual semantics is defined in term of firing sequences, which can be coded in a (symbolic and global) state graph, computable from a bounded net. An alternative is to consider a partial order semantics given in term of processes, which keep explicit the notions of causality and concurrency without computing arbitrary interleavings. In ordinary place/transition bounded nets, it has been shown for many years that the whole set of processes can be finitely represented by a prefix of what is called the unfolding. We have defined , for the first time, such a prefix for safe time Petri nets. It is based on a symbolic unfolding of the net, using a notion of partial state.

It is also known that there are a lot of similarities between time Petri nets and networks of timed automata. Timed automata are also a well studied class of time models. Surprisingly, the concurrency aspects of this model are ignored, and the first reflex when dealing with networks od timed automata is to consider the computation of a single equivalent timed automaton (containing all the interleavings, but having destroyed the concurrency information). With the collaboration of F. Cassez from IRCCyN, Nantes, we have given in a symbolic concurrent semantics for network of timed automata (NTA) in terms of extended symbolic nets. Symbolic nets are standard occurrence nets extended with read arcs and symbolic constraints on places and transitions. We prove that there is a complete finite prefix for any NTA that contains at least the information of the simulation graph of the NTA but keep explicit the notions of concurrency and causality of the network.

The last work considering time and concurrency was done in collaboration with the France Telecom research center in Lannion during the developpement of a new testing language called Late, in the context of the Emmanuel Donin's thesis. presents a case study which is the test of a voice XML service. To develop this application, we proposed a new kind of non-deterministic testing. One testing scenario can describe several different executions and the interpreter tries to find the execution that well fit with the real behavior of the System Under Testing.

Using scenarios for diagnosis seems natural since an observation of a behavior can be described by means of a set of events and dependencies and independencies between these events. That is, an observation can be modeled easily as a scenario. Moreover, scenario languages enjoy a visual and appealing formalism. On the other hand, dealing with scenarios in algorithms is often non trivial : since a scenario can be decomposed in several ways, finding a particular occurrence of a scenario in a set of behavior is not easy.

Our first result of the year in the topic is an algorithm for finding every behavior of an HMSC explaining a particular scenario observation scenario. The task is not easy since the behaviors can be arbitrarily long in case where there are many events that cannot be observed. In , we developed a symbolic algorithm to represent an infinite number of dependencies in a finite way. This gives us an algorithm that we proved to be optimal, without requiring any restriction. A prototype has been implemented to generate this set of explanations given an observation. We are currently researching a way to diagnose a scenario on-line and in a distributed way in order to get a monitoring tool.

In order to help distributing the diagnosis algorithm, we independently study the best way to pass information between peers to get a global knowledge of the behavior of the system. A fundamental procedure to share a regular and commutation closed knowledge in Mazurkiewicz traces is known as Zielonka's algorithm. However, the construction given in the original paper synthesizes exponential size information to pass with every communication, which is infeasible. In , we improve the fundamental Zielonka's algorithm with a symbolic data structure called tiles, which allow sharing only a polynomial size information. Furthermore, this information can be computed in quadratic time, while it was non tractable in the original construction. In order to achieve our goal for a monitoring tool, we need to deal with scenarios in Zielonka's algorithm instead of Mazurkiewicz traces, and to take care of non observable events.

Our last work in the topic of algorithms for Scenarios was to consider problem more complicated than diagnosis. Indeed, diagnosis tries to give an explanation to a single scenarios, whereas usually, we are interested in dealing directly with a set of scenarios. However, in general, problems on sets of scenarios like HMSCs are undecidable, due to the confusion between different decompositions of the same scenario. In , the intersection problem was considered for restriction of HMSCs: in case the HMSCs are local-choice, then the problem becomes tractable in quadratic time. However, the restriction is quite strict forbidding excessive parallelism, and many high level abstracted models may not fulfill it. Then the globally-cooperative restriction was defined, under which the intersection problem is decidable, although the complexity is not polynomial anymore. Interestingly enough, the main ingredient used in this construction is an encoding of scenarios into traces using atoms of MSCs. Bridging this gap between scenarios and traces should make the use of Zielonka's algorithm easier to obtain a monitoring tool.

Consider the life-cycle of a cross-domain video-conference on demand. Suppose End-user
Arequests, with his host domain, a video-conference connection with user
B, such that
Aand
Bdo not have direct access to a common domain; the instances of the domains concerned (called service providers or SPs for short) thus have to set up a chain of
inter-domain connections until the SP giving access to
Bis reached. Clearly, local domain services will not suffice for negotiating and managing this chain. With the Madynes team at LORIA and Alcatel, we are working on the
algorithms for cross-domain QoS contract negotiation and monitoring; Hélia Pouyllau has implemented a prototype negotiation module using web services for the peer-to-peer negotiation of QoS
budget along a fixed chain of service providers, by nested contractualization in which only neighboring SP's interact, and using dynamic programming (DP) techniques. The description of the
algorithm is given in the publications
,
and in SWAN deliverables. The module has
been integrated into the management platform of the project SWAN. Current work focusses on extending the optimized negotiation for single requests into global multi-request optimal negotiation
protocols, and the use of optimization principles other than DP.

Regarding Web services
*orchestrations*and
*choreographies,*several standardization efforts are underway. The most mature effort is around Business Process Execution Languages (BPEL)
. WS-Choreography Definition Language
(WS-CDL)
complements BPEL by paying attention to
so-called choreographies, i.e., peers of interacting business processes. As these formalisms result from standardization discussions, they are quite complex, offer a number of detail features,
and address technical difficulties such as the so-called problem of ``correlations'' with lengthly and informal explanations, which makes their modeling a cumbersome task — see, however, the
work of
modeling of BPEL by means of Petri net
systems of workflow type. This is why we decided to base this study on a simpler and much cleaner formalism for WS orchestrations, namely the
Orcformalism proposed by Jayadev Misra and co-workers
.

Most important is the study of Quality of Service (QoS) composition when composing Web services under orchestrations or choreographies. Here, the challenge is: 1/ To establish a relation between the QoS of queried Web services and that of the orchestration; 2/ To negotiate and tune the QoS parameters of the orchestration, in an efficient way; and 3/ To detect or predict the breaching of a QoS contract, leading to 4/ a reconfiguration of the orchestration.

All these tasks require having adequate models supporting QoS aspects. Regarding the functional aspect, Sydney Rosario had proposed last year a translation of
Orcinto the formalism of Petri net systems,
*i.e.,*systems of equations involving Petri nets
, see
. This behavioural model could then be
enhanced with QoS parameters, but the resulting model was quite heavy .

This year, Claude Jard has developed a small tool written in Prolog to implement the sequential semantics of Orcas defined by the authors of the language. In addition, this tool can deliver a partial order form of the corresponding sequential executions, thus dramatically reducing the size of the executions for storing. Also, jointly with the inventors of Orc, namely William Cook and Jayadev Misra, from Austing University, we started developing direct translations of Orcinto partial order models (more precisely, into so-called Asymmetric Event Structures). Based on preliminary versions of this translation, a tool chain has been developed by Sidney Rosario and our two indian interns, with the following features:

A module performing the simulation of Orcprograms in the form of partial orders;

A module to generate samples of QoS parameters for the Web services called by the orchestration; these can be either given by measurements on actual Web services, or by Monte-Carlo generation from a Gamma or other distribution random simulator, or by bootstrapping a set of available measurements;

A module to enhance the partial order executions with corresponding QoS parameters, based on MaxPlus and related algebras;

A module with statistical tools to compute the composed QoS of the orchestration, from the QoS of the constituent Web services. This functionality allows to design and fine-tune global QoS contracts by selecting adequate quantiles over the sample data.

One objective of the tool is to demonstrate the possibility of performing overbooking, thus improving efficiency of the orchestration in terms of QoS.

The language
*Active XML*or
*AXML*is an extension of XML which allows to enrich documents with
*service calls*or sc's for short. These sc's point to web services that, when triggered, access other documents; this materialization of sc's produces in turn AXML code that is included in
the calling document. One therefore speaks of dynamic or intentional documents; note in particular that materialization can be
*total*(inserting data in XML format) or
*partial*(inserting AXML code containing further sc's). AXML has been developed by the GEMO team at INRIA Futurs, headed by Serge Abiteboul; it allows to set up P2P systems around
repositories of AXML documents (one repository per peer).

We are currently cooperating with the GEMO team (Serge Abiteboul) and the LIAFA laboratory in Paris (Anca Muscholl) to explore the behavioral semantics of AXML in the framework of the ASAX project, see below. Our objective is to be able to ensure confluence despite distribution and asynchrony, even for documents not belonging to the so-called ``positive'' class , where confluence is ensured thanks to the absence of revision of facts.

Obviously, if one bounds the documents to some size, then the usual tools and techniques can be applied to analyze AXML documents. One of the work done this year was to show how to translate AXML documents into CSP under this restrictive hypothesis of boundedness .

One challenge is to model the dynamicity of the document that can grow arbitrarily large, and to define a way to unfold such structures. Another challenge is to define restrictions that are not too intrusive in order to model real systems, and not permissive enough to fall in the undecidability of every non trivial problem.

Contract INRIA2 04 A 0082 MC 01 1 — December 2003/June 2006

The project
*``SWAN: Self-Aware Management''*is being funded by the French national network RNRT, Ministry of Research. I started in December 2003 and is scheduled to last 30 months. The DistribCom
team cooperates in SWAN with

the
*MADYNES*team of INRIA Lorraine and Paris-Nord University (the latter replaced by
*LABRI*Bordeaux in 2005),

industrial partners Alcatel, France Telecom, and QoSmetrics.

SWAN aims at empowering local autonomous diagnosis and administration functions in networks and services. Compared to the preceding projects
*MAGDA*and
*MAGDA2*, where
*asynchronicity*and
*distribution*were already at the heart, the new additional challenge in SWAN is
*dynamicity*, namely non-static topologies of interaction. Networks expand or shrink as peers and connections are added or withdrawn at runtime, with the necessary adaptations and
negotiations managed locally in the domain directly concerned. Web Services show by nature this dynamical behavior. Both applications present thus a fundamental challenge to all model-based
approaches to diagnosis and supervision more generally: find models that allow for self-modification - compare the discussion in the section on models of concurrency. DistribCom is leader of
the SWAN project; the main scientific contributions are the formal investigation and simulations for Orc, described in
, and the multi-domain QoS negotiation detailed in

RNRT November 2004 - November 2006

Very often, software and systems functionalities and performance models are developed by different kind of specialists. The goal of Persiform is to provide a complete methodology and toolbox to allow performance evaluation from functional models. A first prototype of a toolchain that translates functional models (namely sequence diagrams and activity diagrams) into a performance model(namely SES Workbench models). Functional languages are first translated into a common language, a variant of stochastic colored Petri nets. These nets are then transleted into a queueing network model. The partners for this project are: France Telecom R&D, Verimag, INRIA Rennes, INT, and a software company, Orpheus.

External research project with France Telecom

Software development often starts with requirement capture, i.e. collecting a set of representative behaviors of a system. Scenarios collected can be considered as partial views of a system, but may however involve some incoherences. The objective of CO2 was to provide formal definitions of scenario compositions, define notions of coherence for a set of views defined as a collection of scenarios, and provide decision algorithms indicating whether there exists an implementation realizing them. CO2 ended in september. The outcome of the project is a prototype to compose scenarios, 8 deliverables, and a publication in a conference. Furthermore, the work accomplished during CO2 on the mixed product of scenarios opened new perspectives on modular composition of services.

December 2004 - May 2006

The general objective of this contract with Alcatel Research & Innovation is to perform exploratory developments in relation with two Alcatel business divisions, namely: Optical Networks Division and Mobile Radio Division.

In general, telecommunication systems are composed of many interconnected functions, softwares, protocol layers, etc. These elements are designed to monitor their internal state and their ability to fulfill the desired function. In case of abnormal behavior, malfunction or failure, they raise alarms that are collected by a supervisor. The interdependence of components, and the general failure propagation phenomenon, introduces a strong correlation between alarms. So one generally observes bursts of correlated information, that must be analyzed and interpreted to locate possible origin(s) of the failures (up to now, this work is performed by a human operator).

The objective of this contract is to develop diagnosis methods for such systems, made of many interconnected functions. We make use of models that capture the concurrency of behaviors, and describe runs of such systems by event structures. Two application domains have been identified : Submarine Line Terminal Equipment, for high rate optical intercontinental transmissions, and the radio access layer, for GSM/GPRS networks.

ACI Sécurité — september 2004 - september 2007

The purpose of the Potestat ACI is to study security policies in networks, and to analyze the security of such networks with test techniques. The partners involved in this ACI are : LSR/IMAG - INPG (Vasco team), VERIMAG (DCS team), INRIA Rennes (Vertecs, Lande, and DistribCom teams)

February 2005 - January 2007

ASAX ( http://gemowiki.futurs.inria.fr/twiki/bin/view/Gemo/AsaxWeb) is a cooperative research action headed by DistribCom, in cooperation with INRIA's GEMO team, the LIAFA/Paris, and Tel-Aviv university. It started in January 2005 and is scheduled to end in December 2006. ASAX's purpose is the analysis of Active XML systems, see URL http://activexml.neton Active XML and Web services. Currently, only a fragment of AXML, called ``positive AXML'', such as systems having monotonic answers to queries, can be given a deterministic behavioral semantics ; the goal of ASAX is to break this limitation, and provide a formal semantics and analysis algorithms for AXML systems.

Associated Team INRIA-NUS — 2006

This associated team is a collaboration with the National University of Singapore. The main research theme is the control and diagnosis of distributed communicating systems. Two application areas are targeted: Real-time embedded systems and telecommunications systems and services. Although very different in nature, both areas make fundamental use of models of concurrency. Several types of formal models are considered: scenario languages, communicating automata and Petri-nets. More specifically, we work together on the following problems:

An extension of scenario models for distributed systems diagnosis.

Distributed control synthesis, with applications to the quasi-static scheduling problem.

As the cooperation just begun works are still ongoing. Thomas Gazagnaire spent three months in NUS between May and July, working on sliding scenarios. Blaise Genest spent two weeks in NUS in May working on quasi-static scheduling. Loïc Hélouët plan to spend two weeks in NUS in the end of 2006 to work on a new formalism of specification.

A. Benveniste is associated editor at large (AEAL) for the journal
*IEEE Trans. on Automatic Control*and member of the editorial board of the journal and «Proceedings of the
ieee». He has been in 2006 member of the Program Committee of the following conferences: WODES, EMSOFT. He has been plenary speaker at WODES'2006. He
is member of the Strategic Advisory Council of the Institute for Systems Research, Univ. of Maryland, College Park, USA. He is in charge of managing the INRIA side of the Alcatel external
Research Programme (ARP).

E. Fabre is co-organizing with Victor Khomenko (Newcastle) the UFO workshop, a satellite event of ATPN'07. He has been invited in the Program Committe of DX'07.

C. Jard has been in 2006 member of the Program Committee of the following conferences: FORTE, NOTERE, TESTCOM, MOVEP, AFADL, and has been invited for 2007 to the Program Committee of ICAPTN,
UFO and NOTERE. He has served as an expert in several programs of the French ministry of research (particularly in the RNRT programme in telecommunications). He is also member of the editorial
board of the
*Annales des Télécommunications*and the steering committee of MSR series of conferences. C. Jard is a member of the administration council of the ENS Cachan. He has been president of the
Atlanstic research program (at Nantes). He participated to the scientific evaluation comittee of french labs (IRCCyN, Nantes, as president and LIFC, Besançon) and of the research center in
computer science in Montreal. In 2006, C. Jard was member of the PhD Committees of O. Constant (University of Pau) and L. Huo (University of MacGill, Montreal) as rapporteur, of T. Chatain and
E. Donin at Rennes (as supervisor).He also participated to the Habilitation Committee of V. Rusu at Rennes (as president).

Stefan Haar is member of the working group for evaluation of international activities with the COST committee of INRIA; he also served on the IFSIC's "commission de spécialistes'' section 27
until summer of 2006. He is an associate editor of
*IEEE Transactions on Automatic Control*.

Loïc Hélouët was invited to become co-rapporteur at ITU for the question 17 on MSC language. This nomination should become official in december. He was also invited to participate in the program committee of SDL 2007.

É. Fabre teaches information theory and communication theory at Ecole Normale Supérieure de Cachan, Ker Lann campus, in the computer science and telecommunications magistère program.

L. Hélouët teaches the UML notation to the mastere classes at ENST Bretagne. He also participates in module MAS (with C.Jard and S.Haar) of master M2RI, that is dedicated to models and algorithms for large systems supervision.

C. Jard is a full-time professor at the ENS Cachan and teaches mainly at the Master level, in Computer Science and Telecom, and in Maths. He manages the Info-Telecom track of the Master-Recherche-STS of the Rennes 1 university. It is to be noted that one course in this track is on the research subject of DistribCom. He is also in charge of the competitive examination for the entry of new students in computer science in the French ENS schools.

A. Benveniste gave one of the three plenary lectures at the conferences WODES'2006, July 2006, Ann Arbor. The talk was co-authored with Eric Fabre and title was: ``Partial order techniques for distributed discrete event systems: why you can't avoid using them''.

S. Haar presented, on invitation by Reiko Heckel, a seminar on aspects of unfolding and diagnosis at the CS department of the university of Leicester/UK in June 2006.

L. Hélouët has been invited in march 2006 at the LABRI (Laboratoire Bordelais de Recherche en Informatique), to give a talk on applications of game theory to covert channels detection.

B. Genest gave a talk in March 2006 at the LIAFA, Paris 7 on learning algorithms for distributed systems. During his stay at NUS, Singapore, he presented ongoing work on distributed dynamic monitoring. He also presented the improvement of Zielonka's algorithm at LABRI in October 2006.

Guy-Vincent Jourdan is professor at the university of Ottawa/CDN. He has visited Distribcom during two months (may-june 2006) as invited professor at the ENS Cachan and INRIA. He worked with C. Jard and S. Haar on concurrent machine identification, using a new model of IO-partial order automaton. This was the starting point for a project of formal collaboration. S. Haar has visited U. of Ottawa for several months, and requested an INRIA sabbatical leave for 2007 to continue working in Ottawa.

Christoforos Hadjicostis visited Distribcom for two weeks in November 2006. This cooperation aims at building modular diagnosers for distributed systems.