Most software-driven systems we commonly use in our daily life are
huge hierarchical assemblings of components. This observation runs
from the micro-scale (multi-core chips) to the macro-scale (data
centers), and from hardware systems (telecommunication networks) to
software systems (choreographies of web services). The main
characteristics of these pervasive applications are size,
complexity, heterogeneity, and modularity (or concurrency). Besides,
several such systems are actively used before they are fully
mastered, or they have grown so much that they now raise new
problems that are hardly manageable by human operators. While these
systems and applications are becoming more essential, or even
critical, the need for their reliability, efficiency
and manageability becomes a central concern in computer
science. The main objective of SUMO is to develop theoretical
tools to address such challenges, according to the following axes.
Several disciplines in computer science have of course addressed some of the issues raised by large systems. For example, formal methods (essentially for verification purposes), discrete-event systems (diagnosis, control, planning, and their distributed versions), but also concurrency theory (modelling and analysis of large concurrent systems). Practical needs have oriented these methods towards the introduction of quantitative aspects, such as time, probabilities, costs, and their combinations. This approach drastically changes the nature of questions that are raised. For example, verification questions become the reachability of a state in a limited time, the average sojourn duration in a state, the probability that a run of the system satisfies some property, the existence of control strategies with a given winning probability, etc. In this setting, exact computations are not always appropriate as they may end up with unaffordable complexities, or even with undecidability. Approximation strategies then offer a promising way around, and are certainly also a key to handling large systems. Approaches based on discrete-event systems follow the same trend towards quantitative models. For diagnosis aspects, one is interested in the most likely explanations to observed malfunctions, in the identification of the most informative tests to perform, or in the optimal placement of sensors. For control problems, one is of course interested in optimal control, in minimizing communications, in the robustness of the proposed controllers, in the online optimization of QoS (Quality of Service) indicators, etc.
While the above questions have already received partial answers, they remain largely unexplored in a distributed setting. We focus on structured systems, typically a network of dynamic systems with known interaction topology, the latter being either static or dynamic. Interactions can be synchronous or asynchronous. The state-space explosion raised by such systems has been addressed through two techniques. The first one consists in adopting true-concurrency models, which take advantage of the parallelism to reduce the size of the trajectory sets. The second one looks for modular or distributed "supervision" methods, taking the shape of a network of local supervisors, one per component. While these approaches are relatively well understood, their mixing with quantitative models remains a challenge (as an example, there exists no proper setting assembling concurrency theory with stochastic systems). This field is largely open both for modeling, analysis and verification purposes, and for distributed supervision techniques. The difficulties combine with the emergence of data-driven distributed systems (as web services or data centric systems), where the data exchanged by the various components influence both the behaviors of these components and the quantitative aspects of their reactions (e.g. QoS). Such systems call for symbolic or parametric approaches for which a theory is still missing.
Some existing distributed systems like telecommunication networks,
data centers, or large-scale web applications have reached sizes and
complexities that reveal new management problems. One can no longer
assume that the model of the managed systems is static and fully
known at any time and any scale. To scale up the management methods
to such applications, one needs to be able to design reliable
abstractions of parts of the systems, or to dynamically build a part
of their model, following the needs of the management functions to
realize. Besides, one does not wish to define management objectives
at the scale of each single component, but rather to pilot these
systems through high-level policies (maximizing throughput,
minimizing energy consumption, etc.) These distributed systems and
management problems have connections with other approaches for the
management of large structured stochastic systems, such as Bayesian
networks (BN) and their variants. The similarity can actually be
made more formal: inference techniques for BN rely on the concept of
conditional independence, which has a counterpart for networks of
dynamic systems and is at the core of techniques like
distributed diagnosis, distributed optimal planning, or the
synthesis of distributed controllers. The potential of this
connection is largely unexplored, but it suggests that one could
derive from it good approximate management methods for large
distributed dynamic systems.
Since its creation in 2015, SUMO has successfully developed formal methods for large quantitative systems, in particular addressing verification, synthesis and control problems. Our current motivation is to expand this by putting emphasis on new concerns, such as algorithm efficiency, imprecision handling, and the more challenging objective of addressing incomplete or missing models. In the following we list a selection of detailed research goals, structured into four axes according to model classes: quantitative models, large systems, population models, and data-driven models. Some correspond to the pursuit of previously obtained results, others are more prospective.
The analysis and control of quantitative models will remain at the heart of a large part of our
research activities. In particular, we have two starting
collaborative projects focusing on timed models,
namely our ANR project TickTac and our collaboration with MERCE. The
main expected outcome of TickTac is an open-source tool implementing
the latest algorithms and allowing for quick prototyping of new
algorithms. Several other topics will be explored in these
collaborations, including robustness issues, game-theoretic problems, as well as the
development of efficient algorithms, e.g. based on CEGAR approach or
specifically designed for subclasses of automata (e.g. automata with
few clocks and/or having a specific structure, as
in 42).
Inspired by our collaboration with Alstom, we also aim at
developing symbolic techniques for analysing non-linear timed models.
Stochastic models are another important focus for our research. On the one hand, we want to pursue our work on the optimization of non-standard properties for Markov decision processes, beyond the traditional verification questions, and explore e.g. long-run probabilities, and quantiles. Also, we aim at lifting our work on decisiveness from purely stochastic 40, 41 to non-deterministic and stochastic models in order to provide approximation schemes for the probability of (repeated) reachability properties in infinite-state Markov decision processes.
On the other hand, in order to effectively handle large stochastic systems, we will
pursue our work on approximation techniques. We aim at deriving simpler models, enjoying or preserving specific properties, and at determining the appropriate level of abstraction for a given system. One needs of course to quantify the approximation degrees (distances), and to preserve essential features of the original systems (explainability). This is a connection point between formal methods and the booming learning methods.
Regarding diagnosis/opacity issues, we will explore further the quantitative aspects. For diagnosis, the theory needs extensions to the case of incomplete or erroneous models, and to reconfigurable systems, in order to develop its applicability (see Sec. 3.6). There is also a need for non-binary causality analysis (e.g. performance degradations in complex systems). For opacity, we aim at quantifying the effort attackers must produce vs how much of a secret they can guess. We also plan to synthesize robust controllers resisting to sensor failures/attacks.
Part of the background of SUMO is on the analysis and management of concurrent and modular/distributed systems, which we view as two main approaches to address state explosion problems. We will pursue the study of these models (including their quantitative features): verification of timed concurrent systems, robust distributed control of modular systems, resilient control to coalitions of attackers, distributed diagnosis, modular opacity analysis, distributed optimal planning, etc. Nevertheless, we have identified two new lines of effort, inspired by our application domains.
Reconfigurable systems. This is mostly motivated by applications at the convergence of virtualization techs with networking (Orange and Nokia PhDs). Software defined networks, either in the core (SDN/NFV) or at the edge (IoT) involve distributed systems that change structure constantly, to adapt to traffic, failures, maintenance, upgrades, etc. Traditional verification, control, diagnosis approaches (to mention only those) assume static and known models that can be handled as a whole. This is clearly insufficient here: one needs to adapt existing results to models that (sometimes automatically)
change structure, incorporate new components/users or lose some, etc. At the same time, the programming paradigms for such systems (chaos monkey) incorporate resilience mechanisms, that should be considered by our models.
Hierarchical systems. Our experience with the regulation of subway lines (Alstom) revealed that large scale complex systems are usually described at a single level of granularity. Determining the appropriate granularity is a problem in itself. The control of such systems, with humans in the loop, can not be expressed at this single level, as tasks become too complex and require extremely skilled staff. It is rather desirable to describe models simultaneously at different levels of granularity, and to perform control at the appropriate level: humans in charge of managing the system by high level objectives, and computers in charge of implementing the appropriate micro-control sequences to achieve these tasks.
We want to step up our effort in parameterized verification of systems consisting of many identical components, so-called population models. In a nutshell our objectives summarize as "from Boolean to quantitative".
Inspired by our experience on the analysis of populations of yeasts, we aim at developping the quantitative analysis and control of population models, e.g. using Markov decision processes together with quantitative properties, and focusing on generating strategies with fast convergence.
As for broadcast networks, the challenge is to model the mobility of nodes (representing mobile ad hoc networks) in a faithful way. The obtained model should reflect on the one hand, the placement of nodes at a given time instant, and on the other hand, the physical movement of nodes over time. In this context, we will also use game theory techniques which allows one to study cooperative and conflictual behaviors of the nodes in the network, and to synthesize correct-by-design systems in adversarial environments.
As a new application area, we target randomized distributed algorithms. Our goal is to provide probabilistic variants of threshold automata 47 to represent fault-tolerant randomized distributed algorithms, designed for instance to solve the consensus problem. Most importantly, we then aim at developing new parameterized verification techniques, that will enable the automated verification of the correctness of such algorithms, as well as the assessment of their performances (in particular the expected time to termination).
In this axis, we will investigate whether fluid model checking and mean-field approximation techniques apply to our problems. More generally, we aim at a fruitful cross-fertilizing of these approaches with parameterized model-checking algorithms.
In this axis, we will consider data-centric models, and in particular their application to crowd-sourcing. Many data-centric models such as Business Artifacts 48 orchestrate simple calls and answers to tasks performed by a single user. In a crowd-sourcing context, tasks are realized by pools of users, which may result in imprecise, uncertain and (partially) incompatible information. We thus need mechanisms to reconcile and fuse the various contributions in order to produce reliable information. Another aspect to consider concerns answers of higher-order: how to allow users to return intentional answers, under the form of a sub-workflow (coordinated set of tasks) which execution will provide the intended value. In the framework of the ANR Headwork we will build on formalisms such as GAG (guarded attribute grammars) or variants of business artifacts to propose formalisms adapted to crowd-sourcing applications, and tools to analyze them. To address imprecision, we will study techniques to handle fuzziness in user answers, will explore means to set incentives (rewards) dynamically, and to set competence requirements to guide the execution of a complex workflow, in order to achieve an objective with a desired level of quality.
In collaboration with Open Agora, CESPA and University of Yaoundé (Cameroun) we intend to implement in the GAG formalism some elements of argumentation theory (argumentation schemes, speech acts and dialogic games) in order to build a tool for the conduct of a critical discussion and the collaborative construction of expertise. The tool would incorporate point of view extraction (using clustering mechanisms), amendment management and consensus building mechanisms.
We are concerned with one important lesson derived from our involvement in several application domains. Most of our background gets in force as soon as a perfect model of the system under study is available. Then verification, control, diagnosis, test, etc. can mobilize a solid background, or suggest new algorithmic problems to address. In numerous situations, however, assuming that a model is available is simply unrealistic. This is a major bottleneck for the impact of our research. We therefore intend to address this difficulty, in particular for the following domains.
The smart-city trend aims at optimizing all functions of future cities with the help of digital technologies. We focus on the segment of urban trains, which will evolve from static and scheduled offers to reactive and eventually on-demand transportation offers. We address two challenges in this field. The first one concerns the optimal design of robust subway lines. The idea is to be able to evaluate, at design time, the performance of time tables and of different regulation policies. In particular, we focus on robustness issues: how can small perturbations and incidents be accommodated by the system, how fast will return to normality occur, when does the system become unstable? The second challenge concerns the design of new robust regulation strategies to optimize delays, recovery times, and energy consumption at the scale of a full subway line. These problems involve large-scale discrete-event systems, with temporal and stochastic features, and translate into robustness assessment, stability analysis and joint numerical/combinatorial optimization problems on the trajectories of these systems.
Telecommunication-network management is a rich provider of research topics for the team, and some members of SUMO have a long background of contacts and transfer with industry in this domain. Networks are typical examples of large distributed dynamic systems, and their management raises numerous problems ranging from diagnosis (or root-cause analysis), to optimization, reconfiguration, provisioning, planning, verification, etc. They also bring new challenges to the community, for example on the modeling side: building or learning a network model is a complex task, specifically because these models should reflect features like the layering, the multi-resolution view of components, the description of both functions, protocols and configuration, and they should also reflect dynamically-changing architectures. Besides modeling, management algorithms are also challenged by features like the size of systems, the need to work on abstractions, on partial models, on open systems, etc. The networking technology is now evolving toward software-defined networks, virtualized-network functions, multi-tenant systems, etc., which reinforces the need for more automation in the management of such systems.
Data centers are another example of large-scale modular dynamic and reconfigurable systems: they are composed of thousands of servers, on which virtual machines are activated, migrated, resized, etc. Their management covers issues like troubleshooting, reconfiguration, optimal control, in a setting where failures are frequent and mitigated by the performance of the management plane. We have a solid background in the coordination of the various autonomic managers that supervise the different functions/layers of such systems (hardware, middleware, web services, ...) Virtualization technologies now reach the domain of networking, and telecommunication operators/vendors evolve towards providers of distributed open clouds. This convergence of IT and networking strongly calls for new management paradigms, which is an opportunity for the team.
A current trend is to involve end-users in collection and analysis of data. Exemples of this trend are contributive science, crisis-management systems, and crowd-sourcing applications. All these applications are data-centric and user-driven. They are often distributed and involve complex, and sometimes dynamic workflows. In many cases, there are strong interactions between data and control flows: indeed, decisons taken regarding the next tasks to be launched highly depend on collected data. For instance, in an epidemic-surveillance system, the aggregation of various reported disease cases may trigger alerts. Another example is crowd-sourcing applications where user skills are used to complete tasks that are better performed by humans than computers. In return, this requires addressing imprecise and sometimes unreliable answers. We address several issues related to complex workflows and data. We study declarative and dynamic models that can handle workflows, data, uncertainty, and competence management.
Once these models are mature enough, we plan to build prototypes to experiment them on real use cases from contributive science, health-management systems, and crowd-sourcing applications. We also plan to define abstraction schemes allowing formal reasonning on these systems.
SIMSTORS is a software for the simulation of stochastic concurrent timed systems. The heart of the software is a variant of stochastic and timed Petri nets, whose execution is controlled by a regulation policy (a controller), or a predetermined theoretical schedule. The role of the regulation policy is to control the system to realize objectives or a schedule when it exists with the best possible precision. SIMSTORS is well adapted to represent systems with randomness, parallelism, tasks scheduling, and resources. From 2015 to 2018, it was used for the P22 collaboration with Asltom Transport, to model metro traffic and evaluate performance of regulation solutions. In 2020, it was at the heart of a collaboration on multi-modal networks with Alstom transport Madrid. This software allows for step by step simulation, but also for efficient performance analysis of systems such as production cells or train systems. The initial implementation was released in 2015, and the software is protected by the APP.
Since then, SIMSTORS has been extended along two main axes: on one hand, SIMSTORS models were extended to handle situations where shared resources can be occupied by more than one object ( this is of paramount importance to represent conveyors, roads occupied by cars, or train tracks with smoothed scheduling allowing shared sections among trains) with priorities, constraint on their ordering and individual characteristics. This allows for instance to model vehicles with different speeds on a road, while handling safety distance constraints. On the other hand, SIMSTORS models were extended to allow control of stochastic nets based on decision rules that follow optimization schemes. In 2020, it was used to define efficient traffic management techniques with planning during a collaboration with Roma 3 University.
Participants : Emily Clement, Blaise Genest, Loïc Hélouët, Thierry Jéron, Nicolas Markey,
Requirement engineering is a key phase in the development process. Ensuring that requirements are consistent is essential so that they do not conflict and admit implementations. In 28, we consider the formal verification of rt-consistency, which imposes that the inevitability of definitive errors of a requirement should be anticipated, and that of partial consistency, which was recently introduced as a more effective check. We generalize and formalize both notions for discrete-time timed automata, develop three incremental algorithms, and present experimental results.
Negotiations were introduced in 44 as a model for concurrent systems with multiparty decisions. What is very appealing with negotiations is that it is one of the very few non-trivial concurrent models where several interesting problems, such as soundness, i.e. absence of deadlocks, can be solved in PTIME 43. In this paper, we introduce the model of timed negotiations and consider the problem of computing the minimum and the maximum execution times of a negotiation. The latter can be solved using the algorithm of 45 computing costs in negotiations, but surprisingly minimum execution time cannot.
This year, we have proposed new algorithms 18 to compute both minimum
and maximum execution time, that work in much more general classes of negotiations than
45, that only considered sound and deterministic negotiations. Further, we
uncover
the precise complexities of these questions, ranging from PTIME to
Reachability can be decided in polynomial space in timed automata. However, this result assumes
arbitrary precision in the dates at which transitions have to be taken in order to reach the
target state. In 26, we introduce and study permissive
strategies in the setting of timed systems: instead of suggesting exact dates at which to take
transitions, permissive strategies propose intervals of dates; the exact date at which the
transitions are taken are chosen by an opponent. We develop an algorithm for computing maximially permissive strategies in acyclic timed automata and acyclic timed games.
Participants : Nathalie Bertrand, Hugo Bazille, Éric Fabre, Blaise Genest
Systems prone to faults are often equipped with a controller whose aim consists
in restricting the behaviour of the system in order to perform a diagnosis. Such
a task is called active diagnosis. However to avoid that the controller degrades
the system in view of diagnosis, a second objective in terms of quality of
service is usually assigned to the controller. In the framework of stochastic
systems, a possible specification, called safe active diagnosis requires that
the probability of correctness of the infinite (random) run is non null.
In 12, we introduce and study two alternative
specifications that are in many contexts more realistic. The notion of
(
We are interested in studying the evolution of large homogeneous populations of cells, where each cell is assumed to be composed of a group of biological players (species) whose dynamics is governed by a complex biological pathway, identical for all cells. Modeling the inherent variability of the species concentrations in different cells is crucial to understand the dynamics of the population. In 16, we focus on handling this variability by modeling each species by a random variable that evolves over time. This appealing approach runs into the curse of dimensionality since exactly representing a joint probability distribution involving a large set of random variables quickly becomes intractable as the number of variables grows. To make this approach amenable to biopathways, we explore different techniques to (i) approximate the exact joint distribution at a given time point, and (ii) track its evolution as time elapses.
Participants : Loïc Hélouët, Nicolas Markey
In 35, we study games with reachability objectives under energy constraints: such games are played on weighted graphs, and the aim is to reach a target state while always keeping the sum of the weights within a given interval. We prove that under strict energy constraints (either only lower-bound constraint or interval constraint), those games are LOGSPACE-equivalent to energy games with the same energy constraints but without reachability objective (i.e., for infinite runs). We then consider two relaxations of the upper-bound constraints (while keeping the lower-bound constraint strict): in the first one, called weak upper bound, the upper bound is absorbing, i.e., when the upper bound is reached, the extra energy is not stored; in the second one, we allow for temporary violations of the upper bound, imposing limits on the number or on the amount of violations.
We prove that when considering weak upper bound, reachability objectives require memory, but can still be solved in polynomial-time for one-player arenas; we prove that they are in coNP in the two-player setting. Allowing for bounded violations makes the problem PSPACE-complete for one-player arenas and EXPTIME-complete for two players. We then address the problem of existence of bounds for a given arena. We show that with reachability objectives, existence can be a simpler problem than the game itself, and conversely that with infinite games, existence can be harder.
Participants : Loïc Hélouët, Hervé Marchand
A cyber-physical systems is usually composed of various physical and software components that interact with each other in ways that change with context. For such large system, one challenge is to ensure security against malicious users. As a security problem, we consider in 37 feedback control systems where sensor readings may be compromised by a malicious attacker intending on causing damage to the system. We study this problem at the supervisory layer of the control system, using discrete event systems techniques. We assume that the attacker can edit the outputs from the sensors of the system before they reach the supervisory controller. In this context, we formulate the problem of synthesizing a supervisor that is robust against the class of edit attacks on the sensor readings and present a solution methodology for this problem. This methodology blends techniques from games on automata with imperfect information with results from supervisory control theory of partially-observed discrete event systems. Necessary and sufficient conditions are provided for the investigated problem.
Following preceding works regarding the enforcement of confidential informations for Cyber-physical systems, we considered in 30, the opacity control problem with coalition between attackers. In discrete-event systems, the opacity of a secret ensures that some behaviors or states cannot be inferred with certainty from partial observation of the system. Enforcing opacity in a discrete-event system, encoded by a finite labelled transition system (LTS), is a way to avoid information leakage. Checking opacity is decidable but costly (EXPTIME in the worst cases). We addressed opacity for modular systems in which every module, represented by an LTS, has to protect its own secret (a set of secret states S) w.r.t. a local attacker. Once the system is composed, we assume a coalition between the attackers that share their local view (called the global attacker). Assuming the global attacker can observe all interactions between modules, we provide a reduced-complexity opacity verification technique and an algorithm for constructing local controllers that enforces opacity for each secret separately.
Participants : Arthur Queffelec, Ocan Sankur
In 14, we study a variant of the multi-agent path finding (MAPF) problem in which the group of agents are required to stay connected with a supervising base station throughout the execution. In addition, we consider the problem of covering an area with the same connectivity constraint. We show that both problems are PSPACE-complete on directed and undirected topological graphs while checking the existence of a bounded plan is NP-complete when the bound is given in unary (and PSPACE-hard when the encoding is in binary). Moreover, we identify a realistic class of topological graphs on which the decision problem falls in NLOGSPACE although the bounded versions remain NP-complete for unary encoding.
Participants : Nathalie Bertrand, Blaise Genest, Anirban Majumdar, Nicolas Markey, Suman Sadhukhan, Ocan Sankur
In 23, we study congestion games which are a classical
type of games studied in game theory, in which
Randomization is a powerful paradigm to solve hard problems, especially in distributed computing. Proving the correctness, and assessing the performances, of randomized distributed algorithms, is a very challenging research objective, that the verification community has started to address. In 13, we review existing model checking approaches to the verification of randomized distributed algorithms and identify further research directions.
Traditional concurrent games on graphs involve a fixed number of players, who take decisions simultaneously, determining the next state of the game. We recently introduced a parameterized variant of concurrent games on graphs, where the parameter is precisely the number of players. Parameterized concurrent games are described by finite graphs, in which the transitions bear finite-word languages to describe the possible move combinations that lead from one vertex to another. In the invited contribution 21, we report on results on two problems for such concurrent games with arbitrary many players. To start with, we studied the problem of determining whether the first player, say Eve, has a strategy to ensure a reachability objective against any strategy profile of her opponents as a coalition. In particular Eve’s strategy should be independent of the number of opponents she actually has. We establish the precise complexities of the problem for reachability objectives. Second, we considered a synthesis problem, where one aims at designing a strategy for each of the (arbitrarily many) players so as to achieve a common objective. For safety objectives, we show that this kind of distributed synthesis problem is decidable 20.
We considered the number of states necessary for (leaderless) population protocols to be as expressive as unrestricted population protocols. It is well known since 2004 that the expressive power of unrestricted population protocols is exactly (quantifier-free) Presburger arithmetic with remainder predicates. In 24, we prove that it is sufficient to have a number of states polynomial in the size of a quantifier-free presburger arithmetic formula with remainder predicates. There is no difference between protocols with and without leaders. This result is surprising as it is known that for fast protocols, there is an exponential gap between the speed of leader and leaderless protocols.
Participants : Loïc Hélouët, Rituraj Singh
Crowdsourcing consists in hiring workers on internet to perform large amounts of simple, independent and replicated work units, before assembling the returned results. A challenge to solve intricate problems is to define orchestrations of tasks, and allow higher-order answers where workers can suggest a process to obtain data rather than a plain answer. Another challenge is to guarantee that an orchestration with correct input data terminates, and produces correct output data.
In 25, we have proposed complex workflows, a data-centric model for crowdsourcing based on orchestration of concurrent tasks and higher order schemes. We will consider termination (whether some/all runs of a complex workflow terminate) and correctness (whether some/all runs of a workflow terminate with data satisfying FO requirements). We show that existential termination/correctness are undecidable in general excepted for specifications with bounded recursion. However, universal termination/correctness are decidable when constraints on inputs are specified in a decidable fragment of FO, and are at least in co-2EXPTIME.
In 29, we have addressed quality issues in crowdsourcing. Crowdsourcing is a way to solve problems that need human contribution. Crowdsourcing platforms distribute replicated tasks to workers, pay them for their contribution, and aggregate answers to produce a reliable conclusion. A fundamental problem is to infer a correct answer from the set of returned results. Another challenge is to obtain a reliable answer at a reasonable cost: unlimited budget allows hiring experts or large pools of workers for each task but a limited budget forces to use resources at best. This paper considers crowdsourcing of simple boolean tasks. We first define a probabilistic inference technique, that considers difficulty of tasks and expertise of workers when aggregating answers. We then propose CrowdInc, a greedy algorithm that reduce the cost needed to reach a consensual answer. CrowdInc distributes resources dynamically to tasks according to their difficulty. We show on several benchmarks that CrowdInc achieves good accuracy, reduces costs, and we compare its performance to existing solutions.
In 36, we have extended this approach to quality and cost improvement for complex tasks realized on a crowdsourcing platform. We have deisgned new algorithms to replicate, distribute tasks and assemble the returned results during the realization of a complex workflow orchestrating a data production process. The algorithm is dynamic, and optimizes cost and confidence in agregated data to decide whether tasks should be replicated. We have shown that our algorithm exhibits better perfromance than static allocation of tasks to crowdworkers, both in terms of costs and accurracy.
Participants : Adrian Puerto Aubel, Éric Badouel
In 10, we address the problem of component reuse in the context of service-oriented programming and more specifically for the design of user-centric distributed collaborative systems modelled by Guarded Attribute Grammars. Following the contract-based specification of components we developed an approach to an interface theory for the components of a collaborative system in three stages: we define a composition of interfaces that specifies how the component behaves with respect to its environment, we introduce an implementation order on interfaces and finally a residual operation on interfaces characterizing the systems that, when composed with a given component, can complement it in order to realize a global specification.
Participants : Sihem Cherrared, Éric Fabre, Blaise Genest, Léo Henry, Thierry Jéron, Nicolas Markey
When models are missing, learning models from observations of a system is a powerful tool. In 3, we consider learning Discrete Time Markov Chains (DTMC), with different AI methods such as frequency estimation or Laplace smoothing. While models learnt with such methods converge asymptotically towards the exact system, a more practical question in the realm of trusted machine learning is how accurate a model learnt with a limited time budget is. Existing approaches provide bounds on how close the model is to the original system, in terms of bounds on local (transition) probabilities, which has unclear implication on the global behavior. In this work, we provide global bounds on the error made by such a learning process, in terms of global behaviors formalized using temporal logic. More precisely, we propose a learning process ensuring a bound on the error in the probabilities of these properties. While such learning process cannot exist for the full LTL logic, we provide one ensuring a bound that is uniform over all the formulas of CTL. Further, given one time-to-failure property, we provide an improved learning algorithm. Interestingly, frequency estimation is sufficient for the latter, while Laplace smoothing is needed to ensure non-trivial uniform bounds for the full CTL logic.
Active learning of timed languages is concerned with the inference of timed automata by observing some of the timed words in their languages. The learner can query for the membership of words in the language, or propose a candidate model and ask if it is equivalent to the target. The major difficulty of this framework is the inference of clock resets, which are central to the dynamics of timed automata but not directly observable.
Interesting first steps have already been made by restricting to the subclass of event-recording automata 46, where clock resets are tied to observations. In order to advance towards learning of general timed automata, in 27, we generalize this method to a new class, called reset-free event-recording automata, where some transitions may reset no clocks.
Central to our contribution is the notion of invalidity, and the algorithm and data structures to deal with it, allowing on-the-fly detection and pruning of reset hypotheses that contradict observations. This notion is a key to any efficient active-learning procedure for generic timed automata.
The fault diagnosis literature (about discrete event systems) is
abundant for situations where system models including faults and
their consequences are given as a starting point. When adressing real
life applications such as fault diagnosis in telecommunication networks,
one is immediately faced with the lack of models. Already for the normal
behaviour of these systems, not to mention faults and their effects. We
have addressed this challenge from two angles. In a collaboration with
Orange Labs, we have considered a self-modeling approach, that assemble
generic component models in order to fit a network architecture,
capturing the physical, virtualization, functional and service layers,
at appropriate granularities. This allows a quick adaptation to changing
configurations, an important feature of programmable networks. Models
obtained by this approach can be validated by fault injections on
platform mockups. They are then translated into an appropriate formalism
(constraint networks or Bayesian networks) to serve as the basis for
diagnosis engines. This work was at the heart of Sihem Cherrared's
thesis, defended in June 2020 32.
In another collaboration with Nokia Bell
Labs 31, we explored a different approach aiming at
detecting and characterizing soft performance degradations (instead of
abrupt failures). This is done by detecting changes in the joint
distribution of numerous performance indicators. We explored several
machine learning approches, using data collected on a platform running
the real production software, with simulated traffic. Again, we used
(soft) fault injections to characterize network behaviours under
resource stress.
Several researchers of SUMO are involved in the joint research lab of Nokia Bell Labs France and Inria. We participate in the common
research team SAPIENS (Smart Automated and Programmable
Infrastructures for End-to-end Networks and Services), previously named “Softwarization of Everything.” This team involves several other Inria teams: Convecs, Diverse and Spades. SUMO focuses on the management of reconfigurable systems, both at the edge (IoT based
applications) and in the core (e.g. virtualized IMS
systems). In particular, we study control and diagnosis issues for
such systems.
A PhD student is involved in the project:
Abdul Majith (started in January 2019) on Controller Synthesis of Adaptive Systems, supervised by Hervé Marchand, Ocan Sankur and Dinh Thai Bui (Nokia Bell Labs).
SUMO takes part in IOLab, the common lab of Orange Labs and Inria, dedicated to the design and management of Software Defined Networks. Our activities concern the diagnosis of malfunctions in virtualized multi-tenant networks.
This collaboration supported the Cifre PhD grant of Sihem Cherrared, supervised by Éric Fabre, Gregor Goessler (Inria Spades, Grenoble) and Sofiane Imadali (Orange Labs).
Several researchers of SUMO are involved in a collaboration on the verification of real-time systems with the "Information and Network Systems (INS)" Team led by David Mentré of the "Communication & Information Systems (CIS)" Division of MERCE Rennes. The members of the team at MERCE work on different aspects of formal verification. Currently the SUMO team and MERCE jointly supervise a Cifre PhD student (Emily Clement) funded by MERCE since fall 2018; the thesis is about robustness of reachability in timed automata. Moreover Reiya Noguchi, a young engineer, member of MERCE, on leave of a Japanese operational division of Mitsubishi is also hosted and co-supervised by the SUMO team since the beginning of 2019, one day per week; we collaborate with him on the consistency of timed requirements.
Formal verification has been addressed for a long time. A lot of effort has been devoted to Boolean verification, i.e., formal analyis of systems that check whether a given property is true or false.
In many settings, a boolean verdict is not sufficient. The notions of interest are for instance the amount of confidential information leaked by a system, the proportion of some protein after a duration in some experiment in a biological system, whether a distributed protocol satisfies some property only for a bounded number of participants... This calls for quantitative verification, in which algorithms compute a value such as the probability for a property to hold, the mean cost of runs satisfying it, the time needed to achieve a complex workflow...
A second limitation of formal verification is the efficiency of algorithms. Even for simple questions, verification is rapidly PSPACE-complete. However, some classes of models allow polynomial time verification. The key techniques to master complexity are to use concurrency, approximation, etc
The objective of this project is to study efficient techniques for quantitative verification, and develop efficient algorithms for models such as stochastic games, timed and concurrent systems,
The team regularly collaborates with the following researchers:
Léo Henry was awarded a grant from Rennes Métropole for a 2-month stay at the VUB (Brussels), in the team of Ann Nowé. The stay was planned to last from mid-february to mid-april, but it was stopped after one month because of the covid-19 pandemics.
The aim of TickTac is to develop novel algorithms for the verification and synthesis of real-time systems using the timed automata formalism. One of the project's objectives is to develop an open-source and configurable model checker which will allow the community to compare algorithms. The algorithms and the tool will be used on a motion planning case study for robotics.
The objective of this project is to develop techniques to facilite development, deployment, and monitoring of crowd-based participative applications. This requires handling complex workflows with multiple participants, incertainty in data collections, incentives, skills of contributors, ... To overcome these challenges, Headwork will define rich workflows with multiple participants, data and knowledge models to capture various kind of crowd applications with complex data acquisition tasks and human specificities. We will also address methods for deploying, verifying, optimizing, but also monitoring and adapting crowd-based workflow executions at run time.
The Inria Project Lab HAC-SPECIS (High-performance Application and Computers, Studying PErformance and Correctness In Simulation, is a transversal project internal to Inria. The goal of the HAC SPECIS project is to answer the methodological needs raised by the recent evolution of HPC architectures by allowing application and runtime developers to study such systems both from the correctness and performance point of view. Inside this project, we collaborate with Martin Quinson (Myriads team) on the dynamic formal verification of high performance runtimes and applications. The PhD of The Anh Pham, completed in December 2019, was granted by this project.
This year was the last year of the project, the closing meeting took place virtualy in November 2020, in particular with a presentation of the achievements we got about verification of HPC using dynamic partial order methods. Since December 2019, Ehsan Azimi has been hired for a 2-year engineer position (ADT). His role is both to reinforce the basis of the verification engine in SimGrid, and to implement on top of it the results of The Anh Pham's PhD thesis and experiment them on HPC benchmarks.
The team collaborates with the following researchers:
The team also has ongoing interactions with engineers at Alstom transports, originating from former industrial collaborations (in particular the P22 project).
All members of the team regularly write reviews for the main conferences in our areas of expertise (LICS, ICALP, CAV, Concur, STACS, FoSSaCS, RV, WoDES, ...)
All members of the team regularly write reviews for the main journals in our areas of expertise (TCS, LMCS, Acta Informatica, DEDS, IEEE TAC, FMSD, SOSYM, Fundamenta Informaticae...)
Licence: Loïc Hélouët, JAVA and algorithms, L2, 40h, INSA de Rennes, France.
Several members of the team took a training course on the "chiche project", whose aim is to organize visits of researchers in secondary schools and high schools in order to present their research (in simple words) and popularize research in computer science.