In the increasingly networked world, reliability of applications becomes ever more critical as the number of users of, e.g., communication systems, web services, transportation etc., grows steadily. Management of networked systems, in a very general sense of the term, therefore is a crucial task, but also a difficult one.
MExICo strives to
take advantage of distribution by orchestrating cooperation between different agents that observe local subsystems,
and interact in a localized fashion.
The need for applying formal methods in the analysis and management of complex systems has long been recognized. It is with much less unanimity that the scientific community embraces methods based on asynchronous and distributed models. Centralized and sequential modeling still prevails.
However, we observe that crucial applications have increasing numbers of
users, that networks providing services grow fast both in the number of
participants and the physical size and degree of spatial distribution.
Moreover, traditional isolated and proprietary software
products for local systems are no longer typical for emerging applications.
In contrast to traditional centralized and sequential machinery for which purely functional specifications are efficient, we have to account for applications being provided from diverse and noncoordinated sources. Their distribution (e.g. over the Web) must change the way we verify and manage them. In particular, one cannot ignore the impact of quantitative features such as delays or failure likelihoods on the functionalities of composite services in distributed systems.
We thus identify three main characteristics of complex distributed systems that constitute research challenges:
The increasing size and the networked nature of communication systems,
controls, distributed services, etc. confront us with an ever higher degree
of parallelism between local processes. This field of application for
our work includes telecommunication systems and composite web
services. The challenge is to provide sound theoretical foundations and
efficient algorithms for management of such systems, ranging from
controller synthesis and fault diagnosis to integration and adaptation.
While these tasks have received considerable attention in the
sequential setting, managing nonsequential behavior requires
profound modifications for existing approaches, and often the development
of new approaches altogether. We see concurrency in distributed systems as
an opportunity rather than a nuisance. Our goal is to exploit
asynchronicity and distribution as an advantage. Clever use of adequate
models, in particular partial order semantics (ranging from
Mazurkiewicz traces to event structures to MSCs) actually helps in
practice. In fact, the partial order vision allows us to make causal
precedence relations explicit, and to perform diagnosis and test for the
dependency between events. This is a conceptual advantage that
interleavingbased approaches cannot match. The two key features of our
work will be (i) the exploitation of concurrency by using
asynchronous models with partial order semantics, and (ii)
distribution of the agents performing management tasks.
Systems and services exhibit nontrivial interaction between
specialized and heterogeneous components. A coordinated interplay of several
components is required; this is challenging since each of them has only a limited, partial view of the
system's configuration. We refer to this problem as distributed
synthesis or distributed control. An aggravating factor is that
the structure of a component might be semitransparent, which requires a
form of grey box management.
Besides the logical functionalities of programs, the quantitative
aspects of component behavior and interaction play an increasingly
important role.
Since the creation of MExICo, the weight of quantitative aspects in
all parts of our activities has grown, be it in terms of the models considered
(weighted automata and logics), be it in transforming verification or diagnosis verdict
into probabilistic statements (probabilistic diagnosis, statistical model checking),
or within the recently started SystemX cooperation on supervision in
multimodal transport systems.
This trend is certain to continue over the next couple of years, along with
the growing importance of diagnosis and control issues.
In another development, the theory and use of partial order semantics has gained momentum in the past four years, and we intend to further strengthen our efforts and contacts in this domain to further develop and apply partialorder based deduction methods.
When no complete model of the underlying dynamic system is available, the analysis
of logs may allow to reconstruct such a model, or at least to infer some properties of interest; this activity,
which has emerged over the past 10 years on the international level, is referred to as process mining. In this emerging activity, we
have contributed to unfoldingbased process discovery [CI146], and the study of process alignments
[CI121, CI96, CI83, CI60, CI33].
Finally, over the past years biological challenges have come to the center of our work, in three different directions:
It is well known that, whatever the intended form of analysis or control, a
global view of the system state leads to overwhelming numbers of
states and transitions, thus slowing down algorithms that need to explore
the state space. Worse yet, it often blurs the mechanics that are at work
rather than exhibiting them. Conversely, respecting concurrency relations
avoids exhaustive enumeration of interleavings. It allows us to focus on
`essential' properties of nonsequential processes, which are expressible
with causal precedence relations. These precedence relations are usually
called causal (partial) orders. Concurrency is the explicit absence of
such a precedence between actions that do not have to wait for one another.
Both causal orders and concurrency are in fact essential elements of a
specification. This is especially true when the specification is
constructed in a distributed and modular way. Making these ordering
relations explicit requires to leave the framework of state/interleaving
based semantics. Therefore, we need to develop new dedicated algorithms
for tasks such as conformance testing, fault diagnosis, or control for
distributed discrete systems. Existing solutions for these problems often
rely on centralized sequential models which do not scale up well.
for discrete event systems is a crucial task in automatic control. Our focus is on
(as opposed to
) modelbased diagnosis, asking e.g. the following questions:
Modelbased diagnosis 1 starts from a discrete event model of the observed system  or rather, its relevant aspects, such as possible fault propagations, abstracting away other dimensions. From this model, an extraction or unfolding process, guided by the observation, produces recursively the explanation candidates.
Depending on the possible observations, a discreteevent system may be diagnosable or not. Active diagnosis aims at controlling the system to render it diagnosable. We have established in 5 a memoryoptimal diagnoser whose delay is at most twice the minimal delay, whereas the memory required to achieve optimal delay may be highly greater. We have also provided solutions for parametrized active diagnosis, where we automatically construct the most permissive controller respecting a given delay. Further, we introduced four variants of
diagnosability (FA, IA, FF, IF)
in (finite) probabilistic systems (pLTS) depending whether one
considers (1) finite or infinite runs and (2) faulty or all runs. The corresponding
decision problems are PSPACEcomplete. A key ingredient of the
decision procedures was a characterisation of diagnosability by the
fact that a random run almost surely lies in an open set whose
specification only depends on the qualitative behaviour of the pLTS.
For infinite pLTS, this characterisation still holds for
FFdiagnosability but with a IFand IAdiagnosability
when pLTS are finitely branching. Surprisingly,
FAdiagnosability cannot be characterised in this way even
in the finitely branching case.
Further extensions are under way, in particular in passing to prediction and prevention of faults prior to their occurrence.
In asynchronous partialorder based diagnosis with Petri nets, one unfolds the
labelled product of a Petri net model (configurations) that explain exactly
Diagnosis algorithms have to operate in contexts with low observability,
i.e., in systems where many events are invisible to the supervisor.
Checking observability and diagnosability for the
supervised systems is therefore a crucial and nontrivial task in its own
right. Analysis of the relational structure of occurrence nets allows us
to check whether the system exhibits sufficient visibility to allow
diagnosis. Developing efficient methods for both verification of
diagnosability checking under concurrency, and the diagnosis
itself for distributed, composite and asynchronous systems, is an important
field for the team. In 2019,
a new property, manifestability, weaker than diagnosability (dual in some sense to opacity) has been studied in the context of automata and timed automata.
Distributed computation of unfoldings allows one to factor the unfolding of
the global system into smaller local unfoldings, by local
supervisors associated with subnetworks and communicating among each other.
In 30, 20, elements of a methodology for distributed computation of unfoldings between several supervisors, underwritten by algebraic
properties of the category of Petri nets have been developed. Generalizations, in particular
to Graph Grammars, are still do be done.
Computing diagnosis in a distributed way is only one aspect of a much
vaster topic, that of distributed diagnosis (see
28, 31). In fact, it involves a
more abstract and often indirect reasoning to conclude whether or not some
given invisible fault has occurred. Combination of local scenarios is in
general not sufficient: the global system may have behaviors that do not
reveal themselves as faulty (or, dually, nonfaulty) on any local
supervisor's domain (compare 22, 25).
Rather, the local
diagnosers have to join all information that is available to them
locally, and then deduce collectively further information from the
combination of their views.
In particular, even the absence of
fault evidence on all peers may allow to deduce fault occurrence jointly, see
33, 34.
Automatizing such procedures for the supervision and management of
distributed and locally monitored asynchronous systems is a longterm goal
to which MExICo hopes to contribute.
Hybrid systems constitute a model for cyberphysical systems which integrates continuoustime dynamics (modes) governed by differential equations, and discrete transitions which switch instantaneously from one mode to another. Thanks to their ease of programming, hybrid systems have been integrated to power electronics systems, and more generally in cyberphysical systems. In order to guarantee that such systems meet their specifications, classical methods consist in finitely abstracting the systems by discretization of the (infinite) state space, and deriving automatically the appropriate mode control from the specification using standard graph techniques.
Diagnosability of hybrid systems has also been studied through an abstraction / refinement process in terms of timed automata.
Assuring the correctness of concurrent systems is notoriously difficult due to the many unforeseeable ways in which the components may interact and the resulting statespace explosion. A wellestablished approach to alleviate this problem is to model concurrent systems as Petri nets and analyse their unfoldings, essentially an acyclic version of the Petri net whose simpler structure permits easier analysis
29.
However, Petri nets are inadequate to model concurrent read accesses to the same resource. Such situations often arise naturally, for instance in concurrent databases or in asynchronous circuits. The encoding tricks typically used to model these cases in Petri nets make the unfolding technique inefficient. Contextual nets, which explicitly do model concurrent read accesses, address this problem. Their accurate representation of concurrency makes contextual unfoldings up to exponentially smaller in certain situations. In recent work, we further studied this subject from a theoretical and practical perspective, allowing us to develop concrete, efficient data structures and algorithms and a tool (Cunf) that improves upon existing state of the art. This work led to the PhD thesis of César Rodríguez in 2014 .
Contextual unfoldings deal well with two sources of statespace explosion:
concurrency and shared resources. Recently, we proposed an improved data
structure, called contextual merged processes (CMP) to deal with
a third source of statespace explosion, i.e. sequences of choices.
The work on CMP 35 is currently at an abstract level.
In the short term, we want to put this work into practice, requiring some
theoretical groundwork, as well as programming and experimentation.
Another wellknown approach to verifying concurrent systems is
partialorder reduction, exemplified by the tool SPIN.
Although it is known that both partialorder reduction and unfoldings
have their respective strengths and weaknesses, we are not aware of any
conclusive comparison between the two techniques. Spin comes
with a highlevel modeling language having an explicit notion of processes,
communication channels, and variables. Indeed, the reduction techniques
implemented in Spin exploit the specific properties of these features.
On the other side, while there exist highly efficient tools for unfoldings,
Petri nets are a relatively general lowlevel formalism, so these techniques
do not exploit properties of higher language features. Our work on contextual
unfoldings and CMPs represents a first step to make unfoldings exploit
richer models. In the long run, we wish raise the unfolding technique to a
suitable highlevel modelling language and develop appropriate tool support.
The use of process models has increased in the last decade due to the advent of the process mining field. Process mining techniques aim at discovering, analyzing and enhancing formal representations of the real processes executed in any digital environment. These processes can only be observed by the footprints of their executions, stored in form of
. An event log is a collection of traces and is the input of process mining techniques. The derivation of an accurate formalization of an underlying process opens the door to the continuous improvement and analysis of the processes within an information system.
Process models often use true concurrency to represent actions that appear in logs with different permutations.
Among the important challenges in process mining, conformance checking is a crucial one: to assess the quality of a model (automatically discovered or manually designed) in describing the observed behavior, i.e., the event log.
MExICo contributes to process mining, a field which discovers and manipulates true concurrency models and questions about their conformance to recorded event logs.
MExICo introduced antialignments as a tool for conformance checking. The idea of antialignment is to search, for a model
MExICo has also been contributing to clustering of log traces.
Perspectives about process mining in MExICo include model repair, i.e. design and implementation of techniques to incrementally improve models in order to make them fit better to observed logs, including when the log itself grows continuously.
Another direction is to handle models which manipulate data and real time, in order to propose more accurate representation of the log traces when the events carry some additional information (time stamps, identifiers, quantities, costs...)
Besides the logical functionalities of programs, the quantitative
aspects of component behavior and interaction play an increasingly
important role.
Traditional mainframe systems were proprietary and (essentially) localized;
therefore, impact of delays, unforeseen failures, etc. could be considered
under the control of the system manager. It was therefore natural, in
verification and control of systems, to focus on functional
behavior entirely.
With the increase in size of computing system and the growing degree of compositionality and distribution, quantitative factors enter the stage:
Time and probability are thus parameters
that management of distributed systems must
be able to handle; along with both, the cost of operations is often subject to restrictions,
or its minimization is at least desired.
The mathematical treatment of these features in
distributed systems is an important challenge,
which MExICo is addressing; the following describes our activities concerning probabilistic and
timed systems. Note that cost optimization is not a current activity but enters the picture in several intended activities.
Distributed systems featuring nondeterministic and probabilistic aspects are usually hard to analyze and, more specifically, to optimize. Furthermore, high complexity theoretical lower bounds have been established for models like partially observed Markovian decision processes and distributed partially observed Markovian decision processes. We believe that these negative results are consequences of the choice of the models rather than the intrinsic complexity of problems to be solved. Thus we plan to introduce new models in which the associated optimization problems can be solved in a more efficient way. More precisely, we start by studying connection protocols weighted by costs and we look for online and offline strategies for optimizing the mean cost to achieve the protocol. We have been cooperating on this subject with the SUMO team at INRIA Rennes; in the joint work
21; there, we strive to synthesize for a given MDP a control so as to guarantee a specific stationary behavior, rather than  as is usually done  so as to maximize some reward.
Addressing largescale probabilistic systems requires to face state explosion, due to both the discrete part and the probabilistic part of the model. In order to deal with such systems, different approaches have been proposed:
We want to contribute to these three axes: (1) we are looking for productforms related to systems where synchronization are more involved (like in Petri nets 6); (2) we want to adapt methods for discreteevent systems that require some theoretical developments in the stochastic framework and, (3) we plan to address some important limitations of statistical model checking like the expressiveness of the associated logic and the handling of rare events.
Nowadays, software systems largely depend on complex timing constraints and usually consist of many interacting local components. Among them, railway crossings, traffic control units, mobile phones, computer servers, and many more safetycritical systems are subject to particular quality standards. It is therefore becoming increasingly important to look at networks of timed systems, which allow realtime systems to operate in a distributed manner.
Timed automata are a wellstudied formalism to describe reactive systems that come with timing constraints. For modeling distributed realtime systems, networks of timed automata have been considered, where the local clocks of the processes usually evolve at the same rate 3226. It is, however, not always adequate to assume that distributed components of a system obey a global time. Actually, there is generally no reason to assume that different timed systems in the networks refer to the same time or evolve at the same rate. Any component is rather determined by local influences such as temperature and workload.
MExICo’s research is motivated by problems of system management in several domains, such as:
Currently, we have no active cooperation on these subjects.
We have begun in 2014 to examine concurrency issues in systems biology, and are currently enlarging the scope of our research’s applications in this direction. To see the context, note that in recent years, a considerable shift of biologists’ interest can be observed, from the mapping of static genotypes to gene expression, i.e. the processes in which genetic information is used in producing functional products. These processes are far from being uniquely determined by the gene itself, or even jointly with static properties of the environment; rather, regulation occurs throughout the expression processes, with specific mechanisms increasing or decreasing the production of various products, and thus modulating the outcome. These regulations are central in understanding cell fate (how does the cell differenciate ? Do mutations occur ? etc), and progress there hinges on our capacity to analyse, predict, monitor and control complex and variegated processes. We have applied Petri net unfolding techniques for the efficient computation of attractors in a regulatory network; that is, to identify strongly connected reachability components that correspond to stable evolutions, e.g. of a cell that differentiates into a specific functionality (or mutation). This constitutes the starting point of a broader research with Petri net unfolding techniques in regulation. In fact, the use of ordinary Petri nets for capturing regulatory network (RN) dynamics overcomes the limitations of traditional RN models : those impose e.g. Monotonicity properties in the influence that one factor had upon another, i.e. always increasing or always decreasing, and were thus unable to cover all actual behaviours. Rather, we follow the more refined model of boolean networks of automata, where the local states of the different factors jointly detemine which state transitions are possible. For these connectors, ordinary PNs constitute a first approximation, improving greatly over the literature but leaving room for improvement in terms of introducing more refined logical connectors. Future work thus involves transcending this class of PN models. Via unfoldings, one has access – provided efficient techniques are available – to all behaviours of the model, rather than overor underapproximations as previously. This opens the way to efficiently searching in particular for determinants of the cell fate : which attractors are reachable from a given stage, and what are the factors that decide in favor of one or the other attractor, etc. Our current research focusses cellular reprogramming on the one hand, and distributed algorithms in wild or synthetic biological systems on the other. The latter is a distributed algorithms’ view on microbiological systems, both with the goal to model and analyze existing microbiological systems as distributed systems, and to design and implement distributed algorithms in synthesized microbiological systems. Envisioned major longterm goals are drug production and medical treatment via synthesized bacterial colonies. We are approaching our goal of a distributed algorithm’s view of microbiological systems from several directions: (i) Timing plays a crucial role in microbiological systems. Similar to modern VLSI circuits, dominating loading effects and noise render classical delay models unfeasible. In previous work we showed limitations of current delay models and presented a class of new delay models, so called involution channels. In [26] we showed that involution channels are still in accordance with Newtonian physics, even in presence of noise. (ii) In [7] we analyzed metastability in circuits by a threevalued Kleene logic, presented a general technique to build circuits that can tolerate a certain degree of metastability at its inputs, and showed the presence of a computational hierarchy. Again, we expect metastability to play a crucial role in microbiological systems, as similar to modern VLSI circuits, loading effects are pronounced. (iii) We studied agreement problems in highly dynamic networks without stability guarantees [28], [27]. We expect such networks to occur in bacterial cultures where bacteria communicate by producing and sensing small signal molecules like AHL. Both works also have theoretically relevant implications: The work in [27] presents the first approximate agreement protocol in a multidimensional space with time complexity independent of the dimension, working also in presence of Byzantine faults. In [28] we proved a tight lower bound on convergence rates and time complexity of asymptotic and approximate agreement in dynamic and classical static fault models. (iv) We are currently working with Manish Kushwaha (INRA), and Thomas Nowak (LRI) on biological infection models for E. coli colonies and M13 phages.
In the context of the Escape project (PhD thesis of G.K. Aguirre Samboni, started in October 2020) we are now extending our research on causal analysis of complex biological networks to the domain of ecosystems, in cooperation with INRAE researcher Cédric Gaucherel.
The cooperation with INRAE has been intensifiying in 2022 with the start of the AMI INRAEProject SMART, jointly with INRAE RECOVER (Corinne Kurt, Franck Taillader, PhD student Souhila FOUNAS) in the fall, on modeling and assessing environmental multirisks.
Currently, no active contracts or projects in this field.
The carbon footprint of our activities is generic for office work, and probably strongest in traveling. While the latter has been slowed down because of the Covid pandemic, we believe that even in the future, intelligent use of online cooperation and communication can help limit the inevitable footprint of travel to the crucial activities of cooperation and networking, avoiding physical meetings when possible.
With our Project ESCAPE, we are hoping for a strong impact on ecosystem analysis and management. Further, the research on biological regulation networks has the potential for enabling e.g. evaluation and design of medical therapies in epigenetic contexts.
A framework for discrete, nondeterministic modelling of ecosystems has been given in terms of reset Petri Nets, for which the unfolding into ordinary occurrence nets has been formalized in 12 and implemented in the Ecofolder tool, see below. The study of bifurcation in an Ecosystem towards a fatal set of states was developped in 16
COSMOS is a statistical model checker for the Hybrid Automata Stochastic Logic (HASL). HASL employs Linear Hybrid Automata (LHA), a generalization of Deterministic Timed Automata (DTA), to describe accepting execution paths of a Discrete Event Stochastic Process (DESP), a class of stochastic models which includes, but is not limited to, Markov chains. As a result HASL verification turns out to be a unifying framework where sophisticated temporal reasoning is naturally blended with elaborate rewardbased analysis. COSMOS takes as input a DESP (described in terms of a Generalized Stochastic Petri Net), an LHA and an expression Z representing the quantity to be estimated. It returns a confidence interval estimation of Z, recently, it has been equipped with functionalities for rare event analysis.
It is easy to generate and use a C code for discrete Simulink models (using only discrete blocks, which are sampled at fixed intervals) using MathWorks tools. However, it limits the expressivity of the models. In order to use more diverse Simulink models and control the flow of a multimodel simulation (with Discrete Event Stochastic Processes) we developed a Simulink Simulation Engine embedded into Cosmos.
COSMOS is written in C++
Reset Petri nets are a particular class of Petri nets where transition firings can remove all tokens from a place without checking if this place actually holds tokens or not. In
9, we look at partial order semantics of reset Petri nets. In particular, we propose a pomset bisimulation for comparing their concurrent behaviours. Building on this pomset bisimulation we then propose a generalization of the standard finite complete prefixes of unfolding for this class of Petri nets. In a different vein, we introduce in
12the systematic use of (1safe) reset Petri nets for the analysis of Ecosystems, and a dedicated unfolding procedure to represent the dynamics of a reset net in an ordinary Petri net.
A crucial question in analyzing a concurrent system is to determine its longrun behaviour, and in particular, whether there are irreversible choices in its evolution, leading into parts of the reachability space from which there is no return to other parts. Casting this problem in the unifying framework of safe Petri nets, our previous work [3] has provided techniques for identifying attractors, i.e. terminal strongly connected components of the reachability space, whose attraction basins we wish to determine. In
16we provide a solution for the case of safe Petri nets. Our algorithm uses net unfoldings and provides a map of all of the system's configurations (concurrent executions) that act as cliffedges, i.e. any maximal extension for those configurations lies in some basin that is considered fatal. The computation turns out to require only a relatively small prefix of the unfolding.
In classical synchronous designs, supply voltage droops can be handled by accounting for them in clock margins. However, this results in a significant performance hit even if droops are rare. By contrast, adaptive strategies detect such potentially hazardous events and either initiate a rollback to a previous state or proactively reduce clock speed in order to prevent timing violations. The performance of such solutions critically depends on a very fast response to droops. Stateoftheart solutions incur synchronization delays in the order of several clock cycles to avoid, with sufficient probability, that the clock signal is affected by metastability. We present in
10an alldigital circuit that can respond to droops within a fraction of a clock cycle. This is achieved by using potentially metastable measurement values to delay clock signals while they undergo synchronization, instead of after they are synchronized. The challenge is to ensure that this strategy does not lead to harmful glitches or metastable upsets within the circuit. To this end, we verify our solution by formally proving correctness. We complement our findings by simulations of a 65 nm ASIC design confirming the results of our analysis.
Angluin's L* algorithm learns the minimal (complete) deterministic finite automaton (DFA) of a regular language using membership and equivalence queries. Its probabilistic approximatively correct (PAC) version substitutes an equivalence query by a large enough set of random membership queries to get a high level confidence to the answer. Thus it can be applied to any kind of (also nonregular) device and may be viewed as an algorithm for synthesizing an automaton abstracting the behavior of the device based on observations. In
11, we are interested on how Angluin's PAC learning algorithm behaves for devices which are obtained from a DFA by introducing some noise. More precisely we study whether Angluin's algorithm reduces the noise and produces a DFA closer to the original one than the noisy device. We propose several ways to introduce the noise: (1) the noisy device inverts the classification of words w.r.t. the DFA with a small probability, (2) the noisy device modifies with a small probability the letters of the word before asking its classification w.r.t. the DFA, and (3) the noisy device combines the classification of a word w.r.t. the DFA and its classification w.r.t. a counter automaton. Our experiments were performed on several hundred DFAs. Our main contributions, bluntly stated, consist in showing that: (1) Angluin's algorithm behaves well whenever the noisy device is produced by a random process, (2) but poorly with a structured noise, and, that (3) almost surely randomness yields systems with nonrecursively enumerable languages.
Chemical reaction networks are widely used to model biochemical systems. However, when the complexity of these systems increases, the chemical reaction networks are prone to errors in the initial modeling and subsequent updates of the model.
In 14, we present the Metaspeciesoriented Biochemical Systems Language (MobsPy), a language designed to simplify the definition of chemical reaction networks in Python. MobsPy is built around the notion of metaspecies, which are sets of species that can be multiplied to create higherdimensional orthogonal characteristics spaces and inheritance of reactions. Reactions can modify these characteristics. For reactants, queries allow to select a subset from a metaspecies and use them in a reaction. For products, queries specify the dimensions in which a modification occurs. We demonstrate the simplification capabilities of the MobsPy language at the hand of a running example and a circuit from literature. The MobsPy Python package includes functions to perform both deterministic and stochastic simulations, as well as easily configurable plotting. The MobsPy package is indexed in the Python Package Index and can thus be installed via pip.
Angluin's L* algorithm learns the minimal (complete) deterministic finite automaton (DFA) of a regular language using membership and equivalence queries. Its probabilistic approximatively correct (PAC) version substitutes an equivalence query by a large enough set of random membership queries to get a high level confidence to the answer. Thus it can be applied to any kind of (also nonregular) device and may be viewed as an algorithm for synthesizing an automaton abstracting the behavior of the device based on observations. In 17, we are interested in how Angluin's PAC learning algorithm behaves for devices which are obtained from a DFA by introducing some noise. More precisely we study whether Angluin's algorithm reduces the noise and produces a DFA closer to the original one than the noisy device. We propose several ways to introduce the noise: (1) the noisy device inverts the classification of words w.r.t. the DFA with a small probability, (2) the noisy device modifies with a small probability the letters of the word before asking its classification w.r.t. the DFA, and (3) the noisy device combines the classification of a word w.r.t. the DFA and its classification w.r.t. a counter automaton. Our experiments were performed on several hundred DFAs. Our main contributions, bluntly stated, consist in showing that: (1) Angluin's algorithm behaves well whenever the noisy device is produced by a random process, (2) but poorly with a structured noise, and, that (3) almost surely randomness yields systems with nonrecursively enumerable languages.
We study in 13 the conformance checking for timed models, that is, process models that consider both the sequence of events in a process as well as the timestamps at which each event is recorded. Timeaware process mining is a growing subfield of research, and as tools that seek to discover timing related properties in processes develop, so does the need for conformance checking techniques that can tackle time constraints and provide insightful quality measures for timeaware process models. In particular, one of the most useful conformance artefacts is the alignment, that is, finding the minimal changes necessary to correct a new observation to conform to a process model. In this paper, we set our problem of timed alignment and solve two cases each corresponding to a different metric over time processes. For the first, we have an algorithm whose time complexity is linear both in the size of the observed trace and the process model, while for the second we have a quadratic time algorithm for linear process models.
In 19, we solve the case where the metrics used to compare timed processes allows mixed moves, i.e. an error on the timestamp of an event may or may not have propagated to its successors, and provide linear time algorithms for distance computation and alignment on models with sequential causal processes.

20182022 Membre du conseil d'orientation scientifique (COS) du LIS de Marseille (UMR 7020)
Stefan Haar is supervising,
In addition, he supervised the M1 internship of Sunheang Ty on Quantum Boolean Networks.