Asynchronous cellular automata and applications - Special issue of Natural Computing

MAIA Autonomous intelligent machine

Data and Knowledge Representation and Processing

Perception, Cognition and Interaction

http://www.loria.fr/equipes/maia/ 2002 September 01 Laboratoire lorrain de recherche en informatique et ses applications (LORIA) CNRS Université de Lorraine Artificial Intelligence Robotics Planning Machine Learning Multi-Agent Systems François Charpillet Chercheur

Nancy

Team leader, Inria, Senior Researcher oui Olivier Buffet Chercheur

Nancy

Inria, Researcher Alain Dutech Chercheur

Nancy

Inria, Researcher oui Nazim Fatès Chercheur

Nancy

Inria, Researcher Bruno Scherrer Chercheur

Nancy

Inria, Researcher Mauricio Araya PostDoc

Nancy

Univ. Lorraine, ATER until Aug 2013 Amine Boumaza Enseignant

Nancy

Univ. Lorraine, Associate Professor Christine Bourjot Enseignant

Nancy

Univ. Lorraine, Associate Professor Vincent Chevrier Enseignant

Nancy

Univ. Lorraine, Associate Professor oui Tomas Navarrete Gutierrez PostDoc

Nancy

Univ. Lorraine, until Aug 2013 Alexis Scheuer Enseignant

Nancy

Univ. Lorraine, Associate Professor Olivier Simonin Enseignant

Nancy

Univ. Lorraine, Associate Professor, until Sept 2013 oui Vincent Thomas Enseignant

Nancy

Univ. Lorraine, Associate Professor Maan El-Badaoui-El-Najjar CollaborateurExterieur

Nancy

Univ. Lille I Nicolas Beaufort Technique

Nancy

Inria Maxime Rio Technique

Nancy

Inria, from Nov 2013 Mihai Andries PhD

Nancy

Inria, CORDI-S on project PAL Antoine Bautin PhD

Nancy

Inria, until Nov 2013 Olivier Bouré PhD

Nancy

Univ. Lorraine, MESR, until Sep 2013 Benjamin Camus PhD

Nancy

Univ. Lorraine Lucie Daubigney PhD

Nancy

Supélec, supervised with O. Pietquin, until Oct 2013 Abdallah Dib PhD

Nancy

Inria and Caisse des Dépôts et Consignations Amandine Dubois PhD

Nancy

Univ. Lorraine Arsène Fansi Tchango PhD

Nancy

CIFRE Inria and Thales Iñaki Fernandèz PhD

Nancy

Univ. Lorraine, from Feb 2013 Nassim Kaldé PhD

Nancy

Univ. Lorraine, MESR Manel Tagorti PhD

Nancy

Inria, ANR BARQ project Mohamed Tlig PhD

Nancy

Univ. Lorraine Julien Vaubourg PhD

Nancy

Inria and EDF, from Oct 2013 Jano Yazbeck PhD

Nancy

Inria, granted by Univ. Lorraine, until Dec 2013 Jilles Dibangoye PostDoc

Nancy

Inria Boris Lesner PostDoc

Nancy

Inria, ANR BARQ project, until Feb 2013 Xuan Son Nguyen PostDoc

Nancy

Inria, ANR INSTITUT CARNOT ABDT project, from Sep 2013 Véronique Constant Assistant

Nancy

Inria Laurence Félicité Assistant

Nancy

Univ. Lorraine Thomas Moinel AutreCategorie

Nancy

Univ. Lorraine, M2R Internship, from Mar until Jul 2013 Overall Objectives Introduction

The objective of the MAIA MAIA stands for “MAchine Intelligente et Autonome”, that is “Autonomous and Intelligent MAchine”. team is to address foundational and engineering aspects of artificial intelligence. Within this general framework, the team investigates the design and understanding of intelligent agents In the field of artificial intelligence, an “agent” refers to an entity. which autonomously perceive and act upon an environment so as to achieve one or several goals. The MAIA group addresses the design of a single agent, a team of agents or a large number of agents. This common objective is considered from two perspectives organized around two lines of research:

The first research activity is about sequential decision making. It has been influenced by Stuart Russell who considers that an agent is rational. According to them: “For each possible percept sequence, an ideal rational agent should do whatever action is expected to maximize its performance measure” . This view makes Markov decision processes (MDPs) and more generally sequential decision making a good candidate for building the behavior of an agent. It probably explains why MDPs have received considerable attention in recent years by the artificial intelligence (AI) community.

The second activity is about understanding and engineering reactive multi-agent systems. It is influenced by research results from the field of behavioral biology which provides key insights for understanding how intelligent and adaptive behaviors appear in natural swarm systems. This encourages us to study principles of emergent behaviors in natural systems and apply them to the design of artificial intelligent systems. Reactive multi-agent systems are good candidates for building such autonomous and adaptive systems and our work mainly focuses on better understanding how we can soundly build such systems.

Highlights of the Year

the MAIA team was rewarded as “the most influential team of the research field” during the French conference on Planification, Decision and Learning (JFPDA 2013).

M. Tlig, O. Buffet, O. Simonin got the Best Paper Award for their paper presented at RJCIA-13 .

Research Program Sequential Decision Making Synopsis and Research Activities
Sequential decision making consists, in a nutshell, in controlling the actions of an agent facing a problem whose solution requires not one but a whole sequence of decisions. This kind of problem occurs in a multitude of forms. For example, important applications addressed in our work include: Robotics, where the agent is a physical entity moving in the real world; Medicine, where the agent can be an analytic device recommending tests and/or treatments; Computer Security, where the agent can be a virtual attacker trying to identify security holes in a given network; and Business Process Management, where the agent can provide an auto-completion facility helping to decide which steps to include into a new or revised process. Our work on such problems is characterized by three main lines of research:

(A) Understanding how, and to what extent, to best model the problems.

(B) Developing algorithms solving the problems and understanding their behavior.

(C) Applying our results to complex applications.

Before we describe some details of our work, it is instructive to understand the basic forms of problems we are addressing. We characterize problems along the following main dimensions:

(1) Extent of the model: full vs. partial vs. none. This dimension concerns how complete we require the model of the problem – if any – to be. If the model is incomplete, then learning techniques are needed along with the decision making process.

(2) Form of the model: factored vs. enumerative. Enumerative models explicitly list all possible world states and the associated actions etc. Factored models can be exponentially more compact, describing states and actions in terms of their behavior with respect to a set of higher-level variables.

(3) World dynamics: deterministic vs. stochastic. This concerns our initial knowledge of the world the agent is acting in, as well as the dynamics of actions: is the outcome known a priori or are several outcomes possible?

(4) Observability: full vs. partial. This concerns our ability to observe what our actions actually do to the world, i.e., to observe properties of the new world state. Obviously, this is an issue only if the world dynamics are stochastic.

These dimensions are wide-spread in the AI literature and are not exhaustive, in particular the MAIA team is also interested by discrete/continuous or centralized/decentralized problems. The complexity of solving a problem – both in theory and in practice – depends heavily on where it resides in this categorization. A common practice is to address simplified problems, leading to perhaps sub-optimal solutions while trying to characterize how far from the optimal solution we stand.

In what follows, we outline the main formal frameworks on which our work is based; while doing so, we highlight in a little more detail our core research questions. We then give a brief summary of how our work fits into the global research context.
Formal Frameworks Deterministic Sequential Decision Making
Sequential decision making with deterministic world dynamics is most commonly known as planning, or classical planning . Obviously, in such a setting every world state needs to be considered at most once, and thus enumerative models do not make sense (the problem description would have the same size as the space of possibilities to be explored). Planning approaches support factored description languages in which complex problems can be modeled in a compact way. Approaches to automatically learn such factored models do exist, however most works – and also most of our works on this form of sequential decision making – assume that the model is provided by the user of the planning technology. Formally, a problem instance, commonly referred to as a planning task, is a four-tuple $〈 V, A, I, G 〉$ . Here, $V$ is a set of variables; a value assignment to the variables is a world state. $A$ is a set of actions described in terms of two formulas over $V$ : their preconditions and effects. $I$ is the initial state, and $G$ is a goal condition (again a formula over $V$ ). A solution, commonly referred to as a plan, is a schedule of actions that is applicable to $I$ and achieves $G$ .

Planning is PSPACE-complete even under strong restrictions on the formulas allowed in the planning task description. Research thus revolves around the development and understanding of search methods, which explore, in a variety of different ways, the space of possible action schedules. A particularly successful approach is heuristic search, where search is guided by information obtained in an automatically designed relaxation (simplified version) of the task. We investigate the design of relaxations, the connections between such design and the search space topology, and the construction of effective planning systems that exhibit good practical performance across a wide range of different inputs. Other important research lines concern the application of ideas successful in planning to stochastic sequential decision making (see next), and the development of technology supporting the user in model design.
Stochastic Sequential Decision Making
Markov Decision Processes (MDP) are a natural framework for stochastic sequential decision making. An MDP is a four-tuple $〈 S, A, T, r 〉$ , where $S$ is a set of states, $A$ is a set of actions, $T (s, a, s^{'}) = P (s^{'} | s, a)$ is the probability of transitioning to $s^{'}$ given that action $a$ was chosen in state $s$ , and $r (s, a, s^{'})$ is the (possibly stochastic) reward obtained from taking action $a$ in state $s$ , and transitioning to state $s^{'}$ . In this framework, one looks for a strategy: a precise way for specifying the sequence of actions that induces, on average, an optimal sum of discounted rewards $E [\sum_{t = 0}^{\infty} γ^{t} r_{t}]$ . Here, $(r_{0}, r_{1}, . . .)$ is the infinitely-long (random) sequence of rewards induced by the strategy, and $γ \in (0, 1)$ is a discount factor putting more weight on rewards obtained earlier. Central to the MDP framework is the Bellman equation, which characterizes the optimal value function $V^{*}$ :
$\forall s \in S, V^{*} (s) = max_{a \in A} \sum_{s^{'} \in S} T (s, a, s^{'}) [r (s, a, s^{'}) + γ V^{*} (s^{'})] .$
Once the optimal value function is computed, it is straightforward to derive an optimal strategy, which is deterministic and memoryless, i.e., a simple mapping from states to actions. Such a strategy is usually called a policy. An optimal policy is any policy $π^{*}$ that is greedy with respect to $V^{*}$ , i.e., which satisfies:
$\forall s \in S, π (s) \in {arg max}_{a \in A} \sum_{s^{'} \in S} T (s, a, s^{'}) [r (s, a, s^{'}) + γ V^{*} (s^{'})] .$
An important extension of MDPs, known as Partially Observable MDPs (POMDPs) allows to account for the fact that the state may not be fully available to the decision maker. While the goal is the same as in an MDP (optimizing the expected sum of discounted rewards), the solution is more intricate. Any POMDP can be seen to be equivalent to an MDP defined on the space of probability distributions on states, called belief states. The Bellman-machinery then applies to the belief states. The specific structure of the resulting MDP makes it possible to iteratively approximate the optimal value function – which is convex in the belief space – by piecewise linear functions, and to deduce an optimal policy that maps belief states to actions. A further extension, known as a DEC-POMDP, considers $n \geq 2$ agents that need to control the state dynamics in a decentralized way without direct communication.

The MDP model described above is enumerative, and the complexity of computing the optimal value function is polynomial in the size of that input. However, in examples of practical size, that complexity is still too high so naïve approaches do not scale. We consider the following situations: (i) when the state space is large, we study approximation techniques from both a theoretical and practical point of view; (ii) when the model is unknown, we study how to learn an optimal policy from samples (this problem is also known as Reinforcement Learning ); (iii) in factored models, where MDP models are a strict generalization of classical planning – and are thus at least PSPACE-hard to solve – we consider using search heuristics adapted from such (classical) planning.

Solving a POMDP is PSPACE-hard even given an enumerative model. In this framework, we are mainly looking for assumptions that could be exploited to reduce the complexity of the problem at hand, for instance when some actions have no effect on the state dynamics (active sensing). The decentralized version, DEC-POMDP, induces a significant increase in complexity (NEXP-complete). We tackle the challenging – even for (very) small state spaces – exact computation of finite-horizon optimal solutions through alternative reformulations of the problem. We also aim at proposing advanced heuristics to efficiently address problems with more agents and a longer time horizon.
Understanding and mastering complex systems General context
There exist numerous examples of natural and artificial systems where self-organization and emergence occur. Such systems are composed of a set of simple entities interacting in a shared environment and exhibit complex collective behaviors resulting from the interactions of the local (or individual) behaviors of these entities. The properties that they exhibit, for instance robustness, explain why their study has been growing, both in the academic and the industrial field. They are found in a wide panel of fields such as sociology (opinion dynamics in social networks), ecology (population dynamics), economy (financial markets, consumer behaviors), ethology (swarm intelligence, collective motion), cellular biology (cells/organ), computer networks (ad-hoc or P2P networks), etc.

More precisely, the systems we are interested in are characterized by:

locality: Elementary components have only a partial perception of the system's state, similarly, a component can only modify its surrounding environment.

individual simplicity: components have a simple behavior, in most cases it can be modeled by stimulus/response laws or by look-up tables. One way to estimate this simplicity is to count the number of stimulus/response rules for instance.

emergence: It is generally difficult to predict the global behavior of the system from the local individual behaviors. This difficulty of prediction is often observed empirically and in some cases (e.g., cellular automata) one can show that the prediction of the global properties of a system is an undecidable problem. However, observations coming from simulations of the system may help us to find the regularities that occur in the system's behavior (even in a probabilistic meaning). Our interest is to work on problems where a full mathematical analysis seems out of reach and where it is useful to observe the system with large simulations. In return, it is frequent that the properties observed empirically are then studied on an analytical basis. This approach should allow us to understand where lies the frontier between simulation and analysis.

levels of description and observation: Describing a complex system involves at least two levels: the micro level that regards how a component behaves, and the macro level associated with the collective behavior. Usually, understanding a complex system requires to link the description of a component behavior with the observation of a collective phenomenon: establishing this link may require various levels, which can be obtained only with a careful analysis of the system.

We now describe the type of models that are studied in our group.
Multi-agent models
We represent these complex systems with reactive multi-agent systems (RMAS). Multi-agent systems are defined by a set of reactive agents, an environment, a set of interactions between agents and a resulting organization. They are characterized by a decentralized control shared among agents: each agent has an internal state, has access to local observations and influences the system through stimulus response rules. Thus, the collective behavior results from individual simplicity and successive actions and interactions of agents through the environment.

Reactive multi-agent systems present several advantages for modeling complex systems

agents are explicitly represented in the system and have the properties of local action, interaction and observation;

each agent can be described regardless of the description of the other agents, multi-agent systems allow explicit heterogeneity among agents which is often at the root of collective emergent phenomena;

multi-agent systems can be executed through simulation and provide good models to investigate the complex link between global and local phenomena for which analytic studies are hard to perform.

By proposing two different levels of description, the local level of the agents and the global level of the phenomenon, and several execution models, multi-agent systems constitute an interesting tool to study the link between local and global properties.

Despite a widespread use of multi-agent systems, their framework still needs many improvements to be fully accessible to computer scientists from various backgrounds. For instance, there is no generic model to mathematically define a reactive multi-agent system and to describe its interactions. This situation is in contrast with the field of cellular automata, for instance, and underlines that a unification of multi-agent systems under a general framework is a question that still remains to be tackled. We now list the different challenges that, in part, contribute to such an objective.
Current challenges
Our work is structured around the following challenges that combine both theoretical and experimental approaches.
Providing formal frameworks
A widespread and consensual formal definition of a multi-agent system is lacking. Our research aims at translating the concepts from the field of complex systems into the multi-agent systems framework.

One objective of this research is to remove the potential ambiguities that can appear if one describes a system without explicitly formulating each aspect of the simulation framework. As a benefit, the reproduction of experiments is facilitated. Moreover, this approach is intended to gain a better insight of the self-organization properties of the systems.

Another important question consists in monitoring the evolution of complex systems. Our objective is to provide some quantitative characteristics of the system such as local or global stability, robustness, complexity, etc. Describing our models as dynamical systems leads us to use specific tools of this mathematical theory as well as statistical tools.
Controlling complex dynamical system
Since there is no central control of our systems, one question of interest is to know under which conditions it is possible to guarantee a given property when the system is subject to perturbations. We tackle this issue by designing exogenous control architectures where control actions are envisaged as perturbations in the system. As a consequence, we seek to develop control mechanisms that can change the global behavior of a system without modifying the agent behavior (and not violating the autonomy property).
Designing systems
The aim is to design individual behaviors and interactions in order to produce a desired collective output. This output can be a collective pattern to reproduce in case of simulation of natural systems. In that case, from individual behaviors and interactions we study if (and how) the collective pattern is produced. We also tackle “inverse problems” (decentralized gathering problem, density classification problem, etc.) which consist in finding individual behaviors in order to solve a given problem.
Application Domains Decision Making
Our group is involved in several applications of its more fundamental work on autonomous decision making and complex systems. Applications addressed include:

Robotics, where the decision maker or agent is supported by a physical entity moving in the real world;

Medicine or Personally Assisted Living, where the agent can be an analytic device recommending tests and/or treatments, or able to gather different sources of information (sensors for example) in order to help a final user, detecting for example anormal situation needing the rescue of a person (fall detection of elderly people, risk of hospitalization of a person suffering from chronic disease;

Active Sensing, where decisions have to be taken in order to gather information on a system. This can be applied to many fields, like for example monitoring the integrity of airplanes wings or the behavior of people in public areas.

Ambient intelligence
As the Nancy – Grand Est Research Center scientific strategy pushes the development of plateforms on Robotics and Smart Living Apartments, some members of the team have recentered their research toward “ambient intelligence and AI” . This choice is backed up by the Inria Large-scale initiative project termed PAL (Personal assistant Living) in which we are strongly involved. The regional council of Lorraine also supports this new research line through the CPER, (project "situated computing" or "INFOSITU" infositu.loria.fr) whose coordinator is a member of MAIA Team. Within this new domain of research in MAIA, we explore how intelligent decentralized complex systems can help designing intelligent environments dedicated to elderly people with loss of autonomy. This domain of research is currently very active, taking up a societal challenge that developed countries have to address.
Software and Platforms AA4MM Vincent Chevrier correspondant Benjamin Camus Julien Vaubourg
Laurent Ciarletta (Madynes team, LORIA) is a collaborator and correspondant for this software. Yannick Presse (Madynes team, LORIA) is collaborator for this software.

AA4MM (Agents and Artefacts for Multi-modeling and Multi-simulation) is a framework for coupling existing and heterogeneous models and simulators in order to model and simulate complex systems. The first implementation of the AA4MM meta-model was proposed in Julien Siebert's PhD and written in Java. A newer version with more coupling models is currently submitted to the APP (Agence pour la protection des programmes).

This year, we used this software in a strategic action with EDF R&D in the context of the simulation of smart-grids.
MASDYNE Vincent Chevrier correspondant Tomas Navarrete
This work was undertaken in the PhD Thesis of Julien Siebert, a joint thesis between MAIA and Madynes Team. Laurent Ciarletta (Madynes team, LORIA) has been co-advisor of this PhD and correspondant for this software.

Other contributors to this software were: Tom Leclerc, François Klein, Christophe Torin, Marcel Lamenu, Guillaume Favre and Amir Toly.

MASDYNE (Multi-Agent Simulator of DYnamic Networks usErs) is a multi-agent simulator for modeling and simulating users behaviors in mobile ad hoc network. This software is part of joint work with MADYNES team, on modeling and simulation of ubiquitous networks. It has been updated by Tomas Navarrete with new functionalities for the simulation of scenarii.
FiatLux Nazim Fatès correspondant
FiatLux is a discrete dynamical systems simulator that allows the user to experiment with various models (for example 1D and 2D cellular automatas, moving agents on cellular automatas) and to perturb them. Its main feature is to allow users to change the type of updating, for example from a deterministic parallel updating to an asynchronous random updating. FiatLux has a Graphical User Interface and can also be launched in a batch mode for the experiments that require statistics.

In 2013, FiatLux was officially registered by the Agence pour la protection des programmes (APP).A new release is available under the CeCILL licence on the FiatLux website : fiatlux.loria.fr
Cart-o-matic Olivier Simonin correspondant François Charpillet Antoine Bautin Nicolas Beaufort
Philippe Lucidarme (Université d'Angers, LISA) is a collaborator and the coordinator of the Cart-o-matic project.

Cart-o-matic is a software platform for (multi-)robot exploration and mapping tasks. It has been developed by Maia members and LISA (Univ. Angers) members during the robotics ANR/DGA Carotte challenge (2009-2012). This platform is composed of three softwares tools which are protected by software copyrights (through the Agence pour la Protection des Programmes): Slam-o-matic a SLAM algorithm developed by LISA members, Plan-o-matic a robot trajectory planning algorithm developed by Maia and LISA members, and Expl-o-matic a distributed multi-agent strategy for multi-robot exploration developed by Maia members (which is based on algorithms proposed in the PhD Thesis of Antoine Bautin). Cf. illustration at Cart-o-matic.

The purchase of Cart-o-matic by some robotics companies is underway.
New Results Decision Making Searching for Information with MDPs Mauricio Araya Olivier Buffet Vincent Thomas François Charpillet
In the context of Mauricio Araya's PhD and PostDoc, we are working on how MDPs – or related models – can search for information. This has led to various research directions, such as extending POMDPs so as to optimize information-based rewards, or actively learning MDP models. This year has begun with the defense of Mauricio's PhD thesis in February. Since then, we have kept extending Mauricio's work and are preparing journal submissions.

While we have done some progress in this field, there are no concrete outcomes to present concerning optimistic approaches for model-based Bayesian Reinforcement Learning. Concerning POMDPs with information-based rewards, Mauricio's PhD thesis presents strong theoretical results that allow – in principle – deriving efficient algorithms from state-of-the-art “point-based” POMDP solvers. This year we have put this idea into practice, implementing variants of PBVI, PERSEUS and HSVI.

Preliminary results have been published (in French) in JFPDA'13 . A journal paper with complete theoretical and empirical results is under preparation.
Adaptive Management with POMDPs Olivier Buffet
Samuel Nicol, Iadine Chadès (CSIRO), Takuya Iwamura (Stanford University) are external collaborators.

In the field of conservation biology, adaptive management is about managing a system, e.g., performing actions so as to protect some endangered species, while learning how it behaves. This is a typical reinforcement learning task that could for example be addressed through Bayesian Reinforcement Learning.

This year, we have worked in the context of bird migratory pathways, in particular the East Asian-Australasian (EAA) flyway, which is modeled as a network whose nodes are land areas where birds need to stay for some time. An issue is that these land areas are threatened due to sea level rise. The adaptive management problem at hand is that of deciding in the protection of which land areas to invest money so as to preserve the migratory pathways as efficiently as possible.

The outcome of this work is a data challenge paper published at IJCAI'13 , which presents the problem at hand, describes its POMDP model, gives empirical results obtained with state-of-the-art solvers, and challenges POMDP practitioners to find better solution techniques.
Solving decentralized stochastic control problems as continuous-state MDPs Jilles Dibangoye Olivier Buffet François Charpillet
External collaborators: Christopher Amato (MIT), Arnaud Doniec (EMD), Charles Bessonnet (Telecom Nancy), Joni Pajarinen (Aalto University).

Decentralized partially observable Markov decision processes (DEC-POMDPs) are rich models for cooperative decision-making under uncertainty, but are often intractable to solve optimally (NEXP-complete), even using efficient heuristic search algorithms. In this work, we present an efficient methodology to solving decentralized stochastic control problems formalized as a DEC-POMDP or its subclasses. This methodology is three-fold: (1) it converts the original decentralized problem into a centralized problem from the perspective of a solution method that can take advantage of the total data about the original problem that is available during the online execution phase; (2) it shows that the original and transformed problems are equivalent; (3) it solves the transformed problem using a centralized method and transfers the solution back to the original problem. We applied this methodology in various different decentralized stochastic control problems.

Our results include the application of this methodology over DEC-POMDPs , . We recast them into deterministic continuous-state MDPs, where states — called occupancy states — are probability distributions over states and action-observation histories of the original DEC-POMDPs. We also demonstrate the occupancy state is a sufficient statistic for optimally solving DEC-POMDPs. We further show the optimal value function is a piecewise-linear and convex function of the occupancy states. With these results as a background, we prove for the first time that POMDP (and more generally continuous-state MDP) solution methods can, at least in principle, apply in DEC-POMDPs. This work has been presented at IJCAI'2013 and (in French) at JFPDA'2013 , and an in-depth journal article is currently under preparation. We have already extended the results we obtained for general DEC-POMDPs in the case of transition- and observation-independent DEC-MDPs. Of particular interest, we demonstrated that the occupancy states can be further compressed into a probability distribution over the states — the first sufficient statistic in decentralized stochastic control problems that is invariant with time. This work has been presented at AAMAS'2013 , and an in-depth journal article is currently under preparation.

We believe our methodology lays the foundation for further work on optimal as well as approximate solution methods for decentralized stochastic control problems in particular, and stochastic control problems in general.
Abstraction Pathologies in Markov Decision Processes Manel Tagorti Bruno Scherrer Olivier Buffet
Jörg Hoffmann, former member of MAIA, is an external collaborator (from Saarland University).

Abstraction is a common method to compute lower bounds in classical planning, imposing an equivalence relation on the state space and deriving the lower bound from the quotient system. It is a trivial and well-known fact that refined abstractions can only improve the lower bound. Thus, when we embarked on applying the same technique in the probabilistic setting, our firm belief was to find the same behavior there. We were wrong. Indeed, there are cases where every direct refinement step (splitting one equivalence class into two) yields strictly worse bounds. We give a comprehensive account of the issues involved, for two wide-spread methods to define and use abstract MDPs.

This work has been presented and published in the ICAPS-13 workshop on Heuristics and Search for Domain-Independent Planning (HSDIP) and (in French) in JFPDA-13 .
Evolutionary programming for Policies Space exploration Amine Boumaza Vincent Thomas
Evolutionary Programming proposed by Fogel (initially introduced in 1966) is an approach to build an automaton optimizing a fitness function. Like other evolutionary algorithms, an initial population of automata is given, and the evolutionary programming algorithm will make this population evolve by progressively modifying automata (mutations) and keeping the most efficient ones in the next generation.

This process is close to the progressive construction by a policy iteration algorithm in a POMDP and we are currently investigating the links between these approaches.

This work has begun this year through an internship (Benjamin Bibler) and preliminary development has been made to solve the Santa Fe trail problem proposed by Koza (1992) which has become a benchmark to compare genetic and evolutionary programming approaches.
Evolutionary Learning of Tetris Policies Amine Boumaza
Learning Tetris controllers is an interesting and challenging problem due to the fact of the size of its search space where traditional machine learning methods do not work and the use of approximate methods is necessary (see ). In this work we study the performance of a direct policy search algorithm namely the Covariance Matrix Adaptation Evolution Strategy (CMAES). We also proposed different techniques to reduce the learning time, one of which is racing. This approach concentrates the computation effort on promising policies and quickly disregards bad ones in order do reduce the computation time. This approach allowed to obtain policies of the same performance as those obtained without but at the fifth of the computation cost. The learned strategies are among the best performing players at this time scoring several millions of lines on average.
Evolutionary behavior learning Amine Boumaza François Charpillet Iñaki Fernandèz
Evolutionary Robotics (ER) deals with the design of agent behaviors using artificial evolution. Within this framework, the problem of learning optimal policies (or controllers) is treated as a policy search problem in the parameterized space of candidate policies. The search for the optimal policies in this context is driven by a fitness function that associates a value to the candidate policy by measuring its performance on the given task.

The work shown here describes the results of the master's thesis of Inãki Fernandèz which will be extended during a Ph.D. thesis started on october 2014.

Incremental policy learning with shaping. Several methods have been proposed to accelerate the search for optimal policy in evolutionary robotics. In this work, we investigated the use of incremental learning and, more precisely, shaping, a well-known technique in behavioral psychology. The main idea is to learn to solve simple tasks and then exploit the learned behaviors to tackle increasingly harder tasks.

Our preliminary results show that the best performances are obtained either in the setups with shaping or in the control experiment where the task difficulty is maximal. Nevertheless, a closer look at the results indicates that the best controllers for the shaping setups are not obtained at the end of the evolution, but rather at an earlier stage. This means that, for these shaping techniques, the best controllers have learned to solve the task when its difficulty was at an easy level and their performance is maintained later when the task difficulty increases. Although this was unforeseen, the results seem promising and deserve further investigation.

Online evolutionary learning. As opposed to traditional evolutionary robotics which treat the learning problem as an off-line, centralized process, online onboard distributed evolutionary algorithms , consider the learning process as executed at the agent level in a decentralized way. In this sense, each agent has its own controller or genome which is locally broadcasted from agent to agent and the best performing ones survive and spread. This gene-centered view of evolution is inspired from the theory introduced by Richard Dawkins: The selfish gene.

The online aspect of the algorithms means that the agents are learning at the same time they are performing the task at hand. Another property that derives is that the agents are continuously learning which allows them to adapt to dynamically changing conditions and tasks. This is in opposition to the traditional view of evolutionary robotics (offline) where the outcome of evolution is tailored toward single task. Many challenging problems are raised in this framework and this thesis will address the problem of defining fitness functions that drive a swarm of agents to learn to solve a task. One other question is to study the dynamics of these algorithms both experimentally and theoretically using tools from distributed systems. Some promising work in this direction has been proposed .

Learning Bad Actions Olivier Buffet
Jörg Hoffmann, former member of MAIA, and Michal Krajňanský are external collaborators from Saarland University.

In classical planning, a key problem is to exploit heuristic knowledge to efficiently guide the search for a sequence of actions leading to a goal state.

In some settings, one may have the opportunity to solve multiple small instances of a problem before solving larger instances, e.g., trying to handle a logistics problem with small numbers of trucks, depots and items before moving to (much) larger numbers. Then, the small instances may allow to extract knowledge that could be reused when facing larger instances. Previous work shows that it is difficult to directly learn rules specifying which action to pick in a given situation. Instead, we look for rules telling which actions should not be considered, so as to reduce the search space. But this approach requires considering multiple questions: What are examples of bad (or non-bad) actions? How to obtain them? Which learning algorithm to use?

This research work is conducted as part of Michal Krajňanský's master of science (to be defended in early 2014). Early experiments show encouraging results, and we consider participating in the learning track of the international planning competition in 2014.
Complexity of the Policy Iteration algorithm Bruno Scherrer
We have this year improved the state-of-the-art upper bounds for the complexity of a standard algorithm for solving Markov Decision Processes: Policy Iteration.

Given a Markov Decision Process with $n$ states and $m$ actions per state, we study the number of iterations needed by Policy Iteration (PI) algorithms to converge to the optimal $γ$ -discounted optimal policy. We consider two variations of PI: Howard's PI that changes the actions in all states with a positive advantage, and Simplex-PI that only changes the action in the state with maximal advantage. We show that Howard's PI terminates after at most $O (\frac{n m}{1 - γ} log (\frac{1}{1 - γ}))$ iterations, improving by a factor $O (log n)$ a result by Hansen et al. (2013), while Simplex-PI terminates after at most $O (\frac{n^{2} m}{1 - γ} log (\frac{1}{1 - γ}))$ iterations, improving by a factor $O (log n)$ a result by Ye (2011). Under some structural assumptions of the MDP, we then consider bounds that are independent of the discount factor $γ$ : given a measure of the maximal transient time $τ_{t}$ and the maximal time $τ_{r}$ to revisit states in recurrent classes under all policies, we show that Simplex-PI terminates after at most $\tilde{O} (n^{3} m^{2} τ_{t} τ_{r})$ iterations. This generalizes a recent result for deterministic MDPs by Post & Ye (2012), in which $τ_{t} \leq n$ and $τ_{r} \leq n$ . We explain why similar results seem hard to derive for Howard's PI. Finally, under the additional (restrictive) assumption that the state space is partitioned in two sets, respectively states that are transient and recurrent for all policies, we show that Simplex-PI and Howard's PI terminate after at most $\tilde{O} (n m (τ_{t} + τ_{r}))$ iterations.

These results were presented at the JFPDA national workshop and at the NIPS 2013 international conference .
Approximate Dynamic Programming and Application to the Game of Tetris Bruno Scherrer
Victor Gabillon and Mohammad Ghavamzadeh are external collaborators (from the Inria Sequel EPI). Matthieu Geist is an external collaborator (from Supélec Metz).

We present here three results: the first is a unified review of algorithms that are used to estimate a linear approximation of the value of some policy in a Markov Decision Process; the second concerns the analysis of a class of approximate dynamic algorithms for large scale Markov Decision Processes; the last is the successful application of similar dynamic programming algorithms on the Tetris domain.

In the framework of Markov Decision Processes, we have considered linear off-policy learning, that is the problem of learning a linear approximation of the value function of some fixed policy from one trajectory possibly generated by some other policy. We have made a review of on-policy learning algorithms of the literature (gradient-based and least-squares-based), adopting a unified algorithmic view. We have highlighted a systematic approach for adapting them to off-policy learning with eligibility traces. This lead to some known algorithms and suggested new extensions. This work has recently been accepted to JMLR and should be published at the beginning of 2014 .

We have revisited the work of Bertsekas and Ioffe (1996), that introduced $λ$ policy iteration-a family of algorithms parametrized by a parameter $λ$ -that generalizes the standard algorithms value and policy iteration, and has some deep connections with the temporal-difference algorithms described by Sutton and Barto (1998). We deepen the original theory developed by the authors by providing convergence rate bounds which generalize standard bounds for value iteration. We develop the theory of this algorithm when it is used in an approximate form. This work was published in JMLR .

Tetris is a video game that has been widely used as a benchmark for various optimization techniques including approximate dynamic programming (ADP) algorithms. A look at the literature of this game shows that while ADP algorithms that have been (almost) entirely based on approximating the value function (value function based) have performed poorly in Tetris, the methods that search directly in the space of policies by learning the policy parameters using an optimization black box, such as the cross entropy (CE) method, have achieved the best reported results. We have applied an algorithm we proposed in the past, called classification-based modified policy iteration (CBMPI), to the game of Tetris. Our experimental results show that for the first time an ADP algorithm, namely CBMPI, obtains the best results reported in the literature for Tetris in both small $10 \times 10$ and large $10 \times 20$ boards. Although the CBMPI's results are similar to those of the CE method in the large board, CBMPI uses considerably fewer (almost 1/6) samples (calls to the generative model) than CE. This work was presented at NIPS 2013 .
Ambiant Intelligence And Robotic Systems Robotic systems : autonomy, cooperation, exploration, robustness, assistance Local control based platooning Jano Yazbeck François Charpillet Alexis Scheuer
We consider decentralized control methods to operate autonomous vehicles at close spacings to form a platoon. We study models inspired by the flocking approach, where each vehicle computes its control from its local perceptions. We investigate different decentralized models in order to provide robust and callable solutions. Open questions concern collision avoidance, stability and multi-platoon navigation.

In order to reduce the tracking error (i.e. the distance between each follower's path and the path of its predecessor), we developed both an innovative approach and a new lateral control law. This lateral control law reduces the tracking error faster than other existing control laws. An article, presenting this control law, its integration with a previously defined secure longitudinal control law and the experimental results obtained with it, has been accepted to 2014 IEEE International Conference on Robotics and Automation.
Map Matching François Charpillet Maan El-Badaoui-El-Najjar
We addressed an important issue for intelligent transportation system, namely the ability of vehicles to safely and reliably localize themselves within an a priori known road map network. For this purpose, we proposed an approach based on hybrid dynamic bayesian networks enabling to implement in a unified framework two of the most successful families of probabilistic model commonly used for localization: linear Kalman filters and Hidden Markov Models. The combination of these two models enables to manage and manipulate multi-hypotheses and multi-modality of observations characterizing Map Matching problems and it improves integrity approach. Another contribution is a chained-form state space representation of vehicle evolution which permits to deal with non-linearity of the used odometry model. Experimental results, using data from encoders' sensors, a DGPS receiver and an accurate digital roadmap, illustrate the performance of this approach, especially in ambiguous situations .
Adaptation of autonomous vehicle traffic to perturbations Mohamed Tlig Olivier Simonin Olivier Buffet
The aim of the European project InTraDE is to propose more efficient ways to handle containers in seaports through the use of IAVs (Intelligent Autonomous Vehicles).

In his PhD thesis, Mohamed Tlig considers the displacements of numerous such IAVs whose routes are a priori planned by a supervisor. However, in such a large and complex system, different unexpected events can arise and degrade the traffic: failure of a vehicle, human mistake while driving, obstacle on roads, local re-planning, and so on.

After working on a simple decentralized strategy to allow two queues of vehicles to share a single lane (presented in 2012, and this year in AATMO-13 ), we have started looking at improving vehicle flows in complete road networks. In particular, we have proposed an approach that allows multiple flows of vehicles to cross an intersection without stopping, allowing to reduce delays as well as energy consumption. Preliminary results have been presented (in French) at RJCIA-13 , and more advanced work is under submission.

The next step is to coordinate the controller agents located in each of the network's intersections so as to create “green waves” that would improve the flows not just locally, but globally.
Living assistant Robot François Charpillet Antoine Bautin Abdallah Dib Olivier Simonin
With LAR (living AssistanT Robot), a PIA projet which started in March, Abdallah Dib joined our team for a PhD. His work is about the development of a low cost navigation system for a robot evolving in an indoor environment. The main issue of his work is to design a Simultaneous Localisation and Mapping algorithm working in a dynamic environment in which people are moving. This is very challenging if we restrict the sensing capabilities of the robot with low cost sensors such as RGB-D camera. An important service we expect the robot to achieve, is realizing similar services as the one we described below: fall detection, activity recognition.
Exploring an unknown environment with a team of mobile robots François Charpillet Olivier Simonin Antoine Bautin Nassim Kaldé
This work has been realized during the ANR Cart-O-matic project. Antoine Bautin has been hired by the Maia team for this project for a PhD. The main objective of the project was to design and build a multi-robot system able to autonomously map an unknown building. This work has been done in the framework of a French robotics contest called Defi CAROTTE organized by the General Delegation for Armaments (DGA) and the French National Research Agency (ANR). The scientific issues of this project deal with Simultaneous Localization And Mapping (SLAM), multi-robot collaboration and object recognition. The Maia Team has been mainly involved in multi-robot collaboration and navigation , , .

Nassim Kaldé, a new PhD student started last year in order to carry out the work done by Antoine Bautin. The new directtion aims at addressing similar problems as the one we addressed in Cart-O-matic project but with dynamical environment, i.e. environment in which people are evolving with robots. An other point that Nassim Kaldé will address is social navigation, which is important for robot and human to coexist in a smart manner.
Features extraction for the control of redundant system with continuous sensori-motor space Alain Dutech Thomas Moinel
Yann Boniface (CORTEX Team, Loria) is an external collaborator

In collaboration with the CORTEX team and supported by a M2R internship, many questions related to learning the control of a complex (mono)-agent system with a continuous sensori-motor space are explored. For several reasons, the classical framework of Reinforcement Learning is not easily used in that context:

the value function to be learned has to be encoded using features that are not known at start,

because of the richness of the sensori-motor space, a random exploration scheme is unlikely to find the rewarded states that are needed by the learning process,

exploiting what is learned is difficult as one would need to find the maximum of the value function while it is learned.

Our work is focused on a planar model of the human arm with 2 joints and 6 muscles (see figure ). Control signals are the activity of the motor-neurons that alter the length of the muscles, and thus the forces applied on the joints. This system is redundant but also highly non linear as many aspects of the model are described by non-linear differential equations (our model is a slight improvement over the one of Li ). The task to learn is to reach different positions from given starting points.

We have studied a developmental learning process with a simple muscle activation pattern. The idea is to start the learning process in an artificially reduced sensori-motor space (using rough perception and motor capacities) and slowly increase the size and complexity of this space when interesting behaviors are learned. Our approach gives results comparable to other developmental techniques and raises several important research questions. Our work showed that we need an abstraction mechanism in order to define or refine the features used in actions but also in perceptions. This is a very difficult challenge that is one of the keys to the understanding (and design) of cognition. There is also a need for stronger generalization capabilities in the function approximation used in the process.

In parallel, we are taking inspiration from the field of neurosciences, and particularly on the coupling between the cortex and the cerebellum in motor control. Models based on the work of Kaladjian should help us understand what control signals are used by the brain apparatus and how the learning of gestures is organized between these two regions. Our long term goal is to design mechanisms for learning features abstraction in the sensori-motor space while being guided by the improvement in behavior performances.
Ambiant intelligence Personnaly Assisted Living François Charpillet Amandine Dubois Olivier Simonin
This action is supported by the Inria IPL Personally Assisted Living (PAL) which gathers 9 Inria teams associated with 6 research partners (technological, medical or social) which work together on three main issue guidelines: mobility assistance, assessing the degree of frailty of the persons, home activities analysis. The MAIA team is currently mainly involved in the 2 later topics, plus fall detection.

Evaluation of the degree of frailty of the elderly. As argued in the famous paper of Fried et al the estimation of frailty is highly significant to evaluate the risk of falls, disability, hospitalization and mortality. This issue is considered in Maia Team with different sensing devices: single RGB-D cameras , network of RGB-D cameras, sensing intelligent floor. One simple idea which is currently developed in the team is to determine either the center of mass of a person using one or several kinects, or the center of pressure and footsteps localization using an intelligent floor. The idea is to induce from these simple measures, the walking speed, the length of the steps and the position of the monitored persons.

People activity analysis. The follow-up of the activity of elderly people over long period of time can be a good indicator of their well-being, but the evalution of the behavior of a person at home is an open challenge.

To address this issue, we proposed this year a HMM based model capable of following simple activities such as sitting, walking, etc. An evaluation of this model has been conducted within a real smart environment with 26 subjects which were performing any of eight activities (sitting, walking, going up, squatting, lying on a couch, falling, bending and lying down). Seven out of these eight activities were correctly detected among which falling which was detected without false positives .

Fall detection. Elderly fall is one of the major health issues affecting elderly people, especially at home. One of the objectives of the PhD work of Amandine Dubois is to design an automatic system to detect fall at home, which in its final version will be made up of a network of RGB-D sensors. A simple and robust method based on the identification and tracking of the center of mass of people evolving in an indoor environment has been developed. Using a simple Hidden Markov Model whose observations are the position of the center of mass, its velocity and the general shape of the body, we can surprisingly monitor the activity of a person with high accuracy and thus detect falls with very good accuracy without false positives , . An experimental study, that is reported here, has been driven in our smart apartment lab. 26 subjects were asked to perform a predefined scenario in which they realized a set of eight postures. 2 hours of video (216 000 frames) were recorded for the evaluation, half of it being used for the training of the model. The system detected the falls without false positives. This result encourages us to use this system in real situation for a better study of its efficiency.

Interconnected intelligent tiles Mihai Andries François Charpillet Olivier Simonin
We are also involved in the development of a new innovative sensing device: a Pressure-Sensing Floor with LED lighting making possible to provide a new way for people to interact with their environment. Sensitive or intelligent floors have attracted a lot of attention during the last two decades for different applications going from interaction capture in immersive virtual environments to robotics or human tracking, fall detection or activity recognition. Different technologies have been proposed so far either based on optical fiber sensing, pressure sensing or electrical near field. In the Maia Team, we have developed a more sophisticate approach in which both computation and sensing is distributed within the floor. This floor is made up of interconnected intelligent titles with can communicate with each other, have internal computation power, sense the environment activity (through four weight sensors, an accelerometer and a magnetometer) and can interact with users, robots or other sensor networks either by wireless/wire communication or through visual communication (each tile being equipped with 16 leds).

Several scientific challenges are open to us in the fields of decentralized spatial computing and in designing real application for assisting people suffering from loss of autonomy.

Some of these issues have been addressed this year. Mihai Andries, a PhD student, proposed two contributions demonstrating the relevancy of an intelligent floor such as the one we have developed. First contribution is about controlling a mobile robot through its interactions throughout the floor . The second, least developed is about activity recognition of a person through its physical interaction on the floor. This approach has an important advantage compared to video based activity recognition: the privacy of people is without any doubt guaranteed. Let us mention too, the work of an internship student who developed a gait evaluation algorithm using the variation over time of the center of pressure that is sensed by the floor when one or several person walk over the floor.
Multi-Camera Tracking in Partially Observable Environment Arsène Fansi Tchango Olivier Buffet Vincent Thomas Alain Dutech
Fabien Flacher (Thales THERESIS) is an external collaborator.

In collaboration with Thales ThereSIS - SE&SIM Team (Synthetic Environment & Simulation), we focus on the problem of following the trajectories of several persons with the help of several controllable cameras. This problem is difficult since the set of cameras cannot cover simultaneously the whole environment, since some persons can be hidden by obstacles or by other persons, and since the behavior of each person is governed by internal variables which can only be inferred (such as his motivation or his hunger).

The approach we are working on is based on (1) POMDP formalisms to represent the state of the system (person and their internal states) and possible actions for the cameras, (2) a simulator provided and developed by Thales ThereSIS and (3) particle filtering approaches based on this simulator.

From a theoretical point of view, we are currently investigating how to use a deterministic simulator and to generate new particles in order to keep a good approximation of the posterior distribution.
Understanding and mastering complex systems Robustness of Cellular Automata and Reactive Multi-Agent Systems Olivier Bouré Vincent Chevrier Nazim Fatès
Our research on emergent collective behavior focuses on the analysis of the robustness of discrete models of complex systems. We ask to which extent systems may resist to various perturbations in their definitions. We progressed in the knowledge of how to tackle this issue in the case of cellular automata (CA) and multi-agent systems (MAS).

We proposed new definitions of asynchronism in lattice-gas cellular automata . An experimental work was carried out and it was shown that the observation of an asynchronous version of a discrete model of swarm formation could help us gain insight on this well-studied model. The PhD thesis of O. Bouré provides a detailed view of this work.

A study on the density classification problem, a well-studied problem of consensus in cellular automata, was carried out for infinite systems in 1D and 2D and for infinite trees , . Positive results were provided and important conjectures were raised.

We proposed a survey on asynchronous cellular automata and explained some of the difficulties in the classification of these objects .

In collaboration with colleagues from India, we proposed a complete characterisation of the reversibility of the set of the 256 Elementary Cellular Automata, which are known to be diffcult to study in all generality . We also proposed a mathematical analysis of the second-order phase transitions that are observed in the most simple asynchronous cellular automata . We also coordinated a special issue on asynchronous cellular automata in the Natural Computing journal .
Adaptive control of a complex system based on its multi-agent model Vincent Chevrier Tomas Navarrete
Laurent Ciarletta (Madynes team, LORIA) is an external collaborator.

Complex systems are present everywhere in our environment: internet, electricity distribution networks, transport networks. These systems have as characteristics: a large number of autonomous entities, dynamic structures, different time and space scales and emergent phenomena. The thesis work of Tomas Navarrete is centered on the problem of control of such systems. The problem is defined as the need to determine, based on a partial perception of the system state, which actions to execute in order to avoid or favor certain global states of the system. This problem comprises several difficult questions: how to evaluate the impact at the global level of actions applied at a global level, how to model the dynamics of a heterogeneous system (different behaviors arise from different levels of interactions), how to evaluate the quality of the estimations obtained trhough the modeling of the system dynamics.

We propose a control architecture based on an “equation-free” approach. We use a multi-agent model to evaluate the global impact of local control actions before applying the most pertinent set of actions.

Our architecture has been prototypically implemented in order to confront the basic ideas of the architecture within the context of simulated “free-riding” phenomenon in peer to peer file exchange networks. We have demonstrated that our approach allows to drive the system to a state where most peers share files, even when the initial conditions are supposed to drive the system to a state where no peer shares. We have also performed experiments with different configurations of the architecture to identify the different means to improve the performance of the architecture.

This work helped us to better identify the key questions that rise when using the multi-agent paradigm in the context of control of complex systems, concerning the relationship between the model entities and the target system entities.
Multi-Modeling and multi-simulation Vincent Chevrier Christine Bourjot Benjamin Camus Julien Vaubourg
Laurent Ciarletta and Yannick Presse (Madynes team, LORIA) are external collaborators.

Laurent Ciarletta is the co-advisor of the thesis of Julien Vaubourg.

Models of Complex systems generally require different points of view (abstraction levels) at the same time in order to capture and to understand all the dynamics and the complexity. Consisting of different interacting parts, a model of a complex system also requires the joint and simultaneous use of modeling and simulation tools from different scientific fields.

We proposed the AA4MM meta-model that solves the core challenges of multi-modelling and simulation coupling in an homogeneous perspective. In AA4MM, we chose a multi-agent point of view: a multi-model is a society of models; each model corresponds to an agent and coupling relationships correspond to interaction between agents.

This year we have made progress in the definition of multi-level modeling , . We identified several facets of multi-level modeling and implemented them as different kinds of interactions in the AA4MM framework. A demonstration of these different multi-level couplings has been developed on a collective motion phenomenon.

In February started the MS4SG projet which involes MAIA, Madynes and EDF R&D on smart-grid simulation. A Phd thesis started on october 2013 by Julien Vaubourg in the MAIA team on the confrontation of the AA4MM principles against the specificities of smart-grid domain as a kind of complex system.
Bilateral Contracts and Grants with Industry Inria-EDF Strategic action MS4SG Vincent Chevrier Julien Vaubourg
Laurent Ciarletta and Yannick Presse (Madynes team, LORIA) are external collaborators.

The MS4SG (multi-simulation for smart grids) project is granted as a strategic action between Inria and EDF. This project is joint between the Inria teams Madynes and MAIA, and EDF R&D.

Smart grids are electric supply grids endowed with smart capabilities because of the use of information and communication technologies. This perspective of smart grids corresponds to new challenges ; in particular one must re-think the way electricity is supplied to customers and the power supply network is regulated.

The simulation approach can deal with the supervision and regulation of these systems. Such an approach implies to integrate simulators coming from different domains: electrical networks, communication networks and information systems. As these domains can influence each other, smart grids can be considered as a kind of complex system and we are faced with multi-modeling and multi-simulation issues; in particular we must deal with the fact that the models used in the different simulators are not of the same kind (heterogeneous simulations) and that we must link and re-use existing simulators that were designed to work alone on their own.

The aim of the project is to provide primitives based on AA4MM in order to enable the multi-modeling and the multi-simulation of smart grids.
Partnerships and Cooperations Regional Initiatives AME Satelor SATELOR François Charpillet Maxime Rio Nicolas Beaufort Xuan Nguyen Amandine Dubois
Economic mobilisation agency in Lorraine has launched a new project SATELOR providing it with 2.5 million Euros of funding over 3 years, out of an estimated total of 4.7 million. The leader of the project is Pharmagest-Diatelic. PHARMAGEST is the French leader in computer systems for pharmacies, with a 43.5 % share of the market, 9,800 clients and more than 700 employees. Pharmagest is in Nancy. Recently, PHARMAGEST Group expanded its activities into e-health and the development of telemedicine applications. The SATELOR project will accompany the partners of the project in developing new services for maintaining safely elderly people with loss of autonomy at home or people with a chronic illness. Maia team will play an important role for bringing some research results such as those presented in section at an industrial level.
National Initiatives Inria IPL PAL Personally Assisted Living François Charpillet Olivier Simonin Mihai Andries
The PAL project is a national Inria Large Scale Initiative involving several teams of the institute (Arobas, Coprin, E-motion, Lagadic, Demar, Maia, Prima, Pulsar and Trio). It is coordinated by David Daney (Inria Sophia-Antipolis, EPI Coprin). The project focuses on the study and experiment of models for health and well-being. Maia is particularly involved in the People Surveillance work package, by studying and developping intelligent environments and distributed tracking devices for people walking analysis and robotic assistance (smart tiles, 3D camera network, assistant robots), cf. Sec. , and .

The PhD of Mihai Andries in funded by the PAL project.
PIA LAR Living Assistant Robot François Charpillet Abdallah Dib
Partners : Crédit Agricole, Diatelic, Robosoft

LAR project has the objective to designing an assistant robot to improve the autonomy and quality of life for elderly and fragile persons. The project started at the beginning of the year. The role of the Maia Team is to develop a simultaneous localisation and mapping algorithm using a RGB-D camera. The main issue is to develop an algorithm able to deal with dynamic environment. An other issue is for the robot to be able to behave with acceptable social skills.
Inria ADT Percee (2011-13) Olivier Simonin François Charpillet Nicolas Beaufort
Olivier Rochel, from SED, is an external collaborator. Moutie Chaider was hired as an IJD in 2012.

Percee, for “Perception Distribuée pour Environnements Intelligents”, is a project proposed by the Maia and Madynes teams and funded by Inria. This ADT (Action de Développement Technologique) supports our action in the PAL Inria National Scale Initiative (Personally Assisted Living, see ).

The project deals with the development and the study of intelligent homes. Since two years we have developed an experimental platform, the smart apartment. It allows us to study models and technology for life assistance (walk analysis with iTiles and camera networks, robotic assistants, health diagnostic, domotic functions, wireless communication inside home).

In particular we develop a new tactile floor, which is the iTiles network. Two engineers are funded by the ADT: Moutie Chaider (IJD) and Olivier Rochel (Inria research engineer) for two years.
ANR ANR PHEROTAXIS François Charpillet Olivier Simonin
Dominique Martinez (Cortex team, Inria NGE) is an external collaborator and the coordinator of the project for Nancy members.

PHEROTAXIS is an “Investissements d’Avenir” ANR 2011-2014 (Coordination: J.-P. Rospars, UMR PISC, INRA Versailles).

The theme of the research is localisation of odour sources by insects and robots. By associating experimental data with models, the project aims at defining a behavioral model of olfactive processes. This work provides several applications, in particular the development of bio-inspired components highly sensitive and selective.

The project is organized in five work packages and involves the PISC research unit (Versailles), Pasteur Institute (Paris) and LORIA/Inria institute (Nancy).
European Initiatives Collaborations in European Programs, except FP7

Program: InterReg IV B

Project acronym: InTraDE

Project title: Intelligent Transportation for Dynamic Environment

Duration: 2010 - 2014

Coordinator: University of Science and Technology of Lille (Lille 1-LAGIS) (France),

Other partners: South East England Development Agency (United Kingdom), Centre Régional d’Innovation et de Transfert de Technologie – Transport et Logistique (CRITT TL) (France), AG Port of Oostende (AGHO) (Belgium), National Institute for Transport and Logistics, Dublin Institute of Technology (Ireland), Liverpool John Moores University (LOOM) (United Kingdom)

Abstract:

The InTraDE project (Intelligent Transportation for Dynamic Environments, http://www.intrade-nwe.eu/) is funded by the European North West Region. The project is coordinated by Rochdi Merzouki from University of Science and Technology of Lille (LAGIS lab.). Other partners are the Maia team, Liverpool John Moores University (LOOM), the National Institute for Transport and Logistics in Dublin Institute of Technology, the South East England Development Agency, the AGHO Port of Oostende and the CRITT in Le Havre. In the context of seaports and maritime terminals, the InTraDE project aims to improve the traffic management and space optimization inside confined spaces by developing a clean and safe intelligent transportation system. This transportation system will operate in parallel with virtual simulation software of the automated site, allowing a robust and real-time supervision of the goods handling operation.

The Maia team partner focuses on decentralized approaches to deal with the control of automated vehicle platooning and the adaptation of the traffic. Maia is funded with two PhD fellowships and one engineer. Both PhD thesis started in the end of 2010. The PhD of Jano Yazbeck, supervised by F. Charpillet and A. Scheuer, aims at studying a “Secure and robust immaterial hanging for automated vehicles” (see Sec. ). The PhD of Mohamed Tlig, supervised by O. Simonin and O. Buffet, addresses “Reactive coordination for traffic adaptation in large situated multi-agent systems” (see Sec. ).

International Research Visitors Visits of International Scientists

Dr. Iadine Chadès, Research Scientist at CSIRO, Ecosystem Sciences division (Brisbane, Australia), visited MAIA for 1 week in July 2013.

Dissemination Scientific Animation

François Charpillet is member of the scientific concil of the Robotic GDR

Conference organization, Program committees, Editorial boards

Amine Boumaza was a member of the program committee of GECCO'13 (The Genetic and Evolutionary Computation Conference) and CEC'13 (Congress on Evolutionary Computation). He was a reviewer for ANR jeune chercheur programme blanc.

Olivier Buffet is a member of the editorial boards of the “revue d'intelligence artificielle” (RIA), and the “Journal of Artificial Intelligence Research” (JAIR).

Olivier Buffet was a reviewer for the journals: AIJ (Artificial Intelligence Journal), JAIR (Journal of Artificial Intelligence Research), RIA (Revue d'Intelligence Artificielle); for the conferences AAAI'13 (National Conference on Artificial Intelligence), ICAPS'13 (International Conference on Automated Planning and Scheduling), IJCAI'13 (International Joint Conference on Artificial Intelligence), JFPDA'13 (Journées Francophones sur la Planification, la Décision et l'Action pour le contrôle de systèmes); and for the workshop WRLComp'13 (ICML Workshop on the 2013 Reinforcement Learning Competition).

Vincent Chevrier was a reviewer for the journal: ActaBiotheroretica (ACBI), for the conferences IJCAI'13 (International Joint Conference on Artificial Intelligence), JFSMA'13 (Journées francophones sur les Systèmes multi-Agents).

Vincent Chevrier was an expert for the following project calls: ANR Blanc Program and Digiteo, for the best PhD thesis price of the AFIA.

François Charpillet was in the program committees of ICAART, MSDM, WACAI, the 2013 IEEE/RSJ IROS Workshop “Assistance and Service Robotics in a Human Environment” and has also reviewed papers for IROS, ICRA, and RIA.

Alain Dutech was a reviewer for the journals : JMLR (Journal of Machine Learning Research), RIA (Revue d'Intelligence Artificielle); for the conferences IJCAI'13 (International Joint Conference on Artificial Intelligence), JFPDA'13 (Journées Francophones sur la Planification, la Décision et l'Action pour le contrôle de systèmes);

Nazim Fatès co-organised WPCA'13, the First European workshop on probabilistic cellular automata, held in Eindhoven, June 10-12, 2013. He was a reviewer for the following journals: Theoretical Computer Science, Physica D, Journal of cellullar automata. He was a program committee member of the following conferences: AUTOMATA'13 (Workshop on cellular automata), CAAA'13 (cellular automata Algorithms & Architectures), SCW'13 (spatial computing workshop). He was an external referee for the conference CiE'13 (Computation in Europe). He was invited to give a talk and a tutorial at AUTOMATA'13.

Bruno Scherrer was a reviewer for the journals Mathematics of Operations Research, Annals of Operations Research, Applied Mathematics and Optimization, and Transactions on Neural Networks and Learning Systems, for the international conferences ICML'13 (International Conference on Machine Learning) and NIPS'13 (Neural Information Processing Systems), and for the national conference on learning CAP'13 (Conférence Francophone sur l'Apprentissage Automatique).

Bruno Scherrer was invited to give a talk at the Reinforcement Learning seminar of the Gatsby Unit (University College of London).

Alexis Scheuer was a reviewer for the journals Artificial Intelligence (Elsevier), Robotics and Autonomous Systems (Elsevier) and IEEE Transactions on Robotics, as well as for the conference ICRA'14 (IEEE International Conference on Robotics and Automation).

Olivier Simonin has co-organized the 2013 IEEE/RSJ IROS Workshop “Assistance and Service Robotics in a Human Environment”. He is a member of the program committees of ICINCO'13, ICAART'13, and national conferences CAR2013 and JFSMA'13. He was a reviewer for the following international journals, IEEE Transactions on Systems, Man and Cybernetics: Systems, International Journal of Advanced Robotic Systems, and the conference IEEE ICRA'13.

Vincent Thomas was a reviewer for the journal IEEE Transactions on Cybernetics and for the conference JFPDA'13 (Journées Francophones sur la Planification, la Décision et l'Action pour le contrôle de systèmes).

Teaching - Supervision - Juries Teaching

PhD & Master level: Amine Boumaza, Métaheuristiques, 15h eq TD., Master 1 Computer Science, Université de Lorraine, France.

Master level: Alain Dutech, Numerical Learning in AI, 15h eq TD, Master 1 Cognitive Sciences, Université de Lorraine, France.

PhD & Master level: Nazim Fatès, cellular automata, four lectures for a complex systems doctoral module, University Adolofo Ibañez, Santiago, Chile.

PhD & Master level: Alexis Scheuer & Olivier Simonin, Introduction to Mobile Robotics, 30h eq TD., Master 1 Computer Science, Université de Lorraine, France.

PhD & Master level: Alexis Scheuer & Olivier Simonin, Additional Mobile Robotics, 30h eq TD., Master 2 Computer Science, Université de Lorraine, France.

Master level: Olivier Simonin, Artificial Life, 15h eq TD., last year course (equivalent to Master 2), Ecole Supérieure d'Électricité de Metz, France.

PhD & Master level: Vincent Thomas, Métaheuristiques, 15h eq TD., Master 1 Computer Science, Université de Lorraine, France.

PhD & Master level: Vincent Thomas, Optimisation et systèmes dynamiques stochastiques, 22h eq TD., Master 2 Computer Science, Université de Lorraine, France.

Master level: Vincent Thomas, Agents Intelligents, 25h eq TD., Master 1 Cognitive Sciences, Université de Lorraine, France.

Master level: Vincent Thomas, Serious Game and Game Design, 15h eq TD., Master 2 Cognitive Sciences, Université de Lorraine, France.

Supervision

PhD: Mauricio Araya, “Near-Optimal Algorithms for Sequential Information-Gathering Decision Problems”, Université de Lorraine, Feb. 4th, F. Charpillet (advisor), O. Buffet, V. Thomas.

PhD: Antoine Bautin, “Stratégie d'exploration multirobot fondée sur le calcul de champs de potentiels”, Université de Lorraine, Oct. 3rd, F. Charpillet (advisor), O. Simonin.

PhD: Olivier Bouré, “'Le simple est-il robuste ?' Une étude de la robustesse des systèmes complexes à travers les automates cellulaires”, Université de Lorraine, Sep. 13th, Nazim Fatès, Vincent Chevrier (advisor).

PhD: Lucie Daubigney, “Gestion de l'Incertitude pour l'Optimisation de Systèmes Interactifs”, Université de Lorraine, Oct. 1st, Olivier Pietquin, Alain Dutech (advisor).

PhD in progress Mihai Andries, “Calcul spatialisé pour l'assistance à la personne: étude d'un réseau de dalles intelligentes”, Université de Lorraine, Oct. 2012, F. Charpillet (advisor), O. Simonin.

PhD in progress: Benjamin Camus, “Un laboratoire virtuel pour la multi-modélisation”, Université de Lorraine, Oct 2012, Christine Bourjot, Vincent Chevrier (advisor).

PhD in progress: Amandine Dubois, “Assistance à la personne en perte d'autonomie: étude de l'apport d'un réseau de Kinects à la détection et la prévention des chutes”, Université de Lorraine, Oct. 2011, F. Charpillet (advisor).

PhD in progress: Abdallah Dib, “Assistance à la personne en perte d'autonomie: étude de l'apport d'un robot compagnon", Université de Lorraine, March. 2013, F. Charpillet (advisor).

PhD in progress: Arsène Fansi Tchango, “Suivi multi-caméra en environnement partiellement observé”, Université de Lorraine, Oct. 2011, A. Dutech (advisor), O. Buffet, V. Thomas.

PhD in progress: Nassim Kaldé, “Exploration et reconstruction d’un environnement inconnu par une flottille de robots”, Université de Lorraine, Oct. 2012, F. Charpillet (advisor), O. Simonin.

PhD in progress: Manel Tagorti, “Approximating the Value Function for Heuristic Search in Factored MDPs”, Université de Lorraine, Nov. 2011, J. Hoffmann (advisor), B. Scherrer, O. Buffet.

PhD in progress: Mohamed Tlig, “Reactive coordination for traffic adaptation in large situated multi-agent systems”, Université de Lorraine, Dec. 2010, O. Simonin (advisor), O. Buffet.

PhD in progress: Julien Vaubourg, “Multi-modélisation, multi-simulation dans le cadre des Smart-grids”, Université de Lorraine, Oct 2013., Laurent Ciarletta, Vincent Chevrier (advisor).

PhD in progress: Jano Yazbeck, “Secure and robust immaterial hanging for automated vehicles”, Université de Lorraine, Oct. 2010, F. Charpillet (advisor), A. Scheuer.

Juries PhD and HDR committees

Amine Boumaza was a member of the PhD committee of

Charles Olion, October 18th 2013, ISIR/UPMC.

Olivier Buffet was a member of the PhD committee of

Caroline Ponzoni Carvalho Chanel, 12th Apr. 2013, Université de Toulouse / ISAE, and

Adrien Couëtoux, 30th Sept. 2013, Université Paris Sud.

Vincent Chevrier was a member of the PhD committee of

Yishuai Lin, 10th Sept. 2013, Université de Technologie de Belfort-Monbéliard, as a referee; and

Inaya Lahoud, 10th Sept. 2013, Université de Technologie de Belfort-Monbéliard, as a committee member.

François Charpillet was a member (as a referee) of the PhD committee of:

Jean-Baptiste Soyez, 3rd December, Lagis/Université de Lille 1

Asma Azim, 17th December, LIG, Université de Grenoble

Nicolas Coté, 10th December, GREYC, Université de Caen

Rémi Guyanneau, 19th November, LISA, Université d'Angers

Zhaoxia Peng, Lagis, Université de Lille

Xuan Son Nguyen, 28th Jun., LIP6, Université Pierre et Marie Curie,Paris 6

Caroline Ponzoni Carvalho Chanel, 12th Apr.,ONERA, Université de Toulouse/ISAE

Pedro Chahuara Quispe, 27th march, LIG, Université de Grenoble

Joni Pajarinen, 7th February, Aalto University

François Charpillet was a member of the HdR committee of :

Frank Gechter, 4th December, SET, UTBM-Université de Franche-Comté

Stephane Galland, 11th December, SET, UTBM-Université de Franche-Comté

Alain Dutech was a member of the PhD committee of

Tony Pinville, 30th January 2013, Université Paris 6 (referee),

Mahuna Akplogan, 15th May 2013, INRA Toulouse (referee),

Jean-Baptiste Hoock, 12th June 2013, Université Paris 11 (referee),

Joseph El-Gemayel, 25th June 2013, Université de Toulouse 1 (referee),

Lucie Daubigney, 1st October 2013, Université de Lorraine.

Nazim Fatès was a member of the PhD committee of Markus Redeker, June 17th, 2013, Int. Center Unconventional Computing, Bristol University, UK.

Olivier Simonin was a member was a member of the PhD committee of

Benjamin Cogrel, 18th November 2013, Univ. Paris Est Crétiel,

Remy Guyonneau, 19th November 2013, Univeristé d'Angers,

Nicolas Coté, 10th December 2013, Univ. de Caen,

Jean-Baptiste Soyez, 3rd December 2013, Univ. de Lille.

Olivier Simonin was a member (as a referee) of the PhD committee of

Jorge Rios Martinez, 8th January 2013, LIG-Inria Rhône-Alpes Grenoble,

Cyril Poulet, 23rd April 2013, UPMC LIP6,

Yassine Gangat, 27th August 2013, Ile de la Réunion,

Feirouz Ksontini, 13th November 2013, Univ. de Valenciennes,

Madeleine El Zaher, 22th November 2013, Univ. Technologique de Belfort-Montbéliard,

Nicolas Carlesi, 19th December, Univ. Montpellier 2.

Specialist Committees

François Charpillet was a member of "Specialist committee" in University of Grenoble

Olivier Buffet was a member of the "Specialist committee" in Université de Caen Basse-Normandie.

Alain Dutech was a member of the “Specialist committee” in Université Paris 13.

Popularization

François Charpillet participated to the popularization of the robotic and Ambient Intelligence activity of the maia team :

http://videotheque.inria.fr/videotheque/media/25615

http://videotheque.inria.fr/videotheque/doc/809

http://vimeo.com/78659657

http://vimeo.com/83727993

The popularization article by Alain Dutech, Bruno Scherrer and Christophe Thiéry on Reinforcement Learning for the game of Tetris, that was first published at Interstices, was revised and published at Images des Mathématiques .

Vincent Thomas has organized, in collaboration with the "Bibliothèque Universitaire du Campus Lettres", an exposition "jeux: les ateliers de la pensée" where the main objective is to promote games as an interesting subject for academics. This exposition included two days of vulgarization seminars with specialists of several scientific fields (economy, psychology, computer science) and animations. See http://ticri.inpl-nancy.fr/wicri-lor.fr/index.php?title=Exposition_Jeu. The material of the exposition will be presented in several university libraries of the Université de Lorraine in the first half of 2014.

Vincent Thomas participated in “Journée ISN-EPI” (Jun. 27 2013) whose audience is computer science teachers of secondary school by making a presentation on "games and artificial intelligence" and by organizing a workshop on the same subject. Vincent Thomas is participating in the LORIA IDEES group dedicated to teaching activities http://idees.loria.fr/index.php?n=Main.ProgrammeJourneeISN-EPI.

Vincent Thomas participated in “Dans les coulisses d'un labo d'informatique” (Mar. 21 2013) by proposing an animation about “artificial intelligence and video games” for secondary school students https://iww.inria.fr/NanSciNum/dans-les-coulisses-dinria-nancy-grand-est-2/.

Vincent Thomas and Olivier Buffet participated in “Fête de la Science” by organizing a workshop on board games and artificial intelligence (Oct. 10 and 12). See https://iww.inria.fr/NanSciNum/un-ticet-pour-la-science/.

Nazim Fatès animated two debates following the presentation of the film Codebreaker which is a “biopic” dedicated to Alan Turing: in January, it was presented at the Lycée Jean-Moulin in Forbach; in November, it was presented in the main cinema of Saint-Dié-des-Vosges (organisation by “Festival du film de chercheur”).

Christine Bourjot, Alain Dutech and Nazim Fatès participated in a debate on artificial intelligence in the café “l'Irlandais” with a public mainly constituted of university students.

Asynchronous cellular automata and applications - Special issue of Natural Computing Alberto Dennunzio A. Nazim Fatès N. Enrico Formenti E. 12 Springer 2013 51 http://hal.inria.fr/hal-00918586 Apprentissage par renforcement et planification adaptative. Bruno Zanuttini B. Guillaume J. Laurent G. J. Olivier Buffet O. Lavoisier January 2013 153-263 http://hal.inria.fr/hal-00874808 Stratégie d'exploration multirobot fondées sur le calcul de champs de potentiels Antoine Bautin A. Université de Lorraine October 2013 http://hal.inria.fr/tel-00936953 Ph. D. Thesis " Le simple est-il robuste ? " : une étude de la robustesse des systèmes complexes par les automates cellulaires Olivier Bouré O. Université de Lorraine September 2013 http://hal.inria.fr/tel-00918545 Ph. D. Thesis First steps on asynchronous lattice-gas models with an application to a swarming rule Olivier Bouré O. Nazim Fatès N. Vincent Chevrier V. 1567-7818 Natural Computing 12 4 December 2013 551-560 http://hal.inria.fr/hal-00790561 Density Classification on Infinite Lattices and Trees Ana Busic A. Nazim Fatès N. Irène Marcovici I. Jean Mairesse J. 1083-6489 Electronic Journal of Probability 18 51 2013 1-22 http://hal.inria.fr/hal-00918583 La carotte et le bâton... et Tetris Alain Dutech A. Bruno Scherrer B. Christophe Thiery C. 2105-1003 Images des Mathématiques November 2013 http://hal.inria.fr/hal-00922142 Stochastic Cellular Automata Solutions to the Density Classiﬁcation Problem - When randomness helps computing Nazim Fatès N. 1432-4350 Theory of Computing Systems 53 2 2013 223-242 http://hal.inria.fr/inria-00608485 Extended version of the Stacs 2011 proceedings paper Off-policy Learning with Eligibility Traces: A Survey Matthieu Geist M. Bruno Scherrer B. 1532-4435 Journal of Machine Learning Research 2013 http://hal.inria.fr/hal-00921275 Accepted, To appear Performance Bounds for Lambda Policy Iteration and Application to the Game of Tetris Bruno Scherrer B. 1532-4435 Journal of Machine Learning Research 14 January 2013 1175-1221 http://hal.inria.fr/hal-00759102 A Hybrid Bayesian Framework for Map Matching: Formulation Using Switching Kalman Filter Cherif Smaili C. Maan El Badaoui El Najjar M. François Charpillet F. 0921-0296 Journal of Intelligent and Robotic Systems 2013 http://hal.inria.fr/hal-00821912 à paraître Multi-robot exploration of unknown environments with identification of exploration completion and post-exploration rendez-vous using ant algorithms Mihai Andries M. François Charpillet F. IEEE/RSJ International Conference on Intelligent Robots and Systems Tokyo, Japan IEEE/RSJ November 2013 http://hal.inria.fr/hal-00913349 IEEE RSJ International Conference on Intelligent Robots and Systems 2011 IROS Active Diagnosis Through Information-Lookahead Planning Mauricio Araya M. Olivier Buffet O. Vincent Thomas V. 8èmes Journées Francophones sur la Planification, la Décision et l'Apprentissage pour la conduite de systèmes Lille, France July 2013 http://hal.inria.fr/hal-00907288 Journées Francophones Planification, Décision, Apprentissage 2013 JFPDA Cart-O-matic project : autonomous and collaborative multi-robot localization, exploration and mapping. 5th Workshop on Planning, Perception and Navigation for Intelligent Vehicles Antoine Bautin A. Philippe Lucidarme P. Rémy Guyonneau R. Olivier Simonin O. Sébastien Lagrange S. Nicolas Delanoue N. François Charpillet F. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) Tokyo, Japan 2013 http://hal.inria.fr/hal-00875073 IEEE RSJ International Conference on Intelligent Robots and Systems 2009 IROS Cart-O-matic project : autonomous and collaborative multi-robot localization, exploration and mapping Antoine Bautin A. Philippe Lucidarme P. Rémy Guyonneau R. Olivier Simonin O. Sebastien Lagrange S. Nicolas Delanoue N. François Charpillet F. 5th Workshop on Planning, Perception and Navigation for Intelligent Vehicles (a IROS 2013 workshop) Tokyo, Japan November 2013 210-215 http://hal.inria.fr/hal-00939161 IROS Workshop on Planning, Perception and Navigation for Intelligent Vehicles 5 SyWaP: Synchronized Wavefront Propagation for multi-robot assignment of spatially-situated tasks Antoine Bautin A. Olivier Simonin O. François Charpillet F. ICAR 2013 : International Conference on Advanced Robotics Uruguay November 2013 7 http://hal.inria.fr/hal-00920122 IEEE International Conference on Advanced Robotics 2013 ICAR A robustness approach to study metastable behaviours in a lattice-gas model of swarming Olivier Bouré O. Nazim Fatès N. Vincent Chevrier V. Jarkko Kari J. Martin Kutrib M. Andreas Malcher A. 19th International Workshop, AUTOMATA 2013 Giessen, Germany Lecture Notes in Computer Science 8155 Springer Berlin Heidelberg 2013 84-97 http://hal.inria.fr/hal-00768831 International Workshop on Cellular Automata and Discrete Complex Systems 19 AUTOMATA Multi-level modeling as a society of interacting models Benjamin Camus B. Christine Bourjot C. Vincent Chevrier V. L. Yilmaz L. SpringSim'13, ADS Symposium - Spring Simulation Multi-Conference, Agent-Directed Simulation Symposium - 2013 San Diego, United States 1 Curran Associates, Inc. Society for Modeling & Simulation International ( SCS ) April 2013 15-22 http://hal.inria.fr/hal-00816587 Spring Simulation Multi-Conference 2013 SpringSim Multi-agent simulation based governance of complex systems : architecture and example implementation on free-riding Laurent Ciarletta L. Vincent Chevrier V. Tomas Navarrete Gutierrez T. ENC 2013, Mexican International Conference on Computer Science MORELIA, Mexico October 2013 http://hal.inria.fr/hal-00905235 Mexican International Conference on Computer Science 2013 ENC Model-free POMDP optimisation of tutoring systems with echo-state networks Lucie Daubigney L. Matthieu Geist M. Olivier Pietquin O. SIGDial 2013 Metz, France August 2013 102-106 http://hal.inria.fr/hal-00869773 SIGDIAL Meeting on Discourse and Dialogue 14 SIGDIAL Optimisation par essaims particulaires de stratégies de dialogue Lucie Daubigney L. Matthieu Geist M. Olivier Pietquin O. Journées Francophones de Plannification, Décision et Apprentissage (JFPDA) Lille, France July 2013 http://hal.inria.fr/hal-00918425 Journées Francophones Planification, Décision, Apprentissage 2013 JFPDA Particle Swarm Optimisation of Spoken Dialogue System Strategies Lucie Daubigney L. Matthieu Geist M. Olivier Pietquin O. Interspeech 2013 Lyon, France August 2013 1-5 http://hal.inria.fr/hal-00916935 Annual Conference of the International Speech Communication Association 14 INTERSPEECH Random Projections: a Remedy for Overfitting Issues in Time Series Prediction with Echo State Networks Lucie Daubigney L. Matthieu Geist M. Olivier Pietquin O. ICASSP 2013 Vancouver, Canada May 2013 3253-3257 http://hal.inria.fr/hal-00869814 IEEE International Conference on Acoustics, Speech and Signal Processing 38 ICASSP Optimally Solving Dec-POMDPs as Continuous-State MDPs Jilles Steeve Dibangoye J. S. Christopher Amato C. Olivier Buffet O. François Charpillet F. IJCAI - 23rd International Joint Conference on Artificial Intelligence Pékin, China August 2013 http://hal.inria.fr/hal-00907338 International Joint Conference on Artificial Intelligence 23 IJCAI Résolution exacte des Dec-POMDPs comme des MDPs continus Jilles Steeve Dibangoye J. S. Christopher Amato C. Olivier Buffet O. François Charpillet F. 8èmes Journées Francophones sur la Planification, la Décision et l'Apprentissage pour la conduite de systèmes Lille, France July 2013 http://hal.inria.fr/hal-00907279 Journées Francophones Planification, Décision, Apprentissage 2013 JFPDA Producing efficient error-bounded solutions for transition independent decentralized MDPs Jilles Steeve Dibangoye J. S. Christopher Amato C. Arnaud Doniec A. François Charpillet F. International conference on Autonomous Agents and Multi-Agent Systems Saint Paul, MN, United States May 2013 539-546 http://hal.inria.fr/hal-00918066 International Conference on Autonomous Agents and Multiagent Systems 12 AAMAS Automatic Fall Detection System with a RGB-D Camera using a Hidden Markov Model Amandine Dubois A. François Charpillet F. ICOST - 11th International Conference On Smart homes and health Telematics - 2013 Singapore, Singapore Lecture Notes in Computer Science 7910 Springer June 2013 259-266 http://hal.inria.fr/hal-00914345 International Conference on Smart Homes and Health Telematics 11 ICOST Detecting and preventing falls with depth camera, tracking the body center Amandine Dubois A. François Charpillet F. AAATE - 12th European Association for the Advancement of Assistive Technology in Europe - 2013 Vilamoura, Algarve, Portugal September 2013 http://hal.inria.fr/hal-00914299 European Association for the Advancement of Assistive Technology in Europe 12 AAATE Human Activities Recognition with RGB-Depth Camera using HMM Amandine Dubois A. François Charpillet F. EMBC - 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society - 2013 Osaka, Japan July 2013 http://hal.inria.fr/hal-00914319 Annual International Conference of the IEEE Engineering in Medicine and Biology Society 35 EMBC Système d'évaluation de la fragilité chez les personnes âgées Amandine Dubois A. François Charpillet F. JETSAN - Journées d'étude sur la TéléSanté - 2013 Fontainebleau, France May 2013 http://hal.inria.fr/hal-00914354 Journées d'étude sur la TéléSanté 2013 JETSAN A guided tour of asynchronous cellular automata Nazim Fatès N. Andreas Malcher A. Martin Kutrib M. 19th International Workshop AUTOMATA 2013 Giessen, Germany Lecture Notes in Computer Science 8155 Springer Andreas Malcher and Martin Kutrib 2013 http://hal.inria.fr/hal-00845623 International Workshop on Cellular Automata and Discrete Complex Systems 19 AUTOMATA A note on the classiffcation of the most simple asynchronous cellular automata Nazim Fatès N. Andreas Malcher A. Martin Kutrib M. AUTOMATA 2013 Giessen, Germany 8155 Springer Andreas Malcher and Martin Kutrib September 2013 31-45 http://hal.inria.fr/hal-00846059 International Workshop on Cellular Automata and Discrete Complex Systems 19 AUTOMATA Approximate Dynamic Programming Finally Performs Well in the Game of Tetris Victor Gabillon V. Mohammad Ghavamzadeh M. Bruno Scherrer B. Neural Information Processing Systems (NIPS) 2013 South Lake Tahoe, United States 2013 http://hal.inria.fr/hal-00921250 Annual Conference on Neural Information Processing Systems 27 NIPS Adaptive Management of Migratory Birds Under Sea Level Rise Samuel Nicol S. Olivier Buffet O. Takuya Iwamura T. Iadine Chadès I. Francesca Rossi F. IJCAI - 23rd International Joint Conference on Artificial Intelligence - 2013 Pékin, China AAAI Press August 2013 2955-2957 http://hal.inria.fr/hal-00907334 International Joint Conference on Artificial Intelligence 23 IJCAI Sur l'utilisation de politiques non-stationnaires pour les processus de décision Markoviens à horizon infini Bruno Scherrer B. Boris Lesner B. JFPDA - 8èmes Journées Francophones sur la Planification, la Décision et l'Apprentissage pour la conduite de systèmes - 2013 Lille, France 2013 http://hal.inria.fr/hal-00921291 Journées Francophones Planification, Décision, Apprentissage 2013 JFPDA Improved and Generalized Upper Bounds on the Complexity of Policy Iteration Bruno Scherrer B. Neural Information Processing Systems (NIPS) 2013 South Lake Tahoe, United States 2013 http://hal.inria.fr/hal-00921261 Annual Conference on Neural Information Processing Systems 27 NIPS Quelques majorants de la complexité d'itérations sur les politiques Bruno Scherrer B. JFPDA - 8èmes Journées Francophones sur la Planification, la Décision et l'Apprentissage pour la conduite de systèmes - 2013 Lille, France 2013 http://hal.inria.fr/hal-00921287 Journées Francophones Planification, Décision, Apprentissage 2013 JFPDA Abstraction Pathologies In Markov Decision Processes Manel Tagorti M. Bruno Scherrer B. Olivier Buffet O. Joerg Hoffmann J. ICAPS'13 workshop on Heuristics and Search for Domain-independent Planning (HSDIP) Rome, Italy June 2013 http://hal.inria.fr/hal-00907315 ICAPS Workshop on Heuristics and Search for Domain-independent Planning 2013 HSDIP Abstraction Pathologies In Markov Decision Processes Manel Tagorti M. Bruno Scherrer B. Olivier Buffet O. Joerg Hoffmann J. 8èmes Journées Francophones sur la Planification, la Décision et l'Apprentissage pour la conduite de systèmes Lille, France July 2013 http://hal.inria.fr/hal-00907295 Journées Francophones Planification, Décision, Apprentissage 2013 JFPDA Croisement synchronisé de flux de véhicules autonomes dans un réseau Mohamed Tlig M. Olivier Buffet O. Olivier Simonin O. RJCIA - 11èmes Rencontres des Jeunes Chercheurs en Intelligence Artificielle Lille, France July 2013 http://hal.inria.fr/hal-00914578 Rencontres Jeunes Chercheurs en Intelligence Artificielle 11 RJCIA Reactive coordination rules for traffic optimization in road sharing problems Mohamed Tlig M. Olivier Buffet O. Olivier Simonin O. AATMO - PAAMS workshop on Agent-based Approaches for the Transportation Modelling and Optimisation Salamanque, Spain Lecture Notes in Computer Science 7879 Springer May 2013 61-72 http://hal.inria.fr/hal-00907305 PAAMS Workshop on Agent-based Approaches for the Transportation Modelling and Optimisation 2013 PAAMS AATMO Decentralized Near-to-Near Approach for Vehicle Platooning based on Memorization and Heuristic Search Jano Yazbeck J. Alexis Scheuer A. François Charpillet F. ICRA Hong-Kong, China May 2014 http://hal.inria.fr/hal-00936056 IEEE International Conference on Robotics and Automation 2012 ICRA Multi-level Modeling as a Society of Interacting Models Benjamin Camus B. Christine Bourjot C. Vincent Chevrier V. December 2013 http://hal.inria.fr/hal-00913038 Technical Report Off-policy Learning with Eligibility Traces: A Survey Matthieu Geist M. Bruno Scherrer B. 2013 43 http://hal.inria.fr/hal-00644516 Research Report How to design good Tetris players Amine Boumaza A. 2013 http://hal.inria.fr/hal-00926213 Is there something like ”modellability” ? - Reflections on the robustness of discrete models of complex systems Nazim Fatès N. Seminar Univ. de Concepcion Concepcion, Chile October 2013 http://hal.inria.fr/hal-00906991 Seminar Univ. de Concepcion A guided tour of of asynchronous cellular automata Nazim Fatès N. January 2014 http://hal.inria.fr/hal-00908373 Local structure approximation as a predictor of second order phase transitions in asynchronous cellular automata Henryk Fukś H. Nazim Fatès N. 2013 20 http://hal.inria.fr/hal-00921295 Tight Performance Bounds for Approximate Modified Policy Iteration with Non-Stationary Policies Boris Lesner B. Bruno Scherrer B. April 2013 http://hal.inria.fr/hal-00815996 Policy Search: Any Local Optimum Enjoys a Global Performance Guarantee Bruno Scherrer B. Matthieu Geist M. June 2013 http://hal.inria.fr/hal-00829548 Improved and Generalized Upper Bounds on the Complexity of Policy Iteration Bruno Scherrer B. June 2013 http://hal.inria.fr/hal-00829532 On the Performance Bounds of some Policy Search Dynamic Programming Algorithms Bruno Scherrer B. June 2013 http://hal.inria.fr/hal-00829559 Reversibility of Elementary Cellular Automata Under Fully Asynchronous Update Biswanath Sethi B. Nazim Fatès N. Sukanta Das S. January 2014 http://www.annauniv.edu/tamc2014/ http://hal.inria.fr/hal-00906987 To appear in the proceedings of TAMC'14 Embodied , On-line , On-board Evolution for Autonomous Robotics Agoston E Eiben A. E. Evert Haasdijk E. Nicolas Bredeche N. Symbiotic Multi-Robot Organisms: Reliability, Adaptability, Evolution 5.2 Springer 2010 361–382 http://hal.inria.fr/inria-00531455 Frailty in Older Adults : Evidence for a Phenotype L. P. Fried L. P. C. M. Tangen C. M. J. Walson J. A. B. Newman A. B. C. Hirsch C. J. Gottdiener J. T. Seeman T. T. Russell T. W. J. Kop W. J. G. Burke G. M. A. Mc Burnie M. A. Journal of Gerontology, MEDICAL SCIENCE 56 3 2001 146-156 Automated Planning: Theory and Practice Malik Ghallab M. Dana Nau D. Paolo Traverso P. Morgan Kaufmann 2004 Modélisation des interactions entre le cortex cerebral et le cervelet au cours du mouvement Arthur Kaladjian A. Université de Paris IV, UFR Sciences de la Vie 1999 Ph. D. Thesis Optimal Control for Biological Movement Systems Weiwei Li W. University of California, San Diego 2006 Ph. D. Thesis Apprentissage par Renforcement Développemental Thomas Moinel T. Université de Lorraine, Master RAR juin 2013 Masters thesis Markov Decision Processes M. Puterman M. Wiley, New York 1994 Artificial Intelligence: A Modern Approach Stuart Russell S. Peter Norvig P. 2nd edition Prentice-Hall, Englewood Cliffs, NJ 2003 Rationality and Intelligence Stuart Russell S. Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence (IJCAI) 1995 Invited paper (Computers and Thought Award) Safe Longitudinal Platoons of Vehicles without Communication Alexis Scheuer A. Olivier Simonin O. François Charpillet F. Proceedings of the IEEE International Conference on Robotics and Automation - ICRA'09 Kobe (JP) May 2009 70–75 http://www.loria.fr/~scheuer/Platoon Approche multi-agent pour la multi-modélisation et le couplage de simulations. Application à l'étude des influences entre le fonctionnement des réseaux ambiants et le comportement de leurs utilisateurs. Julien Siebert J. Université Henri Poincaré - Nancy I September 2011 http://hal.inria.fr/tel-00642034 Ph. D. Thesis Reinforcement Learning, An introduction R.S. Sutton R. A.G. Barto A. BradFord Book. The MIT Press 1998 Embodied Evolution: Distributing an evolutionary algorithm in a population of robots Richard A. Watson R. A. Sevan G. Ficici S. G. Jordan B. Pollack J. B. Robotics and Autonomous Systems 39 1 April 2002 1–18 http://linkinghub.elsevier.com/retrieve/pii/S0921889002001707 Improving near-to-near lateral control of platoons without communication Jano Yazbeck J. Alexis Scheuer A. Olivier Simonin O. François Charpillet F. IEEE-RSJ International Conference on Intelligent Robots and Systems (IROS) San Francisco, United States September 2011 4103–4108 http://hal.inria.fr/inria-00603726/en Analysis and Implementation of Distributed Algorithms for Multi-Robot Systems James Dwight McLurkin J. MIT 2008 Ph. D. Thesis