Special Issue of the Journal of Cellular Automata

MAIA Autonomous intelligent machine

Knowledge and Data Representation and Management

Perception, Cognition, Interaction

http://www.loria.fr/equipes/maia/ September 01, 2002 Laboratoire lorrain de recherche en informatique et ses applications (LORIA) CNRS Université de Lorraine Artificial Intelligence Robotics Planning Machine Learning Multi-Agent Systems François Charpillet INRIA Chercheur

Nancy

Team Leader, Senior Researcher, Inria oui Christel Wiemert INRIA Assistant

Nancy

TR, Inria, till 30 sept. Véronique Constant INRIA Assistant

Nancy

TR, Inria, from 10 sept. Martine Kuhlmann CNRS Assistant

Nancy

TR, CNRS Aurélie Adam UnivFr Assistant

Nancy

TR, Université de Lorraine Olivier Buffet INRIA Chercheur

Nancy

Junior Researcher, Inria Alain Dutech INRIA Chercheur

Nancy

Junior Researcher, Inria oui Nazim Fatès INRIA Chercheur

Nancy

Junior Researcher, Inria Jörg Hoffmann INRIA Chercheur

Nancy

Senior Researcher, Inria, till 1st april oui Bruno Scherrer INRIA Chercheur

Nancy

Junior Researcher, Inria Amine Boumaza UnivFr Enseignant

Nancy

Associate Professor, Université de Lorraine, from sept. Christine Bourjot UnivFr Enseignant

Nancy

Associate Professor, Université de Lorraine Vincent Chevrier UnivFr Enseignant

Nancy

Associate Professor, Université de Lorraine oui Alexis Scheuer UnivFr Enseignant

Nancy

Associate Professor, Université de Lorraine Olivier Simonin UnivFr Enseignant

Nancy

Associate Professor, Université de Lorraine oui Vincent Thomas UnivFr Enseignant

Nancy

Associate Professor, Université de Lorraine Maan El-Badaoui-El-Najjar UnivFr CollaborateurExterieur

Nancy

Associate Professor, Lille Boris Lesner INRIA PostDoc

Nancy

ANR project BARQ, Inria Jilles Dibangoye INRIA PostDoc

Nancy

ANR project BARQ, Inria Mihai Andries INRIA PhD

Nancy

AEN PAL, Université de Lorraine Mauricio Araya UnivFr PhD

Nancy

ATER, Université de Lorraine Antoine Bautin UnivFr PhD

Nancy

ANR scholarship, Université de Lorraine Olivier Bouré UnivFr PhD

Nancy

ATER, Université de Lorraine Benjamin Camus UnivFr PhD

Nancy

MENRT scholarship, Université de Lorraine Lucie Daubigney AutreAffiliation PhD

Nancy

Contract, Supelec Metz Amandine Dubois UnivFr PhD

Nancy

MENRT scholarship, Université de Lorraine Arsène Fansi Tchango UnivFr PhD

Nancy

CIFRE, Thales, Université de Lorraine Nassim Kalde UnivFr PhD

Nancy

MENRT scholarship, Université de Lorraine Tomas Navarrete UnivFr PhD

Nancy

ATER, Université de Lorraine Manel Tagorti INRIA PhD

Nancy

ANR project BARQ, Université de Lorraine Mohamed Tlig INRIA PhD

Nancy

EU project InTraDE, Université de Lorraine Jano Yazbeck INRIA PhD

Nancy

EU project InTraDE, Université de Lorraine Nicolas Beaufort INRIA Technique

Nancy

Engineer Lionel Havet INRIA Technique

Nancy

Engineer, till 30 sept. Marie Tonnelier INRIA Technique

Nancy

Engineer, till 30 march Overall Objectives Introduction

The objective of the MAIA MAIA stands for “MAchine Intelligente et Autonome”, that is “Autonomous and Intelligent MAchine” team is to address foundation and engineering aspects of artificial intelligence. In this general framework, the team investigates the design and understanding of intelligent agents In the field of artificial intelligence, an “agent” refers to an entity which autonomously perceives and acts upon an environment so as to achieve one or several goals. The MAIA group equally addresses the design of a single agent, a team of agents or a large number of agents. This common objective is considered from two perspectives organized around two lines of research:

The first research activity is about sequential decision making. It has been influenced by Stuart Russell who considers that an agent is rational. In his vision: [For each possible percept sequence, an ideal rational agent should do whatever action is expected to maximize its performance measure] . This view makes Markov decision processes (MDPs) and more generally sequential decision making a good candidate for building the behavior of an agent. It is probably why MDPs have received considerable attention in recent years by the artificial intelligence (AI) community.

The second activity is about understanding and engineering reactive multi-agent systems. It is influenced by research results from the field of behavioral biology which gives us some of the keys to understand how intelligent and adaptive behaviors appear in natural swarm systems. This encourages us to study principles of emergent behaviors in natural systems and apply them to design artificial intelligent systems. Reactive multi-agent systems are good candidates for building such autonomous and adaptive systems and our work mainly focuses on better understanding how we can soundly build such systems.

Highlights of the Year

A paper on non-stationary policies for infinite-horizon Markov decision processes written by Boris Lesner and Bruno Scherrer (see Section for more details) was accepted at NIPS'2012 with a full oral presentation (1467 papers were submitted, 370 were accepted for publication, among which only 20 were selected for full oral presentation).

The Cartomatic projet which was part of the French robotics contest Defi CAROTTE organized by the General Delegation for Armaments (DGA) and French National Research Agency (ANR), has won the third and final edition of the contest. The aim of the Cart-O-matic project was to design and build a multi-robot system able to autonomously map an unknown building and to recognize various objects inside. The scientific issues of this project deal with Simultaneous Localization And Mapping (SLAM), multi-robot collaboration, and object recognition and classification. The research teams involved in this project have developed innovative approaches to each of these fields.

The paper “MOMDPs: a Solution for Modelling Adaptive Management Problems”, cosigned by Olivier Buffet has won the best paper award in this year’s Special Track on Computational Sustainability and Artificial Intelligence at the Association for the Advancement of Artificial Intelligence (AAAI-12) conference in Toronto.

Emil Keyder, Joerg Hoffmann and Patrik Haslum (ANU/NICTA) won the best paper award of the International Conference on Automated Planning and Scheduling (ICAPS-12) for their paper “Semi-Relaxed Plan Heuristics” .

Scientific Foundations Sequential Decision Making Synopsis and Research Activities
Sequential decision making consists, in a nutshell, in controlling the actions of an agent facing a problem whose solution requires not one but a whole sequence of decisions. This kind of problem occurs in a multitude of forms. For example, important applications addressed in our work include: Robotics, where the agent is a physical entity moving in the real world; Medicine, where the agent can be an analytic device recommending tests and/or treatments; Computer Security, where the agent can be a virtual attacker trying to identify security holes in a given network; and Business Process Management, where the agent can provide an auto-completion facility helping to decide which steps to include into a new or revised process. Our work on such problems is characterized by three main research trends:
(A)

Understanding how, and to what extent, to best model the problems.

(B)

Developing algorithms solving the problems and understanding their behavior.

(C)

Applying our results to complex applications.

Before we describe some details of our work, it is instructive to understand the basic forms of problems we are addressing. We characterize problems along the following main dimensions:
(1)

Extent of the model: full vs. partial vs. none. This dimension concerns how complete we require the model of the problem – if any – to be. If the model is incomplete, then learning techniques are needed along with the decision making process.

(2)

Form of the model: factored vs. enumerative. Enumerative models explicitly list all possible world states and the associated actions etc. Factored models can be exponentially more compact, describing states and actions in terms of their behavior with respect to a set of higher-level variables.

(3)

World dynamics: deterministic vs. stochastic. This concerns our initial knowledge of the world the agent is acting in, as well as the dynamics of actions: is the outcome known a priori or are several outcomes possible?

(4)

Observability: full vs. partial. This concerns our ability to observe what our actions actually do to the world, i.e., to observe properties of the new world state. Obviously, this is an issue only if the world dynamics are stochastic.

These dimensions are wide-spread in the AI literature. We remark that they are not exhaustive. In parts of our work, we also consider the difference between discrete/continuous problems, and centralized/decentralized problems. The complexity of solving the problem – both in theory and in practice – depends crucially on where the problem resides in this categorization. In many applications, not one but several points in the categorization make sense: simplified versions of the problem can be solved much more effectively and thus serve for the generation of some – if possibly sub-optimal – action strategy in a more feasible manner. Of course, the application as such may also come in different facets.

In what follows, we outline the main formal frameworks on which our work is based; while doing so, we highlight in a little more detail our core research questions. We then give a brief summary of how our work fits into the global research context.
Formal Frameworks Deterministic Sequential Decision Making
Sequential decision making with deterministic world dynamics is most commonly known as planning, or classical planning . Obviously, in such a setting every world state needs to be considered at most once, and thus enumerative models do not make sense (the problem description would have the same size as the space of possibilities to be explored). Planning approaches support factored description languages allowing to model complex problems in a compact way. Approaches to automatically learn such factored models do exist, however most works – and also most of our works on this form of sequential decision making – assume that the model is provided by the user of the planning technology. Formally, a problem instance, commonly referred to as a planning task, is a four-tuple $〈 V, A, I, G 〉$ . Here, $V$ is a set of variables; a value assignment to the variables is a world state. $A$ is a set of actions described in terms of two formulas over $V$ : their preconditions and effects. $I$ is the initial state, and $G$ is a goal condition (again a formula over $V$ ). A solution, commonly referred to as a plan, is a schedule of actions that is applicable to $I$ and achieves $G$ .

Planning is PSPACE-complete even under strong restrictions on the formulas allowed in the planning task description. Research thus revolves around the development and understanding of search methods, which explore, in a variety of different ways, the space of possible action schedules. A particularly successful approach is heuristic search, where search is guided by information obtained in an automatically designed relaxation (simplified version) of the task. We investigate the design of relaxations, the connections between such design and the search space topology, and the construction of effective planning systems that exhibit good practical performance across a wide range of different inputs. Other important research lines concern the application of ideas successful in planning to stochastic sequential decision making (see next), and the development of technology supporting the user in model design.
Stochastic Sequential Decision Making
Markov Decision Processes (MDP) are a natural framework for stochastic sequential decision making. An MDP is a four-tuple $〈 S, A, T, r 〉$ , where $S$ is a set of states, $A$ is a set of actions, $T (s, a, s^{'}) = P (s^{'} | s, a)$ is the probability of transitioning to $s^{'}$ given that action $a$ was chosen in state $s$ , and $r (s, a, s^{'})$ is the (possibly stochastic) reward obtained from taking action $a$ in state $s$ , and transitioning to state $s^{'}$ . In this framework, one looks for a strategy: a precise way for specifying the sequence of actions that induces, on average, an optimal sum of discounted rewards $E [\sum_{t = 0}^{\infty} γ^{t} r_{t}]$ . Here, $(r_{0}, r_{1}, . . .)$ is the infinitely-long (random) sequence of rewards induced by the strategy, and $γ \in (0, 1)$ is a discount factor putting more weight on rewards obtained earlier. Central to the MDP framework is the Bellman equation, which characterizes the optimal value function $V^{*}$ :

$\forall s \in S, V^{*} (s) = max_{a \in A} \sum_{s^{'} \in S} T (s, a, s^{'}) [r (s, a, s^{'}) + γ V^{*} (s^{'})] .$

Once the optimal value function is computed, it is straightforward to derive an optimal strategy, which is deterministic and memoryless, i.e., a simple mapping from states to actions. Such a strategy is usually called a policy. An optimal policy is any policy $π^{*}$ that is greedy with respect to $V^{*}$ , i.e., which satisfies:

$\forall s \in S, π (s) \in arg max_{a \in A} \sum_{s^{'} \in S} T (s, a, s^{'}) [r (s, a, s^{'}) + γ V^{*} (s^{'})] .$

An important extension of MDPs, known as Partially Observable MDPs (POMDPs) allows to account for the fact that the state may not be fully available to the decision maker. While the goal is the same as in an MDP (optimizing the expected sum of discounted rewards), the solution is more intricate. Any POMDP can be seen to be equivalent to an MDP defined on the space of probability distributions on states, called belief states. The Bellman-machinery then applies to the belief states. The specific structure of the resulting MDP makes it possible to iteratively approximate the optimal value function – which is convex in the belief space – by piecewise linear functions, and to deduce an optimal policy that maps belief states to actions. A further extension, known as a DEC-POMDP, considers $n \geq 2$ agents that need to control the state dynamics in a decentralized way without direct communication.

The MDP model described above is enumerative, and the complexity of computing the optimal value function is polynomial in the size of that input. However, in examples of practical size, that complexity is still too high so naïve approaches do not scale. We consider the following situations: (i) when the state space is large, we study approximation techniques from both a theoretical and practical point of view; (ii) when the model is unknown, we study how to learn an optimal policy from samples (this problem is also known as Reinforcement Learning ); (iii) in factored models, where MDP models are a strict generalization of classical planning – and are thus at least PSPACE-hard to solve – we consider using search heuristics adapted from such (classical) planning.

Solving a POMDP is PSPACE-hard even given an enumerative model. In this framework, we are mainly looking for assumptions that could be exploited to reduce the complexity of the problem at hand, for instance when some actions have no effect on the state dynamics (active sensing). The decentralized version, DEC-POMDPs, induces a significant increase in complexity (NEXP-complete). We tackle the challenging – even for (very) small state spaces – exact computation of finite-horizon optimal solutions through alternative reformulations of the problem. We also aim at proposing advanced heuristics to efficiently address problems with more agents and a longer time horizon.
Project-team positioning
Within Inria, the most closely related teams are TAO and Sequel. TAO works on evolutionary computation (EC) and statistical machine learning (ML), and their combination. Sequel works on ML, with a theoretical focus combining CS and applied maths. The main difference is that TAO and Sequel consider particular algorithmic frameworks that can, amongst others, be applied to Planning and Reinforcement Learning, whereas we revolve around Planning and Reinforcement Learning as the core problems to be tackled, with whichever framework suitable.

In France, we have recently begun collaborating with the IMS Team of Supélec Metz, notably with O. Pietquin and M. Geist who have a great expertise in approximate techniques for MDPs. We have have links with the MAD team of the BIA unit of the INRA at Toulouse, led by R. Sabbadin. They also use MDP related models and are interested in solving large size problems, but they are more driven by applications (mostly agricultural) than we are. In Paris, the Animat Lab, that was a part of the LIP6 and is now attached to the ISIR, has done some interesting works on factored Markov Decision Problems and POMDPs. Like us, their main goal was to tackle problems with large state space.

In Europe, the IDSIA Lab at Lugano (Switzerland) has brought some interesting ideas to the field of MDP (meta-learning, subgoal discovery) but seems now more interested in a Universal Learner. In Osnabrück (Germany), the Neuroinformatic group works on efficient reinforcement learning with a specific interest in the application to robotics. For deterministic planning, the most closely related groups are located in Freiburg (Germany), Glasgow (UK), and Barcelona (Spain). We have active collaborations with all of these.

In the rest of the world, the most important groups regarding MDPs can be found at Brown University, Rutgers Univ. (M. Littman), Univ. of Toronto (C. Boutilier), MIT AI Lab (L. Kaelbling, D. Bertsekas, J.Tsitsiklis), Stanford Univ., CMU, Univ. of Alberta (R. Sutton), Univ. of Massachusetts at Amherst ( S. Zilberstein, A. Barto), etc. A major part of their work is aimed at making Markov Decision Process based tools work on real life problems and, as such, our scientific concerns meet theirs. For deterministic planning, important related groups and collaborators are to be found at NICTA (Canberra, Australia) and at Cornell University (USA).
Understanding and mastering complex systems General context
There exist numerous examples of natural and artificial systems where self-organization and emergence occur. Such systems are composed of a set of simple entities interacting in a shared environment and exhibit complex collective behaviors resulting from the interactions of the local (or individual) behaviors of these entities. The properties that they exhibit, for instance robustness, explain why their study has been growing, both in the academic and the industrial field. They are found in a wide panel of fields such as sociology (opinion dynamics in social networks), ecology (population dynamics), economy (financial markets, consumer behaviors), ethology (swarm intelligence, collective motion), cellular biology (cells/organ), computer networks (ad-hoc or P2P networks), etc.

More precisely, the systems we are interested in are characterized by :

locality: Elementary components have only a partial perception of the system's state, similarly, a component can only modify its surrounding environment.

individual simplicity: components have a simple behavior, in most cases it can be modeled by stimulus/response laws or by look-up tables. One way to estimate this simplicity is to count the number of stimulus/response rules for instance.

emergence: It is generally difficult to predict the global behavior of the system from the local individual behaviors. This difficulty of prediction is often observed empirically and in some cases (e.g., cellular automata) one can show that the prediction of the global properties of a system is an undecidable problem. However, observations coming from simulations of the system may help us to find the regularities that occur in the system's behavior (even in a probabilistic meaning). Our interest is to work on problems where a full mathematical analysis seems out of reach and where it is useful to observe the system with large simulations. In return, it is frequent that the properties observed empirically are then studied on an analytical basis. This approach should allow us to understand more clearly where lies the frontier between simulation and analysis.

levels of description and observation: Describing a complex system involves at least two levels: the micro level that regards how a component behaves, and the macro level associated with the collective behavior. Usually, understanding a complex system requires to link the description of a component behavior with the observation of a collective phenomenon: establishing this link may require various levels, which can be obtained only with a careful analysis of the system.

We now describe the type of models that are studied in our group.
Multi-agent models
To represent these complex systems, we made the choice to use reactive multi-agent systems (RMAS). Multi-agent systems are defined by a set of reactive agents, an environment, a set of interactions between agents and a resulting organization. They are characterized by a decentralized control shared among agents: each agent has an internal state, has access to local observations and influences the system through stimulus response rules. Thus, the collective behavior results from individual simplicity and successive actions and interactions of agents through the environment.

Reactive multi-agent systems present several advantages for modeling complex systems

agents are explicitly represented in the system and have the properties of local action, interaction and observation;

each agent can be described regardless of the description of the other agents, multi-agent systems allow explicit heterogeneity among agents which is often at the root of collective emergent phenomena;

Multi-agent systems can be executed through simulation and provide good model to investigate the complex link between global and local phenomena for which analytic studies are hard to perform.

By proposing two different levels of description, the local level of the agents and the global level of the phenomenon, and several execution models, multi-agent systems constitute an interesting tool to study the link between local and global properties.

Despite of a widespread use of multi-agent systems, their framework still needs many improvements to be fully accessible to computer scientists from various backgrounds. For instance, there is no generic model to mathematically define a reactive multi-agent system and to describe its interactions. This situation is in contrast with the field of cellular automata, for instance, and underlines that a unification of multi-agent systems under a general framework is a question that still remains to be tackled. We now list the different challenges that, in part, contribute to such an objective.
Current challenges
Our work is structured around the following challenges that combine both theoretical and experimental approaches.
Providing formal frameworks
Currently, there is no agreement on a formal definition of a multi-agent system. Our research aims at translating the concepts from the field of complex systems into the multi-agent systems framework.

One objective of this research is to remove the potential ambiguities that can appear if one describes a system without explicitly formulating each aspect of the simulation framework. As a benefit, the reproduction of experiments is facilitated. Moreover, this approach is intended to gain a better insight of the self-organization properties of the systems.

Another important question consists in monitoring the evolution of complex systems. Our objective is to provide some quantitative characteristics of the system such as local or global stability, robustness, complexity, etc. Describing our models as dynamical systems leads us to use specific tools of this mathematical theory as well as statistical tools.
Controlling complex dynamical system
Since there is no central control of our systems, one question of interest is to know under which conditions it is possible to guarantee a given property when the system is subject to perturbations. We tackle this issue by designing exogenous control architectures where control actions are envisaged as perturbations in the system. As a consequence, we seek to develop control mechanism that can change the global behavior of a system without modifying the agent behavior (and not violating the autonomy property).
Designing systems
The aim is to design individual behaviors and interactions in order to produce a desired collective output. This output can be a collective pattern to reproduce in case of simulation of natural systems. In that case, from individual behaviors and interactions we study if (and how) the collective pattern is produced. We also tackle “inverse problems” (decentralized gathering problem, density classification problem, etc.) which consist in finding individual behaviors in order to solve a given problem.
Project-team positioning
Building a reactive multi-agent system consists in defining a set (generally a large number) of simple and reactive agents within a shared environment (physical or virtual) in which they move, act and interact with each other. Our interest in these systems is that, in spite of their simple definition at the agent level, they produce coherent and coordinated behavior at a global scale. The properties that they may exhibit, such as robustness and adaptivity explain why their study has been growing in the last decade (in the broader context of “complex systems”).

Our work on such problems is characterized by five research trends: (A) Defining a formal framework for describing and studying these systems, (B) Developing and understanding reactive multi-agent systems, (C) Analysing and proving properties, (D) Deploying these systems on typical distributed architectures such as swarms of robots, FPGAs, GPUs and sensor networks, (E) Transferring our results in applications.

Multi-agent System is an active area of research in Artificial Intelligence and Complex Systems. Our research fits well into the international research context, and we have made and are making a variety of significant contributions both in theoretical and practical issues. Concerning multi-agent simulation and formalization, we compete or collaborate in France with S. Hassas in LIESP (Lyon), CERV (Brest), IREMIA (la Réunion), Ibisc (Evry), Lirmm (Montpellier), Irit (Toulouse), A. Drogoul (IRD, Bondy) and abroad with F. Zambonelli (Univ. Modena, Italy) A. Deutsch (Dresden, Germany), D. Van Parunak (Vector research, USA), P. Valkenaers, D. Weyns (Univ. Leuven, Belgium), etc. Regarding our work on swarm robotics we have common objectives with the DISAL Distributed Intelligent Systems and Algorithms Laboratory including EPFL Swarm-Intelligent Systems Group (SWIS) founded in 2003 and the Collective Robotics Group (CORO) founded in 2000 at California Institute of Technology USA EPFL Laboratory, the Bristol Robotics Laboratory, the Distributed Robotics Laboratory at MIT, the team of W. & D. Spears at Wyoming university, the Pheromone Robotics project at HRL Lab. HRL, Information and systems sciences Lab (ISSL), Malibu CA, USA (D. Payton), the FlockBots project at GMUGeorge Mason University, Eclab, USA (L. Panait, S. Luke), the team of G. Théraulaz at CNRS-Toulouse and the teams of J.-L. Deneubourg and M. Dorigo at ULB (Bruxelles).
Application Domains Decision Making
Our group is involved in several applications of its more fondamental work on autonomous decision making and complex systems. Applications addressed include:

Robotics, where the decision maker or agent is supported by a physical entity moving in the real world;

Medicine or Personal Assisting Living, where the agent can be an analytic device recommending tests and/or treatments, or able to gather different sources of information (sensors for example) in order to help a final user, detecting for example anormal situation needing the rescue of a person (fall detection of elderly people, risk of hospitalization of a person suffering from chronic desease;

Computer Security, where the agent can be a virtual attacker trying to identify security holes in a given network;

and Business Process Management, where the agent can provide an auto-completion facility helping to decide which steps to include into a new or revised process.

Ambient intelligence
Taking into consideration some scientific strategic choices made by the Nancy – Grand Est Research Center such as developing some platforms around Robotics and Smart Living Apartments, some members of the team have recentered their research toward “ambient intelligence and AI” . This choice has been also comforted by the launch by Inria of a Large-scale initiative project termed PAL (Personal assistant Living) in which we are strongly involved. The regional council of Lorraine also supports this new research line through the CPER, (project "situated computing" or "INFOSITU" (http://infositu.loria.fr) whose the coordinator is a member of MAIA Team. Within this new domain of research in MAIA, we are thinking about the design of intelligent environments dedicated to eldely people with loss of autonomy. This domain of research is currently very active, taking up a societal challenge that developped countries have to address. In order to contribute to this resarch domain, we are currently developping several action lines both at the technological, scientific and service levels:

Evaluation of the degree of frailty of the elderly : We are currently designing a system whose the objective is twofold : assessing the risk of fall and evaluating the degree of frailty of the elderly. This issue is considered with more or less sophisticated sensors : one kinect, a network of kinects, an heterogeneous sensor network made up of an intelligent floor and a network of kinects. One simple idea which is currently developped is to determine either the center of mass of a person using one or several kinect, or the center of pressure and footsteps localization using an intelligent floor. The idea is to induce from these simple measures, the walking speed, the length of the steps and the position of the monitored persons.

People activity analysis: Sensitive or intelligent floors have attracted a lot of attention during the last two decades for different applications going from interaction capture in immersive virtual environments to robotics or human tracking, fall detection or activity recognition. Different technologies have been proposed so far either based on optical fiber sensing, pressure sensing or electrical near field. In PAL we envision a more more sophisticate approach in which both computation and sensing is distributed within the floor. This floor is made up of interconnected intelligent titles with can:

communicate with each other,

provide some computation,

sense the environment activity through four weight sensors, an accelerometer and a magnetometer,

interact with users, robots or other sensor networks either by wireless/wire communication or through visual communication (each tiles being equipped be 16 leds).

Several scientific challenges are open to us

in decentralized spatial computing

in designing real application for assisting people suffering from loss of autonomy

Concerning the second point, we envision several applications :

evaluation of the degree of frailty of the elderly, especially evaluating the risk of fall

activity recognition

monitoring assistant robots

Software AA4MM Vincent Chevrier correspondant Benjamin Camus
Laurent Ciarletta (Madynes team, LORIA) is a collaborator and correspondant for this software.

AA4MM (Agents and Artefacts for Multi-modeling and Multi-simulation) is a framework for coupling existing and heterogeneous models and simulators in order to model and simulate complex systems. The first implementation of the AA4MM meta-model was proposed in Julien Siebert's PhD and written in Java. This year we added a new coupling between models to represent multi-level modeling, and rewrote a part of the core to ease coupling of simulator.
MASDYNE Vincent Chevrier correspondant
This work was undertaken in a joint PhD Thesis between MAIA and Madynes Team. Laurent Ciarletta (Madynes team, LORIA) has been co-advisor of this PhD and correspondant for this software.

Other contributors to this software were: Tom Leclerc, François Klein, Christophe Torin, Marcel Lamenu, Guillaume Favre and Amir Toly.

MASDYNE (Multi-Agent Simulator of DYnamic Networks usErs) is a multi-agent simulator for modeling and simulating users behaviors in mobile ad hoc network. This software is part of joint work with MADYNES team, on modeling and simulation of ubiquitous networks.
FiatLux
FiatLux is a discrete dynamical systems simulator that allows the user to experiment with various models and to perturb them. Its main feature is to allow users to change the type of updating, for example from a deterministic parallel updating to an asynchronous random updating. FiatLux has a Graphical User Interface and can also be launched in a batch mode for the experiments that require statistics. In 2012, the main contributions were made by Olivier Bouré, who developed a lattice-gas cellular automata module.
Cart-o-matic Olivier Simonin correspondant François Charpillet Antoine Bautin Nicolas Beaufort
Philippe Lucidarme (Université d'Angers, LISA) is a collaborator and the coordinator of the Cartomatic project.

Cart-o-matic is a software platform for (multi-)robot exploration and mapping tasks. It has been developed by Maia members and LISA (Univ. Angers) members during the robotics ANR/DGA Carotte challenge (2009-2012). This platform is composed of three softwares which as been protected by software copyrights (APP): Slam-o-matic a SLAM algorithm developed by LISA members, Plan-o-matic a robot trajectory planning algorithm developed by Maia and LISA members and Expl-o-matic a distributed multi-agent strategy for multi-robot exploration developed by Maia members (which is based on algorithms proposed in the PhD Thesis of Antoine Bautin). Cf. illustration at Cart-o-matic

The purchase of Cart-o-matic by some robotics companies is underway.
New Results Decision Making Accounting for Uncertainty in Penetration Testing Olivier Buffet Jörg Hoffmann
Carlos Sarraute (Core Security Technologies) is an external collaborator.

Core Security Technologies is an U.S.-American/Argentinian company providing, amongst other things, tools for (semi-)automated security checking of computer networks against outside hacking attacks. For automation of such checks, a module is needed that automatically generates potential attack paths. Since the application domain is highly dynamic, a module allowing to declaratively specify the environment (the network and its configuration) is highly advantageous. For that reason, Core Security Technologies have been looking into using AI Planning techniques for this purpose. After consulting by Jörg Hoffmann, they are now using a variant of Jörg Hoffmann's FF planner in their product. While that solution is satisfactory in many respects, it also has weaknesses. The main weakness is that it does not handle the incomplete knowledge in this domain – figuratively speaking, the attacker is assumed to have perfect information about the network. This results in high costs in terms of runtime and network traffic, for extensive scanning activities prior to planning.

We are currently working with Core Security's research department to overcome this issue, by modeling and solving the attack planning problem as a POMDP instead. A workshop paper detailing the POMDP model has been published at SecArt'11. While such a model yields much higher quality attacks, solving an entire network as a POMDP is not feasible. We have designed a decomposition method making use of network structure and approximations to overcome this problem, by using the POMDP model only to find good-quality attacks on single machines, and propagating the results through the network in an appropriate manner. This work has been published in ICAPS'12 .
Searching for Information with MDPs Mauricio Araya Olivier Buffet Vincent Thomas François Charpillet
In the context of Mauricio Araya's PhD, we are working on how MDPs —or related models— can search for information. This has led to various research directions, such as extending POMDPs so as to optimize information-based rewards, or actively learning MDP models. This year, we have focused on a novel optimistic Bayesian Reinforcement Learning algorithm –as described below– and on Mauricio's dissertation.

Exact or approximate solutions to Model-based Bayesian RL are impractical, so that a number of heuristic approaches have been considered, most of them relying on the principle of “optimism in the face of uncertainty”. Some of these algorithms have properties that guarantee the quality of their outcome, inspired by the PAC-learning (Probably Approximately Correct) framework. For example, some algorithms provably make in most cases the same decision as would be made if the true model were known (PAC-MDP property).

We have proposed a novel optimistic algorithm, bolt, that is

appealing in that it is (i) optimistic about the uncertainty in the model and (ii) deterministic (thus easier to study); and

provably PAC-BAMDP, i.e., makes in most cases the same decision as a perfect BRL algorithm would.

This work has been published in ICML'12 and (in French) in JFPDA'12 , additional details appearing in .
Scheduling for Probabilistic Realtime Systems Olivier Buffet
Maxim Dorin, Luca Santinelli, Liliana Cucu-Grosjean (Inria, TRIO team), and Rob Davies (U. of York) are external collaborators.

In this collaborative research work (mainly with the TRIO team), we look at the problem of scheduling periodic tasks on a single processor, in the case where each task's period is a (known) random variable. In this setting, some job will necessarily be missed, so that one will try to satisfy some criteria depending on the number of deadline misses.

We have proposed three criteria: (1) satisfying pre-defined deadline miss ratios, (2) minimizing the worst deadline miss ratio, and (3) minimizing the average deadline miss ratio. For each criterion we propose an algorithm that computes a provably optimal fixed priority assignment, i.e., a solution obtained by assigning priorities to tasks and executing jobs by order of priority.

This work has been presented in RTNS'11, and an extended version is currently in preparation.
Adaptive Management with POMDPs Olivier Buffet
Iadine Chadès, Josie Carwardine, Tara G. Martin (CSIRO), Samuel Nicol (U. of Alaska Fairbanks) and Régis Sabbadin (INRA) are external collaborators.

In the field of conservation biology, adaptive management is about managing a system, e.g., performing actions so as to protect some endangered species, while learning how it behaves. This is a typical reinforcement learning task that could for example be addressed through BRL.

Here, we consider that a number of experts provide us with one possible model each, assuming that one of them is the true model. This allows making decisions by solving a hidden model MDP (hmMDP). An hmMDP is essentially a simplified mixed observability MDP (MOMDP), where the hidden part of the state corresponds to the model (in cases where all other variables are fully observable).

From a theoretical point of view, we have proved that deciding whether a finite-horizon hmMDP problem admits a solution policy of value greater than a pre-defined threshold is a Pspace-complete problem. We have also conducted preliminary studies of this approach, using the scenario of the protection of the Gouldian finch, and focusing on the particular characteristics that could be exploited to more efficiently solve this problem. These results have been presented in AAAI'12 .
Multi-Camera Tracking in Partially Observable Environment Arsène Fansi Tchango Olivier Buffet Vincent Thomas Alain Dutech
Fabien Flacher (Thales THERESIS) is an external collaborator.

In collaboration with Thales ThereSIS - SE&SIM Team (Synthetic Environment & Simulation), we focus on the problem of following the trajectories of several persons with the help of several actionable cameras. This problem is difficult since the set of cameras cannot cover simultaneously the whole environment, since some persons can be hidden by obstacles or by other persons, and since the behavior of each person is governed by internal variables which can only be inferred (such as his motivation or his hunger).

The approach we are working on is based on (1) POMDP formalisms to represent the state of the system (person and their internal states) and possible actions for the cameras, (2) a simulator provided and developed by Thales ThereSIS and (3) particle filtering approaches based on this simulator.

From a theoretical point of view, we are currently investigating how to use a deterministic simulator and to generate new particles in order to keep a good approximation of the posterior distribution.
Scaling Up Decentralized MDPs Through Heuristic Search Jilles Dibangoye
External collaborators: Christopher Amato, Arnaud Doniec.

Decentralized partially observable Markov decision processes (Dec-POMDPs) are rich models for cooperative decision-making under uncertainty, but are often intractable to solve optimally (NEXP-complete). The transition and observation independent Dec-MDP is a general subclass that has been shown to have complexity in NP, but optimal algorithms for this subclass are still inefficient in practice. We first provide an updated proof that an optimal policy does not depend on the histories of the agents, but only the local observations. We then present a new algorithm based on heuristic search that is able to expand search nodes by using constraint optimization. We show experimental results comparing our approach with the state-of-the-art Dec-MDP and Dec-POMDP solvers. These results show a reduction in computation time and an increase in scalability by multiple orders of magnitude in a number of benchmarks.

This work was presented in UAI'2012 .
Approximate Modified Policy Iteration Bruno Scherrer
External collaborators: Victor Gabillon, Mohammad Ghavamzadeh and Matthieu Geist.

Modified policy iteration (MPI) is a dynamic programming (DP) algorithm that contains the two celebrated policy and value iteration methods. Despite its generality, MPI has not been thoroughly studied, especially its approximation form which is used when the state and/or action spaces are large or infinite. We have proposed three implementations of approximate MPI (AMPI) that are extensions of well-known approximate DP algorithms: fitted-value iteration, fitted-Q iteration, and classification-based policy iteration. We have provided an error propagation analysis that unifies those for approximate policy and value iteration. For the classification-based implementation, we have developed a finite-sample analysis that shows that MPI's main parameter allows to control the balance between the estimation error of the classifier and the overall value function approximation.

This work was presented in JFPDA'2012 and ICML'2012 .
A Dantzig Selector Approach to Temporal Difference Learning Bruno Scherrer
External collaborators: Matthieu Geist, Mohammad Ghavamzadeh and Alessandro Lazaric.

LSTD is one of the most popular reinforcement learning algorithms for value function approximation. Whenever the number of samples is larger than the number of features, LSTD must be paired with some form of regularization. In particular, $L_{1}$ -regularization methods tend to perform feature selection by promoting sparsity and thus they are particularly suited in high-dimensional problems. Nonetheless, since LSTD is not a simple regression algorithm but it solves a fixed-point problem, the integration with $L_{1}$ -regularization is not straightforward and it might come with some drawbacks (see e.g., the P-matrix assumption for LASSO-TD). We have introduced a novel algorithm obtained by integrating LSTD with the Dantzig Selector. In particular, we have investigated the performance of the algorithm and its relationship with existing regularized approaches, showing how it overcomes some of the drawbacks of existing solutions.

This work was presented at JFPDA'2012 and ICML'2012 .
On the Use of Non-Stationary Policies for Stationary Infinite-Horizon Markov Decision Processes Bruno Scherrer Boris Lesner
In infinite-horizon stationary $γ$ -discounted Markov Decision Processes, it is known that there exists a stationary optimal policy. Using Value and Policy Iteration with some error $ϵ$ at each iteration, it is well-known that one can compute stationary policies that are $\frac{2 γ}{{(1 - γ)}^{2}} ϵ$ -optimal. After having shown that this guarantee is tight, we have developed variations of Value and Policy Iteration for computing non-stationary policies that can be up to $\frac{2 γ}{1 - γ} ϵ$ -optimal, which constitutes a significant improvement in the usual situation when $γ$ is close to 1. Surprisingly, this shows that the problem of ”computing near-optimal non-stationary policies” is much simpler than that of ”computing near-optimal stationary policies”.

This work was presented and selected for a full oral presentation at NIPS'2012 .
Developmental Reinforcement Learning Alain Dutech
External collaborators: Matthieu Geist (IMS Supelec), Olivier Pietquin (IMS Supelec)

Reinforcement Learning in rich, complex and large sensorimotor spaces is a difficult problem mainly because the exploration of such a huge space cannot be done in an extensive way. The idea is thus to adopt a developmental approach where the perception and motor skills of the robot can grow in richness and complexity during learning, as a consequence the size of the state and action spaces grows progressively when the performances of the learning agent increases. The learning framework relies on function approximators with specific properties (continuous input space, life-long adaptation, knowledge transfer). Architectures based on “reservoir learning” and “dynamical self-organizing maps” kind of artificial neural networks have been investigated , .
Dialog and POMDPs Lucie Daubigney
Reinforcement learning (RL) is now part of the state of the art in the domain of spoken dialog systems (SDS) optimization. The best performing RL methods, such as those based on Gaussian Processes, require to test small changes in the policy to assess them as improvements or degradations. This process is called on policy learning. Nevertheless, it can result in system behaviors that are not acceptable by users. Learning algorithms should ideally infer an optimal strategy by observing interactions generated by a non-optimal but acceptable strategy, that is learning off-policy. Such methods usually fail to scale up and are thus not suited for real-world systems. In this work, a sample-efficient, on-line and off-policy RL algorithm is proposed to learn an optimal policy . This algorithm is combined to a compact non-linear value function representation (namely a multilayer perceptron) enabling to handle large scale systems. One of the application domain is the teaching of a second language .
SAP Speaks PDDL: Exploiting a Software-Engineering Model for Planning in Business Process Management Jörg Hoffmann
Ingo Weber (NICTA) and Frank Michael Kraft (bpmnforum.net) are external collaborators.

Planning is concerned with the automated solution of action sequencing problems described in declarative languages giving the action preconditions and effects. One important application area for such technology is the creation of new processes in Business Process Management (BPM), which is essential in an ever more dynamic business environment. A major obstacle for the application of Planning in this area lies in the modeling. Obtaining a suitable model to plan with – ideally a description in PDDL, the most commonly used planning language – is often prohibitively complicated and/or costly. Our core observation in this work is that this problem can be ameliorated by leveraging synergies with model-based software development. Our application at SAP, one of the leading vendors of enterprise software, demonstrates that even one-to-one model re-use is possible.

The model in question is called Status and Action Management (SAM). It describes the behavior of Business Objects (BO), i.e., large-scale data structures, at a level of abstraction corresponding to the language of business experts. SAM covers more than 400 kinds of BOs, each of which is described in terms of a set of status variables and how their values are required for, and affected by, processing steps (actions) that are atomic from a business perspective. SAM was developed by SAP as part of a major model-based software engineering effort. We show herein that one can use this same model for planning, thus obtaining a BPM planning application that incurs no modeling overhead at all.

We compile SAM into a variant of PDDL, and adapt an off-the-shelf planner to solve this kind of problem. Thanks to the resulting technology, business experts may create new processes simply by specifying the desired behavior in terms of status variable value changes: effectively, by describing the process in their own language.

This work has been published in JAIR .
Resource-Constrained Planning: A Monte Carlo Random Walk Approach Jörg Hoffmann
Hootan Nakhost and Martin Müller (University of Alberta) are external collaborators.

The need to economize limited resources, such as fuel or money, is a ubiquitous feature of planning problems. If the resources cannot be replenished, the planner must make do with the initial supply. It is then of paramount importance how constrained the problem is, i.e., whether and to which extent the initial resource supply exceeds the minimum need. While there is a large body of literature on numeric planning and planning with resources, such resource constrainedness has only been scantily investigated. We herein start to address this in more detail. We generalize the previous notion of resource constrainedness, characterized through a numeric problem feature $C \leq 1$ , to the case of multiple resources. We implement an extended benchmark suite controlling $C$ . We conduct a large-scale study of the current state of the art as a function of $C$ , highlighting which techniques contribute to success. We introduce two new techniques on top of a recent Monte Carlo Random Walk method, resulting in a planner that, in these benchmarks, outperforms previous planners when resources are scarce ( $C$ close to 1). We investigate the parameters influencing the performance of that planner, and we show that one of the two new techniques works well also on the regular IPC benchmarks.

This work has been published in ICAPS-12 .
How to Relax a Bisimulation? Michael Katz Jörg Hoffmann
Malte Helmert (Basel University) is an external collaborator.

Merge-and-shrink abstraction (M&S) is an approach for constructing admissible heuristic functions for cost-optimal planning. It enables the targeted design of abstractions, by allowing to choose individual pairs of (abstract) states to aggregate into one. A key question is how to actually make these choices, so as to obtain an informed heuristic at reasonable computational cost. Recent work has addressed this via the well-known notion of bisimulation. When aggregating only bisimilar states – essentially, states whose behavior is identical under every planning operator – M&S yields a perfect heuristic. However, bisimulations are typically exponentially large. Thus we must relax the bisimulation criterion, so that it applies to more state pairs, and yields smaller abstractions. We herein devise a fine-grained method for doing so. We restrict the bisimulation criterion to consider only a subset $K$ of the planning operators. We show that, if $K$ is chosen appropriately, then M&S still yields a perfect heuristic, while abstraction size may decrease exponentially. Designing practical approximations for $K$ , we obtain M&S heuristics that are competitive with the state of the art.

This work has been published in ICAPS-12 , and as Inria research report RR-7901 .
Semi-Relaxed Plan Heuristics Emil Keider Jörg Hoffmann
Patrik Haslum (ANU) is an external collaborator.

Heuristics based on the delete relaxation are at the forefront of modern domain-independent planning techniques. Here we introduce a principled and flexible technique for augmenting delete-relaxed tasks with a limited amount of delete information, by introducing special fluents that explicitly represent conjunctions of fluents in the original planning task. Differently from previous work in this direction, conditional effects are used to limit the growth of the task to be linear, rather than exponential, in the number of conjunctions that are introduced, making its use for obtaining heuristic functions feasible. We discuss how to obtain an informative set of conjunctions to be represented explicitly, and analyze and extend existing methods for relaxed planning in the presence of conditional effects. The resulting heuristics are empirically evaluated, and shown to be sometimes much more informative than standard delete-relaxation heuristics.

This work has been published in ICAPS-12 .
Structural Patterns Beyond Forks: Extending the Complexity Boundaries of Classical Planning Michael Katz Emil Keider
Tractability analysis in terms of the causal graphs of planning problems has emerged as an important area of research in recent years, leading to new methods for the derivation of domain-independent heuristics (Katz and Domshlak 2010). Here we continue this work, extending our knowledge of the frontier between tractable and NP-complete fragments. We close some gaps left in previous work, and introduce novel causal graph fragments that we call the hourglass and semi-fork, for which under certain additional assumptions optimal planning is in P. We show that relaxing any one of the restrictions required for this tractability leads to NP-complete problems. Our results are of both theoretical and practical interest, as these fragments can be used in existing frameworks to derive new abstraction heuristics. Before they can be used, however, a number of practical issues must be addressed. We discuss these issues and propose some solutions.

This work has been published in AAAI-12 .
Understanding and mastering complex systems Adaptive control of a complex system based on its multi-agent model Vincent Chevrier Tomas Navarrete
Laurent Ciarletta (Madynes team, LORIA) is an external collaborator.

Complex systems are present everywhere in our environment: internet, electricity distribution networks, transport networks. These systems have the following characteristics: a large number of autonomous entities, dynamic structures, different time and space scales and emergent phenomena. This work is centered on the problem of control of such systems. The problem is defined as the need to determine, based on a partial perception of the system state, which actions to execute in order to avoid or favor certain global states of the system. This problem comprises several difficult questions: how to evaluate the impact at the global level of actions applied at a global level, how to model the dynamics of an heterogeneous system (different behaviors issue of different levels of interactions), how to evaluate the quality of the estimations issue of the modeling of the system dynamics.

We propose a control architecture based on an “equation-free” approach. We use a multi-agent model to evaluate the global impact of local control actions before applying the most pertinent set of actions.

Associated to our architecture, an experimental platform has been developed to confront the basic ideas or the architecture within the context of simulated “free-riding” phenomenon in peer to peer file exchange networks. We have demonstrated that our approach allows to drive the system to a state where most peers share files, despite given initial conditions that are supposed to drive the system to a state where no peer shares. We have also executed experiments with different configurations of the architecture to identify the different means to improve the performance of the architecture.
Multi Modeling and multi-simulation Vincent Chevrier Christine Bourjot Benjamin Camus
Laurent Ciarletta (Madynes team, LORIA) is an external collaborator.

Complex systems generally require to use different points of view (abstraction levels) at the same time on the system in order to capture and to understand all the dynamics and the complexity. Being made of different interacting parts, a model of a complex system also requires simultaneously modeling and simulation (M&S) tools from different scientific fields.

We proposed the AA4MM meta-model is to build a society of models, simulators and simulation softwares that solves the core challenges of multimodelling and simulation coupling in an homogeneous perspective.

This year we focused on systems that naturally involve entities at different levels of description: micro and macro levels with their dynamics and and their articulations : emergence (upward causation, from micro to macro levels) and immergence (downward causation, from macro to micro levels). We relied on Bourgine’s generic view of the relationship between complex phenomenon’s levels and their temporal evolution . We proposed an extension of the AA4MM concepts in order to adapt them to emergence and immergence specifications. A simple example of multi-level modeling of a flocking phenomenon has been implemented to illustrate our proposal.
Robustness of Cellular Automata and Reactive Multi-Agent Systems Olivier Bouré Vincent Chevrier Nazim Fatès
Our research on emergent collective behaviours focuses on robustness analysis, that is the behavioural resistance to perturbations in collective systems. We progressed in the knowledge of how to tackle this issue in the case of cellular automata (CA) and multi-agent systems (MAS).

The density classification problem was taken as a simple example for studying how decentralised computations can be carried out with simple cells. Although it is known that this problem can not be solved perfectly, we derived analytic calculations to understand how stochastic cellular automata provide good solutions . A collaboration with mathematicians lead us to study how to extend this result to the infinite-space case and to the 2D finite case .

Two papers resulting from the Amybia projects were published : experimental results on phase transitions obtained with FPGAs and the description on a robotics experiment that demonstrates the robustness of a bio-inspired aggregation method .

The results on asynchronous information transmission in cellular automata were consolidated . Original definitions of asynchronism were also developed in lattice-gas cellular automata , which allows us to complete our spectrum of models for which robustness can be studied analytically and with numerical simulations.
Robotics Systems and Ambiant Intelligence Robotics systems : autonomy, cooperation, robustness Local control based platooning Alexis Scheuer Olivier Simonin François Charpillet Jano Yazbeck
We consider decentralised control methods to operate autonomous vehicles at close spacings to form a platoon. We study models inspired by the flocking approach, where each vehicle computes its control from its local perceptions. We investigate different decentralised models in order to provide robust and scalable solutions. Open questions concern collision avoidance, stability and multi-platoon navigation.

In order to reduce the tracking error (i.e. the distance between each follower's path and the path of its predecessor), we developed both an innovative approach and a new lateral control law. This lateral control law reduces the tracking error faster than other existing control laws. This control law, and the experimental results obtained with it, has been submitted to 2013 IEEE International Conference on Robotics and Automation. Its integration with a previously defined secure longitudinal control law has also been studied, and will be submitted soon to 2013 IFAC Intelligent Autonomous Vehicles Symposium.
Adaptation of autonomous vehicle traffic to perturbations Mohamed Tlig Olivier Simonin Olivier Buffet
In the context of the European project InTraDE, the problem studied in the context of Mohamed Tlig's PhD thesis is to handle the displacements of numerous IAVs (Intelligent Autonomous Vehicles) in a seaport. Here we assume a supervisor planning the routes of the vehicles in the port. However, in such a large and complex system, different unexpected events can arise and degrade the traffic : failure of a vehicle, human mistake while driving, obstacle on roads, local re-planning, and so on.

We started focusing on a first important sub-problem of space resource sharing among multiple agents: how to ensure the crossing of two opposed flows of vehicles on a road when one of the two paths is blocked by an obstacle. To overcome this problem, blocked vehicles have to coordinate with vehicles of the other side to share the road and manage delays. The objective is to improve traffic flow and reduce the emergence of traffic jam. After formalizing this problem, we have defined and studied in simulation two decision rules that produce two different strategies: the first one alternates between two vehicles from each side of the road, and the second one gives priority to the vehicle with the highest delay. This work has been presented in ICTAI'12 .

We are now considering more complex situations, e.g., when multiple flows of vehicles share more than one crossroad.
Multi-robot exploration and mapping : The Carotte Challenge Olivier Simonin François Charpillet Antoine Bautin Nicolas Beaufort
In the context of the ANR/DGA Carotte Challenge, we study since 2009 new strategies and algorithms for multi-robot exploration and mapping. The proposed models are experimented with real autonomous mobile robots at LORIA and every year at the Carotte challenge. Our consortium, called “Cart-o-matic”, is composed of members from Université d'Angers (LISA) and from Maia team-project (our industrial partner has left the consortium in 2011).

The year 2012 produced several results :

In June, we won the final edition of the Carotte challenge ! This result was obtained in particular by the efficiency and the robustness of the multi-robot strategy we proposed. Our system also provided one of the best map of the contest.

We developed a software platform, including SLAM, Planning and multi-robot explorations algorithms. These softwares have been protected by copyrights (APP), see .

We presented the results in different publications : RIA revue , ICIRA'2012 International Conference (Finalist for the Best student paper).

Antoine Bautin wrote his PhD thesis, that he will defend in the beginning of year 2013. This work proposes new frontier assignation algorithms for multi-robot exploration. We defined a new heuristics, based on counting the robots towards a frontier rather than considering only the distance between robots and frontiers. For these purpose we developed algorithms based on wavefronts computations (artificial potential fields).We measured on benchmarks that our algorithm outperforms the two classical approaches closest frontier and Greedy assignation.

In Oct. 2012, Nassim Kaldé started a PhD thesis (MENRT scholarship), advised by F. Charpillet and O. Simonin. We aim at continuing the work of the Cartomatic project, under new hypothesis and constrains on communications and complexity of the environment to explore.

Intelligent environments and health assistance Spatial computing: iTiles network Olivier Simonin François Charpillet Lionel Havet Mihai Andries
Olivier Rochel (Inria research engineer, SED Nancy) is an external collaborator.

In the context of ambient intelligence and robotic assistance, we explore the definition of an active floor composed of connected nodes, forming a network of cells. We consider different way of computation, as spatial calculus, to define robust and self-adaptive functions in the environment. We aim at dealing with walk analysis, surveillance of people activity (actimetry) and assistance (control of assistant robots, etc.).

This work can be summarized in several points :

We asked Hikob company to design the iTile model we defined at the end of year 2011. In 2012, a network of 90 iTiles has been installed on the floor of the smart apartment of the center. This apartment is an experimental platform developed in the context of the “Situated Computer Science” Action of the CPER MISN (Lorraine region, Inria and government fundings). See InfoSitu.

Each iTile is composed of one node connected to embedded sensors and to its neighboring tiles. A tile holds 4 weight sensors, an accelerometer and 16 LEDs. A simulator of the iTile network has been developed by Olivier Rochel. This tools makes easier the development on the real tiles.

Several functions have been developed and are currently under experiments: (i) detection of a person walking on the floor (ii) tracking of feet position (iii) propagation and display of information in the network.

We are involved since 2010 in the PAL Inria large scale initiative (Personally Assisted Living). In this context, Mihai Andries started a PhD thesis in oct. 2012 (funded by Inria-PAL). This PhD. aims at studying the iTiles model and its possibility for assistance functions. We also study models allowing robots to interact and to use the iTile network.

Center of pressure and Step Detection of a person walking on our intelligent floor Amandine Dubois François Charpillet
It is quite easy to estimate in realtime the center of pressure of a person walking on the intelligent floor described above. From a sequence of center of pressure, we conceived a system categorizing the set of measures into two sets :

foot: the measure belongs to the pressure trace left by a foot on the floor,

transition: the center of pressure corresponds to what happens when the person passes his right leg or left from backwards to forwards.

This has been done in a first time using an heuristic algorithm and then using an HMM. From this categorization it's then easy to estimate classical gait parameters such as length of the steps or speed of the walk.
Pose estimation of several kinects Nicolas Beaufort François Charpillet
Tracking one or several persons using several Kinects required to solved the calibration, i.e estimation of the pose of each kinect in the scene, knowing that the area covered by each Depth camera don't overlap with other (because of interference). We have addressed this issue using a SLAM approach implemented within a GPU.
Fall prevention and Fall detection Amandine Dubois François Charpillet
A major problem of public health is the loss of autonomy of elderly people usually caused by the falls. Since 2003 one of the goal of MAIA team is to develop a system allowing to detect falls and also to analyze the gait deterioration to prevent falls. A first approach consisted in developing a markerless human motion capture system estimating the 3D positions of the body joints over time. This system used a dynamic Bayesian network and a factored particle filtering algorithm. Since 2011, we used a new approach using Microsoft Kinect camera which allows to acquire at the same time a RGB and a depth image to deal of the problem of the gait. After the extraction of the human from the background, we calculate the gait parameters from the center of mass of a person. Some parameters, as the length and the time of steps, the speed of the gait, allow to predict a deterioration of the gait of a person and an increase of the risk of falls .

Another use of the extraction of center of mass of a person from the Kinect camera is to determine the activity of a person. The method uses a Hidden Markov Model to distinguish eight activities of the daily life (sitting, walking, lying (on a couch, on a bed), lying down, falling, going up on the obstacles, squatting and bending). We set up an experiment in a smart room to validate our results. Concerning the gait parameters we compare them to the real values obtained making the young subjects wake with pads soaked with ink under the shoes on the paper. The results show that there is a difference of 3-4cm between length provided by our Kinect algorithm and the real length provided by the paper. Concerning the detection of the activity, we ask to 28 subjects to perform eight situations (corresponding to the eight states of the HMM). The results showed that each situation is recognized exept “bending”, falls are detected correctly and there are no false positives except “sitting” and “qqsquatting” which are detected instead of “bending”.
Bilateral Contracts and Grants with Industry Bilateral Grants with Industry Arsène Fansi Tchango Olivier Buffet Vincent Thomas Alain Dutech
Arsène Fansi Tchango has currently a CIFRE grant for his PhD "Multi-Camera Tracking in Partially Observable Environment". This CIFRE is the result of the collaboration between Thales THERESIS and Inria Nancy Grand-Est (Section ).
Partnerships and Cooperations Regional Initiatives CPER “Informatique Située” project Olivier Simonin François Charpillet Olivier Rochel , Amandine Dubois Mihai Andries
Ye-Qiong Song (Madynes team, LORIA-Inria) is an external collaborator.

The CPER MIS is a Lorraine region and Inria-Feder project. In this context the Informatique Située action aims at studying and experiment AI models for human assistance and intelligent home. We developed an experimental platform called “Smart Appartment”, where we define and study the iTile network () and different multi-sensors systems for tracking functions. See http://infositu.loria.fr.
RNSC project AEGSST Vincent Chevrier
(Organizer), VincentThomas

This project “Approche Enactive pour la Gouvernance des Systèmes Socio-Techniques” (AEGSST) is the consequence of the work undertaken within the GEST project funded by the IXXI ("Institut Rhône Alpin des Systèmes Complexes") and PEPS CNRS project GEST. It is labeled and funded by the Réseau National des Systèmes Complexes (RNSC).

This project aims at a fundamental level at proposing an enactive perspective for the governance issue in case of complex socio-technical systems, like public transportation systems or smart grids in energy domain. From a more applicative perspective, we seek at specifying a participatory and reflexive simulation system based on a multi-agent model.

This project gathers researchers coming from different domains (social cognition, decision theory, simulation, serious game, etc) in order to clarify interdisciplinary issues.

Several meetings were organized and a workshop occurred the 29 th November in Paris.
COMAC Mauricio Araya Marie Tonnelier Vincent Thomas Olivier Buffet François Charpillet
Laurent Bougrain (CORTEX team, LORIA) is an external collaborator.

The COMAC COMAC = contrôle optimisé multi-techniques des aérostructures composites / optimized multi-technique control of composite aeronautic parts project is part of the Materalia competitive cluster. The main objective of the project is to develop diagnosis tools for the low cost identification of defaults in aeronautic parts made of composite materials.

In the MAIA team, our research effort focuses more precisely on information gathering problems involving active sensors, i.e., an intelligent system which has to select the observations to perform (which sensor, where, at which resolution). Mauricio Araya's undergoing PhD looks precisely at the topic of Active Sensing (Section ).

The project has ended in December 2012 and the main contributions of the MAIA and CORTEX teams are (1) the development of the iComac platform that gathers the information concerning the diagnosis procedures results obtained by all the partners (2) the development of Pie Diagnosis System (PDS), a demonstrative application which uses a POMDP approach to compute the optimal active diagnosis strategy, and hypertrees for visualization.
National Initiatives CNRS PEPII project “IMAVO” (2011-2012) Alain Dutech
IMAVO, for “Interactions entre Modules pour l'Apprentissage dans un environnement VOlatile”, is a PEPII project of the INSB institute of the CNRS. It involves Alain Marchand and Etienne Coutureau from the INCIA Lab of Bordeaux (Behavioral Neurosciences - INSB), Mehdi Khamassi and Benoît Girard from the ISIR Lab of Paris (Robotics and Neurosciences - INS2I), Alain Dutech and Nicolas Rougier from the Loria Lab of Nancy (Computational Neurosciences and Machine Learning - INS2I).

This project investigates model-based and model-free reinforcement learning approaches for rats learning in volatile environments (i.e. context and reward can change during learning). It aims at designing new concept for modularized decision-making systems, allowing a better understanding of the underlying neuro-biological process involved in rats and humans and applications in the field of autonomous robotics.
Inria AEN PAL Personally Assisted Living François Charpillet Olivier Simonin Mihai Andries
The PAL project is a national Inria Large Scale Initiative (Action d'Envergure Nationale) involving several teams of the institute (Arobas, Coprin, E-motion, Lagadic, Demar, Maia, Prima, Pulsar and Trio). It is coordinated by David Daney (Inria Sophia-Antipolis EPI Coprin). The project focuses on the study and experiment of models for health and well-being. Maia is particularly involved in the People Surveillance work package, by studying and developping intelligent environments and distributed tracking devices for people walking analysis and robotic assistance (smart tiles, 3D camera network, assistant robots), cf. Sec. .

In 2012, we organized a Workshop PAL in Nancy, on November (http://pal.inria.fr). The PAL project funded the PhD. thesis of Mihai Andries, who started in october 2012.
PEA-DGA SUSIE 2009-12 François Charpillet Olivier Simonin Romain Mauffray
This project relies on results and questions arising from the SMAART project (2006-08). During this project we adapted the EVAP algorithm, proposed in the PhD thesis of Arnaud Glad (Maia, 2011) to the patrol with UAVs, while providing a generic digital pheromone based patrolling simulator. Concerning sharing authority, we proposed an original interface to manipulate groups of UAVs.

The SUSIE project allowed to progress on two questions (i) studying and improving parameters of the EVAP algorithm through the SUSIE simulator (ii) defining new ways to manipulate pheromones fields in order to improve the sharing authority.
Inria ADT Percee (2011-13) Olivier Simonin François Charpillet Olivier Rochel Nicolas Beaufort
Percee, for “Perception Distribuée pour Environnements Intelligents”, is a project proposed by Maia and Madynes teams and funded by Inria. This ADT (Action de Developpement Technologique) supports our action in the PAL Inria National Scale Initiative (Personally Assisted Living, see ).

The project deals with the development and the study of intelligent homes. Since two years we develop an experimental platform, the smart appartment. It allows us to study models and technology for life assistance (walk analysis with iTiles and camera networks, robotic assistants, health diagnostic, domotic functions, wireless communication inside home).

In particular we develop a new tactile floor, which is the iTiles network. Two engineers are funded by the ADT: Moutie Chaider (IJD) and Olivier Rochel (Inria research engineer) for two years.
ANR CART-O-MATIC ANR Carotte Olivier Simonin François Charpillet Antoine Bautin Nicolas Beaufort
This project has been granted by ANR in the Robotics Carotte challenge (CArtographie par ROboT d'un TErritoire) from the Contenus et Interactions program (2009-2012). The project is funded with ca. 50000 EUR to purchase the robotics platform. The Maia team was also funded with a PhD fellowship (Antoine Bautin, defending his PhD in the beginning of year 2013). The Cartomatic consortium was formed by LISA/Angers University (leader) and Maia/LORIA team (and until 2011 by Wany robotics, Montpellier).

This project concerned the mapping of indoor structured but unknown environments, and the localization of objects, with one or several robots. We explored a decentralized multi-robot approach to achieve the challenge. We demonstrated the efficiency and robustness of the approach by winning the final edition of the contest (June 2012, Bourges). See Section and the Web page Cartomatic project.
ANR Pherotaxis François Charpillet Olivier Simonin
Dominique Martinez (Cortex team, Inria NGE) is an external collaborator and the coordinator of the project for Nancy members.

PHEROTAXIS is an “Investissements d’Avenir” ANR 2011-2014 (Coordination: J.-P. Rospars, UMR PISC, INRA Versailles).

The theme of the research is Localisation of odour sources by insects and robots. By associating experimental data with models, the project will allow to define a behavioral model of olfactive processes. This work will also provide several applications, in particular the development of bio-inspired components hightly sensitive and selective.

The project is organized in five work packages and involves the PISC research unit (Versailles), Pasteur Institute (Paris) and LORIA/Inria institute (Nancy).
ANR project BARQ Jörg Hoffmann Olivier Buffet Bruno Scherrer
This project has been granted by ANR in the “Chaires d'Excellence” program. The project is funded with ca. 400000 EUR and will hire four non-permanent researchers (Doctorants and/or Postdocs). Jörg Hoffmann is the project leader, Olivier Buffet and Bruno Scherrer collaborate. Other collaborators from LORIA are Stephan Merz, Ammar Oulamara, and Martin Quinson. The project also has several international collaborators, in particular Prof. Blai Bonet (Universidad Simon Bolivar, Caracas, Venezuela), Prof. Carmel Domshlak (Technion Haifa, Israel), Prof. Hector Geffner (Universitat Pompeu Fabra, Barcelona, Spain), Dr. Malte Helmert (University of Freiburg, Germany), and Prof. Stephen Smith (CMU, Pittsburgh, USA).

The project unites research from four different areas, namely classical planning, probabilistic planning, model checking, and scheduling. The underlying common theme is the development of new methods for computing lower bounds via state aggregation. Specifically, the basic technique investigated allows explicit selection of states to aggregate, in exponentially large state spaces, via an incremental process interleaving it with state space re-construction steps. The two main research questions to be addressed are how to choose the states to aggregate, and how to effectively obtain, in practical scenarios, anytime methods providing solutions with increasingly tighter performance guarantees.

So far, we have hired Dr. Michael Katz as a PostDoc (for 2 years) working on classical planning, and Manel Tagorti as a PhD student (for 3 years) working on probabilistic planning. The Conseil Regional de Lorraine has accepted to co-finance, for 2011, 50% of the the position of Michael Katz for a period of 1 year. Chao-Wen Perng was funded from BARQ for an internship of 5 months during which she worked on her MSc report, laying some basis for the research direction to be followed by Manel Tagorti.

The project has stopped when Joerg Hoffmann left Inria.
European Initiatives Collaborations in European Programs, except FP7

Program: InterReg IV B

Project acronym: InTraDE

Project title: Intelligent Transportation for Dynamic Environment

Duration: 2010 - 2014

Coordinator: University of Science and Technology of Lille (Lille 1-LAGIS) (France),

Other partners: South East England Development Agency (United Kingdom), Centre Régional d’Innovation et de Transfert de Technologie – Transport et Logistique (CRITT TL) (France), AG Port of Oostende (AGHO) (Belgium), National Institute for Transport and Logistics, Dublin Institute of Technology (Ireland), Liverpool John Moores University (LOOM) (United Kingdom)

Abstract:

The InTraDE project (Intelligent Transportation for Dynamic Environments, http://www.intrade-nwe.eu/) is funded by the European North West Region. The project is coordinated by Rochdi Merzouki from University of Science and Technology of Lille (LAGIS lab.). Other partners are the Maia team, Liverpool John Moores University (LOOM), the National Institute for Transport and Logistics in Dublin Institute of Technology, the South East England Development Agency, the AGHO Port of Oostende and the CRITT in Le Havre. In the context of seaports and maritime terminals, the InTraDE project aims to improve the traffic management and space optimization inside confined spaces by developing a clean and safe intelligent transportation system. This transportation system will operate in parallel with virtual simulation software of the automated site, allowing a robust and real-time supervision of the goods handling operation.

The Maia team partner focuses on decentralized approaches to deal with the control of automated vehicle platooning and the adaptation of the traffic. Maia is funded with two PhD fellowships and one engineer. Both PhD thesis started in the end of 2010. The PhD of Jano Yazbeck, supervised by F. Charpillet and A. Scheuer, aims at studying a “Secure and robust immaterial hanging for automated vehicles”. The PhD of Mohamed Tlig, supervised by O. Simonin and O. Buffet, addresses “Reactive coordination for traffic adaptation in large situated multi-agent systems”.

International Research Visitors Visits of International Scientists

Dr. Iadine Chadès, Research Scientist at CSIRO, Ecosystem Sciences division (Brisbane, Australia), visited MAIA for 1 week in April 2012.

Pr. Sukanta Das, Professor at the Department of Information Technology, BESU university (West Bengal, India), visited MaIA for three weeks in March 2012.

Dissemination Scientific Animation Conference organization, Program committees, Editorial boards

Amine Boumaza co-organized the 23rd JET (Journée Evolutionnaire Thématique) held at the University Pierre et Marie Curie in Paris on November 23rd.

Christine Bourjot was a co-organizer of the ARCO ‘s (Association pour la Recherche Cognitive) Workshop “ROBOTS & CORPS, Immersion Ecologique & Cognition incarnée”, October 2012.

Christine Bourjot was a board member of AFIA (Association pour l’Intelligence Artificielle).

Christine Bourjot was a co-organizer of the ARCO and AFIA ’s Workshop SCIA 2012 Sciences Cognitives et Intelligence Artificielle, May 2012.

Olivier Buffet was a member of:

the organizing committee of the “Journées Francophones sur la Planification, la Décision et l'Action pour le contrôle de systèmes” 2012 (JFPDA'12),

the organizing committee of the “Conférence francophone sur l'apprentissage automatique” 2012 (CAp'12),

the editorial board of the “revue d'intelligence artificielle” (RIA), and

the editorial board of the “Journal of Artificial Intelligence Research” (JAIR).

Olivier Buffet was a reviewer for the journals: AMAI (Annals of Mathematics and Artificial Intelligence), JAIR (Journal of Artificial Intelligence Research), RIA (Revue d'Intelligence Artificielle); and for the conferences AAAI'12 (National Conference on Artificial Intelligence), ICRA'12 (International Conference on Robotics and Automation), JFPDA'12 (Journées Francophones sur la Planification, la Décision et l'Action pour le contrôle de systèmes).

François Charpillet and Olivier Simonin co-organized the international Workshop PAL Personally Assisted Living at LORIA, November 2012. (PAL workshop)

Vincent Chevrier was a member of the program committee of EUMAS 121 (European Workshop on Multi-Agent Systems) , IAT 11 (Intelligent Agent Technology), RFIA12 (Reconnaissance des Formes et Intelligence Artificielle), JFSMA12 (French conference on MAS).

Vincent Chevrier was reviewer for RIA journal.

Vincent Chevrier is the moderator of the mailing list of the French spoken community on multi-agent systems.

Alain Dutech was a reviewer for JAIR (Journal of Artificial Intelligence Research), RIA (Revue d'Intelligence Artificielle), Journal of Adaptive Behavior; and for the conference JFPDA'12 (Journées Francophones sur la Planification, la Décision et l'Action pour le contrôle de systèmes).

Nazim Fatès was a co-organiser of the ACA (Asynchronous cellular automata) workshop in ACRI 2012, a member of the steering committee of Automata 2012 (Annual workshop on cellular automata), member of the program committee of ACRI 2012, SCW'12 (Spatial computing workshop), ICAART'12, ICIST'12, CAAA'12. He was an ad hoc reviewer for the following journals : Theoretical Computer Science, Entropy, Advances in Complex Systems.

Nazim Fatès was an invited speaker at COLMOT'12, a workshop on “Collective motion in biological systems: from data to models”.

Alexis Scheuer was a reviewer for the IEEE Transactions on Robotics and on Systems, Man, and Cybernetics, for the Elsevier Journals of Robotics and Autonomous Systems and of Artificial Intelligence, and for the International Journal of Advanced Robotic Systems, as well as for the International Conference on Robotics and Automation (ICRA'13) and for the International IFAC Symposium on Robot Control (SYROCO'12).

Bruno Scherrer was a reviewer for JAIR (Journal Of Artificial Intelligence Research), TAC (Transactions on Automatic Control), ICML'2012 (International Conference on Machine Learning), NIPS'2012 (Neural Information Processing Systems), ECAI'2012 (European Conference on Artificial Intelligence) and JFPDA'2012 (Journées Francophones sur la Planification, la Décision et l'Action pour le contrôle de systèmes).

Bruno Scherrer was an invited speaker at a workshop of the CEA-EDF-Inria Summer School on Stochastic Optimization at Cadarache (28 Jun.)

Olivier Simonin co-organized the International IROS'2012 “Assistance and Service robotics in a human environment” held at Vilamoura, Portugal, October 12th. (Web page)

Olivier Simonin co-organized the 7th National CAR'2012 Conference (Control Architectures of Robots) at LORIA, May 10-11 2012. (CAR'2012)

Olivier Simonin was chair and co-organizer of the SASO'2012 Demo&Contest Track (IEEE International Conference on Self-Adaptive and Self-Organizing Systems), Lyon, 2012. saso2012.

Olivier Simonin was a reviewer for the journals: JAAMAS (Journal of Autonomous Agents and Multi-Agent System), Natural Computing (Springer), the IEEE Robotics and Automation Magazine, RIA (Revue d'Intelligence Artificielle). He also reviewed papers for ICRA'2012 (International Conference on Robotics and Automation).

Olivier Simonin was a program committee member of SASO'2012 (6th IEEE International Conference on Self-Adaptive and Self-Organizing Systems), ICINCO'2013 (10th Int. Conf. on Informatics in Control, Automation and Robotics), ICAART'2013 (5th Int. Conf. on Agents and AI.) and JFSMA'2012 (French conference on MAS).

Vincent Thomas was a board member of ARCO (Association pour la Recherche Cognitive)

Vincent Thomas was a reviewer of "Journées Francophones sur la Planification, la Décision et l'Action pour le contrôle de systèmes" 2012 (JFPDA'2012)

Teaching - Supervision - Juries Teaching

Licence ISC (Informatique et Sciences Cognitives) : Christine Bourjot, Intelligence Artificielle et Résolution de Problèmes, 25HETD, niveau L3, Université de Lorraine, France

Master M2 SCMN (Sciences Cognitives et Media Numériques) Master SCA: Christine Bourjot, Université de Lorraine.

Master MIAGE (Méthodes Informatiques Appliquées à la Gestion): Christine Bourjot, Extraction Intelligente de Données, 18HETD, niveau M1, Université de Lorraine, France

Master SCA (Sciences Cognitives et Applications): Christine Bourjot, Système Multi-Agent, 20HETD, niveau M2, Université de Lorraine, France

Master : Vincent Chevrier, Modèles et Systèmes Multi-agents, 15CM, M2R, Université de Lorraine, France.

Master SCA (Sciences Cognitives et Applications): Alain Dutech, Apprentissage Numérique, 20HETD, niveau M1, Université de Lorraine, France.

Master : Nazim Fatès, Systèmes communicants, partie automates cellulaires, 10CM, M2R, Université de Lorraine, France.

Master : Alexis Scheuer & Olivier Simonin, Introduction à la robotique mobile, 37,5 HETD, M1 Informatique, Université de Lorraine (UHP), France.

Master SCA (Sciences Cognitives et Applications): Vincent Thomas, Agent Intelligent, 20HETD, niveau M1, Université de Lorraine, France.

Master SCA (Sciences Cognitives et Applications): Vincent Thomas, Game Design et Serious Game, 20HETD, niveau M2, Université de Lorraine, France.

Master Informatique: Vincent Thomas, Optimisation et Systemes Dynamiques Stochastiques, 22HETD, niveau M2, Université de Lorraine, France.

Supelec Metz 5eme année : Olivier Simonin, 15CM “Vie Artificielle”.

Supervision

PhD : Tomas Navarrete, Une architecture de contrôle de systèmes complexes basée sur la simulation multi-agent, Université de Lorraine, 24 oct., Vincent Chevrier

PhD in progress : Mihai Andries, “Calcul spatialisé pour l'assistance à la personne: étude d'un réseau de dalles intelligentes”, Oct. 2012, F. Charpillet (advisor), O. Simonin.

PhD in progress : Mauricio Araya, “Near-Optimal Algorithms for Sequential Information-Gathering Decision Problems”, Sept. 2009, F. Charpillet (advisor), O. Buffet, V. Thomas.

PhD in progress : Antoine Bautin, “Stratégie d'exploration multi-robot fondée sur les champs de potentiels artificiels”, Oct. 2009, F. Charpillet (advisor), O. Simonin.

PhD in progress : Olivier Bouré, “Robustesse des systèmes multi-agents réactifs: vers une informatique bio-inspirée ?”, Nazim Fatès, Vincent Chevrier (advisor).

PhD in progress : Benjamin Camus, “Un laboratoire virtuel pour la multi-modélisation”, Christine Bourjot, Vincent Chevrier (advisor).

PhD in progress : Timothé Collet, “Apprentissage actif par renforcement pour la classification”, Nov. 2012, O. Pietquin (advisor), O. Buffet

PhD in progress : Mihai AndriesAmndine Dubois, “assistance à la personne en perte d'autonomie: étude de l'apport d'un réseau de Kinectsà la détection et la prévention des chutes”, Oct. 2011, F. Charpillet (advisor).

PhD in progress : Arsène Fansi Tchango, “Suivi multi-caméra en environnement partiellement observé”, Oct. 2011, A. Dutech (advisor), O. Buffet, V. Thomas.

PhD in progress : Nassim Kaldé, “Exploration et reconstruction d’un environnement inconnu par une flottille de robots”, Oct. 2012, F. Charpillet (advisor), O. Simonin.

PhD in progress : Manel Tagorti, “Approximating the Value Function for Heuristic Search in Factored MDPs”, Nov. 2011, J. Hoffmann (advisor), B. Scherrer, O. Buffet.

PhD in progress : Mohamed Tlig, “Reactive coordination for traffic adaptation in large situated multi-agent systems”, Dec. 2010, O. Simonin (advisor), O. Buffet.

PhD in progress : Jeannot Yazbeck, “Secure and robust immaterial hanging for automated vehicles”, Oct. 2010, F. Charpillet (advisor), A. Scheuer.

Juries PhD and HDR committees

Vincent Chevrier was a member of the PhD committee of Jonathan Demange, 20 Dec. UTBM, as a referee; and Shirley Hoet, 17 Dec, UPMC, as a committee member.

Vincent Chevrier was a member in the HDR committee as a referee of Pascal Ballet, 6 April, Université de Bretagne Occidentale

François Charpillet was a member (as a referee) of the PhD commitee of :

Muhammad Ali, 11th july 2012, Laas, University of Toulouse

Matthieu Warnier,10th december 2012, Laas, University of Toulouse

Senthilkumar Chandramohan, 25th september 2012, University of Avignon

Guogang Wen, 26th october 2012, Lagis, University of Lille

Mohamed Amine Hamila, 3rd April 2012, Univerity of Valenciennes

François Gaillard,2d February 2012,LIFL, University of Lille

François Charpillet was a member of the PhD commitee of :

Wissam KHALIL, 2nd february 2012, Lagis, University of Lille

Sylvain Raybaud, 5th december 2012, Loria, University of Lorraine

Rui Loureiro, 6th december 2012, LAgis, University of Lille

François Charpillet was a member of the HdR commitee of :

Amir Hajjam El Hassani, 8 th december 2012, University of Besançon

Alain Dutech was a member (as a referee) of the PhD committee of Shirley Hoet, 17 Dec, UPMC.

Nazim Fatès was a member of the PhD committee of Julien Provillard, Université de Nice. The defence was held on the 6th of December in I3S laboratory, Nice.

Bruno Scherrer was a member of the PhD committee of Jean-François Hren, Université Lille 1, 21 Jun.

Olivier Simonin was a member of the PhD committee of M. Guezani, UTBM, 4 April.

Specialist Committees

Vincent Thomas was a member of the "Specialist committee" in Universite Nancy 2.

Popularization

Christine Bourjot was co-organizer of the “Forum des Sciences Cognitives”, Université de Lorraine, November 2012.

In the scope of the 2012 celebrations for Alan Turing's hundredth birthday:

Nazim Fatès recorded a video on Turing's heritage with Inria's audiovisual service. This video can be accessed on http://www.youtube.com/watch?v=6awK-FHBntc.

Nazim Fatès was interviewed by the Eureka magazine and he was a co-organiser of the three conferences on Turing that the Loria organised in September in Nancy for a large public (see http://turing2012.loria.fr).

Nazim Fatès participated in recording a program broadcasted in the “Hopital des enfants malades” in Brabois, where the main topic was Turing and Computer Science (see http://www.loria.fr/news/linformatique-aux-enfants-hospitalises-de-brabois). A paper (with mistakes) appeared in L'Est republicain (November 17, 2012) to relate this event.

Nazim Fatès gave a talk entitled Turing, l'intelligence des machines et le jeu in Lycée Jacques Marquette (Pont-à-Mousson) for the annual meeting of the “Associations des anciens élèves et professeurs” (October 7, 2012).

Olivier Simonin was invited to the Journée “Robotique et Numérique” organized by the GDR Robotique for a talk entitled “Robotique bio-inspirée : vers une intelligence collective ?” (see http://www.gdr-robotique.org/journee.php).

Vincent Thomas is preparing, in collaboration with the "Bibliothèque Universitaire du Campus Lettres", an exposition "jeux: les ateliers de la pensée" where the main objective is to promote Game as an interesting subject for academics. This exposition will include vulgarization seminars with specialists of several scientific fields (economy, psychology, computer science) and animations.

Vincent Thomas is a participant of the Erasmus IP: "Learning Computer Programming in Virtual Environments" involving 8 foreign universities. Its objective is to promote teaching and learning of the fundamentals of computer programming through use of virtual and remote learning environments.

Special Issue of the Journal of Cellular Automata 2 Nazim Fatès N. Nazim Fatès N. 7 Old City Publishing Inc. 2012 110 http://hal.inria.fr/hal-00764125 Proceedings of the 26th AAAI Conference on Artificial Intelligence (AAAI'12) Joerg Hoffmann J. Bart Selman B. 1-3 AAAI Press July 2012 http://hal.inria.fr/hal-00765036 Une architecture de contrôle de systèmes complexes basée sur la simulation multi-agent. Tomas Navarrete Gutierrez T. Université de Lorraine October 2012 http://hal.inria.fr/tel-00758118 Ph. D. Thesis Stratégie d'exploration multirobot fondée sur les champs de potentiels artificiels Antoine Bautin A. Olivier Simonin O. François Charpillet F. 0992-499X Revue d'Intelligence Artificielle 26 5 November 2012 523-542 http://hal.inria.fr/hal-00757894 Probing robustness of cellular automata through variations of asynchronous updating Olivier Bouré O. Nazim Fatès N. Vincent Chevrier V. 1567-7818 Natural Computing 11 4 2012 553-564 http://hal.inria.fr/hal-00658754 Fault-tolerant Data-fusion Method: Application on Platoon Vehicle Localization Maan El Badaoui El Najjar M. Cherif Smaili C. François Charpillet F. Denis Pomorski D. Mireille Bayart M. Nada Matta N. Yves Vandenboomgaerde Y. Jean Arlat J. Supervision and Safety of Complex Systems Wley-ISTE 2012 http://hal.inria.fr/hal-00769570 Localisation intègre de véhicules avec fusion multi-capteurs tolérante aux défaillances Maan El Badaoui El Najjar M. Cherif Smaili C. François Charpillet F. Denis Pomorski D. Nada Matta N. Yves Vanderboomgaerde Y. Jean Arlat J. Supervision, surveillance et sûrté de fonctionnement des grands systèmes Traité Systèmes Automatisés Hermes, Lavoisier 2012 55-94 http://hal.inria.fr/hal-00769562 Alan Turing de l'ordinateur au coquillage Nazim Fatès N. I-ntfnd Eureka Lorraine June 2012 http://hal.inria.fr/hal-00742305 Stochastic Cellular Automata Solutions to the Density Classification Problem - When randomness helps computing Nazim Fatès N. 1432-4350 Theory of Computing Systems 2012 http://hal.inria.fr/inria-00608485 Extended version of the Stacs 2011 proceedings paper. To apeear in TOCS journal (Springer) doi : 10.1007/s00224-012-9386-3 Turing et la dimension ontologique du jeu Nazim Fatès N. 1281-2463 Philosophia scientiae 16 3 2012 7-16 http://hal.inria.fr/hal-00751540 A Robust Scheme for Aggregating Quasi-Blind Robots in an Active Environment Nazim Fatès N. Nikolaos Vlassopoulos N. 1947-9263 International Journal of Swarm Intelligence Research 3 3 2012 66-80 http://hal.inria.fr/hal-00740630 SAP Speaks PDDL: Exploiting a Software-Engineering Model for Planning in Business Process Management Joerg Hoffmann J. Ingo Weber I. Frank Kraft F. 1076-9757 Journal of Artificial Intelligence Research 44 July 2012 587-632 http://hal.inria.fr/hal-00765034 Large-scale Simulations on FPGAs: Finding the Asymptotic Critical Threshold of the Greenberg-Hastings Cellular Automata Nikolaos Vlassopoulos N. Nazim Fatès N. Hugues Berry H. Bernard Girau B. 1557-5969 Journal of Cellular Automata 7 1 2012 5-29 http://hal.inria.fr/hal-00644660 BRL Quasi-Optimal à l'aide de Transitions Locales Optimistes Mauricio Araya M. Vincent Thomas V. Olivier Buffet O. Olivier Buffet O. Journées Francophones sur la planification, la décision et l'apprentissage pour le contrôle des systèmes - JFPDA 2012 Villers-lès-Nancy, France 2012 16 http://hal.inria.fr/hal-00735602 Journées Francophones Planification, Décision, Apprentissage 2012 JFPDA Near-Optimal BRL using Optimistic Local Transitions Mauricio Araya-López M. Vincent Thomas V. Olivier Buffet O. International Conference on Machine Learning - ICML 2012 Edimburgh, United Kingdom June 2012 http://hal.inria.fr/hal-00755270 International Conference on Machine Learning 29 ICML MinPos : A Novel Frontier Allocation Algorithm for Multi-robot Exploration Antoine Bautin A. Olivier Simonin O. François Charpillet F. Chun-Yi Su C.-Y. Subhash Rakheja S. Honghai Liu H. ICIRA - 5th International Conference on Intelligent Robotics and Applications - 2012 Montréal, Canada Lecture Notes in Computer Science 7507 Springer October 2012 496-508 http://hal.inria.fr/hal-00757960 International Conference on Intelligent Robotics and Applications 5 ICIRA The original publication is available at www.springerlink.com First steps on asynchronous lattice-gas models with an application to a swarming rule Olivier Bouré O. Nazim Fatès N. Vincent Chevrier V. Georgios Ch. Sirakoulis G. C. Stefania Bandini S. 10th International Conference on Cellular Automata for Research and Industry Fira, Greece 7495 Srpinger Heidelberg 2012 11 http://hal.inria.fr/hal-00687987 International Conference on Cellular Automata for Research and Industry 10 ACRI Optimistic Heuristics for MineSweeper Olivier Buffet O. Chang-Shing Lee C.-S. Woanting Lin W. Olivier Teytaud O. International Computer Symposium Hualien, Taiwan, Province Of China 2012 http://hal.inria.fr/hal-00750577 International Computer Symposium 2012 ICS Modélisation multi-niveaux dans AA4MM Benjamin Camus B. Julien Siebert J. Christine Bourjot C. Vincent Chevrier V. Pierre Chevailler P. Bruno Mermet B. Journées Francophones sur les Systèmes Multi-Agents Honfleur, France Cépaduès October 2012 43-52 http://hal.inria.fr/hal-00744195 Journées Francophones sur les Systèmes Multi-Agents 20 JFSMA MOMDPs: a Solution for Modelling Adaptive Management Problems Iadine Chadès I. Josie Carwardine J. Tara Martin T. Samuel Nicol S. Régis Sabbadin R. Olivier Buffet O. Twenty-Sixth AAAI Conference on Artificial Intelligence (AAAI-12) Toronto, Canada July 2012 http://hal.inria.fr/hal-00755264 National Conference on Artificial Intelligence 27 AAAI Off-policy Learning in Large-scale POMDP-based Dialogue Systems Lucie Daubigney L. Matthieu Geist M. Olivier Pietquin O. ICASSP 2012 Kyoto, Japan IEEE March 2012 4989-4992 http://hal.inria.fr/hal-00684819 IEEE International Conference on Acoustics, Speech and Signal Processing 2012 ICASSP Optimisation d'un tuteur intelligent à partir d'un jeu de données fixé Lucie Daubigney L. Matthieu Geist M. Olivier Pietquin O. JEP 2012 Grenoble, France 1 June 2012 241-248 http://hal.inria.fr/hal-00749498 Journées d'Etude sur la Parole 2012 JEP Scaling Up Decentralized MDPs Through Heuristic Search Jilles Steeve Dibangoye J. S. Amato Christopher A. Doniec Arnaud D. Conference on Uncertainty in Artificial Intelligence Catalina, United States August 2012 http://hal.inria.fr/hal-00765221 Conference on Uncertainty in Artificial Intelligence 28 UAI Tracking Mobile Objects with Several Kinects using HMMs and Component Labelling Amandine Dubois A. François Charpillet F. Workshop Assistance and Service Robotics in a human environment, International Conference on Intelligent Robots and Systems Vilamoura, Algarve, Portugal October 2012 http://hal.inria.fr/hal-00765105 IROS Workshop on Assistance and Service Robotics in a Human Environment 2012 Self-organizing developmental reinforcement learning Alain Dutech A. International Conference on Simulated Animal Behavior Odense, Denmark 2012 http://hal.inria.fr/hal-00705350 International Conference on the Simulation of Adaptive Behavior 12 SAB "Réservoir computing" et Apprentissage par Renforcement Développemental Alain Dutech A. Olivier Buffet O. Journées Francophones sur la planification, la décision et l'apprentissage pour le contrôle des systèmes - JFPDA 2012 Villers-lès-Nancy, France 2012 13 http://hal.inria.fr/hal-00736316 Journées Francophones Planification, Décision, Apprentissage 2012 JFPDA A note on the Density Classification Problem in Two Dimensions Nazim Fatès N. Automata 2012 - 18th International Workshop on Cellular Automata and Discrete Complex Systems La Marana, Corse, France Formenti September 2012 http://hal.inria.fr/hal-00727558 International Workshop on Cellular Automata and Discrete Complex Systems 18 AUTOMATA A Dantzig Selector Approach to Temporal Difference Learning Matthieu Geist M. Bruno Scherrer B. Alessandro Lazaric A. Mohammad Ghavamzadeh M. John Langford J. Joelle Pineau J. ICML-12 Edinburgh, United Kingdom Omnipress July 2012 1399-1406 http://hal.inria.fr/hal-00749480 International Conference on Machine Learning 29 ICML Un sélecteur de Dantzig pour l'apprentissage par différences temporelles Matthieu Geist M. Bruno Scherrer B. Alessandro Lazaric A. Mohammad Ghavamzadeh M. Olivier Buffet O. Journées Francophones sur la planification, la décision et l'apprentissage pour le contrôle des systèmes - JFPDA 2012 Villers-lès-Nancy, France 2012 13 http://hal.inria.fr/hal-00736229 Journées Francophones Planification, Décision, Apprentissage 2012 JFPDA Evolution of 2-Dimensional Cellular Automata as Pseudo-Random Number Generators. Bernard Girau B. Nikolaos Vlassopoulos N. ACRI - Cellular Automata for Research and Industry Santorini, Greece 2012 http://hal.inria.fr/hal-00703223 International Conference on Cellular Automata for Research and Industry 10 ACRI How to Relax a Bisimulation? Michael Katz M. Joerg Hoffmann J. Malte Helmert M. 22nd International Conference on Automated Planning and Scheduling (ICAPS) Atibaia, Brazil June 2012 http://hal.inria.fr/hal-00765027 International Conference on Automated Planning and Scheduling 22 ICAPS Structural Patterns Beyond Forks: Extending the Complexity Boundaries of Classical Planning Michael Katz M. Emil Keyder E. Twenty-Sixth Conference on Artificial Intelligence (AAAI) Toronto, Canada July 2012 http://hal.inria.fr/hal-00765037 National Conference on Artificial Intelligence 27 AAAI Semi-Relaxed Plan Heuristics Emil Keyder E. Joerg Hoffmann J. Patrik Haslum P. ICAPS - 22nd International Conference on Automated Planning and Scheduling - 2012 Atibaia, Brazil June 2012 http://hal.inria.fr/hal-00765025 International Conference on Automated Planning and Scheduling 22 ICAPS Density Classification on Infinite Lattices and Trees Irène Marcovici I. Ana Busic A. Nazim Fatès N. Jean Mairesse J. David Fernandez-Baca D. 10th Latin American Theoretical Informatics Symposium (LATIN 2012) Arequipa, Peru Lecture Notes in Computer Science 7256 Springer April 2012 109-120 http://hal.inria.fr/hal-00712614 Latin American Theoretical Informatics Symposium 11 LATIN Resource-Constrained Planning: A Monte Carlo Random Walk Approach Hootan Nakhost H. Joerg Hoffmann J. Martin Müller M. 22nd International Conference on Automated Planning and Scheduling (ICAPS) Itabaia, Brazil June 2012 http://hal.inria.fr/hal-00765030 International Conference on Automated Planning and Scheduling 22 ICAPS Les POMDP font de meilleurs hackers: Tenir compte de l'incertitude dans les tests de pénétration Carlos Sarraute C. Olivier Buffet O. Joerg Hoffmann J. Olivier Buffet O. Journées Francophones sur la planification, la décision et l'apprentissage pour le contrôle des systèmes - JFPDA 2012 Villers-lès-Nancy, France 2012 14 http://hal.inria.fr/hal-00735608 Journées Francophones Planification, Décision, Apprentissage 2012 JFPDA Approximations de l'Algorithme Itérations sur les Politiques Modifié Bruno Scherrer B. Victor Gabillon V. Mohammad Ghavamzadeh M. Matthieu Geist M. Olivier Buffet O. Journées Francophones sur la planification, la décision et l'apprentissage pour le contrôle des systèmes - JFPDA 2012 Villers-lès-Nancy, France 2012 1 http://hal.inria.fr/hal-00736226 Journées Francophones Planification, Décision, Apprentissage 2012 JFPDA Le corps de cet article est paru, en langue anglaise, dans ICML'2012 (Proceedings of the International Conference on Machine Learning) Approximate Modified Policy Iteration Bruno Scherrer B. Mohammad Ghavamzadeh M. Victor Gabillon V. Matthieu Geist M. 29th International Conference on Machine Learning - ICML 2012 Edinburgh, United Kingdom June 2012 http://hal.inria.fr/hal-00758882 International Conference on Machine Learning 29 ICML On the Use of Non-Stationary Policies for Stationary Infinite-Horizon Markov Decision Processes Bruno Scherrer B. Boris Lesner B. NIPS 2012 - Neural Information Processing Systems South Lake Tahoe, United States December 2012 http://hal.inria.fr/hal-00758809 Annual Conference on Neural Information Processing Systems 26 NIPS Cooperative Behaviors for the Self-Regulation of Autonomous Vehicles in Space Sharing Conflicts Mohamed Tlig M. Olivier Buffet O. Olivier Simonin O. 24th IEEE International Conference on Tools with Artificial Intelligence Athènes, Greece November 2012 1126-1132 http://hal.inria.fr/hal-00755272 IEEE International Conference on Tools with Artificial Intelligence 24 ICTAI Near-Optimal BRL using Optimistic Local Transitions (Extended Version) Mauricio Araya M. Vincent Thomas V. Olivier Buffet O. RR-7965 Inria May 2012 http://hal.inria.fr/hal-00702243 Research Report Adding Virtualization Capabilities to Grid'5000 Daniel Balouek D. Alexandra Carpen Amarie A. Ghislain Charrier G. Frédéric Desprez F. Emmanuel Jeannot E. Emmanuel Jeanvoine E. Adrien Lèbre A. David Margery D. Nicolas Niclausse N. Lucas Nussbaum L. Olivier Richard O. Christian Pérez C. Flavien Quesnel F. Cyril Rohr C. Luc Sarzyniec L. RR-8026 Inria July 2012 18 http://hal.inria.fr/hal-00720910 Research Report How to Relax a Bisimulation? Michael Katz M. Joerg Hoffmann J. Malte Helmert M. RR-7901 Inria 2012 http://hal.inria.fr/hal-00677299 Research Report MineSweeper: Where to Probe? Marc Legendre M. Kévin Hollard K. Olivier Buffet O. Alain Dutech A. RR-8041 Inria August 2012 26 http://hal.inria.fr/hal-00723550 Research Report Influence of space and time dimensions in multi-agent models of the free-riding collective phenomenon Tomas Navarrete Gutierrez T. Julien Siebert J. Laurent Ciarletta L. Vincent Chevrier V. May 2012 22 http://hal.inria.fr/hal-00700643 Technical Report Approximate Modified Policy Iteration Bruno Scherrer B. Victor Gabillon V. Mohammad Ghavamzadeh M. Matthieu Geist M. May 2012 http://hal.inria.fr/hal-00697169 Research Report On the Use of Non-Stationary Policies for Infinite-Horizon Discounted Markov Decision Processes Bruno Scherrer B. March 2012 http://hal.inria.fr/hal-00682172 Research Report A robustness approach to study metastable behaviours in a lattice-gas model of swarming Olivier Bouré O. Nazim Fatès N. Vincent Chevrier V. 2012 http://hal.inria.fr/hal-00768831 Apprentissage off-policy appliqué à un système de dialogue basé sur les PDMPO Lucie Daubigney L. Matthieu Geist M. Olivier Pietquin O. January 2012 http://hal.inria.fr/hal-00656997 (Référence à supprimer) Distributed Problem Solving in natural and artificial complex systems Paul Bourgine P. Engineering Societies in the Agents World IX, 9th International Workshop, ESAW 2008 2008 Automated Planning: Theory and Practice Malik Ghallab M. Dana Nau D. Paolo Traverso P. Morgan Kaufmann 2004 Markov Decision Processes M. Puterman M. Wiley, New York 1994 Artificial Intelligence: A Modern Approach Stuart Russell S. Peter Norvig P. 2nd edition Prentice-Hall, Englewood Cliffs, NJ 2003 Rationality and Intelligence Stuart Russell S. Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence (IJCAI) 1995 Invited paper (Computers and Thought Award) Safe Longitudinal Platoons of Vehicles without Communication Alexis Scheuer A. Olivier Simonin O. François Charpillet F. Proceedings of the IEEE International Conference on Robotics and Automation - ICRA'09 Kobe (JP) May 2009 70–75 http://www.loria.fr/~scheuer/Platoon Approche multi-agent pour la multi-modélisation et le couplage de simulations. Application à l'étude des influences entre le fonctionnement des réseaux ambiants et le comportement de leurs utilisateurs. Julien Siebert J. Université Henri Poincaré - Nancy I September 2011 http://hal.inria.fr/tel-00642034 Ph. D. Thesis Reinforcement Learning, An introduction R.S. Sutton R. A.G. Barto A. BradFord Book. The MIT Press 1998 Improving near-to-near lateral control of platoons without communication Jano Yazbeck J. Alexis Scheuer A. Olivier Simonin O. François Charpillet F. IEEE-RSJ International Conference on Intelligent Robots and Systems (IROS) San Francisco, United States September 2011 4103-4108 http://hal.inria.fr/inria-00603726/en