The objective of the MAIA
The first research activity is about sequential decision making. It has been influenced by Stuart Russell who considers that an agent is rational. In his vision: [For each possible percept sequence, an ideal rational agent should do whatever action is expected to maximize its performance measure] . This view makes Markov decision processes (MDPs) and more generally sequential decision making a good candidate for building the behavior of an agent. It is probably why MDPs have received considerable attention in recent years by the artificial intelligence (AI) community.
The second activity is about understanding and engineering reactive multi-agent systems. It is influenced by research results from the field of behavioral biology which gives us some of the keys to understand how intelligent and adaptive behaviors appear in natural swarm systems. This encourages us to study principles of emergent behaviors in natural systems and apply them to design artificial intelligent systems. Reactive multi-agent systems are good candidates for building such autonomous and adaptive systems and our work mainly focuses on better understanding how we can soundly build such systems.
A paper on non-stationary policies for infinite-horizon Markov decision processes written by Boris Lesner and Bruno Scherrer (see Section for more details) was accepted at NIPS'2012 with a full oral presentation (1467 papers were submitted, 370 were accepted for publication, among which only 20 were selected for full oral presentation).
The Cartomatic projet which was part of the French robotics contest Defi CAROTTE organized by the General Delegation for Armaments (DGA) and French National Research Agency (ANR), has won the third and final edition of the contest. The aim of the Cart-O-matic project was to design and build a multi-robot system able to autonomously map an unknown building and to recognize various objects inside. The scientific issues of this project deal with Simultaneous Localization And Mapping (SLAM), multi-robot collaboration, and object recognition and classification. The research teams involved in this project have developed innovative approaches to each of these fields.
The paper “MOMDPs: a Solution for Modelling Adaptive Management Problems”, cosigned by Olivier Buffet has won the best paper award in this year’s Special Track on Computational Sustainability and Artificial Intelligence at the Association for the Advancement of Artificial Intelligence (AAAI-12) conference in Toronto.
Emil Keyder, Joerg Hoffmann and Patrik Haslum (ANU/NICTA) won the best paper award of the International Conference on Automated Planning and Scheduling (ICAPS-12) for their paper “Semi-Relaxed Plan Heuristics” .
Sequential decision making consists, in a nutshell, in controlling the actions of an agent facing a problem whose solution requires not one but a whole sequence of decisions. This kind of problem occurs in a multitude of forms. For example, important applications addressed in our work include: Robotics, where the agent is a physical entity moving in the real world; Medicine, where the agent can be an analytic device recommending tests and/or treatments; Computer Security, where the agent can be a virtual attacker trying to identify security holes in a given network; and Business Process Management, where the agent can provide an auto-completion facility helping to decide which steps to include into a new or revised process. Our work on such problems is characterized by three main research trends:
Understanding how, and to what extent, to best model the problems.
Developing algorithms solving the problems and understanding their behavior.
Applying our results to complex applications.
Before we describe some details of our work, it is instructive to understand the basic forms of problems we are addressing. We characterize problems along the following main dimensions:
Extent of the model: full vs. partial vs. none. This dimension concerns how complete we require the model of the problem – if any – to be. If the model is incomplete, then learning techniques are needed along with the decision making process.
Form of the model: factored vs. enumerative. Enumerative models explicitly list all possible world states and the associated actions etc. Factored models can be exponentially more compact, describing states and actions in terms of their behavior with respect to a set of higher-level variables.
World dynamics: deterministic vs. stochastic. This concerns our initial knowledge of the world the agent is acting in, as well as the dynamics of actions: is the outcome known a priori or are several outcomes possible?
Observability: full vs. partial. This concerns our ability to observe what our actions actually do to the world, i.e., to observe properties of the new world state. Obviously, this is an issue only if the world dynamics are stochastic.
These dimensions are wide-spread in the AI literature. We remark that they are not exhaustive. In parts of our work, we also consider the difference between discrete/continuous problems, and centralized/decentralized problems. The complexity of solving the problem – both in theory and in practice – depends crucially on where the problem resides in this categorization. In many applications, not one but several points in the categorization make sense: simplified versions of the problem can be solved much more effectively and thus serve for the generation of some – if possibly sub-optimal – action strategy in a more feasible manner. Of course, the application as such may also come in different facets.
In what follows, we outline the main formal frameworks on which our work is based; while doing so, we highlight in a little more detail our core research questions. We then give a brief summary of how our work fits into the global research context.
Sequential decision making with deterministic world dynamics is most
commonly known as planning, or classical planning
. Obviously, in such a setting every
world state needs to be considered at most once, and thus enumerative
models do not make sense (the problem description would have the same
size as the space of possibilities to be explored). Planning
approaches support factored description languages allowing to model
complex problems in a compact way. Approaches to automatically learn
such factored models do exist, however most works – and also most of
our works on this form of sequential decision making – assume that the
model is provided by the user of the planning technology. Formally, a
problem instance, commonly referred to as a planning task, is a
four-tuple
Planning is PSPACE-complete even under strong restrictions on the formulas allowed in the planning task description. Research thus revolves around the development and understanding of search methods, which explore, in a variety of different ways, the space of possible action schedules. A particularly successful approach is heuristic search, where search is guided by information obtained in an automatically designed relaxation (simplified version) of the task. We investigate the design of relaxations, the connections between such design and the search space topology, and the construction of effective planning systems that exhibit good practical performance across a wide range of different inputs. Other important research lines concern the application of ideas successful in planning to stochastic sequential decision making (see next), and the development of technology supporting the user in model design.
Markov Decision Processes (MDP) are a
natural framework for stochastic sequential decision making. An MDP is
a four-tuple
Once the optimal value function is computed, it is straightforward
to derive an optimal strategy, which is deterministic and memoryless,
i.e., a simple mapping from states to actions. Such a strategy is
usually called a policy. An optimal policy is any policy
An important extension of MDPs, known as Partially Observable MDPs
(POMDPs) allows to account for the fact that the state may not
be fully available to the decision maker.
While the goal is the same
as in an MDP (optimizing the expected sum of discounted rewards), the
solution is more intricate. Any POMDP can be seen to be equivalent to
an MDP defined on the space of probability distributions on states,
called belief states. The Bellman-machinery then applies to the
belief states. The specific structure of the resulting MDP makes it
possible to iteratively approximate the optimal value function –
which is convex in the belief space – by piecewise linear
functions, and to deduce an optimal policy that maps belief states to
actions.
A further extension, known as a DEC-POMDP, considers
The MDP model described above is enumerative, and the complexity of computing the optimal value function is polynomial in the size of that input. However, in examples of practical size, that complexity is still too high so naïve approaches do not scale. We consider the following situations: (i) when the state space is large, we study approximation techniques from both a theoretical and practical point of view; (ii) when the model is unknown, we study how to learn an optimal policy from samples (this problem is also known as Reinforcement Learning ); (iii) in factored models, where MDP models are a strict generalization of classical planning – and are thus at least PSPACE-hard to solve – we consider using search heuristics adapted from such (classical) planning.
Solving a POMDP is PSPACE-hard even given an enumerative model. In this framework, we are mainly looking for assumptions that could be exploited to reduce the complexity of the problem at hand, for instance when some actions have no effect on the state dynamics (active sensing). The decentralized version, DEC-POMDPs, induces a significant increase in complexity (NEXP-complete). We tackle the challenging – even for (very) small state spaces – exact computation of finite-horizon optimal solutions through alternative reformulations of the problem. We also aim at proposing advanced heuristics to efficiently address problems with more agents and a longer time horizon.
Within Inria, the most closely related teams are TAO and Sequel. TAO works on evolutionary computation (EC) and statistical machine learning (ML), and their combination. Sequel works on ML, with a theoretical focus combining CS and applied maths. The main difference is that TAO and Sequel consider particular algorithmic frameworks that can, amongst others, be applied to Planning and Reinforcement Learning, whereas we revolve around Planning and Reinforcement Learning as the core problems to be tackled, with whichever framework suitable.
In France, we have recently begun collaborating with the IMS Team of Supélec Metz, notably with O. Pietquin and M. Geist who have a great expertise in approximate techniques for MDPs. We have have links with the MAD team of the BIA unit of the INRA at Toulouse, led by R. Sabbadin. They also use MDP related models and are interested in solving large size problems, but they are more driven by applications (mostly agricultural) than we are. In Paris, the Animat Lab, that was a part of the LIP6 and is now attached to the ISIR, has done some interesting works on factored Markov Decision Problems and POMDPs. Like us, their main goal was to tackle problems with large state space.
In Europe, the IDSIA Lab at Lugano (Switzerland) has brought some interesting ideas to the field of MDP (meta-learning, subgoal discovery) but seems now more interested in a Universal Learner. In Osnabrück (Germany), the Neuroinformatic group works on efficient reinforcement learning with a specific interest in the application to robotics. For deterministic planning, the most closely related groups are located in Freiburg (Germany), Glasgow (UK), and Barcelona (Spain). We have active collaborations with all of these.
In the rest of the world, the most important groups regarding MDPs can be found at Brown University, Rutgers Univ. (M. Littman), Univ. of Toronto (C. Boutilier), MIT AI Lab (L. Kaelbling, D. Bertsekas, J.Tsitsiklis), Stanford Univ., CMU, Univ. of Alberta (R. Sutton), Univ. of Massachusetts at Amherst ( S. Zilberstein, A. Barto), etc. A major part of their work is aimed at making Markov Decision Process based tools work on real life problems and, as such, our scientific concerns meet theirs. For deterministic planning, important related groups and collaborators are to be found at NICTA (Canberra, Australia) and at Cornell University (USA).
There exist numerous examples of natural and artificial systems where self-organization and emergence occur. Such systems are composed of a set of simple entities interacting in a shared environment and exhibit complex collective behaviors resulting from the interactions of the local (or individual) behaviors of these entities. The properties that they exhibit, for instance robustness, explain why their study has been growing, both in the academic and the industrial field. They are found in a wide panel of fields such as sociology (opinion dynamics in social networks), ecology (population dynamics), economy (financial markets, consumer behaviors), ethology (swarm intelligence, collective motion), cellular biology (cells/organ), computer networks (ad-hoc or P2P networks), etc.
More precisely, the systems we are interested in are characterized by :
locality: Elementary components have only a partial perception of the system's state, similarly, a component can only modify its surrounding environment.
individual simplicity: components have a simple behavior, in most cases it can be modeled by stimulus/response laws or by look-up tables. One way to estimate this simplicity is to count the number of stimulus/response rules for instance.
emergence: It is generally difficult to predict the global behavior of the system from the local individual behaviors. This difficulty of prediction is often observed empirically and in some cases (e.g., cellular automata) one can show that the prediction of the global properties of a system is an undecidable problem. However, observations coming from simulations of the system may help us to find the regularities that occur in the system's behavior (even in a probabilistic meaning). Our interest is to work on problems where a full mathematical analysis seems out of reach and where it is useful to observe the system with large simulations. In return, it is frequent that the properties observed empirically are then studied on an analytical basis. This approach should allow us to understand more clearly where lies the frontier between simulation and analysis.
levels of description and observation: Describing a complex system involves at least two levels: the micro level that regards how a component behaves, and the macro level associated with the collective behavior. Usually, understanding a complex system requires to link the description of a component behavior with the observation of a collective phenomenon: establishing this link may require various levels, which can be obtained only with a careful analysis of the system.
We now describe the type of models that are studied in our group.
To represent these complex systems, we made the choice to use reactive multi-agent systems (RMAS). Multi-agent systems are defined by a set of reactive agents, an environment, a set of interactions between agents and a resulting organization. They are characterized by a decentralized control shared among agents: each agent has an internal state, has access to local observations and influences the system through stimulus response rules. Thus, the collective behavior results from individual simplicity and successive actions and interactions of agents through the environment.
Reactive multi-agent systems present several advantages for modeling complex systems
agents are explicitly represented in the system and have the properties of local action, interaction and observation;
each agent can be described regardless of the description of the other agents, multi-agent systems allow explicit heterogeneity among agents which is often at the root of collective emergent phenomena;
Multi-agent systems can be executed through simulation and provide good model to investigate the complex link between global and local phenomena for which analytic studies are hard to perform.
By proposing two different levels of description, the local level of the agents and the global level of the phenomenon, and several execution models, multi-agent systems constitute an interesting tool to study the link between local and global properties.
Despite of a widespread use of multi-agent systems, their framework still needs many improvements to be fully accessible to computer scientists from various backgrounds. For instance, there is no generic model to mathematically define a reactive multi-agent system and to describe its interactions. This situation is in contrast with the field of cellular automata, for instance, and underlines that a unification of multi-agent systems under a general framework is a question that still remains to be tackled. We now list the different challenges that, in part, contribute to such an objective.
Our work is structured around the following challenges that combine both theoretical and experimental approaches.
Currently, there is no agreement on a formal definition of a multi-agent system. Our research aims at translating the concepts from the field of complex systems into the multi-agent systems framework.
One objective of this research is to remove the potential ambiguities that can appear if one describes a system without explicitly formulating each aspect of the simulation framework. As a benefit, the reproduction of experiments is facilitated. Moreover, this approach is intended to gain a better insight of the self-organization properties of the systems.
Another important question consists in monitoring the evolution of complex systems. Our objective is to provide some quantitative characteristics of the system such as local or global stability, robustness, complexity, etc. Describing our models as dynamical systems leads us to use specific tools of this mathematical theory as well as statistical tools.
Since there is no central control of our systems, one question of interest is to know under which conditions it is possible to guarantee a given property when the system is subject to perturbations. We tackle this issue by designing exogenous control architectures where control actions are envisaged as perturbations in the system. As a consequence, we seek to develop control mechanism that can change the global behavior of a system without modifying the agent behavior (and not violating the autonomy property).
The aim is to design individual behaviors and interactions in order to produce a desired collective output. This output can be a collective pattern to reproduce in case of simulation of natural systems. In that case, from individual behaviors and interactions we study if (and how) the collective pattern is produced. We also tackle “inverse problems” (decentralized gathering problem, density classification problem, etc.) which consist in finding individual behaviors in order to solve a given problem.
Building a reactive multi-agent system consists in defining a set (generally a large number) of simple and reactive agents within a shared environment (physical or virtual) in which they move, act and interact with each other. Our interest in these systems is that, in spite of their simple definition at the agent level, they produce coherent and coordinated behavior at a global scale. The properties that they may exhibit, such as robustness and adaptivity explain why their study has been growing in the last decade (in the broader context of “complex systems”).
Our work on such problems is characterized by five research trends: (A) Defining a formal framework for describing and studying these systems, (B) Developing and understanding reactive multi-agent systems, (C) Analysing and proving properties, (D) Deploying these systems on typical distributed architectures such as swarms of robots, FPGAs, GPUs and sensor networks, (E) Transferring our results in applications.
Multi-agent System is an active area of research in Artificial Intelligence and Complex Systems. Our research fits well into the international research context, and we have made and are making a variety of significant contributions both in theoretical and practical issues.
Concerning multi-agent simulation and formalization, we compete or collaborate in France with S. Hassas in LIESP (Lyon), CERV (Brest), IREMIA (la Réunion), Ibisc (Evry), Lirmm (Montpellier), Irit (Toulouse), A. Drogoul (IRD, Bondy) and abroad with F. Zambonelli (Univ. Modena, Italy) A. Deutsch (Dresden, Germany), D. Van Parunak (Vector research, USA), P. Valkenaers, D. Weyns (Univ. Leuven, Belgium), etc.
Regarding our work on swarm robotics we have common objectives with the DISAL
Our group is involved in several applications of its more fondamental work on autonomous decision making and complex systems. Applications addressed include:
Robotics, where the decision maker or agent is supported by a physical entity moving in the real world;
Medicine or Personal Assisting Living, where the agent can be an analytic device recommending tests and/or treatments, or able to gather different sources of information (sensors for example) in order to help a final user, detecting for example anormal situation needing the rescue of a person (fall detection of elderly people, risk of hospitalization of a person suffering from chronic desease;
Computer Security, where the agent can be a virtual attacker trying to identify security holes in a given network;
and Business Process Management, where the agent can provide an auto-completion facility helping to decide which steps to include into a new or revised process.
Taking into consideration some scientific strategic choices made
by the Nancy – Grand Est Research Center such as developing some platforms around
Robotics and Smart Living Apartments, some members of the team have recentered their research toward “ambient intelligence and AI” . This choice has been also comforted by the launch by Inria of a Large-scale
initiative project termed PAL (Personal assistant Living) in which we are strongly involved. The regional council of Lorraine also supports
this new research line through the CPER, (project "situated computing" or "INFOSITU"
(http://
Evaluation of the degree of frailty of the elderly : We are currently designing a system whose the objective is twofold : assessing the risk of fall and evaluating the degree of frailty of the elderly. This issue is considered with more or less sophisticated sensors : one kinect, a network of kinects, an heterogeneous sensor network made up of an intelligent floor and a network of kinects. One simple idea which is currently developped is to determine either the center of mass of a person using one or several kinect, or the center of pressure and footsteps localization using an intelligent floor. The idea is to induce from these simple measures, the walking speed, the length of the steps and the position of the monitored persons.
People activity analysis: Sensitive or intelligent floors have attracted a lot of attention during the last two decades for different applications going from interaction capture in immersive virtual environments to robotics or human tracking, fall detection or activity recognition. Different technologies have been proposed so far either based on optical fiber sensing, pressure sensing or electrical near field. In PAL we envision a more more sophisticate approach in which both computation and sensing is distributed within the floor. This floor is made up of interconnected intelligent titles with can:
communicate with each other,
provide some computation,
sense the environment activity through four weight sensors, an accelerometer and a magnetometer,
interact with users, robots or other sensor networks either by wireless/wire communication or through visual communication (each tiles being equipped be 16 leds).
Several scientific challenges are open to us
in decentralized spatial computing
in designing real application for assisting people suffering from loss of autonomy
Concerning the second point, we envision several applications :
evaluation of the degree of frailty of the elderly, especially evaluating the risk of fall
activity recognition
monitoring assistant robots
Laurent Ciarletta (Madynes team, LORIA) is a collaborator and correspondant for this software.
AA4MM (Agents and Artefacts for Multi-modeling and Multi-simulation) is a framework for coupling existing and heterogeneous models and simulators in order to model and simulate complex systems. The first implementation of the AA4MM meta-model was proposed in Julien Siebert's PhD and written in Java. This year we added a new coupling between models to represent multi-level modeling, and rewrote a part of the core to ease coupling of simulator.
This work was undertaken in a joint PhD Thesis between MAIA and Madynes Team. Laurent Ciarletta (Madynes team, LORIA) has been co-advisor of this PhD and correspondant for this software.
Other contributors to this software were: Tom Leclerc, François Klein, Christophe Torin, Marcel Lamenu, Guillaume Favre and Amir Toly.
MASDYNE (Multi-Agent Simulator of DYnamic Networks usErs) is a multi-agent simulator for modeling and simulating users behaviors in mobile ad hoc network. This software is part of joint work with MADYNES team, on modeling and simulation of ubiquitous networks.
FiatLux is a discrete dynamical systems simulator that allows the user to experiment with various models and to perturb them. Its main feature is to allow users to change the type of updating, for example from a deterministic parallel updating to an asynchronous random updating. FiatLux has a Graphical User Interface and can also be launched in a batch mode for the experiments that require statistics. In 2012, the main contributions were made by Olivier Bouré, who developed a lattice-gas cellular automata module.
Philippe Lucidarme (Université d'Angers, LISA) is a collaborator and the coordinator of the Cartomatic project.
Cart-o-matic is a software platform for (multi-)robot exploration and mapping tasks. It has been developed by Maia members and LISA (Univ. Angers) members during the robotics ANR/DGA Carotte challenge (2009-2012). This platform is composed of three softwares which as been protected by software copyrights (APP): Slam-o-matic a SLAM algorithm developed by LISA members, Plan-o-matic a robot trajectory planning algorithm developed by Maia and LISA members and Expl-o-matic a distributed multi-agent strategy for multi-robot exploration developed by Maia members (which is based on algorithms proposed in the PhD Thesis of Antoine Bautin). Cf. illustration at Cart-o-matic
The purchase of Cart-o-matic by some robotics companies is underway.
Carlos Sarraute (Core Security Technologies) is an external collaborator.
Core Security Technologies is an U.S.-American/Argentinian company providing, amongst other things, tools for (semi-)automated security checking of computer networks against outside hacking attacks. For automation of such checks, a module is needed that automatically generates potential attack paths. Since the application domain is highly dynamic, a module allowing to declaratively specify the environment (the network and its configuration) is highly advantageous. For that reason, Core Security Technologies have been looking into using AI Planning techniques for this purpose. After consulting by Jörg Hoffmann, they are now using a variant of Jörg Hoffmann's FF planner in their product. While that solution is satisfactory in many respects, it also has weaknesses. The main weakness is that it does not handle the incomplete knowledge in this domain – figuratively speaking, the attacker is assumed to have perfect information about the network. This results in high costs in terms of runtime and network traffic, for extensive scanning activities prior to planning.
We are currently working with Core Security's research department to overcome this issue, by modeling and solving the attack planning problem as a POMDP instead. A workshop paper detailing the POMDP model has been published at SecArt'11. While such a model yields much higher quality attacks, solving an entire network as a POMDP is not feasible. We have designed a decomposition method making use of network structure and approximations to overcome this problem, by using the POMDP model only to find good-quality attacks on single machines, and propagating the results through the network in an appropriate manner. This work has been published in ICAPS'12 .
In the context of Mauricio Araya's PhD, we are working on how MDPs —or related models— can search for information. This has led to various research directions, such as extending POMDPs so as to optimize information-based rewards, or actively learning MDP models. This year, we have focused on a novel optimistic Bayesian Reinforcement Learning algorithm –as described below– and on Mauricio's dissertation.
Exact or approximate solutions to Model-based Bayesian RL are impractical, so that a number of heuristic approaches have been considered, most of them relying on the principle of “optimism in the face of uncertainty”. Some of these algorithms have properties that guarantee the quality of their outcome, inspired by the PAC-learning (Probably Approximately Correct) framework. For example, some algorithms provably make in most cases the same decision as would be made if the true model were known (PAC-MDP property).
We have proposed a novel optimistic algorithm, bolt, that is
appealing in that it is (i) optimistic about the uncertainty in the model and (ii) deterministic (thus easier to study); and
provably PAC-BAMDP, i.e., makes in most cases the same decision as a perfect BRL algorithm would.
This work has been published in ICML'12 and (in French) in JFPDA'12 , additional details appearing in .
Maxim Dorin, Luca Santinelli, Liliana Cucu-Grosjean (Inria, TRIO team), and Rob Davies (U. of York) are external collaborators.
In this collaborative research work (mainly with the TRIO team), we look at the problem of scheduling periodic tasks on a single processor, in the case where each task's period is a (known) random variable. In this setting, some job will necessarily be missed, so that one will try to satisfy some criteria depending on the number of deadline misses.
We have proposed three criteria: (1) satisfying pre-defined deadline miss ratios, (2) minimizing the worst deadline miss ratio, and (3) minimizing the average deadline miss ratio. For each criterion we propose an algorithm that computes a provably optimal fixed priority assignment, i.e., a solution obtained by assigning priorities to tasks and executing jobs by order of priority.
This work has been presented in RTNS'11, and an extended version is currently in preparation.
Iadine Chadès, Josie Carwardine, Tara G. Martin (CSIRO), Samuel Nicol (U. of Alaska Fairbanks) and Régis Sabbadin (INRA) are external collaborators.
In the field of conservation biology, adaptive management is about managing a system, e.g., performing actions so as to protect some endangered species, while learning how it behaves. This is a typical reinforcement learning task that could for example be addressed through BRL.
Here, we consider that a number of experts provide us with one possible model each, assuming that one of them is the true model. This allows making decisions by solving a hidden model MDP (hmMDP). An hmMDP is essentially a simplified mixed observability MDP (MOMDP), where the hidden part of the state corresponds to the model (in cases where all other variables are fully observable).
From a theoretical point of view, we have proved that deciding whether a finite-horizon hmMDP problem admits a solution policy of value greater than a pre-defined threshold is a Pspace-complete problem. We have also conducted preliminary studies of this approach, using the scenario of the protection of the Gouldian finch, and focusing on the particular characteristics that could be exploited to more efficiently solve this problem. These results have been presented in AAAI'12 .
Fabien Flacher (Thales THERESIS) is an external collaborator.
In collaboration with Thales ThereSIS - SE&SIM Team (Synthetic Environment & Simulation), we focus on the problem of following the trajectories of several persons with the help of several actionable cameras. This problem is difficult since the set of cameras cannot cover simultaneously the whole environment, since some persons can be hidden by obstacles or by other persons, and since the behavior of each person is governed by internal variables which can only be inferred (such as his motivation or his hunger).
The approach we are working on is based on (1) POMDP formalisms to represent the state of the system (person and their internal states) and possible actions for the cameras, (2) a simulator provided and developed by Thales ThereSIS and (3) particle filtering approaches based on this simulator.
From a theoretical point of view, we are currently investigating how to use a deterministic simulator and to generate new particles in order to keep a good approximation of the posterior distribution.
External collaborators: Christopher Amato, Arnaud Doniec.
Decentralized partially observable Markov decision processes (Dec-POMDPs) are rich models for cooperative decision-making under uncertainty, but are often intractable to solve optimally (NEXP-complete). The transition and observation independent Dec-MDP is a general subclass that has been shown to have complexity in NP, but optimal algorithms for this subclass are still inefficient in practice. We first provide an updated proof that an optimal policy does not depend on the histories of the agents, but only the local observations. We then present a new algorithm based on heuristic search that is able to expand search nodes by using constraint optimization. We show experimental results comparing our approach with the state-of-the-art Dec-MDP and Dec-POMDP solvers. These results show a reduction in computation time and an increase in scalability by multiple orders of magnitude in a number of benchmarks.
External collaborators: Victor Gabillon, Mohammad Ghavamzadeh and Matthieu Geist.
Modified policy iteration (MPI) is a dynamic programming (DP) algorithm that contains the two celebrated policy and value iteration methods. Despite its generality, MPI has not been thoroughly studied, especially its approximation form which is used when the state and/or action spaces are large or infinite. We have proposed three implementations of approximate MPI (AMPI) that are extensions of well-known approximate DP algorithms: fitted-value iteration, fitted-Q iteration, and classification-based policy iteration. We have provided an error propagation analysis that unifies those for approximate policy and value iteration. For the classification-based implementation, we have developed a finite-sample analysis that shows that MPI's main parameter allows to control the balance between the estimation error of the classifier and the overall value function approximation.
External collaborators: Matthieu Geist, Mohammad Ghavamzadeh and Alessandro Lazaric.
LSTD is one of the most popular reinforcement learning algorithms for value function approximation. Whenever the number of samples is larger than the number of features, LSTD must be paired with some form of regularization. In particular,
In infinite-horizon stationary
This work was presented and selected for a full oral presentation at NIPS'2012 .
External collaborators: Matthieu Geist (IMS Supelec), Olivier Pietquin (IMS Supelec)
Reinforcement Learning in rich, complex and large sensorimotor spaces is a difficult problem mainly because the exploration of such a huge space cannot be done in an extensive way. The idea is thus to adopt a developmental approach where the perception and motor skills of the robot can grow in richness and complexity during learning, as a consequence the size of the state and action spaces grows progressively when the performances of the learning agent increases. The learning framework relies on function approximators with specific properties (continuous input space, life-long adaptation, knowledge transfer). Architectures based on “reservoir learning” and “dynamical self-organizing maps” kind of artificial neural networks have been investigated , .
Reinforcement learning (RL) is now part of the state of the art in the domain of spoken dialog systems (SDS) optimization. The best performing RL methods, such as those based on Gaussian Processes, require to test small changes in the policy to assess them as improvements or degradations. This process is called on policy learning. Nevertheless, it can result in system behaviors that are not acceptable by users. Learning algorithms should ideally infer an optimal strategy by observing interactions generated by a non-optimal but acceptable strategy, that is learning off-policy. Such methods usually fail to scale up and are thus not suited for real-world systems. In this work, a sample-efficient, on-line and off-policy RL algorithm is proposed to learn an optimal policy . This algorithm is combined to a compact non-linear value function representation (namely a multilayer perceptron) enabling to handle large scale systems. One of the application domain is the teaching of a second language .
Ingo Weber (NICTA) and Frank Michael Kraft (bpmnforum.net) are external collaborators.
Planning is concerned with the automated solution of action sequencing problems described in declarative languages giving the action preconditions and effects. One important application area for such technology is the creation of new processes in Business Process Management (BPM), which is essential in an ever more dynamic business environment. A major obstacle for the application of Planning in this area lies in the modeling. Obtaining a suitable model to plan with – ideally a description in PDDL, the most commonly used planning language – is often prohibitively complicated and/or costly. Our core observation in this work is that this problem can be ameliorated by leveraging synergies with model-based software development. Our application at SAP, one of the leading vendors of enterprise software, demonstrates that even one-to-one model re-use is possible.
The model in question is called Status and Action Management (SAM). It describes the behavior of Business Objects (BO), i.e., large-scale data structures, at a level of abstraction corresponding to the language of business experts. SAM covers more than 400 kinds of BOs, each of which is described in terms of a set of status variables and how their values are required for, and affected by, processing steps (actions) that are atomic from a business perspective. SAM was developed by SAP as part of a major model-based software engineering effort. We show herein that one can use this same model for planning, thus obtaining a BPM planning application that incurs no modeling overhead at all.
We compile SAM into a variant of PDDL, and adapt an off-the-shelf planner to solve this kind of problem. Thanks to the resulting technology, business experts may create new processes simply by specifying the desired behavior in terms of status variable value changes: effectively, by describing the process in their own language.
Hootan Nakhost and Martin Müller (University of Alberta) are external collaborators.
The need to economize limited resources, such as fuel or money, is a
ubiquitous feature of planning problems. If the resources cannot be
replenished, the planner must make do with the initial supply. It is
then of paramount importance how constrained the problem is, i.e.,
whether and to which extent the initial resource supply exceeds the
minimum need. While there is a large body of literature on numeric
planning and planning with resources, such resource constrainedness
has only been scantily investigated. We herein start to address this
in more detail. We generalize the previous notion of resource
constrainedness, characterized through a numeric problem feature
Malte Helmert (Basel University) is an external collaborator.
Merge-and-shrink abstraction (M&S) is an approach for constructing
admissible heuristic functions for cost-optimal planning. It enables
the targeted design of abstractions, by allowing to choose individual
pairs of (abstract) states to aggregate into one. A key question is
how to actually make these choices, so as to obtain an informed
heuristic at reasonable computational cost. Recent work has addressed
this via the well-known notion of bisimulation. When aggregating only
bisimilar states – essentially, states whose behavior is identical
under every planning operator – M&S yields a perfect
heuristic. However, bisimulations are typically exponentially
large. Thus we must relax the bisimulation criterion, so that it
applies to more state pairs, and yields smaller abstractions. We
herein devise a fine-grained method for doing so. We restrict the
bisimulation criterion to consider only a subset
This work has been published in ICAPS-12 , and as Inria research report RR-7901 .
Patrik Haslum (ANU) is an external collaborator.
Heuristics based on the delete relaxation are at the forefront of modern domain-independent planning techniques. Here we introduce a principled and flexible technique for augmenting delete-relaxed tasks with a limited amount of delete information, by introducing special fluents that explicitly represent conjunctions of fluents in the original planning task. Differently from previous work in this direction, conditional effects are used to limit the growth of the task to be linear, rather than exponential, in the number of conjunctions that are introduced, making its use for obtaining heuristic functions feasible. We discuss how to obtain an informative set of conjunctions to be represented explicitly, and analyze and extend existing methods for relaxed planning in the presence of conditional effects. The resulting heuristics are empirically evaluated, and shown to be sometimes much more informative than standard delete-relaxation heuristics.
Tractability analysis in terms of the causal graphs of planning problems has emerged as an important area of research in recent years, leading to new methods for the derivation of domain-independent heuristics (Katz and Domshlak 2010). Here we continue this work, extending our knowledge of the frontier between tractable and NP-complete fragments. We close some gaps left in previous work, and introduce novel causal graph fragments that we call the hourglass and semi-fork, for which under certain additional assumptions optimal planning is in P. We show that relaxing any one of the restrictions required for this tractability leads to NP-complete problems. Our results are of both theoretical and practical interest, as these fragments can be used in existing frameworks to derive new abstraction heuristics. Before they can be used, however, a number of practical issues must be addressed. We discuss these issues and propose some solutions.
Laurent Ciarletta (Madynes team, LORIA) is an external collaborator.
Complex systems are present everywhere in our environment: internet, electricity distribution networks, transport networks. These systems have the following characteristics: a large number of autonomous entities, dynamic structures, different time and space scales and emergent phenomena. This work is centered on the problem of control of such systems. The problem is defined as the need to determine, based on a partial perception of the system state, which actions to execute in order to avoid or favor certain global states of the system. This problem comprises several difficult questions: how to evaluate the impact at the global level of actions applied at a global level, how to model the dynamics of an heterogeneous system (different behaviors issue of different levels of interactions), how to evaluate the quality of the estimations issue of the modeling of the system dynamics.
We propose a control architecture based on an “equation-free” approach. We use a multi-agent model to evaluate the global impact of local control actions before applying the most pertinent set of actions.
Associated to our architecture, an experimental platform has been developed to confront the basic ideas or the architecture within the context of simulated “free-riding” phenomenon in peer to peer file exchange networks. We have demonstrated that our approach allows to drive the system to a state where most peers share files, despite given initial conditions that are supposed to drive the system to a state where no peer shares. We have also executed experiments with different configurations of the architecture to identify the different means to improve the performance of the architecture.
Laurent Ciarletta (Madynes team, LORIA) is an external collaborator.
Complex systems generally require to use different points of view (abstraction levels) at the same time on the system in order to capture and to understand all the dynamics and the complexity. Being made of different interacting parts, a model of a complex system also requires simultaneously modeling and simulation (M&S) tools from different scientific fields.
We proposed the AA4MM meta-model is to build a society of models, simulators and simulation softwares that solves the core challenges of multimodelling and simulation coupling in an homogeneous perspective.
This year we focused on systems that naturally involve entities at different levels of description: micro and macro levels with their dynamics and and their articulations : emergence (upward causation, from micro to macro levels) and immergence (downward causation, from macro to micro levels). We relied on Bourgine’s generic view of the relationship between complex phenomenon’s levels and their temporal evolution . We proposed an extension of the AA4MM concepts in order to adapt them to emergence and immergence specifications. A simple example of multi-level modeling of a flocking phenomenon has been implemented to illustrate our proposal.
Our research on emergent collective behaviours focuses on robustness analysis, that is the behavioural resistance to perturbations in collective systems. We progressed in the knowledge of how to tackle this issue in the case of cellular automata (CA) and multi-agent systems (MAS).
The density classification problem was taken as a simple example for studying how decentralised computations can be carried out with simple cells. Although it is known that this problem can not be solved perfectly, we derived analytic calculations to understand how stochastic cellular automata provide good solutions . A collaboration with mathematicians lead us to study how to extend this result to the infinite-space case and to the 2D finite case .
Two papers resulting from the Amybia projects were published : experimental results on phase transitions obtained with FPGAs and the description on a robotics experiment that demonstrates the robustness of a bio-inspired aggregation method .
The results on asynchronous information transmission in cellular automata were consolidated . Original definitions of asynchronism were also developed in lattice-gas cellular automata , which allows us to complete our spectrum of models for which robustness can be studied analytically and with numerical simulations.
We consider decentralised control methods to operate autonomous vehicles at close spacings to form a platoon. We study models inspired by the flocking approach, where each vehicle computes its control from its local perceptions. We investigate different decentralised models in order to provide robust and scalable solutions. Open questions concern collision avoidance, stability and multi-platoon navigation.
In order to reduce the tracking error (i.e. the distance between each follower's path and the path of its predecessor), we developed both an innovative approach and a new lateral control law. This lateral control law reduces the tracking error faster than other existing control laws. This control law, and the experimental results obtained with it, has been submitted to 2013 IEEE International Conference on Robotics and Automation. Its integration with a previously defined secure longitudinal control law has also been studied, and will be submitted soon to 2013 IFAC Intelligent Autonomous Vehicles Symposium.
In the context of the European project InTraDE, the problem studied in the context of Mohamed Tlig's PhD thesis is to handle the displacements of numerous IAVs (Intelligent Autonomous Vehicles) in a seaport. Here we assume a supervisor planning the routes of the vehicles in the port. However, in such a large and complex system, different unexpected events can arise and degrade the traffic : failure of a vehicle, human mistake while driving, obstacle on roads, local re-planning, and so on.
We started focusing on a first important sub-problem of space resource sharing among multiple agents: how to ensure the crossing of two opposed flows of vehicles on a road when one of the two paths is blocked by an obstacle. To overcome this problem, blocked vehicles have to coordinate with vehicles of the other side to share the road and manage delays. The objective is to improve traffic flow and reduce the emergence of traffic jam. After formalizing this problem, we have defined and studied in simulation two decision rules that produce two different strategies: the first one alternates between two vehicles from each side of the road, and the second one gives priority to the vehicle with the highest delay. This work has been presented in ICTAI'12 .
We are now considering more complex situations, e.g., when multiple flows of vehicles share more than one crossroad.
In the context of the ANR/DGA Carotte Challenge, we study since 2009 new strategies and algorithms for multi-robot exploration and mapping. The proposed models are experimented with real autonomous mobile robots at LORIA and every year at the Carotte challenge. Our consortium, called “Cart-o-matic”, is composed of members from Université d'Angers (LISA) and from Maia team-project (our industrial partner has left the consortium in 2011).
The year 2012 produced several results :
In June, we won the final edition of the Carotte challenge ! This result was obtained in particular by the efficiency and the robustness of the multi-robot strategy we proposed. Our system also provided one of the best map of the contest.
We developed a software platform, including SLAM, Planning and multi-robot explorations algorithms. These softwares have been protected by copyrights (APP), see .
We presented the results in different publications : RIA revue , ICIRA'2012 International Conference (Finalist for the Best student paper).
Antoine Bautin wrote his PhD thesis, that he will defend in the beginning of year 2013. This work proposes new frontier assignation algorithms for multi-robot exploration. We defined a new heuristics, based on counting the robots towards a frontier rather than considering only the distance between robots and frontiers. For these purpose we developed algorithms based on wavefronts computations (artificial potential fields).We measured on benchmarks that our algorithm outperforms the two classical approaches closest frontier and Greedy assignation.
In Oct. 2012, Nassim Kaldé started a PhD thesis (MENRT scholarship), advised by F. Charpillet and O. Simonin. We aim at continuing the work of the Cartomatic project, under new hypothesis and constrains on communications and complexity of the environment to explore.
Olivier Rochel (Inria research engineer, SED Nancy) is an external collaborator.
In the context of ambient intelligence and robotic assistance, we explore the definition of an active floor composed of connected nodes, forming a network of cells. We consider different way of computation, as spatial calculus, to define robust and self-adaptive functions in the environment. We aim at dealing with walk analysis, surveillance of people activity (actimetry) and assistance (control of assistant robots, etc.).
This work can be summarized in several points :
We asked Hikob company to design the iTile model we defined at the end of year 2011. In 2012, a network of 90 iTiles has been installed on the floor of the smart apartment of the center. This apartment is an experimental platform developed in the context of the “Situated Computer Science” Action of the CPER MISN (Lorraine region, Inria and government fundings). See InfoSitu.
Each iTile is composed of one node connected to embedded sensors and to its neighboring tiles. A tile holds 4 weight sensors, an accelerometer and 16 LEDs. A simulator of the iTile network has been developed by Olivier Rochel. This tools makes easier the development on the real tiles.
Several functions have been developed and are currently under experiments: (i) detection of a person walking on the floor (ii) tracking of feet position (iii) propagation and display of information in the network.
We are involved since 2010 in the PAL Inria large scale initiative (Personally Assisted Living). In this context, Mihai Andries started a PhD thesis in oct. 2012 (funded by Inria-PAL). This PhD. aims at studying the iTiles model and its possibility for assistance functions. We also study models allowing robots to interact and to use the iTile network.
It is quite easy to estimate in realtime the center of pressure of a person walking on the intelligent floor described above. From a sequence of center of pressure, we conceived a system categorizing the set of measures into two sets :
foot: the measure belongs to the pressure trace left by a foot on the floor,
transition: the center of pressure corresponds to what happens when the person passes his right leg or left from backwards to forwards.
This has been done in a first time using an heuristic algorithm and then using an HMM. From this categorization it's then easy to estimate classical gait parameters such as length of the steps or speed of the walk.
Tracking one or several persons using several Kinects required to solved the calibration, i.e estimation of the pose of each kinect in the scene, knowing that the area covered by each Depth camera don't overlap with other (because of interference). We have addressed this issue using a SLAM approach implemented within a GPU.
A major problem of public health is the loss of autonomy of elderly people usually caused by the falls. Since 2003 one of the goal of MAIA team is to develop a system allowing to detect falls and also to analyze the gait deterioration to prevent falls. A first approach consisted in developing a markerless human motion capture system estimating the 3D positions of the body joints over time. This system used a dynamic Bayesian network and a factored particle filtering algorithm. Since 2011, we used a new approach using Microsoft Kinect camera which allows to acquire at the same time a RGB and a depth image to deal of the problem of the gait. After the extraction of the human from the background, we calculate the gait parameters from the center of mass of a person. Some parameters, as the length and the time of steps, the speed of the gait, allow to predict a deterioration of the gait of a person and an increase of the risk of falls .
Another use of the extraction of center of mass of a person from the Kinect camera is to determine the activity of a person. The method uses a Hidden Markov Model to distinguish eight activities of the daily life (sitting, walking, lying (on a couch, on a bed), lying down, falling, going up on the obstacles, squatting and bending). We set up an experiment in a smart room to validate our results. Concerning the gait parameters we compare them to the real values obtained making the young subjects wake with pads soaked with ink under the shoes on the paper. The results show that there is a difference of 3-4cm between length provided by our Kinect algorithm and the real length provided by the paper. Concerning the detection of the activity, we ask to 28 subjects to perform eight situations (corresponding to the eight states of the HMM). The results showed that each situation is recognized exept “bending”, falls are detected correctly and there are no false positives except “sitting” and “qqsquatting” which are detected instead of “bending”.
Arsène Fansi Tchango has currently a CIFRE grant for his PhD "Multi-Camera Tracking in Partially Observable Environment". This CIFRE is the result of the collaboration between Thales THERESIS and Inria Nancy Grand-Est (Section ).
Ye-Qiong Song (Madynes team, LORIA-Inria) is an external collaborator.
The CPER MIS is a Lorraine region and Inria-Feder project. In this context the Informatique Située action aims at studying and experiment AI models for human assistance and intelligent home. We developed an experimental platform called “Smart Appartment”, where we define and study the iTile network () and different multi-sensors systems for tracking functions. See http://infositu.loria.fr.
(Organizer),
This project “Approche Enactive pour la Gouvernance des Systèmes Socio-Techniques” (AEGSST) is the consequence of the work undertaken within the GEST project funded by the IXXI ("Institut Rhône Alpin des Systèmes Complexes") and PEPS CNRS project GEST. It is labeled and funded by the Réseau National des Systèmes Complexes (RNSC).
This project aims at a fundamental level at proposing an enactive perspective for the governance issue in case of complex socio-technical systems, like public transportation systems or smart grids in energy domain. From a more applicative perspective, we seek at specifying a participatory and reflexive simulation system based on a multi-agent model.
This project gathers researchers coming from different domains (social cognition, decision theory, simulation, serious game, etc) in order to clarify interdisciplinary issues.
Several meetings were organized and a workshop occurred the 29 th November in Paris.
Laurent Bougrain (CORTEX team, LORIA) is an external collaborator.
The COMAC
In the MAIA team, our research effort focuses more precisely on information gathering problems involving active sensors, i.e., an intelligent system which has to select the observations to perform (which sensor, where, at which resolution). Mauricio Araya's undergoing PhD looks precisely at the topic of Active Sensing (Section ).
The project has ended in December 2012 and the main contributions of the MAIA and CORTEX teams are (1) the development of the iComac platform that gathers the information concerning the diagnosis procedures results obtained by all the partners (2) the development of Pie Diagnosis System (PDS), a demonstrative application which uses a POMDP approach to compute the optimal active diagnosis strategy, and hypertrees for visualization.
IMAVO, for “Interactions entre Modules pour l'Apprentissage dans un environnement VOlatile”, is a PEPII project of the INSB institute of the CNRS. It involves Alain Marchand and Etienne Coutureau from the INCIA Lab of Bordeaux (Behavioral Neurosciences - INSB), Mehdi Khamassi and Benoît Girard from the ISIR Lab of Paris (Robotics and Neurosciences - INS2I), Alain Dutech and Nicolas Rougier from the Loria Lab of Nancy (Computational Neurosciences and Machine Learning - INS2I).
This project investigates model-based and model-free reinforcement learning approaches for rats learning in volatile environments (i.e. context and reward can change during learning). It aims at designing new concept for modularized decision-making systems, allowing a better understanding of the underlying neuro-biological process involved in rats and humans and applications in the field of autonomous robotics.
The PAL project is a national Inria Large Scale Initiative (Action d'Envergure Nationale) involving several teams of the institute (Arobas, Coprin, E-motion, Lagadic, Demar, Maia, Prima, Pulsar and Trio). It is coordinated by David Daney (Inria Sophia-Antipolis EPI Coprin). The project focuses on the study and experiment of models for health and well-being. Maia is particularly involved in the People Surveillance work package, by studying and developping intelligent environments and distributed tracking devices for people walking analysis and robotic assistance (smart tiles, 3D camera network, assistant robots), cf. Sec. .
In 2012, we organized a Workshop PAL in Nancy, on November (http://
This project relies on results and questions arising from the SMAART project (2006-08). During this project we adapted the EVAP algorithm, proposed in the PhD thesis of Arnaud Glad (Maia, 2011) to the patrol with UAVs, while providing a generic digital pheromone based patrolling simulator. Concerning sharing authority, we proposed an original interface to manipulate groups of UAVs.
The SUSIE project allowed to progress on two questions (i) studying and improving parameters of the EVAP algorithm through the SUSIE simulator (ii) defining new ways to manipulate pheromones fields in order to improve the sharing authority.
Percee, for “Perception Distribuée pour Environnements Intelligents”, is a project proposed by Maia and Madynes teams and funded by Inria. This ADT (Action de Developpement Technologique) supports our action in the PAL Inria National Scale Initiative (Personally Assisted Living, see ).
The project deals with the development and the study of intelligent homes. Since two years we develop an experimental platform, the smart appartment. It allows us to study models and technology for life assistance (walk analysis with iTiles and camera networks, robotic assistants, health diagnostic, domotic functions, wireless communication inside home).
In particular we develop a new tactile floor, which is the iTiles network. Two engineers are funded by the ADT: Moutie Chaider (IJD) and Olivier Rochel (Inria research engineer) for two years.
This project has been granted by ANR in the Robotics Carotte challenge (CArtographie par ROboT d'un TErritoire) from the Contenus et Interactions program (2009-2012). The project is funded with ca. 50000 EUR to purchase the robotics platform. The Maia team was also funded with a PhD fellowship (Antoine Bautin, defending his PhD in the beginning of year 2013). The Cartomatic consortium was formed by LISA/Angers University (leader) and Maia/LORIA team (and until 2011 by Wany robotics, Montpellier).
This project concerned the mapping of indoor structured but unknown environments, and the localization of objects, with one or several robots. We explored a decentralized multi-robot approach to achieve the challenge. We demonstrated the efficiency and robustness of the approach by winning the final edition of the contest (June 2012, Bourges). See Section and the Web page Cartomatic project.
Dominique Martinez (Cortex team, Inria NGE) is an external collaborator and the coordinator of the project for Nancy members.
PHEROTAXIS is an “Investissements d’Avenir” ANR 2011-2014 (Coordination: J.-P. Rospars, UMR PISC, INRA Versailles).
The theme of the research is Localisation of odour sources by insects and robots. By associating experimental data with models, the project will allow to define a behavioral model of olfactive processes. This work will also provide several applications, in particular the development of bio-inspired components hightly sensitive and selective.
The project is organized in five work packages and involves the PISC research unit (Versailles), Pasteur Institute (Paris) and LORIA/Inria institute (Nancy).
This project has been granted by ANR in the “Chaires d'Excellence” program. The project is funded with ca. 400000 EUR and will hire four non-permanent researchers (Doctorants and/or Postdocs). Jörg Hoffmann is the project leader, Olivier Buffet and Bruno Scherrer collaborate. Other collaborators from LORIA are Stephan Merz, Ammar Oulamara, and Martin Quinson. The project also has several international collaborators, in particular Prof. Blai Bonet (Universidad Simon Bolivar, Caracas, Venezuela), Prof. Carmel Domshlak (Technion Haifa, Israel), Prof. Hector Geffner (Universitat Pompeu Fabra, Barcelona, Spain), Dr. Malte Helmert (University of Freiburg, Germany), and Prof. Stephen Smith (CMU, Pittsburgh, USA).
The project unites research from four different areas, namely classical planning, probabilistic planning, model checking, and scheduling. The underlying common theme is the development of new methods for computing lower bounds via state aggregation. Specifically, the basic technique investigated allows explicit selection of states to aggregate, in exponentially large state spaces, via an incremental process interleaving it with state space re-construction steps. The two main research questions to be addressed are how to choose the states to aggregate, and how to effectively obtain, in practical scenarios, anytime methods providing solutions with increasingly tighter performance guarantees.
So far, we have hired Dr. Michael Katz as a PostDoc (for 2 years) working on classical planning, and Manel Tagorti as a PhD student (for 3 years) working on probabilistic planning. The Conseil Regional de Lorraine has accepted to co-finance, for 2011, 50% of the the position of Michael Katz for a period of 1 year. Chao-Wen Perng was funded from BARQ for an internship of 5 months during which she worked on her MSc report, laying some basis for the research direction to be followed by Manel Tagorti.
The project has stopped when Joerg Hoffmann left Inria.
Program: InterReg IV B
Project acronym: InTraDE
Project title: Intelligent Transportation for Dynamic Environment
Duration: 2010 - 2014
Coordinator: University of Science and Technology of Lille (Lille 1-LAGIS) (France),
Other partners: South East England Development Agency (United Kingdom), Centre Régional d’Innovation et de Transfert de Technologie – Transport et Logistique (CRITT TL) (France), AG Port of Oostende (AGHO) (Belgium), National Institute for Transport and Logistics, Dublin Institute of Technology (Ireland), Liverpool John Moores University (LOOM) (United Kingdom)
Abstract:
The InTraDE project (Intelligent Transportation for Dynamic
Environments, http://
The Maia team partner focuses on decentralized approaches to deal with the control of automated vehicle platooning and the adaptation of the traffic. Maia is funded with two PhD fellowships and one engineer. Both PhD thesis started in the end of 2010. The PhD of Jano Yazbeck, supervised by F. Charpillet and A. Scheuer, aims at studying a “Secure and robust immaterial hanging for automated vehicles”. The PhD of Mohamed Tlig, supervised by O. Simonin and O. Buffet, addresses “Reactive coordination for traffic adaptation in large situated multi-agent systems”.
Dr. Iadine Chadès, Research Scientist at CSIRO, Ecosystem Sciences division (Brisbane, Australia), visited MAIA for 1 week in April 2012.
Pr. Sukanta Das, Professor at the Department of Information Technology, BESU university (West Bengal, India), visited MaIA for three weeks in March 2012.
Amine Boumaza co-organized the 23rd JET (Journée Evolutionnaire Thématique) held at the University Pierre et Marie Curie in Paris on November 23rd.
Christine Bourjot was a co-organizer of the ARCO ‘s (Association pour la Recherche Cognitive) Workshop “ROBOTS & CORPS, Immersion Ecologique & Cognition incarnée”, October 2012.
Christine Bourjot was a board member of AFIA (Association pour l’Intelligence Artificielle).
Christine Bourjot was a co-organizer of the ARCO and AFIA ’s Workshop SCIA 2012 Sciences Cognitives et Intelligence Artificielle, May 2012.
Olivier Buffet was a member of:
the organizing committee of the “Journées Francophones sur la Planification, la Décision et l'Action pour le contrôle de systèmes” 2012 (JFPDA'12),
the organizing committee of the “Conférence francophone sur l'apprentissage automatique” 2012 (CAp'12),
the editorial board of the “revue d'intelligence artificielle” (RIA), and
the editorial board of the “Journal of Artificial Intelligence Research” (JAIR).
Olivier Buffet was a reviewer for the journals: AMAI (Annals of Mathematics and Artificial Intelligence), JAIR (Journal of Artificial Intelligence Research), RIA (Revue d'Intelligence Artificielle); and for the conferences AAAI'12 (National Conference on Artificial Intelligence), ICRA'12 (International Conference on Robotics and Automation), JFPDA'12 (Journées Francophones sur la Planification, la Décision et l'Action pour le contrôle de systèmes).
François Charpillet and Olivier Simonin co-organized the international Workshop PAL Personally Assisted Living at LORIA, November 2012. (PAL workshop)
Vincent Chevrier was a member of the program committee of EUMAS 121 (European Workshop on Multi-Agent Systems) , IAT 11 (Intelligent Agent Technology), RFIA12 (Reconnaissance des Formes et Intelligence Artificielle), JFSMA12 (French conference on MAS).
Vincent Chevrier was reviewer for RIA journal.
Vincent Chevrier is the moderator of the mailing list of the French spoken community on multi-agent systems.
Alain Dutech was a reviewer for JAIR (Journal of Artificial Intelligence Research), RIA (Revue d'Intelligence Artificielle), Journal of Adaptive Behavior; and for the conference JFPDA'12 (Journées Francophones sur la Planification, la Décision et l'Action pour le contrôle de systèmes).
Nazim Fatès was a co-organiser of the ACA (Asynchronous cellular automata) workshop in ACRI 2012, a member of the steering committee of Automata 2012 (Annual workshop on cellular automata), member of the program committee of ACRI 2012, SCW'12 (Spatial computing workshop), ICAART'12, ICIST'12, CAAA'12. He was an ad hoc reviewer for the following journals : Theoretical Computer Science, Entropy, Advances in Complex Systems.
Nazim Fatès was an invited speaker at COLMOT'12, a workshop on “Collective motion in biological systems: from data to models”.
Alexis Scheuer was a reviewer for the IEEE Transactions on Robotics and on Systems, Man, and Cybernetics, for the Elsevier Journals of Robotics and Autonomous Systems and of Artificial Intelligence, and for the International Journal of Advanced Robotic Systems, as well as for the International Conference on Robotics and Automation (ICRA'13) and for the International IFAC Symposium on Robot Control (SYROCO'12).
Bruno Scherrer was a reviewer for JAIR (Journal Of Artificial Intelligence Research), TAC (Transactions on Automatic Control), ICML'2012 (International Conference on Machine Learning), NIPS'2012 (Neural Information Processing Systems), ECAI'2012 (European Conference on Artificial Intelligence) and JFPDA'2012 (Journées Francophones sur la Planification, la Décision et l'Action pour le contrôle de systèmes).
Bruno Scherrer was an invited speaker at a workshop of the CEA-EDF-Inria Summer School on Stochastic Optimization at Cadarache (28 Jun.)
Olivier Simonin co-organized the International IROS'2012 “Assistance and Service robotics in a human environment” held at Vilamoura, Portugal, October 12th. (Web page)
Olivier Simonin co-organized the 7th National CAR'2012 Conference (Control Architectures of Robots) at LORIA, May 10-11 2012. (CAR'2012)
Olivier Simonin was chair and co-organizer of the SASO'2012 Demo&Contest Track (IEEE International Conference on Self-Adaptive and Self-Organizing Systems), Lyon, 2012. saso2012.
Olivier Simonin was a reviewer for the journals: JAAMAS (Journal of Autonomous Agents and Multi-Agent System), Natural Computing (Springer), the IEEE Robotics and Automation Magazine, RIA (Revue d'Intelligence Artificielle). He also reviewed papers for ICRA'2012 (International Conference on Robotics and Automation).
Olivier Simonin was a program committee member of SASO'2012 (6th IEEE International Conference on Self-Adaptive and Self-Organizing Systems), ICINCO'2013 (10th Int. Conf. on Informatics in Control, Automation and Robotics), ICAART'2013 (5th Int. Conf. on Agents and AI.) and JFSMA'2012 (French conference on MAS).
Vincent Thomas was a board member of ARCO (Association pour la Recherche Cognitive)
Vincent Thomas was a reviewer of "Journées Francophones sur la Planification, la Décision et l'Action pour le contrôle de systèmes" 2012 (JFPDA'2012)
Licence ISC (Informatique et Sciences Cognitives) : Christine Bourjot, Intelligence Artificielle et Résolution de Problèmes, 25HETD, niveau L3, Université de Lorraine, France
Master M2 SCMN (Sciences Cognitives et Media Numériques) Master SCA: Christine Bourjot, Université de Lorraine.
Master MIAGE (Méthodes Informatiques Appliquées à la Gestion): Christine Bourjot, Extraction Intelligente de Données, 18HETD, niveau M1, Université de Lorraine, France
Master SCA (Sciences Cognitives et Applications): Christine Bourjot, Système Multi-Agent, 20HETD, niveau M2, Université de Lorraine, France
Master : Vincent Chevrier, Modèles et Systèmes Multi-agents, 15CM, M2R, Université de Lorraine, France.
Master SCA (Sciences Cognitives et Applications): Alain Dutech, Apprentissage Numérique, 20HETD, niveau M1, Université de Lorraine, France.
Master : Nazim Fatès, Systèmes communicants, partie automates cellulaires, 10CM, M2R, Université de Lorraine, France.
Master : Alexis Scheuer & Olivier Simonin, Introduction à la robotique mobile, 37,5 HETD, M1 Informatique, Université de Lorraine (UHP), France.
Master SCA (Sciences Cognitives et Applications): Vincent Thomas, Agent Intelligent, 20HETD, niveau M1, Université de Lorraine, France.
Master SCA (Sciences Cognitives et Applications): Vincent Thomas, Game Design et Serious Game, 20HETD, niveau M2, Université de Lorraine, France.
Master Informatique: Vincent Thomas, Optimisation et Systemes Dynamiques Stochastiques, 22HETD, niveau M2, Université de Lorraine, France.
Supelec Metz 5eme année : Olivier Simonin, 15CM “Vie Artificielle”.
PhD : Tomas Navarrete, Une architecture de contrôle de systèmes complexes basée sur la simulation multi-agent, Université de Lorraine, 24 oct., Vincent Chevrier
PhD in progress : Mihai Andries, “Calcul spatialisé pour l'assistance à la personne: étude d'un réseau de dalles intelligentes”, Oct. 2012, F. Charpillet (advisor), O. Simonin.
PhD in progress : Mauricio Araya, “Near-Optimal Algorithms for Sequential Information-Gathering Decision Problems”, Sept. 2009, F. Charpillet (advisor), O. Buffet, V. Thomas.
PhD in progress : Antoine Bautin, “Stratégie d'exploration multi-robot fondée sur les champs de potentiels artificiels”, Oct. 2009, F. Charpillet (advisor), O. Simonin.
PhD in progress : Olivier Bouré, “Robustesse des systèmes multi-agents réactifs: vers une informatique bio-inspirée ?”, Nazim Fatès, Vincent Chevrier (advisor).
PhD in progress : Benjamin Camus, “Un laboratoire virtuel pour la multi-modélisation”, Christine Bourjot, Vincent Chevrier (advisor).
PhD in progress : Timothé Collet, “Apprentissage actif par renforcement pour la classification”, Nov. 2012, O. Pietquin (advisor), O. Buffet
PhD in progress : Mihai AndriesAmndine Dubois, “assistance à la personne en perte d'autonomie: étude de l'apport d'un réseau de Kinectsà la détection et la prévention des chutes”, Oct. 2011, F. Charpillet (advisor).
PhD in progress : Arsène Fansi Tchango, “Suivi multi-caméra en environnement partiellement observé”, Oct. 2011, A. Dutech (advisor), O. Buffet, V. Thomas.
PhD in progress : Nassim Kaldé, “Exploration et reconstruction d’un environnement inconnu par une flottille de robots”, Oct. 2012, F. Charpillet (advisor), O. Simonin.
PhD in progress : Manel Tagorti, “Approximating the Value Function for Heuristic Search in Factored MDPs”, Nov. 2011, J. Hoffmann (advisor), B. Scherrer, O. Buffet.
PhD in progress : Mohamed Tlig, “Reactive coordination for traffic adaptation in large situated multi-agent systems”, Dec. 2010, O. Simonin (advisor), O. Buffet.
PhD in progress : Jeannot Yazbeck, “Secure and robust immaterial hanging for automated vehicles”, Oct. 2010, F. Charpillet (advisor), A. Scheuer.
Vincent Chevrier was a member of the PhD committee of Jonathan Demange, 20 Dec. UTBM, as a referee; and Shirley Hoet, 17 Dec, UPMC, as a committee member.
Vincent Chevrier was a member in the HDR committee as a referee of Pascal Ballet, 6 April, Université de Bretagne Occidentale
François Charpillet was a member (as a referee) of the PhD commitee of :
Muhammad Ali, 11th july 2012, Laas, University of Toulouse
Matthieu Warnier,10th december 2012, Laas, University of Toulouse
Senthilkumar Chandramohan, 25th september 2012, University of Avignon
Guogang Wen, 26th october 2012, Lagis, University of Lille
Mohamed Amine Hamila, 3rd April 2012, Univerity of Valenciennes
François Gaillard,2d February 2012,LIFL, University of Lille
François Charpillet was a member of the PhD commitee of :
Wissam KHALIL, 2nd february 2012, Lagis, University of Lille
Sylvain Raybaud, 5th december 2012, Loria, University of Lorraine
Rui Loureiro, 6th december 2012, LAgis, University of Lille
François Charpillet was a member of the HdR commitee of :
Amir Hajjam El Hassani, 8 th december 2012, University of Besançon
Alain Dutech was a member (as a referee) of the PhD committee of Shirley Hoet, 17 Dec, UPMC.
Nazim Fatès was a member of the PhD committee of Julien Provillard, Université de Nice. The defence was held on the 6th of December in I3S laboratory, Nice.
Bruno Scherrer was a member of the PhD committee of Jean-François Hren, Université Lille 1, 21 Jun.
Olivier Simonin was a member of the PhD committee of M. Guezani, UTBM, 4 April.
Vincent Thomas was a member of the "Specialist committee" in Universite Nancy 2.
Christine Bourjot was co-organizer of the “Forum des Sciences Cognitives”, Université de Lorraine, November 2012.
In the scope of the 2012 celebrations for Alan Turing's hundredth birthday:
Nazim Fatès recorded a video on Turing's heritage with Inria's audiovisual service. This video can be accessed on http://www.youtube.com/watch?v=6awK-FHBntc.
Nazim Fatès was interviewed by the Eureka magazine and he was a co-organiser of the three conferences on Turing that the Loria organised in September in Nancy for a large public (see http://turing2012.loria.fr).
Nazim Fatès participated in recording a program broadcasted in the “Hopital des enfants malades” in Brabois, where the main topic was Turing and Computer Science (see http://www.loria.fr/news/linformatique-aux-enfants-hospitalises-de-brabois). A paper (with mistakes) appeared in L'Est republicain (November 17, 2012) to relate this event.
Nazim Fatès gave a talk entitled Turing, l'intelligence des machines et le jeu in Lycée Jacques Marquette (Pont-à-Mousson) for the annual meeting of the “Associations des anciens élèves et professeurs” (October 7, 2012).
Olivier Simonin was invited to the Journée “Robotique et Numérique” organized by the GDR Robotique for a talk entitled “Robotique bio-inspirée : vers une intelligence collective ?” (see http://www.gdr-robotique.org/journee.php).
Vincent Thomas is preparing, in collaboration with the "Bibliothèque Universitaire du Campus Lettres", an exposition "jeux: les ateliers de la pensée" where the main objective is to promote Game as an interesting subject for academics. This exposition will include vulgarization seminars with specialists of several scientific fields (economy, psychology, computer science) and animations.
Vincent Thomas is a participant of the Erasmus IP: "Learning Computer Programming in Virtual Environments" involving 8 foreign universities. Its objective is to promote teaching and learning of the fundamentals of computer programming through use of virtual and remote learning environments.