The objective of the MAIA
The first research activity is about sequential decision making. It has been influenced by Stuart Russell who considers that an agent is rational. According to them: “For each possible percept sequence, an ideal rational agent should do whatever action is expected to maximize its performance measure” . This view makes Markov decision processes (MDPs) and more generally sequential decision making a good candidate for building the behavior of an agent. It probably explains why MDPs have received considerable attention in recent years by the artificial intelligence (AI) community.
The second activity is about understanding and engineering reactive multi-agent systems. It is influenced by research results from the field of behavioral biology which provides key insights for understanding how intelligent and adaptive behaviors appear in natural swarm systems. This encourages us to study principles of emergent behaviors in natural systems and apply them to the design of artificial intelligent systems. Reactive multi-agent systems are good candidates for building such autonomous and adaptive systems and our work mainly focuses on better understanding how we can soundly build such systems.
the MAIA team was rewarded as “the most influential team of the research field” during the French conference on Planification, Decision and Learning (JFPDA 2013).
M. Tlig, O. Buffet, O. Simonin got the Best Paper Award for their paper presented at RJCIA-13 .
Sequential decision making consists, in a nutshell, in controlling the actions of an agent facing a problem whose solution requires not one but a whole sequence of decisions. This kind of problem occurs in a multitude of forms. For example, important applications addressed in our work include: Robotics, where the agent is a physical entity moving in the real world; Medicine, where the agent can be an analytic device recommending tests and/or treatments; Computer Security, where the agent can be a virtual attacker trying to identify security holes in a given network; and Business Process Management, where the agent can provide an auto-completion facility helping to decide which steps to include into a new or revised process. Our work on such problems is characterized by three main lines of research:
(A) Understanding how, and to what extent, to best model the problems.
(B) Developing algorithms solving the problems and understanding their behavior.
(C) Applying our results to complex applications.
Before we describe some details of our work, it is instructive to understand the basic forms of problems we are addressing. We characterize problems along the following main dimensions:
(1) Extent of the model: full vs. partial vs. none. This dimension concerns how complete we require the model of the problem – if any – to be. If the model is incomplete, then learning techniques are needed along with the decision making process.
(2) Form of the model: factored vs. enumerative. Enumerative models explicitly list all possible world states and the associated actions etc. Factored models can be exponentially more compact, describing states and actions in terms of their behavior with respect to a set of higher-level variables.
(3) World dynamics: deterministic vs. stochastic. This concerns our initial knowledge of the world the agent is acting in, as well as the dynamics of actions: is the outcome known a priori or are several outcomes possible?
(4) Observability: full vs. partial. This concerns our ability to observe what our actions actually do to the world, i.e., to observe properties of the new world state. Obviously, this is an issue only if the world dynamics are stochastic.
These dimensions are wide-spread in the AI literature and are not exhaustive, in particular the MAIA team is also interested by discrete/continuous or centralized/decentralized problems. The complexity of solving a problem – both in theory and in practice – depends heavily on where it resides in this categorization. A common practice is to address simplified problems, leading to perhaps sub-optimal solutions while trying to characterize how far from the optimal solution we stand.
In what follows, we outline the main formal frameworks on which our work is based; while doing so, we highlight in a little more detail our core research questions. We then give a brief summary of how our work fits into the global research context.
Sequential decision making with deterministic world dynamics is most
commonly known as planning, or classical planning
. Obviously, in such a setting every
world state needs to be considered at most once, and thus enumerative
models do not make sense (the problem description would have the same
size as the space of possibilities to be explored). Planning
approaches support factored description languages in which
complex problems can be modeled in a compact way. Approaches to automatically learn
such factored models do exist, however most works – and also most of
our works on this form of sequential decision making – assume that the
model is provided by the user of the planning technology. Formally, a
problem instance, commonly referred to as a planning task, is a
four-tuple
Planning is PSPACE-complete even under strong restrictions on the formulas allowed in the planning task description. Research thus revolves around the development and understanding of search methods, which explore, in a variety of different ways, the space of possible action schedules. A particularly successful approach is heuristic search, where search is guided by information obtained in an automatically designed relaxation (simplified version) of the task. We investigate the design of relaxations, the connections between such design and the search space topology, and the construction of effective planning systems that exhibit good practical performance across a wide range of different inputs. Other important research lines concern the application of ideas successful in planning to stochastic sequential decision making (see next), and the development of technology supporting the user in model design.
Markov Decision Processes (MDP) are a
natural framework for stochastic sequential decision making. An MDP is
a four-tuple
Once the optimal value function is computed, it is straightforward
to derive an optimal strategy, which is deterministic and memoryless,
i.e., a simple mapping from states to actions. Such a strategy is
usually called a policy. An optimal policy is any policy
An important extension of MDPs, known as Partially Observable MDPs
(POMDPs) allows to account for the fact that the state may not
be fully available to the decision maker.
While the goal is the same
as in an MDP (optimizing the expected sum of discounted rewards), the
solution is more intricate. Any POMDP can be seen to be equivalent to
an MDP defined on the space of probability distributions on states,
called belief states. The Bellman-machinery then applies to the
belief states. The specific structure of the resulting MDP makes it
possible to iteratively approximate the optimal value function –
which is convex in the belief space – by piecewise linear
functions, and to deduce an optimal policy that maps belief states to
actions.
A further extension, known as a DEC-POMDP, considers
The MDP model described above is enumerative, and the complexity of computing the optimal value function is polynomial in the size of that input. However, in examples of practical size, that complexity is still too high so naïve approaches do not scale. We consider the following situations: (i) when the state space is large, we study approximation techniques from both a theoretical and practical point of view; (ii) when the model is unknown, we study how to learn an optimal policy from samples (this problem is also known as Reinforcement Learning ); (iii) in factored models, where MDP models are a strict generalization of classical planning – and are thus at least PSPACE-hard to solve – we consider using search heuristics adapted from such (classical) planning.
Solving a POMDP is PSPACE-hard even given an enumerative model. In this framework, we are mainly looking for assumptions that could be exploited to reduce the complexity of the problem at hand, for instance when some actions have no effect on the state dynamics (active sensing). The decentralized version, DEC-POMDP, induces a significant increase in complexity (NEXP-complete). We tackle the challenging – even for (very) small state spaces – exact computation of finite-horizon optimal solutions through alternative reformulations of the problem. We also aim at proposing advanced heuristics to efficiently address problems with more agents and a longer time horizon.
There exist numerous examples of natural and artificial systems where self-organization and emergence occur. Such systems are composed of a set of simple entities interacting in a shared environment and exhibit complex collective behaviors resulting from the interactions of the local (or individual) behaviors of these entities. The properties that they exhibit, for instance robustness, explain why their study has been growing, both in the academic and the industrial field. They are found in a wide panel of fields such as sociology (opinion dynamics in social networks), ecology (population dynamics), economy (financial markets, consumer behaviors), ethology (swarm intelligence, collective motion), cellular biology (cells/organ), computer networks (ad-hoc or P2P networks), etc.
More precisely, the systems we are interested in are characterized by:
locality: Elementary components have only a partial perception of the system's state, similarly, a component can only modify its surrounding environment.
individual simplicity: components have a simple behavior, in most cases it can be modeled by stimulus/response laws or by look-up tables. One way to estimate this simplicity is to count the number of stimulus/response rules for instance.
emergence: It is generally difficult to predict the global behavior of the system from the local individual behaviors. This difficulty of prediction is often observed empirically and in some cases (e.g., cellular automata) one can show that the prediction of the global properties of a system is an undecidable problem. However, observations coming from simulations of the system may help us to find the regularities that occur in the system's behavior (even in a probabilistic meaning). Our interest is to work on problems where a full mathematical analysis seems out of reach and where it is useful to observe the system with large simulations. In return, it is frequent that the properties observed empirically are then studied on an analytical basis. This approach should allow us to understand where lies the frontier between simulation and analysis.
levels of description and observation: Describing a complex system involves at least two levels: the micro level that regards how a component behaves, and the macro level associated with the collective behavior. Usually, understanding a complex system requires to link the description of a component behavior with the observation of a collective phenomenon: establishing this link may require various levels, which can be obtained only with a careful analysis of the system.
We now describe the type of models that are studied in our group.
We represent these complex systems with reactive multi-agent systems (RMAS). Multi-agent systems are defined by a set of reactive agents, an environment, a set of interactions between agents and a resulting organization. They are characterized by a decentralized control shared among agents: each agent has an internal state, has access to local observations and influences the system through stimulus response rules. Thus, the collective behavior results from individual simplicity and successive actions and interactions of agents through the environment.
Reactive multi-agent systems present several advantages for modeling complex systems
agents are explicitly represented in the system and have the properties of local action, interaction and observation;
each agent can be described regardless of the description of the other agents, multi-agent systems allow explicit heterogeneity among agents which is often at the root of collective emergent phenomena;
multi-agent systems can be executed through simulation and provide good models to investigate the complex link between global and local phenomena for which analytic studies are hard to perform.
By proposing two different levels of description, the local level of the agents and the global level of the phenomenon, and several execution models, multi-agent systems constitute an interesting tool to study the link between local and global properties.
Despite a widespread use of multi-agent systems, their framework still needs many improvements to be fully accessible to computer scientists from various backgrounds. For instance, there is no generic model to mathematically define a reactive multi-agent system and to describe its interactions. This situation is in contrast with the field of cellular automata, for instance, and underlines that a unification of multi-agent systems under a general framework is a question that still remains to be tackled. We now list the different challenges that, in part, contribute to such an objective.
Our work is structured around the following challenges that combine both theoretical and experimental approaches.
A widespread and consensual formal definition of a multi-agent system is lacking. Our research aims at translating the concepts from the field of complex systems into the multi-agent systems framework.
One objective of this research is to remove the potential ambiguities that can appear if one describes a system without explicitly formulating each aspect of the simulation framework. As a benefit, the reproduction of experiments is facilitated. Moreover, this approach is intended to gain a better insight of the self-organization properties of the systems.
Another important question consists in monitoring the evolution of complex systems. Our objective is to provide some quantitative characteristics of the system such as local or global stability, robustness, complexity, etc. Describing our models as dynamical systems leads us to use specific tools of this mathematical theory as well as statistical tools.
Since there is no central control of our systems, one question of interest is to know under which conditions it is possible to guarantee a given property when the system is subject to perturbations. We tackle this issue by designing exogenous control architectures where control actions are envisaged as perturbations in the system. As a consequence, we seek to develop control mechanisms that can change the global behavior of a system without modifying the agent behavior (and not violating the autonomy property).
The aim is to design individual behaviors and interactions in order to produce a desired collective output. This output can be a collective pattern to reproduce in case of simulation of natural systems. In that case, from individual behaviors and interactions we study if (and how) the collective pattern is produced. We also tackle “inverse problems” (decentralized gathering problem, density classification problem, etc.) which consist in finding individual behaviors in order to solve a given problem.
Our group is involved in several applications of its more fundamental work on autonomous decision making and complex systems. Applications addressed include:
Robotics, where the decision maker or agent is supported by a physical entity moving in the real world;
Medicine or Personally Assisted Living, where the agent can be an analytic device recommending tests and/or treatments, or able to gather different sources of information (sensors for example) in order to help a final user, detecting for example anormal situation needing the rescue of a person (fall detection of elderly people, risk of hospitalization of a person suffering from chronic disease;
Active Sensing, where decisions have to be taken in order to gather information on a system. This can be applied to many fields, like for example monitoring the integrity of airplanes wings or the behavior of people in public areas.
As the Nancy – Grand Est Research Center scientific strategy pushes the development of plateforms on Robotics and Smart Living Apartments, some members of the team have recentered their research toward “ambient intelligence and AI” . This choice is backed up by the Inria Large-scale initiative project termed PAL (Personal assistant Living) in which we are strongly involved. The regional council of Lorraine also supports this new research line through the CPER, (project "situated computing" or "INFOSITU" infositu.loria.fr) whose coordinator is a member of MAIA Team. Within this new domain of research in MAIA, we explore how intelligent decentralized complex systems can help designing intelligent environments dedicated to elderly people with loss of autonomy. This domain of research is currently very active, taking up a societal challenge that developed countries have to address.
Laurent Ciarletta (Madynes team, LORIA) is a collaborator and correspondant for this software. Yannick Presse (Madynes team, LORIA) is collaborator for this software.
AA4MM (Agents and Artefacts for Multi-modeling and Multi-simulation) is a framework for coupling existing and heterogeneous models and simulators in order to model and simulate complex systems. The first implementation of the AA4MM meta-model was proposed in Julien Siebert's PhD and written in Java. A newer version with more coupling models is currently submitted to the APP (Agence pour la protection des programmes).
This year, we used this software in a strategic action with EDF R&D in the context of the simulation of smart-grids.
This work was undertaken in the PhD Thesis of Julien Siebert, a joint thesis between MAIA and Madynes Team. Laurent Ciarletta (Madynes team, LORIA) has been co-advisor of this PhD and correspondant for this software.
Other contributors to this software were: Tom Leclerc, François Klein, Christophe Torin, Marcel Lamenu, Guillaume Favre and Amir Toly.
MASDYNE (Multi-Agent Simulator of DYnamic Networks usErs) is a multi-agent simulator for modeling and simulating users behaviors in mobile ad hoc network. This software is part of joint work with MADYNES team, on modeling and simulation of ubiquitous networks. It has been updated by Tomas Navarrete with new functionalities for the simulation of scenarii.
FiatLux is a discrete dynamical systems simulator that allows the user to experiment with various models (for example 1D and 2D cellular automatas, moving agents on cellular automatas) and to perturb them. Its main feature is to allow users to change the type of updating, for example from a deterministic parallel updating to an asynchronous random updating. FiatLux has a Graphical User Interface and can also be launched in a batch mode for the experiments that require statistics.
In 2013, FiatLux was officially registered by the Agence pour la protection des programmes (APP).A new release is available under the CeCILL licence on the FiatLux website : fiatlux.loria.fr
Philippe Lucidarme (Université d'Angers, LISA) is a collaborator and the coordinator of the Cart-o-matic project.
Cart-o-matic is a software platform for (multi-)robot exploration and mapping tasks. It has been developed by Maia members and LISA (Univ. Angers) members during the robotics ANR/DGA Carotte challenge (2009-2012). This platform is composed of three softwares tools which are protected by software copyrights (through the Agence pour la Protection des Programmes): Slam-o-matic a SLAM algorithm developed by LISA members, Plan-o-matic a robot trajectory planning algorithm developed by Maia and LISA members, and Expl-o-matic a distributed multi-agent strategy for multi-robot exploration developed by Maia members (which is based on algorithms proposed in the PhD Thesis of Antoine Bautin). Cf. illustration at Cart-o-matic.
The purchase of Cart-o-matic by some robotics companies is underway.
In the context of Mauricio Araya's PhD and PostDoc, we are working on how MDPs – or related models – can search for information. This has led to various research directions, such as extending POMDPs so as to optimize information-based rewards, or actively learning MDP models. This year has begun with the defense of Mauricio's PhD thesis in February. Since then, we have kept extending Mauricio's work and are preparing journal submissions.
While we have done some progress in this field, there are no concrete outcomes to present concerning optimistic approaches for model-based Bayesian Reinforcement Learning. Concerning POMDPs with information-based rewards, Mauricio's PhD thesis presents strong theoretical results that allow – in principle – deriving efficient algorithms from state-of-the-art “point-based” POMDP solvers. This year we have put this idea into practice, implementing variants of PBVI, PERSEUS and HSVI.
Preliminary results have been published (in French) in JFPDA'13 . A journal paper with complete theoretical and empirical results is under preparation.
Samuel Nicol, Iadine Chadès (CSIRO), Takuya Iwamura (Stanford University) are external collaborators.
In the field of conservation biology, adaptive management is about managing a system, e.g., performing actions so as to protect some endangered species, while learning how it behaves. This is a typical reinforcement learning task that could for example be addressed through Bayesian Reinforcement Learning.
This year, we have worked in the context of bird migratory pathways, in particular the East Asian-Australasian (EAA) flyway, which is modeled as a network whose nodes are land areas where birds need to stay for some time. An issue is that these land areas are threatened due to sea level rise. The adaptive management problem at hand is that of deciding in the protection of which land areas to invest money so as to preserve the migratory pathways as efficiently as possible.
The outcome of this work is a data challenge paper published at IJCAI'13 , which presents the problem at hand, describes its POMDP model, gives empirical results obtained with state-of-the-art solvers, and challenges POMDP practitioners to find better solution techniques.
External collaborators: Christopher Amato (MIT), Arnaud Doniec (EMD), Charles Bessonnet (Telecom Nancy), Joni Pajarinen (Aalto University).
Decentralized partially observable Markov decision processes (DEC-POMDPs) are rich models for cooperative decision-making under uncertainty, but are often intractable to solve optimally (NEXP-complete), even using efficient heuristic search algorithms. In this work, we present an efficient methodology to solving decentralized stochastic control problems formalized as a DEC-POMDP or its subclasses. This methodology is three-fold: (1) it converts the original decentralized problem into a centralized problem from the perspective of a solution method that can take advantage of the total data about the original problem that is available during the online execution phase; (2) it shows that the original and transformed problems are equivalent; (3) it solves the transformed problem using a centralized method and transfers the solution back to the original problem. We applied this methodology in various different decentralized stochastic control problems.
Our results include the application of this methodology over DEC-POMDPs , . We recast them into deterministic continuous-state MDPs, where states — called occupancy states — are probability distributions over states and action-observation histories of the original DEC-POMDPs. We also demonstrate the occupancy state is a sufficient statistic for optimally solving DEC-POMDPs. We further show the optimal value function is a piecewise-linear and convex function of the occupancy states. With these results as a background, we prove for the first time that POMDP (and more generally continuous-state MDP) solution methods can, at least in principle, apply in DEC-POMDPs. This work has been presented at IJCAI'2013 and (in French) at JFPDA'2013 , and an in-depth journal article is currently under preparation. We have already extended the results we obtained for general DEC-POMDPs in the case of transition- and observation-independent DEC-MDPs. Of particular interest, we demonstrated that the occupancy states can be further compressed into a probability distribution over the states — the first sufficient statistic in decentralized stochastic control problems that is invariant with time. This work has been presented at AAMAS'2013 , and an in-depth journal article is currently under preparation.
We believe our methodology lays the foundation for further work on optimal as well as approximate solution methods for decentralized stochastic control problems in particular, and stochastic control problems in general.
Jörg Hoffmann, former member of MAIA, is an external collaborator (from Saarland University).
Abstraction is a common method to compute lower bounds in classical planning, imposing an equivalence relation on the state space and deriving the lower bound from the quotient system. It is a trivial and well-known fact that refined abstractions can only improve the lower bound. Thus, when we embarked on applying the same technique in the probabilistic setting, our firm belief was to find the same behavior there. We were wrong. Indeed, there are cases where every direct refinement step (splitting one equivalence class into two) yields strictly worse bounds. We give a comprehensive account of the issues involved, for two wide-spread methods to define and use abstract MDPs.
This work has been presented and published in the ICAPS-13 workshop on Heuristics and Search for Domain-Independent Planning (HSDIP) and (in French) in JFPDA-13 .
Evolutionary Programming proposed by Fogel (initially introduced in 1966) is an approach to build an automaton optimizing a fitness function. Like other evolutionary algorithms, an initial population of automata is given, and the evolutionary programming algorithm will make this population evolve by progressively modifying automata (mutations) and keeping the most efficient ones in the next generation.
This process is close to the progressive construction by a policy iteration algorithm in a POMDP and we are currently investigating the links between these approaches.
This work has begun this year through an internship (Benjamin Bibler) and preliminary development has been made to solve the Santa Fe trail problem proposed by Koza (1992) which has become a benchmark to compare genetic and evolutionary programming approaches.
Learning Tetris controllers is an interesting and challenging problem due to the fact of the size of its search space where traditional machine learning methods do not work and the use of approximate methods is necessary (see ). In this work we study the performance of a direct policy search algorithm namely the Covariance Matrix Adaptation Evolution Strategy (CMAES). We also proposed different techniques to reduce the learning time, one of which is racing. This approach concentrates the computation effort on promising policies and quickly disregards bad ones in order do reduce the computation time. This approach allowed to obtain policies of the same performance as those obtained without but at the fifth of the computation cost. The learned strategies are among the best performing players at this time scoring several millions of lines on average.
Evolutionary Robotics (ER) deals with the design of agent behaviors using artificial evolution. Within this framework, the problem of learning optimal policies (or controllers) is treated as a policy search problem in the parameterized space of candidate policies. The search for the optimal policies in this context is driven by a fitness function that associates a value to the candidate policy by measuring its performance on the given task.
The work shown here describes the results of the master's thesis of Inãki Fernandèz which will be extended during a Ph.D. thesis started on october 2014.
Incremental policy learning with shaping. Several methods have been proposed to accelerate the search for optimal policy in evolutionary robotics. In this work, we investigated the use of incremental learning and, more precisely, shaping, a well-known technique in behavioral psychology. The main idea is to learn to solve simple tasks and then exploit the learned behaviors to tackle increasingly harder tasks.
Our preliminary results show that the best performances are obtained either in the setups with shaping or in the control experiment where the task difficulty is maximal. Nevertheless, a closer look at the results indicates that the best controllers for the shaping setups are not obtained at the end of the evolution, but rather at an earlier stage. This means that, for these shaping techniques, the best controllers have learned to solve the task when its difficulty was at an easy level and their performance is maintained later when the task difficulty increases. Although this was unforeseen, the results seem promising and deserve further investigation.
Online evolutionary learning. As opposed to traditional evolutionary robotics which treat the learning problem as an off-line, centralized process, online onboard distributed evolutionary algorithms , consider the learning process as executed at the agent level in a decentralized way. In this sense, each agent has its own controller or genome which is locally broadcasted from agent to agent and the best performing ones survive and spread. This gene-centered view of evolution is inspired from the theory introduced by Richard Dawkins: The selfish gene.
The online aspect of the algorithms means that the agents are learning at the same time they are performing the task at hand. Another property that derives is that the agents are continuously learning which allows them to adapt to dynamically changing conditions and tasks. This is in opposition to the traditional view of evolutionary robotics (offline) where the outcome of evolution is tailored toward single task. Many challenging problems are raised in this framework and this thesis will address the problem of defining fitness functions that drive a swarm of agents to learn to solve a task. One other question is to study the dynamics of these algorithms both experimentally and theoretically using tools from distributed systems. Some promising work in this direction has been proposed .
Jörg Hoffmann, former member of MAIA, and Michal Krajňanský are external collaborators from Saarland University.
In classical planning, a key problem is to exploit heuristic knowledge to efficiently guide the search for a sequence of actions leading to a goal state.
In some settings, one may have the opportunity to solve multiple small instances of a problem before solving larger instances, e.g., trying to handle a logistics problem with small numbers of trucks, depots and items before moving to (much) larger numbers. Then, the small instances may allow to extract knowledge that could be reused when facing larger instances. Previous work shows that it is difficult to directly learn rules specifying which action to pick in a given situation. Instead, we look for rules telling which actions should not be considered, so as to reduce the search space. But this approach requires considering multiple questions: What are examples of bad (or non-bad) actions? How to obtain them? Which learning algorithm to use?
This research work is conducted as part of Michal Krajňanský's master of science (to be defended in early 2014). Early experiments show encouraging results, and we consider participating in the learning track of the international planning competition in 2014.
We have this year improved the state-of-the-art upper bounds for the complexity of a standard algorithm for solving Markov Decision Processes: Policy Iteration.
Given a Markov Decision Process with
These results were presented at the JFPDA national workshop and at the NIPS 2013 international conference .
Victor Gabillon and Mohammad Ghavamzadeh are external collaborators (from the Inria Sequel EPI). Matthieu Geist is an external collaborator (from Supélec Metz).
We present here three results: the first is a unified review of algorithms that are used to estimate a linear approximation of the value of some policy in a Markov Decision Process; the second concerns the analysis of a class of approximate dynamic algorithms for large scale Markov Decision Processes; the last is the successful application of similar dynamic programming algorithms on the Tetris domain.
In the framework of Markov Decision Processes, we have considered linear off-policy learning, that is the problem of learning a linear approximation of the value function of some fixed policy from one trajectory possibly generated by some other policy. We have made a review of on-policy learning algorithms of the literature (gradient-based and least-squares-based), adopting a unified algorithmic view. We have highlighted a systematic approach for adapting them to off-policy learning with eligibility traces. This lead to some known algorithms and suggested new extensions. This work has recently been accepted to JMLR and should be published at the beginning of 2014 .
We have revisited the work of Bertsekas and Ioffe (1996), that
introduced
Tetris is a video game that has been widely used as a benchmark for
various optimization techniques including approximate dynamic
programming (ADP) algorithms.
A look at the literature of this game
shows that while ADP algorithms that have been (almost) entirely based
on approximating the value function (value function based) have
performed poorly in Tetris, the methods that search directly in the
space of policies by learning the policy parameters using an
optimization black box, such as the cross entropy (CE) method, have
achieved the best reported results.
We have applied an algorithm we proposed in the past, called classification-based modified
policy iteration (CBMPI), to the game of Tetris. Our experimental
results show that for the first time an ADP algorithm, namely CBMPI,
obtains the best results reported in the literature for Tetris in both
small
We consider decentralized control methods to operate autonomous vehicles at close spacings to form a platoon. We study models inspired by the flocking approach, where each vehicle computes its control from its local perceptions. We investigate different decentralized models in order to provide robust and callable solutions. Open questions concern collision avoidance, stability and multi-platoon navigation.
In order to reduce the tracking error (i.e. the distance between each follower's path and the path of its predecessor), we developed both an innovative approach and a new lateral control law. This lateral control law reduces the tracking error faster than other existing control laws. An article, presenting this control law, its integration with a previously defined secure longitudinal control law and the experimental results obtained with it, has been accepted to 2014 IEEE International Conference on Robotics and Automation.
We addressed an important issue for intelligent transportation system, namely the ability of vehicles to safely and reliably localize themselves within an a priori known road map network. For this purpose, we proposed an approach based on hybrid dynamic bayesian networks enabling to implement in a unified framework two of the most successful families of probabilistic model commonly used for localization: linear Kalman filters and Hidden Markov Models. The combination of these two models enables to manage and manipulate multi-hypotheses and multi-modality of observations characterizing Map Matching problems and it improves integrity approach. Another contribution is a chained-form state space representation of vehicle evolution which permits to deal with non-linearity of the used odometry model. Experimental results, using data from encoders' sensors, a DGPS receiver and an accurate digital roadmap, illustrate the performance of this approach, especially in ambiguous situations .
The aim of the European project InTraDE is to propose more efficient ways to handle containers in seaports through the use of IAVs (Intelligent Autonomous Vehicles).
In his PhD thesis, Mohamed Tlig considers the displacements of numerous such IAVs whose routes are a priori planned by a supervisor. However, in such a large and complex system, different unexpected events can arise and degrade the traffic: failure of a vehicle, human mistake while driving, obstacle on roads, local re-planning, and so on.
After working on a simple decentralized strategy to allow two queues of vehicles to share a single lane (presented in 2012, and this year in AATMO-13 ), we have started looking at improving vehicle flows in complete road networks. In particular, we have proposed an approach that allows multiple flows of vehicles to cross an intersection without stopping, allowing to reduce delays as well as energy consumption. Preliminary results have been presented (in French) at RJCIA-13 , and more advanced work is under submission.
The next step is to coordinate the controller agents located in each of the network's intersections so as to create “green waves” that would improve the flows not just locally, but globally.
With LAR (living AssistanT Robot), a PIA projet which started in March, Abdallah Dib joined our team for a PhD. His work is about the development of a low cost navigation system for a robot evolving in an indoor environment. The main issue of his work is to design a Simultaneous Localisation and Mapping algorithm working in a dynamic environment in which people are moving. This is very challenging if we restrict the sensing capabilities of the robot with low cost sensors such as RGB-D camera. An important service we expect the robot to achieve, is realizing similar services as the one we described below: fall detection, activity recognition.
This work has been realized during the ANR Cart-O-matic project. Antoine Bautin has been hired by the Maia team for this project for a PhD. The main objective of the project was to design and build a multi-robot system able to autonomously map an unknown building. This work has been done in the framework of a French robotics contest called Defi CAROTTE organized by the General Delegation for Armaments (DGA) and the French National Research Agency (ANR). The scientific issues of this project deal with Simultaneous Localization And Mapping (SLAM), multi-robot collaboration and object recognition. The Maia Team has been mainly involved in multi-robot collaboration and navigation , , .
Nassim Kaldé, a new PhD student started last year in order to carry out the work done by Antoine Bautin. The new directtion aims at addressing similar problems as the one we addressed in Cart-O-matic project but with dynamical environment, i.e. environment in which people are evolving with robots. An other point that Nassim Kaldé will address is social navigation, which is important for robot and human to coexist in a smart manner.
Yann Boniface (CORTEX Team, Loria) is an external collaborator
In collaboration with the CORTEX team and supported by a M2R internship, many questions related to learning the control of a complex (mono)-agent system with a continuous sensori-motor space are explored. For several reasons, the classical framework of Reinforcement Learning is not easily used in that context:
the value function to be learned has to be encoded using features that are not known at start,
because of the richness of the sensori-motor space, a random exploration scheme is unlikely to find the rewarded states that are needed by the learning process,
exploiting what is learned is difficult as one would need to find the maximum of the value function while it is learned.
Our work is focused on a planar model of the human arm with 2 joints and 6 muscles (see figure ). Control signals are the activity of the motor-neurons that alter the length of the muscles, and thus the forces applied on the joints. This system is redundant but also highly non linear as many aspects of the model are described by non-linear differential equations (our model is a slight improvement over the one of Li ). The task to learn is to reach different positions from given starting points.
We have studied a developmental learning process with a simple muscle activation pattern. The idea is to start the learning process in an artificially reduced sensori-motor space (using rough perception and motor capacities) and slowly increase the size and complexity of this space when interesting behaviors are learned. Our approach gives results comparable to other developmental techniques and raises several important research questions. Our work showed that we need an abstraction mechanism in order to define or refine the features used in actions but also in perceptions. This is a very difficult challenge that is one of the keys to the understanding (and design) of cognition. There is also a need for stronger generalization capabilities in the function approximation used in the process.
In parallel, we are taking inspiration from the field of neurosciences, and particularly on the coupling between the cortex and the cerebellum in motor control. Models based on the work of Kaladjian should help us understand what control signals are used by the brain apparatus and how the learning of gestures is organized between these two regions. Our long term goal is to design mechanisms for learning features abstraction in the sensori-motor space while being guided by the improvement in behavior performances.
This action is supported by the Inria IPL Personally Assisted Living (PAL) which gathers 9 Inria teams associated with 6 research partners (technological, medical or social) which work together on three main issue guidelines: mobility assistance, assessing the degree of frailty of the persons, home activities analysis. The MAIA team is currently mainly involved in the 2 later topics, plus fall detection.
Evaluation of the degree of frailty of the elderly. As argued in the famous paper of Fried et al the estimation of frailty is highly significant to evaluate the risk of falls, disability, hospitalization and mortality. This issue is considered in Maia Team with different sensing devices: single RGB-D cameras , network of RGB-D cameras, sensing intelligent floor. One simple idea which is currently developed in the team is to determine either the center of mass of a person using one or several kinects, or the center of pressure and footsteps localization using an intelligent floor. The idea is to induce from these simple measures, the walking speed, the length of the steps and the position of the monitored persons.
People activity analysis. The follow-up of the activity of elderly people over long period of time can be a good indicator of their well-being, but the evalution of the behavior of a person at home is an open challenge.
To address this issue, we proposed this year a HMM based model capable of following simple activities such as sitting, walking, etc. An evaluation of this model has been conducted within a real smart environment with 26 subjects which were performing any of eight activities (sitting, walking, going up, squatting, lying on a couch, falling, bending and lying down). Seven out of these eight activities were correctly detected among which falling which was detected without false positives .
Fall detection. Elderly fall is one of the major health issues affecting elderly people, especially at home. One of the objectives of the PhD work of Amandine Dubois is to design an automatic system to detect fall at home, which in its final version will be made up of a network of RGB-D sensors. A simple and robust method based on the identification and tracking of the center of mass of people evolving in an indoor environment has been developed. Using a simple Hidden Markov Model whose observations are the position of the center of mass, its velocity and the general shape of the body, we can surprisingly monitor the activity of a person with high accuracy and thus detect falls with very good accuracy without false positives , . An experimental study, that is reported here, has been driven in our smart apartment lab. 26 subjects were asked to perform a predefined scenario in which they realized a set of eight postures. 2 hours of video (216 000 frames) were recorded for the evaluation, half of it being used for the training of the model. The system detected the falls without false positives. This result encourages us to use this system in real situation for a better study of its efficiency.
We are also involved in the development of a new innovative sensing device: a Pressure-Sensing Floor with LED lighting making possible to provide a new way for people to interact with their environment. Sensitive or intelligent floors have attracted a lot of attention during the last two decades for different applications going from interaction capture in immersive virtual environments to robotics or human tracking, fall detection or activity recognition. Different technologies have been proposed so far either based on optical fiber sensing, pressure sensing or electrical near field. In the Maia Team, we have developed a more sophisticate approach in which both computation and sensing is distributed within the floor. This floor is made up of interconnected intelligent titles with can communicate with each other, have internal computation power, sense the environment activity (through four weight sensors, an accelerometer and a magnetometer) and can interact with users, robots or other sensor networks either by wireless/wire communication or through visual communication (each tile being equipped with 16 leds).
Several scientific challenges are open to us in the fields of decentralized spatial computing and in designing real application for assisting people suffering from loss of autonomy.
Some of these issues have been addressed this year. Mihai Andries, a PhD student, proposed two contributions demonstrating the relevancy of an intelligent floor such as the one we have developed. First contribution is about controlling a mobile robot through its interactions throughout the floor . The second, least developed is about activity recognition of a person through its physical interaction on the floor. This approach has an important advantage compared to video based activity recognition: the privacy of people is without any doubt guaranteed. Let us mention too, the work of an internship student who developed a gait evaluation algorithm using the variation over time of the center of pressure that is sensed by the floor when one or several person walk over the floor.
Fabien Flacher (Thales THERESIS) is an external collaborator.
In collaboration with Thales ThereSIS - SE&SIM Team (Synthetic Environment & Simulation), we focus on the problem of following the trajectories of several persons with the help of several controllable cameras. This problem is difficult since the set of cameras cannot cover simultaneously the whole environment, since some persons can be hidden by obstacles or by other persons, and since the behavior of each person is governed by internal variables which can only be inferred (such as his motivation or his hunger).
The approach we are working on is based on (1) POMDP formalisms to represent the state of the system (person and their internal states) and possible actions for the cameras, (2) a simulator provided and developed by Thales ThereSIS and (3) particle filtering approaches based on this simulator.
From a theoretical point of view, we are currently investigating how to use a deterministic simulator and to generate new particles in order to keep a good approximation of the posterior distribution.
Our research on emergent collective behavior focuses on the analysis of the robustness of discrete models of complex systems. We ask to which extent systems may resist to various perturbations in their definitions. We progressed in the knowledge of how to tackle this issue in the case of cellular automata (CA) and multi-agent systems (MAS).
We proposed new definitions of asynchronism in lattice-gas cellular automata . An experimental work was carried out and it was shown that the observation of an asynchronous version of a discrete model of swarm formation could help us gain insight on this well-studied model. The PhD thesis of O. Bouré provides a detailed view of this work.
A study on the density classification problem, a well-studied problem of consensus in cellular automata, was carried out for infinite systems in 1D and 2D and for infinite trees , . Positive results were provided and important conjectures were raised.
We proposed a survey on asynchronous cellular automata and explained some of the difficulties in the classification of these objects .
In collaboration with colleagues from India, we proposed a complete characterisation of the reversibility of the set of the 256 Elementary Cellular Automata, which are known to be diffcult to study in all generality . We also proposed a mathematical analysis of the second-order phase transitions that are observed in the most simple asynchronous cellular automata . We also coordinated a special issue on asynchronous cellular automata in the Natural Computing journal .
Laurent Ciarletta (Madynes team, LORIA) is an external collaborator.
Complex systems are present everywhere in our environment: internet, electricity distribution networks, transport networks. These systems have as characteristics: a large number of autonomous entities, dynamic structures, different time and space scales and emergent phenomena. The thesis work of Tomas Navarrete is centered on the problem of control of such systems. The problem is defined as the need to determine, based on a partial perception of the system state, which actions to execute in order to avoid or favor certain global states of the system. This problem comprises several difficult questions: how to evaluate the impact at the global level of actions applied at a global level, how to model the dynamics of a heterogeneous system (different behaviors arise from different levels of interactions), how to evaluate the quality of the estimations obtained trhough the modeling of the system dynamics.
We propose a control architecture based on an “equation-free” approach. We use a multi-agent model to evaluate the global impact of local control actions before applying the most pertinent set of actions.
Our architecture has been prototypically implemented in order to confront the basic ideas of the architecture within the context of simulated “free-riding” phenomenon in peer to peer file exchange networks. We have demonstrated that our approach allows to drive the system to a state where most peers share files, even when the initial conditions are supposed to drive the system to a state where no peer shares. We have also performed experiments with different configurations of the architecture to identify the different means to improve the performance of the architecture.
This work helped us to better identify the key questions that rise when using the multi-agent paradigm in the context of control of complex systems, concerning the relationship between the model entities and the target system entities.
Laurent Ciarletta and Yannick Presse (Madynes team, LORIA) are external collaborators.
Laurent Ciarletta is the co-advisor of the thesis of Julien Vaubourg.
Models of Complex systems generally require different points of view (abstraction levels) at the same time in order to capture and to understand all the dynamics and the complexity. Consisting of different interacting parts, a model of a complex system also requires the joint and simultaneous use of modeling and simulation tools from different scientific fields.
We proposed the AA4MM meta-model that solves the core challenges of multi-modelling and simulation coupling in an homogeneous perspective. In AA4MM, we chose a multi-agent point of view: a multi-model is a society of models; each model corresponds to an agent and coupling relationships correspond to interaction between agents.
This year we have made progress in the definition of multi-level modeling , . We identified several facets of multi-level modeling and implemented them as different kinds of interactions in the AA4MM framework. A demonstration of these different multi-level couplings has been developed on a collective motion phenomenon.
In February started the MS4SG projet which involes MAIA, Madynes and EDF R&D on smart-grid simulation. A Phd thesis started on october 2013 by Julien Vaubourg in the MAIA team on the confrontation of the AA4MM principles against the specificities of smart-grid domain as a kind of complex system.
Laurent Ciarletta and Yannick Presse (Madynes team, LORIA) are external collaborators.
The MS4SG (multi-simulation for smart grids) project is granted as a strategic action between Inria and EDF. This project is joint between the Inria teams Madynes and MAIA, and EDF R&D.
Smart grids are electric supply grids endowed with smart capabilities because of the use of information and communication technologies. This perspective of smart grids corresponds to new challenges ; in particular one must re-think the way electricity is supplied to customers and the power supply network is regulated.
The simulation approach can deal with the supervision and regulation of these systems. Such an approach implies to integrate simulators coming from different domains: electrical networks, communication networks and information systems. As these domains can influence each other, smart grids can be considered as a kind of complex system and we are faced with multi-modeling and multi-simulation issues; in particular we must deal with the fact that the models used in the different simulators are not of the same kind (heterogeneous simulations) and that we must link and re-use existing simulators that were designed to work alone on their own.
The aim of the project is to provide primitives based on AA4MM in order to enable the multi-modeling and the multi-simulation of smart grids.
Economic mobilisation agency in Lorraine has launched a new project SATELOR providing it with 2.5 million Euros of funding over 3 years, out of an estimated total of 4.7 million. The leader of the project is Pharmagest-Diatelic. PHARMAGEST is the French leader in computer systems for pharmacies, with a 43.5 % share of the market, 9,800 clients and more than 700 employees. Pharmagest is in Nancy. Recently, PHARMAGEST Group expanded its activities into e-health and the development of telemedicine applications. The SATELOR project will accompany the partners of the project in developing new services for maintaining safely elderly people with loss of autonomy at home or people with a chronic illness. Maia team will play an important role for bringing some research results such as those presented in section at an industrial level.
The PAL project is a national Inria Large Scale Initiative involving several teams of the institute (Arobas, Coprin, E-motion, Lagadic, Demar, Maia, Prima, Pulsar and Trio). It is coordinated by David Daney (Inria Sophia-Antipolis, EPI Coprin). The project focuses on the study and experiment of models for health and well-being. Maia is particularly involved in the People Surveillance work package, by studying and developping intelligent environments and distributed tracking devices for people walking analysis and robotic assistance (smart tiles, 3D camera network, assistant robots), cf. Sec. , and .
The PhD of Mihai Andries in funded by the PAL project.
Partners : Crédit Agricole, Diatelic, Robosoft
LAR project has the objective to designing an assistant robot to improve the autonomy and quality of life for elderly and fragile persons. The project started at the beginning of the year. The role of the Maia Team is to develop a simultaneous localisation and mapping algorithm using a RGB-D camera. The main issue is to develop an algorithm able to deal with dynamic environment. An other issue is for the robot to be able to behave with acceptable social skills.
Olivier Rochel, from SED, is an external collaborator. Moutie Chaider was hired as an IJD in 2012.
Percee, for “Perception Distribuée pour Environnements Intelligents”, is a project proposed by the Maia and Madynes teams and funded by Inria. This ADT (Action de Développement Technologique) supports our action in the PAL Inria National Scale Initiative (Personally Assisted Living, see ).
The project deals with the development and the study of intelligent homes. Since two years we have developed an experimental platform, the smart apartment. It allows us to study models and technology for life assistance (walk analysis with iTiles and camera networks, robotic assistants, health diagnostic, domotic functions, wireless communication inside home).
In particular we develop a new tactile floor, which is the iTiles network. Two engineers are funded by the ADT: Moutie Chaider (IJD) and Olivier Rochel (Inria research engineer) for two years.
Dominique Martinez (Cortex team, Inria NGE) is an external collaborator and the coordinator of the project for Nancy members.
PHEROTAXIS is an “Investissements d’Avenir” ANR 2011-2014 (Coordination: J.-P. Rospars, UMR PISC, INRA Versailles).
The theme of the research is localisation of odour sources by insects and robots. By associating experimental data with models, the project aims at defining a behavioral model of olfactive processes. This work provides several applications, in particular the development of bio-inspired components highly sensitive and selective.
The project is organized in five work packages and involves the PISC research unit (Versailles), Pasteur Institute (Paris) and LORIA/Inria institute (Nancy).
Program: InterReg IV B
Project acronym: InTraDE
Project title: Intelligent Transportation for Dynamic Environment
Duration: 2010 - 2014
Coordinator: University of Science and Technology of Lille (Lille 1-LAGIS) (France),
Other partners: South East England Development Agency (United Kingdom), Centre Régional d’Innovation et de Transfert de Technologie – Transport et Logistique (CRITT TL) (France), AG Port of Oostende (AGHO) (Belgium), National Institute for Transport and Logistics, Dublin Institute of Technology (Ireland), Liverpool John Moores University (LOOM) (United Kingdom)
Abstract:
The InTraDE project (Intelligent Transportation for Dynamic
Environments, http://
The Maia team partner focuses on decentralized approaches to deal with the control of automated vehicle platooning and the adaptation of the traffic. Maia is funded with two PhD fellowships and one engineer. Both PhD thesis started in the end of 2010. The PhD of Jano Yazbeck, supervised by F. Charpillet and A. Scheuer, aims at studying a “Secure and robust immaterial hanging for automated vehicles” (see Sec. ). The PhD of Mohamed Tlig, supervised by O. Simonin and O. Buffet, addresses “Reactive coordination for traffic adaptation in large situated multi-agent systems” (see Sec. ).
Dr. Iadine Chadès, Research Scientist at CSIRO, Ecosystem Sciences division (Brisbane, Australia), visited MAIA for 1 week in July 2013.
François Charpillet is member of the scientific concil of the Robotic GDR
Amine Boumaza was a member of the program committee of GECCO'13 (The Genetic and Evolutionary Computation Conference) and CEC'13 (Congress on Evolutionary Computation). He was a reviewer for ANR jeune chercheur programme blanc.
Olivier Buffet is a member of the editorial boards of the “revue d'intelligence artificielle” (RIA), and the “Journal of Artificial Intelligence Research” (JAIR).
Olivier Buffet was a reviewer for the journals: AIJ (Artificial Intelligence Journal), JAIR (Journal of Artificial Intelligence Research), RIA (Revue d'Intelligence Artificielle); for the conferences AAAI'13 (National Conference on Artificial Intelligence), ICAPS'13 (International Conference on Automated Planning and Scheduling), IJCAI'13 (International Joint Conference on Artificial Intelligence), JFPDA'13 (Journées Francophones sur la Planification, la Décision et l'Action pour le contrôle de systèmes); and for the workshop WRLComp'13 (ICML Workshop on the 2013 Reinforcement Learning Competition).
Vincent Chevrier was a reviewer for the journal: ActaBiotheroretica (ACBI), for the conferences IJCAI'13 (International Joint Conference on Artificial Intelligence), JFSMA'13 (Journées francophones sur les Systèmes multi-Agents).
Vincent Chevrier was an expert for the following project calls: ANR Blanc Program and Digiteo, for the best PhD thesis price of the AFIA.
François Charpillet was in the program committees of ICAART, MSDM, WACAI, the 2013 IEEE/RSJ IROS Workshop “Assistance and Service Robotics in a Human Environment” and has also reviewed papers for IROS, ICRA, and RIA.
Alain Dutech was a reviewer for the journals : JMLR (Journal of Machine Learning Research), RIA (Revue d'Intelligence Artificielle); for the conferences IJCAI'13 (International Joint Conference on Artificial Intelligence), JFPDA'13 (Journées Francophones sur la Planification, la Décision et l'Action pour le contrôle de systèmes);
Nazim Fatès co-organised WPCA'13, the First European workshop on probabilistic cellular automata, held in Eindhoven, June 10-12, 2013. He was a reviewer for the following journals: Theoretical Computer Science, Physica D, Journal of cellullar automata. He was a program committee member of the following conferences: AUTOMATA'13 (Workshop on cellular automata), CAAA'13 (cellular automata Algorithms & Architectures), SCW'13 (spatial computing workshop). He was an external referee for the conference CiE'13 (Computation in Europe). He was invited to give a talk and a tutorial at AUTOMATA'13.
Bruno Scherrer was a reviewer for the journals Mathematics of Operations Research, Annals of Operations Research, Applied Mathematics and Optimization, and Transactions on Neural Networks and Learning Systems, for the international conferences ICML'13 (International Conference on Machine Learning) and NIPS'13 (Neural Information Processing Systems), and for the national conference on learning CAP'13 (Conférence Francophone sur l'Apprentissage Automatique).
Bruno Scherrer was invited to give a talk at the Reinforcement Learning seminar of the Gatsby Unit (University College of London).
Alexis Scheuer was a reviewer for the journals Artificial Intelligence (Elsevier), Robotics and Autonomous Systems (Elsevier) and IEEE Transactions on Robotics, as well as for the conference ICRA'14 (IEEE International Conference on Robotics and Automation).
Olivier Simonin has co-organized the 2013 IEEE/RSJ IROS Workshop “Assistance and Service Robotics in a Human Environment”. He is a member of the program committees of ICINCO'13, ICAART'13, and national conferences CAR2013 and JFSMA'13. He was a reviewer for the following international journals, IEEE Transactions on Systems, Man and Cybernetics: Systems, International Journal of Advanced Robotic Systems, and the conference IEEE ICRA'13.
Vincent Thomas was a reviewer for the journal IEEE Transactions on Cybernetics and for the conference JFPDA'13 (Journées Francophones sur la Planification, la Décision et l'Action pour le contrôle de systèmes).
PhD & Master level: Amine Boumaza, Métaheuristiques, 15h eq TD., Master 1 Computer Science, Université de Lorraine, France.
Master level: Alain Dutech, Numerical Learning in AI, 15h eq TD, Master 1 Cognitive Sciences, Université de Lorraine, France.
PhD & Master level: Nazim Fatès, cellular automata, four lectures for a complex systems doctoral module, University Adolofo Ibañez, Santiago, Chile.
PhD & Master level: Alexis Scheuer & Olivier Simonin, Introduction to Mobile Robotics, 30h eq TD., Master 1 Computer Science, Université de Lorraine, France.
PhD & Master level: Alexis Scheuer & Olivier Simonin, Additional Mobile Robotics, 30h eq TD., Master 2 Computer Science, Université de Lorraine, France.
Master level: Olivier Simonin, Artificial Life, 15h eq TD., last year course (equivalent to Master 2), Ecole Supérieure d'Électricité de Metz, France.
PhD & Master level: Vincent Thomas, Métaheuristiques, 15h eq TD., Master 1 Computer Science, Université de Lorraine, France.
PhD & Master level: Vincent Thomas, Optimisation et systèmes dynamiques stochastiques, 22h eq TD., Master 2 Computer Science, Université de Lorraine, France.
Master level: Vincent Thomas, Agents Intelligents, 25h eq TD., Master 1 Cognitive Sciences, Université de Lorraine, France.
Master level: Vincent Thomas, Serious Game and Game Design, 15h eq TD., Master 2 Cognitive Sciences, Université de Lorraine, France.
PhD: Mauricio Araya, “Near-Optimal Algorithms for Sequential Information-Gathering Decision Problems”, Université de Lorraine, Feb. 4th, F. Charpillet (advisor), O. Buffet, V. Thomas.
PhD: Antoine Bautin, “Stratégie d'exploration multirobot fondée sur le calcul de champs de potentiels”, Université de Lorraine, Oct. 3rd, F. Charpillet (advisor), O. Simonin.
PhD: Olivier Bouré, “'Le simple est-il robuste ?' Une étude de la robustesse des systèmes complexes à travers les automates cellulaires”, Université de Lorraine, Sep. 13th, Nazim Fatès, Vincent Chevrier (advisor).
PhD: Lucie Daubigney, “Gestion de l'Incertitude pour l'Optimisation de Systèmes Interactifs”, Université de Lorraine, Oct. 1st, Olivier Pietquin, Alain Dutech (advisor).
PhD in progress Mihai Andries, “Calcul spatialisé pour l'assistance à la personne: étude d'un réseau de dalles intelligentes”, Université de Lorraine, Oct. 2012, F. Charpillet (advisor), O. Simonin.
PhD in progress: Benjamin Camus, “Un laboratoire virtuel pour la multi-modélisation”, Université de Lorraine, Oct 2012, Christine Bourjot, Vincent Chevrier (advisor).
PhD in progress: Amandine Dubois, “Assistance à la personne en perte d'autonomie: étude de l'apport d'un réseau de Kinects à la détection et la prévention des chutes”, Université de Lorraine, Oct. 2011, F. Charpillet (advisor).
PhD in progress: Abdallah Dib, “Assistance à la personne en perte d'autonomie: étude de l'apport d'un robot compagnon", Université de Lorraine, March. 2013, F. Charpillet (advisor).
PhD in progress: Arsène Fansi Tchango, “Suivi multi-caméra en environnement partiellement observé”, Université de Lorraine, Oct. 2011, A. Dutech (advisor), O. Buffet, V. Thomas.
PhD in progress: Nassim Kaldé, “Exploration et reconstruction d’un environnement inconnu par une flottille de robots”, Université de Lorraine, Oct. 2012, F. Charpillet (advisor), O. Simonin.
PhD in progress: Manel Tagorti, “Approximating the Value Function for Heuristic Search in Factored MDPs”, Université de Lorraine, Nov. 2011, J. Hoffmann (advisor), B. Scherrer, O. Buffet.
PhD in progress: Mohamed Tlig, “Reactive coordination for traffic adaptation in large situated multi-agent systems”, Université de Lorraine, Dec. 2010, O. Simonin (advisor), O. Buffet.
PhD in progress: Julien Vaubourg, “Multi-modélisation, multi-simulation dans le cadre des Smart-grids”, Université de Lorraine, Oct 2013., Laurent Ciarletta, Vincent Chevrier (advisor).
PhD in progress: Jano Yazbeck, “Secure and robust immaterial hanging for automated vehicles”, Université de Lorraine, Oct. 2010, F. Charpillet (advisor), A. Scheuer.
Amine Boumaza was a member of the PhD committee of
Charles Olion, October 18th 2013, ISIR/UPMC.
Olivier Buffet was a member of the PhD committee of
Caroline Ponzoni Carvalho Chanel, 12th Apr. 2013, Université de Toulouse / ISAE, and
Adrien Couëtoux, 30th Sept. 2013, Université Paris Sud.
Vincent Chevrier was a member of the PhD committee of
Yishuai Lin, 10th Sept. 2013, Université de Technologie de Belfort-Monbéliard, as a referee; and
Inaya Lahoud, 10th Sept. 2013, Université de Technologie de Belfort-Monbéliard, as a committee member.
François Charpillet was a member (as a referee) of the PhD committee of:
Jean-Baptiste Soyez, 3rd December, Lagis/Université de Lille 1
Asma Azim, 17th December, LIG, Université de Grenoble
Nicolas Coté, 10th December, GREYC, Université de Caen
Rémi Guyanneau, 19th November, LISA, Université d'Angers
Zhaoxia Peng, Lagis, Université de Lille
Xuan Son Nguyen, 28th Jun., LIP6, Université Pierre et Marie Curie,Paris 6
Caroline Ponzoni Carvalho Chanel, 12th Apr.,ONERA, Université de Toulouse/ISAE
Pedro Chahuara Quispe, 27th march, LIG, Université de Grenoble
Joni Pajarinen, 7th February, Aalto University
François Charpillet was a member of the HdR committee of :
Frank Gechter, 4th December, SET, UTBM-Université de Franche-Comté
Stephane Galland, 11th December, SET, UTBM-Université de Franche-Comté
Alain Dutech was a member of the PhD committee of
Tony Pinville, 30th January 2013, Université Paris 6 (referee),
Mahuna Akplogan, 15th May 2013, INRA Toulouse (referee),
Jean-Baptiste Hoock, 12th June 2013, Université Paris 11 (referee),
Joseph El-Gemayel, 25th June 2013, Université de Toulouse 1 (referee),
Lucie Daubigney, 1st October 2013, Université de Lorraine.
Nazim Fatès was a member of the PhD committee of Markus Redeker, June 17th, 2013, Int. Center Unconventional Computing, Bristol University, UK.
Olivier Simonin was a member was a member of the PhD committee of
Benjamin Cogrel, 18th November 2013, Univ. Paris Est Crétiel,
Remy Guyonneau, 19th November 2013, Univeristé d'Angers,
Nicolas Coté, 10th December 2013, Univ. de Caen,
Jean-Baptiste Soyez, 3rd December 2013, Univ. de Lille.
Olivier Simonin was a member (as a referee) of the PhD committee of
Jorge Rios Martinez, 8th January 2013, LIG-Inria Rhône-Alpes Grenoble,
Cyril Poulet, 23rd April 2013, UPMC LIP6,
Yassine Gangat, 27th August 2013, Ile de la Réunion,
Feirouz Ksontini, 13th November 2013, Univ. de Valenciennes,
Madeleine El Zaher, 22th November 2013, Univ. Technologique de Belfort-Montbéliard,
Nicolas Carlesi, 19th December, Univ. Montpellier 2.
François Charpillet was a member of "Specialist committee" in University of Grenoble
Olivier Buffet was a member of the "Specialist committee" in Université de Caen Basse-Normandie.
Alain Dutech was a member of the “Specialist committee” in Université Paris 13.
François Charpillet participated to the popularization of the robotic and Ambient Intelligence activity of the maia team :
The popularization article by Alain Dutech, Bruno Scherrer and Christophe Thiéry on Reinforcement Learning for the game of Tetris, that was first published at Interstices, was revised and published at Images des Mathématiques .
Vincent Thomas has organized, in collaboration with the "Bibliothèque Universitaire du Campus Lettres", an exposition "jeux: les ateliers de la pensée" where the main objective is to promote games as an interesting subject for academics. This exposition included two days of vulgarization seminars with specialists of several scientific fields (economy, psychology, computer science) and animations. See http://ticri.inpl-nancy.fr/wicri-lor.fr/index.php?title=Exposition_Jeu. The material of the exposition will be presented in several university libraries of the Université de Lorraine in the first half of 2014.
Vincent Thomas participated in “Journée ISN-EPI” (Jun. 27 2013) whose audience is computer science teachers of secondary school by making a presentation on "games and artificial intelligence" and by organizing a workshop on the same subject. Vincent Thomas is participating in the LORIA IDEES group dedicated to teaching activities http://idees.loria.fr/index.php?n=Main.ProgrammeJourneeISN-EPI.
Vincent Thomas participated in “Dans les coulisses d'un labo d'informatique” (Mar. 21 2013) by proposing an animation about “artificial intelligence and video games” for secondary school students https://iww.inria.fr/NanSciNum/dans-les-coulisses-dinria-nancy-grand-est-2/.
Vincent Thomas and Olivier Buffet participated in “Fête de la Science” by organizing a workshop on board games and artificial intelligence (Oct. 10 and 12). See https://iww.inria.fr/NanSciNum/un-ticet-pour-la-science/.
Nazim Fatès animated two debates following the presentation of the film Codebreaker which is a “biopic” dedicated to Alan Turing: in January, it was presented at the Lycée Jean-Moulin in Forbach; in November, it was presented in the main cinema of Saint-Dié-des-Vosges (organisation by “Festival du film de chercheur”).
Christine Bourjot, Alain Dutech and Nazim Fatès participated in a debate on artificial intelligence in the café “l'Irlandais” with a public mainly constituted of university students.