Data Mining (DM), acknowledged to be one of the main ten challenges of the 21st century
DM and ML problems can be set as optimization problems, thus leading to two possible approaches. Note that this alternative has been characterized by H. Simon (1982) as follows. In complex real-world situations, optimization becomes approximate optimization since the description of the real-world is radically simplified until reduced to a degree of complication that the decision maker can handle. Satisficing seeks simplification in a somewhat different direction, retaining more of the detail of the real-world situation, but settling for a satisfactory, rather than approximate-best, decision.
The first approach is to simplify the learning problem to make it tractable by standard statistical or optimization methods. The alternative approach is to preserve as much as possible the genuine complexity of the goals (yielding “interesting” models, accounting for prior knowledge): more flexible optimization approaches are therefore required, such as those offered by Evolutionary Computation.
Symmetrically, optimization techniques are increasingly used in all scientific and technological fields, from optimum design to risk assessment. Evolutionary Computation (EC) techniques, mimicking the Darwinian paradigm of natural evolution, are stochastic population-based dynamical systems that are now widely known for their robustness and flexibility, handling complex search spaces (e.g. mixed, structured, constrained representations) and non-standard optimization goals (e.g. multi-modal, multi-objective, context-sensitive), beyond the reach of standard optimization methods.
The price to pay for such properties of robustness and flexibility is twofold. On one hand, EC is tuned, mostly by trials and errors, using quite a few parameters. On the other hand, EC generates massive amounts of intermediate solutions. It is suggested that the principled exploitation of preliminary runs and intermediate solutions, through Machine Learning and Data Mining techniques, can offer sound ways of adjusting the parameters and finding shortcuts in the trajectories in the search space of the dynamical system.
The overall goals of the project are to model, to predict, to understand, and to control physical or artificial systems. The central claim is that Learning and Optimization approaches must be used, adapted and integrated in a seamless framework, in order to bridge the gap between the system under study on the one hand, and the expert's goal as to the ideal state/functionality of the system on the other hand.
Specifically, our research context involves the following assumptions:
The systems under study range from large-scale engineering systems to physical or chemical phenomenons, including robotics and games. Such systems, sometimes referred to as complex systems, can hardly be modeled based on first principles due to their size, their heterogeneity and the incomplete information aspects involved in their behavior.
Such systems can be observed; indeed selecting the relevant observations and providing a reasonably appropriate description thereof is part of the problem to be solved. A further assumption is that these observations are sufficient to build a reasonably accurate model of the system under study.
The available expertise is sufficient to assess the system state, and any modification thereof, with respect to the desired states/functionalities. The assessment function is usually not a well-behaved function (differentiable, convex, defined on a continuous domain, etc), barring the use of standard optimization approaches and making Evolutionary Computation a better suited alternative.
In this context, the objectives of TAO are threefold:
using Evolutionary Computation (EC) and more generally Stochastic Optimization to support Machine Learning (ML);
using Statistical Machine Learning to support Evolutionary Computation;
investigating integrated ML/EC approaches on diversified and real-world applications.
DAEYAHSP Winner of the Seventh International Planning Competition (Deterministic Temporal Satisficing track) at ICAPS 2011.
MoGo realized 20 wins out of 20 games in 7x7 Go against 10 different professional players. This is further documented in .
Results in (just accepted) prove that the Nash equilibrium of two-player zero-sum partially observable games is undecidable. This fundamental result notably contradicts published decidability results, which used as a decidability criterion a definition which is not equivalent to optimal play in the Nash sense.
This section describes Tao's main research directions, first presented during Tao's evaluation in November 2007. Four strategic issues had been identified at the crossroad of Machine Learning and Evolutionary Computation:
Where | What is the search space and how to search it. |
Representations, Navigation Operators and Trade-offs. | |
What | What is the goal and how to assess the solutions. |
Optimal Decision under Uncertainty. | |
How.1 | How to bridge the gap between algorithms and computing architectures ? |
Hardware-aware software and Autonomic Computing. | |
How.2 | How to bridge the gap between algorithms and users? |
Crossing the chasm |
Six Special Interest Groups (SIGs) have been defined in TAO, investigating the above complementary issues from different perspectives. The comparatively small size of Tao SIGs enables in-depth and lively discussions; the fact that all TAO members belong to several SIGs, on the basis of their personal interests, enforces the strong and informal collaboration of the groups, and the fast information dissemination.
The choice of the solution space is known to be the crux of both Machine Learning (model selection) and Evolutionary Computation (genotypic-phenotypic mapping).
The first research theme in TAO thus concerns the definition of an adequate representation, or search space
Expressiveness/compacity trade-off (static property):
Stability/versatility trade-off (dynamic property): while most modifications of a given solution in
This research direction is investigated in:
the Complex System SIG (section ) focusing on developmental representations for Design and sequential representations for Temporal Planning;
the Large and Deep Networks SIG (section ) considering deep or stochastic Neural Network Topologies;
the Continuous Optimization SIG (section ), concerned with adaptive representations.
Benefiting from the MoGo expertise, TAO investigates several extensions of the Multi-Armed Bandit (MAB) framework and the Monte-Carlo tree search. Some main issues raised by optimal decision under uncertainty are the following:
Regret minimization and any-time behavior.
The any-time issue is tightly related to the scalability of Optimal Decision under Uncertainty; typically, MAB was found better suited than standard Reinforcement Learning to large-scale problems as its criterion (the regret minimization) is more amenable to fast approximations.
Dynamic environments (non stationary reward functions).
The dynamic environment issue, first investigated in TAO through the On-line Trading of Exploration vs Exploitation Challenge
Use of side information / Multi-variate MAB
The use of side information by MAB is meant to exploit prior knowledge and/or complementary information about the reward. Typically in MoGo, the end of the game can be described at different levels of precision (e.g., win/lose, difference in the number of stones); estimating the local reward estimate depending on the available side information aims at a better robustness.
Bounded rationality.
The bounded rationality issue actually regards two settings. The first one considers a number of options which is large relatively to the time horizon, meaning that only a sample of the possible actions can be considered in the imparted time. The second one deals with a finite unknownhorizon, as is the case for the Feature Selection problem.
Multi-objective optimization.
Many applications actually involve antagonistic criteria; for instance autonomous robot controllers might simultaneously want to explore the robot environment, while preserving the robot integrity. The challenge raised by Multi-objective MAB is to find the “Pareto-front” policies for a moderately increased computational cost compared to the standard mono-objective approach.
This research direction is chiefly investigated by the Optimal Decision Making SIG (section ), in interaction with the Complex System and the Crossing the Chasm SIGs (sections and ).
Historically, the apparition of parallel architectures only marginally affected the art of programing; the main focus has been on how to rewrite sequential algorithms to make them parallelism-compliant. The use of distributed architectures however calls for a radically different programming style/computational thinking, seamlessly integrating:
computation: aggregating the local information available with any information provided by other nodes;
communication: building abstractions of the local node state, to be transmitted to other nodes;
assessment: modeling other nodes in order to modulate the exploitation (respectively, the abstraction) of the received (resp. emitted) information.
Message passing algorithms such as Page Rank or Affinity Propagation are prototypical examples of distributed algorithms. The analysis is shifted from the static properties (termination and computational complexity) to the dynamic properties (convergence and approximation) of the algorithms, after the guiding principles of complex systems.
Symmetrically, modern computing systems are increasingly viewed as complex systems of their own, due to their ever increasing resources and computational load. The huge need of scalable administration tools, supporting grid monitoring and maintenance of the job running process, paved the way toward Autonomic Computing . Autonomic Computing (AC) Systems are meant to feature self-configuring, self-healing, self-protecting and self-optimizing skills . A key milestone for Autonomic Computing is to provide the system with a phenomenological model of itself (self-aware system), built from the system logs using Machine Learning and Data Mining.
This research direction is investigated in the Complex System SIG (section ) and in the Autonomic Computing SIG (section ).
This fourth strategic priority, inspired by Moore's book , is motivated by the fact that many outstandingly efficient algorithms never make it out of research labs. One reason for it is the difference between editor's and programmer's view of algorithms. In the perspective of software editors, an algorithm is best viewed as a single “Go” button. The programmer's perspective is radically different: as he/she sees that various functionalities can be ented on the same algorithmic core, the number of options steadily increases (with the consequence that users usually master less than 10% of the available functionalities). Independently, the programmer gradually acquires some idea of the flexibility needed to handle different application domains; this flexibility is most usually achieved through defining parameters and tuning them. Parameter tuning thus becomes a barrier to the efficient use of new algorithms.
This research direction is chiefly investigated by the Crossing the Chasm SIG (section ) and also by the Continuous Optimization SIG (section ).
Since its creation, TAO mainstream applications regard Numerical Engineering, Autonomous Robotics, and Control and Games. Two new fields of applications, due to the arrival of Cécile Germain (Pr UPS, 2005), Philippe Caillou (MdC, 2005), Balázs Kégl (CR CNRS LAL, 2006) and Cyril Furtlehner (CR INRIA, 2007) have been considered: Autonomic Computing and Complex Systems.
Numerical Engineeringstill is a major source of applications. The successful OMD (Optimization Multi-Disciplinaire) RNTL/ANR project is being resumed by OMD2, started in July 2009. Collaborations with IFP and PSA automobile industry respectively led to Zyed Bouzarkouna's and Mouadh Yagoubi's PhD CIFRE. TAO leads the Work Package “Optimization” in the System@atic CSDL project, responsible for both fundamental research on surrogate models in multi-objective optimization and the setup of a software platform, that lead to Ilya Loshchilov's PhD work. A collaboration with CEA DM2S was conducted as a Digiteo project and lead to Philippe Rolet's PhD on simplified models.
Autonomous Software Roboticsis rooted in our participation to the SYMBRION European IP and SyDiNMaLaS (ANR-JST, coll. University of Kyushu). On this topic, Jean-Marc Montanier started his PhD in Sept. 2009; Vladimir Skortsov did his Post-doc from Sept. 2009 to Sept. 2010; Weijia Wang and Riad Akrour started their PhDs in Sept. 2010. See Section .
Our activity in Control and Gamesis chiefly visible through Mogo, already mentioned in the Highlights. Another application regards Brain Computer Interfaces: the Digiteo project Digibrain(coll. with CEA List and Neurospin), with Cedric Gouy-Pailler's postdoc from October 2009 to October 2010.
Applications related to Autonomic Computingbecame an important part of TAO activities, led by Cécile Germain and Balázs Kégl in tight collaboration with the Laboratoire de l'Accélérateur Linéaire (section ). Applications related to Social Systemsare led by Philippe Caillou and Cyril Furtlehner, respectively investigating multi-agent models for labor market, and road traffic models (ANR project TRAVESTI, coordinated by C. Furtlehner, started in 2009). Last but not least, the arrival in TAO of Jamal Atif brought a new application fied in image analysis and understanding.
MoGo and its Franco-Taiwanese counterpart MoGoTW is a Monte-Carlo Tree Search program for the game of Go, which made several milestones of computer-Go in the past (first wins against professional players in 19x19; first win with disadvantageous side in 9x9 Go); MoGo has had new developments as follows:
A Meta-MCTS module (inspired by the collaboration with Tristan Cazenave in the ANR EXPLO-RA project), which provided both a huge opening book in 9x9 and an approximate solving of 7x7 Go .
Following the “poolRave” modification, introduction of machine learning and statistics into MCTS, such as:
Bernstein Races (for offline educating Monte-Carlo simulations).
Variants of Go: we tested variants of Go, in particular blind variants; this suggests that in such frameworks playing theoretically suboptimal moves helps a lot, because such unnatural moves are harder to memorize. A preliminary related publication is ; some additional results are to be published. Another interesting variant is random-Go, starting from a randomly generated board; such situations are much harder for humans, and, interestingly, our program was competitive in front of a 6D player (ranked 4th in a world amateur championship and former French champion) on a 19x19 board .
MoGo's development team was awarded the 2010 ChessBase award for the best contribution to Computer-Games.
The Covariance Matrix Adaptation Evolution Strategy (CMA-ES) is one of the most powerful continuous domain evolutionary algorithms. The CMA-ES is considered state-of-the-art in continuous
domain evolutionary computation
COCO (COmparing Continuous Optimizers) is a platform for systematic and sound comparisons of real-parameter global optimizers. COCO provides benchmark function testbeds (noiseless and noisy) and tools for processing and visualizing data generated by one or several optimizers. The code for processing experiments is provided in Matlab and C. The post-processing code is provided in Python. The code has been improved and used for the GECCO 2009 and 2010 workshops on “Black Box Optimization Benchmarking” (BBOB) (see Section ), and will serve as a basis for the test platform in the CSDL project.
The Grid Observatory software suite collects and publishes traces of the EGI (European Grid Initiative) grid usage. With the release and extensions of its portal, the Grid Observatory has made a database of grid usage traces available to the wider computer science community. These data are stored on the grid, and made accessible through a web portal without the need of grid credentials. More than 100 users are currently registered. The GO is supported by an INRIA ADT (Action de Développement Technologique).
In 2011, the suite has been extended to energy consumption. The first barrier to improved energy efficiency of IT systems is the lack of large-scale collections of experimental data. The Green Computing Observatory (GCO), part of the GO initiative monitors a large computing center (Laboratoire de l'Accélérateur Linéaire - LAL) within the EGI grid, and publishes the data through the Grid Observatory. The GCO is supported by the CNRS PEPS program, and by University Paris-Sud through the MRM (Moyens de Recherche Mutualisés) program.
Portal site:
http://
Within the classical objectives of Autonomics (self-*), two transversal lines of research have emerged.
Most existing work on modeling the dynamics of grid behavior assumes a steady-state system and concludes to some form of long-range dependence (slowly decaying correlation) in the associated time-series. But the physical (economic and sociologic) processes governing the grid behavior dispel the stationarity hypothesis. proposes a categorization of the methods integrate non-stationarity into grid modeling. considers a specific class of models: a sequence of stationary processes separated by breakpoints. The model selection question is now defined as identifying the breakpoints and fitting the processes in each segment, together with a validation methodology that empirically addresses the current lack of theoretical results concerning the quality of the estimated model parameters. Even when stationarity is acceptable, the markovian assumption might be too bold. integrate Echo State Network-based regression into a reinforcement learning in continuous state space for fitting the Q function, with application to reactive grid scheduling.
In order for an autonomic system to continuously infer knowledge from its monitoring (the so-called MAPE-K, Monitor-Analyze-Plan-Execute-Knowledge) loop, heterogeneous sources of data have to be integrated. exemplifies two use cases of the Grid Observatory data on evaluating the perfomance of the major EGI scheduler, and blackhole detection.
The Green Computing Observatory data include the detailed monitoring of the processors and motherboards, as well as the global site information, such as overall consumption and overall cooling. The data schema for publication is grounded in an ontology of measurements developed in collaboration with the MIS (Modélisation, Information et Systèmes) laboratory of University Picardie Jules Verne.
proposes a new approach for analyzing behavioral traces: as most of them are indeed text documents, state of the art techniques in text mining, including Latent Dirichlet Allocation, can be exploited . The advantages are twofold: providing some level of explanation inferred from the data; and a relatively scalable way to capture the temporal variability of the behavior of interest, while retaining the full dimensionality of the problem at hand. A promising perspective for combining this approach and inferred segmentation has been identified and is currently explored.
Divide And Evolve (DAE) DAE solves AI-planning problems by using an Evolutionary Algorithm to sequentially divide them into hopefully simpler problems that are handled by some embedded “classical” planner. Within the ANR project DESCARWIN, work has continued in collaboration with Thalès Research & Technology and ONERA Toulouse. A large part of the work this year has been devoted to writing a brand new version of the DAE software, facing difficulties of parallelization . The resulting program entered the 7th International Planning Competition (IPC 2011) at the 21st International Conference on Planning and Scheduling (ICAPS 2011) and won the Gold Medal in the Temporal Track. Note that the Silver Medal was won by Vincent Vidal, also member of the DESCARWIN team, using his planner YAHSP2 – the one that won the Gold Medal while embedded in DAE, thus demonstrating one more the added value of the DAE approach. Meanwhile, because DAE has many parameters (like most Evolutionary Algorithms), parameter tuning within DAE remains a difficult task, and an original approach has been proposed to learn the parameters based on some instance features , , . Note that this method is however relevant of the “Crossing the Chasm” SIG (see Section ), as it can be applied to any optimization algorithm that handles several instances of the same class.
Resuming work done in 2010, we investigated further the issue of robotic swarm control whenever the environment is partially or completely unknown. This research is at the cross-road of Evolutionary Computation, Machine Learning and Robotics, and a light influence from Evolutionary Ecology, but with a strong focus on engineering (ie. the goal remains to design algorithms). The topic we are interested in is the design of environment-driven self-adaptive distributed algorithms to enable survival at the level of a population of independent robotic units. The population is limited in size, and hardware implementation within real robots has already been achieved . We have also focused our attention on specific aspects of swarm evolutionary dynamics under specific constraints, including the evolution of cooperative and/or altruistic behaviours , . This research yielded interesting results, such as the emergence of altruistic behavior under simple, but specific, algorithmic constraint, as well as tuning mechanism to control the level of altruistic behavior in a population of robots. Perspectives of this work is currently under investigation.
The work done in 2010 about the division of labor among asynchronous and decentralized agents, where each agent is modelled from the competition between two spiking neurons, was further analyzed within a spatio-temporal (simulated) frame. The phase transitions between the asynchronous, the aperiodic and periodic synchronous regimes (depending on the sociability and excitability of the agents) was confirmed, with some counter-intuitive results about the overall merits and efficiency of synchronous behaviors .
We have also explored objective-driven online learning within real robotic hardware, both for single robot online behavior learning as well as small group of robots for pattern formation learning . Our activity in Evolutionary Robotics has also been strenghtened by the publication of book which gather several contributions from major actors in the field , including an introduction paper on current trends and challenges in this domain .
From a slightly different perspective, our work on evolving generative and developmental representations has been continued, with an extensive study of robustness within developmental systems and an investigation of the temporal dynamics at work within genetic regulatory networks for design . While not stricly related to robotics, these contributions share the distributed nature of computation and ultimatly aim at providing an efficient representation for designing and controlling large scale passive or active assembly of units (e.g. robots with complex morphologies).
Additionally, at the crossroad of Machine Learning and Evolutionary Computation, a new Reinforcement Learning approach based on modelling the user's preferences was proposed , ; in the so-called Preference-based Policy Learning, the robot demonstrates some behaviors, is informed of the user's preferences, builds a model of the user's preferences and self-trains to build a new behavior hopefully more satisfactory according to the conjectured user's preferences.
Basic tools from statistical physics (scaling, mean-field techniques and associated distributed algorithms, exactly-solvable models) and probability have been used to model and optimize complex systems, either standalone or combined with MABS approaches. Results are
Within the context of image understanding, a new sequential recognition framework has been proposed in . Sequential image understanding refers to the decision making paradigm where objects in an image are successively segmented/recognized following a predefined strategy. Such an approach generally raises some questions about the ‘‘best’’ segmentation sequence to follow and/or how to avoid error propagation. In , we propose original approaches to answer these questions in the case where the objects to segment/recognize are represented by a model describing the spatial relations between objects. The process is guided by a criterion derived from visual attention, and more precisely from a saliency map, along with some spatial information to focus the attention. This criterion is used to optimize the segmentation sequence. Spatial knowledge is also used to ensure the consistency of the results and to allow backtracking on the segmentation order if needed. The proposed approach was applied for the segmentation of internal brain structures in magnetic resonance images. The results show the relevance of the optimization criteria and the interest of the backtracking procedure to guarantee good and consistent results. From a logical standpoint, sequential object recognition is formulated as an abduction process in , . A scene is viewed as an observation and the task of interpretation is considered as the “best” explanation considering the prior knowledge about the scene context. Towards this aim, we introduce an algebraic-based framework unifying mathematical morphology, description logics and formal concept analysis. We propose to compute the best explanations of an observation through algebraic erosion over the Concept Lattice of a background theory which is efficiently constructed using tools from Formal Concept Analysis. We show that the defined operators are sound and complete and satisfy important rationality postulates of abductive reasoning.
Due to the departure of both PhD students funded within the Microsoft-INRIA joint lab after their successful defenses (Alvaro Fialho and Alejandro Arbelaez), some of the activities of this SIG have been slightly redefined this year, with the one-month visit of Prof. Th. Runarsson (University of Iceland) in October, and the arrival in November of two new post-docs, also funded by the joint lab (Nadjib Lazaar and Manuel Loth). A new direction of research has appeared, in line with both Adaptive Operator Selection (Alvaro Fialho's PhD) and Continuous Search (Alejandro Arbelaez' PhD).
This new direction of research deals with heuristic choice within an existing combinatorial solver using bandit-like algorithms, and the very first results deal with scheduling problems and will be published in early 2012 .
In line with his PhD work, Alvaro Fialho has succesfully used his Adaptive Operator Selection method to the on-line tuning of Differential Evolution in the multi-objective case .
an instance-based parameter-tuning method. Though originally designed for Divide-And-Evolve framework (see Section
), LaO is a generic method that learns the relationship between some instance
features and the optimal parameters of the optimizer. The current version
,
,
uses Neural Network to directly learn the optimal parameters, and
average performance increase compared to the default parameter set (that has won the temporal track in the IPC7 competition) is of more than 10%. On-going work uses rankSVM to learn a
partial order on the features
Alejandro Arbelaez defended his PhD on May 31., led under the supervision of Youssef Hamadi and Michèle Sebag . A survey of his PhD work has been published as a book chapter and some of his last work was more concerned with optimizing the collaboration in distributed SAT solving in highly parallel environments .
In , we describe a learning-to-rank technique based on calibrated multi-class classification. We train a set of multi-class classifiers using AdaBoost.MH, we calibrate them using various techniques to obtain diverse class probability estimates, and, finally, we approximate the Bayes-scoring function (which optimizes the popular Information Retrieval performance measure NDCG), through mixing these estimates into an ultimate scoring function. Our method outperforms many standard ranking algorithms on the LETOR benchmark datasets, most of which are based on significantly more complex learning to rank algorithms than ours.
Our main expertise in continuous optimization is on stochastic search algorithms. We address theory, algorithm design and applications. The methods we investigate are adaptive techniques able to learn iteratively parameters of the distribution used to sample solutions. The Covariance Matrix Adaptation Evolution Strategy (CMA-ES) is nowadays one of the most powerful method for continuous optimization without derivatives. We work on different variants of the CMA-ES to improve it in various contexts as described below. In addition we have contributed to give an information geometry perspective to stochastic optimization unifying both continuous and discrete algorithms using a family of probability distribution parametrized by continuous parameters. The framework proposed in this context allow to retrieve many existing stochastic optimization algorithms when instantiated with different family of probability distributions including the CMA-ES when using gaussian distributions . We have moreover clarified important design principles based on invariances , .
A new variant of CMA-ES to address problem with mixed-integer variables (vectors with both discrete and continuous variables) has been designed . New algorithms using new selection schemes combined with derandomization have been designed and thoroughly theoretically and empirically investigated , . A local search algorithm using an adaptive coordinate descent has been proposed . We have as well investigated how to inject solutions in CMA-ES so as to improve performances if an oracle provide good solutions .
We have proposed simple modifications of evolutionary algorithms so that they reach asymptotically the optimal
We have continued our effort for improving standards in benchmarking and pursued the development of the COCO - COmparing Continuous Optimizers platform
. We are organizing for the GECCO 2012 conference the
Black-Box-Optimization Workshop
We have investigated optimization using a coupling of CMA-ES and surrogates and applied it for the optimization of well placement . We have proposed a new meta-model CMA-ES for the optimization of partially separable functions and shown that it improves performances for solving the well placement problem .
In we present hyper-parameter optimization results on tasks of training neural networks and deep belief networks (DBNs). We optimize hyper-parameters using random search and two new greedy sequential methods based on the expected improvement criterion. The sequential algorithms are applied to the most difficult DBN learning problems and find significantly better results than the best previously reported.
We have investigated theoretically multi-objective algorithms based on the hypervolume and proposed new selection operators based on tournament and multi-armed bandit framework .
We have shown in
a simple algorithm (a
The paper shows upper and lower confidence bounds and/or experiment algorithms in the noisy optimization setting; in particular we compared an optimization algorithm based on bandits and an surrogate-model version; whereas the bandit approach is much faster if the noise decreases quickly to zero around the optimum, the surrogate-model version is faster if the noise does not decrease to zero.
Monte-Carlo Tree Search (MCTS) and Upper Confidence Trees (UCT) are main areas of the team. In particular, we ultra-weakly solved 7x7 Go by winning 20 games out of 20 against professional players in 7x7 Go, thanks to a Meta-Monte-Carlo-Tree Search . The wins were with komi 9.5 as white, and 8.5 as black, suggesting that the ideal komi in 7x7 is 9.We also applied this algorithm to the recent “NoGo” framework, aimed at challenging MCTS for a game which looks like Go but with very different goals; our paper was the first one applying MCTS to NoGo and now all strong programs use the MCTS approach for NoGo. We extended RAVE (Rapid Action Value Estimates) to the continuous settings . In his PhD , Fabien Teytaud proposed several generic improvements of MCTS, including the use of (fast) decisive and anti-decisive moves for games, and applied it to the game of Havannah. An industrial application (to energy management) is proposed in . A MCTS version for partially observable problems with bounded horizon was proposed in . This version is proposed for the two-player case, but for simulations starting at the root; a version in the one-player case, starting from an arbitrary state (and therefore much more efficient for large horizon) is proposed in . This work is extended by a belief state estimation by constraint satisfaction problems in . Other developments and research around MCTS/UCT are described in the MoGo module.
A related important algorithm is Nested Monte-Carlo; we got state of the art results for some traveling salesman variants with a very simple algorithm in .
Fundamental analysis of partially observable games: we proved in that partially observable games are undecidable (result also presented in the BIRS 2010 workshop and the Bielefeld seminar on Search Methodologies), even in the case of finite state spaces and deterministic transitions. This unexpected result is a priori a contradiction with known decidability results; this emphasizes the subtle difference between the classical decision problem (the existence of a strategy winning certainly, whatever may be the strategy of the opponent), which is used is most analysis, and the choice of the move with optimal winning probability. We pointed out that the relevant decision problem is, with no doubt, the latter; that the other decision problem has just been used because it is equivalent to choosing optimal play in the case of fully observable games; and, most importantly, that partially observable games are in fact undecidable, even in the finite deterministic case. On the other hand, on restricted settings, we have shown by some simple lemmas lower and upper bounds on the value of some partially observable games . We extended Monte-Carlo Tree Search to the case of short-term partial information in ; this was successfully applied to the Urban Rival game, a widely played internet card game (now 17 millions of registered users) from a French company.
Tuning of strategies: tuning strategies is a noisy optimization problem in which the convenient “variance of noise decreasing to zero around the optimum” usually does not hold. We have shown that in such a setting, the local bandit-style algorithms are slower than surrogate models; this is detailed in the continuous optimization part.
We organized various computer-Go events, as due to the fame of our program MoGo we are often invited for such events; reports can be found in .
We developed the “double progressive widening” trick, which is aimed at making consistent an algorithm from the finite case to the continuous stochastic case; we got good results in on Q-Learning (with no mathematical proof) and on MCTS (mathematical proof to be submitted soon).
We have also worked on Nash equilibria of Matrix Games, where we proposed an algorithm for finding Nash equilibria faster when the Nash equilibrium is sparse , ; a mathematical proof is ready and will be submitted soon.
Some works are in progress around applications of previous tools to active learning; active learning has also been investigated through conditional random fields in .
Another related work, with motivations from autonomous robotics, combines the exploration of the search space through UCT, with an explicit model of the safe regions explored so far, called Deja-Vu. The Deja-Vu is used to constrain the exploration, mostly in the random phase, and is updated from the current explorations .
The Ilab “Metis” just started; it's an Ilab between Tao, the Inria-Saclay team MaxPlus, and the SME Artelys
http://
The two main families of deep networks are implemented and studied by TAO: stacked RBM (Restricted Boltzmann Machines) and stacked AA (Auto-Associators).
Inspired by the theory of compressed sensing and beyond the common methods based on dictionary learning, we have proposed to learn sparsity and accuracy simultaneously by alternating two constraints on the weights of an Auto-Associator .
The model "SpikeAnts" has been applied to a spatial robotic environment , in collaboration with Nicolas Bredeche (see Section , and has demonstrated even more its interest in the context of swarm robotics.
IFP– 2008-2011 (24 kEur). Optimisation de puits non-conventionnels: type, position et trajectoire, side-contract to Zyed Bouzarkouna's CIFRE Ph.D.;
Participants: Anne Auger, Zyed Bouzarkouna, Marc Schoenauer.
PSA– 2009-2012 (45 kEur). , side-contract to Mouadh Yagoubi's CIFRE Ph.D.;
Participants: Marc Schoenauer, Mouadh Yagoubi.
THALES– 2011-2014 (40 kEur). , side-contract to Gaetan Marceau-Caron's CIFRE Ph.D.;
Participants: Marc Schoenauer, Gaetan Marceau-Caron.
EXQIM– 2011-2014 (40 kEur). , side-contract to Moez Hammami's CIFRE Ph.D.;
Participants: Michèle Sebag, Moez Hammami.
ILAB METIS– 2012- . Participants: O. Teytaud, M. Schoenauer, S. Gaubert (team Max-Plus), J.-J. Christophe (engineer), J. Decock (ph.D.).
ILAB METIS– 2012- . Participants: O. Teytaud, M. Schoenauer, S. Gaubert (team Max-Plus), J.-J. Christophe (engineer), J. Decock (ph.D.).
JASMIN– 2010-2012 (205 kEur). DRIRE programme FEDER.
Participants: CADLM, Intercim, TAO (Michèle Sebag).
CSDL– 2009-2012 (290 kEur). FUI System@tic (Région Ile de France grant). Complex System Design Lab
Participants: Anne Auger, Nikolaus Hansen, Ilya Loshchilov, Raymond Ros, Marc Schoenauer.
INNOV NATION– 2009-2011 (69 kEur). Fonds de Compétitivité des Entreprises. Simulation multi-agent de diffusion et d'évolution d'idées dans un réseau social dynamique (application à un serious game).
Participants: Philippe Caillou, Samuel Thiriot.
OMD2– 2009-2012 (131 kEur). Optimisation Multi-Disciplinaire Distribuée, ANR programme COSINUSCoordinator Maryan Sidorkiewicz, RENAULT Technocentre;
Participants: Anne Auger, Yohei Akimoto, Nikolaus Hansen, Marc Schoenauer, Olivier Teytaud.
SyDiNMaLaS– 2009-2012 (158 kEur). Integrating Symbolic Discovery with Numerical Machine Learning for Autonomous Swarm Control, ANR programme BLANCCoordinator Michèle Sebag, CNRS;
Participants: David Meunier, Marc Schoenauer, Michèle Sebag.
TRAVESTI– 2009-2012 (206 kEur). Estimation du volume de Trafic par Interface Spatio-temporelle, ANR programme SYSCOMM 2008Coordinator Cyril Furtlehner, INRIA;
Participants: Anne Auger, Cyril Furtlehner, Victorin Martin, Maxim Samsonov.
ASAP– 2009-2012 (178 kEur). Apprentissage Statistique par une Architecture Profonde, ANR programme DEFIS 2009Coordinator Alain Rakotomamonjy, LITIS, Université de Rouen, France;
Participants: Sylvain Chevallier, Hélène Paugam-Moisy, Sébastien Rebecchi, Michèle Sebag.
IOMCA2010-2013 (264 kEur). Including Ontologies in Monte-Carlo Tree Search and Applications, ANR international project coordinated by O. Teytaud (Tao, INRIA).
Participants: Adrien Couëtoux, O. Teytaud.
EXPLORA2010-2012 (289 kEur, to be shared with Inria Lille). EXPLOitation pour l'Allocation efficace de Ressources. Applications l'optimisation. ANR Project coordinated by R. Munos (INRIA Lille).
Participants: David Auger, Olivier Teytaud.
DESCARWIN2010-2013 (201 kEur). Coordinateur P. Savéant, Thalès.
Participants: Matthias Brendel, Mostepha-Redouane Khouadjia, Marc Schoenauer.
SIMINOLE2010-2014 (1180k, 250k for TAO). Large-scale simulation-based probabilistic inference, optimization, and discriminative learning with applications in experimental physics, ANR project, Coordinator B. Kégl (CNRS LAL).
Participants: Balázs Kégl, Rémi Bardenet, Nikolaus Hansen, Michèle Sebag, Cécile Germain
Title: Symbiotic Evolutionary Robots Organisms
Type: COOPERATION (ICT)
Defi: Embedded systems design
Instrument: Integrated Project (IP)
Duration: February 2008 - January 2013
Coordinator: Universität Stuttgart (Germany)
Others partners: Vereniging voor christelijk hoger onderwijs, wetenschappelijk onderzoek en patiëntenzorg, Netherlands; Universität Graz, Austria; Universität Karlsruhe, Germany; Vlaams Interuniversitair Instituut Voor biotechnologie VZW, Blegium; University of the West of England, Bristol, United Kingdom; Eberhard Karls Universität Tübingen, Germany; University of York, United Kingdom; Université libre de Bruxelles, Belgium; INRIA, France.
See also:
http://
Title: Massive Sets of Heuristics For Machine Learning
Type: COOPERATION (ICT)
Defi: Cognitive Systems and Robotics
Instrument: Specific Targeted Research Project (STREP)
Duration: December 2010 - December 2012
Coordinator: IDIAP Research Institute (Switzerland)
Others partners: Centre National de la Recherche Scientifique, France; Weierstrass-Institut fur Angewandte Analysis Und Stochastik, Part of Furschungsverbund Berlin E.V, Germany; INRIA, France; Ceske Vysoke Uceni Technicke V Praze,Czech Republic.
See also:
http://
Title: Design of a decision support tool for sustainable, reliable and cost-effective energy strategies in cities and industrial complexes
Type: COOPERATION (ICT)
Defi: Smart Cities and Communities
Instrument: Specific Targeted Research Project (STREP)
Duration: October 2011 - March 2014
Coordinator: Artelys SA (France)
Others partners: Austrian Institute of Technology, Austria; INESC Porto, Portugal; ARMINES (CMA), France; SCHNEIDER ELECTRIC, France; City of Cesena, Italie; City of Bologna, Italy; Tupras - Turkish Petroleum Refineries Corporation, Turkey; ERVET, Italy; INRIA, France.
See also: Artelys Web site
PASCAL2, Network of Excellence, 2008-2013 (34 kE in 2008, 70 kE in 2009). Coordinator John Shawe-Taylor, University of Southampton. M. Sebag is manager of the Challenge Programme.
EGIFP7 Infrastructure - 2010-2013 (48 kEur) Participants: Cécile Germain, Michèle Sebag, Davy Feng, Julien Nauroy
Grille Paris-SudMRM (Moyens de Recherche Mutualisés) 2010-2011 (23KE). Coordinator Balázs Kégl Participants: Cécile Germain, Michèle Sebag, Xiangliang Zhang, Julien Perez, Davy Feng,Julien Nauroy.
DIGIBRAIN– 2007-2011(48 kEur). DIGITEO grant, coordinator Jean-Denis Muller CEA LIST, France
Participants: Cédric Gouy-Pailler, Michèle Sebag.
Unsupervised-Brain– 2011-2014(5 kEur). DIGITEO grant, coordinator Michèle Sebag LRI Université Paris Sud, France
Participants: Yoann Isaac, Cédric Gouy-Pailler, Michèle Sebag.
MetaModel– 2008-2011 (150 kEur). Advanced methodologies for modeling interdependent systems - applications in experimental physics, ANR “jeune chercheur” grant, coordinator Balázs Kégl
Participants: Michèle Sebag, Cécile Germain, Robert Busa-Fekete
NUTN (National University of Tainan, Taiwan). Collaboration of Olivier Teytaud around MoGo (see Invitation section below).
University of Iceland. Prof. Thomas Philip Runarsson was invited for one month in TAO (October 2011) to work on bandit-based choice of heuristics in combinatorial optimization .
Olivier Teytaud (CR1) is invited researcher in NUTN (National University of Tainan, Taiwan) for one year.
Adrien Couëtoux (ph.D. student) is in internship in NUTN (National University of Tainan, Taiwan) for 6 months.
Jérémie Decock (ph.D. student) will go to NUTN (National University of Tainan, Taiwan) for 5 months.
Jean-Baptiste Hoock (ph.D. student) has spent 12 days in Univ. Potsdam (October 2011) for the Mash project.
Anne Auger
THRaSH, Theory of Randomized Search Heuristics workshop, member of Steering Committee;
Editorial Board of Evolutionary Computation, MIT Press;
Olivier Teytaud: Committee at EvoStar, ICML, Lion, Advances in Computer Games.
Nicolas Bredeche
PC member at GECCO 2011, RIVF 2011, EuroGP 2011
Co-organiser of DevLeaNN workshop, a two-day workshop on Development and Learning in Artificial Neural Networks, Paris, France (
http://
New Horizons in Evolutionary Robotics. co-editor, Springer .
Philippe Caillou
PC member at EPIA 2011, V2CS 2011
Coordinator of the SimTools Network (RNSC Network)
Nikolaus Hansen
Editorial Board of Evolutionary Computation, MIT Press;
Marc Schoenauer
Member of ACM-SIGEVO Executive since 2003 (Special Interest Group on Evolutionary Computation (was the International Society on Genetic and Evolutionary Algorithms before 2006); member of ACM-GECCO Business Committee (2012-2013).
Parallel Problem Solving from Nature, Member of Steering Committee (since 1998);
Co-chair with Youssef Hamadi (MSR Cambridge) of the LION'6 conference (Learning and Intelligent OptimizatioN) in Paris, January 2012.
“Invited Speaker co-Chair” of IEEE-CEC (Congress on Evolutionary Computation) 2011, New-Orleans, USA.
Editorial Board of Evolutionary Computation, MIT Press (Editor in Chief, 2002-2009); Genetic Programming and Evolvable Machines, Springer Verlag; Applied Soft Computing, Elsevier; Natural Computing Series, Springer Verlag.
PC member of all important conferences in the area of Evolutionary Computation
Michèle Sebag
Member of the European Machine Learning and Knowledge Discovery from Databases Steering Committee since 2010; ECCAI Fellow since 2011;
Workshop Chair of ECAI 2012 (Montpellier, August 2012);
Co-organization with Einoshin Suzuki of the International Workshop on LEarning and data Mining for Robots (LEMIR) at IEEE-Int. Conf. on Data Mining, Vancouver dec. 2011;
Area chair ICML11, Area chair ECML11;
PC member of ILP11, GECCO11, reviewer on 3PhDs and 1 HdR; reviewer for ERC, CNRS, INRIA-Lille applications; member of LRI CCSU;
Member of the CoNRS; Senior Advisory Board CHIST-ERA; member of the CSFRS (Conseil Supérieur de la Formation et Recherche Stratégique);
Pattern Analysis, Statistical Learning and Computational Modelling NoE, Member of Steering Committee (PASCAL 2004-2008; PASCAL2, 2008- );
Editorial Board of Machine Learning Journal, Springer Verlag; Genetic Programming and Evolvable Machines, Springer Verlag.
Cécile Germain-Renaud
PC member for: IEEE/ACM Cluster, Cloud and Grid Computing (CCGRID) since 2009; IFIP International Conference on Network and Parallel Computing since 2005; SP Cloud workshop; EGEE/EGI user forum since 2008.
Member (elected) of the baord of the Faculty of Science (Conseil d'UFR) and University scientific board (Conseil Scientifique de l'Université).
Marc Schoenauer: EVOLVE, Luxembourg (May).
Michèle Sebag: KAUST, Saoudi Arabia (Feb.); U. Zurich, Switzerland (March); U. York, UK (Nov.); NeuroComp; ML@INRIA.
Nicolas Bredeche
Licence: approx. 80h (Artificial Life), L2, Univ. Paris-Sud, France.
Master: approx. 120h (Evolutionary Computation, Artificial Intelligence), L2, Univ. Paris-Sud, France. Including 15h Evolutionary Robotics, M2R.
Philippe Caillou
Licence: approx. 192h (Computer science for managers), L1, IUT Sceaux, Univ. Paris-Sud, France.
Master: approx. 27h (Multi-Agents Systems), M2R, Univ. Paris-Sud, France.
Master: 3h (Multi-Agent Based Simulation), M2R, Univ. Paris-Dauphine, France.
Cécile Germain-Renaud
Licence: approx. 120h (Computer Architecture, head of Licence) L2, L3, Polytech, Univ. Paris-Sud, France.
Master: approx. 50h (Parallelism), M1 Computer Science, Univ. Paris-Sud, France.
Master: 3h (Multi-Agent Based Simulation), M2R, Univ. Paris-Dauphine, France.
Michèle Sebag
Master: 18h (Machine Learning), M2R, Univ. Paris-Sud, France.
PhD & HdR
HdR: O. Teytaud, Artificial Intelligence and Optimization with Parallelism, Université Paris-Sud, April 22. 2011
PhD: A. Arbelaez, Learning During Search, Université Paris-Sud, May 31., 2011, Y. Hamadi and M. Sebag
PhD: F. Teytaud, Introduction of Statistics in Optimization, Université Paris-Sud, Dec. 8., 2011, M. Schoenauer and O. Teytaud
PhDs in progress
R.Akrour, Autonomous Robotics based on Information Theory, Université Paris-Sud, Nov. 02., 2010, M. Sebag
L. Arnold, Architectures Profondes pour la Vision Computationnelle, Université Paris-Sud, Jan. 01., 2010, H. Paugam-Moisy and Ph. Tarroux (LIMSI)
R. Bardenet, Méthodes d’Échantillonnage pour l'Inférence et l'Optimisation en Physique des Particules, Université Paris-Sud, Nov. 13., 2009, B. Kegl
Z. Bouzarkouna, Optimisation de Puits Non Conventionnels : Type, Position et Trajectoire, Université Paris-Sud, Dec.01., 2008, M. Schoenauer
A. Chotard, Enhancement and Analysis of Evolution Strategies, Université Paris-Sud, Oct. 01., 2011, A. Auger and N. Hansen
A. Couëtoux Monte-Carlo Tree Search and other Reinforcement Learning methods for Energy Management Applications, Université Paris-Sud, Sept. 01.,2010, O. Teytaud
F. Dawei Détection et diagnostic d'anomalies dans les systèmes globalisés à grande échelle, Université Paris-Sud, Oct. 01., 2010, C. Germain
J. Decock Comparison and Combination of Control and Reinforcement Learning methods for Energy Management Applications, Université Paris-Sud, Oct. 03., 2011, O. Teytaud
N. Galichet Integrity Preserving Policy Learning, Université Paris-Sud, Oct. 01., 2011, M. Sebag
M. Hammami, Traitement de Données Financières Haute Fréquence: Exploration de Méthodes de Construction Inductive à Grandes Echelle, Université Paris Diderot, May 01., 2011, M. Sebag
J.-B. Hoock, Goal Planning with Massive Sets of Heuristics, Université Paris-Sud, Nov. 01.,2009, O Teytaud
Y. Isaac, Apprentissage Génératif pour les Interfaces Cerveau-Machine, Université Paris-Sud, Oct. 03., 2011, C. Gouy-Pallier (CEA) and M. Sebag
I. Loshchilov, Rank-based Meta-models for Costly Optmization, Université Paris-Sud, Oct. 01., 2009, M. Schoenauer and M. Sebag
G. Marceau-Caron, Optimisation Globale du Trafic Aérien, Université Paris-Sud, May 11., 2011, A. Hadjaz (Thalès Air Systems), P. Savéant (Thalès R&D) and M. Schoenauer
V. Martin, Modélisation Probabiliste et Inférence par Propagation de Croyances : Application au Trafic Routier, Université Paris-Sud,Dec. 01., 2009, A. de la Fortelle and JM Lasgouttes (INRIA Rocquencourt)
J.-M. Montanier, Robotique Evolutionnaire pour l'Adaptation en Ligne d'un Essaim de Robot, Université Paris-Sud, Oct. 01., 2009, N. Bredeche
W. Wang, Théorie de l'Information pour l'Apprentissage statistique en Robotique Embarquée, Université Paris-Sud, Oct.01., 2010, M. Sebag
M. Yagoubi, Optimisation multi-objectif d'un bloc-moteur diésel, Université Paris-Sud, Feb. 09., 2009, M. Schoenauer and L. Thobois (PSA)