Section: Research Program

The Five Pillars of TAO

This Section describes TAO main research directions at the crossroad of Machine Learning and Evolutionary Computation. Since 2008, TAO has been structured in several special interest groups (SIGs) to enable the agile investigation of long-term or emerging theoretical and applicative issues. The comparatively small size of TAO SIGs enables in-depth and lively discussions; the fact that all TAO members belong to several SIGs, on the basis of their personal interests, enforces the strong and informal collaboration of the groups, and the fast information dissemination.

The first two SIGs consolidate the key TAO scientific pillars, while the others evolve and adapt to new topics.

The Stochastic Continuous Optimization SIG (OPT-SIG) takes advantage of the fact that TAO is acknowledged the best French research group and one of the top international groups in evolutionary computation from a theoretical and algorithmic standpoint. A main priority on the OPT-SIG research agenda is to provide theoretical and algorithmic guarantees for the current world state-of-the-art continuous stochastic optimizer, CMA-ES, ranging from convergence analysis to a rigorous benchmarking methodology. Incidentally, the benchmark platform COCO has been acknowledged since 2009 as “the“ international continuous optimization benchmark, and its extension is at the core of the ANR projects NumBBO and NumBBO2. Another priority is to address the current limitations of CMA-ES in terms of high-dimensional or expensive optimization and constraint handling (respectively Ouassim Ait El Hara's and Asma Atamna's PhDs). Note that most members of this SIG have moved to the recently created Inria team RANDOPT by December 2016.

The Optimal Decision Making under Uncertainty SIG (UCT-SIG) benefits from the MoGo expertise and its past and present world records in the domain of computer-Go, establishing the international visibility of TAO in sequential decision making. Since 2010, UCT-SIG resolutely moves to address the problems of energy management from a fundamental and applied perspective. On the one hand, energy management offers a host of challenging issues, ranging from long-horizon policy optimization to the combinatorial nature of the search space, from the modeling of prior knowledge to non-stationary environment to name a few. On the other hand, the energy management issue can hardly be tackled in a pure academic perspective: tight collaborations with industrial partners are needed to access the true operational constraints. Such international and national collaborations have been started by Olivier Teytaud during his three stays (1 year, 6 months, 6 months) in Taiwan, and witnessed by the FP7 STREP Citines, the ADEME Post contract, and the METIS I-lab with SME Artelys. Note that Olivier Teytaud has left TAO for Google-Zurich on June 6., 2016. The project is continuing in collaboration with RTE under the leadership of Isabelle Guyon and Marc Schoenauer, making connections with Data Science.

The Data Science SIG (DS-SIG) includes the activities conducted or started within the CDS and ISN Lidexes in Saclay. On the one hand, it replaces and extends the former Distributed systems SIG, that was devoted to the modeling and optimization of (large scale) distributed systems, and itself was extending the goals of the original Autonomic Computing SIG, initiated by Cécile Germain-Renaud and investigating the use of statistical Machine Learning for large scale computational architectures (from data acquisition - the Grid Observatory in the European Grid Initiative - to grid management and fault detection). Under the application pressure from natural and social sciences (ranging from High Energy Physics to computational social sciences), this SIG has evolved. A major result of this theme has been the creation 3 years ago of the Paris-Saclay Center for Data Science, co-chaired by Balázs Kégl, and the organization of the Higgs-ML challenge (http://higgsml.lal.in2p3.fr/), most popular challenge ever on the Kaggle platform. Another large scale data challenge sponsored by Microsoft with USD 60000 in prizes on the theme of Automatic Machine Learning (AutoML) in 2015/2016 was crowned by success: the winners developed a new tool called AutoSKlearn as a wrapper to the scikit-learn library, an open source project lead by Inria team Parietal.

On the other hand, several activities around Computational Social Sciences involving Gregory Grefenstette, Cécile Germain-Renaud, Michèle Sebag, Philippe Caillou, Isabelle Guyon and Paola Tubaro, have widely extended previous work around the modeling of multi-agent systems and the exploitation of simulation results in the SimTools RNSC network frame. A research direction involves adding semantics to underspecified collections of societal information: in an historical perspective (as in the new TAO H2020 project, EHRI-II on holocaust archives, or in the Gregorius project on church history) or an individual perspective (as in the ongoing Personal Semantics project). Another research direction, developed within the Paris-Saclay Institute for Digital Society (ISN Lidex), examines societal questions (frictional unemployment, Th. Schmitt's PhD, or quality of life at work, O. Goudet's post-doc, or scientific institution activities, F. Louistisserand's engineer stint on Cartolabe) in a data-driven perspective. The key challenge here is to use learning algorithms to find structure and extract knowledge from poorly structured or unstructured information, and to provide intelligible results and/or means to interact with the user. Novel approaches involving causal modeling are under exploration.

The Designing Criteria SIG (CRI-SIG) focuses on the design of learning and optimization criteria. It elaborates on the lessons learned from the former Complex Systems SIG, showing that the key issue in challenging applications often is to design the objective itself. Such targeted criteria are pervasive in the study and building of autonomous cognitive systems, ranging from intrinsic rewards in robotics to the notion of saliency in vision and image understanding, and that of automatic algorithm selection and parameterization. The desired criteria can also result from fundamental requirements, such as scale invariance in a statistical physics perspective, and guide the algorithmic design. Additionally, the criteria can also be domain-driven and reflect the expert priors concerning the structure of the sought solution (e.g., spatio-temporal consistency); the challenge is to formulate such criteria in a mixed non convex/non differentiable objective function, nevertheless amenable to tractable optimization.

The Deep Learning and Information Theory SIG (DEEP-SIG) originated from some extensions of the work done in the Distributed Systems SIG that have been developped in the context of the TIMCO FUI project (started end 2012 and just ended); the challenge was not only to port ML algorithms on massively distributed architectures, but to see how these architectures can inspire new ML criteria and methodologies. The coincidence of this project with the arrival of Yann Ollivier in TAO gradualy led this work toward Deep Networks. Other research themes of this SIG are concerned with studying various theoretical and practical aspects of deep learning, providing information-theoretic perspectives on the design and optimization of deep learning models, such as using the Fisher information matrix to optimize the parameters, or using minimum description length criteria to choose the right model structure (topology of the neural graph, addition or removal of parameters...) and to provide regularization and model selection. This activity has also branched out into exploring various applications of Deep Learning. Isabelle Guyon has been involved in applications in computer vision, including the study of personality traits in video data and the verification of fingerprints. Energy Management (Section 4.1), Computational Social Sciences (Section 4.2), and anomaly detection are now also steered toward using Deep Networks for different variants of representation learning.