EN FR
EN FR


Section: New Results

Toward Good AI

Causal Modeling

Participants: Philippe Caillou, Isabelle Guyon, Michèle Sebag;

PhDs: Diviyan Kalainathan

Collaboration: David Lopez-Paz (Facebook).

The search for causal models relies on quite a few hardly testable assumptions, e.g. causal sufficiency [160]; it is a data hungry task as it has the identification of independent and conditionally independent pairs of variables at its core. A new approach investigated through the Cause-Effects Pairs (CEP) Challenge [112] formulates causality search as a supervised learning problem, considering the joint distributions of pairs of variables (e.g. (Age, Salary)) labelled with the proper causation relationship between both variables (e.g. Age "causes" Salary) and learning algorithms apt to learn from distributions have been proposed [114]. An edited book has been published [48], that somewhat summarizes the whole history of Cause-EFfect Paris research. Several chapters of this book have co-authors in TAU: Evaluation methods of cause-effect pairs [49], Learning Bivariate Functional Causal Models [47], Discriminant Learning Machines [51], and Results of the Cause-Effect Pair Challenge [50].

In D. Kalainathan's PhD [14] and O. Goudet's postdoc, the search for causal models has been tackled in the framework of generative networks [107], trained to minimize the Maximum Mean Discrepancy loss; the resulting Causal Generative Neural Network improves on the state of the art on the CEP Challenge. CGNN favorably compares with the state of the art w.r.t. usual performance indicators (AUPR, SID) on main causal benchmarks, though with a large computational cost.

An attempt to scale up causal discovery, we proposed the Structural Agnostic Model approach [14] [120]. Working directly on the observational data, this global approach implements a variant of the popular adversarial game [99] between a discriminator, attempting to distinguish actual samples from fake ones, obtained by generating each variable, given real values from all others. A sparsity L1 penalty forces all generators to consider only a small subset of their input variables, yielding a sparse causal graph. SAM obtains state-of-the-art performances on causal benchmarks, and scales up to a few hundred variables.

An innovative usage of causal models is for educational training in sensitive domains, such as medicine, along the following line. Given a causal generative model, artificial data can be generated using a marginal distribution of causes; such data will enable students to test their diagnosis inference (with no misleading spurious correlations in principle), while forbidding to reverse-engineer the artificial data and guess the original data. Some motivating applications for causal modeling are described in section 4.1.

Explainability

Participants: Isabelle Guyon, François Landes, Marc Schoenauer, Michèle Sebag

PhD: Marc Nabhan

Causal modeling is one particular method to tackle explainability, and TAU has been involved in other initiatives toward explainable AI systems. Following the LAP (Looking At People) challenges, Isabelle Guyon and co-organizers have edited a book [143] that presents a snapshot of explainable and interpretable models in the context of computer vision and machine learning. Along the same line, they propose an introduction and a complete survey of the state-of-the-art of the explainability and interpretability mechanisms in the context of first impressions analysis [57].

The team is also involved in the proposal for the IPL HyAIAI (Hybrid Approaches for Interpretable AI), coordinated by the LACODAM team (Rennes) dedicated to the design of hybrid approaches that combine state of the art numeric models (e.g., deep neural networks) with explainable symbolic models, in order to be able to integrate high level (domain) constraints in ML models, to give model designers information on ill-performing parts of the model, to provide understandable explanations on its results. Kickoff took place in September 2019, and we are still looking for good post-doc candidates.

Note also that the on-going work on the identification of the border of the failure zone in the parameter space of the autonomous vehicle simulator [37] (Section 7.1.3) also pertains to explainability.

Finally, a completely original approach to DNN explainability might arise from the study of structural glasses (7.2.3), with a parallel to Graph Neural Networks (GNNs), that could become an excellent non-trivial example for developing explainability protocols.

Robustness of AI Systems

Participants: Guillaume Charpiat, Marc Schoenauer, Michèle Sebag

PhDs: Julien Girard, Marc Nabhan, Nizham Makhoud

Collaboration: Zakarian Chihani (CEA); Hiba Hage and Yves Tourbier (Renault); Johanne Cohen (LRI-GALAC) and Christophe Labreuche (Thalès)

As said (Section 3.1.2, Tau is considering two directions of research related to the certification of MLs. The first direction, related to formal approaches, is the topic of Julien Girard's PhD (see also Section 3.1.2). On the opposite, the second axis aims to increase the robustness of systems that can only be experimentally validated. Two paths are investigated in the team: assessing the coverage of the datasets (more particularly here, used to train an autonomous vehicle controller), topic of Marc Nabhan's CIFRE with Renault; and detecting flaws in the system by reinforcement learning, as done by Nizam Makdoud's CIFRE PhD with Thalès THERESIS.

Formal validation of Neural Networks

The topic of provable deep neural network robustness has raised considerable interest in recent years. Most research in the literature has focused on adversarial robustness, which studies the robustness of perceptive models in the neighbourhood of particular samples. However, other works have proved global properties of smaller neural networks. Yet, formally verifying perception remains uncharted. This is due notably to the lack of relevant properties to verify, as the distribution of possible inputs cannot be formally specified. With Julien Girard-Satabin's PhD thesis, we propose to take advantage of the simulators often used either to train machine learning models or to check them with statistical tests, a growing trend in industry. Our formulation [34] allows us to formally express and verify safety properties on perception units, covering all cases that could ever be generated by the simulator, to the difference of statistical tests which cover only seen examples. Along with this theoretical formulation, we provide a tool to translate deep learning models into standard logical formulae. As a proof of concept, we train a toy example mimicking an autonomous car perceptive unit, and we formally verify that it will never fail to capture the relevant information in the provided inputs.

Experimental validation of Autonomous Vehicle Command Statistical guarantees (e.g., less than 10-8 failure per hour of operation) are obtained by empirical tests, involving millions of kilometers of driving in all possible road, weather and traffic conditions as well as intensive simulations, the only way to full control of the driving conditions. The validation process thus involves 3 steps: i) making sure that all parts of the space of possible scenarios are covered by experiments/tests with sufficiently fine grain; ii) identify failures zones in the space of scenarios; iii) fix the controller flaws that resulted in these failures.

TAU is collaborating with Renault on step ii) within Marc Nabhan's CIFRE PhD (defense expected in Sept. 2020). The current target scenario is the insertion of a car on a motorway, the "drosophila" of autonomous car scenarios and the goal is the identification of the conditions of failures of the autonomous car controller. Only simulations are considered here, with one scenario being defined as a parameter setting of the in-house simulator SCANeR. The goal is the detection of as many failures as possible, running as few simulations as possible, and the identification of the borders of the failure zone using an as simple as possible description, thus allowing engineers to understand the reasons for the flaws. A first paper was published [37] proposing several approaches for the identification of failures. On-going work is concerned with a precise yet simple definition of the border of the failure zone.

Reinforcement Learning from Advice In the context of his CIFRE PhD with Thalès, Nizam Makdoud tests (in simulation) physical security systems using reinforcement learning to learn the best sequence of action that will break through the system. This lead him to propose an original approach called LEarning from Advice (LEA) that uses knowledge from several policies learned on different tasks. Whereas Learning by imitation uses the actions of the known policy, the proposed method uses the different Q-functions of the known policies. The main advantage of this strategy is its robustness to poor advice, as the policy then reverts to standard DDPG [127]. The results (submitted) demonstrate that LEA is able to learn faster than DDPG if given good-enough policies, and only slightly slower when given lousy advices.

Learning Multi-Criteria Decision Aids (Hierarchical Choquet models) In collaboration with Johanne Cohen (LRI-GALAC) and Christophe Labreuche (Thalès), the representation and data-driven elicitation of hierarchical Choquet models has been tackled. A specific neural architecture, enforcing by design the model constraints (monotonicity, additivity), and supporting the end-to-end training of the Multi-Criteria Decision aid, has been proposed in Roman Bresson's PhD. Under mild assumptions, an identifiability result (existence and unicity of the sought model in the neural space) is obtained. The approach is empirically validated and successfully compared to the state of the art.