Section: New Results
Generative Models and Data-driven Design
Learning generative models from observational data faces two critical issues: model selection (defining a loss criterion well suited to the considered distribution space) and tractable optimization.
A Statistical Physics Perspective
Restricted Boltzmann machines (RBM) define generative models, and advanced mean field methods of statistical physics can be leveraged to analyze the learning dynamics. Giancarlo Fissore's Master thesis (now in PhD), co-supervised by Aurélien Decelle and Cyril Furtlehner, has characterized the information content of an RBM from its spectral properties and derived a phenomenological equation of the learning process by means of the spectral dynamics of the weight matrix [5]. The learning dynamics has been analyzed in both linear and non-linear regimes, investigating the impact of the input data.
Secondly [37], the weight matrix ensemble which results from this spectral representation is used to analyze the thermodynamical properties of RBMs in terms of a phase diagram. The conditions for the RBM training, interpreted as a so-called ferromagnetic compositional phase, are given. Ferromagnetic order parameters are identified in the aforementionned phenomenological equation; a closed-form is obtained through explicit integration in simple cases, yielding a behavior of the learning spectral dynamics that matches the actual dynamics of standard RBM training (e.g. using contrastive divergence). After this model, a repulsive interaction takes place among the singular modes of the weight matrix, as some pressure of the lower modes is exerted on higher modes along training. Remarkably, this repulsive interaction is observed in algorithmic experiments for low learning rates.
Functional Brain Dynamics
Generative models have also been used by Aurélien Decelle and Cyril Furtlehner to model the dynamics of the Functional conectome (FCD) in the context of the BRAINTIME exploratory project, along two lines.
On the one hand, Restricted Boltzmann Machines have been used to learn the statistics of the time-varying resting state BOLD activity of 49 human subjects in the age span of 18 to 80 years. RBM models trained on a per individual basis show at least two statistically distinct pure states for each subject, between which resting state activity is stochastically wandering. Through mean-field TAP approximations of free energy we have evaluated the energy barrier between these two states per individual. Interestingly young and old individuals have different switching statistics: more regular for young subjects vs bursty and temporally irregular for elderly subjects. Furthermore, the switching probability is correlated with the energy difference between the two pure RBM states, opening the way to a personalized “landscape” analysis of the resting state FCD.
On the other hand, extremely sparse precision matrices describing the co-activation statistics of different brain regions during resting state based on BOLD time-series have been derived using sparse Gaussian copula models. Such extremely sparse models support direct inter-subject comparisons, in contrast with usually dense FC descriptions. A further step is to characterize the brain activity dynamics, e.g. through considering multi-temporal slice models.
Power systems Design and Optimization
Last work within the POST project, Vincent Berthier's PhD [2] addressed issues in global continuous optimization, and proposed to use unit commitment problems to go beyond classical benchmarks of analytical functions.
Benjamin Donnot (RTE Cifre PhD, now under Isabelle Guyon's supervision), successfully started to disseminate his work in the power system community [20]. His main results regard the design of an original alternative to the one-hot encoding for the topology of the French power grid, termed Guided Dropout. Taking advantate of the high redundancy of network connections, the idea is to learn a random mapping between all possible "n-1" topologies and the connections of the neurons [65], [42].
Multi-Objective Optimization
Dynamic Objectives: Within the E-Lucid project, coordinated by Thales Communications & Security, the on-going work about anomaly detection in network flow [74] led to an original approach to many-objective problem, where the objectives are gradually introduced, preventing the population to be abruptly driven toward satisficing only the easy objectives at the beginning of the evolution [27] (runner-up for the Best Paper Award of the Evolutionary Multi-Objective track at GECCO 2017).
Dynamic Fitness Cases: In [22], we propose to gradually introduce the fitness cases in the case of symbolic regression with Genetic Programming, so as to guide the search more smoothly. Experimental results demonstrate a better success rate in the case of both static and dynamic problems.
Space Weather Forecasting
In the context of the MDG-TAU joint team project, focusing on space weather forecasting, Mhamed Hajaiej's Master thesis (under Aurélien Decelle, Cyril Furtlehner and Michèle Sebag's supervision) has tackled the prediction of magnetic storms from solar magnetograms, more specifically considering the representation of solar magnetograms based on auto-encoders. Besides finding a well-suited NN architecture, the difficulty was to find a loss function well suited to the data distribution. A next step (Mandar Chandorkar's PhD at CWI under Enrico Camporeale supervision) is to estimate from the solar images the speed of the solar wind, and the time needed for solar storms to reach the first Lagrange point; this estimation is meant to build a well-defined supervised learning problem, associating a solar image at to its effect measured at on the first Lagrange point.