Section: New Results

Continuous Optimization

Participants : Yohei Akimoto, Anne Auger, Zyed Bouzarkouna, Alexandre Chotard, Nikolaus Hansen, Ilya Loshchilov, Verena Heidrich-Meisner, Raymond Ros, Marc Schoenauer, Olivier Teytaud, Fabien Teytaud.

Our main expertise in continuous optimization is on stochastic search algorithms. We address theory, algorithm design and applications. The methods we investigate are adaptive techniques able to learn iteratively parameters of the distribution used to sample solutions. The Covariance Matrix Adaptation Evolution Strategy (CMA-ES) is nowadays one of the most powerful method for continuous optimization without derivatives. We work on different variants of the CMA-ES to improve it in various contexts as described below. In addition we have contributed to give an information geometry perspective to stochastic optimization unifying both continuous and discrete algorithms using a family of probability distribution parametrized by continuous parameters. The framework proposed in this context allow to retrieve many existing stochastic optimization algorithms when instantiated with different family of probability distributions including the CMA-ES when using gaussian distributions [85] . We have moreover clarified important design principles based on invariances [85] , [11] .

New algorithms based on derandomization and for mixed-integer optimization

A new variant of CMA-ES to address problem with mixed-integer variables (vectors with both discrete and continuous variables) has been designed [81] . New algorithms using new selection schemes combined with derandomization have been designed and thoroughly theoretically and empirically investigated [16] , [17] . A local search algorithm using an adaptive coordinate descent has been proposed [45] . We have as well investigated how to inject solutions in CMA-ES so as to improve performances if an oracle provide good solutions [82] .

Distributed optimization

We have proposed simple modifications of evolutionary algorithms so that they reach asymptotically the optimal log(λ) speed-up with λ processors and the linear speed-up Θ(λ) for a number of processors of order at most the dimensionality of the problem (for a pointwise solution); in particular bounding the selected population size in the Self-Adaptation algorithm and variants of this idea for fastening the decrease of the step-size in a relevant manner. All these works and more are part of Olivier Teytaud's HDR thesis [3] , defended on April 22., and also build the first part of Fabien Teytaud's PhD [2] , defended on December 8.. We wrote a chapter on lower bounds for distributed derivative free optimization in [79] .


We have continued our effort for improving standards in benchmarking and pursued the development of the COCO - COmparing Continuous Optimizers platform [80] . We are organizing for the GECCO 2012 conference the Black-Box-Optimization Workshop(see http://coco.gforge.inria.fr/doku.php?id=bbob-2012 ).

Optimization with meta-models and surrogate

We have investigated optimization using a coupling of CMA-ES and surrogates and applied it for the optimization of well placement [6] . We have proposed a new meta-model CMA-ES for the optimization of partially separable functions [19] and shown that it improves performances for solving the well placement problem [19] .

Hyperparameter optimization

In [18] we present hyper-parameter optimization results on tasks of training neural networks and deep belief networks (DBNs). We optimize hyper-parameters using random search and two new greedy sequential methods based on the expected improvement criterion. The sequential algorithms are applied to the most difficult DBN learning problems and find significantly better results than the best previously reported.

Multi-objective optimization

We have investigated theoretically multi-objective algorithms based on the hypervolume [4] and proposed new selection operators based on tournament and multi-armed bandit framework [46] .

Multimodal optimization

We have shown in [58] a simple algorithm (a (1+1)-ES with quasi-random restart and murder operator), which, at least in the sequential case, performs as efficiently as much more tricky algorithms.

Mathematical bounds for noisy optimization

The paper [27] shows upper and lower confidence bounds and/or experiment algorithms in the noisy optimization setting; in particular we compared an optimization algorithm based on bandits and an surrogate-model version; whereas the bandit approach is much faster if the noise decreases quickly to zero around the optimum, the surrogate-model version is faster if the noise does not decrease to zero.