Section: New Results

Optimal Decision Making under Uncertainty

The Tao UCT-SIG is working mainly on mathematical programming tools useful for power systems. In particular, we advocate a data science approach, in order to reduce the model error - which is much more critical than the optimization error, in most cases. Real data are the best way for handling uncertainties. Our main results in 2016 are as follows:

  • Noisy optimization In the context of stochastic uncertainties, noisy optimization handles the model error by simulation-based optimization. Our results include:

    • It has been conjectured that gradient approximation by finite differences (hence, not a comparison-based method) is necessary for reaching such a simple regret of O(1/N). We answer this conjecture in the negative [32], providing a comparison-based algorithm as good as gradient methods, i.e. reaching O(1/N) - under the condition, however, that the noise is Gaussian.

    • The concept of Regret is widely used in the bandit literature for assessing the performance of an algorithm. The same concept is also used in the framework of optimization algorithms, sometimes under other names or without a specific name. Experimental results on the noisy sphere function show that the approximation of Simple Regret, termed Approximate Simple Regret, used in some optimization testbeds, fails to estimate the Simple Regret convergence rate, and propose a new approximation of Simple Regret, the Robust Simple Regret [22].

  • Capacity Expansion Planning The optimization of capacities in large scale power systems is a stochastic problem, because the need for storage and connections (i.e. exchange capacities) varies a lot from one week/season to another. It is usually tackled through sample average approximation, i.e. assuming that the system which is optimal on average over the last 40 years (corrected for climate change) is also approximately optimal in general. However, in many cases, data are high-dimensional; the sample complexity, i.e. the amount of data necessary for a relevant optimization of capacities, increases linearly with the number of parameters and can be scarcely available at the relevant scale. This leads to an underestimation of capacities. We suggested the use of bias correction in capacity estimation, and investigated the importance of the bias phenomenon, and the efficiency of both standard and original bias correction tools [53].

  • Multi-armed bandits We studied the problem of sequential decision making in the context of multi-armed bandits. We provided:

    • An algorithm to handle a non-stationary formulation of the stochastic multi-armed bandit where the rewards are not assumed to be identically distributed, that achieves both a competitive regret and sampling complexity against a best sequence of arms. See [61].

    • An algorithm to handle the task of recommending items (actions) to users sequentially interacting with a recommender system. Users are modeled as latent mixtures of C many representative user classes, where each class specifies a mean reward profile across actions. Both the user features (mixture distribution over classes) and the item features (mean reward vector per class) are unknown a priori. The user identity is the only contextual information available to the learner while interacting. This induces a low-rank structure on the matrix of expected rewards from recommending item a to user b. The problem reduces to the well-known linear bandit when either user-or item-side features are perfectly known. In the setting where each user, with its stochastically sampled taste profile, interacts only for a small number of sessions, we develop a bandit algorithm for the two-sided uncertainty. It combines the Robust Tensor Power Method with the OFUL linear bandit algorithm. We provide the first rigorous regret analysis of this combination. See [63].

  • Confidence intervals for streaming data We consider, in a generic streaming regression setting, the problem of building a confidence interval (and distribution) on the next observation based on past observed data. The observations may have arbitrary dependency on the past observations and come from some external filtering process making the number of observations itself a random stopping time. In this challenging context, we provide confidence intervals based on self-normalized vector-valued martingale techniques, applied to the estimation of the mean and of the variance. See [69].

  • Forecasting tool for Hydraulic networks We studied a problem of prediction in the context of the monitoring of an hydraulic network by the French company Prolog-ingenierie. The problem is to predict the value of some specific sensor in the next thirty minutes from the activity of the network (values of all other sensors) in the recent past. We designed a simple tool for that purpose, based on a random forests. The tool has been tested on data generated from the activity recorded on the Parisian hydraulic network in 2010, 2011 and 2013.