Section: Overall Objectives

Context and overall goal of the project

Building upon the expertise in machine learning (ML) and stochastic optimization of the late TAO project-team, the TAU team aims to tackle the vagueness of the Big Data purposes. Based on the claim that (sufficiently) big data can to some extent compensate for the lack of knowledge, Big Data is hoped to fulfill all Artificial Intelligence commitments. This makes Big Data under-specified in three respects:

  • A first source of under-specification is related to common sense, and the gap between observation and interpretation. The acquired data do not report on "obvious" issues; still, obvious issues are not necessarily so for the computer. Providing the machine with common sense is a many-faceted, AI long, challenge. A current challenge is to interpret the data and cope with its blind zones.

  • A second source of under-specification regards the steering of a Big Data system. Such systems commonly require constant learning in order to deal with open environments and users with diverse profiles, expertise and expectations. A Big Data system thus is a dynamic process, whose behavior will depend in a cumulative way upon its future environment. The question regards the control of a lifelong learning system.

  • A third source of under-specification regards its social acceptability. There is little doubt that Big Data can pave the way for Big Brother, and ruin the social contract through modeling benefits and costs at the individual level. What are the fair trade-offs between safety, freedom and efficiency ? We do not know the answers. A first practical and scientific challenge is to assess the fairness of a solution.

The tackling of the under-specified issues in Big Data in TAU currently relies on four core research dimensions, taking inspiration and validation in four main application domains. These research dimensions involve Causal Modelling (required to support prescriptive Big Data), Deep Learning (related to constructive representations, and their compositionality), Optimization and Meta-Optimization (including sequential decision making and categorization of problems), and Big-Data Driven Design. The application domains include the long-lasting domains of Energy Management and High Energy Physics, the more recent focus of TAO/TAU in Computational Social and Economic Sciences, and, new this year, the Autonomous Vehicle, and Population Genetics.