Section: Overall Objectives
Ubiquitous data collection is providing our society with tremendous volumes of data about human, environmental and industrial activity. These ever increasing volumes of collected data hold the keys to new discoveries, both in the industrial and scientific domains. However, those keys will only be accessible to those who can make sense of such data. Making sense of data is a hard problem, requiring a good understanding of the data at hand, of the many data analysis tools and methods, and a good capacity to infer knowledge from the results of such tools. Such skills have been grouped under the umbrella term “Data Science” and lots of efforts are being done on education and research in this area. “Data Scientist” is currently the most sought after job in the US, as the demand far exceeds the number of competent professionals. Nowadays, the main problem of data science is that despite considerable improvements, it is still mostly a “manual” process: current data analysis tools still require an important human effort and know-how, making data analysis a lengthy, partial and error-prone process. This is true even for data science experts, and current approaches are mostly out of reach of non-specialists.
We claim that nowadays, Data Science is in its “Iron Age”: good tools are available, however skilled craftsmen are required to use them in order to transform raw material (the data) into finished products (knowledge, decisions). We foresee that in a decade from now, we should be in an “Industrial Age” of Data Science, where more elaborate tools will alleviate a lot of the human work required in Data Science. Basic Data Science tasks will no longer require a skilled data scientist, but software tools will enable small companies or even individuals to get valuable knowledge from their data, which is not possible currently. Skilled data scientists will thus be fully available to work on the hard tasks that matter, with a drastic productivity improvement thanks to better tools doing the tedious work for them.
The objective of the Lacodam team is to considerably facilitate the process of making sense from large quantities of data, either to derive new knowledge or for making better decisions. Nowadays, this process is mostly manual, and relies on the analyst’s understanding of the domain, of the data at hand and of a plethora of complex computational tools. We envision a novel generation of data analysis and decision support tools that require significantly less tedious human work, relying only on few interactions with high added value. The solutions we foresee requires to bridge data mining techniques with artificial intelligence (AI) approaches, both to take knowledge into account in a principled way, and to introduce automated reasoning techniques in knowledge discovery workflows. Such solutions can be seen as “second order” AI tasks: they exploit AI techniques (for example, planning) in order to pilot more classical AI tasks such as data mining and decision support.