EN FR
EN FR


Section: New Results

Statistics and detection of most unpredictable points in data sets

Participants : Nicolas Brodu, Hussein Yahia, Suman-Kumar Maji.

References: [21][16] .

The assumption that local regularity amounts to predictability can be challenged, depending on the model that one may use to make predictions. A statistical framework, “computational mechanics”, has been explicitly designed over the past 30 years, that precisely formalizes notions of causality and predictability within discrete data sets. Patterns with similar causal influence on the data are clustered in equivalence classes. Taken together, these classes form a Markovian automaton by definition, since no extra information is needed from other classes to (statistically) predict the influence of a group of patterns on the rest of the data set. These automata are defined at the lowest data description scale, but it has been suggested that sub-automata (thus clusters at larger scales) form an ideal coarse-graining of the system in terms of predictability (thus also descriptive power). The theory is also deeply rooted in statistical physics, offering a unique perspective on how macroscopic variables could be derived from a microscopic description of a studied system. Preliminary results are promising and show that, for example, edges may be detected in images with a precursor continuous implementation of the theory extension under construction. In order to make more progress, advanced statistical and computational developments are necessary to carry this work. In order to facilitate this development, N. Brodu has submitted a Marie-Curie outgoing fellowship that, if accepted, would allow to partner with Australian leaders on statistics and data processing (University of Melbourne, department of Mathematics).