A measure of the local irregularity in a complex signal. Singularity exponents can be evaluated in many different ways. GEOSTAT focuses on a microcanonical formulation.

Microcanonical Multiscale Formalism. The formalism used and developped in GEOSTAT in the analysis of complex signals and systems.

GEOSTAT is a research project in
**digital signal processing**, but with this important
distinction that it considers the signals as the realizations
of complex dynamic systems. Consequently, research in GEOSTAT
is oriented towards the determination, in real signals, of
quantities or phenomena that are known to play an important
role both in the evolution of dynamical systems whose
acquisitions are the signals under study, and in the compact
representations of the signals themselves. Among these
parameters, we can mention:

various types of
*singularity exponents*,

Lyapunov exponents, how they are related to intermittency, large deviations and singularity exponents,

various forms of
*entropies*,

the cascading properties of associated random variables,

persistence along the scales,
*optimal wavelets*,

the determination of subsets where statistical information is maximized, their relation to reconstruction and compact representation,

and, above all,
**the ways that lead to effective numerical and high
precision determination of these quantities in real
signals**. The MMF (Multiscale Microcanonical Formalism) is
one of the ways to partly unlock this type of analysis, most
notably w.r.t. singularity exponents and reconstructible
systems
. We presently concentrate our
efforts on it, but GEOSTAT is intended to explore other ways
. Presently GEOSTAT explores new
methods for analyzing and understanding complex signals in
different applicative domains through the theoretical
advances of the MMF, and the framework of
**reconstructible systems**
. Derived from ideas in
Statistical Physics, the methods developped in GEOSTAT offer
new ways to relate and evaluate quantitatively the
*local irregularity*in complex signals and systems, the
statistical concepts of
*information content*and
*most informative subset*. That latter notion is
developed through the notion of
*transition front*and
*Most Singular Manifold*. As a result, GEOSTAT is aimed
at providing
*radically new approaches*to the study of signals
acquired from different complex systems (their analysis,
their classification, the study of their dynamical properties
etc.). The common characteristic of these signals, as
required by
*universality classes*
, being the existence of a
*multiscale organization*of the systems. For instance,
the classical notion of
*edge*or
*border*, which is of multiscale nature, and whose
importance is well known in Computer Vision and Image
Processing, receives, through the MMF, profound and rigorous
new definitions. Used in conjunction with appropriate
*reconstruction formula*, the MMF is capable of
generalizing in a consistent manner the notion of
*edge*so that the generalized definition is adequate to
the case of chaotic data. The description is analogous to the
modelling of states far from equilibrium, that is to say,
there is no stationarity assumption. From this formalism we
derive methods able to determine geometrically the most
informative part in a signal, which also defines its global
properties and allows for
*compact representation*in the wake of known
problematics addressed, for instance, in
*time-frequency analysis*. In this way, the MMF allows
the reconstruction, at any prescribed quality threshold, of a
signal from its most informative subset, and is able to
quantitatively evaluate key features in complex signals
(unavailable with classical methods in Image or Signal
Processing). It appears that the notion of
*transition front*in a signal is much more complex than
previously expected and, most importantly, related to
multiscale notions encountered in the study of non-linearity
. For instance, we give new
insights to the computation of dynamical properties in
complex signals, in particular in signals for which the
classical tools for analyzing dynamics give poor results
(such as, for example, correlation methods or optical flow
for determining motion in turbulent datasets). The
problematics in GEOSTAT can be summarized in the following
items:

the accurate determination in any
n-dimensional complex signal of
*singularity exponents*
**at every point in the signal domain**
.

The geometrical determination and
organization of
*singular manifolds*associated to various transition
fronts in complex signals, the study of their geometrical
arrangement, and the relation of that arrangement with
statistical properties or other global quantities
associated to the signal, e.g.
*cascading properties*
.

The study of the relationships between the dynamics in the signal and the distributions of singularity exponents .

The study of the relationships between
the distributions of singularity exponents and other
quantities associated to
*predictibility*in complex signals and systems, such
as cascading properties, large deviations and Lyapunov
exponents.

The ability to compute
*optimal wavelets*and relate such wavelets to the
geometric arrangement of singular manifolds and cascading
properties
.

The translation of
*recognition*,
*analysis*and
*classification problems*in complex signals to
simpler and more accurate determinations involving new
operators acting on singular manifolds using the
framework of reconstructible systems.

In the applicative domain, GEOSTAT will focus its research activities to the study of the following classes of signals: remote sensing satellite acquisitions in Oceanography (study of different phenomena -i.e. geostrophic or non-geostrophic- complex oceanic dynamics, mixing phenomena, ocean/climate interaction), Speech processing (analysis, recognition, classification), signals in Astronomy (multi-dimensional implementation of the MMF, atmospheric perturbation of acquisitions with optical devices), heartbeat signals .

V. Khanagha has been selected as an award finalist for the
best paper in the INTERSPEECH Conference (Makhuhari, Japan)
for his paper
*A Novel text-independent phonetic segmentation algorithm
based on the Microcanonical Multiscale Formalism*
.

Development of the
*Microcanonical Multiscale Formalism*or MMF for the
efficient computation of singularity exponents
,
,
,
,
,
,
,
,
.

The determination of
*optimal wavelets*for unlocking cascade properties
in the microcanonical sense and their application to
various complex signals
.

The development of the MMF w.r.t.
*reconstructible systems*, i.e. the reconstruction
and analysis of signal from compact representation
.

The study of the microcanonical cascade in complex signals as a new tool for the determination of the dynamics , .

Theoretical developments around the
relations between singularity exponents, various types of
Lyapunov exponents, various types of
*entropies*, large deviation, singular manifolds,
multiscale hierarchies and predictability in complex
signals
.

The following application domains are investigated by the GEOSTAT team:

Complex signal acquired from remote sensing and earth observation, notably in Oceanography, and ocean/climate interaction.

Speech signal.

Astronomical imaging and turbulence.

Analysis of heartbeat signals (in
collaboration with M. Haissaguerre, head of team INSERM
EA3668
*Electrophysiologie et Stimulation Cardiaque de
l'Université de Bordeaux*and the ANUBIS team).

*FluidExponents*: software implementation of the MMF,
written in Java, in a cooperative development mode on the
INRIA GForge, deposited at APP in 2010. Contact:
hussein.yahia@inria.fr.

Fundamental theory and development for the propagation of
dynamic information across the scales in remotely sensed
acquisitions of the oceans. Dynamic information about
geostrophic motion at low resolution (altimetry data, pixel
size: 22 kms) is propagated down to high resolution SST data
(pixel size: 4kms) to derive high resolution determination of
ocan dynamics at SST resolution. Propagation is currently
done for the orientation of the vector field, leaving aside
the propagation of norms for on-going studies. Propagation
across the scales is down through an evaluation of the
microcanonicale cascade and approximation of an
*optimal wavelet*
,
,
.

At the typical scales of remote-sensing maps, the temperature diffusion is negligible and can be considered to track the ocean circulation. Therefore, one of the alternatives for obtaining high-resolution velocity maps consists in tracking the turbulently distributed temperature. The maximum inference is achieved by using the optimal decomposition which, for a representation in terms of wavelets, corresponds to the optimal wavelet . Software: A program in IDL has been developed that allows improved analysis and inference of oceanic turbulent cascades from satellite data. In consists in three main modules. The first module (developed by H. Yahia and J. Sudre) allows to express the local propagation of information of the oceanic cascade. Performance depends on wavelet optimality but the algorithm gives consistently correct estimates with suboptimal wavelets. The second module (developed by J. Sudre, O. Pont and H. Yahia) allows characterizing the invariant features of the underlying turbulent process in a robust and stable way from empirical satellite signals. The third module (developed by O. Pont and J. Sudre) permits an iterative minimization process to accurately retrieve the shape of the optimal wavelet for a given signal or ensemble of signals.

Software: A program in Matlab (developed by M. Milovanovic
(Belgrade Insitute of Nuclear Research) and O. Pont) has been
developed to quantify the optimality degree of arbitrarily
defined wavelets over digital signals in a stable way. The
optimality measure, which is called
Qparameter, quantifies mutual statistical redundance
between resolution levels in the same way as multiscale
mutual information, but its estimation from discrete
empirical data is more robust and requires less data points.
The code can process 1D, 2D and 3D signals.

In the context of a validation study of the evaluation of high resolution ocean dynamics with the microcanonical cascade, we have performed a comparison between the results produced by luminance conservation methods (optical flow) and the microcanonical cascade. Validation is done using the output of the ROMS simulation model , , which has been tested to produce outputs having multiscale characteristics, a property not shared by all simulation models. Results show a clear difference in the two approaches: using optical flow, the vectors badly oriented have no evident relation with the coherent structures of ocean dynamics, as opposed to the results produced by the microcanonical cascade, in which the remaining vectors with an opposite orientation (much less than with optical flow) are clearly located in coherent structures . See figure .

The existence of non-linear and turbulent phenomena in the production process of speech signal, is theoretically and experimentally established. We have tried to use MMF to identify the key parameters quantifying non-linear character of speech signal. The first step is to show the possibility of having such parameter and to have a precise estimate of it. Then, we need to develop efficient numerical methods to employ these parameter for practical applications. In summary following steps have been taken.

Verifying the intermittent character of speech signal, for each family of basic acoustic units(phonemes) by the evaluation of gradient's histogram and the flatness function.

Verifying the local scale-invariance character of each phoneme family, by the evaluation of local power-law scaling. Consequently, It is shown that we have access to the very precise estimation of singularity exponents for speech signal.

Verifying the persistence of singularity spectra, across different scales as a global multi-scale property. Hence the validation of MMF for the analysis of speech signal is concluded after verification of all these three conditions.

Studying the time evolution of singularity exponents and observing how they convey instructive information regarding transition fronts of phonemes. We observed that the statistical properties of phonemes is varying sharply on phoneme boundaries.

Developing a sample-based measure called ACC as a practical tool for phonetic segmentation. ACC is simply the primitive of singularity exponents and reflects the accumulated average of exponents. It has a nice property of being piece-wise linear behavior while changing the slopes on the phoneme boundaries.

Related Publication .

The practical utilization of the proposed sample-based methods, required the development of an algorithm which automatically identify the boundaries between different phonemes.

Developing the automatic segmentation algorithm by fitting a piece-wise linear curve to the proposed sample based measure and taking the breaking points as hypothesized boundaries.

Testing the proposed algorithm on the TIMIT database, and comparing the results with the provided manual transcriptions proved the very good performance of the proposed method compared to the state of the art. Related publication: .

Further improvement of the proposed method by adding a second stage performing a Log Likelihood Ration Test (LLRT) to test weather a hypothesized boundary truly corresponds to a change in distribution or not. This remarkably improved the performance of the proposed method. Performing extensive comparisons with the state of the art text-independent segmentation methods, proved the remarkable performance of the proposed methods. Related publication submitted to ICASSP 2011.

Most of the speaker recognition systems rely on generative
learning of Gaussian Mixture Models (GMM).During the last
decade, Discriminative approaches have been an interesting
and valuable alternative to address directly the
classification problem. For instance, Support Vector Machines
(SVM) combined with GMM supervectors are among
state-of-the-art approaches in speaker recognition. Recently
a new discriminative approach for multiway classification has
been proposed, the Large Margin Gaussian mixture models
(LM-GMM). The latter have the same advantage as SVM in term
of the convexity of the optimization problem to solve.
However they differ from SVM because they draw nonlinear
class boundaries directly in the input space, and thus no
kernel trick is required. We proposed a simplified version of
LM-GMM which has the advantage to lead to an efficient
training algorithm for speaker recognition. We did so by
following the same philosophy of traditional GMM where the
adaptation of target speaker models is done only on the GMM
mean vectors. We carried out experiments on NIST-SRE data.
The results suggest that our simplified algorithm outperforms
both the original LM-GMM and the traditional GMM. Related
publication:
. We participated in the 2010
NIST Speaker Recognition Evaluation (NIST'2010-SRE) campaign
(http://www.itl.nist.gov/iad/mig//tests/sre/2010/index.html).
Our system was jointly developed with IRIT (Toulouse) and is
based on the recently released Aize/SpkDet toolkit (
http://

We first implemented some state-of-the-art statistical methods for text-independent phonetic segmentation. We then used some principals from speaker segmentation methods to develop new and well performing techniques for phonetic segmentation. Related publication: J. Winebarger's internship report, available on the GEOSTAT website: .

Many complex systems in natural phenomena organize through the interactions between processes at different scales. In certain cases, when the dynamics has no privileged spacial or temporal scale, the resulting structure becomes scale-invariant. In this context, fractal and multifractal models have often been proposed to describe such scale-invariance. In recent years, there has been a growing interest in the study of multifractal systems, something made possible by the development of new signal processing methods based on singularity analysis. These techniques have the advantage of being robust to common problems of empirical data such as noise, discretization, data gaps and finite-size effects. The performed work consists in a study of multifractal datasets of very different nature: wind tunnel turbulence, concentration of phytoplankton in the ocean, heartbeat dynamics in different regimes and formation of prices in the stock market. When appropriately characterizing the relevant parameters of their respective multifractal hierarchies, we find evidences of common fractal attractors that have particular and very specific properties. This behaviour can be explained as the result of an effective dynamics that expands from a maximum-singular manifold, which would be the result of multifractal universality classes. Universality classes imply a common macroscopic behaviour independent of the particular microscopic dynamics of each system. Therefore, identifying such universality classes makes possible a better characterization of the studied phenomena, a more accurate and robust fitting of their dynamic parameters and a better modelling. The result is a flexible methodology that can be adapted to the particularities of each system and unveil its effective dynamics. We show applications of these characterizations to the reconstruction of missing data and the forecasting of time series. Related publication: .

Starting of Suman Kumar Maji's PhD on phase reconstruction for adaptive optics. Research objectives defined in collaboration with the ONERA team in adaptive optics.

Starting of a research subject about the analysis of heartbeat signals under the MMF. First results under publication.

Region Aquitaine research call. Funding of the OPTAD project on adaptive optics.

Strong partnership with the LEGOS team
(CNRS UMR 5566), the CNES and ICM-CSIC (Barcelona) around
the
**Hiresubcolor**research contract.

Participation to the ENSC 2010 jury (H. Yahia) for entrepeneurial fundings of the Region Aquitaine (September 8, 2010).

Presentation of the scientific objectives of GEOSTAT at the preparation of IIT Rajasthan, Jodhpur, India (March, 16, H. Yahia).

Meeting and presentation at LEGOS with european partners to prepare on-going calls (H. Yahia, March 31, 2010).

Presentation of GEOSTAT activities at CHU Bordeaux Haut-Leveque (Professor M. Haissaguerre's team, May 5, 2010, H. Yahia).

Presentation at IMB (Institut de Mathématiques Bordelais). H. Yahia, June 23, 2010.

Presentation of GEOSTAT activities at ONERA (September 21, H. Yahia).

Presentation of GEOSTAT activities for the new Basque Center for Mathematics (BCAM). H. Yahia, September 24.

Presentation of the results of
**Hiresubcolor**contract (H. Yahia, November 22).

K. Daoudi is a member of Dr. Anthony
Brew's Ph.D thesis, University College Dublin, Irland
(11/06/2010). Title:
*One Class Classifiers in Speaker Verification*.

K. Daoudi is a member of the ISIVC'2010 program committee (2010 International Symposium on Image/video communication).

K. Daoudi is participating to a
"strategic Canadian project", funded by CRSNG; title :
*Profilage à des données hétérogènes du Web pour la
cybersécurité*.

V. Khanagha: classroom exercices on the Fourier Transform at Master Level (University of Bordeaux, IMB).