Section: New Results
Phonetic segmentation
Participants : Vahid Khanagha, Joshua Winebarger, Khalid Daoudi, Oriol Pont, Hussein Yahia, Régine André-Obrecht.
Previously we had developed a novel phonetic segmentation method based on Microcanonical Multiscale Formalism (MMF). The algorithm was based on precise computation of Local Predictability Exponents (LPEs) at each point, and then using their integration over time axis (ACC) as a quantitative representative of changes in behavior of distribution of these exponents between neighboring phonemes. The piecewise linear estimation of ACC had provided very good segmentation precision. By performing error analysis of the original algorithm, we proposed a 2-step technique which better exploits LPEs to improve the segmentation accuracy. In the first step, we detect the boundaries of the original signal and of a low-pass filtered version, and we consider the union of all detected boundaries as candidates. In the second step, we use a hypothesis test over the local LPE distribution of the original signal to select the final boundaries. In summary following steps have been taken:
-
Detailed error analysis of the original method, which resulted in the realization of the fact that a high-pass filtering can help to detect some of the missed boundaries.
-
Development of the hypothesis test method, using the Log Likelihood Ratio Test for final decision over a list of candidates.
-
Evaluation of the overall 2-step algorithm on the whole train part of the TIMIT database, to compare with the original method.
-
Evaluation on test part of TIMIT database to compare with the state of the art methods.
Related publications: [13] , [14] .
We continued and improved the adaptation of speaker segmentation methods to develop new (nonlinear) techniques for phonetic segmentation. We succeeded in proposing simple and efficient new algorithms that outperform existing ones. Even with new approaches, our nonlinear approach was still competitive.