EN FR
EN FR


Section: New Results

QoE (Quality of Experience)

Participants : Sebastián Basterrech, Yassine Hadjadj-Aoul, Sofiene Jelassi, Adlen Ksentini, Gerardo Rubino, Kamal Singh, César Viho.

We continue the development of the PSQA technology (Pseudo-Subjective Quality Assessment) in the area of Quality of Experience (QoE). PSQA is today a stable technology allowing to build measuring modules capable of quantifying the quality of a video or an audio sequence, as perceived by the user, when received through an IP network. It provides an accurate and efficiently computed evaluation of quality. Accuracy means that PSQA gives values close to those than can be obtained from a panel of human observers, under a controlled subjective testing experiment, following an appropriate standard (which depends on the type of sequence or application). Efficiency means that our measuring tool can work in real time, if necessary. Observe that perceived quality is the main component of QoE. PSQA works by analyzing the networking environment of the communication and some the technical characteristics of the latter. It works without any need to the original sequence (as such, it belongs to the family of no-reference techniques).

It must be pointed out that a PSQA measuring or monitoring module is network dependent and application dependent. Basically, for each specific networking technology, application, service, the module must be built from scratch. But once built, it works automatically and very efficiently, allowing if necessary to use it in real time.

At the heart of the PSQA approach there is the statistical learning process necessary to develop measuring modules. So far we have been using Random Neural Networks (RNNs) as our learning tool (see  [82] for a general description), but recently, we have started to explore other approaches. For instance, in the last ten years a new computational paradigm was presented under the name of Reservoir Computing (RC)  [78] covering the main limitations in training time for recurrent neural networks while introducing no significant disadvantages. Two RC models have been developed independently and simultaneously under the name of Liquid State Machine (LSM)  [81] and Echo State Networks (ESN)  [78] and constitute today one of the basic paradigms for Recurrent Neural Networks modeling  [79] . The main characteristic of the RC model is that it separates two parts: a static sub-structure called reservoir which involves the use of cycles in order to provide dynamic memory in the network, and a parametric part composed of a function such as a multiple linear regression or a classical single layer network. The reservoir can be seen as a dynamical system that expand the input stream in a space of states. The learning part of the model is the parametric one. In a recent collaboration with the Applied Computational Intelligence Research Unit, Artificial Neural Networks Group of the University of the West of Scotland during the first half of the year, we developed an algorithm based on a combination of topology preserving maps such as the Self-Organising Map  [80] and the Scale Invariant Map  [77] to improve the performance of RC models. The obtained results are presented in two papers: [37] and [38] .

In [42] we developed a PSQA version for evaluating the perceived quality in the context of SVC video coding. The tool is based on the use of the RNN model. The main difficulties in defining this tool is regarding the relation between the SVC layers, since the enhanced layers require the information of the base layer in order to be decoded.

In [61] , we developed a tool for evaluating the perceived quality of an application distributing streamed video using HTTP (and thus, TCP). The difficulties here are focused around the possible playout interruptions and the quality variations due to the use of adaptive bitrate techniques. Our procedure belongs to the no-reference family of learning ones, and it is also based on the use of the RNN tool.

In [41] we compared PSQA used for the video evaluation to other no-reference tools as well as two objective evaluation tools. We showed that PSQA outperforms the majority of the other tools, in terms of high correlation with human evaluation. This version will be used as the main metric for evaluating the QoE in the future internet architecture proposed by the FP7 Alicante project.

We have also being developing single-ended parametric-model speech-quality assessors of VoIP conversations over future networks. To do that, a careful identification and accurate characterization of quality-degrading factors over next-generation networks has been done. The recent progress and challenges for accurate assessment of voice quality over evolving VoIP systems has been detailed in the survey paper [19] . In [18] , we study the perceived effects of packet loss processes, which are the principal source of quality degradation over IP networks. In reality, the perceived effect of a given packet loss process is highly related to the distribution of missing packets. Basically, the higher the burstiness of packet loss processes, the greater the perceived quality degradation. Recently, several assessors of speech quality sensitive to packet loss burstiness have been proposed in the literature. A comprehensive comparison study of bursty-packet-loss-aware artificial assessors has been conducted in [18] . An extended and more elaborated version has been published in [47] . Moreover, novel artificial quality assessors that consider transient loss of connectivity incurred by mobile users over mobile transport system have been developed. A paper describing our developed tools and performance results is under preparation. Recently, we started to work on new analytical models of packet losses and delays of packet-based voice conversations over wireless ad-hoc networks. The developed models will be used to design specialized artificial quality assessors of multimedia services over wireless ad-hoc networks. Moreover, we are working on the enhancement of a voice quality assessor version of PSQA, by considering the features of removed speech signals.