Section: New Results

Service Transparency

From Network Traffic Measurements to QoE for Internet Video

Participants: Muhammad Jawad Khokhar, Thibaut Ehlinger, Chadi Barakat.  

Video streaming is a dominant contributor to the global Internet traffic. Consequently, monitoring video streaming Quality of Experience (QoE) is of paramount importance to network providers. Monitoring QoE of video is a challenge as most of the video traffic of today is encrypted. In this work, we consider this challenge and present an approach based on controlled experimentation and machine learning to estimate QoE from encrypted video traces using network level measurements only. We consider a case of YouTube and play out a wide range of videos under realistic network conditions to build ML models (classification and regression) that predict the subjective MOS (Mean Opinion Score) based on the ITU P.1203 model along with the QoE metrics of startup delay, quality (spatial resolution) of playout and quality variations, and this is using only the underlying network Quality of Service (QoS) features. We comprehensively evaluate our approach with different sets of input network features and output QoE metrics. Overall, our classification models predict the QoE metrics and the ITU MOS with an accuracy of 63-90% while the regression models show low error; the ITU MOS (1-5) and the startup delay (in seconds) are predicted with a root mean square error of 0.33 and 2.66 respectively. The results of this work were published in [26] and can be found with further details in the PhD manuscript of Muhammad Jawad Khokhar graduated in October 2019.

When Deep Learning meets Web Measurements to infer Network Performance

Participants: Imane Taibi, Chadi Barakat.  

Web browsing remains one of the dominant applications of the internet, so inferring network performance becomes crucial for both users and providers (access and content) so as to be able to identify the root cause of any service degradation. Recent works have proposed several network troubleshooting tools, e.g, NDT, MobiPerf, SpeedTest, Fathom. Yet, these tools are either computationally expensive, less generic or greedy in terms of data consumption. The main purpose of this work funded by the IPL BetterNet is to leverage passive measurements freely available in the browser and machine learning techniques (ML) to infer network performance (e.g., delay, bandwidth and loss rate) without the addition of new measurement overhead. To enable this inference , we propose a framework based on extensive controlled experiments where network configurations are artificially varied and the Web is browsed, then ML is applied to build models that estimate the underlying network performance. In particular, we contrast classical ML techniques (such as random forest) to deep learning models trained using fully connected neural networks and convolutional neural networks (CNN). Results of our experiments show that neural networks have a higher accuracy compared to classical ML approaches. Furthermore, the model accuracy improves considerably using CNN. These results were published in [28].

On Accounting for Screen Resolution in Adaptive Video Streaming: A QoE-Driven Bandwidth Sharing Framework

Participants: Othmane Belmoukadam, Muhammad Jawad Khokhar, Chadi Barakat.  

Screen resolution along with network conditions are main objective factors impacting the user experience, in particular for video streaming applications. Terminals on their side feature more and more advanced characteristics resulting in different network requirements for good visual experience. Previous studies tried to link MOS (Mean Opinion Score) to video bit rate for different screen types (e.g., CIF, QCIF, and HD). We leverage such studies and formulate a QoE-driven resource allocation problem to pinpoint the optimal bandwidth allocation that maximizes the QoE (Quality of Experience) over all users of a provider located behind the same bottleneck link, while accounting for the characteristics of the screens they use for video playout. For our optimization problem, QoE functions are built using curve fitting on data sets capturing the relationship between MOS, screen characteristics, and bandwidth requirements. We propose a simple heuristic based on Lagrangian relaxation and KKT (Karush Kuhn Tucker) conditions for a subset of constraints. Numerical simulations show that the proposed heuristic is able to increase overall QoE up to 20% compared to an allocation with TCP look-alike strategies implementing max-min fairness. Later, we use a MPEG/DASH implementation in the context of ns-3 and show that coupling our approach with a rate adaptation algorithm can help increasing QoE while reducing both resolution switches and number of interruptions. Our framework and the first validation results were published in [20].

Tuning optimal traffic measurement parameters in virtual networks with machine learning

Participants: Karyna Gogunska, Chadi Barakat.  

With the increasing popularity of cloud networking and the widespread usage of virtualization as a way to offer flexible and virtual network and computing resources, it becomes more and more complex to monitor this new virtual environment. Yet, monitoring remains crucial for network troubleshooting and analysis. Controlling the measurement footprint in the virtual network is one of the main priorities in the process of monitoring as resources are shared between the compute nodes of tenants and the measurement process itself. In this paper, first, we assess the capability of machine learning to predict measurement impact on the ongoing traffic between virtual machines; second, we propose a data-driven solution that is able to provide optimal monitoring parameters for virtual network measurement with minimum traffic interference. These results were published in [25] and are part of the PhD manuscript of Karyna Gogunska graduated in December 2019.

Collaborative Traffic Measurement in Virtualized Data Center Networks

Participants: Houssam Elbouanani, Chadi Barakat.  

Data center network monitoring can be carried out at hardware networking equipment (e.g. physical routers) and/or software networking equipment (e.g. virtual switches). While software switches offer high flexibility to deploy various monitoring tools, they have to utilize server resources, esp. CPU and memory, that can no longer be reserved fully to service users' traffic. In this work we closely examine the costs of (i) sampling packets ; (ii) sending them to a user-space program for measurement; and (iii) forwarding them to a remote server where they will be processed in case of lack of resources locally. Starting from empirical observations, we derive an analytical model to accurately predict (R2= 99.5%) the three aforementioned costs, as a function of the sampling rate. We next introduce a collaborative approach for traffic monitoring and sampling that maximizes the amount of collected traffic without impacting the data center's operation. We analyze, through numerical simulations, the performance of our collaborative solution. The results show that it is able to take advantage of the uneven loads on the servers to maximize the amount of traffic that can be sampled at the scale of a data center. The resulting gain can reach 200% compared to a non collaborative approach. These results were published in [23].

Distributed Privacy Preserving Platform for Ridesharing Services

Participants: Damien Saucez, Yevhenii Semenko.  

The sharing economy fundamentally changed business and social interactions. Interestingly, while in essence this form of collaborative economy allows people to directly interact with each other, it is also at the source of the advent of eminently centralized platforms and marketplaces, such as Uber and Airbnb. One may be concerned with the risk of giving the control of a market to a handful of actors that may unilaterally fix their own rules and threaten privacy. Within the Data Privacy project of the UCAJedi Idex Academy 5 and House of Human and Social Sciences, Technologies and Uses Theme, we have proposed a holistic solution to address privacy issues in the sharing economy. We considered the case of ridesharing and proposed a decentralized architecture which gives the opportunity to shift from centralized platforms to decentralized ones. Digital communications in our proposition are specifically designed to preserve data privacy and avoid any form of centralization. A blockchain is used in our proposition to guarantee the essential roles of a marketplace, but in a decentralized way. Our evaluation shows that privacy protection without trusted entities comes at the cost of harder scalability than an approach with a trusted third party. However, our numerical evaluation on real data and our Android prototype shows the practical feasibility of our approach. The results obtained in this activity are published in 12th International Conference on Security, Privacy, and Anonymity in Computation, Communication, and Storage (SpaCCS) 2019, Atlanta [31] and documented in a research report [35].

Missed by Filter Lists: Detecting Unknown Third-Party Trackers with Invisible Pixels

Participants: Imane Fouad, Arnaud Legout, Natasa Sarafijanovic-Djukic.  

Web tracking has been extensively studied over the last decade. To detect tracking, previous studies and user tools rely on filter lists. However, it has been shown that filter lists miss trackers. In this paper, we propose an alternative method to detect trackers inspired by analyzing behavior of invisible pixels. By crawling 84,658 webpages from 8,744 domains, we detect that third-party invisible pixels are widely deployed: they are present on more than 94.51% of domains and constitute 35.66% of all third-party images. We propose a fine-grained behavioral classification of tracking based on the analysis of invisible pixels. We use this classification to detect new categories of tracking and uncover new collaborations between domains on the full dataset of 4,216,454 third-party requests. We demonstrate that two popular methods to detect tracking, based on EasyList & EasyPrivacy and on Disconnect lists respectively miss 25.22% and 30.34% of the trackers that we detect. Moreover, we find that if we combine all three lists, 379,245 requests originated from 8,744 domains still track users on 68.70% of websites. This work will appear in PETS 2020 [24].

Privacy implications of switching ON a light bulb in the IoT world

Participants: Mathieu Thiery, Arnaud Legout.  

The number of connected devices is increasing every day, creating smart homes and shaping the era of the Internet of Things (IoT), and most of the time, end-users are unaware of their impacts on privacy. In this work, we analyze the ecosystem around a Philips Hue smart white bulb in order to assess the privacy risks associated to the use of different devices (smart speaker or button) and smartphone applications to control it. We show that using different techniques to switch ON or OFF this bulb has significant consequences regarding the actors involved (who mechanically gather information on the user's home) and the volume of data sent to the Internet (we measured differences up to a factor 100, depending on the control technique we used). Even when the user is at home, these data flows often leave the user's country, creating a situation that is neither privacy friendly (and the user is most of the time ignorant of the situation), nor sovereign (the user depends on foreign actors), nor sustainable (the extra energetic consumption is far from negligible). We therefore advocate a complete change of approach, that favors local communications whenever sufficient. The preprint documenting this work has been published as research report [40].


Participants: Arnaud Legout, Mondi Ravi, David Migliacci, Abdelhakim Akodadi, Yanis Boussad.  

We are currently evaluating the relevance to create a startup for the ElectroSmart project. We are quite advanced in the process and the planned creation is June 2020. There is a "contrat de transfer" ready between Inria and ElectroSmart to transfer the PI from Inria to the ElectroSmart company (when it will be created). Arnaud Legout the future CEO of the company obtained the "autorisation de création d'entreprise" from Inria. ElectroSmart has been incubated in PACA Est in December 2018.

The three future co-founder of ElectroSmart (Arnaud Legout, Mondi Ravi, David Migliacci) followed the Digital Startup training from Inria/EM Lyon.

The goal of ElectroSmart is to help people reduce their exposure to EMF and offer a solution to reduce symptoms associated with exposure to EMF. Electrosensitivity, is known to be a complex and multifactorial syndrome that impacts hundreds of millions of persons worldwide. We aim to commercialize the first treatment of electrosensitivity based on non-deceptive placebo (called open-label placebo). It is known today that placebo are an effective treatment to subjective symptoms (which is the case for several symptoms associated with electrosensitivity). The problem with placebo was that is was assumed that it must be deceptive to be efficient. Kaptchuk et al. showed recently that non-deceptive placebo are as effective as deceptive placebo, so the ethical usage of placebo is now possible. ElectroSmart want to be the first company to commercialize non-deceptive placebo for electrosensitive persons. For details, see https://electrosmart.app/.