EN FR
EN FR


Section: New Results

Service Transparency

Participants : Chadi Barakat, Walid Dabbous, Maksym Gabielkov, Young-Hwan Kim, Arnaud Legout, Byungchul Park, Ashwin Rao, Riccardo Ravaioli, Damien Saucez, Thierry Turletti.

  • The Complete Picture of the Twitter Social Graph

     

    We made an in-depth study of the macroscopic structure of the Twitter social graph unveiling the highways on which tweets propagate, the specific user activity associated with each component of this macroscopic structure, and the evolution of this macroscopic structure with time for the past 6 years. For this study, we crawled Twitter to retrieve all accounts and all social relationships (follow links) among accounts; the crawl completed in July 2012 with 505 million accounts interconnected by 23 billion links. Then, we presented a methodology to unveil the macroscopic structure of the Twitter social graph. This macroscopic structure consists of 8 components defined by their connectivity characteristics. Each component group users with a specific usage of Twitter. For instance, we identified components gathering together spammers, or celebrities. Finally, we introduced a method to approximate the macroscopic structure of the Twitter social graph in the past, validate this method using old datasets, and discuss the evolution of the macroscopic structure of the Twitter social graph during the past 6 years. This work is accepted in Sigmetrics'14 [23] .

  • Meddle: Middleboxes for Increased Transparency and Control of Mobile Traffic

     

    Meddle is a platform that relies on traffic indirection to diagnose mobile Internet traffic. Meddle is motivated by the absence of built-in support from ISPs and mobile OSes to freely monitor and control mobile Internet traffic; the restrictions imposed by mobile OSes and ISPs also make existing approaches impractical. Meddle overcomes these hurdles by relying on the native support for traffic indirection by mobile OSes. Specifically, Meddle proxies mobile Internet traffic through a software defined middleboxes configured for mobile traffic diagnosis. We use Meddle to tests the limits of the network perspective of mobile Internet traffic offered by traffic indirection. We use this perspective to characterize and control the behavior of mobile applications and provide a first look at ISP interference on mobile Internet traffic. We then performed controlled experiments on 100 popular iOS and Android applications to show how Meddle can be used to identify misbehavior and to block traffic causing this misbehavior. Unlike existing solutions, this activity can be performed without warranty voiding the device and activated on the fly on-demand. This work is done in the context of Aswhin Rao's PhD thesis [11] in collaboration with Northeastern University and Berkeley.

  • Understanding of modern web traffic

     

    This recent years and with the advent of mobile devices, web traffic has changed and moved from static to dynamic generation. Interestingly, while it is well known that network protocols are intertwined in such a way the characteristics of a layer are affected by those of other layers, most of the measurement work done so far does not pay enough attention to this aspect. We then conducted a cross-layer measurement analysis that confronts all the layers from the very deep technological details to the very high level of users behaviors to shed new light on this issue. To support our study, we analysed an Internet packet traffic trace and showed how this cross-layer analysis approach can explain why TCP flows in mobile traffic are larger than usual. We are currently refining our study to characterises the discrepancies between the different network stack protocol implementations based on the mobile/non-mobile nature of the devices but also their operating system and version. This work is currently under submission.

  • Checking Traffic Differentiation at the Internet Access

     

    In the last few years, ISPs have been reported to discriminate against specific user traffic, especially if generated by bandwidth-hungry applications. The so-called network neutrality, advocating that an ISP should treat all incoming packets equally, has been a hot topic ever since. We propose Chkdiff, a novel method to detect network neutrality violations that takes a radically different approach from existing work: it aims at both application and differentiation technique agnosticism. We achieve this in three steps. Firstly, we perform measurements with the user’s real traffic instead of using specific application traces. Secondly, we do assume that discrimination can take place on any particular packet field, which requires us to preserve the integrity of all the traffic we intend to test. Thirdly, we detect differentiation by comparing the performance of a traffic flow against that of all other traffic flows from the same user, considered as a whole. Chkdiff performance strongly depends on the way routers reply to probe packets. We carried out large scale experiments to understand the way routers reply to our probes and we calibrated models to these replies. The next step will be to evaluate the performance of Chkdiff under these models, before making the tool public and available to the community. Chkdiff is currently the subject of a collaboration with I3S around the PhD thesis of Riccardo Ravaioli (funded by the Labex UCN@Sophia). The work is ongoing and will be submitted soon.

  • Lightweight Enhanced Monitoring for High-Speed Networks

     

    Within the collaboration with Politecnico di Bari, we worked on LEMON, a lightweight enhanced monitoring algorithm based on packet sampling. This solution targets a pre-assigned accuracy on bitrate estimates, for each monitored flow at a router interface. To this end, LEMON takes into account some basic properties of the flows, which can be easily inferred from a sampled stream, and exploits them to dynamically adapt the monitoring time-window on a per-flow basis. Its effectiveness is tested using real packet traces. Experimental results show that LEMON is able to finely tune, in real-time, the monitoring window associated to each flow and its communication overhead can be kept low enough by choosing an appropriate aggregation policy in message exporting. Moreover, compared to a classic fixed-scale monitoring approach, it is able to better satisfy the accuracy requirements of bitrate estimates. Finally, LEMON incurs a low processing overhead, which can be easily sustained by currently deployed routers, such as a CISCO 12000 device. This work has been published in [18] .

  • Packet Extraction Tool for Large Volume Network Traces

     

    Network packet tracing has been used for many different purposes during the last few decades, such as network software debugging, networking performance analysis, forensic investigation, and so on. Meanwhile, the size of packet traces becomes larger, as the speed of network rapidly increases. Thus, to handle huge amounts of traces, we need not only more hardware resources, but also efficient software tools. However, traditional tools are inefficient at dealing with such big packet traces. We proposed pcapWT, an efficient packet extraction tool for large traces. PcapWT provides fast packet lookup by indexing an original trace using a Wavelet Tree structure. In addition, pcapWT supports multi-threading for avoiding synchronous I/O and blocking system calls used for file processing, and is particularly efficient on machines with SSD. PcapWT shows remarkable performance enhancements in comparison with traditional tools such as tcpdump and most recent tools such as pcapIndex in terms of index data size and packet extraction time. Our benchmark using large and complex traces shows that pcapWT reduces the index data size down below 1% of the volume of the original traces. Moreover, packet extraction performance is 20% better than with pcapIndex. Furthermore, when a small amount of packets are retrieved, pcapWT is hundreds of times faster than tcpdump. These results, done in collaboration within the CIRIC, have just been submitted to Computer Networks[34] .

  • Impact of new transport protocols on BitTorrent performance

     

    In the paper [27] , we address the trade-off between the data plane efficiency and the control plane timeliness for the BitTorrent performance. We argue that loss-based congestion control protocols can fill large buffers, leading to a higher end-to-end delay, unlike low-priority or delay-based congestion control protocols. We perform experiments for both the uTorrent and mainline BitTorrent clients, and we study the impact of uTP (a novel transport protocol proposed by BitTorrent) and several TCP congestion control algorithms (Cubic, New Reno, LP, Vegas and Nice) on the download completion time. Briefly, in case peers in the swarm all use the same congestion control algorithm, we observe that the specific algorithm has only a limited impact on the swarm performance. Conversely, when a mix of TCP congestion control algorithms coexists, peers employing a delay-based low-priority algorithm exhibit shorter completion time.