DIONYSOS - 2017 - Annual activity report

DIONYSOS

DIONYSOS - 2017

Project-Team Dionysos

Personnel

Overall Objectives

Overall objectives

Research Program

Application Domains

Highlights of the Year

Awards

New Software and Platforms

New Results

Bilateral Contracts and Grants with Industry

Partnerships and Cooperations

Dissemination

Bibliography

Previous |

Home | Next next

Section: New Software and Platforms

AdaComp

Participants: Corentin Hardy, Bruno Sericola

Our recent works, in collaboration with Technicolor, on deep learning and distributed learning led us to study a kind of data parallelism called the Parameter Server model. This model consists in sharing the learning of a deep neural network between many devices (called the workers) via a centralized Parameter Server (PS). We deployed a platform which allow us to experiment different state-of-the-art algorithms based on the PS model. The platform is composed of a unique powerful machine where many Linux containers (LXC) are running. Each LXC executes a Tensorflow session and can be a worker or a PS. The first experimentations were used to validate the correct functioning of the platform, to better understand its limitations and to determine what can be measured in an unbiased way. Others experimentations helped us to understand the role of different parameters of the overall model, mainly those related to the distribution on user-devices, and their impact on the learning (accuracy of the model, number of iterations to learn the model). During these experimentations, we noted that the main bottleneck is the ingress traffic of PS during the learning phase. To reduce this ingress traffic, we chose to compress the messages sent by the workers to the PS. We proposed in [43] a method to reduce up to 2 orders of magnitude this ingress traffic, keeping a good accuracy on the learned model. This new method, called AdaComp, is available in github (https://github.com/Hardy-c/AdaComp).

Previous |

Home | Next next