Homepage Inria website
OPIS - 2019

Overall Objectives
New Software and Platforms
New Results
Bilateral Contracts and Grants with Industry

Section: New Results

Representation Learning on Real-World Graphs

Participants: Fragkiskos Malliaros, Abdulkadir Çelikkanat

Network representation learning (NRL) methods aim to map each vertex into a low dimensional space by preserving both local and global structure of a given network. In recent years, various approaches based on random walks have been proposed to learn node embeddings – thanks to their success in several challenging problems. In this work, we have introduced two methodologies to compute latent representations of nodes based on random walks.

In particular, we have proposed Kernel Node Embeddings (KernelNE) [53], a model that aims to bring together two popular approaches for NRL, namely matrix factorization and random walk-based models. KernelNE is a weighted matrix factorization model which encodes random walk-based information about the nodes of the graph. The main benefit of this formulation is that it allows to utilize kernel functions on the computation of the embeddings.

Our second approach is motivated by the fact that the popular Skip-Gram algorithm models the conditional distribution of nodes within a random walk based on the softmax function, which might prohibit to capture richer types of interaction patterns among nodes that co-occur within a random walk. Here we argue that considering more expressive conditional probability models to relate nodes within a random walk sequence, might lead to more informative representations. That way, we have introduced the Exponential Family Graph Embedding (EFGE) model [54], that capitalizes on exponential family distribution models to capture interactions between nodes.

We have evaluated our methods on two downstream tasks: node classification and link prediction in social, information and biological networks. The experimental results demonstrate that random walk-based models accompanied with kernels as well as exponential family distributions outperform widely-known baseline NRL methods.