Thesis of Ikenna Oluigbo
Subject:
Defense date: 24/11/2022
Advisor: Hamida Seba
Coadvisor: Mohammed Haddad
Summary:
A graph or network is a mathematical structure which models the interactions between different entities. A typical graph consisting of vertices (or nodes) and edges (or links) connecting these vertices, can be used to model complex entities and the interactions between these entities. Graph representation learning is a technique aimed at learning low-dimensional vector embedding for nodes, edges, or subgraphs in a graph, while preserving the graph features such as structure, node/edge attributes, similarity property, etc. Existing representation learning models are faced with some limitations with respect to extracting network properties and embedding nodes. Firstly, most of these models are primarily focused on extracting and preserving only the structural patterns of the network without considering the rich contextual attributes of nodes. Secondly, many existing models are faced with high time and memory computational complexities. Thirdly, effectively preserving node neighborhood structures is a challenge, especially for models using totally randomized and not well guided random walk techniques. Fourthly, some models rely on tunable parameters to sample nodes in a network, which require expert domain knowledge to determine the right parameters. In light of these challenges, our main objective is to propose new and scalable algorithms to effectively learn embedding for nodes in a network using their structural information and contextual attributes. We leverage majorly on random walk based methods and we propose improved guided walk techniques for aggregating graph properties to learn embedding for nodes through four contributions: In our first contribution, we extensively reviewed various graph representation learning models and evaluated the performance of their learned embedding on different downstream learning tasks. In our second contribution, we propose the use of a novel node neighborhood encoding technique to improve the accuracy of node embedding in learning tasks. The compact node neighborhood representation captures the semantic and structural properties of a node’s neighborhood in a single dimension. Through the compact neighborhood encoding for each node as an additional layer of dimensionality, we are able to improve the accuracy of the embedding for learning tasks. In our third contribution, we developed a novel node representation framework which uses the node attributes as well as their neighborhood structural information to capture node similarities. For this, we improved upon the random walk approach through using a semi-supervised decision function to model a flexible probability walk, which explores different neighborhood spaces to sample nodes sharing similar features. The strength of this approach is that we are able to jointly preserve, in the learned embedding, the node’s neighborhood structures and the rich node attributes in a simple and efficient manner. For our fourth contribution, we propose a new graph representation learning model to capture the structural identity for nodes using a Poisson probability metric with kl-divergence estimation scores to sample node's structural identity similarity in a guided walk.
We implemented all the proposed approaches and conducted extensive experimentation with various benchmark datasets across several domains, to show the success of our techniques for effective graph representation learning.
Jury:
M. Couturier Raphaël | Professeur(e) | Université de Franche Comte | Rapporteur(e) |
M. Renault Eric | Professeur(e) | ESIEE Paris | Rapporteur(e) |
M. Cherifi Hocine | Professeur(e) | Université de Bourgogne | Examinateur(trice) |
Mme Faci Noura | Maître de conférence | LIRIS Université Claude Bernard Lyon 1 | Examinateur(trice) |
Mme Seba Hamida | Maître de conférence | LIRIS Université Claude Bernard Lyon 1 | Directeur(trice) de thèse |
M. Haddad Mohammed | Maître de conférence | LIRIS Université Claude Bernard Lyon 1 | Co-directeur (trice) |