Thesis of Yulian Yang


Subject:
Smantic overlay Peer-to-Peer networks for Information Retrieval

Defense date: 28/03/2014

Advisor: Sylvie Calabretto

Summary:

A Peer-to-Peer (P2P) platform is considered for collaborative Information Retrieval (IR). Each peer hosts a collection of text documents with subjects related to the interests of the peer owner. Without a global indexing mechanism, peers locally index their documents, and provide the service to answer queries. The object is to design a decentralized protocol, enabling the peers to collaboratively forward queries from the initiator to the peers with relevant documents.
Semantic Overlay Network (SON) is one of the state-of-the-art solutions. In SONs, peers with semantically similar resource are clustered. IR is performed by forwarding queries to the relevant peer clusters in an informed way. Current approaches for SONs mainly involves peer rewiring. In peer rewiring, random/greedy walkers are periodically sent by each peer. The walkers walk along the links between peers, aiming at exploring more similar peers to replace less similar neighbors of the peer. In this way, the P2P network gradually evolves from a random overlay network to a SON.
Random and greedy walk are individually applied or integrated in peer rewiring, as a constant strategy during the progress of network evolution. It is not considered that the evolution of the network topology may affect the performance of random and greedy walk. For example, when peers are randomly connected with each other, a random walk performs better than greedy walk for exploring similar peers. But as peer clusters gradually emerge in the network, a walker can explore more similar peers by following a greedy strategy. With this motivation, an evolving walking strategy based on Simulated Annealing (SA) is proposed, which evolves from a random walk to a greedy walk along the progress of network evolution. According to the simulation result, SA-based strategy improves the current approaches, both in the efficiency to build a SON and the effectiveness of the subsequent IR.
This thesis contains several advancements with respect to the state of the art in this field. First of all, we identify a generic peer rewiring pattern and formalize it as a three-step procedure. Our technique provides a consistent framework for peer rewiring, while allowing enough flexibility for the users/designers to specify its properties. Secondly, we formalize SON construction as a combinatorial optimization problem, with peer rewiring as its decentralized local search solution, Based on this model, we propose a novel SA-based approach to peer rewiring. Our approach is validated via an extensive experimental study on the eect of network wiring on (i) SON building and (ii) IR on SONs.