Thesis of Rémi Tournaire

Automatic discovery of ontology mappings

Start date: 01/10/2007
Defense date: 08/12/2010

Advisor: Jean-Marc Petit


In this thesis, we adopt a formal approach to define and discover probabilistic inclusion mappings between two taxonomies with clear semantics, with a view to collaborative document exchange. We compare two ways of modeling probabilistic mappings while being compatible with the logical constraints declared in each taxonomy according to a property of monotony, then we show that these models are complementary to present the relevant mappings. We provide a way to estimate the probabilities of a mapping by a Bayesian technique based on the statistics of the extensions of the classes involved in the mapping. If the sets of instances are disjoint, classifiers are used to merge them. We then present a "generate and test" algorithm which uses the two mapping models to discover the most probable between two taxonomies. We are conducting a thorough experimental analysis of ProbaMap. We present a synthetic data generator that produces controlled input for quantitative and qualitative analysis over a wide spectrum of situations. We also present two sets of results of experiments on real data: the alignment of the OAEI "Directory" dataset, and a comparison for the alignment of Web directories on certain ProbaMap. 2003). The outlook for this coherent work to design a system for responding to probabilistic queries by reusing probability mappings, and converting the coefficients returned by probability matching methods.

