Thesis of Tarek Sayah

Subject:

Specification and management of access control policies in an environment of distributed data access.

Start date: 16/09/2013
Defense date: 08/09/2016

Advisor: Mohand-Said Hacid

Summary:

The emergence of the Semantic Web has led to the rapid adoption of RDF (Resource Description Framework) format for describing data and links between them. The RDF graph model is suitable for the representation of semantic links between objects of the Web that are identified by IRIs (Internationalized Resource Identifier). Applications that publish and share potentially sensitive RDF data are increasing in many areas: bioinformatics, e-government, open data movement. The problem of controlling access to RDF content and selective information exposure according to the privileges of the requesters is becoming increasingly important. An answer to the problem is particularly expected in the context of open-data, because it would encourage data providers to publish data to the public on their own terms, selectively.
Two scientific locks justify a research to develop a solution to the problems of distributed RDF data selective exposure. The first lock is the distribution of data sources and authorities that control them. The naturally open context of the exchange of RDF data makes unrealistic assumptions of mutual trust or of unique shared authentication, so it is necessary to propose innovative mechanisms for user identification and integration of security policies enacted by different authorities. Problems of executing distributed queries taking into account filtering should also be taken into consideration. The second lock is put by the introduction of deduction mechanisms for RDF data (eg, RDF / S, OWL). In fact, when an owner wants to prohibit access to information, we must also ensure that the published data can not be used to infer a supposed secret information via the RDF inference mechanisms.
An initial work of scientific monitoring will lead to the development of criteria for assessing the state of the art on access control to RDF data, including the expressiveness of the policy specification language, conflict resolution generated by conflicting decisions rules and the verification of unauthorized inferences. Tools proposed to release the aforementioned locks include static analysis of queries for RDF in the presence of inference, in particular the problem of query containment, data integration and schema mapping tools as well as declarative models for distributed access control.