Thesis of Vanessa El Khoury


Subject:
Semantic protection of data in distributed multimedia databases

Defense date: 01/09/2011

Advisor: Lionel Brunie
Cotutelle: Harald Kosch

Summary:

Multimedia databases (MMDB) manage data of various types that are often large in size. Most multimedia DBMSs store media data separately from metadata. Because of their large volume and/or their geographical location, multimedia data lend themselves well to distributed management methods that let them be stored remotely (without the metadata).
In addition to data transmission problems that have been extensively studied and for which selective and global encrypting solutions have been proposed, and to problems related to authentication and intellectual property, this new context introduces specific data security issues.
The main issue concerns securing the remote data that risk being accessible despite the access control mechanisms of the DBMS. Traditional encrypting solutions used for data transmission are not relevant in this case due to the following reasons:

* They are very expensive in terms of processing cost and risk undermining the performances of MMDBs.
* Therefore, they increase the complexity of query processing to the point of making them unworkable and some cases as the data once encrypted can no longer be compared.
* They are based on a physical view of the data (raw encryption) rather than on semantic aspects; this makes it impossible to specify which parts of the data should be protected with which level of security.

In summary, multimedia documents must be stored (locally or remotely) under a format that protects them from unauthorized accesses. That format must enable full or partial access to the documents, and allow establishing protection constraints based on the document semantics rather than on its physical representation. In parallel, the format must be transparent to the DBMS, which must be able to access and handle the documents as if they were standard. Moreover, once a document has been retrieved, the user must be granted partial or full access to it by stating her access rights.
This thesis deals with the design and implementation of these mechanisms at the levels of the DBMS as well as of the processing of the data files by the user.
In this work, we intend to analyze and to model various options for semantically protecting multimedia files including the different levels of protections that may be offered. This involves a complete inventory of all semantic metadata that can influence the access to the data and the definition of the corresponding access policies. We will then study how these new protection rules impact the DBMS (query analysis, adaptation of query processing to the protection mode, query optimization, ...). We will also propose mechanisms to define semantic protection constraints within the document itself. The last part of the project will deal with subjects related to the use of the protected file by the user, such as encryption keys, representation of access rights, ...
The solutions defined for addressing these issues will require defining metadata specific to security. To this end, the MPEG-21 standard will be used.
The proposed solutions will be tested against a full size database.