Thesis of Peng Wang


Subject:
Analysis of writings and information retrieval in large databases clandestine manuscripts of the 18th century

Defense date: 18/11/2014

Advisor: Christophe Garcia
Coadvisor: Véronique Eglin

Summary:

The thesis concerns the development of original methods and completes recovery of manuscript collections of heritage. These methods focus on the corpus of clandestine connections in Europe in the 18th century will lead to the development of solutions - for partial or non-existent today confined to restricted collections of small size - aid to navigation in a manuscript collection, indexing (text and individual forms - graphs) and assistance in reading through a contribution to the recognition of the scriptures. More generally, this thesis aims to develop solutions for the characterization of content and recognition in the context of multi corpus writers (more than 120,000 documents and dozens of different hands). This multidisciplinary project has a fundamental component in the history of classical thought, it focuses on the role of written communication - letters and manuscripts of philosophical scholars - in the development of the Republic of Letters and the formation of the philosophical spirit between 1685 and 1789. The electronic corpus on which the thesis will work allows original research and fruitful in terms of historical and philosophical analysis, in terms of electronic instrumentation and in terms of image processing of manuscripts. This project is generally in the sense that the objects included in the study cannot be modeled by standard representations due to the presence of highly heterogeneous and composite content. It therefore requires the development of flexible and adaptive methods (that adapt to the specific content including the variability of the scriptures), robustness (less sensitive to noise and variations in image quality), favoring an analytical approach and recognition of mixed scripts (up to a global modeling of the contents of a written and allograph modeling and dictionary forms).