Thesis of Bastien Moysset


Subject:
Detection, localization and typing of text in heterogeneous document images with deep neural networks

Defense date: 31/12/2018

Advisor: Christian Wolf

Summary:

This Phd. thesis is about detecting and localizing text lines in highly heterogeneous document databases with complex layouts. For this, we plan to use Machine Learning techniques and, in particular, neural networks. The two main contributions, up to date, are the use of a Recurrent Neural Network within a CTC framework for the segmentation of a paragraph into lines and the adaptation of a convolutional neural networks to directly predict line bounding boxes coordinates within a full document image. One of the main remaining challenge is to take the context of the document into account and to model the interraction between the lines.