Thesis of Dimitri Gominski


Subject:
Description, appariement et indexation d'images multi-date et multi-source

Defense date: 09/11/2021

Advisor: Liming Chen
Coadvisor: Mohsen Ardabilian

Summary:

With an ever increasing volume of digitally accessible images, establishing connections to organize and analyse data is all the more important. A typical formulation for connecting images without using metadata is content-based image retrieval (CBIR). Similarly to other applications in computer vision, CBIR has benefited from the expressivity of convolutional neural networks (CNN) and obtained unprecedented results on usual benchmarks. However, it is hard to say whether this performance is explained by the proposal of more and more sophisticated architectures and models, or simply by the presence of a training dataset that matches the use case, i.e. that has similar visual and semantic characteristics. Indeed, the usual paradigm of the model-training dataset couple shows its limits as soon as one leaves the case characterized by the training data: the performance drops when the model is tested on different data, or data with too high variability.

This thesis addresses this issue with a critical look at deep learning methods and their real application potential. In a context of multi-source geographical imagery, a benchmark is proposed to characterize a new research problem: heterogeneous image retrieval, "low-data" (without training data), with a use case where defining a training dataset and a baseline method is not easy: the interconnection of iconographic collections from different heritage institutions. With this benchmark, new measures are proposed to qualify the generalization ability of the model in a CBIR context, then technical solutions that allow to get rid of the hazardous definition of similar visual and semantic characteristics. The discussion around the results highlights a probably too great importance given to the architecture of neural networks, and promising ideas in CBIR which provides tools agnostics of the used model, and allowing to exploit the comparative advantages of different models trained on different data sets. Finally, the interest of this generalist approach is confirmed by a second application to land-use classification with high-resolution satellite imagery, a case where despite the abundance of methods and data, they are encapsulated in a set of small datasets and therefore with a limited application potential.


Jury:
Mr Bell PeterProfesseur(e)Friedrich-Alexander Universität
Mr Erlangen-NürnbergProfesseur(e)Allemagne Rapporteur(e)
Mr Joly PhilippeMaître de conférenceUniversité Paul Sabatier, ToulouseRapporteur(e)
Mme Stoter JantienProfesseur(e)Delft University of Technology, Pays-BasPrésident(e)
Mr Samaras Dimitris Professeur(e)Stony Brook University, Etats-UnisExaminateur​(trice)
Mme Gouet-Brunet ValérieDirecteur(trice) de rechercheUniversité Gustave EiffelCo-directeur (trice)
Mr Chen LimingProfesseur(e)Ecole Centrale de LyonCo-directeur (trice)