Thesis of Hoang Viet Tuan Nguyen

Handling data quality in extraction and selection of evolutions from displacement field time series obtained by satellite imagery

Defense date: 10/10/2018

Advisor: Nicolas Meger
Codirection: Catherine Pothier, Christophe Rigotti, Emmanuel Trouve


This PhD thesis deals with knowledge discovery from Displacement Field Time Series (DFTS) obtained by satellite imagery. Such series now occupy a central place in the study and monitoring of natural phenomena such as earthquakes, volcanic eruptions and glacier displacements. These series are indeed rich in both spatial and temporal information and can now be produced regularly at a lower cost thanks to spatial programs such as the European Copernicus program and its famous Sentinel satellites. Our proposals are based on the extraction of grouped frequent sequential patterns. These patterns, originally defined for the extraction of knowledge from Satellite Image Time Series (SITS), have shown their potential in early work to analyze a DFTS. Nevertheless, they cannot use the confidence indices coming along with DFTS and the swap method used to select the most promising patterns does not take into account their spatiotemporal complementarities, each pattern being evaluated individually. Our contribution is thus double. A first proposal aims to associate a measure of reliability with each pattern by using the confidence indices. This measure allows to select patterns having occurrences in the data that are on average sufficiently reliable. We propose a cor- responding constraint-based extraction algorithm. It relies on an efficient search of the most reliable occurrences by dynamic programming and on a pruning of the search space provided by a partial push strategy. This new method has been implemented on the basis of the exis- ting prototype SITS-P2miner, developed by the LISTIC and LIRIS laboratories to extract and rank grouped frequent sequential patterns. A second contribution for the selection of the most promising patterns is also made. This one, based on an informational criterion, makes it possible to take into account at the same time the confidence indices and the way the patterns complement each other spatially and temporally. For this aim, the confidence indices are interpreted as probabilities, and the DFTS are seen as probabilistic databases whose distributions are only partial. The informational gain associated with a pattern is then defined according to the ability of its occurrences to complete/refine the distributions characterizing the data. On this basis, a heuristic is proposed to select informative and complementary patterns. This method provides a set of weakly redundant patterns and therefore easier to interpret than those provided by swap randomization. It has been implemented in a dedicated prototype. Both proposals are evaluated quantitatively and qualitatively using a reference DFTS covering Greenland glaciers constructed from Landsat optical data. Another DFTS that we built from TerraSAR-X radar data covering the Mont-Blanc massif is also used. In addition to being constructed from different data and remote sensing techniques, these series differ drastically in terms of confidence indices, the series covering the Mont-Blanc massif being at very low levels of confidence. In both cases, the proposed methods operate under standard conditions of resource consumption (time, space), and experts’ knowledge of the studied areas is confirmed and completed.

Mme TUPIN FlorenceProfesseur(e)Télécom ParisTechPrésident(e)
Mme FROMONT ElisaProfesseur(e)Université de Rennes 1Rapporteur(e)
M. CREMILLEUX BrunoUniversité de Caen NormandieRapporteur(e)
M. IENCO DinoChargé(e) de RechercheIRSTEAExaminateur​(trice)
M. MEGER NicolasMaître de conférenceUniversité Savoie Mont BlancDirecteur(trice) de thèse
M. RIGOTTI ChristopheMaître de conférenceINSA LyonCo-directeur (trice)
Mme POTHIER CatherineMaître de conférenceINSA LyonCo-directeur (trice)
M. TROUVE EmmanuelProfesseur(e)Université Savoie Mont BlancCo-directeur (trice)