Thesis of Bonan Cuan


Subject:
Deep Similarity Metric Learning for Multiple Object Tracking

Defense date: 12/09/2019

Advisor: Christophe Garcia
Codirection: Khalid Idrissi

Summary:

Multiple object tracking, i.e. simultaneously tracking multiple objects in the scene, is an
important but challenging visual task. Objects need to be accurately detected from background
and distinguished from each other to avoid erroneous trajectories. Since remarkable
progress has been made in object detection field, “tracking-by-detection” approaches are
widely adopted in multiple object tracking research. Objects in all frames are detected in
advance and tracking reduces to an association problem: linking detections of the same
object through frames into trajectories.
Most tracking algorithms employ both motion and appearance models for data association.
For multiple object tracking problems where exist many objects of the same category, a
fine-grained discriminant appearance model is paramount and indispensable. In this thesis,
we propose an appearance-based re-identification model using deep similarity metric learning
to deal with multiple object tracking in mono-camera videos. Two main contributions are
reported in this dissertation:
First, we reviewed the state of the art of object re-identification, realizing innercategory
recognition which is required in multiple object tracking. Specificly, we thoroughly investigated
the application of pairwise metric learning in this field. A deep Siamese network is
employed to learn a proper end-to-end mapping from input images to a discriminant embedding
space. Different metric learning configurations using various metrics, loss functions,
deep network structures, etc., are experimented and compared, in order to determine the
best re-identification model for tracking. With an intuitive and simple classification design,
the proposed model achieves satisfactory re-identifiation results, which are comparable to
state-of-the-art approaches using triplet loss when evaluated on benchmarks like CUHK03.
Our approach is easy and fast to train and the learned embedding can be readily transferred
onto the domain of tracking tasks.
Second, we integrated our proposed re-identification model in data association as appearance
guidance for multiple object tracking. For each object to be tracked in a video, we establish
a identity-related appearance model based on the learned embedding for re-identification.
Similarities among detected object instances are exploited for identity classification, which
determines the tracking result along with motion models. Besides, we also investigated

the collaboration and interference between appearance and motion models. Contrary to
most existing tracking algorithms that bind both kind of models via a simple sum of their
scores, we propose an online model coupling to further improve the tracking performance.
When a model fails in front of ambiguous tracks, the other takes over the data association.
Experiments on Multiple Object Tracking Challenge benchmark prove the effectiveness of
our modifications, with a state-of-the-art tracking accuracy.

 


Jury:
Mme Alice CAPLIERProfesseur(e)Université de GrenobleRapporteur(e)
Mr Thierry CHATEAUProfesseur(e)Université de Clermont-AnvergneRapporteur(e)
Mr Michel PAINDAVOINE Professeur(e)Université de BougogneExaminateur​(trice)
Mr Christophe GARCIAProfesseur(e)INSADirecteur(trice) de thèse
Mr Khalid IDRISSIMaître de conférenceINSACo-directeur (trice)