Thesis of Yuyao Zhang


Subject:
Non-linear Dimensionality Reduction and Sparse Representation Models for Facial Analysis

Defense date: 20/02/2014

Advisor: Jean-Michel Jolion
Coadvisor: Khalid Idrissi

Summary:

Face analysis techniques commonly require a proper representation of images by means of dimensionality reduction leading to embedded manifolds, which aims at capturing relevant characteristics of the signals.
In this thesis, we first provide a comprehensive survey on the state of the art of embedded manifold models.
Then, we introduce a novel non-linear embedding method, the Kernel Similarity Principle Component Analysis (KS-PCA), into Active Appearance Models, in order to model face appearances under variable illumination. The proposed algorithm successfully outperforms the traditional linear PCA transform to capture the salient features generated by different illuminations, and reconstruct the illuminated faces with high accuracy.
We also consider the problem of automatically classifying human face poses from face views with varying illumination, as well as occlusion and noise. Based on the sparse representation methods, we propose two dictionary-learning frameworks for this pose classification problem.
The first framework is the Adaptive Sparse Representation pose Classification (ASRC). It trains the dictionary via a linear model called Incremental Principle Component Analysis (Incremental PCA), tending to decrease the intra-class redundancy which may affect the classification performance, while keeping the extra-class redundancy which is critical for sparse representation.
The other proposed work is the Dictionary-Learning Sparse Representation model (DLSR) that learns the dictionary with the aim of coinciding with the classification criterion. This training goal is achieved by the K-SVD algorithm.
In a series of experiments, we show the performance of the two dictionary-learning methods which are respectively based on a linear transform and a sparse representation model.
Besides, we propose a novel Dictionary Learning framework for Illumination Normalization (DLIN). DLIN based on sparse representation in terms of coupled dictionaries. The dictionary pairs are jointly optimized from normally illuminated and irregularly illuminated face image pairs. We further utilize a Gaussian Mixture Model (GMM) to enhance the framework's capability of modeling data under complex distribution. The GMM adapt each model to a part of the samples and then fuse them together. Experimental results demonstrate the effectiveness of the sparsity as a prior for patch-based illumination normalization for face images