Thesis of Taygun Kekec


Subject:
Learning Representations from 3D Data and Temporal Data

Defense date: 31/08/2017

Advisor: Christian Wolf

Summary:

The overall goal of this thesis on learning representations of 3D data and temporal data is to investigate innovative approaches based on computer vision techniques and machine learning, but also explore the use of multiple cues, in particular geometric cues and visual cues.
Hand-designed features such as SIFT and HOG underpin many successful object recognition approaches. However, these only capture low-level edge information and it has proven difficult to design features that effectively capture mid-level cues (e.g. edge intersections) or high-level representation (e.g. object parts). Recent developments in machine learning, known as Deep Learning, have shown how hierarchies of features can be learned in an unsupervised manner directly from data.
This thesis will start by studying how to learn features, rather than hand-craft them. It will then study several basic architectures, exploring how they learn features, and analysing how they can be stacked into hierarchies that can extract multiple layers of representation. The proposed PhD will tackle the problem of creating structured models for weakly structured data, e.g. video, where the application of traditional unstructured learning machines is unsuccessful. In particular, in the targeted situations, motion present in the data may be vehicles related motion, motion related to interactions between vehicles (relatively to traffic conditions) or irrelevant motion. In these cases, scene representation based on low level features alone is not efficient and context needs to be taken into account.
Given these requirements, the integration of various types of features at various levels seems a promising idea.