Research interests
- Action recognition
- Deep learning
- Neural networks
- Machine learning
- Computer vision
Ph.D. subject
Neural-based Action Recognition in Videos
Multimedia content indexing currently relies on global descriptors, built from digital signatures which are intended to summarize the image content in terms of distribution of light intensity, color or texture. These descriptive signatures, used as index, consist of low level measures, close to the image signal and particularly sensitive to noise. Even if these descriptors are useful to compare multimedia documents, they are unable to describe semantically their content, and are difficult to handle for a user in order to search a specific document. However, search engines based on linguistic queries require the detection of high-level indexes closer to the concept of visual objects such as faces, human bodies, buildings to name but a few examples. They also require a categorization of video segments, an automatic recognition of their content: news, commercials, football, etc...
This PhD aims to semantically categorize video segments, obtained from the automatic detection of shots and from a macro-segmentation based on inter-programs detection. First, we will focus on developing new techniques for modeling and localizing objects of interest based only on their visual appearance, without a priori modeling or heuristic filtering, but by automatic learning from samples directly extracted from images. This work will follow previous activities led in France Telecom R&D, based on neural models. We will focus on the detection and recognition of deformable objects, by a joint consideration of texture and movement in a video. An example of application may be detecting and tracking moving objects such as faces in TV news or players in sports videos. Then, we will focus on automatic recognition of a video segment theme. To do this, we will follow-up previous research works, aiming at the categorization of collections of still images, and will extend them to the case of video. In this case, each video frame will be processed globally a signature, including color, texture and movement measures enabling to summarize its contents. Robust statistical and neural learning techniques will be implemented to categorize the content according to example database of the given concepts.
Biography
| Ph.D candidate | since 2009 | Orange Labs / LIRIS INSA de Lyon |
| MS degree in Computer Vision | 2008-2009 | Télécom ParisTech |
| Telecommunication Engineer | 2003-2008 | Higher School of Communication, Tunisia |
For more information, please refer to my resume (in French, last updated in Nov. 2011).
PhD Advisors
| Franck Mamalet | Orange Labs - MAS team |
| Christian Wolf | LIRIS - Imagine team |
| Christophe Garcia | LIRIS - Imagine team |
| Atilla Baskurt | LIRIS - Imagine team |
Publications
Presentations
- Presentation at HBU'11 (November 16th 2011, Amsterdam, Netherlands).
- Presentation at ICANN'10 (September 18th 2010, Thessaloniki, Greece).
Useful Links
The Deep Learning Homepage
The Computer Vision Homepage
Schmidhuber's page on RNNs
LeCun's page
Ph.D. dissertation/research advice
Ph.D. Comics
Miscellaneous
- Member of the organization committee of ICPR - HARL 2012 : human activities recognition and localization competition
- Some radio stations I often listen to: WBGO 88.3 FM (Jazz, New York), WWOZ 90.7 FM (Blues, New Orleans), KBLX 102.9 FM (Talk/AC Music, San Francisco), Capital 95.8 FM (Top-40/Pop, London), Real Radio Northwest 105.4 FM (Top-40/Pop, Manchester), The Apple 97.0 AM (Talk/News, New York), LBC 97.3 FM (Talk/News, London).

