The LIRIS human activities dataset - database structure

Main page | download | database structure | annotations in XML | camera calibration | annotation tools | evaluation tools

You need to cite the following journal paper in all publications including results for which you used the LIRIS dataset:.

C Wolf, J. Mille, E. Lombardi, O. Celiktutan, M. Jiu, E. Dogan, G. Eren, M. Baccouche, E. Dellandrea, C.-E. Bichot, C. Garcia, B. Sankur, Evaluation of video activity localizations integrating quality and quantity measurements, In Computer Vision and Image Understanding (127):14-30, 2014.

The activity classes

Discussion between two or more people
Give an object to another person
Put/take an object into/from a box/desk
Enter/leave a room (pass through a door) without unlocking
Try to enter a room (unsuccessfully)
Unlock and enter (or leave) a room
Leave baggage unattended
Handshaking
Typing on a keyboard
Telephone conversation

Examples

The following links illustrate some of the dataset activity classes. For your convenience we coded videos in MPEG-4 which can be viewed directly by clicking on them if viewed in a modern browser. However, note that the quality of these videos is lower than the distributed frames.

Set D1 (robot+Kinect): example 1 Kinect grayscale: MPEG-4 Kinect pseudocolor: MPEG-4
Set D1 (robot+Kinect): example 2 Kinect grayscale: MPEG-4 Kinect pseudocolor: MPEG-4
Set D2 (Sony camcorder): example 1 MPEG-4
Set D2 (Sony camcorder): example 2 MPEG-4

Dataset size and structure

D1: Video file numbers of the training+validation set:

0001 0002 0004 0005 0007 0008 0010 0011 0013 0014 0016 0019 0020 0022 0023 0025 0026 0028 0029 0031 0032 0034 0035 0037 0038 0040 0041 0043 0044 0046 0047 0049 0050 0055 0056 0058 0059 0061 0062 0064 0065 0067 0068 0070 0071 0073 0074 0076 0077 0079 0080 0082 0083 0085 0086 0088 0089 0091 0092 0094 0095 0097 0098 0100 0101 0103 0104 0106 0107 0109 0110 0112 0113 0115 0116 0118 0119 0121 0128 0133 0134 0136 0137 0139 0140 0142 0143 0145 0146 0148 0149 0151 0152 0154 0155 0157 0158 0160 0161 0163 0164 0166 0167 0169 0170 0172 0173 0175 0176 0178 0179

D1: Video file numbers of the test set:

0003 0006 0009 0012 0015 0017 0018 0021 0024 0027 0030 0033 0036 0039 0042 0045 0048 0051 0052 0053 0054 0057 0060 0063 0066 0069 0072 0075 0078 0081 0084 0087 0090 0093 0096 0099 0102 0105 0108 0111 0114 0117 0120 0122 0123 0124 0125 0126 0127 0129 0130 0131 0132 0135 0138 0141 0144 0147 0150 0153 0156 0159 0162 0165 0168 0171 0174 0177 0180

D2: Video file numbers of the training+validation set:

0001 0003 0004 0008 0009 0011 0012 0013 0014 0015 0016 0017 0019 0020 0022 0023 0025 0026 0028 0029 0033 0034 0036 0037 0038 0040 0042 0044 0045 0046 0048 0049 0051 0052 0053 0055 0056 0058 0059 0061 0062 0064 0066 0068 0069 0070 0072 0073 0074 0076 0077 0079 0080 0081 0083 0084 0086 0087 0088 0089 0090 0091 0092 0094 0101 0102 0103 0107 0108 0110 0112 0113 0115 0116 0118 0119 0121 0122 0124 0125 0127 0128 0129 0131 0132 0134 0135 0136 0138 0139 0140 0142 0144 0145 0148 0149 0150 0151 0152 0154 0155 0157 0159 0160 0162 0163 0164 0166 0167

D2: Video file numbers of the test set:

0002 0005 0006 0007 0010 0018 0021 0024 0027 0030 0031 0032 0035 0039 0041 0043 0047 0050 0054 0057 0060 0063 0065 0067 0071 0075 0078 0082 0085 0093 0095 0096 0097 0098 0099 0100 0104 0105 0106 0109 0111 0114 0117 0120 0123 0126 0130 0133 0137 0141 0143 0146 0147 0153 0156 0158 0161 0165

Action histograms per data subset

D1-training-validation D1-test D2-training-validation D2-test
1DI40
2GI15
3BO48
4EN83
5ET16
6LO14
7UB14
8HS25
9KB33
10TE17
TOTAL305
1DI17
2GI9
3BO23
4EN37
5ET9
6LO9
7UB8
8HS20
9KB17
10TE7
TOTAL156
1DI23
2GI13
3BO41
4EN56
5ET16
6LO14
7UB13
8HS21
9KB30
10TE15
TOTAL242
1DI15
2GI6
3BO20
4EN28
5ET7
6LO7
7UB8
8HS15
9KB13
10TE6
TOTAL125

Total D1 : 461 actions
Total D2 : 367 actions
Total Dataset : 828 actions

Video file formats

Total database size is about 12.5GB. The total base contains the following parts:

Reading should be straightforward with standard tools like MATLAB or C++ libraries like Open CV or libjpeg. Depth data has been saved in 16 bit in little endian mode (LSB - least significant byte first). On a standard Intel PC (or Mac), the data can be read with the standard imread() function. On some other computer it might be necessary to reconvert the data to bit endian format (MSB - most significant byte order). One possibility to do that is to use the convert tool of the ImageMagick package with the -endian option.

ATTENTION : the 11bit depth data covers the 11 most significant bits of the 16 bit words. It is therefore necessary to divide the depth values by 32 to get them into the standard interval of [0,2047].

Organization, contact

The dataset was collected and produced by members of the LIRIS Laboratory, CNRS, France
Christian Wolf, Julien Mille, Eric Lombardi, Bülent Sankur (BUSIM, Bogazici University, Turkey), Emmanuel Dellandréa, Christophe Garcia, Charles-Edmond Bichot, Mingyuan Jiu, Oya Celiktutan, Moez Baccouche

Send questions to christian.wolf (at) liris.cnrs.fr

You need to cite the following journal paper in all publications including results for which you used the LIRIS dataset:.

C Wolf, J. Mille, E. Lombardi, O. Celiktutan, M. Jiu, E. Dogan, G. Eren, M. Baccouche, E. Dellandrea, C.-E. Bichot, C. Garcia, B. Sankur, Evaluation of video activity localizations integrating quality and quantity measurements, In Computer Vision and Image Understanding (127):14-30, 2014.