Hollywood 3D

Leaderboard

Below are the performance of various techniques reported on the Hollywood 3D dataset. (note that these have not been independently verified). If you wish your technique to be added to the leaderboard, email S.Hadfield{at}surrey.ac.uk with the name of the technique, the reference to the publication, and if possible a link to the pdf.

	Correct Classification Rate	Average Precision
Algorithm	Correct Classification Rate	Mean	NoAction	Run	Punch	Kick	Shoot	Eat	Drive	UsePhone	Kiss	Hug	StandUp	SitDown	Swim	Dance
HOS' [5]	-	36.9	21.2	63.1	54.2	19.9	31.0	24.2	60.8	22.3	31.3	32.4	50.0	18.1	43.0	44.9
Multi-view neural networks [7]	35.71	30.79	-	-	-	-	-	-	-	-	-	-	-	-	-	-
Disp-Pyr{1,3} [4]	36.09	30.52	11.83	47.89	27.71	22.93	49.38	7.48	59.84	14.75	41.42	17.09	50.02	10.03	29.44	37.54
Enriched-IPs [6]	32.8	30.1	11.8	49.5	28.0	20.5	37.4	8.8	61.7	14.9	46.3	14.2	52.8	10.7	23.1	41.8
MVRELM [3]	33.44	29.86	-	-	-	-	-	-	-	-	-	-	-	-	-	-
Disparity-IPs [6]	35.7	28.7	11.6	53.2	34.4	17.4	36.3	7.3	63.5	14.4	34.9	16.6	39.8	9.8	31.3	30.9
SAE-MD(Av) [2]	30.13	26.11	12.77	50.44	38.01	7.94	35.51	7.03	59.62	23.92	16.40	7.02	34.23	6.95	29.48	36.26
HoG/HoF/HoDG + 3.5D-Harris [1]	21.8	14.1	13.7	27.0	5.7	4.8	16.6	5.6	69.6	7.6	10.2	12.1	9.0	5.6	7.5	7.5

[1] Hadfield, S. and Bowden, R. Hollywood 3D: Recognizing Actions in 3D Natural Scenes. In Proceedings, Conference on Computer Vision and Pattern Recognition (CVPR), pg. 3398-3405, 2013.

[2] Konda, K. and Memisevic, R. Learning to combine depth and motion. Indian Conference on Computer Vision, Graphics and Image Processing, 2014.

[3] Iosifidis, A. and Tefas, A. and Pitas, I. Multi-view Regularized Extreme Learning Machine for Human Action Recognition. In Artificial Intelligence: Methods and Applications volume 8554, pg. 84-94, Springer International Publishing, 2014.

[4] Iosifidis, A. and Tefas, A. and Nikolaidis, N. and Pitas, I. Human action recognition in stereoscopic videos based on bag of features and disparity pyramids.

[5] Hadfield, S. and Lebeda, K. and Bowden, R. Natural action recognition using invariant 3D motion encoding. In Proceedings, European Conference on Computer Vision (ECCV), Springers Lecture Notes in Computer Science issue 8690, pg. 758-771, 2014. (Code below)

[6] Mademlis, I. and Iosifidis, A. and Tefas, A. and Nikolaidis, N. and Pitas, I. Stereoscopic Video Description for Human Action Recognition. In the IEEE Symposium Series on Computational Intelligence (SSCI), 2014.

[7] Iosifidis, A. and Tefas, A. and Pitas, I. Human action recognition based on bag of features and multi-view neural networks. In Proceedings International Conference on Image Processing (ICIP), pg. 1510-1514, 2014.

Data and Code

To access the data and code, please enter your details below.

Calibrations

The stereo calibrations for each sequence in the dataset (as described in this paper) are available here.