Welcome to the VIVA Hand Gesture Challenge!

finalvis7c

This is a hand gesture dataset which was designed in order to study natural human activity under difficult settings of cluttered background, volatile illumination, and frequent occlusion. The dataset was captured using a Kinect device under real-world driving settings.
There are 19 hand gesture classes, with 8 subjects. Results are reported by averaging classification accuracy over an 8-fold cross-subject cross validation (testing on one subject and training on the rest). Please submit a txt file with the following format
[video_name label_prediction] for evaluation.
A sample submission file can be found in the zip file below. For speed, we compute an average run-time for each video. This is done by computing the time it took to process a video and dividing by the number of frames in the video. This resulting number is then and averaged over all of the videos.

Download the dataset (~2GB)

Result
Rank Method Modality Accuracy Runtime Environment
1 CNN: LRN+HRN RGBD 77.5% 110fps 8GB RAM, Intel Core i5-3210m @ 2.5 GHz – 2 cores
2 CNN: LRN RGBD 74.0% 400fps 8GB RAM, Intel Core i5-3210m @ 2.5 GHz – 2 cores
3 HOG+HOG² RGBD 64.5% 50fps 8GB RAM, Intel Core i7 950 @ 3.07 GHz – 4 cores
4 HON4D D 58.7% 25fps 8GB RAM, Intel Core i7 950 @ 3.07 GHz – 4 cores
5 Dense Trajectories RGBD 54% 18fps 8GB RAM, Intel Core i7 950 @ 3.07 GHz – 4 cores
6 HOG3D RGBD 44.6% 3fps 8GB RAM, Intel Core i7 950 @ 3.07 GHz – 4 cores
7 Harris-3.5D RGBD 36.4% 0.2fps 8GB RAM, Intel Core i7 950 @ 3.07 GHz – 4 cores
Rank Method Info
1 Molchanov, P., Gupta, S., Kim, K. & Kautz, J. Hand Gesture Recognition With 3D Convolutional Neural Networks. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2015.
2 Molchanov, P., Gupta, S., Kim, K. & Kautz, J. Hand Gesture Recognition With 3D Convolutional Neural Networks. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2015.
3 Ohn-Bar, E. & Trivedi, M.M. Hand Gesture Recognition in Real Time for Automotive Interfaces: A Multimodal Vision-Based Approach and Evaluations. IEEE Transactions on Intelligent Transportation Systems, 15(6):2368-2377, 2014. (code)
4 Oreifej, O. & Liu, Z. Hon4d: Histogram of oriented 4d normals for activity recognition from depth sequences. In Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on, pages 716-723, 2013. (code)
5 Wang, H., Kläser, A., Schmid, C. & Liu, C.L. Dense trajectories and motion boundary descriptors for action recognition. International journal of computer vision, 103(1):60-79, Springer, 2013. (code)
6 Klaser, A., Marsza\lek, M. & Schmid, C. A spatio-temporal descriptor based on 3d-gradients. In BMVC 2008-19th British Machine Vision Conference, pages 275-1, 2008. (code)
7 Hadfield, S. & Bowden, R. Hollywood 3d: Recognizing actions in 3d natural scenes. In Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on, pages 3398-3405, 2013. (code)