Abstract
Dynamic Time Warping (DTW) is commonly used in gesture recognition tasks in order to tackle the temporal length variability of gestures. In the DTW framework, a set of gesture patterns are compared one by one to a maybe infinite test sequence, and a query gesture category is recognized if a warping cost below a certain threshold is found within the test sequence. Nevertheless, either taking one single sample per gesture category or a set of isolated samples may not encode the variability of such gesture category. In this paper, a probability-based DTW for gesture recognition is proposed. Different samples of the same gesture pattern obtained from RGB-Depth data are used to build a Gaussian-based probabilistic model of the gesture. Finally, the cost of DTW has been adapted accordingly to the new model. The proposed approach is tested in a challenging scenario, showing better performance of the probability-based DTW in comparison to state-of-the-art approaches for gesture recognition on RGB-D data.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman, A., Matas, J., Schaffalitzky, F., Kadir, T., Van Gool, L.: A comparison of affine region detectors. International Journal of Computer Vision 65(1/2), 43–72 (2005)
ChaLearn Gesture Dataset (CGD 2011), ChaLearn, California, Copyright (c) ChaLearn - 2011 (2011)
Sakoe, H., Chiba, S.: Dynamic programming algorithm optimization for spoken word recognition. IEEE Transactions on Acoustics, Speech and Signal Processing 26(1), 43–49 (1978)
Reyes, M., Dominguez, G., Escalera, S.: Feature weighting in dynamic time warping for gesture recognition in depth data. In: 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), pp. 1182–1188 (2011)
Zhou, F., la Torre, F.D., Hodgins, J.K.: Hierarchical aligned cluster analysis for temporal clustering of human motion. IEEE Transaction on Pattern Analysis and Machine Intelligence 35(3), 582–596 (2010)
Lv, F., Nevatia, R.: Single View Human Action Recognition using Key Pose Matching and Viterbi Path Searching. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2007, pp. 1–8, 17–22 (2007)
Svensen, M., Bishop, C.M.: Robust bayesian mixture modelling. In: Proceedings of European Symposium on Artificial Neural Networks, vol. 64, pp. 235–252 (2005)
Hampapur, A., Brown, L., Connell, J., Ekin, A., Haas, N., Lu, M., Merkl, H., Pankanti, S.: Smart video surveillance: exploring the concept of multiscale spatiotemporal tracking. IEEE Signal Processing Magazine 22(2), 38–51 (2005)
Pentland, A.: Socially aware computation and communication. Computer 38, 33–40 (2005)
Starner, T., Pentland, A.: Real-time American Sign Language recognition from video using hidden Markov models. In: Proceedings of the International Symposium on Computer Vision, pp. 265–270 (1995)
Shotton, J., Sharp, T., Kipman, A., Fitzgibbon, A., Finocchio, M., Blake, A., Cook, M., Moore, R.: Real-time human pose recognition in parts from single depth images. Communications of the ACM 56, 116–124 (2013)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bautista, M.Á. et al. (2013). Probability-Based Dynamic Time Warping for Gesture Recognition on RGB-D Data. In: Jiang, X., Bellon, O.R.P., Goldgof, D., Oishi, T. (eds) Advances in Depth Image Analysis and Applications. WDIA 2012. Lecture Notes in Computer Science, vol 7854. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40303-3_14
Download citation
DOI: https://doi.org/10.1007/978-3-642-40303-3_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40302-6
Online ISBN: 978-3-642-40303-3
eBook Packages: Computer ScienceComputer Science (R0)