Abstract
We present a method to identify human-object interactions involved in complex, fine-grained activities. Our approach benefits from recent improvements in range sensor technology and body trackers to detect and classify important events in a depth video. Combining global motion information with local video analysis, our method is able to recognize the time instants of a video at which a person picks up or puts down an object. We introduce three novel datasets for evaluation and perform extensive experiments with promising results.
Chapter PDF
Similar content being viewed by others
References
Oreifej, O., Liu, Z.: Hon4d: Histogram of oriented 4D normals for activity recognition from depth sequences. In: CVPR 2013, pp. 716–723 (2013)
Wang, J., Liu, Z., Wu, Y., Yuan, J.: Mining actionlet ensemble for action recognition with depth cameras. In: CVPR 2012, pp. 1290–1297 (2012)
Vieira, A., Nascimento, E., Oliveira, G., Liu, Z., Campos, M.: Stop: Space-time occupancy patterns for 3D action recognition from depth map sequences. In: Alvarez, L., Mejail, M., Gomez, L., Jacobo, J. (eds.) CIARP 2012. LNCS, vol. 7441, pp. 252–259. Springer, Heidelberg (2012)
Li, W., Zhang, Z., Liu, Z.: Action recognition based on a bag of 3D points. In: CVPR4HB 2010, pp. 9–14 (2010)
Sung, J., Ponce, C., Selman, B., Saxena, A.: Human activity detection from RGBD images. In: AAAI workshop on Pattern, Activity and Intent Recognition, PAIR (2011)
Mehrotra, S., Zhang, Z., Cai, Q., Zhang, C., Chou, P.A.: Low-complexity, near-lossless coding of depth maps from kinect-like depth cameras. In: MMSP, pp. 1–6. IEEE (2011)
Camplani, M., Salgado, L.: Efficient spatio-temporal hole filling strategy for Kinect depth maps. In: Society of Photo-Optical Instrumentation Engineers (SPIE) Conference Series. Society of Photo-Optical Instrumentation Engineers (SPIE) Conference Series, vol. 8290 (February 2012)
Gupta, A., Davis, L.: Objects in action: An approach for combining action understanding and object perception. In: CVPR 2007, pp. 1–8 (2007)
Gupta, A., Kembhavi, A., Davis, L.: Observing human-object interactions: Using spatial and functional compatibility for recognition. PAMI 31, 1775–1789 (2009)
Packer, B., Saenko, K., Koller, D.: A combined pose, object, and feature model for action understanding. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1378–1385 (June 2012)
Shotton, J., Sharp, T., Kipman, A., Fitzgibbon, A., Finocchio, M., Blake, A., Cook, M., Moore, R.: Real-time human pose recognition in parts from single depth images. In: CACM, vol. 56, pp. 116–124 (January 2013)
Perona, P., Malik, J.: Scale space and edge detection using anisotropic diffusion. In: CVWS 1987, pp. 16–22 (1987)
Rao, C., Yilmaz, A., Shah, M.: View-invariant representation and recognition of actions. In: IJCV, vol. 50, pp. 203–226 (November 2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Ubalde, S., Liu, Z., Mejail, M. (2014). Detecting Subtle Human-Object Interactions Using Kinect. In: Bayro-Corrochano, E., Hancock, E. (eds) Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications. CIARP 2014. Lecture Notes in Computer Science, vol 8827. Springer, Cham. https://doi.org/10.1007/978-3-319-12568-8_93
Download citation
DOI: https://doi.org/10.1007/978-3-319-12568-8_93
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-12567-1
Online ISBN: 978-3-319-12568-8
eBook Packages: Computer ScienceComputer Science (R0)