Abstract
Automatic detection and pose estimation of humans is an important task in Human-Computer Interaction (HCI), user interaction and event analysis. This paper presents a model based approach for detecting and estimating human pose by fusing depth and RGB color data from monocular view. The proposed system uses Haar cascade based detection and template matching to perform tracking of the most reliably detectable parts namely, head and torso. A stick figure model is used to represent the detected body parts. The fitting is then performed independently for each limb, using the weighted distance transform map. The fact that each limb is fitted independently speeds-up the fitting process and makes it robust, avoiding the combinatorial complexity problems that are common with these types of methods. The output is a stick figure model consistent with the pose of the person in the given input image. The algorithm works in real-time and is fully automatic and can detect multiple non-intersecting people.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Zcam from 3dv systems (2009), http://3dvzcam.com
Aggarwal, J., Cai, Q.: Human motion analysis: A review. In: Proceedings of the Nonrigid and Articulated Motion Workshop, pp. 90–102 (1997)
Badler, N.I., Phillips, C.B., Webber, B.L.: Simulating Humans: Computer Graphics, Animation, and Control. Oxford University Press, Oxford (1993)
Barrón, C., Kakadiaris, I.A.: Estimating anthropometry & pose from a single uncalibrated image. Computer Vision and Image Understanding 81, 269–284 (2001)
Bradley, D.: Profile face detection (2003), http://opencv.willowgarage.com
Chang, F., jen Chen, C., jen Lu, C.: A linear-time component-labeling algorithm using contour tracing technique. Computer Vision and Image Understanding 93, 206–220 (2004)
Churchill, E., McConville, J.T., Laubach, L.L., Erskine, P., Downing, K., Churchill, T.: Anthropometric source book. A handbook of anthropometric data, vol. 2. NASA (1978)
Fujiyoshi, H., Lipton, A.J.: Real-time human motion analysis by image skeletonization. In: Proceedings of the Fourth IEEE Workshop on Applications of Computer Vision (WACV 1998), pp. 15–21 (1998)
Guo, Y., Xu, G., Tsuji, S.: Tracking human body motion based on a stick figure model. Journal of Visual Comm. and Image Representation 5(1), 1–9 (1994)
Haritaoglu, I., Harwood, D., Davis, L.: W4: Who? when? where? what? A real time system for detecting and tracking people. In: Proceedings of the Third IEEE Int. Conf. on Automatic Face and Gesture Recog., pp. 222–227 (1998)
Herda, L., Fua, P., Plänkers, R., Boulic, R., Thalmann, D.: Skeleton-based motion capture for robust reconstruction of human motion. In: Proceedings of the Computer Animation, pp. 77–83. IEEE Computer Society, Los Alamitos (2000)
Jensen, R.R., Paulsen, R.R., Larsen, R.: Analyzing gait using a time-of-flight camera. In: Salberg, A.-B., Hardeberg, J.Y., Jenssen, R. (eds.) SCIA 2009. LNCS, vol. 5575, pp. 21–30. Springer, Heidelberg (2009)
Johansson, G.: Visual motion perception. Scientific American 232(6), 76–89 (1975)
Kolb, A., Barth, E., Koch, R., Larsen, R.: Time-of-flight cameras in computer graphics. Computer Graphics Forum 29, 141–159 (2010)
Kruppa, H., Santana, M.C., Schiele, B.: Fast and robust face finding via local context. In: Joint IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance (October 2003)
Lienhart, R., Maydt, J.: An extended set of Haar-like features for rapid object detection. In: Proceedings of the International Conference on Image Processing, vol. 1, pp. 900–903 (2002)
Microsoft: Kinect for xbox 360 (2010), http://www.xbox.com/en-US/kinect
Moeslund, T., Hilton, A., Krüger, V.: A survey of advances in vision-based human motion capture and analysis. Computer Vision and Image Understanding 104(2-3), 90–126 (2006)
Ohya, J., Kishino, F.: Human posture estimation from multiple images using genetic algorithm. In: Proceedings of the 12th IAPR International Conference on Pattern Recognition, Conference A: Computer Vision & Image Processing. vol. 1, pp. 750–753 (1994)
Rashid, R.F.: Towards a system for the interpretation of moving light display. IEEE Transactions on Pattern Analysis and Machine Intelligence 2(6), 574–581 (1980)
Rosenfeld, A., Pfaltz, J.: Distance function on digital pictures. Pattern Recognition 1(1), 33–61 (1968)
Takahashi, K., Uemura, T., Ohya, J.: Neural-network-based real-time human body posture estimation. In: Proceedings of the IEEE Signal Processing Society Workshop Neural Networks for Signal Processing X, vol. 2, pp. 477–486 (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jain, H.P., Subramanian, A., Das, S., Mittal, A. (2011). Real-Time Upper-Body Human Pose Estimation Using a Depth Camera. In: Gagalowicz, A., Philips, W. (eds) Computer Vision/Computer Graphics Collaboration Techniques. MIRAGE 2011. Lecture Notes in Computer Science, vol 6930. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24136-9_20
Download citation
DOI: https://doi.org/10.1007/978-3-642-24136-9_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24135-2
Online ISBN: 978-3-642-24136-9
eBook Packages: Computer ScienceComputer Science (R0)