Abstract
We address the problem of estimating human body pose from a single image with cluttered background. We train multiple local linear regressors for estimating the 3D pose from a feature vector of gradient orientation histograms. Each linear regressor is capable of selecting relevant components of the feature vector depending on pose by training it on a pose cluster which is a subset of the training samples with similar pose. For discriminating the pose clusters, we use kernel Support Vector Machines (SVM) with pose-dependent feature selection. We achieve feature selection for kernel SVMs by estimating scale parameters of RBF kernel through minimization of the radius/margin bound, which is an upper bound of the expected generalization error, with efficient gradient descent. Human detection is also possible with these SVMs. Quantitative experiments show the effectiveness of pose-dependent feature selection to both human detection and pose estimation.
Chapter PDF
Similar content being viewed by others
References
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Proc. of CVPR, vol. 2, pp. 886–893 (2006)
Mikolajczyk, K., Schmid, C., Zisserman, A.: Human detection based on a probabilistic assembly of robust part detectors. In: Proc. of ECCV, vol. I, pp. 69–81 (2004)
Zhu, Q., Avidan, S., Yeh, M.C., Cheng, K.T.: Fast human detection using a cascade of histograms of oriented gradients. In: Proc. of CVPR, vol. 2, pp. 1491–1498 (2006)
Agarwal, A., Triggs, B.: A local basis representation for estimating human pose from cluttered images. In: Proc. of ACCV, vol. 1, pp. 50–59 (2006)
Poppe, R.: Evaluating example-based pose estimation: experiments on the HumanEva sets. In: Computer Vision and Pattern Recognition (CVPR 2007) workshop on Evaluation of Articulated Human Motion and Pose Estimation (EHuM2) (2007)
Shakhnarovich, G., Viola, P., Darrell, T.: Fast pose estimation with parameter-sensitive hashing. In: Proc. of ICCV, vol. 2, pp. 750–757 (2007)
Blum, A., Langley, P.: Selection of relevant features and examples in machine learning. Artificial Intelligence 97(1-2), 245–271 (1997)
Agarwal, A., Triggs, B.: Monocular human motion capture with a mixture of regressors. In: IEEE Workshop on Vision for Human-Computer Interaction, pp. 1–8 (2005)
Thayananthan, A., Navaratnam, R., Stenger, B., Torr, P.H.S., Cipolla, R.: Multivariate relevance vector machines for tracking. In: Proc. of ECCV, Graz, Austria, vol. 3, pp. 124–138 (May 2006)
Vapnik, V.N.: Statistical Learning Theory. Wiley-Interscience, Chichester (1998)
Keerthi, S.S.: Efficient tuning of SVM hyperparameters using radius/margin bound and iterative algorithms. IEEE Trans. on Neural Networks 13(5), 1225–1229 (2002)
Bissacco, A., Yang, M.H., Soatto, S.: Fast human pose estimation using appearance and motion via multi-dimensional boosting regression. In: Proc. of CVPR, pp. 1–8 (2007)
Date, N., et al.: Real-time human motion sensing based on vision-based inverse kinematics for interactive applications. In: Proc. of ICPR, vol. 3, pp. 318–321 (2004)
Sigal, L., Bhatia, S., Roth, S., Black, M.J., Isard, M.: Tracking loose-limbed people. In: Proc. of CVPR, vol. I, pp. 421–428 (2004)
Sigal, L., Black, M.J.: Predicting 3D people from 2D pictures. In: Proc. of Conf. Articulated Motion and Deformable Objects, pp. 185–195 (2006)
Wren, C.R., Azarbayejani, A., Darrell, T., Pentland, A.P.: Pfinder: Real-time tracking of the human body. IEEE Trans. on PAMI 19(7), 780–785 (1997)
Agarwal, A., Triggs, B.: Recovering 3D human pose from monocular images. IEEE Trans. on PAMI 28(1), 44–58 (2006)
Okada, R., Stenger, B., Kondoh, N.: A video motion capture system for interactive games. In: Proc. of MVA, pp. 186–189 (2007)
Sminchisescu, C., Kanaujia, A., Li, Z., Metaxas, D.: Learning to reconstruct 3D human motion from bayesian mixtures of experts. a probabilistic discriminative approach. Technical Report CSRG-502, University of Toronto (2004)
Viola, P., Jones, M., Snow, D.: Detecting pedestrians using patterns of motion and appearance. In: Proc. of ICCV, pp. 734–741 (2003)
Sabzmeydani, P., Mori, G.: Detecting pedestrians by learning shapelet features. In: Proc. of CVPR, pp. 1–8 (2007)
Zehnder, P., Koller-Meier, E., Gool, L.V.: A hierarchical system for recognition, tracking and pose estimation. In: Bengio, S., Bourlard, H. (eds.) MLMI 2004. LNCS, vol. 3361, pp. 329–340. Springer, Heidelberg (2005)
Evgeniou, T., Pontil, M., Papageorgiou, C., Poggio, T.: Image representations for object detection using kernel classifiers. In: Proc. of ACCV, pp. 687–692 (2000)
Platt, J.: Probabilistic outputs for support vector machines and comparison to regularize likelihood methods. In: Smola, A., Bartlett, P., Schoelkopf, B., Schuurmans, D. (eds.) Advances in Large Margin Classifiers, pp. 61–74 (2000)
Tipping, M.E.: Sparse bayesian learning and the relevance vector machine. Journal of Machine Learning Research 1, 211–244 (2001)
Micchelli, C.A., Pontil, M.A.: On learning vector-valued functions. Neural Computation 17(1), 177–204 (2005)
Scholkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge (2001)
Shanno, D.F., Phua, K.H.: Minimization of unconstrained multivarite functions. ACM Transactions on Mathematical Software 6, 618–622 (1980)
Chapelle, O., Vapnik, V., Bousquet, O., Mukherjee, S.: Choosing multiple parameters for support vector machines. Machine Learning 46, 131–159 (2002)
Sigal, L., Black, M.J.: Humaneva: Synchronized video and motion capture dataset for evaluation of articulated human motion. Technical Report CS-06-08, Brown Univ. (2006), http://vision.cs.brown.edu/humaneva/
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines (2001), http://www.csie.ntu.edu.tw/~cjlin/libsvm
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Okada, R., Soatto, S. (2008). Relevant Feature Selection for Human Pose Estimation and Localization in Cluttered Images. In: Forsyth, D., Torr, P., Zisserman, A. (eds) Computer Vision – ECCV 2008. ECCV 2008. Lecture Notes in Computer Science, vol 5303. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88688-4_32
Download citation
DOI: https://doi.org/10.1007/978-3-540-88688-4_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-88685-3
Online ISBN: 978-3-540-88688-4
eBook Packages: Computer ScienceComputer Science (R0)