Abstract
Human segmentation in still images is a complex task due to the wide range of body poses and drastic changes in environmental conditions. Usually, human body segmentation is treated in a two-stage fashion. First, a human body part detection step is performed, and then, human part detections are used as prior knowledge to be optimized by segmentation strategies. In this paper, we present a two-stage scheme based on Multi-Scale Stacked Sequential Learning (MSSL). We define an extended feature set by stacking a multi-scale decomposition of body part likelihood maps. These likelihood maps are obtained in a first stage by means of a ECOC ensemble of soft body part detectors. In a second stage, contextual relations of part predictions are learnt by a binary classifier, obtaining an accurate body confidence map. The obtained confidence map is fed to a graph cut optimization procedure to obtain the final segmentation. Results show improved segmentation when MSSL is included in the human segmentation pipeline.
Chapter PDF
Similar content being viewed by others
References
Andriluka, M., Roth, S., Schiele, B.: Pictorial structures revisited: People detection and articulated pose estimation. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 1014–1021. IEEE (2009)
Bautista, M.A., Escalera, S., Baró, X., Radeva, P., Vitriá, J., Pujol, O.: Minimal design of error-correcting output codes. Pattern Recogn. Lett. 33(6), 693–702 (2012)
Bourdev, L., Maji, S., Brox, T., Malik, J.: Detecting people using mutually consistent poselet activations. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part VI. LNCS, vol. 6316, pp. 168–181. Springer, Heidelberg (2010)
Chakraborty, B., Bagdanov, A.D., Gonzalez, J., Roca, X.: Human action recognition using an ensemble of body-part detectors. Expert Systems (2011)
Cohen, W.W., de Carvalho, V.R.: Stacked sequential learning. In: Proc. of IJCAI 2005, pp. 671–676 (2005)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR, vol. 1, pp. 886–893 (2005)
Dantone, M., Gall, J., Leistner, C., van Gool, L.: Human pose estimation using body parts dependent joint regressors. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3041–3048 (June 2013)
Dietterich, T., Bakiri, G.: Solving multiclass learning problems via error-correcting output codes. Journal of Artificial Intelligence Research 2, 263–286 (1995)
Dietterich, T.G.: Machine learning for sequential data: A review. In: Caelli, T.M., Amin, A., Duin, R.P.W., Kamel, M.S., de Ridder, D. (eds.) SPR 2002 and SSPR 2002. LNCS, vol. 2396, pp. 15–30. Springer, Heidelberg (2002)
Escalera, S., Tax, D., Pujol, O., Radeva, P., Duin, R.: Subclass problem-dependent design of error-correcting output codes. PAMI 30(6), 1–14 (2008)
Escalera, S., Pujol, O., Radeva, P.: On the decoding process in ternary error-correcting output codes. PAMI 32, 120–134 (2010)
Felzenszwalb, P.F., Huttenlocher, D.P.: Efficient matching of pictorial structures. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 66–73. IEEE (2000)
Gatta, C., Puertas, E., Pujol, O.: Multi-scale stacked sequential learning. Pattern Recognition 44(10–11), 2414–2426 (2011)
Gkioxari, G., Arbelaez, P., Bourdev, L.D., Malik, J.: Articulated pose estimation using discriminative armlet classifiers. In: CVPR, pp. 3342–3349. IEEE (2013)
Hernández-Vela, A., Zlateva, N., Marinov, A., Reyes, M., Radeva, P., Dimov, D., Escalera, S.: Graph cuts optimization for multi-limb human segmentation in depth maps. In: CVPR, pp. 726–732 (2012)
Hernández-Vela, A., Reyes, M., Ponce, V., Escalera, S.: Grabcut-based human segmentation in video sequences. Sensors 12(11), 15376–15393 (2012)
Pishchulin, L., Andriluka, M., Gehler, P., Schiele, B.: Poselet conditioned pictorial structures. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 588–595. IEEE (2013)
Puertas, E., Escalera, S., Pujol, O.: Generalized multi-scale stacked sequential learning for multi-class classification. Pattern Analysis and Applications, 1–15 (2013)
Ramanan, D., Forsyth, D., Zisserman, A.: Strike a pose: tracking people by finding stylized poses. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005, vol. 1, pp. 271–278 (June 2005)
Ramanan, D., Forsyth, D., Zisserman, A.: Tracking people by learning their appearance. PAMI 29(1), 65–81 (2007)
Rother, C., Kolmogorov, V., Blake, A.: “grabcut”: interactive foreground extraction using iterated graph cuts. ACM Trans. Graph. 23(3), 309–314 (2004)
Sánchez, D., Bautista, M.A., Escalera, S.: Hupba 8k+: Dataset and ecoc-graphcut based segmentation of human limbs. Neurocomputing (2014)
Sánchez, D., Ortega, J.C., Bautista, M.Á., Escalera, S.: Human body segmentation with multi-limb error-correcting output codes detection and graph cuts optimization. In: Sanches, J.M., Micó, L., Cardoso, J.S. (eds.) IbPRIA 2013. LNCS, vol. 7887, pp. 50–58. Springer, Heidelberg (2013)
Sapp, B., Jordan, C., Taskar, B.: Adaptive pose priors for pictorial structures. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 422–429. IEEE (2010)
Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., Blake, A.: Real-time human pose recognition in parts from single depth images. In: CVPR, p. 3 (2011)
Vineet, V., Warrell, J., Ladicky, L., Torr, P.: Human instance segmentation from video using detector-based conditional random fields. In: BMVC (2011)
Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: CVPR, vol. 1 (2001)
Yang, Y., Ramanan, D.: Articulated pose estimation with flexiblemixtures-of-parts. In: IEEE Conference on Computer Vision and PatternRecognition, pp. 1385–1392. IEEE (2011)
Yu, C.N.J., Joachims, T.: Learning structural svms with latentvariables. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 1169–1176. ACM (2009)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Puertas, E., Bautista, M.A., Sanchez, D., Escalera, S., Pujol, O. (2015). Learning to Segment Humans by Stacking Their Body Parts. In: Agapito, L., Bronstein, M., Rother, C. (eds) Computer Vision - ECCV 2014 Workshops. ECCV 2014. Lecture Notes in Computer Science(), vol 8925. Springer, Cham. https://doi.org/10.1007/978-3-319-16178-5_48
Download citation
DOI: https://doi.org/10.1007/978-3-319-16178-5_48
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16177-8
Online ISBN: 978-3-319-16178-5
eBook Packages: Computer ScienceComputer Science (R0)