3D Hand Pose Detection in Egocentric RGB-D Images

Rogez, Grégory; Khademi, Maryam; Supančič III, J. S.; Montiel, J. M. M.; Ramanan, Deva

doi:10.1007/978-3-319-16178-5_25

Grégory Rogez^16,17,
Maryam Khademi¹⁶,
J. S. Supančič III¹⁶,
J. M. M. Montiel¹⁷ &
…
Deva Ramanan¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 8925))

Included in the following conference series:

European Conference on Computer Vision

5771 Accesses
10 Citations

Abstract

We focus on the task of hand pose estimation from egocentric viewpoints. For this problem specification, we show that depth sensors are particularly informative for extracting near-field interactions of the camera wearer with his/her environment. Despite the recent advances in full-body pose estimation using Kinect-like sensors, reliable monocular hand pose estimation in RGB-D images is still an unsolved problem. The problem is exacerbated when considering a wearable sensor and a first-person camera viewpoint: the occlusions inherent to the particular camera view and the limitations in terms of field of view make the problem even more difficult. We propose to use task and viewpoint specific synthetic training exemplars in a discriminative detection framework. We also exploit the depth features for a sparser and faster detection. We evaluate our approach on a real-world annotated dataset and propose a novel annotation technique for accurate 3D hand labelling even in case of partial occlusions.

This research was supported by the European Commission under FP7 Marie Curie IOF grant “Egovision4Health” (PIOF-GA-2012-328288).

Download to read the full chapter text

Chapter PDF

Hand Pose Estimation from a Single RGB-D Image

Weakly-Supervised 3D Hand Pose Estimation from Monocular RGB Images

Estimating 2D Multi-hand Poses from Single Depth Images

Keywords

References

Hodges, S., Williams, L., Berry, E., Izadi, S., Srinivasan, J., Butler, A., Smyth, G., Kapur, N., Wood, K.: SenseCam: a retrospective memory aid. In: Dourish, P., Friday, A. (eds.) UbiComp 2006. LNCS, vol. 4206, pp. 177–193. Springer, Heidelberg (2006)
Chapter Google Scholar
Yang, R., Sarkar, S., Loeding, B.L.: Handling movement epenthesis and hand segmentation ambiguities in continuous sign language recognition using nested dynamic programming. PAMI 32(3), 462–477 (2010)
Article Google Scholar
den Bergh, M.V., Gool, L.J.V.: Combining rgb and tof cameras for real-time 3d hand gesture interaction. In: WACV, 66–72 (2011)
Google Scholar
Shotton, J., Fitzgibbon, A.W., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., Blake, A.: Real-time human pose recognition in parts from single depth images. In: CVPR (2011)
Google Scholar
Oikonomidis, I., Kyriazis, N., Argyros, A.A.: Tracking the articulated motion of two strongly interacting hands. In: CVPR (2012)
Google Scholar
Romero, J., Kjellstrom, H., Ek, C.H., Kragic, D.: Non-parametric hand pose estimation with object context. Im. and Vision Comp. 31(8), 555–564 (2013)
Article Google Scholar
Tang, D., Kim, T.H.Y.T.K.: Real-time articulated hand pose estimation using semi-supervised transductive regression forests. In: ICCV (2013)
Google Scholar
Sakata, H., Taira, M., Kusunoki, M., Murata, A., Tsutsui, K.I., Tanaka, Y., Shein, W.N., Miyashita, Y.: Neural representation of three-dimensional features of manipulation objects with stereopsis. Experimental Brain Research 128(1–2), 160–169 (1999)
Article Google Scholar
Fathi, A., Ren, X., Rehg, J.: Learning to recognize objects in egocentric activities. In: CVPR (2011)
Google Scholar
Pirsiavash, H., Ramanan, D.: Detecting activities of daily living in first-person camera views. In: CVPR (2012)
Google Scholar
Starner, T., Schiele, B., Pentland, A.: Visual contextual awareness in wearable computing. In: International Symposium on Wearable Computing (1998)
Google Scholar
Kurata, T., Kato, T., Kourogi, M., Jung, K., Endo, K.: A functionally-distributed hand tracking method for wearable visual interfaces and its applications. In: MVA, 84–89 (2002)
Google Scholar
Kölsch, M., Turk, M.: Hand tracking with flocks of features. In: CVPR (2), 1187 (2005)
Google Scholar
Kölsch, M.: An appearance-based prior for hand tracking. In: Blanc-Talon, J., Bone, D., Philips, W., Popescu, D., Scheunders, P. (eds.) ACIVS 2010, Part II. LNCS, vol. 6475, pp. 292–303. Springer, Heidelberg (2010)
Chapter Google Scholar
Morerio, P., Marcenaro, L., Regazzoni, C.S.: Hand detection in first person vision. In: FUSION (2013)
Google Scholar
Dominguez, S., Keaton, T., Sayed, A.: A robust finger tracking method for multimodal wearable computer interfacing. IEEE Transactions on Multimedia 8(5), 956–972 (2006)
Article Google Scholar
Ryoo, M.S., Matthies, L.: First-person activity recognition: What are they doing to me?. In: CVPR (2013)
Google Scholar
Mayol, W., Davison, A., Tordoff, B., Molton, N., Murray, D.: Interaction between hand and wearable camera in 2d and 3d environments. In: BMVC (2004)
Google Scholar
Ren, X., Philipose, M.: Egocentric recognition of handled objects: Benchmark and analysis. In: IEEE Workshop on Egocentric Vision (2009)
Google Scholar
Damen, D., Gee, A.P., Mayol-Cuevas, W.W., Calway, A.: Egocentric real-time workspace monitoring using an rgb-d camera. In: IROS (2012)
Google Scholar
Ren, X., Gu, C.: Figure-ground segmentation improves handled object recognition in egocentric video. In: CVPR, pp. 3137–3144. IEEE (2010)
Google Scholar
Fathi, A., Farhadi, A., Rehg, J.: Understanding egocentric activities. In: ICCV (2011)
Google Scholar
Oikonomidis, I., Kyriazis, N., Argyros, A.: Efficient model-based 3d tracking of hand articulations using kinect. In: BMVC (2011)
Google Scholar
Keskin, C., Kıraç, F., Kara, Y.E., Akarun, L.: Hand pose estimation and hand shape classification using multi-layered randomized decision forests. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part VI. LNCS, vol. 7577, pp. 852–863. Springer, Heidelberg (2012)
Chapter Google Scholar
Xu, C., Cheng, L.: Efficient hand pose estimation from a single depth image. In: ICCV (2013)
Google Scholar
Mann, S., Huang, J., Janzen, R., Lo, R., Rampersad, V., Chen, A., Doha, T.: Blind navigation with a wearable range camera and vibrotactile helmet. In: ACM International Conf. on Multimedia. MM 2011 (2011)
Google Scholar
Argyros, A.A., Lourakis, M.I.A.: Real-Time Tracking of Multiple Skin-Colored Objects with a Possibly Moving Camera. In: Pajdla, T., Matas, J.G. (eds.) ECCV 2004. LNCS, vol. 3023, pp. 368–379. Springer, Heidelberg (2004)
Chapter Google Scholar
Erol, A., Bebis, G., Nicolescu, M., Boyle, R.D., Twombly, X.: Vision-based hand pose estimation: A review. CVIU 108(1–2), 52–73 (2007)
Google Scholar
Sridhar, S., Oulasvirta, A., Theobalt, C.: Interactive markerless articulated hand motion tracking using rgb and depth data. In: ICCV (2013)
Google Scholar
Stenger, B., Thayananthan, A., Torr, P., Cipolla, R.: Model-based hand tracking using a hierarchical bayesian filter. PAMI 28(9), 1372–1384 (2006)
Article MATH Google Scholar
Oikonomidis, I., Kyriazis, N., Argyros, A.A.: Full dof tracking of a hand interacting with an object by modeling occlusions and physical constraints. In: ICCV (2011)
Google Scholar
de La Gorce, M., Fleet, D.J., Paragios, N.: Model-based 3d hand pose estimation from monocular video. IEEE PAMI 33(9), 1793–1805 (2011)
Article Google Scholar
Ong, E.J., Bowden, R.: A boosted classifier tree for hand shape detection. In: FGR (2004)
Google Scholar
Rogez, G., Rihan, J., Orrite, C., Torr, P.H.S.: Fast human pose detection using randomized hierarchical cascades of rejectors. IJCV 99(1), 25–52 (2012)
Article MathSciNet Google Scholar
Sense, P.: The primesensortmreference design 1.08. Prime Sense (2011)
Google Scholar
Intel: Perceptual computing sdk (2013)
Google Scholar
Šarić, M.: Libhand: A library for hand articulation Version 0.9 (2011)
Google Scholar
SmithMicro: Poser10 (2010) http://poser.smithmicro.com/
Shakhnarovich, G., Viola, P., Darrell, T.: Fast pose estimation with parameter-sensitive hashing. In: 2003 Proceedings of the Ninth IEEE International Conference on Computer Vision, pp. 750–757. IEEE (2003)
Google Scholar
Romero, J., Feix, T., Kjellstrom, H., Kragic, D.: Spatio-temporal modeling of grasping actions. In: IROS (2010)
Google Scholar
Daz3D: Every-hands pose library (2013). http://www.daz3d.com/everyday-hands-poses-for-v4-and-m4
Spinello, L., Arras, K.O.: People detection in rgb-d data. In: IROS (2011)
Google Scholar
PrimeSense: Nite2 middleware (2013)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of California, Irvine, USA
Grégory Rogez, Maryam Khademi, J. S. Supančič III & Deva Ramanan
Aragon Institute of Engineering Research (i3A), Universidad de Zaragoza, Zaragoza, Spain
Grégory Rogez & J. M. M. Montiel

Authors

Grégory Rogez
View author publications
You can also search for this author in PubMed Google Scholar
Maryam Khademi
View author publications
You can also search for this author in PubMed Google Scholar
J. S. Supančič III
View author publications
You can also search for this author in PubMed Google Scholar
J. M. M. Montiel
View author publications
You can also search for this author in PubMed Google Scholar
Deva Ramanan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Grégory Rogez .

Editor information

Editors and Affiliations

University College London, London, United Kingdom
Lourdes Agapito
University of Lugano, Lugano, Switzerland
Michael M. Bronstein
Technische Universität Dresden, Dresden, Germany
Carsten Rother

1 Electronic Supplementary Material

Below is the link to the electronic supplementary material.

Supplementary material 1 (MP4 17,420 KB)

Supplementary material 2 (MP4 16,284 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rogez, G., Khademi, M., Supančič III, J.S., Montiel, J.M.M., Ramanan, D. (2015). 3D Hand Pose Detection in Egocentric RGB-D Images. In: Agapito, L., Bronstein, M., Rother, C. (eds) Computer Vision - ECCV 2014 Workshops. ECCV 2014. Lecture Notes in Computer Science(), vol 8925. Springer, Cham. https://doi.org/10.1007/978-3-319-16178-5_25

Download citation

DOI: https://doi.org/10.1007/978-3-319-16178-5_25
Published: 19 March 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16177-8
Online ISBN: 978-3-319-16178-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

3D Hand Pose Detection in Egocentric RGB-D Images

Abstract

Chapter PDF

Similar content being viewed by others

Hand Pose Estimation from a Single RGB-D Image

Weakly-Supervised 3D Hand Pose Estimation from Monocular RGB Images

Estimating 2D Multi-hand Poses from Single Depth Images

Keywords

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic Supplementary Material

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

3D Hand Pose Detection in Egocentric RGB-D Images

Abstract

Chapter PDF

Similar content being viewed by others

Hand Pose Estimation from a Single RGB-D Image

Weakly-Supervised 3D Hand Pose Estimation from Monocular RGB Images

Estimating 2D Multi-hand Poses from Single Depth Images

Keywords

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic Supplementary Material

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation