Abstract
A large body of current linguistic research on sign language is based on analyzing large corpora of video recordings. This requires either manual or automatic annotation of the videos. In this paper we introduce methods for automatically detecting and classifying hand-head occlusions in sign language videos. Linguistically, hand-head occlusions are an important and interesting subject of study as the head is a structural place of articulation in many signs. Our method combines easily calculable local video properties with more global hand tracking. The experiments carried out with videos of the Suvi on-line dictionary of Finnish Sign Language show that the sensitivity of the proposed local method in detecting occlusion events is 92.6%. When global hand tracking is combined in the method, the specificity can reach the level of 93.7% while still maintaining the detection sensitivity above 90%.
This work has been funded by the following grants of the Academy of Finland: 140245, Content-based video analysis and annotation of Finnish Sign Language (CoBaSiL); 251170, Finnish Centre of Excellence in Computational Inference Research (COIN); 134433, Signs, Syllables, and Sentences (3BatS).
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Crasborn, O., Zwitserlood, I.: Annotation of the video data in the “Corpus NGT” (2008), Online publication, http://hdl.handle.net/1839/00-0000-0000-000A-3F63-4 ; Dept. of Linguistics, and Centre for Language Studies, Radboud University Nijmegen, The Netherlands
Dreuw, P., Forster, J., Deselaers, T., Ney, H.: Efficient approximations to model-based joint tracking and recognition of continuous sign language. In: IEEE International Conference on Automatic Face and Gesture Recognition, Amsterdam, The Netherlands, pp. 1–6 (September 2008)
Dreuw, P., Neidle, C., Athitsos, V., Sclaroff, S., Ney, H.: Benchmark databases for video-based automatic sign language recognition. In: LREC. European Language Resources Association (2008)
Han, J., Awad, G., Sutherland, A.: Automatic skin segmentation and tracking in sign language recognition. IET Computer Vision 3(1), 24–35 (2009)
Jantunen, T., Koskela, M., Laaksonen, J., Rainò, P.: Towards automated visualization and analysis of signed language motion: Method and linguistic issues. In: Proceedings of 5th International Conference on Speech Prosody, Chicago, Ill, USA (2010), http://speechprosody2010.illinois.edu/papers/100006.pdf
Jantunen, T., Viitaniemi, V., Karppa, M., Laaksonen, J.: The head as a place of articulation: From automated detection to linguistic analysis. In: Poster accepted for presentation at 11th Theoretical Issues in Sign Language Research Conference, University College London, July 10-13 (2013)
Johnston, T.: Guidelines for annotation of the video data in the Auslan corpus, Dept. of Linguistics, Macquarie University, Sydney, Australia (2009), Online publication, http://media.auslan.org.au/media/upload/attachments/AnnotationGuidelinesAuslanCorpusT5.pdf
Karppa, M., Jantunen, T., Koskela, M., Laaksonen, J., Viitaniemi, V.: Method for visualisation and analysis of hand and head movements in sign language video. In: Kirchhof, C., Malisz, Z., Wagner, P. (eds.) Proceedings of the 2nd Gesture and Speech in Interaction Conference (GESPIN 2011), Bielefeld, Germany (2011), http://coral2.spectrum.uni-bielefeld.de/gespin2011/final/Jantunen.pdf
Karppa, M., Jantunen, T., Viitaniemi, V., Laaksonen, J., Burger, B., De Weerdt, D.: Comparing computer vision analysis of signed language video with motion capture recordings. In: Proceedings of 8th Language Resources and Evaluation Conference (LREC 2012), Istanbul, Turkey, pp. 2421–2425 (May 2012), http://www.lrec-conf.org/proceedings/lrec2012/pdf/321_Paper.pdf
Uřičář, M., Franc, V., Hlaváč, V.: Detector of facial landmarks learned by the structured output SVM. In: Csurka, G., Braz, J. (eds.) VISAPP 2012: Proceedings of the 7th International Conference on Computer Vision Theory and Applications, Portugal, vol. 1, pp. 547–556. SciTePress — Science and Technology Publications (2012)
Yang, H.-D., Sclaroff, S., Lee, S.-W.: Sign language spotting with a threshold model based on conditional random fields. IEEE Transactions on Pattern Analysis and Machine Intelligence 31(7), 1264–1277 (2009)
Yang, R., Sarkar, S., Loeding, B.: Handling movement epenthesis and hand segmentation ambiguities in continuous sign language recognition using nested dynamic programming. IEEE Transactions on Pattern Analysis and Machine Intelligence 32(3), 462–477 (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Viitaniemi, V., Karppa, M., Laaksonen, J., Jantunen, T. (2013). Detecting Hand-Head Occlusions in Sign Language Video. In: Kämäräinen, JK., Koskela, M. (eds) Image Analysis. SCIA 2013. Lecture Notes in Computer Science, vol 7944. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38886-6_35
Download citation
DOI: https://doi.org/10.1007/978-3-642-38886-6_35
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38885-9
Online ISBN: 978-3-642-38886-6
eBook Packages: Computer ScienceComputer Science (R0)