Detecting Hand-Head Occlusions in Sign Language Video

Viitaniemi, Ville; Karppa, Matti; Laaksonen, Jorma; Jantunen, Tommi

doi:10.1007/978-3-642-38886-6_35

Ville Viitaniemi¹⁸,
Matti Karppa¹⁸,
Jorma Laaksonen¹⁸ &
…
Tommi Jantunen¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7944))

Included in the following conference series:

Scandinavian Conference on Image Analysis

3407 Accesses
2 Citations

Abstract

A large body of current linguistic research on sign language is based on analyzing large corpora of video recordings. This requires either manual or automatic annotation of the videos. In this paper we introduce methods for automatically detecting and classifying hand-head occlusions in sign language videos. Linguistically, hand-head occlusions are an important and interesting subject of study as the head is a structural place of articulation in many signs. Our method combines easily calculable local video properties with more global hand tracking. The experiments carried out with videos of the Suvi on-line dictionary of Finnish Sign Language show that the sensitivity of the proposed local method in detecting occlusion events is 92.6%. When global hand tracking is combined in the method, the specificity can reach the level of 93.7% while still maintaining the detection sensitivity above 90%.

This work has been funded by the following grants of the Academy of Finland: 140245, Content-based video analysis and annotation of Finnish Sign Language (CoBaSiL); 251170, Finnish Centre of Excellence in Computational Inference Research (COIN); 134433, Signs, Syllables, and Sentences (3BatS).

Download to read the full chapter text

Chapter PDF

Towards an Automatic Annotation of French Sign Language Videos: Detection of Lexical Signs

Vision-based continuous sign language recognition using multimodal sensor fusion

Article 22 January 2021

Sign Language Video Classification Based on Image Recognition of Specified Key Frames

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Crasborn, O., Zwitserlood, I.: Annotation of the video data in the “Corpus NGT” (2008), Online publication, http://hdl.handle.net/1839/00-0000-0000-000A-3F63-4 ; Dept. of Linguistics, and Centre for Language Studies, Radboud University Nijmegen, The Netherlands
Dreuw, P., Forster, J., Deselaers, T., Ney, H.: Efficient approximations to model-based joint tracking and recognition of continuous sign language. In: IEEE International Conference on Automatic Face and Gesture Recognition, Amsterdam, The Netherlands, pp. 1–6 (September 2008)
Google Scholar
Dreuw, P., Neidle, C., Athitsos, V., Sclaroff, S., Ney, H.: Benchmark databases for video-based automatic sign language recognition. In: LREC. European Language Resources Association (2008)
Google Scholar
Han, J., Awad, G., Sutherland, A.: Automatic skin segmentation and tracking in sign language recognition. IET Computer Vision 3(1), 24–35 (2009)
Article Google Scholar
Jantunen, T., Koskela, M., Laaksonen, J., Rainò, P.: Towards automated visualization and analysis of signed language motion: Method and linguistic issues. In: Proceedings of 5th International Conference on Speech Prosody, Chicago, Ill, USA (2010), http://speechprosody2010.illinois.edu/papers/100006.pdf
Jantunen, T., Viitaniemi, V., Karppa, M., Laaksonen, J.: The head as a place of articulation: From automated detection to linguistic analysis. In: Poster accepted for presentation at 11th Theoretical Issues in Sign Language Research Conference, University College London, July 10-13 (2013)
Google Scholar
Johnston, T.: Guidelines for annotation of the video data in the Auslan corpus, Dept. of Linguistics, Macquarie University, Sydney, Australia (2009), Online publication, http://media.auslan.org.au/media/upload/attachments/AnnotationGuidelinesAuslanCorpusT5.pdf
Karppa, M., Jantunen, T., Koskela, M., Laaksonen, J., Viitaniemi, V.: Method for visualisation and analysis of hand and head movements in sign language video. In: Kirchhof, C., Malisz, Z., Wagner, P. (eds.) Proceedings of the 2nd Gesture and Speech in Interaction Conference (GESPIN 2011), Bielefeld, Germany (2011), http://coral2.spectrum.uni-bielefeld.de/gespin2011/final/Jantunen.pdf
Karppa, M., Jantunen, T., Viitaniemi, V., Laaksonen, J., Burger, B., De Weerdt, D.: Comparing computer vision analysis of signed language video with motion capture recordings. In: Proceedings of 8th Language Resources and Evaluation Conference (LREC 2012), Istanbul, Turkey, pp. 2421–2425 (May 2012), http://www.lrec-conf.org/proceedings/lrec2012/pdf/321_Paper.pdf
Uřičář, M., Franc, V., Hlaváč, V.: Detector of facial landmarks learned by the structured output SVM. In: Csurka, G., Braz, J. (eds.) VISAPP 2012: Proceedings of the 7th International Conference on Computer Vision Theory and Applications, Portugal, vol. 1, pp. 547–556. SciTePress — Science and Technology Publications (2012)
Google Scholar
Yang, H.-D., Sclaroff, S., Lee, S.-W.: Sign language spotting with a threshold model based on conditional random fields. IEEE Transactions on Pattern Analysis and Machine Intelligence 31(7), 1264–1277 (2009)
Article Google Scholar
Yang, R., Sarkar, S., Loeding, B.: Handling movement epenthesis and hand segmentation ambiguities in continuous sign language recognition using nested dynamic programming. IEEE Transactions on Pattern Analysis and Machine Intelligence 32(3), 462–477 (2010)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Information and Computer Science, Aalto University School of Science, Espoo, Finland
Ville Viitaniemi, Matti Karppa & Jorma Laaksonen
Sign Language Centre, Department of Languages, University of Jyväskylä, Finland
Tommi Jantunen

Authors

Ville Viitaniemi
View author publications
You can also search for this author in PubMed Google Scholar
Matti Karppa
View author publications
You can also search for this author in PubMed Google Scholar
Jorma Laaksonen
View author publications
You can also search for this author in PubMed Google Scholar
Tommi Jantunen
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Signal Processing, Tampere University of Technology, P.O. Box 553, Tampere, Finland
Joni-Kristian Kämäräinen
Department of Information and Computer Science,, Aalto University, P.O. Box 15400, 00076, Espoo, Finland
Markus Koskela

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Viitaniemi, V., Karppa, M., Laaksonen, J., Jantunen, T. (2013). Detecting Hand-Head Occlusions in Sign Language Video. In: Kämäräinen, JK., Koskela, M. (eds) Image Analysis. SCIA 2013. Lecture Notes in Computer Science, vol 7944. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38886-6_35

Download citation

DOI: https://doi.org/10.1007/978-3-642-38886-6_35
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38885-9
Online ISBN: 978-3-642-38886-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

Detecting Hand-Head Occlusions in Sign Language Video

Abstract

Chapter PDF

Similar content being viewed by others

Towards an Automatic Annotation of French Sign Language Videos: Detection of Lexical Signs

Vision-based continuous sign language recognition using multimodal sensor fusion

Sign Language Video Classification Based on Image Recognition of Specified Key Frames

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Navigation

Detecting Hand-Head Occlusions in Sign Language Video

Abstract

Chapter PDF

Similar content being viewed by others

Towards an Automatic Annotation of French Sign Language Videos: Detection of Lexical Signs

Vision-based continuous sign language recognition using multimodal sensor fusion

Sign Language Video Classification Based on Image Recognition of Specified Key Frames

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation