Abstract
Co-speech human gesture analysis is an important aspect for social conversational interactions involving human-robot interfaces. Co-speech gestures require synchronous integration of speech, human posture, and motions. Iconic gestures are a major subclass of co-speech gestures that express entities and actions by their attributes such as shape-contours, magnitude, and proximity using the synchronous motions of fingers, palms, and spoken phrases. The attributes of entities and actions correlate directly with the displayed contours. In this research, we describe an integrated technique that combines motion analysis to derive contours, synchronization of motion with speech to identify words corresponding to iconic gestures, and conceptual dependency of action words to drive iconic gestures. This technique models motion-sketched contour as a combination of synchronous color Petri net extended to model composite motions and contour-segment patterns. We present high-level algorithms and the corresponding implementation for the proposed technique and evaluate its performance. Performance results show approximately 90% recognition of simple contours, including closed contours.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Yenilmez, M.I.: Economic and social consequences of population aging the dilemmas and opportunities in the twenty-first century. Appl. Res. Qual. Life 10(4), 735–752 (2015). https://doi.org/10.1007/s11482-014-9334-2
García, D.H., Esteban, P.G., Lee, H.R., Romeo, M., Senft, E., Billing, E.: Social robots in therapy and care. In: Proceedings of the14th ACM/IEEE International Conference on Human-Robot Interaction, Daegu, South Korea, pp. 669–670 (2019). https://doi.org/10.1109/HRI.2019.8673243
McNeill, D.: Hand and Mind: What Gestures Reveal About Thought. The University of Chicago Press, Chicago (1992)
Wagner, P., Malisz, Z., Kopp, S.: Gesture and speech in interaction—an overview. Speech Commun. 57, 209–232 (2014). https://doi.org/10.1016/j.specom.2013.09.008
Pickering, M.J., Garrod, S.: Understanding Dialogue: Language Use and Social Interaction. Cambridge University Press, Cambridge (2021)
Graham, J.A., Heywood, S.: The effects of elimination of hand gestures and of verbal codability on speech performance. Eur. J. Soc. Psychol. 5(2), 189–195 (1976). https://doi.org/10.1002/ejsp.2420050204
Aussems, S., Kita, S.: Seeing iconic gestures while encoding events facilitates children’s memory of these events. Child Dev. 90(4), 1127–1137 (2019). https://doi.org/10.1111/cdev.12988
Sowa, T., Wachsmuth, I.: A model for the representation for processing of shape in coverbal iconic gestures. In: Proceedings of KogWis05: The German Cognitive Science Conference, Basel, Switzerland, pp. 183–188. Schwabe Verlag, Basel (2005)
Ghayoumi, M., Thafar, M., Bansal, A.K.: A formal approach for multimodal integration to derive emotions. J. Vis. Lang. Sentient Syst. 2, 48–54 (2016). https://doi.org/10.18293/DMS2016-030
Singh, A., Bansal, A.K.: Towards synchronous model of non-emotional conversational gesture generation in humanoids. In: Arai, K. (ed.) Intelligent Computing, London, UK 2022, LNNS, vol. 283, no. 1, pp. 737–756. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-80119-9_47
Gesture recognition market size, share & trends analysis report by technology (touch-based, touchless), By industry (automotive, consumer electronics, healthcare), by region, and segment forecasts, 2022–2030. https://www.grandviewresearch.com/industry-analysis/gesture-recognition-market. Last accessed 24 March 2023
Cheok, M.J., Omar, Z., Jaward, M.H.: A review of hand gesture and sign language recognition techniques. Int. J. Mach. Learn. Cybern. 10(1), 131–153 (2019). https://doi.org/10.1007/s13042-017-0705-5
Iengo, S., Rossi, S., Staffa, M., Finzi, A.: Continuous gesture recognition for flexible human-robot interaction. In: Proceedings of the IEEE International Conference on Robotics and Automation, Hong Kong, China, pp. 4863–4868 (2014). https://doi.org/10.1109/ICRA.2014.6907571
Singh, A., Bansal, A.K.: Synchronous colored Petri net based modeling and video analysis of conversational head-gestures for training social robots. In: Arai, K. (ed.) Future Technology Conference, Vancouver, Canada 2021, LNNS, vol. 561, pp. 432–450. Springer, Heidelberg (2021). https://doi.org/10.1007/978-3-030-89880-9_36
Ng-Thow-Hing, V., Okita, S.Y., Luo, P.: Synchronized gesture and speech production. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Taipei, Taiwan, pp. 4617–4624 (2010). https://doi.org/10.1109/IROS.2010.5654322
Allen, J.F.: Maintaining knowledge about temporal intervals. Commun. ACM 26(11), 832–843 (1983)
Ziaie, P., Muller, T., Knoll, A.: A novel approach to hand-gesture recognition in a human-robot dialog system. In: Proceedings of the First Workshops on Image Processing Theory, Tools and Applications, Sousse, Tunisia, pp. 1–8 (2008). https://doi.org/10.1109/IPTA.2008.4743760
Schank, R.C.: Conceptual dependency: a theory of natural language understanding. Cognit. Psychol. 3(4), 552–631 (1972). https://doi.org/10.1016/0010-0285(72)90022-9
Chein, M., Mugnier, M.L.: Conceptual graphs: fundamental notions. Revue d’Inteligence Artificielle 6(4), 365–406 (1992)
Goldin-Meadow, S.: The role of gesture in communication and thinking. Trends Cognit. Sci. 3(11), 419–429 (1999). https://doi.org/10.1016/S1364-6613(99)01397-2
Kelly, S.D., Kravitz, C., Hopkins, M.: Neural correlates of bimodal speech and gesture comprehension. Brain Lang. 89(1), 253–260 (2004). https://doi.org/10.1016/S0093-934X(03)00335-3
Cook, S.W., Tanenhaus, M.K.: Embodied communication: speakers’ gestures affect listeners’ actions. Cognition 113(1), 98–104 (2009). https://doi.org/10.1016/j.cognition.2009.06.006
Kendon, A.: Gesture: Visible Action as Utterance. Cambridge University Press, Cambridge (2004)
Morency, L.-P., de Kok, I., Gratch, J.: Context-based recognition during human interactions: automatic feature selection and encoding dictionary. In: Proceedings of the Tenth International Conference on Multimedia Interfaces (ICMI), Chania, Crete, Greece, pp. 181–188 (2008). https://doi.org/10.1145/1452392.1452426
Jensen, K., Kristensen, L.M.: Colored Petri Nets: Modeling and Validation of Concurrent Systems. Springer, Heidelberg (2009)
Wang, J.: Timed Petri Net: Theory and Applications. Springer Science + Business Media, New York, NY (1998)
Liu, W., Du, Y.: Modeling multimedia synchronization using Petri nets. Inf. Technol. J. 8(7), 1054–1058 (2009). https://doi.org/10.3923/itj.2009.1054.1058
Ekman, P., Friesen, W.V.: The repertoire of nonverbal behavior: categories, origins, usage, and coding. Semiotica 1(1), 49–98 (1969). https://doi.org/10.1515/9783110880021.57
Zhao, R., Wang, Y., Jia, P., Li, Ma, C.Y., Zhang, Z.: Review of human gesture recognition based on computer vision technology. In: Proceedings of the IEEE 5th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), Chongqing, China, pp. 1599–1603 (2021). https://doi.org/10.1109/IAEAC50856.2021.9390889
Alibali, M.W., Goldin-Meadow, S.: Gesture-speech mismatch and mechanisms of learning: what the hands reveal about a child’s state of mind. Cognit. Psychol. 25(4), 468–523 (1993). https://doi.org/10.1006/cogp.1993.1012
Asadi-Aghbolaghi, M., Fathy, M., Behbahani, M.M., Sarrafzadeh, A.: A survey on deep learning based approaches for action and gesture recognition in image sequences. In: Proceedings of the12th IEEE International Conference on Automatic Face & Gesture Recognition, Washington, DC, USA, pp. 476–483 (2017). https://doi.org/10.1109/FG.2017.150
Gong, X.-Y., Su, H., Xu, D., Zhang, Z.-T., Shen, F., Yang, H.-B.: An overview of contour detection approaches. Int. J. Autom. Comput. 15(6), 656–672 (2018). https://doi.org/10.1007/s11633-018-1117-z
Pisharady, P.K., Saerbeck, M.: Recent methods in vision-based hand-gesture recognition: a review. Comput. Vis. Image Underst. 141, 152–165 (2015). https://doi.org/10.1016/j.cviu.2015.08.004
Yang, D., Peng, B., Al-Huda, A., Malik, A., Zhai, D.: An overview of edge and object-contour detection. Neurocomputing 488, 470–493 (2022). https://doi.org/10.1016/j.neucom.2022.02.079
Zhang, Y., Li, S.: A survey of shape representation and description techniques. Pattern Recognit 42(1), 1–19 (2009). https://doi.org/10.1016/j.patcog.2003.07.008
Zhu, G., Zhang, L., Shen, P., Song, J.: Multimodal gesture recognition using 3D convolution and convolutional LSTM. IEEE Access 5, 4517–4524 (2017). https://doi.org/10.1109/ACCESS.2017.2684186
Yu, J., Qin, M., Zhou, S.: Dynamic gesture recognition based on 2D convolutional neural network and feature fusion. Sci. Rep. 12, article 4345 (2022). https://doi.org/10.1038/s41598-022-08133-z
Nam, Y., Wohn, N., Lee-Kwang, H.: Modeling and recognition of hand gesture using colored Petri nets. IEEE Trans. Syst. Man Cybern. Part A: Syst. Humans 29(5), 514–421 (1999). https://doi.org/10.1109/3468.784178
Mediapipe. https://mediapipe.dev. Last accessed 24 March 2023
Navarro, G.: A guided tour to approximate string matching. ACM Comput. Surv. 33(1), 31–88 (2001). https://doi.org/10.1145/375360.375365
Open CV. https://opencv.org. Last accessed 24 March 2023
PyAudio. https://people.csail.mit.edu/hubert/pyaudio/docs/. Last accessed 24 March 2023
Pydub. https://pypi.org/project/pydub/. Last accessed 24 March 2023
Chen, D., Manning, C.D.: A fast and accurate dependency parser using neural networks. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, pp. 740–750 (2014). https://doi.org/10.3115/v1/D14-1082
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Singh, A., Bansal, A.K. (2024). An Integrated Analysis for Identifying Iconic Gestures in Human-Robot Interactions. In: Arai, K. (eds) Intelligent Systems and Applications. IntelliSys 2023. Lecture Notes in Networks and Systems, vol 825. Springer, Cham. https://doi.org/10.1007/978-3-031-47718-8_18
Download citation
DOI: https://doi.org/10.1007/978-3-031-47718-8_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-47717-1
Online ISBN: 978-3-031-47718-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)