Abstract
Led by the fundamental role that rhythms apparently play in speech and gestural communication among humans, this study was undertaken to substantiate a biologically motivated model for synchronizing speech and gesture input in human computer interaction. Our approach presents a novel method which conceptualizes a multimodal user interface on the basis of timed agent systems. We use multiple agents for the purpose of polling presemantic information from different sensory channels (speech and hand gestures) and integrating them to multimodal data structures that can be processed by an application system which is again based on agent systems. This article motivates and presents technical work which exploits rhythmic patterns in the development of biologically and cognitively motivated mediator systems between humans and machines.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
R.A. Bolt. “Put-That-There”: Voice and gesture at the graphics interface. Computer Graphics, 14 (3): 262–270, 1980.
E. Bos, C. Huls, & W. Claasen. EDWARD: Full integration of language and action in a multimodal user interface. Int. Journal Human-Computer Studies, 40: 473–495, 1994.
T. Broendsted & J.P. Madsen. Analysis of speaking rate variations in stress-timed languages. Proceedings 5 th European Conference on Speech Communication and Technology (EuroSpeech), pages 481–484, Rhodes 1997.
W.S. Condon, Communication: Rhythm and structure. In J. Evans & M. Clynes (eds.): Rhythm in Psychological, Linguistic and Musical Processes (pp. 55–77). Springfield, Ill.: Thomas, 1986.
J. Coutaz, L. Nigay, & D. Salber. Multimodality from the user and systems perspectives. In Proceedings of the ERCIM-95 Workshop on Multimedia Multimodal User Interfaces, 1995.
F. Cummins & R.F. Port. Rhythmic constraints on stress timing in English. Journal of Phonetics 26: 145–171, 1998.
G. Fant. & A. Kruckenberg. On the quantal nature of speech timing. Proc. ICSLP 1996, pp. 2044–2047, 1996.
J. Kien & A. Kemp. Is speech temporally segmented? Comparison with temporal segmentation in behavior. Brain and Language 46: 662–682, 1994.
D.B. Koons, C.J. Sparrell, & K.R. Thorisson. Integrating simultaneous input from speech, gaze, and hand gestures. In M.T. Maybury (Ed.): Intelligent Multimedia Interfaces (pp. 257–276). AAAI Press/The MIT Press, Menlo Park, 1993.
S. Kopp & I. Wachsmuth. Natural timing in coverbal gesture of an articulated figure, Working notes, Workshop “Communicative Agents” at Autonomous Agents 1999, Seattle.
B. Lenzmann: Benutzeradaptive und multimodale Interface-Agenten. Dissertationen der Künstlichen Intelligenz, Bd. 184. Sankt Augustin: Infix, 1998.
J.G. Martin. Rhythmic (hierarchical) versus serial structure in speech and other behavior. Psychological Review 79 (6): 487–509, 1972.
J.G. Martin. Rhythmic and segmental perception. J. Acoust. Soc. Am. 65(5): 1286–1297, 1979.
M.T. Maybury. Research in multimedia and multimodal parsing and generation. Artificial Intelligence Review 9 (2-3): 103–127, 1995.
D. McAuley. Time as phase: A dynamical model of time perception. In Proceedings of the Sixteenth Annual Meeting of the Cognitive Science Society, pages 607–612, Hillsdale NJ: Lawrence ErlbaumAssociates, 1994.
E. McClave. Gestural beats: The rhythmh ypothesis. Journal of Psycholinguistic Research 23 (1), 45–66, 1994.
D. McNeill. Hand and Mind: What Gestures Reveal About Thought. Chicago: University of Chicago Press, 1992.
J.G. Neal & S.C. Shapiro. Intelligent multi-media interface technology. In J.W. Sullivan and S.W. Tyler, editors, Intelligent User Interfaces, pages 11–43. ACM Press, New York, 1991.
L. Nigay & J. Coutaz. A generic platformfor addressing themultimodal challence. In Proceedings of the Conference on Human Factors in Computing Systems (CHI-95), pages 98–105, Reading: Addison-Wesley, 1995.
E. Pöppel. A hierarchical model of temporal perception. Trends in Cognitive Science 1 (2), 56–61, 1997.
G. Schöner & J.A.S. Kelso. Dynamic pattern generation in behavioral and neural systems. Science, 239: 1513–1520, 1988.
T. Sowa, M. Fröhlich, & M.E. Latoschik, Temporal symbolic integration applied to a multimodal system using gestures and speech, this volume.
R.K. Srihari. Computational models for integrating linguistic and visual information: a survey. Artificial Intelligence Review 8: 349–369, 1995.
I. Wachsmuth & Y. Cao: Interactive graphics design with situated agents. In W. Strasser & F. Wahl (eds.): Graphics and Robotics (pp. 73–85), Springer, 1995.
M. Wooldridge & N.R. Jennings. Intelligent agents: Theory and practice. Knowledge Engineering Review, 10 (2): 115–152, 1995.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wachsmuth, I. (1999). Communicative Rhythm in Gesture and Speech. In: Braffort, A., Gherbi, R., Gibet, S., Teil, D., Richardson, J. (eds) Gesture-Based Communication in Human-Computer Interaction. GW 1999. Lecture Notes in Computer Science(), vol 1739. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-46616-9_25
Download citation
DOI: https://doi.org/10.1007/3-540-46616-9_25
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66935-7
Online ISBN: 978-3-540-46616-1
eBook Packages: Springer Book Archive