Abstract
It is widely acknowledged that users of Spoken Language Systems (SLS) want the ability to truncate system prompts by using a barge-in capability (e.g., Basson et al., 1995; Yankelovich et al., 1995). However, little has been published on how barge-in is used or if it adversely affects Automatic Speech Recognition (ASR) and the interface usability. Typically, user requests for barge-in are assumed to be based on the desire to make system interactions faster and therefore more similar to interactions with touch-tone systems. We believe that requests for a barge-in capability are rooted in the notion of discourse as a turn-taking event. Viewed in this way, we believe SLS can be enhanced to develop speech interfaces that are deemed more natural by users, as well as to increase system performance. This study addressed several issues. We found that users new to the system did not need to be informed about the barge-in capability before they attempted barge-in, that they used barge-in during almost half of their interactions with the system, and that they had identifiable patterns of barge-in use consistent with the turn-taking model. Results are presented and consequences for speech interface design as well as algorithm enhancement are discussed.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Aust, H., Oerder, M., Seide, F., and Steinbiss, V. (1994). Experience with the Phillips automatic train timetable information system.Proc. IEEE Workshop on Interactive Voice Technology for Telecommunications Applications. New York: IEEE Press, pp. 67–72.
Basson, S., Kalyanswamy, A., Man, E., Springer, S., and Yashchin, D. (1995). Establishing speech technology requirements: Themoney talks field trial.Proc. Annual International Voice Technologies Applications. San Jose, California: American Voice Input/Output Society, pp. 131–136.
Franzke, M., Marx, A.N., Roberts, T.L., and Engelbeck, G.E. (1993). Is Speech Recognition Usable? An exploration of the usability of a speech-based voice mail interface.SIGCHI Bulletin, 25:49–51. New York: Association for Computing Machinery Inc.
Marx, M. and Phillips, M. (1995). Against “Shoehorning:” Rethinking IVR architectures for speech recognition.Proc. Annual International Voice Technologies Applications. San Jose, California: American Voice Input/Output Society, pp. 187–195.
Rudnicky, A.I. and Hauptmann, A.G. (1988). Talking to computers: An empirical investigation.International Journal of Man-Machine Studies, 28:583–604.
Sacks, H., Schegloff, E., and Jefferson, G. (1975). A simplest systematics for the organization of turn-taking for conversation.Language, 50:696–735. Washington, D.C.: Linguistic Society of America.
Stuart, R., Desurvire, H., and Dews, S. (1991). The truncation of prompts in phone based interfaces: Using TOTT in evaluations.Proc. of the Human Factors Society 35th Annual Meeting. Santa Monica, CA: Human Factors Society, pp. 230–234.
Yankelovich, N., Levow, G., and Marx, M. (1995). Designing speech acts: Issues in speech user interfaces.SIGCHI, Human Factors in Computing System Proc., Annual Conference Series. New York: Association for Computing Machinery Inc., pp. 369–376.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Heins, R., Franzke, M., Durian, M. et al. Turn-taking as a design principle for barge-in in Spoken Language Systems. Int J Speech Technol 2, 155–164 (1997). https://doi.org/10.1007/BF02208827
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.1007/BF02208827