Multimodality in Language and Speech Systems — From Theory to Design Support Tool

Bernsen, Niels Ole

doi:10.1007/978-94-017-2367-1_6

Niels Ole Bernsen^4,5

Part of the book series: Text, Speech and Language Technology ((TLTB,volume 19))

230 Accesses
29 Citations

Abstract

This paper presents an approach towards achieving fundamental understanding of unimodal and multimodal output and input representations with the ultimate purpose of supporting the design of usable unimodal and multimodal human-human-system interaction (HHSI). The phrase ‘human-human-system interaction’ is preferred to the more common ‘human-computer interaction’ (HCI) because the former would appear to provide a better model of our interaction with systems in the future, involving (i) more than one user, (ii) a complex networked system rather than a (desktop) ‘computer’ which in most applications may soon be a thing of the past, and (iii) a system which increasingly behaves as an equal to the human users (Bernsen, 2000). Whereas the enabling technologies for multimodal representation and exchange of information are growing rapidly, there is a lack of theoretical understanding of how to get from the requirements specification of some application of innovative interactive technology to a selection of the input/output modalities for the application which will optimise the usability and naturalness of interaction. Modality Theory is being developed to address this, as it turns out, complex and thorny problem starting from what appears to be a simple and intuitively evident assumption. It is that, as long as we are in the dark with respect to the nature of the elementary, or unimodal, modalities of which multimodal presentations must be composed, we do not really understand what multimodality is. To achieve at least part of the understanding needed, it appears, the following objectives should be pursued, defining the research agenda of Modality Theory (Bernsen, 1993):

(1)
To establish an exhaustive taxonomy and systematic analysis of the unimodal modalities which go into the creation of multimodal output representations of information for HHSI.
(2)
To establish an exhaustive taxonomy and systematic analysis of the unimodal modalities which go into the creation of multimodal input representations of information for HHSI. Together with Step (1) above, this will provide sound foundations for describing and analysing any particular system for interactive representation and exchange of information.
(3)
To establish principles for how to legitimately combine different unimodal output modalities, input modalities, and input/output modalities for usable representation and exchange of information in HHSI.
(4)
To develop a methodology for applying the results of Steps (1) – (3) above to the early design analysis of how to map from the requirements specification of some application to a usable selection of input/output modalities.
(5)
To use results in building, possibly automated, practical interaction design support tools.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Implementation Goals for Multimodal Interfaces in Human-Computer Interaction

Multimodal Interfaces of Human–Computer Interaction

Article 01 January 2018

Consistent categorization of multimodal integration patterns during human–computer interaction

Article 22 March 2017

References

Baber, C. & J. Noyes (Eds.). Interactive Speech Technology. London: Taylor & Francis, 1993.
Google Scholar
Benoit, C., J.C. Martin, C. Pelachaud, L. Schomaker & B. Suhm. Suhm “Audio-Visual and Multimodal Speech Systems.” In: D. Gibbon (Ed.), Handbook of Standards and Resources for Spoken Language Systems - Supplement Volume. Kluwer, 2000.
Google Scholar
Bernsen, N.O. “A research agenda for modality theory.” In: Cox, R., Petre, M., Bma, P., and Lee, J. (Eds.), Proceedings of the Workshop on Graphical Representations, Reasoning and Communication. World Conference on Artificial Intelligence in Education. Edinburgh, 1993: 43–46.
Google Scholar
Bernsen, N.O. “Foundations of multimodal representations. A taxonomy of representational modalities.” Interacting with Computers 6. 4 347–71, 1994.
Google Scholar
Bernsen, N.O. “Why are analogue graphics and natural language both needed in HCI?” In: Paterno, F. (Ed.), Design, Specification and Verification of Interactive Systems. Proceedings of the Eurographics Workshop, Carrara, Italy, 165–179. Focus on Computer Graphics. Springer Verlag, 1995: 235–51, 1994.
Google Scholar
Bernsen, N.O. “Towards a tool for predicting speech functionality.” Speech Communication 23: 181–210, 1997.
Article Google Scholar
Bernsen, N.O. “Natural human-human-system interaction.” In: Eamshaw, R., R Guedj, A. van Dam & J. Vince (Eds.). Frontiers of Human-Centred Computing, On-Line Communities and Virtual Environments. Berlin: Springer Verlag, 2000.
Google Scholar
Bernsen, N.O. & L. Dybkjær. “Working Paper on Speech Functionality.” Esprit Long-Term Research Project DISC Year 2 Deliverable D2.10. University of Southern Denmark. See www.disc2.dk, 1999a.
Google Scholar
Bernsen, N.O. & L. Dybkjær. “A theory of speech in multimodal systems.” In: Dalsgaard, P., C.-H. Lee, P. Heisterkamp & R. Cole (Eds.). Proceedings of the ESCA Workshop on Interactive Dialogue in Multi-Modal Systems, Irsee, Germany. Bonn: European Speech Communication Association: 105–108, 1999b.
Google Scholar
Bernsen, N.O., H. Dybkjær & L. Dybkjær. Designing Interactive Speech Systems. From First Ideas to User Testing. Springer Verlag, 1998.
Google Scholar
Bernsen, N.O. & S. Lu. “A software demonstrator of modality theory.” In: Bastide, R. & P. Palanque (Eds.). Proceedings of DSV-IS’95: Second Eurographics Workshop on Design, Specification and Verification of Interactive Systems. Springer Verlag, 242–61, 1995.
Google Scholar
Bernsen, N.O. & S. Verjans. “From task domain to human-computer interface. Exploring an information mapping methodology.” In: John Lee (Ed). Intelligence and Multimodality in Multimedia Interfaces. Menlo Park, CA: AAAI PressURL: http://www.aaai.org/Press/Books/Lee/lee.html, 1997.
Google Scholar
Bertin, J. Semiology of Graphics. Diagrams. Networks. Maps. Trans. by J. Berg. Madison: The University of Wisconsin Press, 1983.
Google Scholar
Bodart, F., A.M., Hennebert, J.-M. Leheureux, I. Provot, G. Zucchinetti & J. Vanderdonckt. “Key Activities for a Development Methodology of Interactive Applications.” In: Benyon, D. & P. Palanque (Eds.). Critical Issues in User Interface Systems Engineering, Springer Verlag, 1995.
Google Scholar
Buxton, W. “Lexical and pragmatic considerations of input structures.” Computer Graphics 17, 1: 31–37, 1983.
Google Scholar
Foley, J.D., V.L., Wallace & P. Chan. “The Human Factors of Graphic Interaction Techniques.” IEEE Computer Graphics and Application 4. 11: 13–48, 1984.
Google Scholar
Greenstein, J.S. & L.Y. Amaut. “Input devices.” In: M. Helander (Ed.). Handbook of Human-Computer Interaction, Amsterdam: North-Holland, 495–519, 1988.
Google Scholar
Holmes, N. Designer’s Guide to Creating Charts and Diagrams. New York: Watson-Guptill Publications, 1984.
Google Scholar
Hovy, E. & Y. Arens. “When is a picture worth a thousand words? Allocation of modalities in multimedia communication.” Paper presented at the AAAI Symposium on Human-Computer Interfaces, Stanford, 1990.
Google Scholar
Joslyn, C., C. Lewis & B. Domik. “Designing glyphs to exploit patterns in multidimensional data sets.” CHI’95 Conference Companion, 198–199, 1995.
Google Scholar
Lenorovitz, D.R., M.D. Phillips, R.S. Ardrey & G.V. Kloster. “A taxonomic approach to characterizing human-computer interaction.” In: G. Salvendy (Ed.). Human-Computer Interaction. Amsterdam: Elsevier Science Publishers, 111–116, 1984.
Google Scholar
Lockwood, A. Diagram. A visual survey of graphs, maps, charts and diagrams for the graphic designer. London: Studio Vista, 1969.
Google Scholar
Lohse, G., N. Walker, K. Biolsi & H. Rueter. “Classifying graphical information.” Behaviour and Information Technology 10, 5419–36, 1991.
Article Google Scholar
Luz, S. & Bemsen, N.O. “Interactive advice on the use of speech in multimodal systems design with SMALTO.” In: Ostermann, J., K.J. Ray Liu, J.Aa. Sorensen, E. Deprettere & W.B. Kleijn (Eds.). Proceedings of the Third IEEE Workshop on Multimedia Signal Processing, Elsinore, Denmark. IEEE, Piscataway, NJ: 489–494, 1999.
Google Scholar
Mackinlay, J., S.K. Card & G.G. Robertson. “A semantic analysis of the design space of input devices.” Human-Computer Interaction 5: 145–90, 1990.
Article Google Scholar
Mullet, K. & D.J. Schiano. “3D or not 3D: `More is better’ or `Less is more’?” CHI’95 Conference Companion, 174–175, 1995.
Google Scholar
Rosch, E. “Principles of categorization.” In: Rosch, E. & B.B. Lloyd (Eds.). Cognition and Categorization. Hillsdale, NJ: Erlbaum, 1978. SMALTO: http://disc.nis.sdu.dk/smalto/
Stenning, K. & J. Oberlander. “Reasoning with words, pictures and calculi: Computation versus justification.” In: Barwise, J., J.M. Gawron, G. Plotkin & S. Tutiya (Eds.). Situation Theory and Its Applications. Stanford, CA: CSLI, Vol. 2: 607–62, 1991.
Google Scholar
Tufte, E.R. The Visual Display of Quantitative Information. Cheshire, CT: Graphics Press, 1983. Tufte, E.R. Envisioning information. Cheshire, CT: Graphics Press, 1990.
Google Scholar
Twyman, M. “A schema for the study of graphic language.” In: Kolers, P., M. Wrolstad & H. Bouna (Eds.). Processing of Visual Language Vol. 1. New York: Plenum Press, 1979.
Google Scholar

Download references

Author information

Authors and Affiliations

Natural Interactive Systems Laboratory, University of Southern, Denmark
Prof. Niels Ole Bernsen (Director)
Main Campus, Odense University, Science Park 10, 5230, Odense M, Denmark
Prof. Niels Ole Bernsen (Director)

Authors

Prof. Niels Ole Bernsen
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Speech, Music and Hearing, KTH Royal Institute of Technology, Stockholm, Sweden
Björn Granström , David House & Inger Karlsson , &

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Bernsen, N.O. (2002). Multimodality in Language and Speech Systems — From Theory to Design Support Tool. In: Granström, B., House, D., Karlsson, I. (eds) Multimodality in Language and Speech Systems. Text, Speech and Language Technology, vol 19. Springer, Dordrecht. https://doi.org/10.1007/978-94-017-2367-1_6

Download citation

DOI: https://doi.org/10.1007/978-94-017-2367-1_6
Publisher Name: Springer, Dordrecht
Print ISBN: 978-90-481-6024-2
Online ISBN: 978-94-017-2367-1
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Multimodality in Language and Speech Systems — From Theory to Design Support Tool

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Implementation Goals for Multimodal Interfaces in Human-Computer Interaction

Multimodal Interfaces of Human–Computer Interaction

Consistent categorization of multimodal integration patterns during human–computer interaction

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Multimodality in Language and Speech Systems — From Theory to Design Support Tool

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Implementation Goals for Multimodal Interfaces in Human-Computer Interaction

Multimodal Interfaces of Human–Computer Interaction

Consistent categorization of multimodal integration patterns during human–computer interaction

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation