Abstract
Speech production is multimodal and multisensory. We present a novel approach to segmental pronunciation instruction that directs foreign language (L2) learners to simultaneously monitor the visual, tactile-articulatory, and auditory information of their articulators’ movements in a video conference-based classroom. We reasoned that this would allow L2 learners to heighten their proprioceptive awareness for speech motor actions and thus facilitate pronunciation gains. We present preliminary results from a study with German learners of English who participated in a pronunciation training session with their video self-view activated. The focus was on the L2 /v-w/ contrast in singletons and /Cw/-clusters. Each participant completed one small-group lesson and a pre-, post-, and delayed post-test on Zoom. Tasks were designed to elicit data from controlled to spontaneous speech. Here, we report preliminary results from the controlled word-list reading task. L1 English raters judged words both a) categorically, i.e., whether speakers produced /v/ or /w/, and b) on a six-point goodness scale. Most participants started near ceiling; individual speakers with the most room for improvement showed slight gains from pre- to post-tests; /kw/-clusters had the lowest ratings at the delayed post-test. Although no significant differences between time points were found in the preliminary data, the results suggest that L2 learners might benefit from pronunciation training that facilitates the integration of multisensory feedback, in particular visual and somatosensory feedback.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bliss, H., Abel, J., & Gick, B. (2018). Computer-assisted visual articulation feedback in L2 pronunciation instruction. Journal of Second Language Pronunciation, 4(1), 129–153. https://doi.org/10.1075/jslp.00006.bli
Bohn, O. S., & Best, C. T. (2012). Native-language phonetic and phonological influences on perception of American English approximants by Danish and German listeners. Journal of Phonetics, 40(1), 109–128. https://doi.org/10.1016/J.WOCN.2011.08.002
Bradlow, A. R., Pisoni, D. B., Akahane-Yamada, R., & Tohkura, Y. (1997). Training Japanese listeners to identify English / r / and / l /: IV. Some effects of perceptual learning on speech production. The Journal of the Acoustical Society of America, 101(4), 2299–2310. https://doi.org/10.1121/1.418276
Carruthers, S. W. (2007). Articulatory training on facial movements using the webcam pronunciation mirror: A pilot study. Hawaii Pacific University TESOL Working Paper Series, 5(1), 3–17.
Cleland, J., Scobbie, J. M., & Wrench, A. A. (2015). Using ultrasound visual biofeedback to treat persistent primary speech sound disorders. Clinical Linguistics & Phonetics, 29(8–10), 575–597. https://doi.org/10.3109/02699206.2015.1016188
Colantoni, L., Steele, J., & Escudero Neyra, P. R. (2015). Second language speech: Theory and practice. Cambridge University Press.
Dubois, C., Otzenberger, H., Gounot, D., Sock, R., & Metz-Lutz, M.-N. (2012). Visemic processing in audiovisual discrimination of natural speech: A simultaneous fMRI–EEG study. Neuropsychologia, 50(7), 1316–1326. https://doi.org/10.1016/j.neuropsychologia.2012.02.016
Emmersion. (2021). Emmersion speaking assessment [Computer software]. https://emmersion.ai/products/assessments/
Flege, J. E. (1995). Second language speech learning: Theory, findings and problems. In Speech perception and linguistic experience: Issues in cross-language research (pp. 229–273). https://www.researchgate.net/publication/333815781
Ghazanfar, A. A., & Turesson, H. K. K. (2008). Speech production: How does a word feel? Current Biology, 18(24). https://doi.org/10.1016/J.CUB.2008.10.033
Guion, S. G., & Pederson, E. (2007). Investigating the role of attention in phonetic learning. In Language experience in second language speech learning: In honor of James Emil Flege [Language learning & language teaching] (Vol. 17, pp. 57–77). John Benjamins Publishing Company. https://doi.org/10.1075/LLLT.17.09GUI
Haldin, C., Acher, A., Kauffmann, L., Hueber, T., Cousin, E., Badin, P., Perrier, P., Fabre, D., Perennou, D., Detante, O., Jaillard, A., Lœvenbruck, H., & Baciu, M. (2018). Speech recovery and language plasticity can be facilitated by Sensori-Motor Fusion training in chronic non-fluent aphasia. A case report study. Clinical Linguistics & Phonetics, 32(7), 595–621. https://doi.org/10.1080/02699206.2017.1402090
Hamann, S., & Sennema, A. (2005a). Acoustic differences between German and Dutch labiodentals. ZAS Papers in Linguistics, 42, 33–41.
Hamann, S., & Sennema, A. (2005b). Voiced labiodental fricatives or glides-all the same to Germans? Proceedings of ISCA Workshop on Plasticity in Speech Perception (PSP2005), 15–17.
Iverson, P., Ekanayake, D., Hamann, S., Sennema, A., & Evans, B. G. (2008). Category and perceptual interference in second-language phoneme learning: An examination of English /w/-/v/ learning by Sinhala, German, and Dutch speakers. Journal of Experimental Psychology: Human Perception and Performance, 34(5), 1305-1316. https://doi.org/10.1037/0096-1523.34.5.1305
Jones, J. A., & Munhall, K. G. (2003). Learning to produce speech with an altered vocal tract: The role of auditory feedback. Citation: The Journal of the Acoustical Society of America, 113, 532. https://doi.org/10.1121/1.1529670
Jones, J. A., & Munhall, K. G. (2005). Remapping auditory-motor representations in voice production. Current Biology, 15(19), 1768–1772. https://doi.org/10.1016/J.CUB.2005.08.063
Kartushina, N., Hervais-Adelman, A., Frauenfelder, U. H., & Golestani, N. (2015). The effect of phonetic production training with visual feedback on the perception and production of foreign speech sounds. The Journal of the Acoustical Society of America, 138(2), 817–832. https://doi.org/10.1121/1.4926561
Kartushina, N., Hervais-Adelman, A., Frauenfelder, U. H., & Golestani, N. (2016). Mutual influences between native and non-native vowels in production: Evidence from short-term visual articulatory feedback training. Journal of Phonetics, 57, 21–39. https://doi.org/10.1016/j.wocn.2016.05.001
Katz, W. F., & Mehta, S. (2015). Visual feedback of tongue movement for novel speech sound learning. Frontiers in Human Neuroscience, 9. https://doi.org/10.3389/FNHUM.2015.00612/ABSTRACT
Landis, J. R., & Koch, G. G. (1977). The measurement of observer agreement for categorical data. Biometrics, 33(1), 159. https://doi.org/10.2307/2529310
Lee, J., Jang, J., & Plonsky, L. (2015). The effectiveness of second language pronunciation instruction: A meta-analysis. Applied Linguistics, 36(3), 345–366. https://doi.org/10.1093/applin/amu040
Levis, J. M. (2018). Intelligibility, oral communication, and the teaching of pronunciation (Vol. 27). Cambridge University Press (Cambridge Applied Linguistics Series).
Liberman, A. M., Cooper, F. S., Shankweiler, D. P., & Studdert-Kennedy, M. (1967). Perception of the speech code. Psychological Review, 74(6), 431–461. https://doi.org/10.1037/H0020279
Liberman, A. M., & Mattingly, I. G. (1985). The motor theory of speech perception revised. Cognition, 21(1), 1–36. https://doi.org/10.1016/0010-0277(85)90021-6
Mangold, M. (2015). Duden Aussprachewörterbuch. Dudenverlag.
Marian, V. (2018). Audio-visual integration during bilingual language processing. In A. Pavlenko (Ed.), The bilingual mental lexicon (pp. 52–78). Multilingual Matters. https://doi.org/10.21832/9781847691262-005/HTML
Massaro, D.W., Bigler, S., Chen, T., Perlman, M., & Ouni, S. (2008). Pronunciation training: The role of eye and ear. Proc. Interspeech 2008, 2623–2626. https://doi.org/10.21437/Interspeech.2008-650
Massaro, D. W., & Light, J. (2003). Read my tongue movements: Bimodal learning to perceive and produce non-native speech /r/ and /l. Proceedings of Eurospeech (Interspeech), 8th European Conference on Speech Communication and Technology. http://www.cstr.ed.ac.uk/projects/festival/
McGurk, H., & MacDonald, J. (1976). Hearing lips and seeing voices. Nature, 264(5588), 746–748. https://doi.org/10.1038/264746A0
Melnik-Leroy, G. A., Turnbull, R., & Peperkamp, S. (2022). On the relationship between perception and production of L2 sounds: Evidence from Anglophones’ processing of the French /u/–/y/ contrast. Second Language Research, 38(3), 581–605. https://doi.org/10.1177/0267658320988061/ASSET/IMAGES/LARGE/10.1177_0267658320988061-FIG2.JPEG
Nasir, S. M., & Ostry, D. J. (2006). Somatosensory precision in speech production. Current Biology, 16(19), 1918–1923. https://doi.org/10.1016/j.cub.2006.07.069
Nasir, S. M., & Ostry, D. J. (2008). Speech motor learning in profoundly deaf adults. Nature Neuroscience, 11(10), 1217–1222. https://doi.org/10.1038/nn.2193
Navarra, J., & Soto-Faraco, S. (2007). Hearing lips in a second language: Visual articulatory information enables the perception of second language sounds. Psychological Research, 71(1), 4–12. https://doi.org/10.1007/s00426-005-0031-5
Offerman, H. M., & Olson, D. J. (2016). Visual feedback and second language segmental production: The generalizability of pronunciation gains. System, 59, 45–60. https://doi.org/10.1016/j.system.2016.03.003
Olson, D. J. (2014). Benefits of visual feedback on segmental production in the L2 classroom. Language Learning & Technology, 18(3), 173–192. http://llt.msu.edu/issues/october2014/olson.pdf
Olson, D. J. (2022). Visual feedback and relative vowel duration in L2 pronunciation: The curious case of stressed and unstressed vowels. In J. Levis & A. Guskaroska (Eds.), Proceedings of the 12th Pronunciation in Second Language Learning and Teaching Conference. https://doi.org/10.31274/psllt.13353
Pascoe, G. (1987). Die Aussprache des Englischen an bayerischen Schulen. Profil.
Posit Team. (2022). RStudio: Integrated Development Environment for R. Posit Software, PBC, Boston, MA. URL http://www.posit.co/.
R Core Team. (2022). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/
Saito, K. (2012). Effects of instruction on L2 pronunciation development: A synthesis of 15 quasi-experimental intervention studies. TESOL Quarterly, 46(4), 842–854. https://doi.org/10.1002/tesq.67
Saito, K., & Plonsky, L. (2019). Effects of second language pronunciation teaching revisited: A proposed measurement framework and meta-analysis. Language Learning, 69(3), 652–708. https://doi.org/10.1111/lang.12345
Schmidt, R. (2012). Chapter 2. Attention, awareness, and individual differences in language learning. In W. M. Chan, K. N. Chin, S. Bhatt, & I. Walker (Eds.), Perspectives on individual characteristics and foreign language education (pp. 27–50). De Gruyter Mouton. https://doi.org/10.1515/9781614510932.27
Schuhmann, K. S., & Huffman, M. K. (2015). L1 drift and L2 category formation in second language learning. In The Scottish Consortium for ICPhS 2015 (Ed.), Proceedings of the 18th International Congress of Phonetic Sciences. Glasgow, UK: The University of Glasgow. ISBN 978-0-85261-941-4. Paper number 0850.1–5. https://www.internationalphoneticassociation.org/icphsproceedings/ICPhS2015/Papers/ICPHS0850.pdf
Schuhmann, K. S., & Huffman, M. K. (2019). Development of L2 Spanish VOT before and after a brief pronunciation training session. Journal of Second Language Pronunciation, 5(3), 402–434. https://doi.org/10.1075/JSLP.18018.SCH
Sönning, L. (2020). Phonological variation in German Learner English. Universität Bamberg. https://doi.org/10.20378/irb-49135
Thomson, R. I., & Derwing, T. M. (2015). The effectiveness of L2 pronunciation instruction: A narrative review. Applied Linguistics, 36(3), 326–344. https://doi.org/10.1093/applin/amu076
Tiippana, K. (2014). What is the McGurk effect? Frontiers in Psychology, 5, 725. https://doi.org/10.3389/FPSYG.2014.00725/BIBTEX
Tuthill, J. C., & Azim, E. (2018). Proprioception. Current Biology, 28(5), R194–R203. https://doi.org/10.1016/J.CUB.2018.01.064
Wei, S. (2014). Computer vision aided lip movement correction to improve English pronunciation. Master thesis, Purdue University. https://docs.lib.purdue.edu/open_access_theses/704
Wei, S., Chen, Y., McGraw, T., & Ginther, A. (2015). Computer-vision-aided lip movement correction to improve English pronunciation. Paper ID #13112. 122nd ASEE Annual Conference & Exposition.
Acknowledgments
We are tremendously grateful to the reviewers’ helpful comments, to everyone who participated in our study, and all student assistants who helped with study preparations, data collection, or initial data processing: Clyn Baker, Maryellen Martin, Arnav Gupta, Jessica Jablonski, Mallory Evans, Chris Colon, Maike Rocker, and Anne Drobny. We also wish to thank our colleagues for helpful discussions on various aspects at different stages of this project, in particular Laura C. Smith and Miran Kim, as well as colleagues at the TU Braunschweig for help with recruitment, in particular Holger Hopp. All errors are our own. This work was supported by the National Science Foundation (PIRE, grant number 1545900) and The Pennsylvania State University Center for Global Studies Research Award to the first author.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Schuhmann, K.S., Schaech, S., Catto, C. (2023). Multisensory Pronunciation Training in a Video Conference-Based Foreign Language Classroom. In: Georgiou, G.P., Giannakou, A., Savvidou, C. (eds) Advances in Second/Foreign Language Acquisition. Palgrave Macmillan, Cham. https://doi.org/10.1007/978-3-031-38522-3_2
Download citation
DOI: https://doi.org/10.1007/978-3-031-38522-3_2
Published:
Publisher Name: Palgrave Macmillan, Cham
Print ISBN: 978-3-031-38521-6
Online ISBN: 978-3-031-38522-3
eBook Packages: Social SciencesSocial Sciences (R0)