Abstract
For multilingual text-to-speech synthesis, it is desirable to have reliable grapheme-to-phoneme conversion algorithms which can be easily adapted to different languages. I propose a flexible dual-route neural network algorithm which consists of two components: a constructor net for exploiting regularities of the mapping from graphemes to phonemes and a self-organizing map (SOM) for storing exceptions which are not captured by the constructor net. The SOM transcribes one word at a time, the constructor net one phoneme at a time. The constructor net output is then classified by mapping it onto a set of codebook vectors generated by Learning Vector Quantisation which capture the net's concept of each phoneme.
Thanks to T. Mark Ellison, Joachim Buhmann, Paul Taylor, and David Willshaw for valuable comments. The financial support of the Studienstiftung des deutschen Volkes and of ERASMUS programme ICP 95 NL 1186 is gratefully acknowledged.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Coltheart, M.; Curtis, B.; Atkins, P and Haller, M. (1993): Models of Reading Aloud. Psychological Review 100, PP. 589–608
Elman, J. (1993): Learning and development in neural nets: the importance of starting small. Cognition 48, pp. 71–99
Fahlman, S. and LeBiere, Ch. (1990): The Cascade-Correlation learning architecture. In: Advances in Neural Information Processing Systems 2, D.S. Touretzky (ed.) Morgan Kaufman, pp. 524–532
Frean, Marcus (1990): Small Nets and Short Paths: Optimising Neural Nets. Ph.D. thesis, University of Edinburgh
Kenstowicz, M. (1994): Phonology in Generative Grammar. Oxford: Basil Blackwell
Kohonen, T. (19893): Self-Organisation and Associative Memory. Berlin, Heidelberg, New York: Springer
Kohonen, T.; Kangas, J.; Laaksonen, J. and Torkkola, K. (1992): LVQ _PAK: A program package for the correct application of Learning Vector Quantisation algorithms. In: Proc. Int. Joint Conf. on Neural Networks, pp. I 725–730
Miikkulainen, R. (1993): Subsymbolic Natural Language Processing Cambridge, Mass.: MIT Press
Plaut, D.; Seidenberg, M.; McClelland, J. and Patterson, K.: (1994) Understanding Normal and Impaired Word Reading: Computational Principles in Quasi-Regular Domains. Technical Report PDP.CNS.94.5
Quartz, Stephen (1993): Neural networks, nativism, and the place of constructivism. Cognition 48, pp. 223–242
Rosenke, Katrin (1995): Verschiedene neuronale Strukturen für die Transkription von deutschen Wörtern. In: Proc. 6. Konferenz Elektronische Sprachsignalverarbeitung, R. Hoffmann and R. Ose (eds), TU Dresden, Institut f. Technische Akustik, pp. 159–166.
Seidenberg, M. and McClelland, J. (1989): A distributed, developmental model of word recognition and naming. Psychological Review 96, pp. 523–568
Sejnowski, T. and Rosenberg, C. (1987): Parallel networks that learn to pronounce English text. Complex Systems 1, pp. 145–168
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1996 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wolters, M. (1996). A dual route neural net approach to grapheme-to-phoneme conversion. In: von der Malsburg, C., von Seelen, W., Vorbrüggen, J.C., Sendhoff, B. (eds) Artificial Neural Networks — ICANN 96. ICANN 1996. Lecture Notes in Computer Science, vol 1112. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-61510-5_42
Download citation
DOI: https://doi.org/10.1007/3-540-61510-5_42
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-61510-1
Online ISBN: 978-3-540-68684-2
eBook Packages: Springer Book Archive