Abstract
The paper presents a dynamic method for phoneme synthesis using an elemental-based concatenation approach. The vocal sound waveform can be decomposed into elemental patterns that have slight modifications of the shape as they chain one after another in time but keep the same dynamics which is specific to each phoneme. An approximation or RBF network is used to generate elementals in time with the possibility of controlling the characteristics of the sound signals. Based on this technique a quite realistic mimic of a natural sound was obtained.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Edgington, M.: Investigating the limitations of concatenative synthesis. In: Proceedings of Eurospeech 1997, Rhodes/Athens, Greece, pp. 593–596 (1997)
Bulut, M., Narayanan, S.S., Syrdal, A.: Expressive speech synthesis using a concatenative synthesizer. In: Proceedings of InterSpeech, Denver, CO, pp. 1265–1268 (2002)
Balyan, A., Agrawal, S.S., Dev, A.: Speech Synthesis: A Review. International Journal of Engineering Research & Technology (IJERT) 2(6), 57–75 (2013)
Banbrrok, M., McLaughlin, S., Mann, I.: Speech characterization and synthesis by nonlinear methods. IEEE Trans. Speech Audio Process 7(1), 1–17 (1999)
Pitsikalis, V., Kokkinos, I., Maragos, P.: Nonlinear analysis of speech signals: Generalized dimensions and Lyapunov exponents. In: Proc. European Conf. on Speech Communication and Technology-Eurospeech-03, pp. 817–820 (September 2003)
McLaughlin, S., Maragos, P.: Nonlinear methods for speech analysis and synthesis. In: Marshall, S., Sicuranza, G. (eds.) Advances in Nonlinear Signal and Image Processing, vol. 6, p. 103. Hindawi Publishing Corporation (2007)
Tao, C., Mu, J., Xu, X., Du, G.: Chaotic characteristic of speech signal and its LPC residual. Acoust. Sci. & Tech. 25(1), 50–53 (2004)
Koga, H., Nakagawa, M.: Chaotic and Fractal Properties of Vocal Sounds. Journal of the Korean Physical Society 40(6), 1027–1031 (2002)
Lo, W.K., Ching, P.C.: Phone-Based Speech Synthesis With Neural Network And Articulatory Control. In: Proceedings of Fourth International Conference on Spoken Language (ICSLP 1996), vol. 4, pp. 2227–2230 (1996)
Malcangi, M., Frontini, D.: A Language-Independent Neural Network-Based Speech Synthesizer. Neurocomputing 73(1-3), 87–96 (2009)
Raghavendra, E.V., Vijayaditya, P., Prahallad, K.: Speech synthesis using artificial neural networks. In: National Conference on Communications (NCC), Chennai, India, pp. 1–5 (2010)
Frank, R.J., Davey, N., Hunt, S.P.: Time Series Prediction and Neural Networks. Journal of Intelligent and Robotic Systems 31, 91–103 (2001)
Kinzel, W.: Predicting and generating time series by neural networks: An investigation using statistical physics. Computational Statistical Physics, 97–111 (2002)
Priel, A., Kanter, I.: Time series generation by recurrent neural networks. Annals of Mathematics and Artificial Intelligence 39, 315–332 (2003)
Medsker, L.R., Jain, L.C.: Recurrent Neural Networks: Design and Applications. CRC Press (2001)
Kalinli, A., Sagiroglu, S.: Elman Network with Embedded Memory for System Identification. Journal of Information Science and Engineering 22, 1555–1568 (2006)
Coca, A.E., Romero, R.A.F., Zhao, L.: Generation of composed musical structures through recurrent neural networks based on chaotic inspiration. In: The 2011 International Joint Conference on Neural Networks (IJCNN), pp. 3220–3226 (July 2011)
Röbel, A.: Morphing Sound Attractors. In: Proc. of the 3rd. World Multiconference on Systemics, Cybernetics and Informatics (SCI 1999) AES 31st International Conference (1999)
Crisan, M.: A Neural Network Model for Phoneme Generation. Applied Mechanics and Materials 367, 478–483 (2013)
Takens, F.: Detecting strange attractors in turbulence. Lecture Notes in Mathematics 898, 366–381 (1981)
Small, M.: Applied nonlinear time series analysis: applications in physics, physiology and finance. World Scientific Publishing Co., NJ (2005)
Sprott, J.C.: Chaos and Time-Series Analysis. Oxford University Press, NY (2003)
Kononov, E.: Visual Recurrence Analysis Software Package, Version 4.9 (accessed 2013), http://nonlinear.110mb.com/vra/
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Crisan, M. (2015). Approximation Neural Network for Phoneme Synthesis. In: Satapathy, S., Biswal, B., Udgata, S., Mandal, J. (eds) Proceedings of the 3rd International Conference on Frontiers of Intelligent Computing: Theory and Applications (FICTA) 2014. Advances in Intelligent Systems and Computing, vol 327. Springer, Cham. https://doi.org/10.1007/978-3-319-11933-5_1
Download citation
DOI: https://doi.org/10.1007/978-3-319-11933-5_1
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11932-8
Online ISBN: 978-3-319-11933-5
eBook Packages: EngineeringEngineering (R0)