Abstract
Dysarthria is a set of congenital and traumatic neuromotor disorders that impair the physical production of speech. These impairments reduce or remove the normal control of the vocal articulators. The acoustic characteristics of dysarthric speech is very different from the speech signal collected from a normative population, with relatively larger intra-speaker inconsistencies in the temporal dynamics of the dysarthric speech [1] [2]. These inconsistencies result in poor audible quality for the dysarthric speech, and in low phone/speech recognition accuracy. Further, collecting and labeling the dysarthric speech is extremely difficult considering the small number of people with these disorders, and the difficulty in labeling the database due to the poor quality of the speech. Hence, it would be of great interest to explore on how to improve the efficiency of the acoustic models built on small dysarthric speech databases such as Nemours [3], or use speech databases collected from a normative population to build acoustic models for dysarthric speakers. In this work, we explore the latter approach.
Chapter PDF
Similar content being viewed by others
References
Weismer, G., Tjaden, K., Kent, R.D.: Can articulatory behavior in motor speech disorders be accounted for by theories of normal speech production? Journal of Phonetics 23, 149–164 (1995)
Duffy, J.: Motor Speech Disorders: Substrates, Differential Diagnosis, and Management. Mosby, St. Louis (2005)
Menendez-Pidal, X., Polikoff, J.B., Peters, S.M., Leonzio, J.E., Bunnell, H.T.: The Nemours database of Dysarthric speech. In: Proceedings of the Fourth International Conference on Spoken Language Processing, Philadelphia, USA (1996)
Murdoch, B.E. (ed.): Dysarthria: A Physiological Approach to Assessment and Treatment, ch. 1. Stanley Thornes Publishers Ltd., UK (1998)
Gauvain, J.L., Lee, C.H.: Maximum a posteriori estimation for multivariate Gaussian mixture obser-vations of Markov chains. IEEE Transactions on Speech and Audio Processing 2, 291–298 (1994)
Leggetter, C.J., Woodland, P.C.: Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models. Computer Speech and Language 9, 171–185 (1994)
Young, S., Jansen, J., Odell, J., Ollason, D., Woodland, P.: The HTK book. Cambridge University Engineering Department, Cambridge (2003)
Deller, J.R., Hsu, D., Ferrier, L.J.: On the use of Hidden Markov Modelling for recognition of dysarthric speech. Computer Methods and Programs in Biomedicine 35, 125–139 (1991)
Reynolds, D.A.: A Gaussian Mixture Modeling Approach to Text-Independent Speaker Identification. Ph.D. thesis, Georgia Institute of Technology (1992)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
George, K.K., Kumar, C.S. (2013). Towards Enhancing the Acoustic Models for Dysarthric Speech. In: Duffy, V.G. (eds) Digital Human Modeling and Applications in Health, Safety, Ergonomics, and Risk Management. Healthcare and Safety of the Environment and Transport. DHM 2013. Lecture Notes in Computer Science, vol 8025. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39173-6_22
Download citation
DOI: https://doi.org/10.1007/978-3-642-39173-6_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-39172-9
Online ISBN: 978-3-642-39173-6
eBook Packages: Computer ScienceComputer Science (R0)