Abstract
The aim of this work is to improve the automatic recognition of the dysarthria speech. In this context, we have compared two techniques of speech parameterization; these two techniques are based on the recently proposed coefficients Power Normalized Cepstral Coefficients and Mel-Frequency Cepstral Coefficients. In this paper we have concatenate several variants of JITTER and SHIMMER with the techniques of speech parameterization to improve an automatic recognition of the dysarthric word system. The aim is to help the fragile persons having speech problems (dysarthric voice) and the doctor to make a first diagnosis about the patient’s disease. For this, an Automatic Acknowledgment of Continuous Pathological Speech System has been developed based on the Hidden Models of Markov and the Hidden Markov Model Toolkit. For our tests, we used the Nemours Database which contains 11 speakers representing dysarthric voices.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Kim, C., Stern, R.M.: Power Normalized Cepstral Coefficients (PNCC) for robust speech recognition. IEEE Trans. Audio Speech Lang. Process. 24, 1315 (2016)
Mohammed, A., Mansour, A., Ghulam, M., Mohammed, Z., Mesallam, T.A., Malki, K.H., Mohamed, F., Mekhtiche, M.A., Mohamed, B.: Automatic speech recognition of pathological voice. Indian J. Sci. Technol. 8, 32 (2015)
Tsanas, A.: Accurate telemonitoring of Parkinson’s disease symptom severity using nonlinear speech signal processing and statistical machine learning. University of Oxford, June 2012
Zaidi, B.F., Selouani, S.A., Boudraa, M., Hamdani, G.: Human/machine interface dialog integrating new information and communication technology for pathological voice. In: IEEE Xplore, Future Technologies Conference (FTC), San Francisco, CA, USA, January 2017
Alam, M.J., Kenny, P., Dumouchel, P., O’Shaughnessy, D.: Robust feature extractors for continuous speech recognition. In: IEEE Xplore, European Signal Processing Conference (EUSIPCO), Lisbon, Portugal, November 2014
Dua, M., Aggarwal, R.K., Kadyan, V., Dua, S.: Punjabi automatic speech recognition using HTK. Int. J. Comput. Sci. Issues 9(4), 359 (2012)
Young, S., Kershaw, D., Odell, J., Ollason, D., Valtchev, V., Woodland, P.: The HTK Book, version 3.1, pp. 1–277 (2006)
Menéndez-Pidal, X., Polikoff, J.B., Peters, S.M., Leonzio, J.E., Bunnell, H.T.: The nemours database of dysarthric speech. J. IEEE (in press)
Darley, F.L., Aronson, A.E., Brown, J.R.: Differential diagnostic patterns of dysarthria. J. Speech Lang. Hear. Res. 12, 246–269 (1969)
Titze, I.R.: Principles of Voice Production. National Center for Voice and Speech, Iowa City, USA, 2nd printing (2000)
Schoentgen, J., de Guchteneere, R.: Time series analysis of jitter. J. Phon. 23, 189–201 (1995)
Baken, R.J., Orlikoff, R.F.: Clinical Measurement of Speech and Voice, 2nd edn. Singular Thomson Learning, San Diego (2000)
Tsanas, A., Little, M.A., McSharry, P.E., Ramig, L.O.: Nonlinear speech analysis algorithms mapped to a standard metric achieve clinically useful quantification of average Parkinson‘s disease symptom severity. J. R. Soc. Interface 8, 842–855 (2011)
Kaiser, J.: On a simple algorithm to calculate the ‘energy’ of a signal. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 1990), pp. 381–384, Albuquerque, NM, USA, April 1990
Kounoudes, A., Naylor, P.A., Brookes, M.: The DYPSA algorithm for estimation of glottal closure instants in voices speech. In: IEEE International Conference on Acoustics, Speech and Signal Processing, (ICASSP), pp. 349–352, Orlando, FL (2002)
Naylor, P.A., Kounoudes, A., Gudnason, J., Brookes, M.: Estimation of glottal closure instants in voices speech using the DYPSA algorithm. IEEE Trans. Audio Speech Lang. Process. 15, 34–43 (2007)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Zaidi, BF., Boudraa, M., Selouani, SA., Addou, D., Yakoub, M.S. (2020). Automatic Recognition System for Dysarthric Speech Based on MFCC’s, PNCC’s, JITTER and SHIMMER Coefficients. In: Arai, K., Kapoor, S. (eds) Advances in Computer Vision. CVC 2019. Advances in Intelligent Systems and Computing, vol 944. Springer, Cham. https://doi.org/10.1007/978-3-030-17798-0_40
Download citation
DOI: https://doi.org/10.1007/978-3-030-17798-0_40
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-17797-3
Online ISBN: 978-3-030-17798-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)