Abstract
Present development of digital registration and methods of recorded voice processing are useful in detection of most pathologies and diseases of a human vocal tract. The recognition of the voice condition requires the creation of a model which is comprised of different acoustic parameters of speech signal. In this study a vector consisting of 31 parameters for analysing the speech signal was created. The speech parameters were extracted from time, frequency and cepstral domains. Using Principal Components Analysis the number of the parameters was reduced to 17. In order to validate the detection of the pathological voice signal, a tenfold cross-validation and confusion matrix were used. The goal and novelty of this work was the analysis of applicability of the parameters selectively used to assess the pathology.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
References
Deliyski, D.D.: Multi-dimensional acoustic analysis of spasmodic dysphonia. In: Proc. in the ASHA Convention, Atlanta (1991)
Gogh, C.D.L., Festen, J.M., Verdonck-de Leeuw, I.M., Parker, A.J.: Acoustical analysis of tracheoesophageal voice. Speech Communication 47, 160–168 (2005) ISSN 0167-6393
Martinez, D., Lleida, E., Ortega, A., Miguel, A., Villalba, J.: Voice Pathology Detection on the Saarbruecken Voice Database with Calibration and Fusion of Scores Using MultiFocal Toolkit. Advances in Speech and Language Technologies for Iberian Languages Commu-nications in Computer and Information Science 328, 99–109 (2012)
Barry, W.J., Putzer, M.: Saarbruecken Voice Database. Institute of Phonetics, University of Saarland, http://www.stimmdatenbank.coli.uni-saarland.de/
Fant, G.: Acoustic Theory of Speech Production With Calculations based on X-Ray Studies of Russian Articulations, Mouton, The Hague (1970) ISBN: 9027916004
Maciel, C.D., Pereira, J.: Identifying healthy and pathologically affected voice signals. IEEE Signal Processing Magazine 27(1), 120–123 (2010)
Arroyave, J.R.O.A., Bonilla, J.F.V., Trejos, E.D.: Acoustic Analysis and Non Linear Dynamics Applied to Voice Pathology Detection: A Review. Recent Patents on Signal Processing (2012)
Loughran, R., Walker, J., O’Neill, M., O’Farrell, M.: The Use of Mel-frequency Cepstral Coefficients in Musical Instrument Identification. Routes/Roots, Michigan (2008)
Engel, Z.W., Klaczynski, M., Wszolek, W.: A Vibroacoustic Model of Selected Human Larynx Diseases. International Journal of Occupational Safety and Ergonomics (JOSE) 13(4), 367–379 (2007)
Wuyts, F.L., De Bodt, M.S., Molenberghs, G., Remacle, M., Heylen, L., Millet, B., Van Lierde, K., Jan, R., Van de Heyning, P.H.: The Dysphonia Severity Index: An Objective Measure of Vocal Quality Based on a Multiparameter Approach. Journal of Speech, Language and Hearing Research 43, 796–809 (2000) ISSN 1092-4388
Osowski, S.: Sieci neuronowe w ujęciu algorytmicznym, WNT, Warszawa (1996)
Tadeusiewicz, R., Izworski, A., Wszolek, W.: Pathological speech evaluation using the artificial intelligence methods. Med. Biol. Eng. Comput. (2007)
Noll, A.: Short-term spectrum and ’cepstrum’ techniques for vocal pitch detection. J. Acoust. Soc. Am. 41, 293–300 (1964)
Awan, S.N., Giovinco, A., Owens, J.: Effects of vocal intensity and vowel type on cepstral analysis of voice. In: Presented at the 39th Annual Symposium: Care of the Professional Voice, Philadelphia (2010)
Mehta, D.D., Deliyski, D.D., Zeitels, S.M., Quatieri, T.F., Hillman, E.R.: Voice Production Mechanisms Following Phonosurgical Treatment of Early Glottic Cancer. Ann. Otol. Rhinol. Laryngol. 119(1), 1–9 (2010)
Bishop, C.M.: Pattern Recognition and Machine Learning, pp. 559–599. Springer Science, Singapore (2006)
Methods based on Principal Components Analysis and the concept of Eigenface, Metody oparte na Analizie Glownych Skladowych i koncepcji Eigenface, http://icsolutions.pl/
Orozco-Arroyave, J.R., Murillo-Rendon, S., Alvares-Meza, A.M., Arias-Londono, J.D., Delgado-Trejos, E., Vargas-Bonilla, J.F., Castellanos-Domingues, C.G.: Automatic Selection of Acoustic and Non-linear Dynamic Features in Voice Signals for Hypernasality Detection. In: Interspeech, pp. 529–532 (2011)
Refaeilzadeh, P., Tang, L., Liu, H.: Cross Validation, Encyclopedia of Database Systems (EDBS), p. 6. Arizona State University, Springer (2009)
Delgado-Trejos, E., Sepulveda-Sepulveda, F.A., Castellanos-Domnguez, G.: Robustness Improvement of Hypernasal Speech Detection by Acoustic Analysis and the Rademacher Complexity Model. In: Advances in Biomed. Research, pp. 159–162
Epstein, M.A., Payri, B.G.: The effects of vowel quality and pitch on spectral and glottal flow measurements of the voice source, Lecture, University of California, Los Angeles.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Panek, D., Skalski, A., Gajda, J. (2014). Quantification of Linear and Non-linear Acoustic Analysis Applied to Voice Pathology Detection. In: Piętka, E., Kawa, J., Wieclawek, W. (eds) Information Technologies in Biomedicine, Volume 4. Advances in Intelligent Systems and Computing, vol 284. Springer, Cham. https://doi.org/10.1007/978-3-319-06596-0_33
Download citation
DOI: https://doi.org/10.1007/978-3-319-06596-0_33
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-06595-3
Online ISBN: 978-3-319-06596-0
eBook Packages: EngineeringEngineering (R0)