Abstract
Stuttering is a fluency disorder characterized by the occurrences of dysfluencies in normal flow of speech, such as repetitions, prolongations and interjection and so on. It is one of the serious problems in speech pathology. The goal of this paper is to present experimental results for the classification of three types of dysfluencies such as syllable repetition, word repetition and prolongation in stuttered speech. The three speech parameterization techniques :Linear Prediction Coefficients (LPC), Linear Prediction Cepstral Coefficients (LPCC) and Mel Frequency Cepstral Coefficients (MFCC) are used as speech feature extraction methods. The performance of these parameterization techniques are compared using the results obtained by thorough experimentation. The speech samples are obtained from University College London Archive of Stuttered Speech (UCLASS). The dysfluencies are extracted from these speech samples and used for feature extraction. The multi-class Support Vector Machine (SVM) is employed for the classification of speech dysfluencies.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
- Hide Markov Model
- Linear Discriminant Analysis
- Discrete Cosine Transform
- Dynamic Time Warping
- Speech Sample
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Czyzewski, A., Kaczmarek, A., Kostek, B.: Intelligent processing of stuttered speech, vol. 21, pp. 143–171 (2003)
Bloodstein, O.: A handbook on stuttering. Singular Publishing Group,Inc., San-Diego (1995)
Chee, L.S., Ai, O.C., Hariharan, M., Yaacob, S.: MFCC based recognition of repetition and prolongation in stuttered speech using k-nn and lda. In: Proccedings of 2009 IEEE Student Conference on Research and Development (SCOReD), Malaysia (November 2009)
Cristianini, N., Shawe-Taylor, J.: An introduction to support vector machines and other kernel-based learning methods. Cambridge University Press (2000)
Sherman, D.: Clinical and experimental use of the iowa scale of severity of stuttering. Journal of Speech and Hearing Disorders, 316–320 (1952)
Noth, E., Niemann, H., Haderlein, T., Decher, M., Eysholdt, U., Rosanowski, F., Wittenberg, T.: Automatic stuttering recognition using hidden markov models. Interspeech (2000)
Antoniol, G., Rollo, V.F., Venturi, G.: Linear predictive coding and cepstrumcoefficients for mining time variant information from software repositories. In: Proceedings of the 2005 International Workshop on Mining Software Repositories (2005)
Luts, J., Ojeda, F., Van de Plas, R., De Moor, B., Van Huffel, S., Suykens, J.: A tutorial on support vector machine-based methods for classification problems in chemometrics. Anal. Chim. Acta 665, 129–145 (2010)
Proakis, J.G., Manolakis, D.G.: Digital signal processing. principles, algorithms and applications. MacMillan, New York
Ravikumar, K.M., Reddy, B., Rajagopal, R., Nagaraj, H.: Automatic detection of syllable repetition in read speech for objective assessment of stuttered disfluencies. In: Proceedings of World Academy Science, Engineering and Technology, pp. 270–273 (2008)
Ravikumar, K.M., Rajagopal, R., Nagaraj, H.C.: An approach for objective assessment of stuttered speech using MFCC features. ICGST International Journal on Digital Signal Processing DSP 9, 19–24 (2009)
Rabiner, L., Juang, B.: Fundamentals of speech recognition. Prentice hall (1993)
Sin Chee, L., Chia Ai, O., Hariharan, M., Yaacob, S.: Automatic detection of prolongations and repetitions using lpcc. In: Proccedings of International Conference for Technical Postgraduates, TECHPOS (2009)
Lindasalwa, M., Begam, K.M., Elamvazuthi, I.: Voice recognition algorithms using mel frequency cepstral coefficient (MFCC) and dynamic time warping (DTW) techniques. Journal of Computing 2, 138–143 (2010)
Wisniewski, M., Kuniszyk-Jozkowiak, W., Smolka, E., Suszynsk, W.: Automatic detection of disorders in a continuous speech with the hidden markov models approach. In: Computer Recognition Systems 2. ASC, vol. 45, pp. 445–453. Springer, Heidelberg (2008)
Wisniewski, M., Kuniszyk-Jozkowiak, W., Smolka, E., Suszynski, W.: Automatic detection of prolonged fricative phonemes with the hidden markov models approach. Journal of Medical Informatics & Technologies 11 (2007)
Howell, P., Huckvale, M.: Facilities to assist people to research into stammered speech. Stammering Research, 130–242 (2004); an Online Journal Published by the British Stammering Association
Howell, P., Sackin, S., Glenn, K.: Development of a two stage procedure for the automatic recognition of dysfluencies in the speech of children who stutter: Ii. ann recognition of repetitions and prolongations with supplied word segment markers. Journal of Speech, Language, and Hearing Research 40, 1085 (1997)
Mahesha, P., Vinod, D.S.: Automatic classification of dysfluencies in stuttered speech using MFCC. In: Proccedings of International Conference on Computing Communication & Information Technology (ICCCIT), Chennai, India (June 2012)
Prahallad, K.: Speech technology: A practical introduction topic: Spectrogram, cepstrumand mel-frequency analysis. Technical report, JCarnegie Mellon University and International Institute of Information Technology, Hyderabad
Schoslkopf, B., Smola, A.: Learning with kernals, support vector machines. MIT Press, London (2002)
Devis, S., Howell, P., Batrip, J.: The UCLASS archive of stuttered speech. Journal of Speech (April 2009)
SAwad, S.: The application of digital speech processing to stuttering therapy. In: Proceedings of Instrumentation and Measurement Technology Conference: IEEE Sensing, Processing, Networking, pp. 1361–1367 (1997)
Cullinan, W.L., Prathe, E.M., Williams, D.: Comparison of procedures for scaling severity of stuttering. Journal of Speech and Hearing Research, 187–194 (1963)
Young, M.A.: Predicting ratings of severity of stuttering (monograph), pp. 31–54 (1961)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 ICST Institute for Computer Science, Social Informatics and Telecommunications Engineering
About this paper
Cite this paper
Mahesha, P., Vinod, D.S. (2013). Classification of Speech Dysfluencies Using Speech Parameterization Techniques and Multiclass SVM. In: Singh, K., Awasthi, A.K. (eds) Quality, Reliability, Security and Robustness in Heterogeneous Networks. QShine 2013. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 115. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37949-9_26
Download citation
DOI: https://doi.org/10.1007/978-3-642-37949-9_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37948-2
Online ISBN: 978-3-642-37949-9
eBook Packages: Computer ScienceComputer Science (R0)