Abstract
The present research proposes a paradigm which combines the Wavelet Packet Transform (WPT) with the distinguished Mel Frequency Cepstral Coefficients (MFCC) for extraction of speech feature vectors in the task of text independent speaker identification. The proposed technique overcomes the single resolution limitation of MFCC by incorporating the multi resolution analysis offered by WPT. To check the accuracy of the proposed paradigm in the real life scenario, it is tested on the speaker database by using Hidden Markov Model (HMM) and Gaussian Mixture Model (GMM) as classifiers and their relative performance for identification purpose is compared. The identification results of the MFCC features and the Wavelet Packet based Mel Frequency Cepstral (WP-MFC) Features are compared to validate the efficiency of the proposed paradigm. Accuracy as high as 100% was achieved in some cases using WP-MFC Features.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Reynolds, D.A.: Speaker Identification and Verification Using Gaussian Mixture Speaker Models. Speech Communication 17 (1995)
Bolt Richard, H., Cooper Franklin, S., David Edward Jr., E., Denes Peter, B., Pickett James, M., Stevens Kenneth, N.: Speaker Identification by Speech Spectograms: A Scientists’ View of its Reliability for Legal Purposes. The Acoustic Society of America 47 (1970)
Reynolds Douglas, A.: Identification, Experimental Evaluation of Features for Robust Speaker. IEEE Transactions on Speech and Audio Processing 77, 257–285 (1994)
Gaikwad Santosh, K., Gawali Bharti, W., Pravin, Y.: A Review on Speech Recognition Technique. International Journal of Computer Applications 10 (2010)
Sirko, M., Michael, P., Ralf, S., Hermann, N.: Computing Mel-frequency coefficients on Power Spectrum. IEEE Proceedings of IEEE 1, 73–76 (2001)
Chen, S.-H., Luo, Y.-R.: Speaker Verification Using MFCC and Support. In: Proceedings of the International MultiConference of Engineers and Computer Scientists (2009)
Rabiner, L.: A tutorial on hidden Markov models and selected applications in speech recognition, pp. 257–286 (1989)
Blimes, J.A.: A gentle tutorial of the EM algorithm and its application to parameter estimation for gaussian mixture and hidden markov models. International Computer Science Institute (1998)
Reynolds, D.A., Campbell, W.M.: Springer Handbook of Speech Processing. Text Independent Speaker Recognition. Springer (2008)
Mallat, S.G.: A theory for multiresolution signal decomposition: the wavelet representation. IEEE 111, 674–693 (1989)
Robi, P.: The Engineers Ultimate Guide to Wavelet Analysis (2012), http://users.rowan.edu/~polikar/wavelets/wttutorial.html (accessed March 20, 2012)
VoxForge (2012), http://www.voxforge.org/home/downloads/speech/english (accessed February 20, 2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Srivastava, S., Bhardwaj, S., Bhandari, A., Gupta, K., Bahl, H., Gupta, J.R.P. (2013). Wavelet Packet Based Mel Frequency Cepstral Features for Text Independent Speaker Identification. In: Abraham, A., Thampi, S. (eds) Intelligent Informatics. Advances in Intelligent Systems and Computing, vol 182. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32063-7_26
Download citation
DOI: https://doi.org/10.1007/978-3-642-32063-7_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32062-0
Online ISBN: 978-3-642-32063-7
eBook Packages: EngineeringEngineering (R0)