Skip to main content

Analysis of Speaker’s Voice in Cepstral Domain Using MFCC Based Feature Extraction and VQ Technique for Speaker Identification System

  • Conference paper
  • First Online:
Advanced Intelligent Systems for Sustainable Development (AI2SD’2018) (AI2SD 2018)

Abstract

Automatic Speaker Recognition technology have been rapidly developed in recent years and facilely integrated with existing biometric system, which can be deployed in the identification systems to improve recognition and ensure security. An essential initial phase in Speaker Recognition (SR) system is the step of extracting accurate information from human acoustic signal that captures the unique characteristics of the speaker’s voice. One popular choice for features extraction is the short-term spectral characteristics. In this paper, we proposed to investigate the performance of the Mel frequency cepstral coefficient (MFCC) to extract features in training phase for text-dependent speaker identification system. In order to evaluate the reliability of the proposed MFCCs feature sets, we use the Vector Quantization (VQ) classifier based on the best Known Linde-Buzo-Gray (LBG), and results are reported for a dataset composed of eight subject (5 male and 3 female). Moreover, we also outline the influence of changing the codebook size to find the best identification rate. The results elucidate the influence of the codebook size on the identification rate for the text-dependent speaker identification system that yield an identification accuracy of 87.5% using codebook of size 8, 16, 32 and 64.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Rabiner, L.R., Juang, B.-H.: Fundamentals of Speech Recognition. PTR Prentice Hall, NJ, USA (1993)

    Google Scholar 

  2. Sukor, A., Syafiq, A.: Speaker identification system using MFCC procedure & noise reduction method. M. Tech Thesis, Universiy Tun Hussein Onn, Malaysia (2012)

    Google Scholar 

  3. Kau, K., Jain, N.: Feature extraction and classification for automatic speaker recognition system. A review. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 5(1), 1–6 (2015)

    Google Scholar 

  4. Singh, S.K.: Features and techniques for speaker recognition. M. Tech. Credit Seminar Report, Electronic Systems Group, EE Dept

    Google Scholar 

  5. Yanling, Z., Xiaoshi, Z., Huixian, G., Na, L.:A speaker recognition based on VQ. In: 3rd IEEE Conferences on Industrial Electronics and Applications (ICIEA), pp. 1988–1990 (2008)

    Google Scholar 

  6. Hermansky, H.: Perceptual linear predictive (PLP) analysis of speech. J. Acoust. Soc. Am. 87(4), 1738–1752 (1990)

    Google Scholar 

  7. Reynolds, D.A.: Experimental evaluation of features for robust speaker identification. IEEE Trans. Speech Audio Process. 2(4), 639–643 (1994)

    Article  Google Scholar 

  8. Davis, S., Mermelstein, P.: Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoust. Speech Signal Process. 28(4), 357–366 (1980)

    Article  Google Scholar 

  9. Zhang, W., Yang, Y., Wu, Z., Sang, L.: Experimental evaluation of a new speaker identification framework using PCA. In: Proceedings of the IEEE International Conference on Systems, Man and Cybernetics, pp. 4147–4152. Washington (2003)

    Google Scholar 

  10. Kumar, J., Prabhakar, O.P., Sahu, N.K.: Comparitive analysis of different feature extraction and classifier technique for speaker identification system: a review. Int. J. Innov. Res. Comput. Commun. Eng. 2(1), 2760–2769 (2014)

    Google Scholar 

  11. Reynolds, D.A.: A Gaussian mixture modeling approach to text-independent speaker identification. Ph. D. thesis, Georgia Institute of Technology (1992)

    Google Scholar 

  12. Verma, G.K.: Multi-feature fusion for closed set text independent speaker identification. In: International Conference on Information Intelligence, Systems, Technology and Management, Springer, 170–179 (2011)

    Google Scholar 

  13. Saini, P., Kaur, P., et al.: Hindi automatic speech recognition using HTK. Int. J. Eng. Trends Technol. 4 (2013)

    Google Scholar 

  14. Jothilakshmi, S., Ramalingam, V., Palanivel, S.: Unsupervised speaker segmentation with residual phase and MFCC features. Expert Syst. Appl. 36(6), 9799–9804 (2009)

    Article  Google Scholar 

  15. Tiwari, V.: MFCC and its applications in speaker recognition. IEEE Int. J. Emerg. Technol. 1(7), 33–37 (2013)

    Google Scholar 

  16. Srinivasan, A.: Speaker identification and verification using vector quantization and mel frequency cepstral coefficients. Res. J. Appl. Sci. Eng. Technol. 4(1), 33–40 (2012)

    Google Scholar 

  17. Temko, A., Nadeu, C.: Classification of acoustic events using SVM-based clustering schemes. Pattern Recogn. 39, 684–694 (2006)

    Article  Google Scholar 

  18. de Lara, J.R.C.: A method of automatic speaker recognition using cepstral features and vectorial quantization. In: Lazo, M., Sanfeliu, A. (eds.) CIARP 2005, LNCS 3773, pp. 146–153 (2005)

    Google Scholar 

  19. Lindasalwa, M., Begam, M., Elamvazuthi, I.: Voice recognition algorithm using Mel frequency cepstral coefficient (MFCC) and Dynamic time warping (DTW) techniques. J. Comput. 2(3), 138–143 (2010)

    Google Scholar 

  20. Alam, M.J., Kinnunen, T., Kenny, P., Ouellet, P., O’Shaughnessy, D.: Multitaper MFCC and PLP features for speaker verification using i-vectors. J. Speech Commun. Elsevier 55(2), 237–251 (2013)

    Google Scholar 

  21. Juang, B.-H., Rabiner, L.: Fundamentals of Speech Recognition. Signal Processing Series. Prentice Hall, Englewood Cliffs, NJ (1993)

    Google Scholar 

  22. Tiwari, V.: MFCC and its applications in speaker recognition. Int. J. Emerg. Technol. (2010)

    Google Scholar 

  23. Kamale, H.E., Kawitkar, R.S: Vector quantization approach for speaker recognition. Int. J. Comput. Technol. Electron. Eng., 110–114 (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mariame Jenhi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Jenhi, M., Roukhe, A., Hlou, L. (2019). Analysis of Speaker’s Voice in Cepstral Domain Using MFCC Based Feature Extraction and VQ Technique for Speaker Identification System. In: Ezziyyani, M. (eds) Advanced Intelligent Systems for Sustainable Development (AI2SD’2018). AI2SD 2018. Advances in Intelligent Systems and Computing, vol 915. Springer, Cham. https://doi.org/10.1007/978-3-030-11928-7_78

Download citation

Publish with us

Policies and ethics