Abstract
Automatic Speaker Recognition technology have been rapidly developed in recent years and facilely integrated with existing biometric system, which can be deployed in the identification systems to improve recognition and ensure security. An essential initial phase in Speaker Recognition (SR) system is the step of extracting accurate information from human acoustic signal that captures the unique characteristics of the speaker’s voice. One popular choice for features extraction is the short-term spectral characteristics. In this paper, we proposed to investigate the performance of the Mel frequency cepstral coefficient (MFCC) to extract features in training phase for text-dependent speaker identification system. In order to evaluate the reliability of the proposed MFCCs feature sets, we use the Vector Quantization (VQ) classifier based on the best Known Linde-Buzo-Gray (LBG), and results are reported for a dataset composed of eight subject (5 male and 3 female). Moreover, we also outline the influence of changing the codebook size to find the best identification rate. The results elucidate the influence of the codebook size on the identification rate for the text-dependent speaker identification system that yield an identification accuracy of 87.5% using codebook of size 8, 16, 32 and 64.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Rabiner, L.R., Juang, B.-H.: Fundamentals of Speech Recognition. PTR Prentice Hall, NJ, USA (1993)
Sukor, A., Syafiq, A.: Speaker identification system using MFCC procedure & noise reduction method. M. Tech Thesis, Universiy Tun Hussein Onn, Malaysia (2012)
Kau, K., Jain, N.: Feature extraction and classification for automatic speaker recognition system. A review. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 5(1), 1–6 (2015)
Singh, S.K.: Features and techniques for speaker recognition. M. Tech. Credit Seminar Report, Electronic Systems Group, EE Dept
Yanling, Z., Xiaoshi, Z., Huixian, G., Na, L.:A speaker recognition based on VQ. In: 3rd IEEE Conferences on Industrial Electronics and Applications (ICIEA), pp. 1988–1990 (2008)
Hermansky, H.: Perceptual linear predictive (PLP) analysis of speech. J. Acoust. Soc. Am. 87(4), 1738–1752 (1990)
Reynolds, D.A.: Experimental evaluation of features for robust speaker identification. IEEE Trans. Speech Audio Process. 2(4), 639–643 (1994)
Davis, S., Mermelstein, P.: Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoust. Speech Signal Process. 28(4), 357–366 (1980)
Zhang, W., Yang, Y., Wu, Z., Sang, L.: Experimental evaluation of a new speaker identification framework using PCA. In: Proceedings of the IEEE International Conference on Systems, Man and Cybernetics, pp. 4147–4152. Washington (2003)
Kumar, J., Prabhakar, O.P., Sahu, N.K.: Comparitive analysis of different feature extraction and classifier technique for speaker identification system: a review. Int. J. Innov. Res. Comput. Commun. Eng. 2(1), 2760–2769 (2014)
Reynolds, D.A.: A Gaussian mixture modeling approach to text-independent speaker identification. Ph. D. thesis, Georgia Institute of Technology (1992)
Verma, G.K.: Multi-feature fusion for closed set text independent speaker identification. In: International Conference on Information Intelligence, Systems, Technology and Management, Springer, 170–179 (2011)
Saini, P., Kaur, P., et al.: Hindi automatic speech recognition using HTK. Int. J. Eng. Trends Technol. 4 (2013)
Jothilakshmi, S., Ramalingam, V., Palanivel, S.: Unsupervised speaker segmentation with residual phase and MFCC features. Expert Syst. Appl. 36(6), 9799–9804 (2009)
Tiwari, V.: MFCC and its applications in speaker recognition. IEEE Int. J. Emerg. Technol. 1(7), 33–37 (2013)
Srinivasan, A.: Speaker identification and verification using vector quantization and mel frequency cepstral coefficients. Res. J. Appl. Sci. Eng. Technol. 4(1), 33–40 (2012)
Temko, A., Nadeu, C.: Classification of acoustic events using SVM-based clustering schemes. Pattern Recogn. 39, 684–694 (2006)
de Lara, J.R.C.: A method of automatic speaker recognition using cepstral features and vectorial quantization. In: Lazo, M., Sanfeliu, A. (eds.) CIARP 2005, LNCS 3773, pp. 146–153 (2005)
Lindasalwa, M., Begam, M., Elamvazuthi, I.: Voice recognition algorithm using Mel frequency cepstral coefficient (MFCC) and Dynamic time warping (DTW) techniques. J. Comput. 2(3), 138–143 (2010)
Alam, M.J., Kinnunen, T., Kenny, P., Ouellet, P., O’Shaughnessy, D.: Multitaper MFCC and PLP features for speaker verification using i-vectors. J. Speech Commun. Elsevier 55(2), 237–251 (2013)
Juang, B.-H., Rabiner, L.: Fundamentals of Speech Recognition. Signal Processing Series. Prentice Hall, Englewood Cliffs, NJ (1993)
Tiwari, V.: MFCC and its applications in speaker recognition. Int. J. Emerg. Technol. (2010)
Kamale, H.E., Kawitkar, R.S: Vector quantization approach for speaker recognition. Int. J. Comput. Technol. Electron. Eng., 110–114 (2008)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Jenhi, M., Roukhe, A., Hlou, L. (2019). Analysis of Speaker’s Voice in Cepstral Domain Using MFCC Based Feature Extraction and VQ Technique for Speaker Identification System. In: Ezziyyani, M. (eds) Advanced Intelligent Systems for Sustainable Development (AI2SD’2018). AI2SD 2018. Advances in Intelligent Systems and Computing, vol 915. Springer, Cham. https://doi.org/10.1007/978-3-030-11928-7_78
Download citation
DOI: https://doi.org/10.1007/978-3-030-11928-7_78
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-11927-0
Online ISBN: 978-3-030-11928-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)