Abstract
The speech is a biological or physical feature unique to each person, and this is widely used in speaker identification tasks like access control, transaction authentication, home automation applications, among others. The aim of this research is to propose a connected-words speaker recognition scheme based on a closed-set speaker-independent voice corpus in noisy environments that can be applied in contexts such as forensic purposes. Using a KDD analysis, MFCCs were used as filtering technique to extract speech features from 158 speakers, to later carry out the speaker identification process. Paper presents a performance comparison of ANN, KNN and logistic regression models, which obtained a F1 score of 98%, 98.32% and 97.75%, respectively. The results show that schemes such as KNN and ANN can achieve a similar performance in full voice files when applying the proposed KDD framework, generating robust models applied in forensic environments.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Mohd Hanifa, R., Isa, K., Mohamad, S.: A review on speaker recognition: technology and challenges. Comput. Electr. Eng. 90 (2021)
Becerra, A., de la Rosa, J.I., González, E., Pedroza, A.D., Escalante, N.I.: Training deep neural networks with non-uniform frame-level cost function for automatic speech recognition. Multi. Tools Appl. 77(20), 27231–27267 (2018). https://doi.org/10.1007/s11042-018-5917-5
Campbell, J.P.: Speaker recognition: a tutorial. Proc. IEEE 85, 1437–1462 (1997)
Basharirad, B., Moradhaseli, M.: Speech emotion recognition methods: a literature review. In: The 2nd International Conference on Applied Science and Technology (ICAST’17), p. 020105. AIP Publishing (2017)
Deng, L., Li, X.: Machine learning paradigms for speech recognition: an overview. IEEE Trans. Audio, Speech Lang. Process. 21, 1060–1089 (2013)
Togneri, R., Pullella, D.: An overview of speaker identification: accuracy and robustness issues. IEEE Circuits Syst. Mag. 11, 23–61 (2011)
Pawar, R.V., Jalnekar, R.M., Chitode, J.S.: Review of various stages in speaker recognition system, performance measures and recognition toolkits. Analog Integr. Circ. Sig. Process 94(2), 247–257 (2017). https://doi.org/10.1007/s10470-017-1069-1
Chaudhary, G., Srivastava, S., Bhardwaj, S.: Feature extraction methods for speaker recognition: a review. Int. J. Pattern Recognit. Artif. Intell. 31, 1750041 (2017)
Lotia, P., Khan, M.R.: A review of various score normalization techniques for speaker identification system. Int. J. Adv. Eng. Technol. 3, 650–667 (2012)
Khalid, L.F., Abdulazeez, A.M.: Identifying speakers using deep learning: a review. Int. J. Sci. Bus. 5, 15–26 (2021)
Miao, X., Li, Y., Wen, M., Liu, Y., Julian, I.N., Guo, H.: Fusing features of speech for depression classification based on higher-order spectral analysis. Speech Commun. 143, 46–56 (2022)
Simić, N., Suzić, S., Nosek, T., et al.: Speaker recognition using constrained convolutional neural networks in emotional speech. Entropy 24, 1–17 (2022)
Shahin, I., Nassif, A.B.: Emirati-accented speaker identification in stressful talking conditions. In: Int. Conf. Electr. Comput. Technol. Appl. ICECTA, pp. 1–6. IEEE Press, Ras Al Khaimah (2019)
Al Hindawi, N.A., Shahin, I., Nassif, A.B.: Speaker identification for disguised voices based on modified SVM classifier. In: Int. Multi-Conference Syst. Signals Devices, SSD, pp. 687–691. IEEE Press, Monastir (2021)
Ge, Z., Iyer, A.N., Cheluvaraja, S., Sundaram, R., Ganapathiraju, A.: Neural network based speaker classification and verification systems with enhanced features. In: Intelligent Systems Conference (Intel-liSys), pp. 1089–1094. IEEE Press, London, UK (2017)
Ozcan, Z., Kayikcioglu, T.: A speaker identification performance comparison based on the classifier, the computation time and the number of MFCC. In: 25th Signal Processing and Communications Applications Conference (SIU), pp. 1–4. IEEE Press, Antalya (2017)
AboElenein, N.M., Amin, K.M., Ibrahim, M., Hadhoud, M.M.: Improved text-independent speaker identification system for real time applications. In: Proc. 4th Int. Japan-Egypt Conf. Electron. Commun. Comput. JEC-ECC, pp. 58–62. IEEE Press, Cairo (2016)
Ye, F., Yang, J.: A deep neural network model for speaker identification. Appl. Sci. 11, 1–18 (2021)
Liu, Z., Wu, Z., Li, T., Li, J., Shen, C.: GMM and CNN hybrid method for short utterance speaker recognition. IEEE Trans. Industr. Inf. 14, 3244–3252 (2018)
Alsulaiman, M., Mahmood, A., Muhammad, G.: Speaker recognition based on Arabic phonemes. Speech Commun. 86, 42–51 (2017)
Chakroun, R., Zouari, L.B., Frikha, M., Hamida, A.B.: A novel approach based on Support Vector Machines for automatic speaker identification. In: 12th International Conference of Computer Systems and Applications (AICCSA), pp. 1–5. IEEE Press, Marrakech (2015)
AbuAladas, F.E., Zeki, A.M., Al-Ani, M.S., Messikh, A.E.: Speaker identification based on curvlet transform technique. In: International Conference on Computing, Engineering, and Design (ICCED), pp. 1–4. IEEE Press, Kuala Lumpur (2017)
Tiwari, V.: MFCC and its applications in speaker recognition. Int. J. Emerg. Technol. 1, 19–22 (2010)
Prototyping Model: https://searchcio.techtarget.com/definition/Prototyping-Model
Weitzenfeld, A., Guardati, S.: Ingeniería de software: el proceso para el desarrollo de software. In: Introducción a la Computación, pp. 355–396. Cengage Learning (2007)
Sommerville, I.: Software engineering. Pearson, México (2011)
Comendador, B., Rabago, L., Tanguilig, B.: An educational model based on knowledge discovery in databases (KDD) to predict learner’s behavior using classification techniques. In: ICSPCC2016. IEEE Press (2016)
Singh, D., Singh, B.: Investigating the impact of data normalization on classification performance. Appl. Soft Comput. 97, 105524 (2020)
Lecun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521, 436–444 (2015)
Han, S.H., Kim, K.W., Kim, S., Youn, Y.C.: Artificial neural network: understanding the basic concepts without mathematics. Dement. Neurocognitive Disord. 17, 83–89 (2018)
Zhou, I., et al.: Graph neural networks: a review of methods and applications. AI Open 1, 57–81 (2020)
Kingma, D.P., Ba, L.J.: Adam: A method for stochastic optimization. In: 3rd Int. Conf. Learn. Represent. ICLR, pp. 1–15. arXiv.org, Ithaca (2015)
LaValley, M.P.: Logistic regression. Circulation 117, 2395–2399 (2008)
Ranganathan, P., Pramesh, C.S., Aggarwal, R.: Common pitfalls in statistical analysis: logistic regression. Persp. Clin. Res. 8, 148–151 (2017)
Guptaa, P., Garg, S.: Breast cancer prediction using varying parameters of machine learning models. Proc. Comput. Sci. 172, 593–601 (2020)
Goldberger, J., Roweis, S., Hinton, G., Salakhutdinov, R.: Neighbourhood components analysis. In: Adv in Neural Information Processing Systems. MIT Press, Vancouver (2004)
Zhang, D.: Methods and rules of voting and decision: a literature review. Open J. Soc. Sci. 8, 60–72 (2020)
Pillai, S.K., Raghuwanshi, M.M., Gaikwad, M.: Hyperparameter tuning and optimization in machine learning for species identification system. In: Dutta, M., Rama Krishna, C., Kumar, R., Kalra, M. (eds.) Proceedings of International Conference on IoT Inclusive Life (ICIIL 2019), NITTTR Chandigarh, India. LNNS, vol. 116, pp. 235–241. Springer, Singapore (2020). https://doi.org/10.1007/978-981-15-3020-3_22
Wu, J., Chen, X.Y., Zhang, H., et al.: Hyperparameter optimization for machine learning models based on Bayesian optimization. J. Electron. Sci. Technol. 17, 26–40 (2019)
Wong, P.Y.T.: Reliable accuracy estimates from k-fold cross validation. IEEE Trans. Knowl. Data Eng. 32, 1586–1594 (2020)
Raschka, S.: Model evaluation, model selection, and algorithm selection in machine learning. Arxiv, pp. 1–49 (2020)
Grandini, M., Bagli, E., Visani, G.: Metrics for multi-class classification: an overview. Arxiv, pp. 1–17 (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Rodarte-Rodríguez, A. et al. (2023). Speaker Identification in Noisy Environments for Forensic Purposes. In: Mejia, J., Muñoz, M., Rocha, Á., Hernández-Nava, V. (eds) New Perspectives in Software Engineering. CIMPS 2022. Lecture Notes in Networks and Systems, vol 576. Springer, Cham. https://doi.org/10.1007/978-3-031-20322-0_21
Download citation
DOI: https://doi.org/10.1007/978-3-031-20322-0_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-20321-3
Online ISBN: 978-3-031-20322-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)