Skip to main content

Speaker Identification in Noisy Environments for Forensic Purposes

  • Conference paper
  • First Online:
New Perspectives in Software Engineering (CIMPS 2022)

Abstract

The speech is a biological or physical feature unique to each person, and this is widely used in speaker identification tasks like access control, transaction authentication, home automation applications, among others. The aim of this research is to propose a connected-words speaker recognition scheme based on a closed-set speaker-independent voice corpus in noisy environments that can be applied in contexts such as forensic purposes. Using a KDD analysis, MFCCs were used as filtering technique to extract speech features from 158 speakers, to later carry out the speaker identification process. Paper presents a performance comparison of ANN, KNN and logistic regression models, which obtained a F1 score of 98%, 98.32% and 97.75%, respectively. The results show that schemes such as KNN and ANN can achieve a similar performance in full voice files when applying the proposed KDD framework, generating robust models applied in forensic environments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 249.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Mohd Hanifa, R., Isa, K., Mohamad, S.: A review on speaker recognition: technology and challenges. Comput. Electr. Eng. 90 (2021)

    Google Scholar 

  2. Becerra, A., de la Rosa, J.I., González, E., Pedroza, A.D., Escalante, N.I.: Training deep neural networks with non-uniform frame-level cost function for automatic speech recognition. Multi. Tools Appl. 77(20), 27231–27267 (2018). https://doi.org/10.1007/s11042-018-5917-5

    Article  Google Scholar 

  3. Campbell, J.P.: Speaker recognition: a tutorial. Proc. IEEE 85, 1437–1462 (1997)

    Article  Google Scholar 

  4. Basharirad, B., Moradhaseli, M.: Speech emotion recognition methods: a literature review. In: The 2nd International Conference on Applied Science and Technology (ICAST’17), p. 020105. AIP Publishing (2017)

    Google Scholar 

  5. Deng, L., Li, X.: Machine learning paradigms for speech recognition: an overview. IEEE Trans. Audio, Speech Lang. Process. 21, 1060–1089 (2013)

    Google Scholar 

  6. Togneri, R., Pullella, D.: An overview of speaker identification: accuracy and robustness issues. IEEE Circuits Syst. Mag. 11, 23–61 (2011)

    Article  Google Scholar 

  7. Pawar, R.V., Jalnekar, R.M., Chitode, J.S.: Review of various stages in speaker recognition system, performance measures and recognition toolkits. Analog Integr. Circ. Sig. Process 94(2), 247–257 (2017). https://doi.org/10.1007/s10470-017-1069-1

    Article  Google Scholar 

  8. Chaudhary, G., Srivastava, S., Bhardwaj, S.: Feature extraction methods for speaker recognition: a review. Int. J. Pattern Recognit. Artif. Intell. 31, 1750041 (2017)

    Article  Google Scholar 

  9. Lotia, P., Khan, M.R.: A review of various score normalization techniques for speaker identification system. Int. J. Adv. Eng. Technol. 3, 650–667 (2012)

    Google Scholar 

  10. Khalid, L.F., Abdulazeez, A.M.: Identifying speakers using deep learning: a review. Int. J. Sci. Bus. 5, 15–26 (2021)

    Google Scholar 

  11. Miao, X., Li, Y., Wen, M., Liu, Y., Julian, I.N., Guo, H.: Fusing features of speech for depression classification based on higher-order spectral analysis. Speech Commun. 143, 46–56 (2022)

    Article  Google Scholar 

  12. Simić, N., Suzić, S., Nosek, T., et al.: Speaker recognition using constrained convolutional neural networks in emotional speech. Entropy 24, 1–17 (2022)

    Article  MathSciNet  Google Scholar 

  13. Shahin, I., Nassif, A.B.: Emirati-accented speaker identification in stressful talking conditions. In: Int. Conf. Electr. Comput. Technol. Appl. ICECTA, pp. 1–6. IEEE Press, Ras Al Khaimah (2019)

    Google Scholar 

  14. Al Hindawi, N.A., Shahin, I., Nassif, A.B.: Speaker identification for disguised voices based on modified SVM classifier. In: Int. Multi-Conference Syst. Signals Devices, SSD, pp. 687–691. IEEE Press, Monastir (2021)

    Google Scholar 

  15. Ge, Z., Iyer, A.N., Cheluvaraja, S., Sundaram, R., Ganapathiraju, A.: Neural network based speaker classification and verification systems with enhanced features. In: Intelligent Systems Conference (Intel-liSys), pp. 1089–1094. IEEE Press, London, UK (2017)

    Google Scholar 

  16. Ozcan, Z., Kayikcioglu, T.: A speaker identification performance comparison based on the classifier, the computation time and the number of MFCC. In: 25th Signal Processing and Communications Applications Conference (SIU), pp. 1–4. IEEE Press, Antalya (2017)

    Google Scholar 

  17. AboElenein, N.M., Amin, K.M., Ibrahim, M., Hadhoud, M.M.: Improved text-independent speaker identification system for real time applications. In: Proc. 4th Int. Japan-Egypt Conf. Electron. Commun. Comput. JEC-ECC, pp. 58–62. IEEE Press, Cairo (2016)

    Google Scholar 

  18. Ye, F., Yang, J.: A deep neural network model for speaker identification. Appl. Sci. 11, 1–18 (2021)

    Article  Google Scholar 

  19. Liu, Z., Wu, Z., Li, T., Li, J., Shen, C.: GMM and CNN hybrid method for short utterance speaker recognition. IEEE Trans. Industr. Inf. 14, 3244–3252 (2018)

    Article  Google Scholar 

  20. Alsulaiman, M., Mahmood, A., Muhammad, G.: Speaker recognition based on Arabic phonemes. Speech Commun. 86, 42–51 (2017)

    Article  Google Scholar 

  21. Chakroun, R., Zouari, L.B., Frikha, M., Hamida, A.B.: A novel approach based on Support Vector Machines for automatic speaker identification. In: 12th International Conference of Computer Systems and Applications (AICCSA), pp. 1–5. IEEE Press, Marrakech (2015)

    Google Scholar 

  22. AbuAladas, F.E., Zeki, A.M., Al-Ani, M.S., Messikh, A.E.: Speaker identification based on curvlet transform technique. In: International Conference on Computing, Engineering, and Design (ICCED), pp. 1–4. IEEE Press, Kuala Lumpur (2017)

    Google Scholar 

  23. Tiwari, V.: MFCC and its applications in speaker recognition. Int. J. Emerg. Technol. 1, 19–22 (2010)

    Google Scholar 

  24. Prototyping Model: https://searchcio.techtarget.com/definition/Prototyping-Model

  25. Weitzenfeld, A., Guardati, S.: Ingeniería de software: el proceso para el desarrollo de software. In: Introducción a la Computación, pp. 355–396. Cengage Learning (2007)

    Google Scholar 

  26. Sommerville, I.: Software engineering. Pearson, México (2011)

    Google Scholar 

  27. Comendador, B., Rabago, L., Tanguilig, B.: An educational model based on knowledge discovery in databases (KDD) to predict learner’s behavior using classification techniques. In: ICSPCC2016. IEEE Press (2016)

    Google Scholar 

  28. Singh, D., Singh, B.: Investigating the impact of data normalization on classification performance. Appl. Soft Comput. 97, 105524 (2020)

    Article  Google Scholar 

  29. Lecun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521, 436–444 (2015)

    Article  Google Scholar 

  30. Han, S.H., Kim, K.W., Kim, S., Youn, Y.C.: Artificial neural network: understanding the basic concepts without mathematics. Dement. Neurocognitive Disord. 17, 83–89 (2018)

    Article  Google Scholar 

  31. Zhou, I., et al.: Graph neural networks: a review of methods and applications. AI Open 1, 57–81 (2020)

    Article  Google Scholar 

  32. Kingma, D.P., Ba, L.J.: Adam: A method for stochastic optimization. In: 3rd Int. Conf. Learn. Represent. ICLR, pp. 1–15. arXiv.org, Ithaca (2015)

    Google Scholar 

  33. LaValley, M.P.: Logistic regression. Circulation 117, 2395–2399 (2008)

    Google Scholar 

  34. Ranganathan, P., Pramesh, C.S., Aggarwal, R.: Common pitfalls in statistical analysis: logistic regression. Persp. Clin. Res. 8, 148–151 (2017)

    Google Scholar 

  35. Guptaa, P., Garg, S.: Breast cancer prediction using varying parameters of machine learning models. Proc. Comput. Sci. 172, 593–601 (2020)

    Article  Google Scholar 

  36. Goldberger, J., Roweis, S., Hinton, G., Salakhutdinov, R.: Neighbourhood components analysis. In: Adv in Neural Information Processing Systems. MIT Press, Vancouver (2004)

    Google Scholar 

  37. Zhang, D.: Methods and rules of voting and decision: a literature review. Open J. Soc. Sci. 8, 60–72 (2020)

    Google Scholar 

  38. Pillai, S.K., Raghuwanshi, M.M., Gaikwad, M.: Hyperparameter tuning and optimization in machine learning for species identification system. In: Dutta, M., Rama Krishna, C., Kumar, R., Kalra, M. (eds.) Proceedings of International Conference on IoT Inclusive Life (ICIIL 2019), NITTTR Chandigarh, India. LNNS, vol. 116, pp. 235–241. Springer, Singapore (2020). https://doi.org/10.1007/978-981-15-3020-3_22

    Chapter  Google Scholar 

  39. Wu, J., Chen, X.Y., Zhang, H., et al.: Hyperparameter optimization for machine learning models based on Bayesian optimization. J. Electron. Sci. Technol. 17, 26–40 (2019)

    Google Scholar 

  40. Wong, P.Y.T.: Reliable accuracy estimates from k-fold cross validation. IEEE Trans. Knowl. Data Eng. 32, 1586–1594 (2020)

    Article  Google Scholar 

  41. Raschka, S.: Model evaluation, model selection, and algorithm selection in machine learning. Arxiv, pp. 1–49 (2020)

    Google Scholar 

  42. Grandini, M., Bagli, E., Visani, G.: Metrics for multi-class classification: an overview. Arxiv, pp. 1–17 (2020)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Aldonso Becerra-Sánchez .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Rodarte-Rodríguez, A. et al. (2023). Speaker Identification in Noisy Environments for Forensic Purposes. In: Mejia, J., Muñoz, M., Rocha, Á., Hernández-Nava, V. (eds) New Perspectives in Software Engineering. CIMPS 2022. Lecture Notes in Networks and Systems, vol 576. Springer, Cham. https://doi.org/10.1007/978-3-031-20322-0_21

Download citation

Publish with us

Policies and ethics