Speaker Identification in Noisy Environments for Forensic Purposes

Rodarte-Rodríguez, Armando; Becerra-Sánchez, Aldonso; De La Rosa-Vargas, José I.; Escalante-García, Nivia I.; Olvera-González, José E.; de J. Velásquez-Martínez, Emmanuel; Zepeda-Valles, Gustavo

doi:10.1007/978-3-031-20322-0_21

Armando Rodarte-Rodríguez¹³,
Aldonso Becerra-Sánchez¹³,
José I. De La Rosa-Vargas¹³,
Nivia I. Escalante-García¹⁴,
José E. Olvera-González¹⁴,
Emmanuel de J. Velásquez-Martínez¹³ &
…
Gustavo Zepeda-Valles¹³

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 576))

Included in the following conference series:

International Conference on Software Process Improvement

285 Accesses
1 Citations

Abstract

The speech is a biological or physical feature unique to each person, and this is widely used in speaker identification tasks like access control, transaction authentication, home automation applications, among others. The aim of this research is to propose a connected-words speaker recognition scheme based on a closed-set speaker-independent voice corpus in noisy environments that can be applied in contexts such as forensic purposes. Using a KDD analysis, MFCCs were used as filtering technique to extract speech features from 158 speakers, to later carry out the speaker identification process. Paper presents a performance comparison of ANN, KNN and logistic regression models, which obtained a F1 score of 98%, 98.32% and 97.75%, respectively. The results show that schemes such as KNN and ANN can achieve a similar performance in full voice files when applying the proposed KDD framework, generating robust models applied in forensic environments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 189.00; Price excludes VAT (USA)

Softcover Book: USD 249.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

A Statistical Approach to Speaker Identification in Forensic Phonetics

A Semi-supervised Speaker Identification Method for Audio Forensics Using Cochleagrams

Enhanced forensic speaker verification performance using the ICA-EBM algorithm under noisy and reverberant environments

Article 07 May 2020

References

Mohd Hanifa, R., Isa, K., Mohamad, S.: A review on speaker recognition: technology and challenges. Comput. Electr. Eng. 90 (2021)
Google Scholar
Becerra, A., de la Rosa, J.I., González, E., Pedroza, A.D., Escalante, N.I.: Training deep neural networks with non-uniform frame-level cost function for automatic speech recognition. Multi. Tools Appl. 77(20), 27231–27267 (2018). https://doi.org/10.1007/s11042-018-5917-5
Article Google Scholar
Campbell, J.P.: Speaker recognition: a tutorial. Proc. IEEE 85, 1437–1462 (1997)
Article Google Scholar
Basharirad, B., Moradhaseli, M.: Speech emotion recognition methods: a literature review. In: The 2nd International Conference on Applied Science and Technology (ICAST’17), p. 020105. AIP Publishing (2017)
Google Scholar
Deng, L., Li, X.: Machine learning paradigms for speech recognition: an overview. IEEE Trans. Audio, Speech Lang. Process. 21, 1060–1089 (2013)
Google Scholar
Togneri, R., Pullella, D.: An overview of speaker identification: accuracy and robustness issues. IEEE Circuits Syst. Mag. 11, 23–61 (2011)
Article Google Scholar
Pawar, R.V., Jalnekar, R.M., Chitode, J.S.: Review of various stages in speaker recognition system, performance measures and recognition toolkits. Analog Integr. Circ. Sig. Process 94(2), 247–257 (2017). https://doi.org/10.1007/s10470-017-1069-1
Article Google Scholar
Chaudhary, G., Srivastava, S., Bhardwaj, S.: Feature extraction methods for speaker recognition: a review. Int. J. Pattern Recognit. Artif. Intell. 31, 1750041 (2017)
Article Google Scholar
Lotia, P., Khan, M.R.: A review of various score normalization techniques for speaker identification system. Int. J. Adv. Eng. Technol. 3, 650–667 (2012)
Google Scholar
Khalid, L.F., Abdulazeez, A.M.: Identifying speakers using deep learning: a review. Int. J. Sci. Bus. 5, 15–26 (2021)
Google Scholar
Miao, X., Li, Y., Wen, M., Liu, Y., Julian, I.N., Guo, H.: Fusing features of speech for depression classification based on higher-order spectral analysis. Speech Commun. 143, 46–56 (2022)
Article Google Scholar
Simić, N., Suzić, S., Nosek, T., et al.: Speaker recognition using constrained convolutional neural networks in emotional speech. Entropy 24, 1–17 (2022)
Article MathSciNet Google Scholar
Shahin, I., Nassif, A.B.: Emirati-accented speaker identification in stressful talking conditions. In: Int. Conf. Electr. Comput. Technol. Appl. ICECTA, pp. 1–6. IEEE Press, Ras Al Khaimah (2019)
Google Scholar
Al Hindawi, N.A., Shahin, I., Nassif, A.B.: Speaker identification for disguised voices based on modified SVM classifier. In: Int. Multi-Conference Syst. Signals Devices, SSD, pp. 687–691. IEEE Press, Monastir (2021)
Google Scholar
Ge, Z., Iyer, A.N., Cheluvaraja, S., Sundaram, R., Ganapathiraju, A.: Neural network based speaker classification and verification systems with enhanced features. In: Intelligent Systems Conference (Intel-liSys), pp. 1089–1094. IEEE Press, London, UK (2017)
Google Scholar
Ozcan, Z., Kayikcioglu, T.: A speaker identification performance comparison based on the classifier, the computation time and the number of MFCC. In: 25th Signal Processing and Communications Applications Conference (SIU), pp. 1–4. IEEE Press, Antalya (2017)
Google Scholar
AboElenein, N.M., Amin, K.M., Ibrahim, M., Hadhoud, M.M.: Improved text-independent speaker identification system for real time applications. In: Proc. 4th Int. Japan-Egypt Conf. Electron. Commun. Comput. JEC-ECC, pp. 58–62. IEEE Press, Cairo (2016)
Google Scholar
Ye, F., Yang, J.: A deep neural network model for speaker identification. Appl. Sci. 11, 1–18 (2021)
Article Google Scholar
Liu, Z., Wu, Z., Li, T., Li, J., Shen, C.: GMM and CNN hybrid method for short utterance speaker recognition. IEEE Trans. Industr. Inf. 14, 3244–3252 (2018)
Article Google Scholar
Alsulaiman, M., Mahmood, A., Muhammad, G.: Speaker recognition based on Arabic phonemes. Speech Commun. 86, 42–51 (2017)
Article Google Scholar
Chakroun, R., Zouari, L.B., Frikha, M., Hamida, A.B.: A novel approach based on Support Vector Machines for automatic speaker identification. In: 12th International Conference of Computer Systems and Applications (AICCSA), pp. 1–5. IEEE Press, Marrakech (2015)
Google Scholar
AbuAladas, F.E., Zeki, A.M., Al-Ani, M.S., Messikh, A.E.: Speaker identification based on curvlet transform technique. In: International Conference on Computing, Engineering, and Design (ICCED), pp. 1–4. IEEE Press, Kuala Lumpur (2017)
Google Scholar
Tiwari, V.: MFCC and its applications in speaker recognition. Int. J. Emerg. Technol. 1, 19–22 (2010)
Google Scholar
Prototyping Model: https://searchcio.techtarget.com/definition/Prototyping-Model
Weitzenfeld, A., Guardati, S.: Ingeniería de software: el proceso para el desarrollo de software. In: Introducción a la Computación, pp. 355–396. Cengage Learning (2007)
Google Scholar
Sommerville, I.: Software engineering. Pearson, México (2011)
Google Scholar
Comendador, B., Rabago, L., Tanguilig, B.: An educational model based on knowledge discovery in databases (KDD) to predict learner’s behavior using classification techniques. In: ICSPCC2016. IEEE Press (2016)
Google Scholar
Singh, D., Singh, B.: Investigating the impact of data normalization on classification performance. Appl. Soft Comput. 97, 105524 (2020)
Article Google Scholar
Lecun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521, 436–444 (2015)
Article Google Scholar
Han, S.H., Kim, K.W., Kim, S., Youn, Y.C.: Artificial neural network: understanding the basic concepts without mathematics. Dement. Neurocognitive Disord. 17, 83–89 (2018)
Article Google Scholar
Zhou, I., et al.: Graph neural networks: a review of methods and applications. AI Open 1, 57–81 (2020)
Article Google Scholar
Kingma, D.P., Ba, L.J.: Adam: A method for stochastic optimization. In: 3rd Int. Conf. Learn. Represent. ICLR, pp. 1–15. arXiv.org, Ithaca (2015)
Google Scholar
LaValley, M.P.: Logistic regression. Circulation 117, 2395–2399 (2008)
Google Scholar
Ranganathan, P., Pramesh, C.S., Aggarwal, R.: Common pitfalls in statistical analysis: logistic regression. Persp. Clin. Res. 8, 148–151 (2017)
Google Scholar
Guptaa, P., Garg, S.: Breast cancer prediction using varying parameters of machine learning models. Proc. Comput. Sci. 172, 593–601 (2020)
Article Google Scholar
Goldberger, J., Roweis, S., Hinton, G., Salakhutdinov, R.: Neighbourhood components analysis. In: Adv in Neural Information Processing Systems. MIT Press, Vancouver (2004)
Google Scholar
Zhang, D.: Methods and rules of voting and decision: a literature review. Open J. Soc. Sci. 8, 60–72 (2020)
Google Scholar
Pillai, S.K., Raghuwanshi, M.M., Gaikwad, M.: Hyperparameter tuning and optimization in machine learning for species identification system. In: Dutta, M., Rama Krishna, C., Kumar, R., Kalra, M. (eds.) Proceedings of International Conference on IoT Inclusive Life (ICIIL 2019), NITTTR Chandigarh, India. LNNS, vol. 116, pp. 235–241. Springer, Singapore (2020). https://doi.org/10.1007/978-981-15-3020-3_22
Chapter Google Scholar
Wu, J., Chen, X.Y., Zhang, H., et al.: Hyperparameter optimization for machine learning models based on Bayesian optimization. J. Electron. Sci. Technol. 17, 26–40 (2019)
Google Scholar
Wong, P.Y.T.: Reliable accuracy estimates from k-fold cross validation. IEEE Trans. Knowl. Data Eng. 32, 1586–1594 (2020)
Article Google Scholar
Raschka, S.: Model evaluation, model selection, and algorithm selection in machine learning. Arxiv, pp. 1–49 (2020)
Google Scholar
Grandini, M., Bagli, E., Visani, G.: Metrics for multi-class classification: an overview. Arxiv, pp. 1–17 (2020)
Google Scholar

Download references

Author information

Authors and Affiliations

Universidad Autónoma de Zacatecas, Campus Siglo XXI, Carr. Zacatecas-Guadalajara Km. 6, Ejido “La Escondida”, 98160, Zacatecas, Mexico
Armando Rodarte-Rodríguez, Aldonso Becerra-Sánchez, José I. De La Rosa-Vargas, Emmanuel de J. Velásquez-Martínez & Gustavo Zepeda-Valles
Tecnológico Nacional de México Campus Pabellón de Arteaga, Carretera a la Estación de Rincón Km. 1, Pabellón de Arteaga, 20670, Aguascalientes, Mexico
Nivia I. Escalante-García & José E. Olvera-González

Authors

Armando Rodarte-Rodríguez
View author publications
You can also search for this author in PubMed Google Scholar
Aldonso Becerra-Sánchez
View author publications
You can also search for this author in PubMed Google Scholar
José I. De La Rosa-Vargas
View author publications
You can also search for this author in PubMed Google Scholar
Nivia I. Escalante-García
View author publications
You can also search for this author in PubMed Google Scholar
José E. Olvera-González
View author publications
You can also search for this author in PubMed Google Scholar
Emmanuel de J. Velásquez-Martínez
View author publications
You can also search for this author in PubMed Google Scholar
Gustavo Zepeda-Valles
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Aldonso Becerra-Sánchez .

Editor information

Editors and Affiliations

Centro de Investigación en Matemáticas, A.C., Unidad Zacatecas, Zacatecas, Mexico
Jezreel Mejia
Centro de Investigación en Matemáticas, A.C., Unidad Zacatecas, Zacatecas, Zacatecas, Zacatecas, Mexico
Mirna Muñoz
ISEG, Universidade de Lisboa, Lisbon, Portugal
Álvaro Rocha
Universidad Hipócrates, Acapulco de Juárez, Guerrero, Mexico
Víctor Hernández-Nava

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rodarte-Rodríguez, A. et al. (2023). Speaker Identification in Noisy Environments for Forensic Purposes. In: Mejia, J., Muñoz, M., Rocha, Á., Hernández-Nava, V. (eds) New Perspectives in Software Engineering. CIMPS 2022. Lecture Notes in Networks and Systems, vol 576. Springer, Cham. https://doi.org/10.1007/978-3-031-20322-0_21

Download citation

DOI: https://doi.org/10.1007/978-3-031-20322-0_21
Published: 30 October 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-20321-3
Online ISBN: 978-3-031-20322-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Speaker Identification in Noisy Environments for Forensic Purposes

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

A Statistical Approach to Speaker Identification in Forensic Phonetics

A Semi-supervised Speaker Identification Method for Audio Forensics Using Cochleagrams

Enhanced forensic speaker verification performance using the ICA-EBM algorithm under noisy and reverberant environments

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Speaker Identification in Noisy Environments for Forensic Purposes

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

A Statistical Approach to Speaker Identification in Forensic Phonetics

A Semi-supervised Speaker Identification Method for Audio Forensics Using Cochleagrams

Enhanced forensic speaker verification performance using the ICA-EBM algorithm under noisy and reverberant environments

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation