An Efficient Feature Fusion Technique for Text-Independent Speaker Identification and Verification

Bansal, Savina; Bansal, R. K.; Sharma, Yashender

doi:10.1007/978-981-16-8403-6_56

Savina Bansal⁶,
R. K. Bansal⁶ &
Yashender Sharma⁶

Part of the book series: Lecture Notes on Data Engineering and Communications Technologies ((LNDECT,volume 106))

572 Accesses

Abstract

Speaker identification and verification is an important research area that finds applications in forensics voice verification, mobile banking and security authentication for access control. Various techniques for feature extraction are available in the literature. In this work, a speech feature fusion extraction technique based on fusion of time domain, frequency domain and cepstral domain features has been proposed. Supervised machine learning classification algorithms are used for speaker feature classification. Performance of proposed technique has been evaluated on two open-source speech datasets. Performance metrics of training time and accuracy (validation and test) are measured with the help of confusion matrix. The results indicate that even with smaller training datasets, the average accuracy achieved is 2.97 and 8.97% better and training time 1.95 and 2.03 s less as compared to MFCC and (MFCC + delta + delta delta) MFCC + Δ + Δ², respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 189.00; Price excludes VAT (USA)

Softcover Book: USD 249.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

A Feature Level Fusion Scheme for Robust Speaker Identification

Higher order information set based features for text-independent speaker identification

Article 27 November 2017

Text-Independent Speaker Recognition System Using Feature-Level Fusion for Audio Databases of Various Sizes

Article Open access 18 July 2023

References

H. Garg, R.K. Bansal, S. Bansal, Improved speech compression using LPC and DWT approach. Int. J. Electron. Commun. Instrum. Eng. Res. Dev. (IJECIERD) 4(2), 155–162 (2014)
Google Scholar
Z. Zhang, Mechanics of human voice production and control. J. Acoust. Soc. Am. 140, 2614–2635 (2016). https://doi.org/10.1121/1.4964509
Article Google Scholar
R.M. Hanifa, K. Isa, S. Mohamad, A review on speaker recognition: technology and challenges. Comput. Electr. Eng. 90, 107005 (2021). https://doi.org/10.1016/j.compeleceng.2021.107005
Z. Bai, X.-L. Zhang, Speaker recognition based on deep learning: an overview. Neural Netw. 140, 65–99 (2021). https://doi.org/10.1016/j.neunet.2021.03.004
Article Google Scholar
G. Sharma, K. Umapathy, S. Krishnan, Trends in audio signal feature extraction methods. Appl. Acoust. 158, 107020 (2020). https://doi.org/10.1016/j.apacoust.2019.107020
F. Alías, J.C. Socoró, X. Sevillano, A review of physical and perceptual feature extraction techniques for speech, music and environmental sounds. Appl. Sci. 6(5), 143 (2016). https://doi.org/10.3390/app6050143
Article Google Scholar
K.S.R. Murty, B. Yegnanarayana, Combining evidence from residual phase and MFCC features for speaker recognition. IEEE Signal Process. Lett. 13(1), 52–55 (2006). https://doi.org/10.1109/LSP.2005.860538
Article Google Scholar
S. Fong, K. Lan, R. Wong, Classifying human voices by using hybrid SFX time-series preprocessing and ensemble feature selection. BioMed Res. Int. 2013(720834) (2013). https://doi.org/10.1155/2013/720834
H. Ali, S.N. Tran, E. Benetos et al., Speaker recognition with hybrid features from a deep belief network. Neural Comput. Appl. 29, 13–19 (2018). https://doi.org/10.1007/s00521-016-2501-7
Article Google Scholar
M. Soleymanpour, H. Marvi, Text-independent speaker identification based on selection of the most similar feature vectors. Int. J. Speech Technol. 20, 99–108 (2017). https://doi.org/10.1007/s10772-016-9385-x
Article Google Scholar
S. Selva Nidhyananthan, R. Shantha Selva Kumari, T. Senthur Selvi, Noise robust speaker identification using RASTA–MFCC feature with quadrilateral filter bank structure. Wireless Pers. Commun. 91, 1321–1333 (2016). https://doi.org/10.1007/s11277-016-3530-3
M. Mohammadi, H.R. Sadegh Mohammadi, Robust features fusion for text independent speaker verification enhancement in noisy environments, in 2017 Iranian Conference on Electrical Engineering (ICEE) (2017), pp. 1863–1868. https://doi.org/10.1109/IranianCEE.2017.7985357
R. Jahangir et al., Text-independent speaker identification through feature fusion and deep neural network. IEEE Access 8, 32187–32202 (2020). https://doi.org/10.1109/ACCESS.2020.2973541
Article Google Scholar
S. Bansal, R.K. Bansal, Y. Sharma, ANN based efficient feature fusion technique for speaker recognition, in International Conference on Emerging Technologies: AI, IoT and CPS for Science & Technology Applications (2021). http://ceur-ws.org/Vol-3058/Paper-063.pdf
M.A. Hossan, S. Memon, M.A. Gregory, A novel approach for MFCC feature extraction, in 2010 4th International Conference on Signal Processing and Communication Systems (2010), pp. 1–5. https://doi.org/10.1109/ICSPCS.2010.5709752
E. Alexandre-Cortizo, M. Rosa-Zurera, F. Lopez-Ferreras, Application of Fisher linear discriminant analysis to speech/music classification, in EUROCON 2005—The International Conference on “Computer as a Tool” (2005), pp. 1666–1669. https://doi.org/10.1109/EURCON.2005.1630291
S. Sun, C. Zhang, Subspace ensembles for classification. Physica A 385(1), 199–207 (2007). https://doi.org/10.1016/j.physa.2007.05.010
Article MathSciNet Google Scholar
G. Pirker, M. Wohlmayr, S. Petrik, F. Pernkopf, A pitch tracking corpus with evaluation on multipitch tracking scenario. Interspeech, 1509–1512 (2011). Available Online https://www2.spsc.tugraz.at/databases/PTDB-TUG/
ST-AEDS-20180100_1, Free ST American English Corpus. Available Online https://www.openslr.org/45/

Download references

Author information

Authors and Affiliations

Department of ECE, GZSCCET, MRSPTU Bathinda, Bathida, Punjab, 151001, India
Savina Bansal, R. K. Bansal & Yashender Sharma

Authors

Savina Bansal
View author publications
You can also search for this author in PubMed Google Scholar
R. K. Bansal
View author publications
You can also search for this author in PubMed Google Scholar
Yashender Sharma
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Electronics and Communication Engineering, National Institute of Technology, Kurukshetra, Kurukshetra, India
Pankaj Verma
Department of Electronics and Communication Engineering, National Institute of Technology, Kurukshetra, Kurukshetra, India
Chhagan Charan
Department of Electrical and Computer Engineering, Ryerson University, Toronto, ON, Canada
Xavier Fernando
Department of Electrical and Computer Engineering, Oakland University, Rochester, MI, USA
Subramaniam Ganesan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bansal, S., Bansal, R.K., Sharma, Y. (2022). An Efficient Feature Fusion Technique for Text-Independent Speaker Identification and Verification. In: Verma, P., Charan, C., Fernando, X., Ganesan, S. (eds) Advances in Data Computing, Communication and Security. Lecture Notes on Data Engineering and Communications Technologies, vol 106. Springer, Singapore. https://doi.org/10.1007/978-981-16-8403-6_56

Download citation

DOI: https://doi.org/10.1007/978-981-16-8403-6_56
Published: 29 March 2022
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-8402-9
Online ISBN: 978-981-16-8403-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

An Efficient Feature Fusion Technique for Text-Independent Speaker Identification and Verification

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

A Feature Level Fusion Scheme for Robust Speaker Identification

Higher order information set based features for text-independent speaker identification

Text-Independent Speaker Recognition System Using Feature-Level Fusion for Audio Databases of Various Sizes

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

An Efficient Feature Fusion Technique for Text-Independent Speaker Identification and Verification

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

A Feature Level Fusion Scheme for Robust Speaker Identification

Higher order information set based features for text-independent speaker identification

Text-Independent Speaker Recognition System Using Feature-Level Fusion for Audio Databases of Various Sizes

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation