Feature classification criterion for missing features mask estimation in robust speaker recognition

Ribas González, Dayana; Calvo de Lara, José Ramón

doi:10.1007/s11760-012-0299-z

Feature classification criterion for missing features mask estimation in robust speaker recognition

Original Paper
Published: 20 March 2012

Volume 8, pages 365–375, (2014)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Signal, Image and Video Processing Aims and scope Submit manuscript

Feature classification criterion for missing features mask estimation in robust speaker recognition

Download PDF

Dayana Ribas González¹ &
José Ramón Calvo de Lara¹

327 Accesses
2 Citations
Explore all metrics

Abstract

Currently, many speaker recognition applications must handle speech corrupted by environmental additive noise without having a priori knowledge about the characteristics of noise. Some previous works in speaker recognition have used the missing feature (MF) approach to compensate for noise. In most of those applications, the spectral reliability decision step is performed using the signal to noise ratio (SNR) criterion, which attempts to directly measure the relative signal to noise energy at each frequency. An alternative approach to spectral data reliability has been used with some success in the MF approach to speech recognition. Here, we compare the use of this new criterion with the SNR criterion for MF mask estimation in speaker recognition. The new reliability decision is based on the extraction and analysis of several spectro-temporal features from across the entire speech frame, but not across the time, which highlight the differences between spectral regions dominated by speech and by noise. We call it the feature classification (FC) criterion. It uses several spectral features to establish spectrogram reliability unlike SNR criterion that relies only in one feature: SNR. We evaluated our proposal through speaker verification experiments, in Ahumada speech database corrupted by different types of noise at various SNR levels. Experiments demonstrated that the FC criterion achieves considerably better recognition accuracy than the SNR criterion in the speaker verification tasks tested.

Article PDF

A sub-band-based feature reconstruction approach for robust speaker recognition

Article Open access 21 October 2014

Spectral Reconstruction and Noise Model Estimation Based on a Masking Model for Noise Robust Speech Recognition

Article 06 January 2017

Robust noise MKMFCC–SVM automatic speaker identification

Article 14 February 2018

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

Benesty J., Sondhi M.M., Huang Y.: Springer Handbook of Speech Processing. Springer, Berlin (2008)
Book Google Scholar
Berouti, M., Schwartz, R., Makhoul, J.: Enhancement of speech corrupted by acoustic noise. In: IEEE ICASSP (1979)
Hirsch, H.G., Ehrlicher, C.: Noise estimation techniques for robust speech recognition. In: ICASSP (1995)
Martin, R.: Noise power spectral density estimation based on optimal smoothing and minimum statistics. In: IEEE Transaction on Speech and Audio Proceedings, vol. 9 (2001)
Teunen, R., Shahshahani, B., Heck, L.P.: A Model-Based Transformational Approach to Robust Speaker Recognition. ICSLP, Beijing (2000)
Davis, S., Mermelstein, P.: Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. In: IEEE Transaction Ac. Speech, and Signal Processing, vol. 28, issue number 4, pp. 357–366 (1980)
Hermansky, H.: Perceptual linear prediction (PLP) analysis for speech. J. Acoust. Soc. Am. 87(4), 1738–1752 (1990)
Google Scholar
Gales, M.J.F., Young, S.J.: HMM recognition in noise using parallel model combination. In: EUROSPEECH’93, pp. 837–840 (1993)
Sagayama, S., Yamaguchi, Y., Takahashi, S., Takahashi, J.: Jacobian approach to fast acoustic model adaptation. In: ICASSP (1997)
Bimbot F., Bonastre J.-F., Fredouille C., Gravier G., Magrin-Chagnolleau I., Meignier S., Merlin T., Ortega-Garcia J., Petrovska-Delacretaz D., Reynolds D.A.: A tutorial on text-independent speaker verification. Eurasip J. Appl. Signal Process. 4, 430–451 (2004)
Article Google Scholar
Kinnunen T., Li H.: An overiew of text-independent speaker recognition: from features to supervectors. Speech Commun. 52, 12–40 (2010)
Article Google Scholar
Raj, B., Stern, R.: Missing-feature approaches in speech recognition. In: IEEE Signal Processing Magazine (2005)
Raj, B., Seltzer, M., Stern, R.M.: Reconstruction of MFs for robust speech recognition. Speech Commun. 43, 275–296 (2004)
Google Scholar
El-Maliki, M., Drygajlo, A.: Integration and imputation methods for unreliable feature compensation in GMM based speaker verification. In: Speaker Recognition Workshop Odyssey, Crete, Greece (2001)
El-Maliki M., Drygajlo A.: Missing Features Detection and Handling for Robust Speaker Verification. Eurospeech, Budapest (1999)
Google Scholar
Demange S., Cerisara C., Haton J.-P.: Accurate Marginalization Range for Missing Data Recognition in Interspeech. Interspeech, Antwerp (2007)
Google Scholar
Padilla M., Quatieri T., Reynolds D.: MF Theory with Soft Spectral Subtraction for Speaker Verification. Interspeech, Pittsburgh (2006)
Google Scholar
Ming J., Hazen T., Glass J.R., Reynolds D.A.: Robust speaker recognition in noisy conditions. IEEE Trans. Speech Audio Process. 15, 1711–1723 (2007)
Google Scholar
Pullella, D., Kuhne, M., Togneri, R.: Robust speaker identification using combined feature selection and missing data recognition. In: ICASSP (2008)
Cerisara C., Demange S., Haton J.-P.: On noise masking for automatic missing data speech recognition: a survey and discussion. Comput Speech Lang 21(3), 443–457 (2007)
Article Google Scholar
Drygajlo, A., El-Maliki, M.: Speaker verification in noisy enviroments with combined spectral subtraction and MF theory. In: Signal Processing Laboratory, Swiss Federal Institute of Technology at Lausanne (1998)
Shao, Y., Wang, D.: Robust speaker recognition using binary time-frequency masks. In: ICASSP (2006)
Seltzer, M., Raj, B., Stern, R.M.: A Bayesian classifier for spectrographic mask estimation for MF speech recognition. Speech Commun. 43, 379–393 (2004)
Google Scholar
Reynolds D.A., Quatieri T.F., Dunn R.B.: Speaker verification using adapted gaussian mixture models. Digit Signal Process 10, 19–41 (2000)
Article Google Scholar
Talkin D.: “A Robust Algorithm for Pitch Tracking (RAPT)”, Speech Coding and Synthesis. Elsevier, Amsterdam (1995)
Google Scholar
Duin, R.P.W., Juszczak, P., Paclik, P., Pekalska, E., de Ridder, D., Tax, D.M.J., Verzakov, S.: “PRTools4 A Matlab Toolbox for Pattern Recognition”, Version 4.1, Delft Pattern Recognition Research Faculty EWI—ICT, http://prtools.org/ (2007)
Zilca R., Kingsbury B., Navratil J., Ramaswamy G.: Pseudo Pitch Synchronous Analysis of Speech with Applications to Speaker Recognition. In: IEEE Trans. Audio Speech Lang. Process. 14, 467–478 (2006)
Google Scholar
Ortega J., Gonzalez J., Marrero V.: AHUMADA: A Large Speech Corpus in Spanish for Speaker Characterization and Identification. Speech Commun. 31, 255–264 (2000)
Article Google Scholar
Drygajlo A., El-Maliki M.: Speaker Verification in Missing Features Detection and Handling for Robust Speaker Verification. EUROSPEECH, Budapest (1999)
Google Scholar
Davis G.M.: Noise Reduction in Speech Applications. CRC PRESS LLC, New York (2002)
Book Google Scholar
Krishnamurthy N., Hansen J.H.L.: Babble noise: modeling, analysis, and applications. In: IEEE Trans. Audio Speech Lang. Process. 17(7), 1394–1407 (2009)
Google Scholar
Besacier L., Bonastre J.-F.: Subband architecture for automatic speaker recognition. Signal Process. 80, 1245–1259 (2000)
Article MATH Google Scholar
Besacier L., Bonastre J.F., Fredouille C.: Localization and selection of speaker-specific information with statistical modeling. Speech Commun. 31, 89–106 (2000)
Article Google Scholar
Morris, A.C., Green, P.M.: Some solutions to the missing feature problem in data classification with application to noise robust ASR. In: ICASSP, pp. 737–740 (1998)

Download references

Author information

Authors and Affiliations

Advanced Technologies Application Center (CENATAV), 7a ave. 21812 Siboney, Playa, 12200, Havana, Cuba
Dayana Ribas González & José Ramón Calvo de Lara

Authors

Dayana Ribas González
View author publications
You can also search for this author in PubMed Google Scholar
José Ramón Calvo de Lara
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dayana Ribas González.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ribas González, D., Calvo de Lara, J.R. Feature classification criterion for missing features mask estimation in robust speaker recognition. SIViP 8, 365–375 (2014). https://doi.org/10.1007/s11760-012-0299-z

Download citation

Received: 15 December 2010
Revised: 22 February 2012
Accepted: 27 February 2012
Published: 20 March 2012
Issue Date: February 2014
DOI: https://doi.org/10.1007/s11760-012-0299-z

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Feature classification criterion for missing features mask estimation in robust speaker recognition

Abstract

Article PDF

Similar content being viewed by others

A sub-band-based feature reconstruction approach for robust speaker recognition

Spectral Reconstruction and Noise Model Estimation Based on a Masking Model for Noise Robust Speech Recognition

Robust noise MKMFCC–SVM automatic speaker identification

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Feature classification criterion for missing features mask estimation in robust speaker recognition

Abstract

Article PDF

Similar content being viewed by others

A sub-band-based feature reconstruction approach for robust speaker recognition

Spectral Reconstruction and Noise Model Estimation Based on a Masking Model for Noise Robust Speech Recognition

Robust noise MKMFCC–SVM automatic speaker identification

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation