Abstract
Common statistical estimators for speech enhancement rely on several assumptions about statistical properties of speech and noise processes. In real applications, these assumptions may not be always satisfied due to the effects of a nonstationary environment. In this work, we propose new robust spectral estimators for speech enhancement by incorporation of calculation of rank-order statistics to existing speech enhancement estimators. The proposed estimators are better adapted to nonstationary characteristics of speech signals and noise processes in real environments. By means of computer simulations, we show that the proposed estimators outperform the known estimators in terms of objective criteria of quality.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
P. Loizou, Speech Enhancement: Theory and Practice, 2nd Ed. (Taylor & Francis, Boca Raton, 2013).
R. McAulay and M. Malpass, “Speech enhancement using a softdecision noise suppression filter,” IEEE Trans. Acoust., Speech, Signal Process. 28, 137–145 (1980).
N. S. Kim and J.-H. Chang, “Spectral enhancement based on global soft decision,” IEEE Signal Process. Lett. 7 (5), 108–110 (2000).
Y. Ephraim and D. Malah, “Speech enhancement using a minimum mean square error short-time spectral amplitude estimator,” IEEE Trans. Acoust., Speech, Signal Process. 32, 1109–1121 (1984).
G.-H. Ding, T. Huang, and B. Xu, “Suppression of additive noise using a power spectral density MMSE estimator,” IEEE Signal Process. Lett. 11, 585–588 (2005).
R. Martin, “Speech enhancement based on minimum mean-square error estimation and supergaussian priors,” IEEE Trans. Speech Audio Process. 13, 845–856 (2005).
Y. Ephraim and D. Malah, “Speech enhancement using a minimum mean-square error log-spectral amplitude estimator,” IEEE Trans. Acoust., Speech, Signal Process. 33, 443–445 (1985).
T. Lotter and P. Vary, “Speech enhancement by map spectral amplitude estimation using a super-Gaussian speech model,” EURASIP J. Appl. Signal Process., No. 7, 1110–1126 (2005).
L. Yaroslavsky and M. Eden, Fundamentals of Digital Optics (Birkhaeuser, Boston, 1996).
V. Kober, M. Mozerov, and J. Alvarez-Borrego, “Nonlinear filters with spatially connected neighborhoods,” Opt. Eng. 40, 971–983 (2001).
P. J. Huber, P. C. Pop, and E. M. Ronchetti, Robust Statistics, 2nd Ed., (Wiley, New York, 2009).
V. M. Diaz-Ramirez and V. Kober, “Robust speech processing using local adaptive nonlinear filtering,” IET Signal Process. 7, 345–359 (2013).
“IEEE Subcommittee (1969), IEEE recommended practice for speech quality measurements,” IEEE Trans. Audio Electroacoust. 17, 225–246 (1969).
“ITU, Perceptual evaluation of speech quality (PESQ). An objective method for end-to-end speech quality assessment of narrow-band telephone networks and speech codecs,” ITU-T Recommendation, 862 (2001).
C. H. Taal, R. C. Hendriks, R. Heusdens, and J. Jensen, “An algorithm for intelligibility prediction of time-frequency weighted noisy speech,” IEEE Trans. Audio, Speech, Language Process. 19, 2125–2136 (2011).
E. Vincent, R. Gribonval, and C. F‘evotte, “Performance measurement in blind audio source separation,” IEEE Trans. Audio, Speech, Language Process. 14, 1462–1469 (2006).
Author information
Authors and Affiliations
Corresponding authors
Additional information
Original Russian Text © Y. Sandoval-Ibarra, V.H. Diaz-Ramirez, V.I. Kober, V.N. Karnaukhov, 2015, published in Informatsionnye Protsessy, 2015, Vol. 15, No. 3, pp. 314–323.
Rights and permissions
About this article
Cite this article
Sandoval-Ibarra, Y., Diaz-Ramirez, V.H., Kober, V.I. et al. Speech enhancement with adaptive spectral estimators. J. Commun. Technol. Electron. 61, 672–678 (2016). https://doi.org/10.1134/S1064226916060218
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1134/S1064226916060218