Abstract
In speech communication systems, such as voice-controlled systems, hands-free mobile telephones and hearing aids, the received microphone signals are degraded by room reverberation, ambient noise and other interferences. This signal degradation can decrease the fidelity and intelligibility of speech and the word recognition rate of automatic speech recognition systems.
The reverberation process is often described using deterministic models that depend on a large number of unknown parameters. These parameters are often difficult to estimate blindly and are dependent on the exact spatial position of the source and receiver. In recently emerged speech dereverberation methods, which are feasible in practice, the reverberation process is described using a statistical model. This model depends on smaller number of parameters such as the reverberation time of the enclosure, which can be assumed to be independent of the spatial location of the source and receiver. This model can be utilized to estimate the spectral variance of part of the reverberant signal component. Together with an estimate of the spectral variance of the ambient noise, this estimate can then be used to enhance the observed noisy and reverberant speech.
In this chapter we provide a brief overview of dereverberation methods. We then describe single and multiple microphone algorithms that are able to jointly suppress reverberation and ambient noise. Finally, experimental results demonstrate the beneficial use of the algorithms developed.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Abramson, A., Cohen, I.: Markov-switching GARCH model and application to speech enhancement in subbands. In: Proc. Int. Workshop Acoust. Echo Noise Control (IWAENC), pp. 1–4. Paris, France (2006)
Abramson, A., Cohen, I.: Recursive supervised estimation of a Markov-switching GARCH process in the short-time Fourier transform domain. IEEE Trans. Signal Process. 55(7), 3227–3238 (2007)
Accardi, A.J., Cox, R.V.: A modular approach to speech enhancement with an application to speech coding. In: Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), vol. 1, pp. 201–204 (1999)
Allen, J.B.: Effects of small room reverberation on subjective preference. J. Acoust. Soc. Am. 71(S1), S5 (1982)
Allen, J.B., Berkley, D.A.: Image method for efficiently simulating small-room acoustics. J. Acoust. Soc. Am. 65(4), 943–950 (1979)
Benesty, J., Makino, S., Chen, J. (eds.): Speech Enhancement. Springer (2005)
Benesty, J., Sondhi, M.M., Huang, Y. (eds.): Springer Handbook of Speech Processing. Springer (2007)
Berouti, M., Schwartz, R., Makhoul, J.: Enhancement of speech corrupted by acoustic noise. In: Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), vol. 4, pp. 208–211 (1979)
Boll, S.F.: Suppression of acoustic noise in speech using spectral subtraction. IEEE Trans. Acoust., Speech, Signal Process. ASSP-27(2), 113–120 (1979)
Bolt, R.H., MacDonald, A.D.: Theory of speech masking by reverberation. J. Acoust. Soc. Am. 21, 577–580 (1949)
Burshtein, D., Gannot, S.: Speech enhancement using a mixture-maximum model. IEEE Trans. Speech Audio Process. 10(6), 341351 (2002)
Cappe, O.: Elimination of the musical noise phenomenon with the Ephraim and Malah noise suppressor. IEEE Trans. Speech Audio Process. 2(2), 345–349 (1994). DOI 10.1109/89. 279283
Cohen, I.: Optimal speech enhancement under signal presence uncertainty using log-spectral amplitude estimator. IEEE Signal Process. Lett. 9(4), 113–116 (2002)
Cohen, I.: Noise spectrum estimation in adverse environments: Improved minima controlled recursive averaging. IEEE Trans. Speech Audio Process. 11(5), 466–475 (2003). DOI 10. 1109/TSA.2003.811544
Cohen, I.: From volatility modeling of financial time-series to stochastic modeling and enhancement of speech signals. In: J. Benesty, S. Makino, J. Chen (eds.) Speech Enhancement, chap. 5, pp. 97–114. Springer (2005)
Cohen, I.: Speech spectral modeling and enhancement based on autoregressive conditional heteroscedasticity models. Signal Processing 86(4), 698–709 (2006)
Cohen, I., Gannot, S.: Spectral enhancement methods. In: Benesty et al. [7], chap. 45. Part H
Cohen, I., Gannot, S., Berdugo, B.: An integrated real-time beamforming and post filtering system for nonstationary noise environments. EURASIP J. on App. Signal Process. 11, 1064–1073 (2003)
Cox, T.J., Li, F., Darlington, P.: Extracting room reverberation time from speech using artificial neural networks. J. Audio Eng. Soc. 49(4), 219–230 (2001)
Crochiere, R.E., Rabiner, L.R.: Multirate Digital Signal Processing. Prentice-Hall (1983)
Delcroix, M., Hikichi, T., Miyoshi, M.: Precise dereverberation using multichannel linear prediction. IEEE Trans. Audio, Speech, Lang. Process. 15(2), 430–440 (2007)
Deller, J.R., Proakis, J.G., Hansen, J.H.L.: Discrete-Time Processing of Speech Signals. New York: MacMillan (1993)
Ephraim, Y., Cohen, I.: Recent advancements in speech enhancement. In: R.C. Dorf (ed.) The Electrical Engineering Handbook, Circuits, Signals, and Speech and Image Processing, third edn. CRC Press (2006)
Ephraim, Y., Lev-Ari, H., Roberts, W.J.J.: A brief survey of speech enhancement. In: The Electronic Handbook, second edn. CRC Press (2005)
Ephraim, Y., Malah, D.: Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator. IEEE Trans. Acoust., Speech, Signal Process. 32(6), 1109–1121 (1984)
Ephraim, Y., Malah, D.: Speech enhancement using a minimum mean-square error logspectral amplitude estimator. IEEE Trans. Acoust., Speech, Signal Process. 33(2), 443–445 (1985)
Gannot, S., Cohen, I.: Adaptive beamforming and postfiltering. In: Benesty et al. [7], chap. 48
Gannot, S., Moonen, M.: Subspace methods for multimicrophone speech dereverberation. EURASIP J. on App. Signal Process. 2003(11), 1074–1090 (2003)
Gaubitch, N.D., Naylor, P.A.: Analysis of the dereverberation performance of microphone arrays. In: Proc. Int. Workshop Acoust. Echo Noise Control (IWAENC) (2005)
Gaubitch, N.D., Naylor, P.A., Ward, D.B.: On the use of linear prediction for dereverberation of speech. In: Proc. Int. Workshop Acoust. Echo Noise Control (IWAENC), pp. 99–102 (2003)
Goh, Z., Tan, K.C., Tan, T.G.: Postprocessing method for suppressing musical noise generated by spectral subtraction. IEEE Trans. Speech Audio Process. 6(3), 287–292 (1998). DOI 10.1109/89.668822
Griebel, S.M., Brandstein, M.S.: Wavelet transform extrema clustering for multi-channel speech dereverberation. In: Proc. Int. Workshop Acoust. Echo Noise Control (IWAENC), pp. 52–55. Pocono Manor, Pennsylvania (1999)
Gürelli, M.I., Nikias, C.L.: EVAM: An eigenvector-based algorithm for multichannel blind deconvolution of input colored signals. IEEE Trans. Signal Process. 43(1), 134–149 (1995)
Gustafsson, S., Martin, R., Jax, P., Vary, P.: A psychoacoustic approach to combined acoustic echo cancellation and noise reduction. IEEE Trans. Speech Audio Process. 10(5), 245–256 (2002)
Gustafsson, S., Martin, R., Vary, P.: Combined acoustic echo control and noise reduction for hands-free telephony. Signal Processing 64(1), 21–32 (1998)
Gustafsson, S., Nordholm, S., Claesson, I.: Spectral subtraction using reduced delay convolution and adaptive averaging. IEEE Trans. Speech Audio Process. 9(8), 799–807 (2001)
Habets, E.A.P.: Multi-channel speech dereverberation based on a statistical model of late reverberation. In: Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), pp. 173–176. Philadelphia, USA (2005)
Habets, E.A.P.: Speech dereverberation based on a statistical model of late reverberation using a linear microphone array. In: Proc. Joint Workshop on Hands-Free Speech Communication and Microphone Arrays (HSCMA), pp. d7–d8. Piscataway, NJ, USA (2005)
Habets, E.A.P.: Single- and multi-microphone speech dereverberation using spectral enhancement. Ph.D. thesis, Technische Universiteit Eindhoven (2007)
Habets, E.A.P., Cohen, I., Gannot, S.: MMSE log spectral amplitude estimator for multiple interferences. In: Proc. Int. Workshop Acoust. Echo Noise Control (IWAENC), pp. 1–4. Paris, France (2006)
Habets, E.A.P., Cohen, I., Gannot, S., Sommen, P.C.W.: Joint dereverberation and residual echo suppression of speech signals in noisy environments. IEEE Trans. Audio, Speech, Lang. Process. 16(8), 1433–1451 (2008)
Habets, E.A.P., Gannot, S., Cohen, I.: Dual-microphone speech dereverberation in a noisy environment. In: Proc. IEEE Int. Symposium on Signal Processing and Information Technology (ISSPIT), pp. 651–655. Vancouver, Canada (2006)
Haykin, S.: Blind Deconvolution, fourth edn. Prentice-Hall Information and System Sciences. Prentice-Hall (1994)
Hopgood, J.: Nonstationary signal processing with application to reverberation cancellation in acoustic environments. Ph.D. thesis, Cambridge University (2001)
Huang, Y., Benesty, J.: A class of frequency-domain adaptive approaches to blind multichannel identification. IEEE Trans. Signal Process. 51(1), 11–24 (2003)
Jetzt, J.J.: Critical distance measurement of rooms from the sound energy spectral response. J. Acoust. Soc. Am. 65(5), 1204–1211 (1979)
Jot, J.M., Cerveau, L., Warusfel, O.: Analysis and synthesis of room reverberation based on a statistical time-frequency model. In: Proc. Audio Eng. Soc. Convention (1997)
Kuttruff, H.: Room Acoustics, 4th edn. Taylor & Frances (2000)
Lebart, K., Boucher, J.M., Denbigh, P.N.: A new method based on spectral subtraction for speech dereverberation. Acta Acoustica 87, 359–366 (2001)
Lim, J.S., Oppenheim, A.V.: Enhancement and bandwidth compression of noisy speech. Proc. IEEE 67(12), 1586–1604 (1979)
Lindsey, G., Breen, A., Nevard, S.: SPAR’s archivable actual-word databases. Technical report, University College London (1987)
Loizou, P.C.: Speech enhancement based on perceptually motivated Bayesian estimators of the magnitude spectrum. IEEE Trans. Speech Audio Process. 13(5), 857–869 (2005). DOI 10.1109/TSA.2005.851929
Löllmann, H.W., Vary, P.: Estimation of the reverberation time in noisy environments. In: Proc. Int. Workshop Acoust. Echo Noise Control (IWAENC), pp. 1–4 (2008)
Martin, R.: Noise power spectral density estimation based on optimal smoothing and minimum statistics. IEEE Trans. Speech Audio Process. 9, 504–512 (2001). DOI 10.1109/89.928915
Martin, R.: Speech enhancement based on minimum mean-square error estimation and supergaussian priors. IEEE Trans. Speech Audio Process. 13(5), 845–856 (2005). DOI 10.1109/TSA.2005.851927
Miyoshi, M., Kaneda, Y.: Inverse filtering of room acoustics. IEEE Trans. Acoust., Speech, Signal Process. 36(2), 145–152 (1988)
Nábˇelek, A.K., Letowski, T.R., Tucker, F.M.: Reverberant overlap- and self-masking in consonant identification. J. Acoust. Soc. Am. 86(4), 1259–1265 (1989)
Nábˇelek, A.K., Mason, D.: Effect of noise and reverberation on binaural and monaural word identification by subjects with various audiograms. J. Speech Hear. Res. 24, 375–383 (1981)
Peterson, P.M.: Simulating the response of multiple microphones to a single acoustic source in a reverberant room. J. Acoust. Soc. Am. 80(5), 1527–1529 (1986)
Peutz, V.M.A.: Articulation loss of consonants as a criterion for speech transmission in a room. J. Audio Eng. Soc. 19(11), 915–919 (1971)
Polack, J.D.: La transmission de l’énergie sonore dans les salles. Ph.D. thesis, Université du Maine, La Mans, France (1988)
Polack, J.D.: Playing billiards in the concert hall: the mathematical foundations of geometrical room acoustics. Appl. Acoust. 38(2), 235–244 (1993)
Radlovi´c, B.D., Kennedy, R.A.: Nonminimum-phase equalization and its subjective importance in room acoustics. IEEE Trans. Speech Audio Process. 8(6), 728–737 (2000)
Ratnam, R., Jones, D.L., Wheeler, B.C., O’Brien, Jr., W.D., Lansing, C.R., Feng, A.S.: Blind estimation of reverberation time. J. Acoust. Soc. Am. 114(5), 2877–2892 (2003)
Sabine, W.C.: Collected Papers on acoustics (Originally 1921). Peninsula Publishing (1993)
Schroeder, M.R.: Statistical parameters of the frequency response curves of large rooms. J. Audio Eng. Soc. 35, 299–306 (1954)
Schroeder, M.R.: Frequency correlation functions of frequency responses in rooms. J. Acoust. Soc. Am. 34(12), 1819–1823 (1962)
Schroeder, M.R.: Integrated-impulse method measuring sound decay without using impulses. J. Acoust. Soc. Am. 66(2), 497–500 (1979)
Schroeder, M.R.: The “schroeder frequency” revisited. J. Acoust. Soc. Am. 99(5), 3240–3241 (1996). DOI 10.1121/1.414868
Sim, B.L., Tong, Y.C., Chang, J.S., Tan, C.T.: A parametric formulation of the generalized spectral subtraction method. IEEE Trans. Speech Audio Process. 6(4), 328–337 (1998)
Steinberg, J.C.: Effects of distortion upon the recognition of speech sounds. J. Acoust. Soc. Am. 1, 35–35 (1929)
Takata, Y., Nábˇelek, A.K.: English consonant recognition in noise and in reverberation by Japanese and American listeners. J. Acoust. Soc. Am. 88, 663–666 (1990)
Talantzis, F., Ward, D.B.: Robustness of multichannel equalization in an acoustic reverberant environment. J. Acoust. Soc. Am. 114(2), 833–841 (2003)
Tashev, I., Malvar, H.S.: A new beamformer design algorithm for microphone arrays. In: Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), vol. 3, pp. iii/101–iii/104 (2005)
Varga, A., Steeneken, H.J.M.: Assessment for automatic speech recognition II: NOISEX-92: a database and an experiment to study the effect of additive noise on speech recognition systems. Speech Communication 3(3), 247–251 (1993). DOI 10.1016/0167--6393(93)90095--3
Wexler, J., Raz, S.: Discrete Gabor expansions. Signal Processing 21(3), 207–220 (1990)
Wolfe, P.J., Godsill, S.J.: Efficient alternatives to the Ephraim and Malah suppression rule for audio signal enhancement. EURASIP J. on App. Signal Process. 2003(10), 1043–1051 (2003)
Yegnanarayana, B., Satyanarayana, P.: Enhancement of reverberant speech using LP residual signal. IEEE Trans. Speech Audio Process. 8(3), 267–281 (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag London Limited
About this chapter
Cite this chapter
Habets, E. (2010). Speech Dereverberation Using Statistical Reverberation Models. In: Naylor, P., Gaubitch, N. (eds) Speech Dereverberation. Signals and Commmunication Technology. Springer, London. https://doi.org/10.1007/978-1-84996-056-4_3
Download citation
DOI: https://doi.org/10.1007/978-1-84996-056-4_3
Publisher Name: Springer, London
Print ISBN: 978-1-84996-055-7
Online ISBN: 978-1-84996-056-4
eBook Packages: EngineeringEngineering (R0)