Separation and Dereverberation of Speech Signals with Multiple Microphones

Huang, Yiteng (Arden); Benesty, Jacob; Chen, Jingdong

doi:10.1007/3-540-27489-8_12

Yiteng (Arden) Huang⁴,
Jacob Benesty⁵ &
Jingdong Chen⁴

Part of the book series: Signals and Communication Technology ((SCT))

2453 Accesses
2 Citations

Abstract

Speech enhancement was not and should not be examined solely with the tool of time-frequency analysis. Approaching this problem from different perspectives or incorporating other knowledges helps to expand the number of options open to us when developing a speech enhancement system. Using multiple microphones at different locations makes it possible to develop more sophisticated source separation and dereverberation technologies for speech enhancement, which enable man-made systems to extract a speech signal of interest in a noisy environment with competing speech and/or noise sources. This phenomenon is referred to as the cocktail party effect demonstrated by human beings and many other creatures with few efforts. However, separating and dereverberating speech signals is a very difficult problem in reverberant environments and the state-of-the-art algorithms are still unsatisfactory. The challenge lies in the coexistence of spatial interference from competing sources and temporal echoes due to room reverberation in the observed microphone signals. Focusing only on optimizing the signal-to-interference ratio is inadequate for most speech processing systems where source separation and speech dereverberation are two fully-integrated problems. In this chapter, we study these two problems in a unified framework. We deduce that spatial interference and temporal reverberation can be separated and a SIMO system with the speech signal of interest as input is extracted from the MIMO system. Furthermore, this interference-free SIMO system is dereverberated using the MINT theorem. Such a two-stage procedure leads to a novel sequential source separation and speech dereverberation algorithm based on blind multichannel identification. Simulations with measurements obtained in the varechoic chamber at Bell Labs verified the proposed algorithm.

Access provided by Autonomous University of Puebla. Download to read the full chapter text

Chapter PDF

Maximum A Posteriori Spectral Estimation with Source Log-Spectral Priors for Multichannel Speech Enhancement

Speech Separation and Extraction by Combining Superdirective Beamforming and Blind Source Separation

Research on Speech Enhancement Algorithms Based on Blind Source Separation in Outdoor Environment

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

E. C. Cherry, “Some experiments on the recognition of speech, with one and with two ears,” J. Acoust. Soc. Am., vol. 25, pp. 975–979, Sept. 1953.
Article Google Scholar
B. D. Van Veen and K. M. Buckley, “Beamforming: a versatile approach to spatial filtering,” IEEE ASSP Magazine, vol. 5, pp. 4–24, Apr. 1988.
Article Google Scholar
Y. Huang, J. Benesty, and G. W. Elko, “Source localization,” in Audio Signal Processing for Next-Generation Multimedia Communication Systems, Y. Huang and J. Benesty, Eds., Boston, MA: Kluwer Academic, 2004.
Google Scholar
B. Widrow, P. E. Mantley, L. J. Griffiths, and B. B. Goode, “Adaptive antenna systems,” Proc. of the IEEE, vol. 55, pp. 2143–2159, Dec. 1967.
Article Google Scholar
J. Herault, C. Jutten, and B. Ans, “Detection de grandeurs primitives dans un message composite par une architecture de calul neuromimetique un apprentissage non supervise,” in Proc. GRETSI, 1985.
Google Scholar
P. Comon, “Independent component analysis: a new concept?,” Signal Processing, vol. 36, pp. 287–314, Apr. 1994.
Article MATH Google Scholar
L. Molgedey and H. G. Schuster, “Separation of a mixture of independent signals using time delayed correlations,” Phys. Rev. Lett., vol. 72, no. 23, pp. 3634–3637, June 1994.
Article Google Scholar
J.-F. Cardoso, “Eigenstructure of the 4th-order cumulant tensor with application to the blind source separation problem,” in Proc. IEEE ICASSP, 1989, pp. 2109–2112.
Google Scholar
S. Amari, A. Cichocki, and H. H. Yang, “Blind signal separation and extraction: neural and information-theoretic approaches,” in Unsupervised Adaptive Filtering, Volume 1: Blind Source Separation, S. Haykin, Ed., New York: John Wiley & Sons, 2000.
Google Scholar
H. Wee and J. Principe, “A criterion for BSS based on simultaneous diagonalization of time correlation matrices,” in Proc. IEEE Workshop NNSP, 1997, pp. 496–508.
Google Scholar
M. Z. Ikram and D. R. Morgan, “Exploring permutation inconsistency in blind separation of speech signals in a reverberant environment,” in Proc. IEEE ICASSP, 2000, vol. 2, pp. 1041–1044.
Google Scholar
L. Parra and C. Spence, “Convolutive blind separation of non-stationary sources,” IEEE Trans. Speech Audio Processing, vol. 8, pp. 320–327, May 2000.
Article Google Scholar
D. Bees, M. Blostein, and P. Kabal, “Reverberant speech enhancement using cepstral processing,” in Proc. IEEE ICASSP, 1991, vol. 2, pp. 977–980.
Google Scholar
S. Subramaniam, A. P. Petropulu, and C. Wendt, “Cepstrum-based deconvolution for speech dereverberation,” IEEE Trans. Speech Audio Processing, vol. 4, pp. 392–396, Sept. 1996.
Article Google Scholar
T. Nakatani and M. Miyoshi, “Blind dereverberation of single channel speech based on harmonic structure,” in Proc. IEEE ICASSP, 2003, vol. I, pp. 92–95.
Google Scholar
A. V. Oppenheim and R. W. Schafer, Discrete-Time Signal Processing. Englewood Cliffs, NJ: Prentice-Hall, 1989.
MATH Google Scholar
M. Miyoshi and Y. Kaneda, “Inverse filtering of room acoustics,” IEEE Trans. Acoust., Speech, Signal Processing, vol. 36, pp. 145–152, Feb. 1988.
Article Google Scholar
G. Xu, H. Liu, L. Tong, and T. Kailath, “A least-squares approach to blind channel identification,” IEEE Trans. Signal Processing, vol. 43, pp. 2982–2993, Dec. 1995.
Article Google Scholar
L. Tong, G. Xu, and T. Kailath, “A new approach to blind identification and equalization of multipath channels,” in Proc. 25th Asilomar Conf. on Signals, Systems, and Computers, 1991, vol. 2, pp. 856–860.
Google Scholar
C. Avendano, J. Benesty, and D. R. Morgan, “A least squares component normalization approach to blind channel identification,” in Proc. IEEE ICASSP, 1999, vol. 4, pp. 1797–1800.
Google Scholar
Y. Huang and J. Benesty, “Adaptive multi-channel least mean square and Newton algorithms for blind channel identification,” Signal Processing, vol. 82, pp. 1127–1138, Aug. 2002.
Article MATH Google Scholar
Y. Huang and J. Benesty, “A class of frequency-domain adaptive approaches to blind multi-channel identification,” IEEE Trans. Signal Processing, vol. 51, pp. 11–24, Jan. 2003.
Article MathSciNet Google Scholar
P. P. Vaidyanathan, Multirate Systems and Filter Bank. Englewood Cliffs, NJ: Prentice-Hall, 1993.
Google Scholar
D. R. Morgan, J. Benesty, and M. M. Sondhi, “On the evaluation of estimated impulse responses,” IEEE Signal Processing Lett., vol. 5, pp. 174–176, July 1998.
Article Google Scholar
M. Z. Ikram and D. R. Morgan, “Exploring permutation inconsistency in blind separation of speech signals in a reverberant environment,” in Proc. IEEE ICASSP, 2000, vol. 2, pp. 1041–1044.
Google Scholar
L. R. Rabiner and B. H. Juang, Fundamentals of Speech Recognition. Englewood Cliffs, NJ: Prentice-Hall, 1993.
Google Scholar
S. R. Quackenbush, T. P. Barnwell, M. A. Clements, Objective Measures of Speech Quality. Englewood Cliffs, NJ: Prentice-Hall, 1988.
Google Scholar
G. Chen, S. N. Koh, I. Y. Soon, “Enhanced Itakura measure incorporating masking properties of human auditory system,” Elsevier Science Signal Processing, vol. 83, pp. 1445–1456, July 2003.
Article Google Scholar
A. Härmä, “Acoustic measurement data from the varechoic chamber,” Technical Memorandum, Agere Systems, Nov. 2001.
Google Scholar
W. C. Ward, G. W. Elko, R. A. Kubli, and W. C. McDougald, “The new Varechoic chamber at AT&T Bell Labs,” in Proc. Wallance Clement Sabine Centennial Symposium, 1994, pp. 343–346.
Google Scholar

Download references

Author information

Authors and Affiliations

Bell Laboratories, Lucent Technologies, Murray Hill, NJ, 07974, USA
Yiteng (Arden) Huang & Jingdong Chen
INRS-EMT, Université du Québec, Montréal, QC, H5A 1K6, Canada
Jacob Benesty

Authors

Yiteng (Arden) Huang
View author publications
You can also search for this author in PubMed Google Scholar
Jacob Benesty
View author publications
You can also search for this author in PubMed Google Scholar
Jingdong Chen
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Huang, Y.(., Benesty, J., Chen, J. (2005). Separation and Dereverberation of Speech Signals with Multiple Microphones. In: Speech Enhancement. Signals and Communication Technology. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-27489-8_12

Download citation

DOI: https://doi.org/10.1007/3-540-27489-8_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-24039-6
Online ISBN: 978-3-540-27489-6
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Separation and Dereverberation of Speech Signals with Multiple Microphones

Abstract

Chapter PDF

Similar content being viewed by others

Maximum A Posteriori Spectral Estimation with Source Log-Spectral Priors for Multichannel Speech Enhancement

Speech Separation and Extraction by Combining Superdirective Beamforming and Blind Source Separation

Research on Speech Enhancement Algorithms Based on Blind Source Separation in Outdoor Environment

Keywords

References

Author information

Authors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Navigation

Separation and Dereverberation of Speech Signals with Multiple Microphones

Abstract

Chapter PDF

Similar content being viewed by others

Maximum A Posteriori Spectral Estimation with Source Log-Spectral Priors for Multichannel Speech Enhancement

Speech Separation and Extraction by Combining Superdirective Beamforming and Blind Source Separation

Research on Speech Enhancement Algorithms Based on Blind Source Separation in Outdoor Environment

Keywords

References

Author information

Authors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation