Abstract
Frequency domain ICA has been used successfully to separate the utterances of interfering speakers in convolutive environments, see e.g. [6],[7]. Improved separation results can be obtained by applying a time frequency mask to the ICA outputs. After using the direction of arrival information for permutation correction, the time frequency mask is obtained with little computational effort. The proposed postprocessing is applied in conjunction with two frequency domain ICA methods and a beamforming algorithm, which increases separation performance for reverberant, as well as for in-car speech recordings, by an average 3.8dB. By combined ICA and time frequency masking, SNR-improvements up to 15dB are obtained in the car environment. Due to its robustness to the environment and regarding the employed ICA algorithm, time frequency masking appears to be a good choice for enhancing the output of convolutive ICA algorithms at a marginal computational cost.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Balan, R., Rosca, J., Rickard, S.: Robustness of Parametric Source Demixing in Echoic Environments. In: Proc. Int. Workshop on Independent Component Analysis and Blind Signal Separation, San Diego, California, pp. 144–149 (2001)
Baumann, W., Kolossa, D., Orglmeister, R.: Beamforming-based convolutive source separation. In: Proceedings ICASSP 2003, vol. 5, pp. 357–360 (2003)
Baumann, W., Kolossa, D., Orglmeister, R.: Maximum Likelihood Permutation Correction for Convolutive Source Separation. In: Proc. Int. Workshop on Independent Component Analysis and Blind Signal Separation, Nara, Japan, pp. 373–378 (2003)
Available at http://www2.ele.tue.nl/ica99/
Cardoso, J.-F.: High order contrasts for independent component analysis. Neural Computation 11, 157–192 (1999)
Kurita, S., Saruwatari, H., Kajita, S., Takeda, K., Itakura, F.: Evaluation of blind signal separation method using directivity pattern under reverberant conditions. In: Proceedings ICASSP ’00, vol. 5, pp. 3140–3143 (2000)
Parra, L., Alvino, C.: Geometric Source Separation: Merging convolutive source separation with geometric beamforming. IEEE Trans. on Speech and Audio Processing 10(6), 352–362 (2002)
Schobben, D., Torkkola, K., Smaragdis, P.: Evaluation of Blind Signal Separation. In: Proc. Int. Workshop on Independent Component Analysis and Blind Signal Separation, Aussois, France (1999)
TIDigits Speech Database: Studio Quality Speaker-Independent Connected-Digit Corpus. Readme file on CD-ROM, See also at http://morph.ldc.upenn.edu/Catalog/LDC93S10.html
Yilmaz, Ö., Rickard, S.: Blind Separation of Speech Mixtures via Time- Frequency Masking. Submitted to IEEE Transactions on Signal Processing (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kolossa, D., Orglmeister, R. (2004). Nonlinear Postprocessing for Blind Speech Separation. In: Puntonet, C.G., Prieto, A. (eds) Independent Component Analysis and Blind Signal Separation. ICA 2004. Lecture Notes in Computer Science, vol 3195. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30110-3_105
Download citation
DOI: https://doi.org/10.1007/978-3-540-30110-3_105
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23056-4
Online ISBN: 978-3-540-30110-3
eBook Packages: Springer Book Archive