Musical Noise Reduction Capability of Various Speech Enhancement Algorithms

Saxena, Prateek; Gupta, V. K.; Chandra, Mahesh

doi:10.1007/978-81-322-2752-6_68

Prateek Saxena¹⁸,
V. K. Gupta¹⁹ &
Mahesh Chandra²⁰

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 434))

1611 Accesses
2 Citations

Abstract

This paper presents a comparative analysis of spectral subtraction and Weiner denoising techniques for musical noise reduction. The iterative spectral subtraction method provides least musical noise generation applied in different noisy environments. The method of musical noise production is traced by observing the change in the kurtosis ratio of noise spectrum using different denoising techniques for different noisy signal. A MATLAB simulation is performed for four different noisy environments car noise, babble noise, operation room noise and machine gun noise at −10, −5, 0, 5 and 10 dB input SNR levels. It is observed that wiener based methods provide more improvement in SNR as compared to spectral subtraction based methods. But at the same time musical noise generation is more in wiener based methods. The wiener based method HRNR gives a maximum 35.77 dB improvement in SNR for car noise at −10 dB input SNR level. Iterative spectral subtraction gives the minimum value of kurtosis ratio for all noises at all input SNR level.

Access provided by Autonomous University of Puebla. Download conference paper PDF

Speech Enhancement Using a Novel Spectral Subtraction Method for Seashore Noise

Iterative-processed multiband speech enhancement for suppressing musical sounds

Article 21 October 2023

A Comparative Analysis of Statistical Model and Spectral Subtractive Speech Enhancement Algorithms

Keywords

1 Introduction

The speech quality is deteriorated under adverse noise conditions in hearing aids and mobile phones. Therefore noise reduction requires more attention by various researchers. A commonly and efficient method used for noise reduction is spectral subtraction method. This method gives reduction in both noise performance as well as computational complexity [1–5]. Whereas Weiner based denoising techniques like TSNR (two step noise reductions) and HRNR (harmonic regeneration noise reduction) effectively removes the reverberation effect. Classic short-time noise reduction techniques removes noise as well as introduces harmonic distortion. For example, TSNR enhances speech but introduces harmonic distortion because of the unreliability of estimators for low signal-to-noise ratios [6]. A significant improvement is brought by HRNR compared to TSNR. However, the major disadvantage of these methods is Musical noise generation due to non-linear signal processing. This provides a significant distortion in the speech quality and intelligibility. This paper is an extension of the paper titled “Comparative analysis of speech enhancement methods” [7] with a comparison of musical noise reduction capability of various speech enhancement methods.

Spectral subtraction method [7] provides an efficient noise reduction for low musical noise. The amount of musical noise generation and the difference between higher-order statistics of the power spectra before and after nonlinear signal processing shows a higher correlation [8, 9]. Here, amount of musical noise generated by the spectral subtraction [10], iterative spectral subtraction method [11–14] geometrical approach for spectral subtraction [15], Weiner based denoising techniques like TSNR and HRNR [16, 17] are compared for musical noise reduction capability.

2 Mathematical Analysis of Musical Noise Generation via Higher-Order Statistics

The amount of musical noise generation is strongly correlated with different isolated power spectral components and the isolation level of these components [18]. A Higher order statistics called Kurtosis is adopted to measure these isolated components among all components. A higher value of kurtosis signifies a signal with many isolated components. However, the calculation of kurtosis is not sufficient to measure the amount of musical noise generation. Therefore, the change in kurtosis between signals before and after signal processing is used to identify only the musical-noise components. Therefore the kurtosis ratio is used as a measure to estimate musical noise defined as

$$ kurtosis\,ratio = \frac{{kurt_{proc} }}{{kurt_{org} }} $$

(1)

where kurt_proc = kurtosis of processed signal, kurt_org = kurtosis of observed signal The kurtosis ratio increases with increment in amount of musical noise.

3 Speech Enhancement Algorithms

Two different class of enhancement algorithms are presented, out of which three are spectral subtraction based methods and other two are Weiner based methods. Noisy speech signal is given by Eq. (2).

$$ y(n) = s(n) + d(n) $$

(2)

where s(n), d(n) and y(n) represent the pure speech signal, uncorrelated additive noise and the degraded speech signal respectively [10].

3.1 Spectral Subtraction

The principal of spectral subtraction method [10] is to achieve the estimated clean signal spectrum by the subtraction of an estimated noise spectrum from the corrupted speech signal spectrum. The estimation of noise spectrum taken and updated during the silence periods when the signal is not present i.e. in presence of noise only. The noise is assumed to be additive, stationary or near stationary. The Eq. (2) can be converted to Eq. (3) after Fourier transform.

$$ Y\left[ w \right] = X\left[ w \right] + D\left[ w \right] $$

(3)

Magnitude and phase of Y[w] can be expressed as follows

$$ Y\left[ w \right] = \left| {y\left[ w \right].e^{{j\phi_{y} }} } \right| $$

(4)

where |Y(w)| = magnitude spectrum, $ \phi_{y} $ = phase spectra of the corrupted noisy speech signal. Noise signal can be expressed in transformed domain as follows

$$ D\left[ w \right] = \left| {D\left[ w \right].e^{{j\phi_{y} }} } \right| $$

(5)

Here clean speech signal is estimated by subtraction of noise spectrum from the noisy speech spectrum as given in Eq. (6).

$$ X(w) = \left[ {\left| {Y(w)} \right| - \left| {D(w)} \right|} \right].e^{{j\phi_{y} }} $$

(6)

The unknown noise spectrum |D(w)| is calculated by the average value in absence of speech signal. Spectral subtraction method is represented in Fig. 1.

3.2 Iterative Spectral Subtraction

The only drawback of spectral subtraction is that a clear narrowband of noise still remains in the spectrum, even if our estimate of noise is correct. To overcome the drawback of spectral subtraction of weak signals another approach in which spectral analysis is iteratively applied on the signals, commonly known as Iterative Spectral Subtraction methods [11–14].

3.3 Spectral Subtraction Using Geometrical Approach

Geometric approach [15] is used to overcome the problem of spectral subtraction algorithm. This method involves the estimation of phase differences between the noisy signals and noise.

3.4 Decision-Directed (DD) Approach

The characteristics of this estimator has been tested by decision-directed (DD) approach proposed by Ephraım and Malah [19]. The main disadvantage of DD approach is to introduce the reverberation effect. Reverberation effect is minimized by Two-Step-Noise-Reduction (TSNR) technique as well it keeps the benefit of DD method. But TSNR, introduce harmonic distortion in enhanced speech because of the unreliability of the estimator for small SNR. To remove these harmonic distortions Harmonic Regeneration Noise Reduction (HRNR) is implemented [16, 17, 20].

4 Simulation Results and Discussion

All algorithms are implemented and simulated for speech enhancement in MATLAB. Then these algorithms are compared for four different noises- car noise, F16 noise, operation room noise and machine gun noise. Sound quality is evaluated and compared on the basis of their improved output SNR and the higher order statistics by finding Kurtosis ratio before and after the signal processing. One sample “YAHA SAI LAGHBAG PANCH MEAL DAKSHIN PASCHIM MAI KATGHAR GAON HAI”] has been used to check performance from our database [21].

The noisy version of this sentence was prepared by adding car noise and F16 noise from NOISEX-92 database [22] to this clean sentence at −10, −5, 0, 5 and 10 SNR levels. In spectral subtraction methods, β = 1.1 and ŋ = 0.8 are taken for implementation where β and ŋ are over subtraction factor and spectral floor parameter respectively. Residual noise and the perceived Musical noise are controlled by parameter β. A small value of β means the audible musical noise but the reduced residual noise. Also for a large β, residual noise will be audible but the musical issues will be reduced due to spectral subtraction. Also the amount of speech spectral distortion is greatly affected by the parameter α. The resulting signal will be highly distorted for large α. As well as the signal is suffered with poor intelligibility. Noise remains in enhanced speech signal for small value of α. In Geometrical based analysis parameter α is taken as 0.98 and parameter β is taken as 0.6. Similarly in Weiner based algorithms the value of parameter α at 0.98 gives the optimum result for enhanced speech. Simulation results are shown in the Tables 1, 2, 3 and 4 for the car noise, babble noise, operation room noise and machine gun noise respectively. Figure 2 shows average improvement in SNR at all input SNR level for all noises for each enhancement method.

Table 1 Output SNR and Kurtosis ratio for car noise

Full size table

Table 2 Output SNR and Kurtosis ratio for babble noise

Full size table

Table 3 Output SNR and Kurtosis ration of operation room noise

Full size table

Table 4 Output SNR and Kurtosis ration for machine gun noise

Full size table

Figure 3 shows average kurtosis ratio at all input SNR level for all noises for each enhancement method. It is observed from these figures that wiener based methods gives better results than basic spectral subtraction based methods in terms of increase in output SNR. It is observed from the Fig. 2 that output SNR level of HRNR algorithm gives the best result among all the algorithms at all input SNR level for all noises. Iterative spectral subtraction gives improvement in SNR with lesser musical noise generation as compared to other methods due to lowest kurtosis ratio than others.

5 Conclusion

Higher-order statistics has been used for implementation of musical-noise-generation analysis for nonlinear noise reduction. The HRNR Weiner based algorithm provided the best output SNR among all algorithms at all input SNR levels. It is observed from the higher order statistics that iterative spectral subtraction has less kurtosis ratio such that it enhanced the signal with least musical noise generation.

References

P. C. Loizou, Speech Enhancement Theory and Practice, Boca Raton, FL: CRC, Taylor & Francis Group, 2007.
Google Scholar
S. F. Boll, “Suppression of acoustic noise in speech using spectral subtraction”, IEEE Trans. Acoust., Speech, Signal Process., vol. 27, no. 2, 1979, pp. 113–120.
Google Scholar
M. Berouti, R. Schwartz, and J. Makhoul, “Enhancement of speech corrupted by acoustic noise”, Proc. ICASSP, 1979, pp. 208–211.
Google Scholar
R. McAulay and M. Malpass, “Speech enhancement using a soft-decisionnoise suppression filter”,IEEE Trans. Acoust., Speech, Signal Process., vol. 28, no. 2, 1980, pp. 137–145.
Google Scholar
R. Martin, “Spectral subtraction based on minimum statistics”, Proc. EUSIPCO, 1994, pp. 1182–1185.
Google Scholar
Cyril Plapous, Claude Marro, and Pascal Scalart, “Improved Signal-to-Noise Ratio Estimation for Speech Enhancement” IEEE Transactions on Audio, Speech, and Language Processing, Vol.14, Issue 6, 2006, pp. 2098–2108.
Google Scholar
Pankaj Goel, Prateek Saxena, V.K. Gupta, Mahesh Chandra, “Comparative analysis of speech enhancement methods”, proc.10th IEEE Int.Confrence on Wireless and Optical networks, 2013, pp.1–5.
Google Scholar
Y. Uemura, Y. Takahashi, H. Saruwatari, K. Shikano, and K. Kondo, “Automatic optimization scheme of spectral subtraction based on musical noise assessment via higher-order statistics”, Proc. Of Int. Workshop. Acoust. Echo and Noise Control, 2008.
Google Scholar
Y. Uemura, Y. Takahashi, H. Saruwatari, K. Shikano, and K. Kondo, “Musical noise generation analysis noise reduction methods based on spectral subtraction and MMSE STSA estimation”, Proc. Of ICASSP, 2009, pp. 4433–4436.
Google Scholar
Purav Goel, Anil Garg, “Developments in spectral subtraction for speech enhancement,” International Journal of Engineering Research and Applications, Vol. 2, Issue 1, 2012, pp. 055–063.
Google Scholar
K. Yamashita, S. Ogata, and T. Shimamura, “Spectral subtraction iterated with weighting factors,” Proc. IEEE Speech Coding Workshop, 2002, pp. 138–140.
Google Scholar
Kiyohiro Shikano, and Kazunobu KondoK. Yamashita, S. Ogata, and T. Shimamura, “Improved spectral subtraction utilizing iterative processing”, IEICE Trans. A, vol. 88, no. 11, 2005, pp. 1246–1257.
Google Scholar
M. R. Khan and T. Hassan, “Iterative noise power subtraction technique for improved speech quality,” Proc. of Int. Conf. Elect. Comput. Eng, 2008, pp. 391–394.
Google Scholar
X. Li, G. Li, and X. Li, “Improved voice activity detection based on iterative spectral subtraction and double thresholds for CVR,” Proc. of Workshop Power Electron. Intell. Transport. Syst., 2008, pp.153–156.
Google Scholar
Yang Lu, Philipos C. Loizou, “A geometric approach to spectral subtraction,” Speech Communication, Vol. 50,2008, pp. 453–466.
Google Scholar
C. Plapous, C. Marro, P. Scalart, and L. Mauuary, “A Two-Step Noise Reduction Technique,” IEEE Intl. Conf. Acoust., Speech, Signal Processing, Canada, Vol. 1, 2004, pp. 289–292,
Google Scholar
C. Plapous, C. Marro, and P. Scalart, “Speech Enhancement Using Harmonic Regeneration,”IEEE Intl. Conf. Acoust., Speech, Signal Processing, USA, Vol. 1, 2005, pp. 157–160.
Google Scholar
Ryoichi Miyazaki, Hiroshi Saruwatari, Takayuki Inoue, Yu Takahashi, “Musical-Noise-Free Speech Enhancement Based on Optimized Iterative Spectral Subtraction“, IEEE Transactions on Audio, Speech, and Language Processing, VOL. 20, NO. 7, 2012, pp. 2080–2094.
Google Scholar
Y. Ephraım, and D. Malah, “Speech Enhancement Using a Minimum Mean-Square Error Short-Time Spectral Amplitude Estimator,” IEEE Trans. Acoust., Speech, Signal Processing, Vol. 32, No. 6, 1984, pp. 1109–1121.
Google Scholar
O. Capp´e, “Elimination of the Musical Noise Phenomenon with the Ephra¨ım and Malah Noise Suppressor,” IEEE Trans. Speech and audio Processing, Vol. 2, No. 2, 1994, pp. 345–349.
Google Scholar
Samudravijaya K et. al., Hindi Speech Database, Proc. ICSLP00, Beijing, China, CDROM 00192.pdf.
Google Scholar
A. Varga, H. J. M. Steeneken, D. Jones, “The noisex-92 study on the effect of additive noise on automatic speech recognition system,” Reports of NATO Research Study Group (RSG.10), June 1992.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of ECE, RVIT Engineering College, Bijnor, India
Prateek Saxena
Department of ECE, IEC, Ghaziabad, India
V. K. Gupta
Department of ECE, BIT, Mesra, Ranchi, India
Mahesh Chandra

Authors

Prateek Saxena
View author publications
You can also search for this author in PubMed Google Scholar
V. K. Gupta
View author publications
You can also search for this author in PubMed Google Scholar
Mahesh Chandra
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Prateek Saxena .

Editor information

Editors and Affiliations

Department of Computer Science and Engineering, Anil Neerukonda Institute of Technology and Sciences, Visakhapatnam, India
Suresh Chandra Satapathy
Kalyani University, Nadia, West Bengal, India
Jyotsna Kumar Mandal
University of Hyderabad, Hyderabad, India
Siba K. Udgata
Department of Electronics and Communication Engineering, Shri Ramswaroop Memorial Group of Professional Colleges, Lucknow, Uttar Pradesh, India
Vikrant Bhateja

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Saxena, P., Gupta, V.K., Chandra, M. (2016). Musical Noise Reduction Capability of Various Speech Enhancement Algorithms. In: Satapathy, S.C., Mandal, J.K., Udgata, S.K., Bhateja, V. (eds) Information Systems Design and Intelligent Applications. Advances in Intelligent Systems and Computing, vol 434. Springer, New Delhi. https://doi.org/10.1007/978-81-322-2752-6_68

Download citation

DOI: https://doi.org/10.1007/978-81-322-2752-6_68
Published: 03 February 2016
Publisher Name: Springer, New Delhi
Print ISBN: 978-81-322-2750-2
Online ISBN: 978-81-322-2752-6
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Musical Noise Reduction Capability of Various Speech Enhancement Algorithms

Abstract

Similar content being viewed by others

Speech Enhancement Using a Novel Spectral Subtraction Method for Seashore Noise

Iterative-processed multiband speech enhancement for suppressing musical sounds

A Comparative Analysis of Statistical Model and Spectral Subtractive Speech Enhancement Algorithms

Keywords

1 Introduction

2 Mathematical Analysis of Musical Noise Generation via Higher-Order Statistics