Abstract
Speech signal carries much more information than spoken words and speaker’s information. Noise present in the speech signal is an obstacle in the path of analysing the speech signal precisely. Therefore, speech signal denoising has an important role in the speech signal analysis. In this paper, Wavelets are explored for pre-processing and suppression of noise from the noisy speech signals. The main aim of the proposed methodology involves, the pre-processing of the speech signals by using the combination of Discrete Wavelet Transform (DWT) and hard thresholding and to evaluate the result to find the optimal wavelet family with there optimal order followed by the suitable level of decomposition. This paper involves the exploration of the Wavelet Family and an improved Signal to Noise Ratio (SNR) values of the acquired speech signal which is shown in the result analysis.
Access provided by Autonomous University of Puebla. Download conference paper PDF
Similar content being viewed by others
Keywords
- Speech signals
- Discrete wavelet transform (DWT)
- Wavelet transform (WT)
- Hard thresholding
- Signal to noise ratio (SNR)
1 Introduction
Communication is the necessity of all human beings. Speech is one of the most important communication sources. It is the most natural, intuitive and fastest means of interaction among humans [1]. The speech signal contains the background noise and the distorted speech signal. The noise present in the speech signal causes the degradation in the quality of the speech. Over the last few decades, the pre-processing of speech signal which involves the removal of noise from speech has been an area of interest for the researchers [2]. Therefore, to remove a noise from the speech is very important as it may affect the further processing of the speech signal. So, filtering the noise from the acquired speech signal plays an important role in the speech signal analysis. As the frequency domain based signal processing can be designed easily and most of the noise suppression methods require the use of Short Term Fourier Transform (STFT) but in today’s scenario Wavelet Transform (WT) is gaining much importance as it is simple and efficient method for speech signal de-noising. Wavelet transform can analyze the signal and can select the information present in it that other signal de-noising techniques lack [2]. Therefore, lot of study has been done by the researchers for the suppression of noise in speech signals. Aggarwal et al. [2] introduced an approach of multi-resolution analysis using WT and found that the modified universal threshold gives better results of denoising. Bhowmick et al. [3] proposed a method of Voiced Speech Probability Detection Wavelet Decomposition (VSPDWD) and compared it with seven different techniques. It was found that the VSPDWD technique gave an improved SNR values at all levels of decomposition. Akkar et al. [4] in their study made a comparison between different thresholding techniques. It was found that the square wavelet thresholding gave better results than the traditional thresholding techniques. Babichev et al. [5] introduced a de-noising method that is the Emperical Mode and the Wavelet Decomposition techniques. It was found that there was a relative change in the values of the Shannon Entropy used for the quality criterion. This indicates that the technique used was effective. Hadi et al. [6] in their study made a comparison between different threshold selection rule. It was found that the sawtooth wavelet thresholding gave better results than the traditional thresholding techniques.
In this paper, a combination of Discrete Wavelet transform and Hard Thresholding technique for noise reduction has been proposed. Wavelet Transform provides a multi-resolution and is a better technique than Fourier Transform and STFT. This paper consists of following sections: Section 2 gives the description of the pre-processing of noisy speech signals using wavelets. Section 3 consists of experimental setup, Sect. 4 consists of exploration of suitable Wavelet Family for speech signal analysis and Sect. 5 consists of simulation results. Section 6 consists of the conclusion of the analysis done in the proposed work.
2 Pre-processing of Noisy Speech Signals Using Wavelets
2.1 Discrete Wavelet Transform
For suppressing the noise present in the speech signal, Discrete Wavelet Transform has been used. It involves the decomposition of the speech signal in the time frequency domain. The noise present in the speech signal cannot be easily removed by using the Kalman or Chebyshev filters. Therefore, it can be removed by applying the wavelet transform [7,8,9]. A wavelet is a wave like oscillation that begins at zero, increases and then again goes to zero [1, 10]. The scaled and shifted version of fundamental or mother wavelet Ψ is elucidated below [3]:
where β is the scaling parameter and τ is the translation parameter. The noisy speech signal s(t) is decomposed into sub-bands through DWT into approximation and detailed coefficients. The detailed coefficient or the higher frequency component D(p, k) has been elucidated below [3]:
where p, k and n are integers and ψ*(t) is the complex conjugate of ψ(t). The approximation coefficients or the lower frequency component A(p, k), has been elucidated below [3]:
where *(n) is the complex conjugate of the scaling function(n). When DWT is applied to the noisy signal s(t) at different level then the speech signal decomposes to approximation and detailed coefficients [3, 11]. The detailed coefficients are obtained by filtering the high frequency component present in the noisy speech signals through high pass filter and the approximation coefficients are obtained by filtering the low frequency component present in the noisy speech signals through low pass filter. The reconstruction of the original speech signal is done by applying the Inverse Discrete Wavelet Transform (IDWT) to the filtered speech signal which is formed by combining the detailed and the approximation coefficients from the last level of decomposition to the first level. Figures 1 and 2, shows the two level wavelet decomposition and reconstruction process in which s is the noisy speech signal, cA1 and cD1 are the first level approximation and detailed coefficients and cA2 and cD2 are the second level approximation and detailed coefficients [2].
2.2 Universal Thresholding Based Filtering Method for Pre-processing of Speech Signals
The noise present in the speech signal is a major issue in the speech signal analysis. In the proposed work, a universal thresholding based filtering technique using DWT has been proposed. The higher frequency components acquired through DWT is having a residual noise that cannot be removed by applying the simple filtering process [9, 12, 13].
2.2.1 Threshold Selection
The universal threshold value T can be evaluated by the equation elucidated below [3]:
where L is the noisy speech signal sample. The standard deviation σ can be evaluated as [5]:
where MAD is the Median Absolute Deviation and \(D\left[ n \right]\) is the detailed coefficient of noisy speech signal.
2.2.2 Threshold Function
The universal threshold function, hard thresholding has been used in the proposed work. The calculation formula for hard \((H_{m,n} )\) threshold function is given below [1]:
where \(\omega_{m,n}\) is the wavelet decomposition coefficient of the noisy speech signal and μ is the threshold value. The threshold value μ is placed to zero and if the value of the coefficients is more than the threshold value then all the coefficients are threshold and this is known as Hard Thresholding.
3 Experimental Setup
The proposed method is performed and evaluated on NOIZEUS database [14]. The database accommodates 30 IEEE sentences contaminated with eight different noises at different SNRs. The noise is added to the sentences from the AURORA database [15] that includes train, babble, car, exhibition hall, restaurant, street, airport and train-station noise. In this experiment the noise from the noisy speech signal is removed using Discrete Wavelet Transform technique. In the proposed methodology the noisy speech signal is decomposed into Approximation and Detailed coefficients by using different types of wavelets like Daubechies, Coiflets, Symlets and Haar wavelet. The detail coefficients are difficult to remove through filters. Therefore, noise suppression in the noisy speech signals is done through hard thresholding [16,17,18]. The evaluation of the proposed work is done by calculating the SNR by using the mathematical expression as elucidated in Eq. (7) [3, 11]:
where \(L\) is the sample size for the filtered speech signal, \(s\left( m \right)\) is the noisy speech signal and \(\hat{s}\left( m \right)\) is the clean speech signal.
4 Exploration of Suitable Wavelet Family for Speech Signal Analysis
In the proposed work different Wavelet families has been explored for the suppression of noise from speech signals. From the different family used which are Daubechies, Symlets, Coiflets and Haar wavelet, Coiflets tends to give the optimal SNR value. The comparative analysis of SNR values for different wavelet family is shown in Table 1. The noise in the signal decreases from 0 dB to 10 dB and the SNR value of the respective signal tends to increase. The order 5 of the Coiflet wavelet gives the optimal SNR value. The comparative analysis of SNR values for different order of Coiflet Wavelet are shown in Table 2. Here it is examined that as the order of the Wavelet family increases, the SNR value increases. The comparative analysis of SNR values for different level of decomposition of noisy speech signal is shown in Table 3. Here it is analyzed that the SNR improves to a certain level of decomposition and then it stops as the sample number decreases in lower sub-bands. So, based on the explored parameters the simulation of result has been done.
5 Simulation Results
The speech signals contain the noise which is important to remove as it causes the difficulty in the further processing of the signal. Figure 3 shows the noisy and filtered speech signal. The decomposition of the noisy speech signal is done through DWT at various levels along with a different wavelet family. The hard thresholding is applied by calculating the threshold value \(T\) to the coefficients obtained through the decomposition of the noisy speech signal [19, 20]. From the exploration of suitable wavelet family the result is simulated. Therefore, an improved result of pre-processing of speech signal by using the combination of DWT and Hard Thresholding has been obtained. The reconstruction of the noisy speech signal is done through IDWT. The important information present in the reconstructed speech signal is not lost [21, 22].
6 Conclusion
In the proposed work, the pre-processing of the noisy speech signal through the combination of DWT and hard thresholding has been done and the Wavelet Family has been explored to obtain the improved result. Comparative analysis for the best wavelet family, order of the wavelet and the best level of decomposition has been obtained and based on the explored parameters the simulation of the result for the suppression of noisy speech signal has been done.
References
Zhong, X., Dai, Y., Dai, Y., Jin, T.: Study on processing of wavelet speech denoising in speech recognition system. Int. J. Speech Technol. 21, 563–569 (2018)
Aggarwal, R., Singh, J.K., Gupta, V.K., Rathore, S., Tiwari, M., Khare, A.: Noise reduction of speech signal using wavelet transform with modified universal threshold. Int. J. Comput. Appl. 20, 14–19 (2011)
Bhowmick, A., Chandra, M.: Speech enhancement using voiced speech probability based wavelet decomposition. Comput. Electr. Eng. 62, 706–718 (2017)
Akkar, H.A.R., Hadi, W.A.H., Al-Dosari, I.H.: A squared-Chebyshev wavelet thresholding based 1D signal compression. Defence Technol. 15, 426–431 (2019)
Babichev, S., Mikhalyov, O.: A hybrid model of 1-D signal adaptive filter based on the complex use of Huang transform and wavelet analysis. Int. J. Intell. Syst. Appl. 1–8 (2019)
Akkar, H.A.R., Hadi, W.A.H., Al-Dosari, I.H.: Implementation of sawtooth wavelet thresholding for noise cancellation in one dimensional signal. Int. J. Nanoelectron. Mater. 12, 67–74 (2019)
Bhateja, V., Urooj, S., Verma, R., Mehrotra, R.: A novel approach for suppression of powerline interference and impulse noise in ECG signals. In: Proc. of IMPACT-2013. pp. 103–107. Aligarh, India (2013)
Chieng, T.M., Hau, Y.W., Bin Omar, Z., Lim, C.W.: Qualitative and quantitative performance comparison of ECG noise reduction and signal enhancement method based on various digital filter designs and discrete wavelet transform. Int. J. Comput. Digital Syst. 9, 534–544 (2020)
Bhateja, V., Srivastava, A., Tiwari, D.K.: An approach for the preprocessing of EMG signals using canonical correlation analysis. Smart Comput. Inform. 78, 201–208 (2017)
Jakati, J.S., Shridhar, S.K.: Efficient speech de-noising algorithm using multi-level discrete wavelet transform and thresholding. Int. J. Emerg. Trends Eng. Res. 8, 2472–2480 (2020)
Taquee, A., Bhateja, V., Shankar, A., Srivastava, A.: Combination of wavelets and hard thresholding for analysis of cough signals. In: 2018 Second World Conference on Smart Trends in Systems, Security and Sustainability (WorldS4). pp. 266–270. IEEE Press, London (2018)
Wang, K., Su, G., Liu, L., Wang, S.: Wavelet packet analysis for speaker-independent emotion recognition. Neurocomputing 398, 257–264 (2020)
Venkateswarlu, S.C., Karthik, A., Kumar, D.N.: Performance on speech enhancement objective quality measures using hybrid wavelet thresholding. Int. J. Eng. Adv. Technol. (IJEAT) 8, 3523–3533 (2019)
NOIZEUS: A Noisy Speech Corpus for Evaluation of Speech Enhancement Algorithms. https://ecs.utdallas.edu/loizou/speech/noizeus/
The AURORA Experimental Framework For The Performance Evaluation of Speech Recognition Systems Under Noisy Conditions. https://www.isca-speech.org/archive_open/asr2000/asr0_181.html
Verma, R., Mehrotra, R., Bhateja, V.: A new morphological filtering algorithm for pre-processing of electrocardiographic signals. In: Proc. (SPRINGER) of the Fourth International Conference on Signal and Image Processing (ICSIP 2012). pp. 193–201. Coimbatore, India (2012)
Bhateja, V., Devi, S.: Combination of wavelet analysis and non-linear enhancement function for computer aided detection of microcalcifications. In: Proc. of International Conference on Biomedical Engineering and Assistive Technologies (BEATs-2010), p. 44. Jalandhar, India (2010)
Alabbasi, H.A., Jalil, A.M., Hasan, F.S.: Adaptive wavelet thresholding with robust hybrid features for text-independent speaker identification system. Int. J. Electr. Comput. Eng. 10, 2088–8708 (2020)
Mishra, A., Bhateja, V., Gupta, A., Mishra, A.: Noise removal in EEG signals using SWT–ICA combinational approach. Smart Intell. Comput. Appl. 105, 217–224 (2018)
Srivastava, A., Bhateja, V., Shankar, A., Taquee, A.: On analysis of suitable wavelet family for processing of cough signals. Front. Intell. Comput. Theory Appl. 1014, 194–200 (2019)
Bhateja, V., Taquee, A., Sharma D.K.: Pre-processing and classification of cough sounds in noisy environment using SVM. In: Proc. of 2019 4th International Conference on Information Systems and Computer Networks (ISCON), pp. 822–826. Mathura, India (2019)
Vani, H.Y., Anusuya, M.A.: Improving speech recognition using bionic wavelet features. AIMS Electron. Electr. Eng. 4, 200–215 (2020)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Yadav, M.K., Bhateja, V., Singh, M. (2021). Exploration of Wavelets for Pre-processing of Speech Signals. In: Bhateja, V., Satapathy, S.C., Travieso-Gonzalez, C.M., Flores-Fuentes, W. (eds) Computer Communication, Networking and IoT. Lecture Notes in Networks and Systems, vol 197. Springer, Singapore. https://doi.org/10.1007/978-981-16-0980-0_44
Download citation
DOI: https://doi.org/10.1007/978-981-16-0980-0_44
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-0979-4
Online ISBN: 978-981-16-0980-0
eBook Packages: EngineeringEngineering (R0)