Efficient subband fast adaptive algorithm based-backward blind source separation for speech intelligibility enhancement

Sayoud, Akila; Djendi, Mohamed

doi:10.1007/s10772-020-09715-w

Efficient subband fast adaptive algorithm based-backward blind source separation for speech intelligibility enhancement

Published: 18 May 2020

Volume 23, pages 471–479, (2020)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

International Journal of Speech Technology Aims and scope Submit manuscript

Efficient subband fast adaptive algorithm based-backward blind source separation for speech intelligibility enhancement

Download PDF

Akila Sayoud¹ &
Mohamed Djendi¹

147 Accesses
1 Citation
Explore all metrics

Abstract

This paper addresses the problem of speech intelligibility enhancement by subband adaptive filtering algorithms in a blind framework. Recently in Djendi and Sayoud (Int J Speech Technol 22:391–406, 2019), we have proposed a subband adaptive algorithm based on the forward blind source separation structure that is efficient for acoustic noise reduction and speech intelligibility enhancement applications. In this paper, we propose a novel subband domain implementation of the backward blind source separation structure combined with a modified version of the fast normalized least mean square (FNLMS) algorithm. The new proposed subband algorithm is efficient in improving the speech signal intelligibility without introducing any distortion at the output. A fair comparison of the proposed backward subband FNLMS algorithm with other fullband type algorithms is presented. This comparison is based on the evaluation of several objective criteria. The obtained results show the best performance of the proposed subband algorithm in terms of speed convergence.

An efficient wavelet-based adaptive filtering algorithm for automatic blind speech enhancement

Article 21 April 2018

A new dual subband fast NLMS adaptive filtering algorithm for blind speech quality enhancement and acoustic noise reduction

Article 28 March 2019

Statistical Analysis and Evaluation of Blind Speech Extraction Algorithms

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

In many modern speech communication systems, the presence of background noise causes degradation in the quality and intelligibility of the communications. For this reason, noise reduction plays an important role in ensuring high quality communication and is still an active research topic. Many techniques for noise reduction and speech enhancement applications have been developed in the literature depending on the number of sensors available for processing. These approaches can be classified into three basic categories which are: (i) the temporal filtering techniques using only single microphone such as optimal filtering (Benesty and Chen 2011) and spectral subtraction (Boll 1979), (ii) the adaptive noise cancellation based on a primary sensor that pick up the noisy signal and a reference sensor to measure the noise field (Widrow and Goodlin 1975), and (iii) the last approach is the beamforming techniques that used an sensor array (Habets et al. 2009).

Moreover, adaptive filtering algorithms have become more popular and have proven effectiveness in acoustic noise reduction and speech signal quality enhancement. The normalized least mean square (NLMS) is the most popular and widely used adaptive filtering algorithm because of its simplicity and robustness (Habets et al. 2009). In spite of these advantages, the use of the NLMS algorithm has been hampered by its slow convergence when the input signal is highly correlated (Sayed 2003). To tackle this issue, numerous algorithms have been proposed, such as the recursive least square algorithms and their fast versions (Slock and Kailath 1991), the affine projection algorithms and their fast versions (Ozeki and Umeda 1984; Bouchard 2003). In the same direction, adaptive filtering in subbands has been proposed to improve the convergence speed behavior of the conventional fullbband adaptive filtering algorithms (Pradhan and Reddy 1999; Lee and Gan 2004; Djendi and Bendoumia 2013). Subband adaptive filtering (SAF) employs multirate filter banks for signal decomposition and reconstruction. This technique leads to a fast convergence speed and less computational complexity (Lee et al. 2010; Gilloire et al. 1988). In this paper, we propose a new subband implementation of the backward blind source separation (BBSS) combined with a modified version of the fast normalized least mean square algorithm (FNLMS) for noise reduction and speech intelligibility enhancement applications.

The organization of this paper is as follows: in section II we present the adopted acoustic environment model. In section III we describe the principle of the proposed backward subband FNLMS algorithm. The simulation results of the comparative study between the proposed backward subband FNLMS algorithm and other fullband type algorithms are presented in section IV. Finally we conclude our work in section V.

1.1 Acoustical envirroment model presentation

In this paper, the acoustical environment is modeled by the two channel simplified convolutive mixture that was proposed in (Djendi and Zoulikha 2014), where two noisy observations m₁(n) and m₂(n) are generated by the propagation of two uncorrelated source signals of speech s(n) and noise b(n) as depicted in Fig. 1.

The two noisy observations p₁(n) and p₂(n) are modeled by these two equations:

$$m_{1} \left( n \right) = s\left( n \right) + h_{21} \left( n \right)*b\left( n \right)$$

(1)

$$m_{2} \left( n \right) = b\left( n \right) + h_{12} \left( n \right)*s\left( n \right)$$

(2)

where h₁₂(n) and h₂₁(n) represent the acoustic coupling paths between the source signals and the microphones. We assume that the speech signal is close from the first microphone and the noise is close from the second microphone, hence the impulse responses h₁₁(n) and h₂₂(n) are equal to the Kronecker unit impulse δ(n) (Van Gerven and Van Compernolle 1995) (see Fig. 1).

2 Proposed subband backward algorithm descreption

In this section we describe the principle of the proposed backward subband FNLMS algorithm. The proposed algorithm is a subband implementation of the backward blind source separation (BBSS) structure combined with a modified version of the fast NLMS (FNLMS) algorithm. A general block diagram of the proposed backward subband FNLMS algorithm is presented in Fig. 2.

In this figure, we find the following signals:

m₁(n) and m₂(n) are the fullband mixing signal.
m_1i(n) and m_2i(n) are the subband signals of each fullband signals m₁(n) and m₂(n), respectively.
e_1i,D(p) and e_2i,D(p) are the decimated output sub-signals.
$\varvec{E}_{1i} \left( n \right) = \left[ {e_{1i} \left( n \right), e_{1i} \left( {n - 1} \right), \ldots ,e_{1i} \left( {n - l + 1} \right)} \right]$ and $\varvec{E}_{2i} \left( n \right) = \left[ {e_{2i} \left( n \right), e_{2i} \left( {n - 1} \right), \ldots ,e_{2i} \left( {n - l + 1} \right)} \right]$ are the vectors of the decimated output sub-signals e_1i,D(p) and e_2i,D(p)
e₁(n) and e₂(n) are the interpolated sub-signals into their fullband form.

All of these signals will be well detailed and explained by their mathematical derivation of the following analysis stage section.

2.1 Analysis stage

As shown in Fig. 2 (stage 1) the two noisy input signals m₁(n) and m₂(n) are split into M subband signals m_1i(n) and m_2i(n) by means of analysis filter banks $h_{1} \left( n \right), \ldots ,h_{M} \left( n \right)$, and they are decimated by a factor D = M. The decimated mixing sub-signals are defined as follows:

$$m_{1i,D} \left( p \right) = m_{1i} \left( {pM} \right)\quad i = 1, \ldots ,M.$$

(3)

$$m_{1i} \left( n \right) = \varvec{h}_{i}^{T} \left( n \right)\varvec{m}_{1} \left( n \right)\quad i = 1, \ldots ,M.$$

(4)

$$m_{2i,D} \left( p \right) = m_{2i} \left( {pM} \right)\quad i = 1, \ldots ,M.$$

(5)

$$m_{2i} \left( n \right) = \varvec{h}_{i}^{T} \left( n \right)\varvec{m}_{2} \left( n \right)\quad i = 1, \ldots ,M.$$

(6)

where m₁(n) = [m₁(n), m₁(n − 1…, m₁(n − l + 1], m₂(n) = [m₂(n), m₂(n − 1…, m₂(n − l + 1)]. l is the length of the analysis filters h_i(n). The variable n is used for the time index of the original fullband signals, and p is used for the decimated sub-signals.

2.2 Adaptation prosses stage

In the second stage we applied the BBSS structure (Henni et al. 2019) to retrieve the original source signals s(n) and b(n) from only the decimated noisy observations m_1i,D(p) and m_2i,D(p). In this proposed structure, two symmetric adaptive filters are used to estimate the enhanced output signals. To update the coefficients of these adaptive filters, we use the modified FNLMS algorithm when combined with the BBSS structure. We note that the output signals of the proposed backward subband FNLMS algorithm are estimated in subbands, while the coefficients of the adaptive filters are adapted in their fullband forms. A detailed descriptive scheme of the adaptation process (stage 2) is given in Fig. 3.

2.3 Synthesis stage

In the last stage, the decimated output sub-signals e_1i,D(p) and e_2i,D(p) are interpolated by a factor I = M, subsequently a synthesis filter banks $g_{1} \left( n \right), \ldots ,g_{M} \left( n \right)$ are used to merge these last interpolated sub-signals into their fullband form e₁(n) and e₂(n), which are given by the following relations:

$$e_{1} \left( n \right) = \mathop \sum \limits_{i = 1}^{M} \varvec{g}_{i}^{T} \left( n \right)\varvec{E}_{1i} \left( n \right)$$

(7)

$$e_{2} \left( n \right) = \mathop \sum \limits_{i = 1}^{M} \varvec{g}_{i}^{T} \left( n \right)\varvec{E}_{2i} \left( n \right)$$

(8)

where

$$e_{1i} \left( n \right) = \left\{ {\begin{array}{*{20}c} {e_{1i,D} (p/I), \quad n = 0, \pm I, \pm 2I, \ldots } \\ {0\, otherwise} \\ \end{array} } \right.\;For\;i = 1, \ldots ,M.$$

(9)

$$e_{2i} \left( n \right) = \left\{ {\begin{array}{*{20}c} {e_{2i,D} (p/I), \quad n = 0, \pm I, \pm 2I, \ldots } \\ {0\, otherwise} \\ \end{array} } \right.\;For\;i = 1, \ldots ,M.$$

(10)

and $\varvec{E}_{1i} \left( n \right) = \left[ {e_{1i} \left( n \right), e_{1i} \left( {n - 1} \right), \ldots ,e_{1i} \left( {n - l + 1} \right)} \right]$, $\varvec{E}_{2i} \left( n \right) = \left[ {e_{2i} \left( n \right), e_{2i} \left( {n - 1} \right), \ldots ,e_{2i} \left( {n - l + 1} \right)} \right]$.

2.4 Mathematical formulation of the processing algorithm

We have adopted the FNLMS algorithm to update the two cross-filters of the BBSS structure in a subband framework. In this subsection, we present the mathematical formulation of the proposed backward subband FNLMS algorithm.

The estimated signals for M subbands of the proposed backward subband FNLMS algorithm are given as follows:

$$e_{1i,D} \left( p \right) = m_{1i,D} \left( p \right) - \varvec{w}_{1}^{T} \left( p \right)\varvec{e}_{2i,D} \left( p \right)\quad i = 1, \ldots ,M.$$

(11)

$$e_{2i,D} \left( p \right) = m_{2i,D} \left( p \right) - \varvec{w}_{2}^{T} \left( p \right)\varvec{e}_{1i,D} \left( p \right)\quad i = 1, \ldots ,M.$$

(12)

where $\varvec{e}_{1i,D} \left( p \right) = \left[ {e_{1i,D} \left( p \right), e_{1i,D} \left( {p - 1} \right), \ldots ,e_{1i,D} \left( {p - L + 1} \right)} \right]$and $\varvec{e}_{2i,D} \left( p \right) = \left[ {e_{2i,D} \left( p \right), e_{2i,D} \left( {p - 1} \right), \ldots ,e_{2i,D} \left( {P - L + 1} \right)} \right]$. L is the length of the adaptive filters. The vectors $\varvec{w}_{1} \left( p \right)$ and $\varvec{w}_{2} \left( p \right)$ are the two adaptive filters of the proposed backward subband FNLMS algorithm, which are updated as follows:

$$\varvec{w}_{1} \left( {p + 1} \right) = \varvec{w}_{1} \left( p \right) - \mu_{1} \mathop \sum \limits_{i = 1}^{M} \left[ {e_{1i,D} \left( p \right)\varvec{c}_{1i,D} \left( p \right)} \right]$$

(13)

$$\varvec{w}_{2} \left( {p + 1} \right) = \varvec{w}_{2} \left( p \right) - \mu_{2} \mathop \sum \limits_{i = 1}^{M} \left[ {e_{2i,D} \left( p \right)\varvec{c}_{2i,D} \left( p \right)} \right]$$

(14)

where $0 < \mu_{1}$,$\mu_{2} < 2$ are defined as the step-size parameters which affects the convergence behavior of the filter weights, and $\varvec{c}_{1i,D} \left( p \right)$, $\varvec{c}_{2i,D} \left( p \right)$ are the decimated subbund adaptation gain vectors, which are given by the following relations:

$$\varvec{c}_{1i,D} \left( p \right) = \gamma_{1i,D} \left( p \right)\varvec{k}_{1i,D} \left( p \right)\quad i = 1, \ldots ,M.$$

(15)

$$\varvec{c}_{2i,D} \left( p \right) = \gamma_{2i,D} \left( p \right)\varvec{k}_{2i,D} \left( p \right)\quad i = 1, \ldots ,M.$$

(16)

where the scalars $\gamma_{1i,D} \left( p \right)$ and $\gamma_{2i,D} \left( p \right)$ are the decimated subband likelihood variables, which are defined as follows:

$$\gamma_{1i,D} \left( p \right) = \frac{1}{{1 - \varvec{k}_{1i,D}^{T} \left( p \right)\varvec{e}_{2i,D} \left( p \right)}}\quad i = 1, \ldots ,M.$$

(17)

$$\gamma_{2i,D} \left( p \right) = \frac{1}{{1 - \varvec{k}_{2i,D}^{T} \left( p \right)\varvec{e}_{1i,D} \left( p \right)}}\quad i = 1, \ldots ,M.$$

(18)

The decimated subband vectors $\varvec{k}_{1i,D} \left( p \right)$ and $\varvec{k}_{2i,D} \left( p \right)$ are the kalman gains, that are given by the following relations:

$$\left[ {\begin{array}{*{20}c} {\varvec{k}_{1i,D} \left( p \right)} \\ * \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} { - \frac{{\varepsilon_{1i,D} \left( p \right)}}{{\lambda \alpha_{1i,D} \left( {p - 1} \right) + c_{0} }}} \\ { \varvec{k}_{1i,D} \left( {p - 1} \right)} \\ \end{array} } \right]\quad i = 1, \ldots ,M.$$

(19)

$$\left[ {\begin{array}{*{20}c} {\varvec{k}_{2i,D} \left( p \right)} \\ * \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} { - \frac{{\varepsilon_{2i,D} \left( p \right)}}{{\lambda \alpha_{2i,D} \left( {p - 1} \right) + c_{0} }}} \\ { \varvec{k}_{2i,D} \left( {p - 1} \right)} \\ \end{array} } \right]\quad i = 1, \ldots ,M.$$

(20)

where the asterisk * represents the last unused element of the Kalman gains, $\lambda$($0 < \lambda < 1$) is an exponential forgetting factor and $c_{0}$ is a small positive constant used to avoid division by very small values in absence of the input signal. The decimated subband parameters $\alpha_{1i,D}$ and $\alpha_{2i,D}$ are the forward prediction errors variances, they are defined as follows:

$$\alpha_{1i,D} \left( p \right) = \lambda \alpha_{1i,D} \left( {p - 1} \right) + \varepsilon^{2}_{1i,D} \left( p \right)\quad i = 1, \ldots ,M.$$

(21)

$$\alpha_{2i,D} \left( p \right) = \lambda \alpha_{2i,D} \left( {p - 1} \right) + \varepsilon^{2}_{2i,D} \left( p \right)\quad i = 1, \ldots ,M.$$

(22)

The decimated subband prediction errors $\varepsilon_{1i,D} \left( p \right)$ and $\varepsilon_{2i,D} \left( p \right)$ that are used to evaluate the kalman gains can be calculated using a first-order prediction model as follows:

$$\varepsilon_{1i,D} \left( p \right) = e_{2i,D} \left( p \right) - a_{1i,D} e_{2i,D} \left( {p - 1} \right)\quad i = 1, \ldots ,M.$$

(23)

$$\varepsilon_{2i,D} \left( p \right) = e_{1i,D} \left( p \right) - a_{2i,D} e_{1i,D} \left( {p - 1} \right)\quad i = 1, \ldots ,M.$$

(24)

where $a_{1i,D}$ and $a_{2i,D}$ are the decimated subband prediction coefficients that are obtained by minimizing the functions $E\left[ {\varepsilon_{1i,D}^{2} \left( p \right)} \right]$ and $E \left[ {\varepsilon_{2i,D}^{2} \left( p \right)} \right]$. The derivative of these last functions with respect to $a_{1i,D}$ and $a_{2i,D}$ respectively leads to the following relations:

$$a_{1i,D} \left( p \right) = \frac{{E\left[ {e_{2i,D} \left( p \right)e_{2i,D} \left( {p - 1} \right)} \right]}}{{E\left[ {e^{2}_{2i,D} \left( {p - 1} \right)} \right]}} = \frac{{r_{1i,D} \left( p \right)}}{{r_{2i,D} \left( p \right)}}\quad i = 1, \ldots ,M.$$

(25)

$$a_{2i,D} \left( p \right) = \frac{{E\left[ {e_{1i,D} \left( p \right)e_{1i,D} \left( {p - 1} \right)} \right]}}{{E\left[ {e^{2}_{1i,D} \left( {p - 1} \right)} \right]}} = \frac{{r_{3i,D} \left( p \right)}}{{r_{4i,D} \left( p \right)}}\quad i = 1, \ldots ,M.$$

(26)

where $r_{1i,D} \left( p \right)$ and $r_{2i,D} \left( p \right)$ represent respectively, the first coefficient of the autocorrelation function of the decimated output sub-signals $e_{2i,D} \left( p \right)$ and the power of the decimated output sub-signals $e_{2i,D} \left( p \right)$. $r_{3i,D} \left( p \right)$ and $r_{4i,D} \left( p \right)$ represent respectively, the first coefficient of the autocorrelation function of the decimated output sub-signals $e_{1i,D} \left( p \right)$ and the power of the decimated output sub-signals $e_{1i,D} \left( p \right)$. An estimation of these last prediction coefficients for each subband can be performed as follows:

$$a_{1i,D} \left( p \right) = \frac{{r_{1i,D} \left( p \right)}}{{r_{2i,D} \left( p \right) + c_{a} }}\quad i = 1, \ldots ,M.$$

(27)

$$a_{2i,D} \left( p \right) = \frac{{r_{3i,D} \left( p \right)}}{{r_{4i,D} \left( p \right) + c_{a} }}\quad i = 1, \ldots ,M.$$

(28)

where $r_{1i,D} \left( p \right)$, $r_{2i,D} \left( p \right)$, $r_{3i,D} \left( p \right)$, and $r_{4i,D} \left( p \right)$ are estimated recursively by the following relations:

$$r_{1i,D} \left( p \right) = \lambda_{a} r_{1i,D} \left( {p - 1} \right) + e_{2i,D} \left( p \right) e_{2i} \left( {p - 1} \right)\quad i = 1, \ldots ,M.$$

(29)

$$r_{2i,D} \left( p \right) = \lambda_{a} r_{2i,D} \left( {p - 1} \right) + e^{2}_{2i,D} \left( p \right)\quad i = 1, \ldots ,M.$$

(30)

$$r_{3i,D} \left( p \right) = \lambda_{a} r_{3i,D} \left( {p - 1} \right) + e_{1i,D} \left( p \right) e_{1i,D} \left( {p - 1} \right)\quad i = 1, \ldots ,M.$$

(31)

$$r_{4i,D} \left( p \right) = \lambda_{a} r_{4i,D} \left( {p - 1} \right) + e^{2}_{1i,D} \left( p \right)\quad i = 1, \ldots ,M.$$

(32)

where $\lambda_{a}$ is a forgetting factor and is $c_{a}$ a small positive constant.

3 Simulation results

3.1 Descreption of the used signals

In this simulation, we consider that the mixing model of Fig. 1 generates two noisy observations $m_{1} \left( n \right)$ and $m_{2} \left( n \right)$, where the original speech signal $s\left( n \right)$ is a French male speaker of about 4 seconds length and the noisy disturbance source $b\left( n \right)$ is a USASI (United State of America Standard Institute now ANSI) noise taken from AURORA database (Sayoud et al. 2018). Fig. 4 shows the time evolution of the source signals $s\left( n \right)$ and $b\left( n \right).$ These source signals are sampled at $8 kHz$ with 16 bit quantification. We have used the model proposed in (Djendi et al. 2006) to generate the two impulse responses $h_{12} \left( n \right)$ and $h_{21} \left( n \right)$. Figure 5 shows an example of the two simulated impulse responses with $L = 128$. In Fig. 6 we show the time evolution of the two noisy observation $m_{1} \left( n \right)$ and $m_{2} \left( n \right)$, the input SNR (signal-to-noise ratio) at the inputs of the two microphones is selected equal to $0dB$.

3.2 Time evolution of the enhanced output

Figure 7 presents the time evolution of the enhanced output signal $e_{1} \left( n \right)$ obtained by the proposed backward subband FNLMS algorithm with $M = 2$ and $M = 4$. As shown in Fig. 7, the enhanced output speech signal is completely denoised, this means that the proposed backward subband FNLMS algorithm can efficiently enhances noisy speech.

3.3 Performance evaluations

This subsection, is reserved to evaluate the performance properties of the proposed backward subband FNLMS algorithm in comparison with the following adaptive fullband type algorithms: (i) the backward normalized least mean square (BNLMS) algorithm (Van Gerven and Van Compernolle 1992), which is based on the combination between the BBSS structure and the NLMS algorithm, (ii) the backward fast normalized least mean square algorithm (BFNLMS) proposed recently in our previous work (Sayoud et al. 2018). This algorithm is based on the combination between the BBSS structure and the FNLMS algorithm, which represents the fullband version of our proposed backward subband FNLMS algorithm. We recall here that all simulated algorithms are controlled by a manual activity voice detector (MAVD) to retrieve the speech signal at the first output $e_{1} \left( n \right)$. The simulation parameters of each algorithm are given in Table 1. This comparative study is based on the following objective criteria:

Table 1 Simulation parameters of the following algorithms i.e. proposed backward subband FNLMS algorithm, BNLMS algorithm and BFNLMS algorithm

Full size table

(i) System Mismatch (SM) criterion which is computed between the adaptive filter $w_{1} \left( n \right)$ and the real one $h_{21} \left( n \right)$ as follow (Hu and Loizou 2008):

$$SM_{dB} = 20 log_{10} \left({\frac{\Vert {\varvec{h}_{21} - \varvec{w}_{1} \left( n \right)\Vert}}{\Vert{\varvec{h}_{21} \Vert}}} \right)$$

(33)

(ii) Segmental Mean Square Error (SegMSE) criterion is given by the following relation (Ghribi et al. 2016):

$$SegMSE_{dB} = \frac{10}{K}\mathop \sum \limits_{m = 0}^{K - 1} log_{10} \left( {\frac{1}{N}\mathop \sum \limits_{n = Nm}^{Nm + N - 1} \left| {s\left( n \right) - e_{1} \left( n \right)} \right|^{2} } \right)$$

(34)

Where $N$ is the segment length of the original signal ${\text{s}}\left( n \right)$ and the enhanced one $e_{1} \left( n \right)$, and $K$ is the number of segments in silence periods. We note that the SegMSE criterion is evaluated only in silence periods.

(iii) Segmental signal-to-noise-ratio (SegSNR) criterion is calculated using the following formula (Rabiner and Juang 1993):

$$SegSNR_{dB} = \frac{10}{K}\mathop \sum \limits_{m = 0}^{K - 1} log_{10} \left( {\frac{{\mathop \sum \nolimits_{n = Nm}^{Nm + N - 1} \left| {s\left( n \right)} \right|^{2} }}{{\mathop \sum \nolimits_{n = Nm}^{Nm + N - 1} \left| {s\left( n \right) - e_{1} \left( n \right)} \right|^{2} }}} \right)$$

(35)

where $s\left( n \right)$ and $e_{1} \left( n \right)$ are the original and the enhanced speech signals, respectively. The parameters K and N are the number of segments and the segment length, respectively.

3.4 System mismatch (SM) evaluation

We have used the SM criterion to evaluate the speed convergence performance of the proposed backward subband FNLMS algorithm in comparison with the fullband BNLMS and fullband BFNLMS ones. The simulation parameters of each simulated algorithm are given in Table 1. The obtained results for three inputs SNR (i.e. − 3dB, 0dB, 3dB) are shown in Fig. 8. From these results we can see clearly the superiority of the proposed backward subband algorithm in comparison with the other ones (i.e. BNLMS, BFNLMS) in terms of speed convergence performance for every case of input SNR.

3.5 Segmental mean square error (SegMSE) evaluation

The obtained results of the SegMSE criterion for the three algorithms i.e. proposed backward subband FNLMS algorithm, BNLMS and BFNLMS algorithms are reported on Fig. 9. The simulation parameters of each simulated algorithm are given in Table 1. From this experiment of Fig. 9 we can confirm again that the proposed backward subband FNLMS algorithm behaves more efficiently in terms of speed convergence than the other algorithms i.e. BNLMS and BFNLMS, especially when the number of subbands is selected high ($M = 4$).

3.6 Segmental (SegSNR) signal-to-noise-ratio evaluation

In order to evaluate the noise reduction performance of the proposed backward subband FNLMS algorithm, in the steady state, in comparison with the BNLMS and BFNLMS algorithms, we have used the SegSNR criterion to compute the final values of SNR and only in speech activity periods. We recall here that the simulation parameters of Table 1 are used for each simulated algorithm. In Fig. 10 we present the obtained results of the SegSNR evaluation for three inputs SNR (i.e. − 3dB, 0dB, 3dB). According to these results, we can see clearly that our proposed backward subband FNLMS algorithm has almost the same SegSNR values with the fullband BFNLMS algorithm when the number of subbands is selected to $M = 2$. However, the output SegSNR values decrease when the number of subbands is high ($M = 4$). We have also noted that the output SegSNR values of the proposed backward subband FNLMS algorithm with 2 and 4 subbands are superior to 40 dB, which confirm the good behavior of the proposed backward subband FNLMS algorithm in reducing the acoustic noise components. A poor behavior of the BNLMS algorithm is noted.

4 Conclusion

In this paper, we have proposed a new backward subband FNLMS adaptive filtering algorithm for noise reduction and speech intelligibility enhancement application. The proposed backward subband FNLMS algorithm is a subband implementation of the BBSS structure based on the use of the modified FNLMS algorithm. The performances of the proposed backward subband FNLMS algorithm are compared with two fullband type algorithms i.e. BNLMS and BFNLMS. Therefore, intensive experiments were conducted in terms of three objective criteria, i.e. system mismatch (SM), segmental signal to-noise-ratio (SegSNR), and segmental mean square error (SegMSE). The obtained results with different noisy observations levels (i.e. highly and slightly noisy observations), have confirmed that the proposed backward subband FNLMS algorithm improves the speed convergence behavior in the transient phase especially when the number of subbands is selected high. We have also noted a degradation of the output SegSNR values when the number of subbands is selected high, however the proposed backward subband FNLMS algorithm reduces the acoustic noise components by about 40 dB at the output, with low and high selected subbands number. Finally we can say that the proposed backward subband FNLMS algorithm is an interesting candidate for acoustic noise reduction and speech enhancement applications.

References

Benesty, J., & Chen, J. (2011). Optimal time-domain noise reduction filters, a theoretical study. Berlin, Germany: Springer-Verlag.
Book MATH Google Scholar
Boll, S. F. (1979). Suppression of acoustic noise in speech using spectral subtraction. IEEE Transactions on Acoustics Speech and Signal Processing,27(2), 113–120.
Article Google Scholar
Bouchard, M. (2003). Multichannel affine and fast affine projection algorithms for active noise control and acoustic equalization systems. IEEE Transactions on Speech and Audio Processing,11(1), 54–60.
Article Google Scholar
Djendi, M., & Bendoumia, R. (2013). A new adaptive filtering subband algorithm for two channel acoustic noise reduction and speech enhancement. Computers & Electrical Engineering,39(8), 2531–2550.
Article Google Scholar
Djendi, M., Gilloire, A., Scalart, P. (2006). Noise cancellation using two closely spaced microphones: Experimental study with a specific model and two adaptive algorithms. In IEEE int. conf. ICASSP, Toulouse, France, 14–19 May 2006, Vol. 3, pp. 744–748.
Djendi, M., & Sayoud, A. (2019). A new dual subband fast NLMS adaptive filtering algorithm for blind speech quality enhancement and acoustic noise reduction. International Journal of Speech Technology,22, 391–406.
Article Google Scholar
Djendi, M., & Zoulikha, M. (2014). New automatic forward and backward blind sources separation algorithms for noise reduction and speech enhancement. Computers & Electrical Engineering,40, 2072–2088.
Article Google Scholar
Ghribi, K., Djendi, M., & Berkani, D. (2016). A new wavelet-based forward BSS algorithm for acoustic noise reduction and speech quality enhancement. Applied Acoustics,105, 55–66.
Article Google Scholar
Gilloire, A., & Vetterli, M. (1988). Adaptive filtering in subbands. In: Proceedings of the ICASSP, pp. 1572–1575.
Habets, E. A. P., Benesty, J., Gannot, S., & Cohen, I. (2009). The MVDR beamformer for speech enhancement. In Speech processing in modern communication, pp. 225–254.
Haykin, S. (2002). Adaptive filter theory (4th ed.). Upper Saddle River, NJ: Prentice-Hall.
MATH Google Scholar
Henni, R., Djendi, M., & Djebari, M. (2019). A new efficient two-channel fast transversal adaptive filtering algorithm for blind speech enhancement and acoustic noise reduction. Computer and Electrical Engineering,73, 349–368.
Article Google Scholar
Hu, Y., & Loizou, P. C. (2008). Evaluation of objective quality measures for speech enhancement. IEEE Transactions on Audio, Speech and Language Processing,16(1), 229–238.
Article Google Scholar
Lee, K. A., & Gan, W. S. (2004). Improving convergence of the NLMS algorithm using constrained subband updates. IEEE Signal Processing Letters,11(9), 736.
Article Google Scholar
Lee, K.-A., Gan, W.-S., & Kuo, S. M. (2010). Subband adaptive filtering theory and implementation. Wiley: New York.
Google Scholar
Ozeki, K., & Umeda, T. (1984). An adaptive filtering algorithm using an orthogonal projection to an affine subspace and its properties. Electron Commun Jpn,67A(5), 19–27.
Article MathSciNet Google Scholar
Pradhan, S. S., & Reddy, V. U. (1999). A new approach to subband adaptive filtering. IEEE Transactions on Signal Processing,47(3), 655–664.
Article Google Scholar
Rabiner, L., & Juang, B. H. (1993). Fundamentals of speech recognition. Englewood Cliffs: Prentice-Hall.
Google Scholar
Sayed, A. H. (2003). Fundamentals of adaptive filtering. New York: Wiley.
Google Scholar
Sayoud, A., Djendi, M., & Guessoum, A. (2018a). A two-sensor fast adaptive algorithm for blind speech enhancement. In: International conference on engineering & MIS, Istanbul, Turkey.
Sayoud, A., Djendi, M., Medahi, S., & Guessoum, A. (2018b). A dual fast NLMS adaptive filtering algorithm for blind speech quality enhancement. Applied Acoustics,135, 101–110.
Article Google Scholar
Slock, D. T. M., & Kailath, T. (1991). Numerically stable fast transversal filters for recursive least-squares adaptive filtering. IEEE Trans Signal Procjan,39(1), 92–114.
Article MATH Google Scholar
Van Gerven, S., & Van Compernolle, D. (1992). Feedforward and feedback in symmetric adaptive noise canceller: Stability analysis in a simplified case. In: Eusipco92, European Signal Processing Conf. Brussels, Belgium, pp. 1081–1084 (August).
Van Gerven, S., & Van Compernolle, D. (1995). Signal separation by symmetric adaptive decorrelation: Stability, convergence, and uniqueness. IEEE Trans Signal Proc,74(3), 1602–1612.
Article Google Scholar
Widrow, B., & Goodlin, R. C. (1975). Adaptive noise cancelling: Principles and applications. Proceedings of the IEEE,63(12), 1692–1716.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Signal Processing and Image Laboratory (LATSI), University of Blida 1, Route de Soumaa, B.P. 270, Blida, 09000, Algeria
Akila Sayoud & Mohamed Djendi

Authors

Akila Sayoud
View author publications
You can also search for this author in PubMed Google Scholar
Mohamed Djendi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mohamed Djendi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sayoud, A., Djendi, M. Efficient subband fast adaptive algorithm based-backward blind source separation for speech intelligibility enhancement. Int J Speech Technol 23, 471–479 (2020). https://doi.org/10.1007/s10772-020-09715-w

Download citation

Received: 07 October 2019
Accepted: 11 May 2020
Published: 18 May 2020
Issue Date: June 2020
DOI: https://doi.org/10.1007/s10772-020-09715-w

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Efficient subband fast adaptive algorithm based-backward blind source separation for speech intelligibility enhancement

Abstract

Similar content being viewed by others

An efficient wavelet-based adaptive filtering algorithm for automatic blind speech enhancement

A new dual subband fast NLMS adaptive filtering algorithm for blind speech quality enhancement and acoustic noise reduction

Statistical Analysis and Evaluation of Blind Speech Extraction Algorithms

1 Introduction

1.1 Acoustical envirroment model presentation