A novel cascade structure for joint backward blind acoustic noise and echo cancellation systems

Henni, Rahima; Djendi, Mohamed

doi:10.1007/s42452-020-2942-6

A novel cascade structure for joint backward blind acoustic noise and echo cancellation systems

Research Article
Published: 02 June 2020

Volume 2, article number 1155, (2020)
Cite this article

Download PDF

SN Applied Sciences Aims and scope Submit manuscript

A novel cascade structure for joint backward blind acoustic noise and echo cancellation systems

Download PDF

Rahima Henni^1,2 &
Mohamed Djendi¹

840 Accesses
Explore all metrics

Abstract

This paper addresses joint acoustic echo and punctual noise cancellations framework in hands-free communication systems. A novel cascade structure that combines the backward blind source separation structure (BBSS) as an acoustic noise canceller (ANC), and a conventional acoustic echo canceller (AEC), intended to solve the problem of acoustic echo, is presented. It is proposed to use two adaptive filtering algorithms to update the filter coefficients of each process (ANC and AEC stages). In the first ANC stage, the BBSS structure is combined with the two-channel simplified fast transversal (SFTF) filter algorithm. In the second AEC stage, a single-channel SFTF algorithm is used. Experimental results show both superiority and efficiency of the new structure in comparison with the AEC based SFTF system (without a joint process) in terms of echo return loss enhancement and mean square error gain criteria.

A new adaptive solution based on joint acoustic noise and echo cancellation for hands-free systems

Article 06 April 2019

A Nonlinear Acoustic Echo Canceller with Improved Tracking Capabilities

Two-Channel Acoustic Noise Reduction by New Backward Normalized Decorrelation Algorithm

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Echo and noise related problems are very common in many applications involving speech communication, e.g., audio conference and hands-free telephony. In these systems, the speech that comes from the far-end speaker and echoes back with a time delay produces perception problems. In such condition, the perception is further impaired when the speaker is situated in a noisy environment [1]. In order to provide a better communication service, an acoustic echo canceller (AEC) is required to cancel the echo returned to the transmission room and also to allow uninterrupted communication between the rooms [2,3,4].

A class of promising state of the art techniques exists in the literature in which many one microphone [5, 6], and two microphones [7, 8], adaptive techniques have been proposed to resolve this critical issue, i.e. acoustic echo. However, the presence of the punctual noise impedes the convergence of acoustic echo cancellation algorithms, which leads to poor echo cancellation. Wherefore, for hands-free mobile equipment, it is strongly required to employ an acoustic noise canceller besides the acoustic echo canceller. A survey of techniques that disjointedly reduce the noise are proposed, for example, in previous studies [9, 10], some adaptive techniques based on one [11, 12], and or two [13, 14] microphones have been proposed to correct speech distortion. In [15,16,17], the source separation structures principle (forward and backward) were used to improve the corrupted speech signals [18, 19]. Combinations between these two structures and adaptive filtering algorithm families have given new insight to acoustic noise cancellation field. In the situation where two acoustical disturbances are present in one time, i.e. acoustic noise and echo, and to improve the quality of the communication, the aforementioned noise reduction techniques should be concatenated by a second process, i.e. acoustic echo cancellation. Therefore, a survey of techniques for combining acoustic noise and echo cancellation systems can be found in literature [20,21,22,23,24]. The order in which the processing blocks are applied is very important, in the case where the AEC process is firstly applied and then the ANC, and since most noise reduction techniques make use of multiple microphones, the acoustic echo canceller then obviously has to be repeated for each of the microphones, and these AEC systems need to be robust against the noise that is still present in their input signals. The advantage of the second case where the ANC precedes the AEC is that only one acoustic echo canceller is needed.

In this paper, we propose a novel cascade structure for joint backward blind acoustic noise and echo cancellation systems. In the new structure, we propose to use the BBSS structure with the two-channel SFTF algorithm [25] to suppress the punctual noise components from the far-end signal before to be processed with the AEC system based on the use of the single-channel SFTF algorithm [26,27,28,29,30].

The outline of this paper is as follows: Sect. 2 presents the development of the proposed approach, in which a detailed description of the acoustical environment and the new cascading structure for combining acoustic noise and echo cancellation systems are given. In Sect. 3, we present the experimental study, where we describe the used signals and we present the simulation and the evaluation results of the proposed approach. Finally, we conclude our work in Sect. 4.

2 Development of the proposed approach

In realistic hands-free communication applications, the observed speech signal is corrupted by the echo and noise components as well. As shown in Fig. 1, in addition to the punctual noise, the two microphones of the near-end pick up the amplified and broadcasted far-end signal.

In order improve the AEC in the presence of punctual noise components, an acoustic noise canceller should be integrated to deal with noise components. In this section, we present a novel cascade structure for joint backward blind acoustic noise and echo cancellation systems, where the blind ANC stage is placed before the AEC one. In Fig. 2, we give a detailed scheme of the proposed structure, and then each step is described in the next sub-sections.

2.1 Modeling of the acoustical environment

In Fig. 2 [1st step], we give the mixing model that is physically coherent [17], where ${\text{s }}\left( {\text{n}} \right)$ is the far-end signal, ${\text{b }}\left( {\text{n}} \right)$ is the punctual noise, and the impulse response ${\text{h}}_{11} \left( {\text{n}} \right)$ models the direct acoustic path, the impulse responses ${\text{h}}_{12} \left( {\text{n}} \right)$ and ${\text{h}}_{21} \left( {\text{n}} \right)$ model the cross acoustic paths between the source signals and the two microphones. In our work, it is assumed that the noise source is close to the second microphone, hence, the direct impulse response ${\text{h}}_{22} \left( {\text{n}} \right)$ is the Kronecker unit impulse $\updelta\left( {\text{n}} \right)$ [13]. Both of ${\text{d}}_{1} \left( {\text{n}} \right)$ and ${\text{d}}_{2} \left( {\text{n}} \right)$ are the echo signals picked out by the near-end microphones. These echo signals are given by relations (1) and (2) respectively.

$$d_{1} \left( n \right) = s\left( n \right)*h_{11} \left( n \right)$$

(1)

$$d_{2} \left( n \right) = s\left( n \right)*h_{12} \left( n \right)$$

(2)

However, the observed and available signals are ${\text{p}}_{1} \left( {\text{n}} \right)$ and ${\text{p}}_{2} \left( {\text{n}} \right)$ and are given in a function notation as follow:

$$p_{1} \left( n \right) = d_{1} \left( n \right) + b\left( n \right)*h_{21} \left( n \right)$$

(3)

$$p_{2} \left( n \right) = d_{2} \left( n \right) + b\left( n \right)$$

(4)

where ‘*’ denotes the linear convolution operator.

2.2 Problem formulation of joint backward blind ANC and AEC cancellation systems

The proposed system is depicted in Fig. 2 [2nd and 3rd steps]. It consists of a backward blind ANC process, for dealing with the noise cancellation task, and an AEC process, which is designed for the echo cancellation task. The backward blind ANC process [See Fig. 2 (2nd step)] comprises a two channel noise cancellation system based on the BBSS structure [25]. This blind approach aims to recover the original sources estimates of ${\text{s }}\left( {\text{n}} \right)$ and ${\text{b }}\left( {\text{n}} \right)$ by using only the noisy observations ${\text{p}}_{1} \left( {\text{n}} \right)$ and ${\text{p}}_{2} \left( {\text{n}} \right)$ that are generated by the model of Fig. 2 [1st step].

The outputs of the ANC process are given by the following relations:

$$u_{1} \left( n \right) = p_{1} \left( n \right) - u_{2} \left( n \right)*w_{21} \left( n \right)$$

(5)

$$u_{2} \left( n \right) = p_{2} \left( n \right) - u_{1} \left( n \right)*w_{12} \left( n \right)$$

(6)

By inserting (3) and (6) in (5), and (4) and (5) in (6), we get:

$$u_{1} \left( n \right)\left[ {\delta \left( n \right) - w_{12} \left( n \right)*w_{21} \left( n \right) } \right] = \left[ {s\left( n \right)*\left( {h_{11} \left( n \right) - h_{12} \left( n \right)*w_{21} \left( n \right)} \right) + b\left( n \right)*\left( {h_{21} \left( n \right) - w_{21} \left( n \right)} \right)} \right]$$

(7)

$$u_{2} \left( n \right)\left[ {\delta \left( n \right) - w_{12} \left( n \right)*w_{21} \left( n \right)} \right] = \left[ {s\left( n \right)*\left( {h_{12} \left( n \right) - h_{11} \left( n \right)*w_{12} \left( n \right)} \right) + b\left( n \right)*\left( {\delta \left( n \right) - h_{21} \left( n \right)*w_{12} \left( n \right)} \right)} \right]$$

(8)

by using the optimal solutions:

$$w_{21} \left( n \right) = h_{21} \left( n \right)$$

(9)

$$w_{12} \left( n \right) = h_{12} \left( n \right)*\frac{1}{{h_{11} \left( n \right)}}$$

(10)

It can recover the original signals, i.e. ${\text{s }}\left( {\text{n}} \right)$ and ${\text{b }}\left( {\text{n}} \right)$, as given in (11) and (12), with a distortion as given in (11).

$$u_{1} \left( n \right) = s\left( n \right)*h_{11} \left( n \right)$$

(11)

$$u_{2} \left( n \right) = b\left( n \right)$$

(12)

As depicted in Fig. 2 [3rd step], and since the echo signal is speech we employ the AEC system only at the first output, i.e. $u_{1} \left( n \right)$, we thus obtain the following output signal relation:

$$u_{3} \left( n \right) = u_{1} \left( n \right) - s\left( n \right)*w_{13} \left( n \right)$$

(13)

For more development, we insert (11) in (13). Then the AEC output signal formula can be written as:

$$u_{3} \left( n \right) = s\left( n \right)*\left( {h_{11} \left( n \right) - w_{13} \left( n \right)} \right)$$

(14)

To suppress the acoustic echo, the filter ${\text{w}}_{13} \left( {\text{n}} \right)$ must identify the impulse response ${\text{h}}_{11} \left( {\text{n}} \right)$, i.e. ${\text{w}}_{13} \left( {\text{n}} \right) = {\text{h}}_{11} \left( {\text{n}} \right)$.

2.3 Algorithms outline

In this analysis, we consider two algorithms to update the filters coefficients of the proposed system: the first one is two-channel SFTF algorithm proposed in our previous work [25], it is used in the blind ANC block, and the single-channel SFTF algorithm [26] which is used in the AEC block.

The output signals ${\text{u}}_{1} \left( {\text{n}} \right)$ and ${\text{u}}_{2} \left( {\text{n}} \right)$ of the blind ANC block are given in a vector notation as follow:

$$u_{1} \left( n \right) = p_{1} \left( n \right) - \varvec{w}_{21}^{T} \left( n \right)\varvec{u}_{2} \left( n \right)$$

(15)

$$u_{2} \left( n \right) = p_{2} \left( n \right) - \varvec{w}_{12}^{T} \left( n \right)\varvec{u}_{1} \left( n \right)$$

(16)

where ${\mathbf{u}}_{1} \left( {\text{n}} \right) = \left[ { {\text{u}}_{1} \left( {\text{n}} \right), \ldots , {\text{u}}_{1} \left( { {\text{n }} - {\text{L}} + 1} \right)} \right]^{\text{T}}$, and ${\mathbf{u}}_{2} \left( {\text{n}} \right) = \left[ { {\text{u}}_{2} \left( {\text{n}} \right), \ldots , {\text{u}}_{2} \left( { {\text{n }} - {\text{L}} + 1} \right)} \right]^{\text{T}}$ are the two error vectors of the BBSS structure of Fig. 2 [2nd step]. ${\mathbf{w}}_{21} \left( {\text{n}} \right) = \left[ {{\text{w}}_{21} \left( {\text{n}} \right), \ldots , {\text{w}}_{21} \left( {{\text{n}} - {\text{L}} + 1} \right)} \right]^{\text{T}}$ and ${\mathbf{w}}_{12} \left( {\text{n}} \right) = \left[ {{\text{w}}_{12} \left( {\text{n}} \right), \ldots , {\text{w}}_{12} \left( {{\text{n}} - {\text{L}} + 1} \right)} \right]^{\text{T}}$ are the two filters updated by two-channel SFTF algorithm and given as:

$$\varvec{w}_{21} \left( {n + 1} \right) = \varvec{w}_{21} \left( n \right) - u_{1} \left( n \right)\gamma_{1} \left( n \right)\varvec{k}_{1} \left( n \right)$$

(17)

$$\varvec{w}_{12} \left( {n + 1} \right) = \varvec{w}_{12} \left( n \right) - u_{2} \left( n \right)\gamma_{2} \left( n \right)\varvec{k}_{2} \left( n \right)$$

(18)

The two-channel SFTF algorithm is obtained by eliminating the backward prediction process, thus only the forward predictor is used to compute the dual Kalman gain vectors, i.e. ${\mathbf{k}}_{1} \left( {\text{n}} \right)$ and ${\mathbf{k}}_{2} \left( {\text{n}} \right)$, these two vectors can be written as:

$$\left[ {\begin{array}{*{20}c} {\varvec{k}_{1} \left( n \right)} \\ * \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} 0 \\ {\varvec{k}_{1} \left( {n - 1} \right)} \\ \end{array} } \right] - \frac{{\varepsilon_{1} \left( n \right)}}{{\lambda \alpha_{1} \left( {n - 1} \right) + \delta }} \left[ {\begin{array}{*{20}c} 1 \\ { - {\mathbf{a}}_{1} \left( {n - 1} \right)} \\ \end{array} } \right]$$

(19)

$$\left[ {\begin{array}{*{20}c} {\varvec{k}_{2} \left( n \right)} \\ * \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} 0 \\ {\varvec{k}_{2} \left( {n - 1} \right)} \\ \end{array} } \right] - \frac{{\varepsilon_{2} \left( n \right)}}{{\lambda \alpha_{2} \left( {n - 1} \right) + \delta }} \left[ {\begin{array}{*{20}c} 1 \\ { - {\mathbf{a}}_{2} \left( {n - 1} \right)} \\ \end{array} } \right]$$

(20)

where $\upalpha_{1}$ and $\upalpha_{2}$ are the forward prediction errors variances parameters presented by the following relations:

$$\alpha_{1} \left( n \right) = \lambda \alpha_{1} \left( {n - 1} \right) + \gamma_{1} \left( {n - 1} \right) \varepsilon_{1}^{2} \left( n \right)$$

(21)

$$\alpha_{2} \left( n \right) = \lambda \alpha_{2} \left( {n - 1} \right) + \gamma_{2} \left( {n - 1} \right)\varepsilon_{2}^{2} \left( n \right)$$

(22)

We get the forward prediction coefficients vectors, i.e.${\mathbf{a}}_{1} \left( {\text{n}} \right)$ and ${\mathbf{a}}_{2} \left( {\text{n}} \right)$, by minimizing the functions ${\text{E}}[{\text{u}}_{1} \left( {\text{n}} \right)]$, and ${\text{E}}[{\text{u}}_{2} \left( {\text{n}} \right)]$, respectively. The update formulas of the forward predictors ${\mathbf{a}}_{1} \left( {\text{n}} \right)$ and ${\mathbf{a}}_{2} \left( {\text{n}} \right)$ are given by:

$${\mathbf{a}}_{1} \left( n \right) = \rho \left[ {{\mathbf{a}}_{1} \left( {n - 1} \right) - \varepsilon_{1} \left( n \right)\gamma_{1} \left( n \right)\varvec{k}_{1} \left( {n - 1} \right)} \right]$$

(23)

$${\mathbf{a}}_{2} \left( n \right) = \rho \left[ {{\mathbf{a}}_{2} \left( {n - 1} \right) - \varepsilon_{2} \left( n \right)\gamma_{2} \left( n \right)\varvec{k}_{2} \left( {n - 1} \right)} \right]$$

(24)

The prediction errors $\varepsilon_{1} \left( n \right)$ and $\varepsilon_{2} \left( n \right)$ can be estimated as follows:

$$\varepsilon_{1} \left( n \right) = u_{2} \left( n \right) - {\mathbf{a}}_{1} \left( n \right)\varvec{u}_{2} \left( {n - 1} \right)$$

(25)

$$\varepsilon_{2} \left( n \right) = u_{1} \left( n \right) - {\mathbf{a}}_{1} \left( n \right)\varvec{u}_{1} \left( {n - 1} \right)$$

(26)

And we apply the following definitions to calculate the likelihood variables $\upgamma_{1} \left( {\text{n}} \right)$, $\upgamma_{2} \left( {\text{n}} \right)$:

$$\gamma_{1} \left( n \right) = \frac{1}{{1 - \varvec{k}_{1}^{T} \left( n \right) \varvec{u}_{2} \left( n \right)}}$$

(27)

$$\gamma_{2} \left( n \right) = \frac{1}{{1 - \varvec{k}_{2}^{T} \left( n \right) \varvec{u}_{1} \left( n \right)}}$$

(28)

Now, we describe the mathematical derivation of the single-channel SFTF algorithm that we use it in the AEC stage. The a priori filtering error ${\text{u}}_{3} \left( {\text{n}} \right)$ and the adaptive filter ${\text{w}}_{13} \left( {\text{n}} \right)$ equations of this algorithm are given by:

$$u_{3} \left( n \right) = u_{1} \left( n \right) - \varvec{w}_{13}^{T} \left( n \right)\varvec{s}\left( n \right)$$

(29)

$$\varvec{w}_{13} \left( {n + 1} \right) = \varvec{w}_{13} \left( n \right) - u_{3} \left( n \right)\gamma_{3} \left( n \right)\varvec{k}_{3} \left( n \right)$$

(30)

where $s\left( n \right) = \left[ {s\left( n \right),s\left( {n - 1} \right), \ldots ,s\left( {n - L + 1} \right)} \right]^{T}$ is the coefficients vector of the far-end signal ${\text{k}}_{3} \left( {\text{n}} \right)$ and $\upgamma_{3} \left( {\text{n}} \right)$ are the Kalman gain vector and the likelihood variable, respectevely, and they are calculated as follow:

$$\left[ {\begin{array}{*{20}c} {k_{3} \left( n \right)} \\ * \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} 0 \\ {k_{3} \left( {n - 1} \right)} \\ \end{array} } \right] - \frac{{\varepsilon_{3} \left( n \right)}}{{\lambda \alpha_{3} \left( {n - 1} \right) + \delta }} \left[ {\begin{array}{*{20}c} 1 \\ { - {\mathbf{a}}_{3} \left( {n - 1} \right)} \\ \end{array} } \right]$$

(31)

$$\gamma_{3} \left( n \right) = \frac{1}{{1 - \varvec{k}_{3}^{T} \left( n \right)\varvec{ s}\left( n \right)}}$$

(32)

The forward prediction coefficients vector ${\text{a}}_{3} \left( n \right)$ is calculated by minimizing the cost function $E\left[ {\varepsilon_{3}^{2} \left( n \right)} \right]$. The update formula of the forward predictor is given by the following relation:

$${\mathbf{a}}_{3} \left( n \right) = \rho \left[ {{\text{a}}_{3} \left( {n - 1} \right) - \varepsilon_{3} \left( n \right)\gamma_{3} \left( n \right)\varvec{k}_{3} \left( {n - 1} \right)} \right]$$

(33)

The forward prediction error variance $\alpha_{3} \left( n \right)$ is given by:

$$\alpha_{3} \left( n \right) = \lambda \alpha_{3} \left( {n - 1} \right) + \gamma_{3} \left( {n - 1} \right) \varepsilon_{3}^{2} \left( n \right)$$

(34)

and the prediction error $\varepsilon_{3}^{2} \left( n \right)$, is calculated by the following relation:

$$\varepsilon_{1} \left( n \right) = s\left( n \right) - {\mathbf{a}}_{1} \left( n \right)\varvec{s}\left( {n - 1} \right)$$

(35)

The asterisk “$*$” represents the last unused element of the dual Kalman gain vector. $\uplambda$ is a smoothing factor, $\delta$ is a small positive constant used to avoid division by very small values in silence periods. The parameter $\uprho$ allows better robustness against numerical propagation errors.

In Tables 1 and 2, the proposed cascade structure is summarized.

Table 1 1st proposed processing algorithm: two-channel SFTF algorithm

Full size table

Table 2 2nd proposed processing algorithm: single-channel SFTF algorithm

Full size table

3 Experimental results

3.1 Description of the used signals

The observed signals $p_{1} \left( n \right)$ and $p_{2} \left( n \right)$, are generated by the model of Fig. 2 [1st step], these signals contains two statistically independent source signals: (1) the speech signal $s\left( n \right)$, is a short French sentence of about 4 s, pronounced by a male speaker and is phonetically balanced, and (2) a punctual noise $b\left( n \right)$, that is USASI noise [United states of America Standard Institute now (ANSI)]. The impulse responses (IR) $h_{12} \left( n \right)$ and $h_{21} \left( n \right)$ are generated by the model of [9] and have a length $L = 256$ [Samples of these IRs are given by Fig. 4]. These IRs are used to generate the observations, i.e. $p_{1} \left( n \right)$ and $p_{2} \left( n \right)$, a samples of all these signals are given by Fig. 3. The sampling frequency is set to $fs = 8\,{\text{kHz}}$ and the input SNR is chosen equal to $SNR = 15\;{\text{dB}}$. All experimental signals and IRs are presented below on Figs. 3 and 4, respectively.

The black stepped curve represented with the original speech signal depicted on Fig. 3a represents the manual voice activity detector (Manual VAD). We use this system to extract the speech signal at the first output ${\text{u}}_{1} \left( {\text{n}} \right)$ of the ANC process. It is recalled here that all the obtained results are mean averaged.

In order to evaluate the backward blind ANC process on the AEC stage, we compare in Fig. 5 the output signal obtained by the proposed structure and that obtained by the conventional AEC system. The conventional AEC system considered in our comparison is based on the use of the adaptive single-channel SFTF algorithm. According to Fig. 5, we see the good behavior of the proposed structure in comparison with the conventional AEC system, where both disturbances have been canceled.

3.2 Objective criteria

In order to compare the proposed algorithm with the AEC based SFTF system, several experiments in different conditions were performed. We consider the echo return loss enhancement (ERLE) criterion given by [25]:

$$ERLE_{dB} = \frac{10}{M}\mathop \sum \limits_{m = 0}^{M - 1} log_{10} \left( {\frac{{\mathop \sum \nolimits_{n = Nm}^{Nm + N - 1} \left| {s\left( n \right)} \right|^{2} }}{{\mathop \sum \nolimits_{n = Nm}^{Nm + N - 1} \left| {s\left( n \right) - u_{3} \left( n \right)} \right|^{2} }}} \right)$$

(36)

and the mean square error (MSE) gain, computed as follows [25]:

$$MSE_{dB} = \frac{10}{M} \mathop \sum \limits_{m = 0}^{M - 1} log_{10} \left( {\frac{1}{N}\mathop \sum \limits_{{n = N_{m} }}^{{N_{m} + N - 1}} \left| {s\left( n \right) - u_{3} \left( n \right)} \right|^{2} } \right)$$

(37)

where $s\left( n \right)$ and $u_{3} \left( n \right)$ are respectively, the original speech signal and the output signal obtained by the proposed structure. The parameters $M$ and $N$ are the number of segments and the segment length, respectively.

We have evaluated the performances of the proposed joint backward blind acoustic noise and echo cancellation systems in comparison with the conventional AEC system based on the single-channel SFTF algorithm and we have reported the obtained results on Figs. 6, 7 and 8. The simulation parameters of the proposed algorithms are summarized in Table 3 (selected by simulation), and the obtained results are given for different conditions test and noisy observations levels (high and low input SNR). In these experiments, three input SNRs, i.e. $5\,{\text{dB}}$, $20\,{\text{dB}}$ and $25\,{\text{dB}}$ are used. Furthermore, we have used three types of noise, i.e. USASI, white, and babble. The white noise is used to test the stability performance of the used algorithms, for the evaluation of the convergence speed performance, we use the USASI noise and a real babble noise is used to evaluate the ability of the processing algorithms to track the non-stationarity of the input signal.

Table 3 Summary of the simulation parameters

Full size table

According to Figs. 6, 7 and 8, we see that the proposed system gives better results in comparison with the AEC based SFTF system in terms of ERLE. This is due to the noise reduction improvement provided by the integrated backward blind ANC process that allows a better estimate of the acoustic echo. We can see that the proposed algorithm performs well even in high punctual noise (input $SNR = 5\,{\text{dB}}$), unlike the classical AEC system, where the algorithm is disturbed by the punctual noise present in the input signal.

To support the previous results, we have performed several other experiments in terms of MSE criterion and have selected one of them to evaluate the AEC performance of the proposed cascade structure in comparison with AEC based SFTF system. The simulation parameters are summarized in Table 3, and the obtained results for three input SNRs, i.e. $5\,{\text{dB}}$, $15\,{\text{dB}}$ and $25\,{\text{dB}}$, and three noise types, i.e. USASI, white and babble, are reported on Figs. 9, 10 and 11.

From the obtained results (that are mean averaged), it can be noted the superiority of the proposed approach (cascade structure) over the AEC based SFTF system in transient and steady-state regime. This superiority is more attractive when the input SNR is selected low, this is due to the backward blind ANC stage that makes the AEC process of the proposed cascade structure more robust against acoustic echo components, whereas, the AEC based SFTF system is highly penalized by the noise components presence. Finally, it can be concluded that the proposed cascade structure allows better cancellation of acoustic echo and noise components in the same time.

3.2.1 A comparative study between the proposed approach, AEC NLMS [2], and joint AEC and ANC system [31]

In this section, the performance of the proposed approach is compared with the following algorithms: (1) AEC based NLMS [2], and (2) a joint AEC and ANC system that uses both two-channel NLMS and single-channel NLMS in their process [31]. In this comparative study, the AEC NLMS algorithm [2] is taken as a reference algorithm. In this experiment, the real and adaptive filters are equal to L = 256. The impulse responses is of a room and composed about 256 points, this means an exact modelization of the room impulse response by the adaptive filters of the same length (equal to 256). The parameters of each algorithm are summarized in Table 4.

Table 4 Parameter values of the following algorithms: (1) AEC based NLMS [1], (2) joint system [31], and (3) the proposed approach [in this paper]

Full size table

Parameters in Table 4 are selected to get the best performance of each algorithm. It is evaluated the ERLE and MSE criteria for different input SNRs and different noise type. However, we have selected some results of ERLE and MSE and reported in Figs. 12 and 13, respectively. From these results, it can be conlcude that the proposed approach behaves more efficiently than the AEC based NLMS and [2] and the joint AEC and ANC system of [31]. The superiority of our approach is got from the good behavior of the one and two-channel SFTF algorithms that are integrated in the proposed joint AEC and ANC structure.

4 Conclusion

In this paper, we have proposed a novel algorithm for joint backward blind acoustic noise and echo cancellation systems. The proposed algorithm is based on two cascade stages for cancelling punctual noise and then acoustic echo signal. In the ANC stage, the BBSS structure is used to cancell punctual noise, then an AEC system is used to suppress the echo signal. Both stages use efficient SFTF algorithm to take advantage of the adaptive behaviour of this algorithms when combined with the proposed cascade system.

To evaluate the proposed system, we have done intensive tests in terms of ERLE and MSE criteria, in various conditions of noisy observations (highly and slightly noisy observations). In these experiments, our proposed approach is compared with an AEC based SFTF system. The obtained results have shown the superiority of the proposed algorithm in terms of objective criteria even under low input SNR condition (more punctual noise is present). Finally, it can be concluded that the proposed system can be a good alternative for AEC techniques in the presence of strong punctual noise components where the classical techniques fail.

References

Puder H, Dreiseitel P (2000) Implementation of a hands-free car phone with echo cancellation and noise-dependent loss control. In: Proceedings of the IEEE international conference on acoustics, speech and signal processing, vol 6, pp 3622–3625
Widrow B, Stearns SD (1985) Adaptive signal processing. Prentice Hall, Upper Saddle River
MATH Google Scholar
Sondhi M, Kellermann W (1992) Adaptive echo cancellation for speech signal. In: Furui S, Sondhi M (eds) Chapter 11 in Advances in speech signal processing. Dekker, New York
Google Scholar
Hänsler E, Schmidt GU (2004) Acoustic echo and noise control: a practical approach. Wiley, New York
Book Google Scholar
Sakai Y, Tahir Akhtar M (2013) The performance of the acoustic echo cancelation using blind source separation to reduce double-talk interference. In: International symposium on intelligent signal processing and communication systems, Naha, Japan
Mader A, Puder H, Schmidt GU (2000) Step-size control for acoustic echo cancellation filters—an overview. Signal Process 80:1697–1719
Article Google Scholar
Paleologu C, Benesty J, Ciochină S (2014) Widely linear general Kalman filter for stereophonic acoustic echo cancellation. Signal Process 94:570–575
Article Google Scholar
Benesty J, Gänsler T, Morgan DR, Sondhi MM, Gay SL (2013) Advances in network and acoustic echo cancellation. Springer, Berlin
MATH Google Scholar
Djendi M, Scalart P, Gilloire A (2006) Noise cancellation using two closely spaced microphones: experimental study with a specific model and two adaptive algorithms. In: Proceedings of IEEE ICASSP, vol 3, pp 744–747
Gupta D, Gupta VK, Mishra AN (2017) Article: Pseudo affine projection algorithm based noise minimization from speech signals. In: IJCA proceedings on national conference on recent trends in electronics and electrical engineering NCRTEEE, no 1, pp 1–5
Gabrea M (2003) Double affine projection algorithm-based speech enhancement algorithm. In: Proceedings of the IEEE ICASSP Montréal, Canada, vol 2, pp 904–907
Al Kindi MJ, Dunlop J (1989) Improved adaptive noise cancellation in the presence of signal leakage on the noise reference channel. Signal Process 17(3):241–250
Article MathSciNet Google Scholar
Van Gerven S, Van Compernolle D (1995) Signal separation by symmetric adaptive decorrelation: stability, convergence, and uniqueness. IEEE Trans Signal Process 74(3):1602–1612
Article Google Scholar
Gupta VK, Gupta DK, Chandra M (2014) Real time noise canceller using modified sigmoid function RLS algorithm. In: Proceedings of international conference on computational vision and robotic (ICCVR-2014), 9^th–10th, August, 2014 Bhubaneswar, India
Davis GM (2002) Noise reduction in speech applications. CRC Press, Boca Raton. ISBN 0-8493-0949-2
Book Google Scholar
Buchner H, Aichner R, Kellermann W (2005) A generalization of blind source separation algorithms for convolutive mixtures based on second order statistics. IEEE Trans Speech Audio Process 13(1):120–134
Article Google Scholar
Djendi M, Zoulikha M (2014) New automatic forward and backward blind sources separation algorithms for noise reduction and speech enhancement. Comput Electr Eng 40:2072–2088
Article Google Scholar
Rupp M (2011) Pseudo affine projection algorithms robustness and stability analysis. IEEE Trans Signal Process 59:2017–2023
Article MathSciNet Google Scholar
Gonzalez A, Ferrer M, Albu F, de Diego M (2012) Affine projection algorithm: evolution to smart and fast algorithms and application. In: 20th European signal processing conference, EURASIP, 2012. ISSN 2076-1465
Jeannès RLB, Faucon G (1997) How to reduce the noise influence in a joint system developed for echo and noise cancellation. IEEE Signal Process Lett 4(10):280–282
Article Google Scholar
Kuo SM, Gan WS, Asthana P (2005) Integrated noise reduction and acoustic echo cancellation in hands-free systems. In: Proceedings of 2005 international symposium on intelligent signal processing and communication systems, Hong Kong
Hanshi SM, Chong YW, Ramadass S, Naeem AN, Ooi KC (2014) Efficient acoustic echo cancellation joint with noise reduction framework. In: International conference on computer, communication, and control technology (I4CT 2014), September 2–4, 2014, Langkawi, Kedah, Malaysia
Park SJ, Cho CG, Lee C, Youn DH (2002) Integrated echo and noise canceller for hands-free applications. IEEE Trans Circuits Syst II 49(3):186–195
Google Scholar
Jeannes J, Scalart P, Faucon G, Beaugent C (2001) Combined noise and echo reduction in handsfree systems: a survey. IEEE Trans Speech Audio Process 9(8):808–820
Article Google Scholar
Henni R, Djendi M, Djebari M (2019) A new efficient two-channel fast transversal adaptive filtering algorithm for blind speech enhancement and acoustic noise reduction. Comput Electr Eng 73:349–368
Article Google Scholar
Djendi M (2015) New efficient adaptive fast transversal filtering (FTF)-type algorithms for mono and stereophonic acoustic echo cancelation. Int J Adapt Control Signal Process 29:273–301
Article MathSciNet Google Scholar
Sundaradhas SN, Panchama moorthy SP, Ramapackiyam SSK (2020) Upgraded NLMS algorithm for speech enhancement with sparse and dispersive impulse responses. Indian J Phys. https://doi.org/10.1007/s12648-020-01688-5
Article Google Scholar
Li H, Feng J, Wang Y, Zhang X (2019) A joint optimized robust acoustic echo cancellation algorithm. Int J Pattern Recognit Artif Intell 33(09):1959029
Article Google Scholar
Sayoud A, Djendi M, Guessoum A (2019) A new speech enhancement adaptive algorithm based on fullband–subband MSE switching. Int J Speech Technol 22:993–1005
Article Google Scholar
Hassani I, Arezki M, Benallal A (2020) A novel set membership fast NLMS algorithm for acoustic echo cancellation. Appl Acoust 163:107210
Article Google Scholar
Henni R, Djendi M, Djebari M (2019) A new adaptive solution based on joint acoustic noise and echo cancellation for hands-free systems. Int J Speech Technol 22:407–420
Article Google Scholar

Download references

Acknowledgements

This study was carried out without funding. Authors are grateful to the Blida University, Algeria, and Professor Abderreak Guessoum, Professor at Blida University, Algeria, and Director of Signal Processing and imaging Laboratory (LATSI) for providing the infrastructure to achieve this work.

Author information

Authors and Affiliations

Laboratory of Signal Processing and Imaging (LATSI), University of Blida 1, Route de Soumaa, B.P. 270, 09000, Blida, Algeria
Rahima Henni & Mohamed Djendi
Detection, Information and Communication (DIC) Laboratory, University of Blida 1, Route de Soumaa, B.P. 270, 09000, Blida, Algeria
Rahima Henni

Authors

Rahima Henni
View author publications
You can also search for this author in PubMed Google Scholar
Mohamed Djendi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mohamed Djendi.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Henni, R., Djendi, M. A novel cascade structure for joint backward blind acoustic noise and echo cancellation systems. SN Appl. Sci. 2, 1155 (2020). https://doi.org/10.1007/s42452-020-2942-6

Download citation

Received: 02 November 2019
Accepted: 21 May 2020
Published: 02 June 2020
DOI: https://doi.org/10.1007/s42452-020-2942-6

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A novel cascade structure for joint backward blind acoustic noise and echo cancellation systems

Abstract

Similar content being viewed by others

A new adaptive solution based on joint acoustic noise and echo cancellation for hands-free systems

A Nonlinear Acoustic Echo Canceller with Improved Tracking Capabilities

Two-Channel Acoustic Noise Reduction by New Backward Normalized Decorrelation Algorithm

1 Introduction

2 Development of the proposed approach

2.1 Modeling of the acoustical environment