1 Introduction

Audio noise removal technique is used as an art to eliminate the acoustic noise from the speech signal and therefore it can enhance the quality of speech signal. There have been a lot of research works in the context of acoustic noise cancellation (ANC) by applying adaptive filters [1,2,3,4,5,6,7,8,9,10,11,12,13,14]. The original idea of noise cancellation via adaptive algorithm was firstly proposed by Bernard Widrow [1]. Since then, several adaptive cancellation algorithms have been proposed to address the stability, misadjustment, and computational complexity.

In [2], the least-mean-squares (LMS) algorithm with variable step size is proposed in which the adaptive weights are frozen when the target signal is strong. In [3], a robust variable step-size normalized LMS algorithm is presented. A constrained optimization problem is derived by minimizing the \(l_{2}\) norm of the error signal with a constraint on the filter weights. A novel LMS-based adaptive algorithm for ANC application is introduced in [4] which applies nonlinearities to the input and error signals by using the Lagrange multiplier method.

In some ANC applications, there is a secondary path between the output of adaptive filter and the error signal. The secondary path causes phase shifts or delays in signal transmission. Conventional LMS algorithms cannot compensate for the effect of the secondary path. For those applications, the filtered-x LMS (FXLMS) algorithm is proposed [5]. Adaptive volterra filtered-x least-mean-squares (VFXLMS) algorithm has been derived in [6] for ANC application with nonlinear effects. Adaptive filtered-s least-mean-square (FSLMS) algorithm is proposed in [7] which has a better noise cancellation performance with less computational complexity compared to second-order VFXLMS algorithm. In [8], an adaptive noise cancellation, based on multiple sub-filters, is studied which improves the convergence performance; however, it deteriorates the steady-state performance compared to the single-filter approaches. A novel adaptive algorithm for cancelling residual echo is proposed in [9], where the complex-valued residual echo is estimated and corrected. Unlike the conventional single-channel echo cancellation algorithms, this approach considers both the amplitude and the phases of far-end signal. The idea of multichannel acoustic echo cancellation is addressed in [10] by using a multiple-input multiple-output (MIMO) adaptive filtering. In [11], a multiple reference adaptive filtering for ANC application in propeller aircraft has been studied. A multichannel structure for ANC has been proposed in [12] where a filtered-x affine projection algorithm is derived for noise cancellation. An adaptive Bayesian algorithm is proposed in the frequency-domain [13] to address the multichannel acoustic echo cancellation problem. The echo paths between the loudspeakers and the near-end microphone are modelled as a multichannel random variable with a first-order Markov property. Multi-microphone speech enhancement methods are applied to remove the background noise and undesired echos in order to achieve a high-quality speech [14].

In this paper, we propose a novel structure for ANC application in which multiple adaptive filters are used to eliminate the acoustic noise from the speech signal. We measure the noise source by multiple microphones and use a bank of time-domain adaptive filters to eliminate them from the speech. Since the measured noises are uncorrelated with each other, we can enhance the filtering performance by using a linear combination of error signals. The steady-state mean-square deviation (MSD) performance is also analysed. The theoretical findings verify computer simulation results.

The organization of this paper is stated in the following. In Sect. 2, the problem formulation of ANC is introduced. In Sect. 3, we introduce our proposed time-domain adaptive filter bank for cancelling the noise signal. Section 4 discusses the theoretical performance analysis of the proposed algorithm. Computer simulation results are represented in Sect. 5. Finally, Sect. 6 concludes the paper.

Notations: In this paper, small letters with subscript, e.g. \(x_{i}\), display vectors, capital letters, e.g. \(R_{i}\), display matrices and small letters with parentheses, e.g. x(i), display scalars. The superscripts \(x^{T}\) represent the transpose of a matrix or vector. All vectors are column vectors. Random variables display with boldface letters, e.g. scalars, vectors, and matrices, are denoted by \(\varvec{x}(i)\), \(\varvec{w}_{i}\), and \(\varvec{R}_{i} \), respectively. The \(\textrm{Tr}\big (.\big )\) symbol shows the trace operator, and \(\textrm{E}\big (.\big )\) symbol denotes the expectation operator. The \(\textrm{vec}(.)\) operator stacks the columns of a matrix into a vector on top of each other. The \(\lambda _\textrm{max}(.)\) denotes the largest eigenvalue of its matrix argument.

2 Problem formulation

Figure 1 shows the configuration of a noise cancellation system. It consists of one adaptive filter \(\varvec{w}_{i}\) and two microphones MIC1 and MIC2. MIC1 captures the noise source signal and MIC2 captures the speech signal. Both microphone’s records are contaminated by a measurement noise. The measurement noise of MIC1 is denoted by \(\varvec{v}(i)\) and the measurement noise of MIC2 is represented by \(\varvec{z}(i)\). Assume that the noise source signal \(\varvec{x}(i)\) passes through the acoustic channel \(w^{o}\) with the following impulse response

$$\begin{aligned} \sum _{i=0}^{M-1} w^{o}(i) \delta (n-i). \end{aligned}$$
(1)

By sorting the channel coefficients \(w^{o}(i)\) into the column vector \(w^{o}\), the acoustic channel impulse response can be expressed as,

$$\begin{aligned} w^{o}=\big [ w^{o}(0),w^{o}(1),\dots ,w^{o}(M-1) \big ]^{T} \end{aligned}$$
(2)

where \((.)^{T}\) represents the transpose operator. According to Fig. 1, the second microhpone, MIC2, measures the sum of the channel output and the speech signal, \(\varvec{s}(i)\). Therefore, the desired signal at the adaptive filter is achieved as

$$\begin{aligned} \varvec{d}(i)=\varvec{x}_{i}^{T}w^{o}+\varvec{s}(i) +\varvec{z}(i), \end{aligned}$$
(3)

where \(\varvec{z}(i)\) is the measurement noise of the second microphone and

$$\begin{aligned} \varvec{x}_{i}=\big [ \varvec{x}(i), \varvec{x}(i-1), \dots , \varvec{x}(i-M+1) \big ]^{T}, \end{aligned}$$
(4)

is the vector form of the input signal \(\varvec{x}(i)\) which is measured by the first microphone MIC1.

The input signal of the adaptive filter is

$$\begin{aligned} \varvec{u}(i)=\varvec{x}(i)+\varvec{v}(i), \end{aligned}$$
(5)

where \(\varvec{v}(i)\) is the measurement noise of the first microphone MIC1. The output signal of the adaptive filter is denoted by \(\varvec{y}(i)\), that is

$$\begin{aligned} \varvec{y}(i)= \varvec{u}_{i}^{T} \varvec{w}_{i}, \end{aligned}$$
(6)

where \(\varvec{u}_{i}\) is a vector of input signal \(\varvec{u}(i)\) as follows:

$$\begin{aligned} \varvec{u}_{i} = \big [\varvec{u}(i), \varvec{u}(i-1), \dots , \varvec{u}(i-M+1) \big ]^{T}, \end{aligned}$$
(7)

and \(\varvec{w}_{i}\) is a vector of filter weights

$$\begin{aligned} \varvec{w}_{i} = \big [\varvec{w}_{0}(i),\varvec{w}_{1}(i), \dots , \varvec{w}_{M-1}(i) \big ]^{T}. \end{aligned}$$
(8)

The error signal \(\varvec{e}(i)\) is obtained by subtracting the filter output signal \(\varvec{y}(i)\) from the desired signal \(\varvec{d}(i)\) as

$$\begin{aligned} \varvec{e}(i)=\varvec{d}(i)-\varvec{y}(i). \end{aligned}$$
(9)

The LMS algorithm is used to estimate the acoustic channel impulse response \(w^{o}\) by updating the weight vector \(\varvec{w}_{i}\) of the adaptive filter [15] as

$$\begin{aligned} \varvec{w}_{i+1} = \varvec{w}_{i} + \mu \varvec{e}(i) \varvec{u}_{i}, \end{aligned}$$
(10)

where \(\mu \) is the step-size parameter. For a sufficiently long time, the adaptive filter weight vector \(\varvec{w}_{i}\) converges to the optimal vector \(w^{o}\). As a result, the filter’s output signal is

$$\begin{aligned} \varvec{y}(i) \approx \varvec{u}_{i}^{T} w^{o}. \end{aligned}$$
(11)

Therefore, according to (3), (5), (9), and (11), the error signal is

$$\begin{aligned} \varvec{e}(i) \approx \varvec{s}(i)+\varvec{z}(i) -\varvec{v}_{i}^{T}w^{o}. \end{aligned}$$
(12)

In the ideal case, where the measurement noises \(\varvec{z}(i)\) and \(\varvec{v}_{i}\) are zero, the error signal \(\varvec{e}(i)\) is a good estimation of the noise-free speech signal, such that

$$\begin{aligned} \varvec{e}(i) \approx \varvec{s}(i). \end{aligned}$$
(13)

In order to decrease the effect of measurement noise, we can use multiple microphones with uncorrelated measurement noises. In this paper, we propose a novel structure for ANC application in which we alleviate the effect of input measurement noise with using multiple microphones measuring the noise source.

Fig. 1
figure 1

The acoustic noise cancellation (ANC) block diagram

3 A bank of parallel adaptive filters

In this section, we propose a novel structure for noise cancellation in which there are multiple adaptive filters each of which is connected to a microphone to capture the noise source signal \(\varvec{x}(i)\) as shown in Fig. 2. The LMS adaptation rule of each adaptive filter is as follows:

$$\begin{aligned} \varvec{w}_{k,i+1} = \varvec{w}_{k,i} + \mu \varvec{e}_{k}(i) \varvec{u}_{k,i}, \end{aligned}$$
(14)

where \(\varvec{w}_{k,i}\) is the kth adaptive filter weight vector at time instant i, and \(\varvec{u}_{k,i}\) is the input signal of the kth adaptive filter. Since the measurement noises \(\varvec{v}_{k}(i)\) are uncorrelated, we can reduce the noise power by linearly combining the error signals \(\varvec{e}_{k}(i)\) as

$$\begin{aligned} \varvec{e}_{A}(i) = \frac{1}{N}\sum _{k=1}^{N}\varvec{e}_{k}(i) \approx \hat{\varvec{s}}(i), \end{aligned}$$
(15)

where \(\varvec{e}_{A}(i) \) is the average error signal of adaptive filter bank which is approximately equal to the noise-free speech signal. In order to compute the error signal \(\varvec{e}_{k}(i)\) at the kth adaptive filter, we need to define some variables as follows:

$$\begin{aligned} \varvec{e}_{k}(i) = \varvec{d}(i) - \varvec{y}_{k}(i), \end{aligned}$$
(16)

where \(\varvec{d}(i)\) is defined in (3) and \(\varvec{y}_{k}(i)\) is the output signal of kth adaptive filter and given as

$$\begin{aligned} \varvec{y}_{k}(i)=\varvec{u}_{k,i}^{T} \varvec{w}_{k,i}, \quad k=1,\dots ,N. \end{aligned}$$
(17)

Inserting (3) and (17) in (16), we obtain the kth error signal

$$\begin{aligned} \varvec{e}_{k}(i) =\varvec{s}(i)+\varvec{z}(i) + \varvec{u}_{k,i}^{T}\tilde{ \varvec{w} }_{k,i} - \varvec{v}_{k,i}^{T}w^{o}, \end{aligned}$$
(18)

where we define the error vector \(\tilde{ \varvec{w} }_{k,i}\) as follows:

$$\begin{aligned} \tilde{\varvec{w} }_{k,i} \triangleq w^{o} - \varvec{w}_{k,i}. \end{aligned}$$
(19)

By substituting (18) into (15), we have

$$\begin{aligned} \varvec{e}_{A}(i) =\varvec{s}(i) +\varvec{z}(i) + \frac{1}{N}\sum _{k=1}^{N} \varvec{u}_{k,i}^{T}\tilde{ \varvec{w} }_{k,i} - \frac{1}{N}\sum _{k=1}^{N} \varvec{v}_{k,i}^{T}w^{o}.\nonumber \\ \end{aligned}$$
(20)
Fig. 2
figure 2

Adaptive filter bank for ANC application

The variance of the average error signal is equal to

$$\begin{aligned} \textrm{E} \vert \varvec{e}_{A}(i) \vert ^2= & {} \sigma _{s}^2 +\sigma _{z}^2 + \Vert w^{o} \Vert ^2_{\frac{1}{N}R_{v,k}}\nonumber \\{} & {} + \frac{1}{N^2}\sum _{k=1}^{N} \textrm{E} \Vert \tilde{\varvec{w}}_{k,i} \Vert ^2_{R_{u,k}}, \end{aligned}$$
(21)

where the input signals \(\varvec{u}_{k,i}\) and the noise vectors \(\varvec{v}_{k,i}\) are random variables with the following \(M\times M\) covariance matrices

$$\begin{aligned} R_{u,k}= & {} \textrm{E}\big [\varvec{u}_{k,i}\varvec{u}_{k,i}^{T}\big ] =R_{x}+R_{v,k}, \nonumber \\ R_{v,k}= & {} \textrm{E}\big [\varvec{v}_{k,i}\varvec{v}_{k,i}^{T}\big ], \quad R_{x}=\textrm{E}\big [\varvec{x}_{i}\varvec{x}_{i}^{T}\big ]. \end{aligned}$$
(22)

The noise scalar \(\varvec{z}(i)\) is zero-mean Gaussian random variables with variance \(\sigma _{z}^2\) and the speech signal \(\varvec{s}(i)\) has the following variance \(\sigma _{s}^2\)

$$\begin{aligned} \sigma _{z}^{2}= \textrm{E} \vert z(i) \vert ^2, \quad \sigma _{s}^{2} =\textrm{E} \vert s(i) \vert ^2. \end{aligned}$$
(23)

It should be noted that the following parameters are mutually independent, because the origin of their production is separate

$$\begin{aligned} \varvec{x}_{i} \perp \varvec{v}_{k,i} \perp \varvec{z}(i) \perp \varvec{s}(i) \end{aligned}$$
(24)

In order to calculate the signal-to-noise-ratio (SNR) of the cleared signal, we need to derive the variance of the local error signal (18) as

$$\begin{aligned} \textrm{E} \vert \varvec{e}_{k}(i) \vert ^2 = \sigma _{s}^2 + \sigma _{z}^2 + \Vert w^{o} \Vert ^2_{R_{v,k}} + \textrm{E} \Vert \tilde{ \varvec{w} }_{k,i} \Vert ^2_{R_{u,k}}. \end{aligned}$$
(25)

Therefore, we can obtain the SNR of local error signal as

$$\begin{aligned} \textbf{SNR}_{k} = \frac{\sigma _{s}^2}{\sigma _{z}^2 + \Vert w^{o} \Vert ^2_{R_{v,k}} + \textrm{E} \Vert \tilde{ \varvec{w} }_{k,i} \Vert ^2_{R_{u,k}}}. \end{aligned}$$
(26)

On the other hand, according to (21), the SNR of the average error signal (20) is

$$\begin{aligned} \textbf{SNR}_{A} = \frac{\sigma _{s}^2}{\sigma _{z}^2 + \Vert w^{o} \Vert ^2_{\frac{1}{N}R_{v,k}} + \frac{1}{N^2}\sum _{k=1}^{N} \textrm{E} \Vert \tilde{\varvec{w} }_{k,i} \Vert ^2_{R_{u,k}}}. \end{aligned}$$
(27)

Comparing the SNR of the local error signal (26) and the SNR of the average error signal (27), we conclude that the denominator of the SNR fraction has become smaller in our proposed ANC structure. In other words, linear combination of local error signals leads to better output SNR.

4 Performance analysis

In this section, the behaviour of the error vector \(\tilde{ \varvec{w} }_{k,i}\) in the mean sense and in the mean-square sense is analysed. By substituting Eq. (18) into the adaptation rule (14), we have

$$\begin{aligned} \varvec{w}_{k,i+1} =\varvec{w}_{k,i}+ \mu \varvec{u}_{k,i} \big ( \varvec{s}(i)+\varvec{z}(i) +\varvec{u}_{k,i}^{T}\tilde{\varvec{w}}_{k,i} -\varvec{v}_{k,i}^{T}w^{o} \big ).\nonumber \\ \end{aligned}$$
(28)

By subtracting (28) from \(w^{o}\), the error vector of kth adaptive filter is obtained as

$$\begin{aligned} \tilde{\varvec{w}}_{k,i+1}= & {} \varvec{B}_{k,i} \tilde{\varvec{w}}_{k,i} -\mu \varvec{u}_{k,i}\varvec{s}(i) \nonumber \\{} & {} -\mu \varvec{u}_{k,i}\varvec{z}(i)+\mu \varvec{u}_{k,i}\varvec{v}_{k,i}^{T} w^{o}, \end{aligned}$$
(29)

where \(I_{M}\) is the identity matrix with dimension \(M\times M\) and

$$\begin{aligned} \varvec{B}_{k,i}= I_{M}-\mu \varvec{u}_{k,i} \varvec{u}_{k,i}^{T}. \end{aligned}$$
(30)

4.1 The mean analysis

By computing the expected value of both sides of (29), we have

$$\begin{aligned} \textrm{E}[ \tilde{\varvec{w}}_{k,i+1}]= & {} \textrm{E} [\varvec{B}_{k,i} \tilde{\varvec{w}}_{k,i} ] - \mu \textrm{E} [\varvec{u}_{k,i} \varvec{s}(i)]\nonumber \\{} & {} - \mu \textrm{E} [\varvec{u}_{k,i} \varvec{z}(i)] + \mu \textrm{E} [\varvec{u}_{k,i} \varvec{v}_{k,i}^{T}w^{o}]. \end{aligned}$$
(31)

By considering this fact that the measurement noises \(\varvec{v}_{k,i}\), \(\varvec{z}(i)\), and also the noise source \(\varvec{x}_{i}\) are zero-mean and mutually independent as stated in (24), we can simplify Eq. (31) as

$$\begin{aligned} \textrm{E}[\tilde{\varvec{w}}_{k,i+1}]= & {} B_{k} \textrm{E} [\tilde{\varvec{w}}_{k,i} ] - \mu \textrm{E} [\varvec{u}_{k,i}] \textrm{E} [\varvec{s}(i)]\nonumber \\{} & {} - \mu \textrm{E} [\varvec{u}_{k,i}] \textrm{E} [\varvec{z}(i)]\nonumber \\{} & {} + \mu \textrm{E} [ (\varvec{x}_{i} +\varvec{v}_{k,i}) \varvec{v}_{k,i}^{T}] w^{o}, \end{aligned}$$
(32)

where we have defined

$$\begin{aligned}&B_{k}\triangleq \textrm{E}[\varvec{B}_{k,i}]=I_{M}-\mu R_{u,k}, \end{aligned}$$
(33)
$$\begin{aligned}&\varvec{u}_{k,i}\triangleq \varvec{x}_{i} + \varvec{v}_{k,i}. \end{aligned}$$
(34)

Since the error vector \(\tilde{\varvec{w}}_{k,i}\) depends on the data up to time \(i-1\), and the matrix \(\varvec{B}_{k,i}\) is a function of the input data at time i, we consider the independence assumption of \(\varvec{B}_{k,i}\) and \(\tilde{\varvec{w}}_{k,i}\) in Eq. (31). The mean equation of the error vector can be further simplified as

$$\begin{aligned} \textrm{E}[ \tilde{\varvec{w}}_{k,i+1}] = B_{k} \textrm{E} [ \tilde{\varvec{w}}_{k,i}]+ \mu R_{v,k}w^{o}, \end{aligned}$$
(35)

where we use the following assumptions

$$\begin{aligned}{} & {} \textrm{E} [\varvec{u}_{k,i}] = \textrm{E}[\varvec{x}_{i}] +\textrm{E}[\varvec{v}_{k,i}] = 0, \nonumber \\{} & {} \textrm{E} [ \varvec{z}(i)]=0, \nonumber \\{} & {} \textrm{E}[\varvec{x}_{i} \varvec{v}_{k,i}^{T}] =\textrm{E}[\varvec{x}_{i} ] \textrm{E}[\varvec{v}_{k,i}^{T} ] = 0.\nonumber \\ \end{aligned}$$
(36)

By taking the limit of the mean relation (35) in steady-state regime, when \(i\rightarrow \infty \), and using the following approximation in steady-state regime

$$\begin{aligned} \lim _{i \rightarrow \infty } \textrm{E} [ \tilde{\varvec{w}}_{k,i+1}] \simeq \lim _{i \rightarrow \infty } \textrm{E} [ \tilde{\varvec{w}}_{k,i} ], \end{aligned}$$
(37)

we can group the two terms with expectation and thus the mean relation of the error vector results in

$$\begin{aligned} \lim _{i \rightarrow \infty } \textrm{E}[ \tilde{\varvec{w}}_{k,i}] =\mu \big ( I_{M} - B_{k} \big )^{-1} R_{v,k}w^{o}. \end{aligned}$$
(38)

This relation shows that when there is a measurement noise \(\varvec{v}_{k,i}\) in the ANC problem, mean of the error vector is nonzero and therefore the estimation is biased. The bias value is directly related to the measurement noise power \(R_{v,k}\). We can decline the bias value by linearly combining the error signals of adaptive filters since the noise power decreases from \(R_{v,k}\) to \(\frac{1}{N}R_{v,k}\).

4.2 The mean-square analysis

By inserting (34) into (29), we can rewrite the error vector as

$$\begin{aligned} \tilde{\varvec{w}}_{k,i+1}= & {} \varvec{B}_{k,i} \tilde{\varvec{w}}_{k,i} -\mu \varvec{u}_{k,i} \varvec{s}(i) -\mu \varvec{u}_{k,i}\varvec{z}(i)\nonumber \\{} & {} +\, \mu \varvec{x}_{i}\varvec{v}_{k,i}^{T} w^{o} + \mu \varvec{v}_{k,i}\varvec{v}_{k,i}^{T} w^{o}. \end{aligned}$$
(39)

In this section, we study the behaviour of the error vector variance by analysing the mean of weighted squared norm given as

$$\begin{aligned} \textrm{E} \Vert \tilde{\varvec{w}}_{k,i+1} \Vert _{\Sigma }^2= & {} \textrm{E} \Big [ \tilde{ \varvec{w} }_{k,i+1}^{T} \Sigma \tilde{\varvec{w}}_{k,i+1} \Big ] \nonumber \\= & {} \textrm{E} \Big (\tilde{\varvec{w}}_{k,i}^{T} \varvec{B}_{k,i}^{T} \Sigma \varvec{B}_{k,i} \tilde{\varvec{w}}_{k,i} \Big ) \nonumber \\{} & {} + \, \mu ^2 \textrm{E} \Big ( \varvec{s}(i) \varvec{u}_{k,i}^{T} \Sigma \varvec{u}_{k,i}\varvec{s}(i) \Big ) \nonumber \\{} & {} + \, \mu ^2 \textrm{E} \Big ( \varvec{z}(i) \varvec{u}_{k,i}^{T} \Sigma \varvec{u}_{k,i}\varvec{z}(i) \Big ) \nonumber \\{} & {} + \, \mu ^2 \textrm{E} \Big ( w^{oT}\varvec{v}_{k,i}\varvec{x}_{i}^{T} \Sigma \varvec{x}_{i}\varvec{v}_{k,i}^{T} w^{o}\Big ) \nonumber \\{} & {} + \, \mu ^2 \textrm{E} \Big ( w^{oT}\varvec{v}_{k,i} \varvec{v}_{k,i}^{T} \Sigma \varvec{v}_{k,i} \varvec{v}_{k,i}^{T} w^{o} \Big ) \nonumber \\{} & {} + \, \mu \textrm{E} \Big (\tilde{\varvec{w}}_{k,i}^{T} \varvec{B}_{k,i}^{T} \Sigma \varvec{v}_{k,i} \varvec{v}_{k,i}^{T} w^{o} \Big ), \end{aligned}$$
(40)

in which other terms are discarded because \(\varvec{v}_{k,i}\), \( \varvec{z}(i)\) and \(\varvec{x}_{i}\) are mutually independent and zero-mean samples. According to the LMS adaptation rule, the estimation vector \(\varvec{w}_{k,i}\) depends on the data up to time \(i-1\), \(\varvec{u}_{k,i-1}\). Thus, we can assume that the error vector \(\tilde{\varvec{w}}_{k,i}\) is independent of the input data \(\varvec{u}_{k,i}\). Therefore, we conclude that the error vector \(\tilde{\varvec{w}}_{k,i}\) is independent of matrix \(\varvec{ B}_{k,i}\), where \(\varvec{ B}_{k,i}\) is defined in (30). Thus, the first expectation term in (40) is

$$\begin{aligned}{} & {} \textrm{E} \Big (\tilde{\varvec{w}}_{k,i}^{T} \varvec{B}_{k,i}^{T} \Sigma \varvec{B}_{k,i} \tilde{\varvec{w}}_{k,i} \Big ) \nonumber \\{} & {} \quad =\textrm{E} \Big [ \textrm{E} \Big (\tilde{\varvec{w}}_{k,i}^{T} \varvec{B}_{k,i}^{T} \Sigma \varvec{B}_{k,i} \tilde{\varvec{w}}_{k,i} \Big ) | \tilde{\varvec{w}}_{k,i}\Big ]\nonumber \\{} & {} \quad = \textrm{E} \Big [ \tilde{\varvec{w}}_{k,i}^{T} \textrm{E} \Big ( \varvec{B}_{k,i}^{T} \Sigma \varvec{B}_{k,i} \Big ) \tilde{\varvec{w}}_{k,i} | \tilde{\varvec{w}}_{k,i} \Big ] \nonumber \\{} & {} \quad = \textrm{E} \Vert \tilde{\varvec{w}}_{k,i} \Vert ^{2}_{\Sigma ^{\prime }}, \end{aligned}$$
(41)

where the weighting matrix \(\Sigma ^{\prime }\) is defined as follows:

$$\begin{aligned} \Sigma ^{\prime }=\textrm{E} \Big ( \varvec{B}_{k,i}^{T} \Sigma \varvec{B}_{k,i} \Big ). \end{aligned}$$
(42)

In linear algebra, we have the following property for Kronecker product [16]:

$$\begin{aligned} \textrm{vec}(A \Sigma B)= (B^{T} \otimes A) \sigma , \end{aligned}$$
(43)

where

$$\begin{aligned} \sigma = \textrm{vec}(\Sigma ). \end{aligned}$$
(44)

Applying the \(\textrm{vec}(.)\) operation to both sides of (42), we have

$$\begin{aligned} \sigma ^{\prime } =\mathcal {F} \sigma , \end{aligned}$$
(45)

where

$$\begin{aligned} \mathcal {F} \triangleq \textrm{E} \Big [ \varvec{B}_{k,i}^{T} \otimes \varvec{B}_{k,i}^{T} \Big ]. \end{aligned}$$
(46)

By substituting \( \varvec{ B}_{k,i}\) from (30) into (46), we have

$$\begin{aligned} \mathcal {F}= \textrm{E} \Big [ \Big ( I_{M}-\mu \varvec{u}_{k,i}\varvec{u}_{k,i}^{T}\Big ) \otimes \Big ( I_{M}-\mu \varvec{u}_{k,i}\varvec{u}_{k,i}^{T}\Big ) \Big ] . \end{aligned}$$
(47)

One can conclude that the expectation in (47) is time-invariant and, thus,

$$\begin{aligned} \mathcal {F}= & {} \textrm{E} \Big [ \Big ( I_{M}-\mu \varvec{u}_{k,i}\varvec{u}_{k,i}^{T}\Big ) \otimes \Big ( I_{M}-\mu \varvec{u}_{k,i}\varvec{u}_{k,i}^{T}\Big ) \Big ]\nonumber \\= & {} (I_{M} \otimes I_{M} ) - ( I_{M} \otimes \mu R_{u,k})\nonumber \\{} & {} - (\mu R_{u,k} \otimes I_{M} ) + \mathcal {O}(\mu ^2), \end{aligned}$$
(48)

where \(\mathcal {O}(\mu ^{2})\) denotes

$$\begin{aligned} \mathcal {O}(\mathcal {M}^{2})=\mu ^2\textrm{E}\Big [ \varvec{u}_{k,i}\varvec{u}_{k,i}^{T} \otimes \varvec{u}_{k,i}\varvec{u}_{k,i}^{T} \Big ]. \end{aligned}$$
(49)

Calculating (49) needs the knowledge of fourth-order statistics which are not available. The effect of this term can be ignored by assuming a sufficiently small step size as used in [16]. Therefore, matrix \(\mathcal {F}\) became time-invariant and can be approximated as

$$\begin{aligned} \mathcal {F} \approx B_{k}^{T} \otimes B_{k}^{T}. \end{aligned}$$
(50)

Using this property that \(\textrm{Tr}[AB] = \textrm{Tr}[BA]\), we can rewrite the second expectation of (40) as

$$\begin{aligned} \mu ^2 \textrm{E} \Big ( \varvec{s}(i) \varvec{u}_{k,i}^{T} \Sigma \varvec{u}_{k,i}\varvec{s}(i) \Big )= & {} \mu ^2 \textrm{Tr} \Big [\textrm{E} \big (\varvec{s}(i) \varvec{u}_{k,i}^{T} \Sigma \varvec{u}_{k,i}\varvec{s}(i) \big ) \Big ] \nonumber \\= & {} \mu ^2 \textrm{Tr} \Big [\textrm{E} \big ( \varvec{u}_{k,i}^{T} \Sigma \varvec{u}_{k,i}\varvec{s}(i) \varvec{s}(i) \big ) \Big ]\nonumber \\= & {} \mu ^2 \textrm{Tr} \Big [\textrm{E} \big ( \varvec{u}_{k,i}^{T} \Sigma \varvec{u}_{k,i} \big ) \sigma _{s}^{2} \Big ] \nonumber \\= & {} \mu ^2 \textrm{Tr} \Big [\textrm{E} \big (\varvec{u}_{k,i} \varvec{u}_{k,i}^{T} \Sigma \big ) \sigma _{s}^{2} \Big ]\nonumber \\= & {} \mu ^{2} \sigma _{s}^{2} \textrm{Tr} \big [ R_{u,k}\Sigma \big ], \end{aligned}$$
(51)

where \(R_{u,k}\) and \(\sigma _{s}^2\) are defined in (22) and (23), respectively. Similar to (51), the third expectation of (40) can be derived as

$$\begin{aligned} \mu ^2 \textrm{E} \Big ( \varvec{z}(i) \varvec{u}_{k,i}^{T} \Sigma \varvec{u}_{k,i}\varvec{z}(i) \Big ) = \mu ^{2} \sigma _{z}^{2} \textrm{Tr} \big [ R_{u,k}\Sigma \big ]. \end{aligned}$$
(52)

The fourth expectation term of (40) is also calculated as

$$\begin{aligned}{} & {} \mu ^2 \textrm{E} \Big ( w^{oT}\varvec{v}_{k,i}\varvec{x}_{i}^{T} \Sigma \varvec{x}_{i}\varvec{v}_{k,i}^{T} w^{o} \Big ) \nonumber \\{} & {} \quad = \mu ^2 \textrm{E} \Big [ \textrm{E} \Big ( w^{oT}\varvec{v}_{k,i}\varvec{x}_{i}^{T} \Sigma \varvec{x}_{i}\varvec{v}_{k,i}^{T} w^{o} \Big ) |\varvec{v}_{k,i} \Big ] \nonumber \\{} & {} \quad =\mu ^2 \textrm{E} \Big [ w^{oT}\varvec{v}_{k,i} \textrm{E} \Big (\varvec{x}_{i}^{T} \Sigma \varvec{x}_{i} \Big ) \varvec{v}_{k,i}^{T} w^{o} |\varvec{v}_{k,i} \Big ]\nonumber \\{} & {} \quad =\mu ^2\textrm{Tr}\big [ R_{x}\Sigma \big ] w^{oT}R_{v,k}w^{o}, \end{aligned}$$
(53)

where we used the \(R_{v,k}\) definition in (22) and the following property of the trace operator

$$\begin{aligned} \textrm{E} \Big (\varvec{x}_{i}^{T} \Sigma \varvec{x}_{i} \Big ) =\textrm{Tr}\big [ \textrm{E} \Big (\varvec{x}_{i}^{T} \Sigma \varvec{x}_{i} \Big ) \big ]=\textrm{Tr}\big [ R_{x}\Sigma \big ]. \end{aligned}$$
(54)

The fifth expectation term of (40) is approximated as the following

$$\begin{aligned} \mu ^2 \textrm{E} \Big ( w^{oT}\varvec{v}_{k,i}\varvec{v}_{k,i}^{T} \Sigma \varvec{v}_{k,i}\varvec{v}_{k,i}^{T} w^{o} \Big ) \approx \mu ^2 w^{oT}R_{v,k}\Sigma R_{v,k} w^{o}.\nonumber \\ \end{aligned}$$
(55)

Since the error vector \(\tilde{\varvec{w}}_{k,i}\), the matrix \(\varvec{B}_{k,i}\), and the noise vector \(\varvec{v}_{k,i}\) are mutually independent, we can separate the sixth expectation term of (40) as

$$\begin{aligned}{} & {} \mu \textrm{E} \Big (\tilde{\varvec{w}}_{k,i}^{T} \varvec{B}_{k,i}^{T} \Sigma \varvec{v}_{k,i}\varvec{v}_{k,i}^{T} w^{o} \Big ) \nonumber \\{} & {} \quad = \mu \textrm{E} \Big (\tilde{\varvec{w}}_{k,i} \Big )^{T} \textrm{E} \Big ( \varvec{B}_{k,i}\Big )^{T} \Sigma \textrm{E} \Big (\varvec{v}_{k,i}\varvec{v}_{k,i}^{T}\Big ) w^{o} \nonumber \\{} & {} \quad = \mu \textrm{E} \Big (\tilde{\varvec{w}}_{k,i} \Big )^{T} B_{k}^{T} \Sigma R_{v,k} w^{o}. \end{aligned}$$
(56)

By considering the steady-state mode of the mean of the error vector (38), the expectation in (56) is obtained in steady-state mode as

$$\begin{aligned}{} & {} \lim _{i \rightarrow \infty } \mu \textrm{E} \Big (\tilde{\varvec{w}}_{k,i}^{T} \varvec{B}_{k,i}^{T} \Sigma \varvec{v}_{k,i}\varvec{v}_{k,i}^{T} w^{o} \Big )\nonumber \\{} & {} \quad =\mu ^{2} w^{oT} R_{v,k}^{T} \big ( I_{M} - B_{k} \big )^{-1T} B_{k}^{T} \Sigma R_{v,k} w^{o}. \end{aligned}$$
(57)

Finally, replacing the achievements of (41), (51), (52), (53), (55), and (57) into (40) yields to

$$\begin{aligned} \textrm{E}\Vert \tilde{\varvec{w}}_{k,i+1} \Vert ^{2}_{\sigma }= & {} \textrm{E}\Vert \tilde{\varvec{w}}_{k,i} \Vert ^{2}_{\mathcal {F}\sigma }+ \mu ^{2} \sigma _{s}^{2} \textrm{Tr} \big [ R_{u,k}\Sigma \big ] \nonumber \\{} & {} + \, \mu ^{2} \sigma _{z}^{2} \textrm{Tr} \big [ R_{u,k}\Sigma \big ]\nonumber \\{} & {} + \, \mu ^2\textrm{Tr}\big [ R_{x}\Sigma \big ] w^{oT}R_{v,k}w^{o} \nonumber \\{} & {} + \, \mu ^2 w^{oT}R_{v,k}\Sigma R_{v,k} w^{o} \nonumber \\{} & {} + \, \mu ^{2} w^{oT} R_{v,k}^{T} \big ( I_{M} - B_{k} \big )^{-1T} B_{k}^{T} \Sigma R_{v,k} w^{o}.\nonumber \\ \end{aligned}$$
(58)

Using the following property of the trace operator

$$\begin{aligned} \textrm{Tr}(W\Sigma )=[\textrm{vec}(W^{T})]^{T}\sigma , \end{aligned}$$
(59)

We can write the variance (58) in the vector expression

$$\begin{aligned} \textrm{E}\Vert \tilde{\varvec{w}}_{k,i+1} \Vert ^{2}_{\sigma }= & {} \textrm{E}\Vert \tilde{\varvec{w}}_{k,i} \Vert ^{2}_{\mathcal {F}\sigma } + \mu ^{2} \sigma _{s}^{2} \textrm{vec} \big ( R_{u,k}^{T})^{T} \sigma \nonumber \\{} & {} + \, \mu ^{2} \sigma _{z}^{2} \textrm{vec} \big ( R_{u,k}^{T})^{T} \sigma \nonumber \\{} & {} + \, \mu ^2 w^{oT}R_{v,k}w^{o} \textrm{vec} \big ( R_{x}^{T} \big ) ^{T} \sigma \nonumber \\{} & {} + \, \mu ^2 \textrm{vec}\big ( R_{v,k}^{T} w^{oT} w^{o} R_{v,k} \big )^{T }\sigma \nonumber \\{} & {} + \, \mu ^{2} \textrm{vec} \big (B_{k} \big ( I_{M} - B_{k} \big )^{-1} R_{v,k} w^{o}w^{oT}R_{v,k}^{T} \big )^{T} \sigma . \nonumber \\ \end{aligned}$$
(60)

By taking the limit of the variance (58) in the steady-state regime, when \(i\rightarrow \infty \), and using the following approximation

$$\begin{aligned} \lim _{i \rightarrow \infty } \textrm{E}\Vert \tilde{\varvec{w}}_{k,i+1} \Vert ^{2}_{\sigma } \simeq \lim _{i \rightarrow \infty } \textrm{E}\Vert \tilde{\varvec{w}}_{k,i} \Vert ^{2}_{\mathcal {F}\sigma }, \end{aligned}$$
(61)

We can group the two terms with expectation in (58), and thus the variance of the error vector turns to

$$\begin{aligned}{} & {} \lim _{i \rightarrow \infty } \textrm{E}\Vert \tilde{\varvec{w}}_{k,i} \Vert ^{2}_{(I_{M^{2}}-\mathcal {F}) \sigma }\nonumber \\{} & {} \quad = \mu ^{2} \sigma _{s}^{2} \textrm{vec} \big (R_{u,k}^{T})^{T} \sigma \nonumber \\{} & {} \qquad + \mu ^{2} \sigma _{z}^{2} \textrm{vec} \big ( R_{u,k}^{T})^{T} \sigma \nonumber \\{} & {} \qquad + \mu ^2 w^{oT}R_{v,k}w^{o} \textrm{vec} \big ( R_{x}^{T} \big ) ^{T} \sigma \nonumber \\{} & {} \qquad + \mu ^2 \textrm{vec}\big ( R_{v,k}^{T} w^{oT} w^{o}R_{v,k} \big )^{T }\sigma \nonumber \\{} & {} \qquad + \mu ^{2} \textrm{vec} \big (B_{k} \big ( I_{M} - B_{k} \big )^{-1} R_{v,k} w^{o}w^{oT}R_{v,k}^{T} \big )^{T} \sigma . \end{aligned}$$
(62)

In order to compute the variance of the average error signal (21), we should choose the vector \(\sigma \) as

$$\begin{aligned} (I_{M^{2}}-\mathcal {F})\sigma = R_{u,k}. \end{aligned}$$
(63)

By solving (63) for \(\sigma \), we have

$$\begin{aligned} \sigma = (I_{M^{2}}-\mathcal {F})^{-1} \textrm{vec}(R_{u,k}). \end{aligned}$$
(64)

Inserting (64) into (62), the steady-state MSD performance is obtained as

$$\begin{aligned} \textrm{MSD}^{\textrm{Total}}= & {} \lim _{i \rightarrow \infty } \textrm{E}\Vert \tilde{\varvec{w}}_{k,i} \Vert ^{2}_{R_{u,k}} \nonumber \\= & {} \,\mu ^2 \textrm{vec} \Bigg (\sigma _{s}^{2} R_{u,k}^{T} +\sigma _{z}^{2} R_{u,k}^{T} +w^{oT}R_{v,k}w^{o}R_{x}^{T}\nonumber \\{} & {} + \, w^{oT} w^{o} R_{v,k}^{T}R_{v,k} \nonumber \\{} & {} + \, B_{k} \big ( I_{M} - B_{k} \big )^{-1} R_{v,k} w^{o}w^{oT}R_{v,k}^{T} \Bigg )^{T} \nonumber \\{} & {} \times (I_{ M^{2}}-\mathcal {F})^{-1} \textrm{vec}(R_{u,k}). \end{aligned}$$
(65)

Using the following property of the Kronecker product [17],

$$\begin{aligned} \lambda _\textrm{max} (A \otimes C) = \lambda _\textrm{max}(A) \lambda _\textrm{max}(C), \end{aligned}$$
(66)

and considering (50), we can write

$$\begin{aligned} \lambda _\textrm{max} (\mathcal {F}) = \lambda _\textrm{max} (B_{k}) \lambda _\textrm{max} (B_{k}) . \end{aligned}$$
(67)

Thus, we conclude that matrix \(\mathcal {F}\) is stable when matrix \(B_{k}\) is stable, that is

$$\begin{aligned} \lambda _\textrm{max}\Big (B_{k}\Big )<1. \end{aligned}$$
(68)

In order to guarantee this stability, the following condition needs to be satisfied for the step size

$$\begin{aligned} 0<\mu <\frac{2}{\lambda _\textrm{max}(R_{u,k})}, \quad k=1,2, \ldots ,N. \end{aligned}$$
(69)

where \(R_{u,k}\) is defined in (22). Inserting (65) into (21), the variance of the total error signal in steady-state equals to

$$\begin{aligned} \lim _{i \rightarrow \infty } \textrm{E} \vert \varvec{e}_{A}(i) \vert ^2= & {} \sigma _{s}^2 +\sigma _{z}^2 + \Vert w^{o} \Vert ^2_{\frac{1}{N}R_{v,k}} \nonumber \\{} & {} + \, \frac{1}{N^2}\sum _{k=1}^{N} \mu ^2 \textrm{vec} \Bigg (\sigma _{s}^{2} R_{u,k}^{T} + \sigma _{z}^{2}R_{u,k}^{T}\nonumber \\{} & {} + \, w^{oT}R_{v,k}w^{o} R_{x}^{T} +w^{oT} w^{o} R_{v,k}^{T}R_{v,k} \nonumber \\{} & {} + \, B_{k} \big ( I_{M} - B_{k} \big )^{-1} R_{v,k} w^{o}w^{oT}R_{v,k}^{T} \Bigg )^{T} \nonumber \\{} & {} \times \, (I_{ M^{2}}-\mathcal {F})^{-1} \textrm{vec}(R_{u,k}). \end{aligned}$$
(70)

From (70), we conclude that the influencing factors on the mean-square behaviour of the performance are: (i) the step size, \(\mu \), (ii) the signal power, \(\sigma _{s}^{2}\), (iii) the measurement noise power, \(\sigma _{z}^{2}\), (iv) the noise covariance matrices \(R_{x}\) and \(R_{v,k}\), (v) the input regressors via matrix \( B_{k} \), and (vi) the optimal vector of the acoustic channel \(w^{o}\).

5 Simulation results

In the first part of the simulations, we verify the theoretical achievement about the MSE of the proposed ANC in (70). To this end, the noise source signal \(\varvec{x}_{i}\), the measurement noises \(\varvec{v}_{k,i}\) and \(\varvec{z}(i)\) are generated from zero-mean Gaussian random variables with covariance matrices \(R_{x}= I_{M}\), \(R_{v,k}=0.029I_{M}\), and \(\sigma _{z}^{2}=0.001\), respectively. We assume the adaptive time-domain filter bank has \(N=7\) filters. In order to compare the theoretical findings with computer simulation results, we assume that the speech signal \(\varvec{s}(i)\) is of a zero-mean Gaussian process. The step size is set to \(\mu =0.01\). Figure 3 compares the numerical learning behaviour of the proposed filter bank method and that of the single adaptive filtering [4]. We also include the theoretical MSE of the proposed method derived in (70) and that of single LMS where \(N=1\). The numerical results are averaged over 250 experiments. The acoustic channel is modelled as

$$\begin{aligned} w^{o}=\frac{1}{M}[1,1, \ldots ,1]^{T} \in R^{M\times 1} , \end{aligned}$$

where \(M=4\).

As seen in Fig. 3, the theoretical finding (70) is well matched to the numerical results.

We repeat the same experiment when \(N=2\) is chosen in the proposed method. The results are represented in Fig. 4. By comparing Figs. 3, 4, one can conclude that, in the proposed method, the larger N the better noise cancellation is attained.

In the next scenario of simulations, we run the proposed ANC algorithm for different number of adaptive filters N. As seen in Fig. 5, the MSE error for this experiment is improved when the number of adaptive filters N is increased.

Fig. 3
figure 3

MSE performance of the conventional and the proposed adaptive method with \(N=7\)

Fig. 4
figure 4

MSE performance of the conventional and the proposed adaptive method with \(N=2\)

Fig. 5
figure 5

The MSE performance of the proposed method with different N

In the next part of simulations, we use the proposed ANC method with \(N=10\), and compare it to the conventional ANC in [11], over a real speech signal. The speech and the noise source signals are shown in Figs. 6a, b, respectively. The results for the conventional and the proposed ANC methods are represented in Figs. 6c, d, respectively. As seen, the outcome of the conventional ANC with one filter is steel noisy; however, the result of the proposed method is less noisy and very similar to the original speech signal in Fig. 6a.

The MSE error for this experiment is also provided in Fig. 7. The results in Fig. 6 and Fig. 7 indicate that how the proposed method improves the noise cancellation performance with \(N=8\) filters.

Fig. 6
figure 6

a The speech signal \(\varvec{s}(i)\), b the noisy speech signal \(\varvec{d}(i)\), c the noise-suppressed speech signal by LMS filter \(\varvec{e}_{k}(i)\), d the noise-suppressed speech signal by LMS filter bank \(\varvec{e}_{A}(i)\)

Fig. 7
figure 7

The MSE performance of the conventional method and the proposed method with \(N=8\) applied on the real speech signal

In the next scenario, we have attempted to replicate real-life conditions as accurately as possible. The captured audio signal utilized is depicted in Fig. 8. The noise signal corresponds to the sound of a vacuum cleaner, and its non-stationary characteristics are evident in Fig. 9. Figure 10 showcases the captured speech signal in the presence of the vacuum cleaner noise.

Fig. 8
figure 8

The recorded real speech signal when the vacuum cleaner is OFF

Fig. 9
figure 9

The recorded signal of a vacuum cleaner

Fig. 10
figure 10

The recorded signal of a real speech when the vacuum cleaner is ON

The parameters of the weight update equations were carefully selected to ensure a rapid convergence rate. Figure 11 presents the output signals of a single adaptive filter (bottom figure) for noise cancellation, contrasting it with the scenario where we employ \(N=10\) parallel adaptive filters (top figure). It is evident that the top figure exhibits superior noise cancellation performance, as indicated by the remaining noise level highlighted by a red arrow.

Fig. 11
figure 11

Output signals of parallel adaptive filters (top figure) and a single adaptive filter (bottom figure)

6 Conclusion

This work deals with acoustic noise cancellation problem where multiple microphones are used to record the noise source signal. A bank of least-mean-squares (LMS) time-domain adaptive filters are proposed to enhance the noise cancellation performance. The recorded noise signals are filtered by an adaptive filter bank. We show that the noise cancellation performance is enhanced by linearly combining the error signals of adaptive filters. We derived an expression for the noise cancellation behaviour in terms of the steady-state mean-square error performance. Numerical simulations verify the theoretical derivations. According to the simulation results, the proposed adaptive structure has a better noise cancellation performance compared to the traditional ANC structure.