1 Introduction

Adaptive filtering algorithms have been widely used in practical applications such as acoustic echo cancelation, network echo cancelation, channel equalization and active noise control [3, 7, 8, 10, 19, 21]. Among these algorithms, the least mean squares algorithm (LMS) is recognized as a typical algorithm due to its simplicity and easy implementation. Based on the LMS algorithm, its normalized version called the NLMS was developed to overcome the gradient noise amplification problem in the LMS when input vector is large [16]. However, colored inputs which are very common tend to degrade the performance for both the LMS and NLMS algorithms. In order to address this problem, the normalized subband adaptive filter (NSAF) was investigated [9], which can filter the fullband correlated inputs into the subband signals that approximate Gaussian white signals. Subsequently, to speed up the convergence rate when identifying sparse impulse responses, a class of proportionate NSAF algorithms was extensively developed, such as the proportionate NSAF (PNSAF) and \( \mu \)-law PNSAF (MPNSAF) algorithms [1, 2].

Since impulsive noises which may produce more outliers than Gaussian model are frequently encountered in practical applications, a series of algorithms aimed at improving robustness against impulsive noises were developed, such as the sign-type algorithms [13, 14, 20], M-estimate-type algorithms [4, 22, 23] and maximum correntropy algorithms [5, 6]. Among these robust algorithms, the sign subband adaptive filter (SSAF) [13, 20] that utilizes the sign function has attracted much attention because it can be easily implemented and achieve desirable performance in the presence of strong impulsive noises. Unfortunately, the conventional SSAF is obliged to a compromise between fast convergence rate and small steady-state misalignment. As is known to all, the variable step size (VSS) strategy is an effective method for overcoming this drawback [15, 18]. Following this idea, the variable step size sign subband adaptive filter (VSS-SSAF) was investigated to solve this conflicting requirement [17]. Apart from the VSS scheme, the variable regularization parameter is another technique for addressing the previously stated tradeoff. Correspondingly, the variable regularization parameter (VPR) sign subband adaptive filter (VRP-SSAF) was proposed [11], which is obtained by minimizing the L1-norm of the subband a posteriori error vector that subjects to a constraint on the tap-weight vector of the filter.

Motivated by the mean-square deviation (MSD) analysis, this paper accomplishes an improved variable regularization parameter for SSAF (IVRP-SSAF) by making the MSD undergo the steepest descent from the current iteration to the next, which is different from the previous VRP-SSAF that is based on the minimization of the L1-norm of the subband a posteriori error. Since it is difficult to calculate the exact values of several quantities in the procedure, we get the proposed VRP by minimizing the upper bound of a certain term which is used to update the MSD. To enhance the tracking ability of the scheme, we design a re-initialization mechanism for the regularization parameter to detect the sudden change in the system. Meanwhile, the computational complexity of the proposed algorithm is discussed. Sufficient simulations demonstrate the superiority of the proposed IVRP-SSAF algorithm.

Notation For convenient reading, the dimensions of the vectors or matrices used in what follows are summarized in Table 1.

Table 1 Dimension of the vector or matrix, where \( M \) denotes the length of the filter, \( N \) is the number of subbands

2 Review of the SSAF

Consider an unknown M-dimensional vector \( {\mathbf{w}}_{o} \) that satisfies the following linear model

$$ d(n) = {\mathbf{u}}^{\text{T}} (n){\mathbf{w}}_{o} + \eta (n) $$
(1)

where \( d(n) \) is the desired signal, \( {\mathbf{u}}(n) = \left[ {u(n),{\kern 1pt} {\kern 1pt} {\kern 1pt} u(n - 1),{\kern 1pt} {\kern 1pt} \cdots ,{\kern 1pt} {\kern 1pt} u(n - M + 1)} \right]^{T} \) denotes the input vector, \( \eta (n) \) refers to the disturbance noise and \( ( \cdot )^{\text{T}} \) stands for vector or matrix transpose. Figure 1 presents the structure of the SSAF, where \( N \) denotes the number of subbands. \( H_{i} (z) \) and \( G_{i} (z) \) for \( i = 0,1, \ldots ,N - 1 \) are the analysis filters and the synthesis filters, and \( \downarrow N \) and \( \uparrow N \) stand for \( N \)-fold decimation and interpolations, respectively. The desired signal \( d(n) \) and input signal \( u(n) \) are partitioned into the subband signals \( d_{i} (n) \) and \( u_{i} (n) \) by means of the analysis filters bank \( H_{i} (z) \). Then, \( d_{i} (n) \) and \( y_{i} (n) \) are decimated to \( d_{{i,{\kern 1pt} D}} (k) \) and \( y_{{i,{\kern 1pt} D}} (k) \) which is defined as \( y_{{i,{\kern 1pt} D}} (k) = {\mathbf{u}}_{i}^{\text{T}} (k){\mathbf{w}}(k) \), where \( {\mathbf{u}}_{i} (k) = \left[ {u_{i} (kN),{\kern 1pt} \,u_{i} (kN - 1),{\kern 1pt} \ldots ,u_{i} (kN - M + 1)} \right]^{\text{T}} \). Symbols \( n \) and \( k \) denote the original sequences and decimated sequences, respectively, and they satisfy \( n = kN \).

Fig. 1
figure 1

Structure of the SSAF

The update equation of the SSAF is expressed as

$$ {\mathbf{w}}(k + 1) = {\mathbf{w}}(k) + \mu \frac{{{\mathbf{U}}(k)\text{sgn} \left[ {{\mathbf{e}}_{D} (k)} \right]}}{{\sqrt {\sum\nolimits_{i = 0}^{N - 1} {{\mathbf{u}}_{i}^{\text{T}} (k){\mathbf{u}}_{i} (k) + \delta } } }} $$
(2)

where \( \mu \) represents the step size, \( \text{sgn} ( \cdot ) \) is the sign function, \( \delta \) denotes the regularization parameter to avoid numerical difficulty, and \( {\mathbf{e}}_{D} (k) = {\mathbf{d}}_{D} (k) - {\mathbf{U}}^{\text{T}} (k){\mathbf{w}}(k) \) with

$$ {\mathbf{d}}_{D} (k) = {\mathbf{U}}^{\text{T}} (k){\mathbf{w}}_{o} + {\varvec{\upeta}}_{D} (k) $$
(3)

and

$$ {\mathbf{U}}(k) = \left[ {{\mathbf{u}}_{0} (k),{\kern 1pt} {\kern 1pt} {\kern 1pt} {\mathbf{u}}_{1} (k), \ldots ,{\kern 1pt} {\kern 1pt} {\mathbf{u}}_{N - 1} (k)} \right] $$
(4)
$$ {\varvec{\upeta}}_{D} (k) = \left[ {\eta_{{0,{\kern 1pt} D}} (k),{\kern 1pt} {\kern 1pt} {\kern 1pt} \eta_{{1,{\kern 1pt} D}} (k),{\kern 1pt} {\kern 1pt} \ldots ,{\kern 1pt} {\kern 1pt} \eta_{{N - 1,{\kern 1pt} D}} } \right]^{\text{T}} $$
(5)

where \( \eta_{{i,{\kern 1pt} D}} (k) \) accounts for the \( i{\text{th}} \) subband measurement noise with zero mean and variance \( \sigma_{{\eta_{{i,{\kern 1pt} D}} }}^{2} \).

2.1 Proposed IVRP-SSAF Algorithm

To facilitate the derivation, following the same method in [11] that assumes the step size to be one and replaces the regularization parameter with its time-varying version

$$ {\mathbf{w}}(k + 1) = {\mathbf{w}}(k) + \frac{{{\mathbf{U}}(k)\text{sgn} \left[ {{\mathbf{e}}_{D} (k)} \right]}}{{\sqrt {\sum\nolimits_{i = 0}^{N - 1} {{\mathbf{u}}_{i}^{\text{T}} (k){\mathbf{u}}_{i} (k) + \delta (k)} } }} $$
(6)

where \( \delta (k) \) accounts for the variable regularization parameter. The weight error vector \( {\tilde{\mathbf{w}}}(k) \) and noise-free a priori subband error \( e_{{i,{\kern 1pt} a}} (k) \) are defined, as follows:

$$ {\tilde{\mathbf{w}}}(k) = {\mathbf{w}}_{o} - {\mathbf{w}}(k) . $$
(7)
$$ e_{{i,{\kern 1pt} a}} (k) = {\mathbf{u}}_{i}^{\text{T}} (k){\tilde{\mathbf{w}}}(k) . $$
(8)

2.2 Design of the Variable Regularization Parameter

To proceed, an assumption which has been widely used in the literatures [9, 12] needs to be introduced

Assumption 1

The off-diagonal elements satisfy \( {\mathbf{u}}_{i}^{\text{T}} (k){\mathbf{u}}_{j} (k) \approx 0,{\kern 1pt} {\kern 1pt} {\kern 1pt} i \ne j \) if the frequency responses of the analysis filters do not significantly overlap.

Combining (7) gives rise to the update of the weight error vector \( {\tilde{\mathbf{w}}}(k) \) for (6)

$$ {\tilde{\mathbf{w}}}(k + 1) = {\tilde{\mathbf{w}}}(k) - \frac{{{\mathbf{U}}(k)\text{sgn} \left[ {{\mathbf{e}}_{D} (k)} \right]}}{{\sqrt {\sum\nolimits_{i = 0}^{N - 1} {{\mathbf{u}}_{i}^{\text{T}} (k){\mathbf{u}}_{i} (k) + \delta (k)} } }} . $$
(9)

Taking the square and mathematical expectation of both sides of (9) yields the recursion of the MSD

$$ \begin{aligned} {\text{MSD}}(k + 1) & = {\text{MSD}}(k) - 2E\left\{ {\frac{{{\mathbf{e}}_{a}^{\text{T}} (k)\text{sgn} \left[ {{\mathbf{e}}_{D} (k)} \right]}}{{\sqrt {\sum\nolimits_{i = 0}^{N - 1} {{\mathbf{u}}_{i}^{\text{T}} (k){\mathbf{u}}_{i} (k) + \delta (k)} } }}} \right\} \\ & \quad + \,E\left\{ {\frac{{\text{sgn} \left[ {{\mathbf{e}}_{D}^{\text{T}} (k)} \right]{\mathbf{U}}^{\text{T}} (k){\mathbf{U}}(k)\text{sgn} \left[ {{\mathbf{e}}_{D} (k)} \right]}}{{\sum\nolimits_{i = 0}^{N - 1} {{\mathbf{u}}_{i}^{\text{T}} (k){\mathbf{u}}_{i} (k) + \delta (k)} }}} \right\} \\ \end{aligned} $$
(10)

where \( {\text{MSD}}(k) = E\left\{ {{\tilde{\mathbf{w}}}^{\text{T}} (k){\tilde{\mathbf{w}}}(k)} \right\} \) denotes the transient MSD at iteration \( k \), and \( {\mathbf{e}}_{a} (k) = \left[ {e_{{0,{\kern 1pt} a}} (k),{\kern 1pt} {\kern 1pt} {\kern 1pt} e_{{1,{\kern 1pt} a}} (k),{\kern 1pt} {\kern 1pt} \cdots ,{\kern 1pt} {\kern 1pt} e_{{N - 1,{\kern 1pt} a}} (k)} \right]^{T} \) refers to the noise-free a priori subband error vector.

Under Assumption 1, we have

$$ {\mathbf{U}}^{\text{T}} (k){\mathbf{U}}(k){\kern 1pt} = {\text{diag}}\left[ {{\mathbf{u}}_{0}^{\text{T}} (k){\mathbf{u}}_{0} (k),{\kern 1pt} {\kern 1pt} {\kern 1pt} {\mathbf{u}}_{1}^{\text{T}} (k){\mathbf{u}}_{1} (k), \ldots ,{\kern 1pt} {\mathbf{u}}_{N - 1}^{\text{T}} (k){\mathbf{u}}_{N - 1} (k)} \right] $$
(11)

and (10) can be simplified as:

$$\begin{aligned} {\text{MSD}}(k + 1) &= {\text{MSD}}(k) - 2E\left\{ {\frac{{{\mathbf{e}}_{a}^{\text{T}} (k)\text{sgn} \left[ {{\mathbf{e}}_{D} (k)} \right]}}{{\sqrt {\sum\nolimits_{i = 0}^{N - 1} {{\mathbf{u}}_{i}^{\text{T}} (k){\mathbf{u}}_{i} (k) + \delta (k)} } }}} \right\}\\& \qquad + E\left\{ {\frac{{\sum\limits_{i = 0}^{N - 1} {{\mathbf{u}}_{i}^{\text{T}} (k){\mathbf{u}}_{i} (k)} }}{{\sum\nolimits_{i = 0}^{N - 1} {{\mathbf{u}}_{i}^{\text{T}} (k){\mathbf{u}}_{i} (k) + \delta (k)} }}} \right\}\end{aligned} $$
(12)

Let

$$ \Delta (k) = - 2E\left\{ {\frac{{{\mathbf{e}}_{a}^{\text{T}} (k)\text{sgn} \left[ {{\mathbf{e}}_{D} (k)} \right]}}{{\sqrt {\sum\nolimits_{i = 0}^{N - 1} {{\mathbf{u}}_{i}^{\text{T}} (k){\mathbf{u}}_{i} (k) + \delta (k)} } }}} \right\} + E\left\{ {\frac{{\sum\nolimits_{i = 0}^{N - 1} {{\mathbf{u}}_{i}^{\text{T}} (k){\mathbf{u}}_{i} (k)} }}{{\sum\nolimits_{i = 0}^{N - 1} {{\mathbf{u}}_{i}^{\text{T}} (k){\mathbf{u}}_{i} (k) + \delta (k)} }}} \right\} $$
(13)

Since \( {\mathbf{e}}_{D} (k) = {\mathbf{e}}_{a} (k) + {\varvec{\upeta}}_{D} (k) \), \( \Delta (k) \) is expanded as

$$ \begin{aligned} \Delta (k) & = - 2E\left\{ {\frac{{\left( {{\mathbf{e}}_{D}^{\text{T}} (k) - {\varvec{\upeta}}_{D}^{\text{T}} (k)} \right)\text{sgn} \left[ {{\mathbf{e}}_{D} (k)} \right]}}{{\sqrt {\sum\nolimits_{i = 0}^{N - 1} {{\mathbf{u}}_{i}^{\text{T}} (k){\mathbf{u}}_{i} (k) + \delta (k)} } }}} \right\} + E\left\{ {\frac{{\sum\limits_{i = 0}^{N - 1} {{\mathbf{u}}_{i}^{\text{T}} (k){\mathbf{u}}_{i} (k)} }}{{\sum\nolimits_{i = 0}^{N - 1} {{\mathbf{u}}_{i}^{\text{T}} (k){\mathbf{u}}_{i} (k) + \delta (k)} }}} \right\} \\ {\kern 1pt} & \quad {\kern 1pt} = \, - 2E\left\{ {\frac{{\left\| {{\mathbf{e}}_{D} (k)} \right\|_{1} - {\varvec{\upeta}}_{D}^{\text{T}} (k)\text{sgn} \left[ {{\mathbf{e}}_{D} (k)} \right]}}{{\sqrt {\sum\nolimits_{i = 0}^{N - 1} {{\mathbf{u}}_{i}^{\text{T}} (k){\mathbf{u}}_{i} (k) + \delta (k)} } }}} \right\} + E\left\{ {\frac{{\sum\nolimits_{i = 0}^{N - 1} {{\mathbf{u}}_{i}^{\text{T}} (k){\mathbf{u}}_{i} (k)} }}{{\sum\nolimits_{i = 0}^{N - 1} {{\mathbf{u}}_{i}^{\text{T}} (k){\mathbf{u}}_{i} (k) + \delta (k)} }}} \right\} \\ \end{aligned} $$
(14)

Considering that the term for \( {\varvec{\upeta}}_{D}^{\text{T}} (k)\text{sgn} \left[ {{\mathbf{e}}_{D} (k)} \right] \) is difficult to be calculated, we employ the method designed in [17] to obtain the upper bound of \( {\varvec{\upeta}}_{D}^{\text{T}} (k)\text{sgn} \left[ {{\mathbf{e}}_{D} (k)} \right] \)

$$ \begin{aligned} {\varvec{\upeta}}_{D}^{\text{T}} (k)\text{sgn} \left[ {{\mathbf{e}}_{D} (k)} \right] & = {\varvec{\upeta}}_{D}^{\text{T}} (k)\text{sgn} \left( {{\mathbf{U}}^{\text{T}} (k){\tilde{\mathbf{w}}}(k) + {\varvec{\upeta}}_{D} (k)} \right) \\ & \, \le \left\| {{\varvec{\upeta}}_{D} (k)} \right\|_{1} \le KN\sigma_{{\eta_{{i,{\kern 1pt} D}} }} = K\sqrt N \sigma_{\eta } \\ \end{aligned} $$
(15)

where \( \sigma_{\eta }^{2} = N\sigma_{{\eta_{{i,{\kern 1pt} D}} }}^{2} \) is widely used in the subband filtering algorithms [9, 12, 17], and \( K \) is a constant value with its range \( (0,{\kern 1pt} {\kern 1pt} {\kern 1pt} 3] \) [17].

Invoking (15), the upper bound of \( \Delta (k) \) is presented as follows:

$$ \begin{aligned} \Delta (k) & \le - 2E\left\{ {\frac{{\left\| {{\mathbf{e}}_{D} (k)} \right\|_{1} - \left\| {{\varvec{\upeta}}_{D} (k)} \right\|_{1} }}{{\sqrt {\sum\nolimits_{i = 0}^{N - 1} {{\mathbf{u}}_{i}^{\text{T}} (k){\mathbf{u}}_{i} (k) + \delta (k)} } }}} \right\} + E\left\{ {\frac{{\sum\nolimits_{i = 0}^{N - 1} {{\mathbf{u}}_{i}^{\text{T}} (k){\mathbf{u}}_{i} (k)} }}{{\sum\nolimits_{i = 0}^{N - 1} {{\mathbf{u}}_{i}^{\text{T}} (k){\mathbf{u}}_{i} (k) + \delta (k)} }}} \right\} \\ & {\kern 1pt} \le - 2E\left\{ {\frac{{\left\| {{\mathbf{e}}_{D} (k)} \right\|_{1} - K\sqrt N \sigma_{\eta } }}{{\sqrt {\sum\nolimits_{i = 0}^{N - 1} {{\mathbf{u}}_{i}^{\text{T}} (k){\mathbf{u}}_{i} (k) + \delta (k)} } }}} \right\} + E\left\{ {\frac{{\sum\nolimits_{i = 0}^{N - 1} {{\mathbf{u}}_{i}^{\text{T}} (k){\mathbf{u}}_{i} (k)} }}{{\sum\nolimits_{i = 0}^{N - 1} {{\mathbf{u}}_{i}^{\text{T}} (k){\mathbf{u}}_{i} (k) + \delta (k)} }}} \right\} = \bar{\Delta }(k) \\ \end{aligned} $$
(16)

The regularization parameter \( \delta (k) \) that enables the MSD to undergo the steepest descent from iteration \( k \) to \( k + 1 \) can be achieved by taking the first-order derivative of (16) with respect to \( \delta (k) \), as follows:

$$ \begin{aligned} \frac{{\partial \bar{\Delta }(k)}}{\partial \delta (k)} & = E\left\{ {\frac{{\left\| {{\mathbf{e}}_{D} (k)} \right\|_{1} - K\sqrt N \sigma_{\eta } }}{{\left[ {\sum\nolimits_{i = 0}^{N - 1} {{\mathbf{u}}_{i}^{\text{T}} (k){\mathbf{u}}_{i} (k) + \delta (k)} } \right]^{{\frac{3}{2}}} }}} \right\} - E\left\{ {\frac{{\sum\nolimits_{i = 0}^{N - 1} {{\mathbf{u}}_{i}^{\text{T}} (k){\mathbf{u}}_{i} (k)} }}{{\left[ {\sum\nolimits_{i = 0}^{N - 1} {{\mathbf{u}}_{i}^{\text{T}} (k){\mathbf{u}}_{i} (k) + \delta (k)} } \right]^{2} }}} \right\} \\ & {\kern 1pt} = E\left\{ {\frac{{\left[ {\left\| {{\mathbf{e}}_{D} (k)} \right\|_{1} - K\sqrt N \sigma_{\eta } } \right]\sqrt {\sum\nolimits_{i = 0}^{N - 1} {{\mathbf{u}}_{i}^{\text{T}} (k){\mathbf{u}}_{i} (k) + \delta (k)} } - \sum\nolimits_{i = 0}^{N - 1} {{\mathbf{u}}_{i}^{\text{T}} (k){\mathbf{u}}_{i} (k)} }}{{\left[ {\sum\nolimits_{i = 0}^{N - 1} {{\mathbf{u}}_{i}^{\text{T}} (k){\mathbf{u}}_{i} (k) + \delta (k)} } \right]^{2} }}} \right\} \\ \end{aligned} $$
(17)

A suboptimal \( \delta (k) \) is thus obtained by setting (17) to zero

$$ \delta (k) = E\left\{ {\left[ {\frac{{\sum\nolimits_{i = 0}^{N - 1} {{\mathbf{u}}_{i}^{\text{T}} (k){\mathbf{u}}_{i} (k)} }}{{\left\| {{\mathbf{e}}_{D} (k)} \right\|_{1} - K\sqrt N \sigma_{\eta } }}} \right]^{2} - \sum\limits_{i = 0}^{N - 1} {{\mathbf{u}}_{i}^{\text{T}} (k){\mathbf{u}}_{i} (k)} } \right\} . $$
(18)

2.3 Practical Considerations

As can be seen in (18), it is difficult to calculate the exact regularization parameter because it is essentially a statistic. Following the method in [17], the moving average strategy can be used to compute \( \delta (k) \), as follows:

$$ \delta (k) = \left\{ {\begin{array}{*{20}l} {\alpha \delta (k - 1) + (1 - \alpha )\hbox{max} \left( {\chi (k),{\kern 1pt} {\kern 1pt} {\kern 1pt} \delta (k - 1)} \right),} \hfill & {{\text{if}}\,\,\chi (k) > 0{\kern 1pt} } \hfill \\ {\delta (k - 1),{\kern 1pt} } \hfill & {\text{otherwise}} \hfill \\ \end{array} } \right. . $$
(19)

where \( \alpha \) denotes the smoothing factor close to one, and \( \chi (k) \) is expressed as

$$ \chi (k) = \left[ {\frac{{\sum\nolimits_{i = 0}^{N - 1} {{\mathbf{u}}_{i}^{\text{T}} (k){\mathbf{u}}_{i} (k)} }}{{\left\| {{\mathbf{e}}_{D} (k)} \right\|_{1} - K\sqrt N \sigma_{\eta } }}} \right]^{2} - \sum\limits_{i = 0}^{N - 1} {{\mathbf{u}}_{i}^{\text{T}} (k){\mathbf{u}}_{i} (k)} . $$
(20)

Remark 1

Due to the stochastic approximation in (15), \( \chi (k) \) may be negative. Thus, the update of the regularization parameter needs to be constrained to \( \chi (k) > 0 \). When \( 0 < \chi (k) \le \delta (k - 1) \), one will have \( \delta (k) = \delta (k - 1) \). And, if \( \chi (k) > \delta (k - 1) \), we will get \( \delta (k) > \delta (k - 1) \). This means \( \delta (k) \ge \delta (k - 1) \) is guaranteed as long as \( \delta (k) \) is updated, which caters to the evolution of the regularization parameter.

2.4 Re-initialization Mechanism

The monotonically increasing behavior of \( \delta (k) \) cannot ensure its tracking capability when the system suddenly changes. Therefore, to improve the tracking ability of the algorithm, a re-initialization mechanism designed for \( \delta (k) \) is presented by learning from [18], as shown in Table 2. In this table, \( \bmod ({\kern 1pt} \cdot ) \) denotes the remainder operator, \( V_{T} \) and \( V_{D} \) refer to the positive integers \( (V_{T} > V_{D} ) \), \( {\mathbf{C}} = {\text{sort}}\left[ {\frac{{\left| {e(k)} \right|}}{{\left\| {{\mathbf{u}}(k)} \right\|_{2} + \varepsilon }},{\kern 1pt} {\kern 1pt} \ldots ,{\kern 1pt} {\kern 1pt} \frac{{\left| {e(k - V_{T} + 1)} \right|}}{{\left\| {{\mathbf{u}}(k - V_{T} + 1)} \right\|_{2} + \varepsilon }}} \right]^{\text{T}} \) where \( {\text{sort}}( \cdot ) \) accounts for the ascending order operation, \( {\mathbf{M}} = {\text{diag}}\left( {1,{\kern 1pt} {\kern 1pt} \ldots ,{\kern 1pt} {\kern 1pt} 1,{\kern 1pt} {\kern 1pt} 0,{\kern 1pt} {\kern 1pt} \ldots ,{\kern 1pt} {\kern 1pt} 0} \right) \) is a diagonal matrix with its first \( V_{T} - V_{D} \) elements set to one, and \( \xi \) is a threshold value.

Table 2 Summary of the re-initialization mechanism

Note that the re-initialization mechanism is designed for the nonstationary environment. When the noise level is large, \( V_{T} - V_{D} \) should be increased to discard the large noise samples that will otherwise contribute as outliers for the estimation of a change in the system. If the condition \( \Delta_{k} > \xi \) holds, this means that the sudden change in the system happens at the current iteration, and we need to initialize several parameters including \( \delta (k) \) and \( {\mathbf{w}}(k) \). If the change is not too large (\( {\text{ctrl}}_{\text{new}} > {\text{ctrl}}_{\text{old}} \)), the second line of the “if” sentence can solve the tracking problem without restarting the delta sequence. If no change is detected, the regularization parameter will be updated by the previously developed manners in (19) and (20).

2.5 Computational Complexity

In this subsection, we compare the computational complexity of the proposed IVRP-SSAF and several subband-type algorithms. For the IVRP-SSAF, the computational cost regarding the re-initialization mechanism has not been taken into account because it is performed only every \( V_{T} \) input samples, and we just perform the calculation in (6), (19) and (20). We calculate the total number of multiplications, additions and comparisons for one fullband input sample as the scale of the computational complexity. Table 3 shows the results, where the symbol \( L \) stands for the length of the analysis filters and synthesis filters. As we can see, the NSAF algorithm requires \( 3M + 3NL + 1 \) multiplications and \( 3M + 3N(L - 1) \) additions, while the SSAF algorithm demands fewer multiplications and additions than it. Because of the additional computation for VRP, the VRP-SSAF algorithm requires more multiplications and additions compared with the SSAF algorithm, amounting to \( 3M + 2M/N + 3NL + 1 \) and \( 4M + (M - 1)/N + 3NL - 2N - 2 \), respectively. The VSS-SSAF algorithm slightly increases the computational complexity as compared with the SSAF algorithm, demanding \( M + (2M + 3)/N + 3NL \) multiplications and \( 2M + (M + 2)/N + 3NL - 2N \) additions. Owing to calculating the proposed VRP, our scheme has the highest computational cost, which needs \( 3M + (2M + 4)/N + 3NL \) multiplications and \( 4M + M/N + 3NL - 2N - 2 \) additions. This is tolerable and can be compensated by the improved performance.

Table 3 Computational complexity of various subband-type algorithms

3 Simulations

In this section, the proposed algorithm is evaluated by simulations conducted for both the system identification and echo cancelation. The sparse acoustic echo path with \( M = 512 \) coefficients (the sampling frequency is 8 kHz) is used as the unknown system, as depicted in Fig. 2. As is known to all, with the increase in subband, the algorithm tends to obtain faster convergence rate but also suffers from more expensive computational cost [12]. Thus, to balance the conflict, the cosine-modulated filter bank with \( N = 4 \) subbands is employed. The measurement noise consists of two parts where one is a white Gaussian noise with variance \( \sigma_{\eta }^{2} = 10^{ - 3} \), and the other is an impulsive interference modeled as a Bernoulli–Gaussian process, i.e., \( v(k) = q(k)h(k) \), where \( q(k) \) is a white Gaussian process, and \( h(k) \) is a Bernoulli process with the probability mass function given by \( P\left[ {q(k)} \right] = 1 - P_{r} \). (\( P_{r} \) denotes the probability of the occurrence of the impulsive interference.) In the experiments, the variance for \( v(k) \) is set to \( \sigma_{v}^{2} = 1000E\left[ {y^{2} (k)} \right] \) with \( y(k) = {\mathbf{u}}^{\text{T}} (k){\mathbf{w}}_{o} \). The initial vector of the adaptive filter is set to a zero vector, and the regularization parameter is initialized as \( \delta (0) = 10^{ - 4} \). The relevant parameters for the re-initialization mechanism are set to the recommended values in [18], as \( V_{T} { = }3M,{\kern 1pt} {\kern 1pt} {\kern 1pt} V_{D} = 0.75V_{T} ,{\kern 1pt} {\kern 1pt} {\kern 1pt} \xi = 1,{\kern 1pt} {\kern 1pt} {\kern 1pt} \varepsilon = 10^{ - 6} \). The initial values for \( ctrl_{new} \) and \( {\text{ctrl}}_{\text{old}} \) are set to zero. The NMSD, defined as \( 10\log_{10} \left[ {\left\| {{\mathbf{w}}_{o} - {\mathbf{w}}(k)} \right\|^{2} /\left\| {{\mathbf{w}}_{o} } \right\|^{2} } \right] \), is used to evaluate the performance. The results are the average of 100 independent trials.

Fig. 2
figure 2

Impulse response of the sparse acoustic echo path

3.1 System Identification

In this subsection, the correlated input is generated by filtering the zero-mean Gaussian random sequence through a first-order system \( G(z) = 1/(1 - 0.95z^{ - 1} ) \). The length of filter is \( M = 512 \) as well as the number of subbands is \( N = 4 \).

Figure 3 compares the results of the proposed algorithm with different \( \alpha \). As can be seen, the proposed algorithm with \( \alpha = 0.999 \) achieves better performance, especially in terms of the steady-state misalignment, than that with other values no matter in which case of \( P_{r} \). With the decrease in \( \alpha \), the performance of the algorithm degrades. Therefore, for the IVRP-SSAF, the selection of \( \alpha = 0.999 \) will be appropriate.

Fig. 3
figure 3

NMSD curves of the proposed algorithm with different \( \alpha \), and \( K = 1 \). a\( P_{r} = 0.001 \), b\( P_{r} = 0.01 \), c\( P_{r} = 0.1 \)

Figure 4 compares the results of the proposed algorithm with respect to \( K \). For both \( P_{r} = 0.001 \) and \( P_{r} = 0.01 \), the IVRP-SSAF with \( K = 2 \) or \( K = 3 \) exhibits similar performance. By contrast, the IVRP-SSAF with \( K = 1 \) performs better than that with other values, yielding the lowest steady-state misalignment. In the case of \( P_{r} = 0.1 \), the learning curves of the proposed algorithm with different \( K \) are nearly the same, which means that the value of \( K \) makes a trivial impact to its performance. Therefore, the selection of \( K = 1 \) for the IVRP-SSAF will be utilized for simulations.

Fig. 4
figure 4

NMSD curves of the proposed algorithm with different \( K \), and \( \alpha = 0.999 \). a\( P_{r} = 0.001 \), b\( P_{r} = 0.01 \), c\( P_{r} = 0.1 \)

Figure 5 compares the learning curves of various algorithms in the presence of impulsive noise, where the probability is selected as \( P_{r} = 0.001 \). To facilitate the comparison among the VRP-SSAF, VSS-SSAF and the proposed algorithm, the value of \( \rho \) for the VRP-SSAF is obtained by training, and the values of \( K \) and \( \alpha \) for the VSS-SSAF are recommended values in the literature [17]. As can be seen, the NSAF algorithm performs poorly since it is not equipped with capability to combat impulsive noise. In contrast, the performance of the SSAF, VRP-SSAF and VSS-SSAF algorithms has significant improvement due to their robustness against impulsive noise. Importantly, as one can see, the proposed IVRP-SSAF performs better than other algorithms, which provides the fastest convergence rate and lowest steady-state misalignment. Note that all the tested algorithms, except the VSS-SSAF algorithm, have tracking abilities.

Fig. 5
figure 5

Comparison of NMSD curves of the proposed algorithm, NSAF, SSAF, VRP-SSAF and VSS-SSAF, and \( P_{r} = 0.001 \). For the VRP-SSAF, \( \delta (0) = 0.001 \) and \( \delta_{p} = 0.0001 \). The unknown vector \( {\mathbf{w}}_{o} \) changes to \( - {\mathbf{w}}_{o} \) at the middle of iterations

Figure 6 carries out the comparison of various algorithms in the case of \( P_{r} = 0.01 \). Similar to the results in Fig. 5, the proposed algorithm still obtains the best performance in terms of both the convergence rate and steady-state misalignment.

Fig. 6
figure 6

Comparison of NMSD curves of the proposed algorithm, NSAF, SSAF, VRP-SSAF and VSS-SSAF, and \( P_{r} = 0.01 \). For the VRP-SSAF, \( \delta (0) = 0.001 \) and \( \delta_{p} = 0.0001 \). The unknown vector \( {\mathbf{w}}_{o} \) changes to \( - {\mathbf{w}}_{o} \) at the middle of iterations

Figure 7 presents the results of several algorithms in the case of \( P_{r} = 0.1 \). Because of the adverse effect of strong impulsive noise, all the tested algorithms have experienced a disastrous decrease in the performance. In the situation of keeping the same step size used in Figs. 6 and 7, the NSAF algorithm behaves divergent. The remaining algorithms maintain convergent since they can prevent the impulsive noise from damaging the adaptive process. Moreover, as compared to the SSAF, VRP-SSAF and VSS-SSAF algorithms, the proposed algorithm enjoys more advantages, which is obvious after the sudden change in the unknown system.

Fig. 7
figure 7

Comparison of NMSD curves of the proposed algorithm, NSAF, SSAF, VRP-SSAF and VSS-SSAF, and \( P_{r} = 0.1 \). For the VRP-SSAF, \( \delta (0) = 0.001 \) and \( \delta_{p} = 0.0001 \). The unknown vector \( {\mathbf{w}}_{o} \) changes to \( - {\mathbf{w}}_{o} \) at the middle of iterations

In Fig. 8, the evolutions of the regularization parameter using logarithm scale \( 10\log_{10} \delta (k) \) are presented in different impulsive noise environments. As can be seen, before the change in the system, in the situation of \( P = 0.001 \) or \( P = 0.01 \), the steady-state value of \( 10\log_{10} \delta (k) \) is close to 120, while in the case of \( P = 0.1 \), it is approximate to 90, which implies the strong impulsive noise could jeopardize the performance of the proposed scheme. After the change, the regularization parameter maintains a similar evolutionary trend as before.

Fig. 8
figure 8

Evolutions of the regularization parameter for the proposed algorithm in different impulsive noise environments, \( K = 1 \) and \( \alpha = 0.999 \). The unknown vector \( {\mathbf{w}}_{o} \) changes to \( - {\mathbf{w}}_{o} \) at the middle of iterations

3.2 Echo Cancelation

The speech input, depicted in Fig. 9, is utilized to examine the proposed algorithm in an acoustic echo cancelation application. The length of filter is \( M = 512 \) as well as the number of subbands is \( N = 4 \).

Fig. 9
figure 9

Impulse response of the speech input

As can be seen from Fig. 10, the NSAF algorithm provides the worst result, which becomes divergent when its step size is selected as \( \mu = 0.001 \). In contrast to the NSAF algorithm, the SSAF, VRP-SSAF and VSS-SSAF algorithms greatly improve their convergence rates and reduce the steady-state misalignments. The performance of the VRP-SSAF algorithm is inferior to that of the SSAF and VSS-SSAF algorithms. It is worth noting that the proposed IVRP-SSAF exhibits the fastest convergence rate and arrives at the lowest steady-state misalignment compared with other tested algorithms.

Fig. 10
figure 10

NMSD curves of the proposed algorithm, NSAF, SSAF, VRP-SSAF and VSS-SSAF for the acoustic echo cancelation, and \( P_{r} = 0.001 \). For the VRP-SSAF, \( \delta (0) = 0.001 \) and \( \delta_{p} = 0.0001 \)

4 Conclusions

In this paper, the IVRP-SSAF algorithm that can suppress impulsive noise has been proposed. The proposed VRP was derived by minimizing the upper bound of a certain term that is used to update the MSD. We have designed a re-initialization mechanism to enhance the tracking capability as well as discussed the computational complexity of the algorithm. Sufficient experiments have illustrated that the proposed IVRP-SSAF algorithm outperforms the NSAF, SSAF, VRP-SSAF and VSS-SSAF algorithms in the impulsive noise scenario for both system identification and acoustic echo cancelation applications.

Remark 2

For better understanding the proposed scheme, a nomenclature table is summarized as follows.

Summary of the nomenclature list

Symbol

Nomenclature

\( d(n) \)

Desired signal

\( \eta (n) \)

Measurement noise

\( H_{i} (z) \)

Analysis filters

\( G_{i} (z) \)

Synthesis filters

\( N \)

Subband number

\( \downarrow N \)

N-fold decimations

\( \uparrow N \)

N-fold interpolations

\( d_{{i,{\kern 1pt} D}} (k) \)

Subband desired signal

\( y_{{i,{\kern 1pt} D}} (k) \)

Subband output signal

\( \eta_{{i,{\kern 1pt} D}} (k) \)

Subband measurement noise

\( e_{{i,{\kern 1pt} a}} (k) \)

Priori subband error

MSD

Mean-square deviation

\( {\mathbf{w}}_{o} \)

Unknown vector

\( {\mathbf{u}}(n) \)

Input vector

\( {\mathbf{u}}_{i} (k) \)

Subband input vector

\( {\mathbf{w}}(k) \)

Weight vector

\( {\mathbf{U}}(k) \)

Subband input matrix

\( {\varvec{\upeta}}_{D} (k) \)

Subband noise vector

\( {\mathbf{d}}_{D} (k) \)

Subband desired vector

\( {\mathbf{e}}_{D} (k) \)

Subband error vector

\( {\tilde{\mathbf{w}}}(k) \)

Weight error vector

\( {\mathbf{e}}_{a} (k) \)

Prior subband error vector