1 Introduction

So far, adaptive filter algorithms are often used in channel equalization, active interference control, echo cancellation, biomedical engineering [7, 15, 21, 28, 30, 31], and many other fields [23, 33, 34]. In recent years, especially in the research on wireless sensor networks, in other words, diffusion adaptive filtering algorithms have been widely studied due to their unique performance, which is an extension of adaptive filtering algorithms over network graphs [31]. Besides, three collaborative strategies for adaptive filtering/estimation algorithms on the distributed network are widely used, including incremental, consensus, and diffusion strategies. However, these three collaborative strategies have different performances; specifically, the consensus technique has an asymmetry problem, which can cause unstable growth. The diffusion strategies show good performance for unstable, and they can remove the asymmetry problem and real-time adaptation learning over distributed networks. So, diffusion strategies have been used frequently in the past decade, and they also include the adapt-then-combine (ATC) scheme [22] and the combine-then-adapt (CTA) scheme [5, 31]. Of course, the performance between these two schemes is also different; in response to this comparison, the specific explanation is that in the CTA formulation of the diffusion strategy, the name combine-then-adapt is that the first step involves a combination step, while the second step consists of an adaptation step; a similar implementation can be obtained by switching the order of the combination and adaptation steps. Moreover, Cattivelli and colleagues analyzed these two schemes, indicating that the ATC is better than the CTA [22]. With this conclusion, in the following research, the ATC scheme becomes a research focus in distributed adaptive filtering algorithms [2,3,4, 16, 19, 22, 24, 25, 39].

The Wiener filter principle is the fundamental adaptive filtering algorithm based on the minimum mean square estimation error to construct an efficient convex cost function [38]. We can know the core position of the cost function in the design of adaptive filtering algorithms. Besides, most of the cost functions were designed in the adaptive filtering algorithm field to satisfy symmetry, such as LMS [38], LMF [35], symmetric Gaussian kernel function [21], and hyperbolic function [18, 36]. It is worth noting that the mainstream cost function design constructs the independent variable as a function of the estimation error. Still, the estimation error is closely related to the measurement interference/noise. Not all the measurement interference/noise distributions satisfy symmetrical distributions, such as among the asymmetric interference/noise, impulsive noise is the most representative asymmetrical distribution. Impulsive noise will significantly affect adaptive estimation accuracy and most diffusion adaptive estimation/filtering algorithms over a network graph. Therefore, designing a robust distributed filtering algorithm for impulsive noise is necessary.

For this purpose, many papers have been written so far [1, 13, 17, 20, 26, 27, 37]. The diffusion least mean p-power (DLMP) algorithm was proposed by Wen [37], which is robust to the generalized Gaussian noise distribution environments and prior knowledge of the distribution. However, the DLMP algorithm was proposed with a fixed power p-value, so the parameter p-value is the critical factor, which means the DLMP algorithm performance is highly susceptible to the p-value. Based on the minimization of the L1-norm subject to a constraint on the adaptive estimate vectors, Ni and colleagues designed a diffusion sign subband adaptive filtering (DSSAF) algorithm [26]. The DSSAF algorithm performs better, but the computational complexity of DSSAF is relatively large.

Besides, by combining the diffusion least mean square (DLMS) algorithm [4] and the sign operation to the estimated error at each iteration moment point, Ni and colleagues derived a diffusion sign-error LMS (DSELMS) algorithm [27]. The DSELMS algorithm has a simple architecture, but the DSELMS algorithm has a significant drawback, i.e., the steady-state estimation error is high [11]. Based on the Huber cost function, a similar set of diffusion adaptive filtering algorithms by Guan and colleagues [20], Wei and colleagues [17], and Soheila and colleagues [1] have been proposed as the DNHuber, DRVSSLMS, and RDLMS algorithms, respectively. Nevertheless, among them, the RDLMS algorithm is impractical for impulsive noise. Discussing impulsive noise and input signals can be more comprehensive for the DNHuber algorithm. The DRVSSLMS algorithm has high algorithm computational complexity not conducive to implementing practical engineering. Besides, inspired by the least mean logarithmic absolute difference (LLAD) operation, Chen and colleagues designed another distributed adaptive filtering algorithm, i.e., the DLLAD algorithm [5]. But analysis of the robustness of the distributed algorithm to the input signal and impulsive noise has yet to be performed. To solve this problem, Guan and colleagues proposed a diffusion probabilistic least mean square (DPLMS) algorithm [13] by combining the ATC scheme and the probabilistic LMS algorithm [10, 16] at all distributed network nodes.

Can an asymmetric function be used to design a cost function in an adaptive filtering algorithm? It is a problem that needs urgent attention to ensure the effectiveness of adaptive filtering algorithms. The asymmetric cost function usually performs very well, especially when the estimated error variable is of symmetric distribution. However, symmetry and asymmetry are complementary concepts, so does the estimation error change symmetrical when running an adaptive filtering algorithm? Usually, the change in the estimation error cost function will be affected by measurement interference/noise. The distribution of most interference or noise does not satisfy symmetry. So, for asymmetric estimation error distribution, the symmetric cost function is inappropriate and cannot adapt to the error distribution well.

Therefore, for these asymmetric interferences, it is feasible to use the asymmetric function to design the cost function to construct a novel adaptive filtering algorithm. Combining the content of the above paragraph, we can see that most of the above distributed adaptive filtering algorithms are proposed using the symmetry cost functions [29]. So, to address the asymmetry estimated error distribution issue, we offer a family diffusion adaptive filtering algorithms by using three different asymmetric cost functions (i.e., the linear–linear cost (LLC), quadratic-quadratic cost (QQC), and linear-exponential cost (LEC) [6, 8, 9], namely the DLLCLMS, DQQCLMS, and DLECLMS algorithms. The stability of the mean estimation error of those three proposed diffusion algorithms is analyzed, and the computational complexity is also analyzed theoretically. Simulation experiment results indicate that the DLLCLMS, DQQCLMS, and DLECLMS algorithms are more robust to the input signal and impulsive noise than the DSELMS, DRVSSLMS, and DLLAD algorithms.

The rest parts of this article are organized as follows. The proposed DLLCLMS, DQQCLMS, and DLECLMS algorithms will be described in detail in Sect. 2. The statistical stability behavior, computation complexity, and parameters (a and b) of DLLCLMS, DQQCLMS, and DLECLMS algorithms are studied in Sect. 3. The simulation experiment is described in Sect. 4. Finally, conclusions are provided in Sect. 5. Note: Bold type refers to vectors, [ ]T denotes the transpose, [ ]-1 denotes the inverse operation, and | | denotes the absolute value operation.

2 Proposed the Diffusion Algorithms Using the Asymmetric Function

This section mainly describes designing the DLLCLMS, DQQCLMS, and DLECLMS algorithms. The specific plan is that the first step is to propose three adaptive filtering algorithms based on asymmetric cost functions (i.e., the LLC, QQC, and LEC functions). Then, the second step is to modify those three adaptive filtering algorithms by extending to all distributed network agents to propose the DLLCLMS, DQQCLMS, and DLECLMS algorithms.

2.1 Three Adaptive Filtering Algorithms Based on Asymmetric Cost Functions

Setting an unknown linear system with the length M of the system coefficient \({\mathbf{W}}^{\mathrm{o}}\), and \(\mathbf{W}\left(i\right)\) be the adaptive estimated weight vector at iteration \(i\), \(\mathbf{X}\left(i\right)\) denotes the input signal vector of the adaptive filtering algorithm. The estimation error \(e\left(i\right)\) between the desired signal \(d\left(i\right)\) and the estimation output \(y\left(i\right)\) can be expressed as Eqs. (1)–(2). Additionally, \(v\left(i\right)\) is this unknown linear system measurement noise.

$$ \left\{ {\begin{array}{*{20}l} {d\left( i \right) = {\mathbf{W}}^{{{\text{oT}}}} X\left( i \right) + v\left( i \right)} \hfill \\ {y\left( i \right) = {\mathbf{W}}^{{\text{T}}} \left( i \right)X\left( i \right)} \hfill \\ \end{array} } \right. $$
(1)
$$e\left(i\right)=d\left(i\right)-y\left(i\right)={\mathbf{W}}^{\mathrm{o}}\mathbf{X}\left(i\right)+v\left(i\right)-{\mathbf{W}}^{\mathrm{T}}\left(i\right)\mathbf{X}\left(i\right)$$
(2)

Next, three asymmetric cost functions, including the LLC, QQC, and LEC functions, are used to design three adaptive filtering algorithms.

Firstly, LLC adaptive filtering algorithm aims to minimize the LLC cost function of estimation error defined as

$$ J_{{{\text{LLC}}}} \left( i \right) = \left\{ {\begin{array}{*{20}c} {a\;e(i)} \\ {b\;e(i)} \\ \end{array} \begin{array}{*{20}c} {,\;{\text{if}}\; e\left( i \right) > 0} \\ {,\;{\text{if}}\; e\left( i \right) \le 0} \\ \end{array} } \right. $$
(3)

Secondly, QQC adaptive filtering algorithm aims to minimize the QQC cost function of estimation error defined as

$$ J_{{{\text{QQC}}}} \left( i \right) = \frac{1}{2}\left\{ {\begin{array}{*{20}c} {ae^{2} \left( i \right)} \\ {be^{2} \left( i \right)} \\ \end{array} \begin{array}{*{20}c} {,\;{\text{if}} e\left( i \right) > 0} \\ {,\;{\text{if}} e\left( i \right) \le 0} \\ \end{array} } \right. $$
(4)

Thirdly, LEC adaptive filtering algorithm aims to minimize the LEC cost function of estimation error defined as

$$ J_{{{\text{LEC}}}} \left( i \right) = b\left[ {{\text{exp}}\left( {ae\left( i \right)} \right) - ae\left( i \right) - 1} \right] $$
(5)

In Eqs. (3), (4), and (5), \(a, b>0\) is the cut-off value.

Parameters a and b determine the shape and characteristics of each cost function; how to set it is critical. Parameters a and b facilitate the adjustment of the asymmetric cost of error functions to the empirical cost situation because they determine the severity of a given estimation error type. For example, setting a = b reduces QQC to the mean squared error, whereas LLC is reduced to the mean absolute error. From Eq. (3), the LLC cost function behaves as a sign-error cost function estimator. Therefore, the LLC cost function can combine the sign-error cost function and asymmetric estimated error. From Eq. (4), the QQC cost function behaves as a square error cost function estimator. Therefore, the QQC cost function can combine the square error cost function estimator and asymmetric estimated error. From Eq. (5), the LEC cost function behaves as an exponential estimator. Therefore, the LEC cost function can combine the exponential function estimator and asymmetric estimated error. From the above analysis, it can be seen that these three cost functions are worthy of further study and then, used to design adaptive filtering algorithms.

According to the steepest descent method, the weight vector update of the LLC adaptive filter algorithm is

$$ \begin{aligned} {\mathbf{W}}\left( {i + 1} \right) = & \left\{ {\begin{array}{*{20}c} {{\mathbf{W}}\left( i \right) + \mu a\;{\text{sign}}\left( {e\left( i \right)} \right){\mathbf{X}}\left( i \right)} \\ {{\mathbf{W}}\left( i \right) + \mu b\;{\text{sign}}\left( {e\left( i \right)} \right){\mathbf{X}}\left( i \right)} \\ \end{array} \begin{array}{*{20}c} {,\;{\text{if}}\; e\left( i \right) > 0} \\ {,\;{\text{if}}\; e\left( i \right) \le 0} \\ \end{array} } \right. \\ = & {\mathbf{W}}\left( i \right) + \frac{\mu }{2}\left[ {a\left( {1 + {\text{sign}}\left( {e\left( i \right)} \right)} \right) + b\left( {1 - {\text{sign}}\left( {e\left( i \right)} \right)} \right)} \right]{\mathbf{X}}\left( i \right) \\ \end{aligned} $$
(6)

In Eq. (6), \(\mathrm{sign}\left(\cdot \right)\) denotes the sign function, and μ is the step size.

According to the steepest descent method, the weight vector update of the QQC adaptive filter algorithm is

$$ {\mathbf{W}}\left( {i + 1} \right) = \left\{ {\begin{array}{*{20}l} {{\mathbf{W}}\left( i \right) + \mu ae\left( i \right){\mathbf{X}}\left( i \right)} \hfill \\ {{\mathbf{W}}\left( i \right) + \mu be\left( i \right){\mathbf{X}}\left( i \right)} \hfill \\ \end{array} \begin{array}{*{20}c} {,\;{\text{if}}\; e\left( i \right) > 0} \\ {,\;{\text{if}}\; e\left( i \right) \le 0} \\ \end{array} } \right. $$
(7)

In similar operations, according to the steepest descent method, the weight vector update of the LEC adaptive filter algorithm is

$$\mathbf{W}\left(i+1\right)=\mathbf{W}\left(i\right)+\mu ab\left(\left(\mathrm{exp}\left(ae\left(i\right)\right)-1\right)\right)\mathbf{X}\left(i\right)$$
(8)

In Eq. (8), the symbol \(\mathrm{exp}\left(\cdot \right)\) denotes the exponential function, and μ is the step size.

2.2 Three Asymmetric Adaptive Diffusion Filtering Algorithms

As described in part of the introduction, recent research on wireless sensor networks has been widely studied due to their unique performance. So, according to the design results of the previous subsection, three adaptive filtering algorithms, combined with a schematic diagram of distributed network structure in our previous research papers [13, 20], set a distributed network of N agent sensor nodes (as Fig. 1). \({\mathbf{X}}_{\mathrm{n}}\left(i\right)\) and \({d}_{\mathrm{n}}\left(i\right)\) are the input signals and estimation output signals at agent n, respectively. It needs to be stated that this paper is different from our previous work [13, 20]. The similarity is how to realize the practical distributed adaptive estimation in impulsive interference. This paper uses three asymmetric functions to design three cost functions to construct a novel family adaptive filtering algorithm to address the asymmetry estimated error distribution issue. Moreover, explore the robustness of the algorithm developed in this paper to the input signal and impulsive interference.

Fig. 1
figure 1

A distributed network con N agent sensor nodes [13, 20]

Based on Fig. 1, by using minimizes the global cost function, we can seek the optimal linear estimator at each time instant i:

$$ J^{{{\text{global}}}} \left( {{\mathbf{W}}\left( i \right)} \right) = \mathop \sum \limits_{n} J_{{\text{n}}}^{{{\text{local}}}} \left( {{\mathbf{W}}\left( i \right)} \right) $$
(9)

Each sensor node \(n\in \left\{\mathrm{1,2},\cdots ,N\right\}\) has access to some zero-mean random process \(\left\{{d}_{\mathrm{n}}\left(i\right),{\mathbf{X}}_{\mathrm{n}}\left(i\right)\right\}\), \({d}_{\mathrm{n}}\left(i\right)\) is a scalar, and \({\mathbf{X}}_{\mathrm{n}}\left(i\right)\) is a regression vector. Suppose these measurement signals follow a standard computational model given by:

$${d}_{\mathrm{n}}\left(i\right)={\mathbf{W}}^{\mathrm{oT}}{\mathbf{X}}_{\mathrm{n}}\left(i\right)+{v}_{\mathrm{n}}\left(i\right)$$
(10)

where \({\mathbf{W}}^{\mathrm{o}}\) is the unknown parameter vector with length M, and \({v}_{\mathrm{n}}\left(i\right)\) is the unknown linear distributed network system measurement noise with variance \({\sigma }_{v,n}^{2}\).

The DLMS algorithm [4] is obtained by minimizing a linear combination of the local mean square estimation error:

$$ J_{n}^{{{\text{local}}}} \left( {{\mathbf{W}}\left( i \right)} \right) = \mathop \sum \limits_{{l \in N_{n} }} c_{l,n} {\text{E}}\left[ {\left( {e_{l} \left( i \right)} \right)^{2} } \right] = \mathop \sum \limits_{{l \in N_{n} }} c_{l,n} {\text{E}}\left[ {\left( {d_{l} \left( i \right) - {\mathbf{X}}_{l} \left( i \right){\mathbf{W}}^{{\text{T}}} \left( i \right)} \right)^{2} } \right] $$
(11)

The set of distributed network nodes connected to the n-th node (including the n-th node itself) is denoted by \({N}_{n}\) and is called the neighborhood of distributed network nodes n. The weighting coefficients \({c}_{l,n}\) are real and satisfy \(\sum_{l\in {N}_{n}}{c}_{l,n}=1\). \({c}_{l,n}\) forms a nonnegative combination matrix C.

Cattivelli and colleagues analyzed ATC and CTA, and the ATC is better than the CTA [22]. So, using the ATC scheme, there are two steps in the DLMS algorithm: adaptation and combination. The order of these two steps is as follows

$$ \left\{ {\begin{array}{*{20}l} {{\boldsymbol{\varphi }}_{{\text{n}}} \left( i \right) = {\mathbf{W}}_{{\text{n}}} \left( {i - 1} \right) + \mu_{{\text{n}}} {\mathbf{X}}_{{\text{n}}} \left( i \right)e_{{\text{n}}} \left( i \right)} \hfill \\ {{\mathbf{W}}_{{\text{n}}} \left( i \right) = \mathop \sum \limits_{{l \in N_{{\text{n}}} }} c_{l,n} {\boldsymbol{\varphi }}_{{\text{l}}} \left( i \right)} \hfill \\ \end{array} } \right. $$
(12)

where μ is the step size (learning rate), and \({\boldsymbol{\varphi }}_{n}\left(i\right)\) is the local estimates at distributed network node n.

They were combining Eqs. (6)~(8) and Eq. (15), and three asymmetric adaptive diffusion filtering algorithms are designed as follows.

2.2.1 The DLLCLMS Algorithm

Combining Eqs. (6) and (15), a summary of the DLLCLMS algorithm procedure based on the above analysis is given in Table 1. From Table 1, we know the DLLCLMS algorithm can be regarded as a general algorithm structure of the DSELMS algorithm. If \(a=b\), the DLLCLMS algorithm is the DSELMS algorithm. In other words, the DLLCLMS can be seen as a mixture of a DSELMS algorithm for different estimated error \(e\left(i\right)\) at a different network node and dynamic switching according to the relationship between \(e\left(i\right)\) and \(0\).

Table 1 The DLLCLMS algorithm summary

2.2.2 The DQQCLMS Algorithm

Combining Eqs. (7) and (15), a summary of the DQQCLMS algorithm procedure based on the above analysis is given in Table 2. From Table 2, we know the DLLCLMS algorithm can be regarded as a general algorithm structure of the DLMS algorithm. If \(a=b\), the DLLCLMS algorithm is the DLMS algorithm. In other words, the DLLCLMS can be seen as a mixture of a DLMS algorithm for different estimated error \(e\left(i\right)\) at a different network node and dynamic switching according to the relationship between \(e\left(i\right)\) and \(0\).

Table 2 The DQQCLMS algorithm summary

2.2.3 The DLECLMS Algorithm

Combining Eqs. (8) and (15), a summary of the DLECLMS algorithm procedure based on the above analysis is given in Table 3. Table 3 shows that the DLECLMS algorithm can be an adaptive filter based on an asymmetric exponential function estimator and extend in a distributed network (Table 4).

Table 3 The DLECLMS algorithm summary
Table 4 The computational complexity of the DSELMS, DRVSSLMS, DLLAD, and three proposed diffusion adaptive filtering algorithms

3 Performance of Proposed Diffusion Algorithms

After completing those three diffusion adaptive filtering algorithms design, the performance of those three algorithms should be analyzed theoretically. This subsection will discuss the performances of the diffusion asymmetric adaptive filtering algorithms, including mean behavior and computational complexity.

To facilitate performance analysis, we make the following assumptions:

Assumption 1

The distributed network system measurement noises are independent of other signals.

Assumption 2

\({\varvec{X}}\left(i\right)\) is zero-mean Gaussian, temporally white, and spatially independent with \({{\varvec{R}}}_{xx,n}=E\left[{{\varvec{X}}}_{\mathbf{n}}\left(i\right){{{\varvec{X}}}_{\mathbf{n}}}^{T}\left(i\right)\right]\).

Assumption 3

The regression vector \({{\varvec{X}}}_{n}\left(i\right)\) is independent of \({\widehat{{\varvec{W}}}}_{n}\left(j\right)\) for all distributed networks n and j < i. All distributed network system weight vectors are approximately independent of all input signals.

Assumption 4

The distributed network system measurement noises \({v}_{\mathrm{n}}\left(i\right)\) at the n-th agent is assumed to be a mixture signal of zero-mean white Gaussian noise \({g}_{\mathrm{n}}\left(i\right)\) of variance \({\sigma }_{g,n}^{2}\), and impulsive noise \({Im}_{\mathrm{n}}\left(i\right)\), i.e., \({v}_{\mathrm{n}}\left(i\right)={g}_{\mathrm{n}}\left(i\right)+{Im}_{\mathrm{n}}\left(i\right)\). The impulsive noise can be described using \({Im}_{\mathrm{n}}\left(i\right)={B}_{\mathrm{n}}\left(i\right){G}_{\mathrm{n}}\left(i\right)\), where \({B}_{\mathrm{n}}\left(i\right)\) is a Bernoulli process with the probability of \(P\left[{B}_{\mathrm{n}}\left(i\right)=1\right]={P}_{\mathrm{r}}\) and \(P\left[{B}_{\mathrm{n}}\left(i\right)=0\right]=1-{P}_{\mathrm{r}}\), and \({G}_{\mathrm{n}}\left(i\right)\) is a zero-mean white Gaussian process of variance \({{I}_{\mathrm{n}}\sigma }_{g,n}^{2}\) with \({I}_{\mathrm{n}}\gg 1\).

Then, let us define some equations at agent \(n\) and time \(i\), \({{\widehat{\mathbf{W}}}_{\mathrm{n}}\left(i\right)={\mathbf{W}}^{\mathrm{o}}-\mathbf{W}}_{\mathrm{n}}\left(i\right)\),\({{\widehat{\boldsymbol{\varphi }}}_{\mathrm{n}}\left(i\right)={\mathbf{W}}^{\mathrm{o}}-\mathbf{\varphi }}_{\mathrm{n}}\left(i\right)\), which are then collected to form the network weight error vector and intermediate network weight error vector, i.e., \(\mathbf{W}\left(i\right)=\mathrm{col}\left\{{\mathbf{W}}_{1}\left(i\right),{\mathbf{W}}_{2}\left(i\right),\cdots ,{\mathbf{W}}_{N}\left(i\right)\right\}\), \(\mathbf{\varphi }\left(i\right)=\mathrm{col}\left\{{\boldsymbol{\varphi }}_{1}\left(i\right),{\boldsymbol{\varphi }}_{2}\left(i\right),\cdots ,{\boldsymbol{\varphi }}_{N}\left(i\right)\right\}\), \(\widehat{\mathbf{W}}\left(i\right)=\mathrm{col}\left\{{\widehat{\mathbf{W}}}_{1}\left(i\right),{\widehat{\mathbf{W}}}_{2}\left(i\right),\cdots ,{\widehat{\mathbf{W}}}_{N}\left(i\right)\right\}\), \(\widehat{\boldsymbol{\varphi }}\left(i\right)=\mathrm{col}\left\{{\widehat{\boldsymbol{\varphi }}}_{1}\left(i\right),{\widehat{\boldsymbol{\varphi }}}_{2}\left(i\right),\cdots ,{\widehat{\boldsymbol{\varphi }}}_{N}\left(i\right)\right\}\), \({{\varvec{\upmu}}}_{{\varvec{A}}}=\mathrm{diag}\left\{{a\mu }_{1},a{\mu }_{2},\cdots ,a{\mu }_{N}\right\}\), \({{\varvec{\upmu}}}_{{\varvec{B}}}=\mathrm{diag}\left\{{b\mu }_{1},b{\mu }_{2},\cdots ,b{\mu }_{N}\right\}\), \({{\varvec{\upmu}}}_{{\varvec{A}}{\varvec{B}}}=\mathrm{diag}\left\{{ab\mu }_{1},ab{\mu }_{2},\cdots ,ab{\mu }_{N}\right\}\), and \(\mathbf{e}\left(i\right)=\mathrm{col}\left\{{e}_{1}\left(i\right),{e}_{2}\left(i\right),\cdots ,{e}_{N}\left(i\right)\right\}\).

3.1 Mean Weight Vector Error Behavior

Two noteworthy performances of adaptive filtering algorithms are convergence and steady-state characteristics. So, by studying the mean weight estimation error vector, the convergence and steady-state error properties of those three proposed diffusion adaptive filtering algorithms can be explored. The following will analyze the mean behavior performance of these three diffusion algorithms.

3.1.1 The DLLCLMS algorithm

Equations (16) and (17) can be written as

$$ {\hat{\boldsymbol{\varphi }}}\left( i \right) = \left\{ {\begin{array}{*{20}l} {{\hat{\mathbf{W}}}\left( {i - 1} \right) - {\varvec{S}}_{{{{\varvec{\upmu}}}_{{\text{A}}} }} {\mathbf{S}}_{{{\mathbf{S}}_{{\mathbf{e}}} }} \left( i \right){\mathbf{X}}\left( i \right)} \hfill \\ {{\hat{\mathbf{W}}}\left( {i - 1} \right) - {\varvec{S}}_{{{{\varvec{\upmu}}}_{{\varvec{B}}} }} {\mathbf{S}}_{{{\mathbf{S}}_{{\mathbf{e}}} }} \left( i \right){\mathbf{X}}\left( i \right)} \hfill \\ \end{array} \begin{array}{*{20}c} {,\;{\text{if}}\; {\mathbf{e}}\left( i \right) > 0 \, \left( I \right)} \\ {,\;{\text{if}}\; {\mathbf{e}}\left( i \right) \le 0 \, \left( {II} \right)} \\ \end{array} } \right. $$
(13)
$$\widehat{\mathbf{W}}\left(i\right)={\mathbf{C}}^{\mathrm{T}}\widehat{\boldsymbol{\varphi }}\left(i\right)$$
(14)

where \(\mathbf{C}=\mathbf{C}\otimes \mathbf{I}\), \({{\varvec{S}}}_{{{\varvec{\upmu}}}_{{\varvec{A}}}}={{\varvec{\upmu}}}_{{\varvec{A}}}\otimes \mathbf{I}\), \({\mathbf{S}}_{{\mathbf{S}}_{\mathbf{e}}}\left(i\right)={\mathbf{S}}_{\mathbf{e}}\left(i\right)\otimes \mathbf{I}\), \(\mathbf{X}\left(i\right)=\mathrm{col}\left\{{\mathbf{X}}_{1}\left(i\right),{\mathbf{X}}_{2}\left(i\right),\cdots ,{\mathbf{X}}_{N}\left(i\right)\right\}\), \({\mathbf{S}}_{\mathbf{e}}\left(i\right)=\mathrm{diag}\left\{\mathrm{sign}\left(\mathbf{e}\left(i\right)\right)\right\},\) and \(\otimes \) denotes the Kronecker product operation.

Taking the expectation of Eq. (22) and Eq. (23),

$$ {\text{E}}\left[ {{\hat{\mathbf{W}}}\left( i \right)} \right] = \left\{ {\begin{array}{*{20}l} {{\mathbf{C}}^{{\text{T}}} {\mathbf{E}}\left[ {{\hat{\mathbf{W}}}\left( {i - 1} \right)} \right] - {\mathbf{C}}^{{\text{T}}} {\varvec{S}}_{{{{\varvec{\upmu}}}_{{\varvec{A}}} }} {\mathbf{E}}\left[ {{\mathbf{S}}_{{{\mathbf{S}}_{{\mathbf{e}}} }} \left( i \right){\mathbf{X}}\left( i \right)} \right]} \hfill \\ {{\mathbf{C}}^{{\text{T}}} {\mathbf{E}}\left[ {{\hat{\mathbf{W}}}\left( {i - 1} \right)} \right] - {\mathbf{C}}^{{\text{T}}} {\varvec{S}}_{{{{\varvec{\upmu}}}_{{\varvec{B}}} }} {\mathbf{E}}\left[ {{\mathbf{S}}_{{{\mathbf{S}}_{{\mathbf{e}}} }} \left( i \right){\mathbf{X}}\,\left( i \right)} \right]} \hfill \\ \end{array} \begin{array}{*{20}l} {,if \quad {\mathbf{e}}\left( i \right) > 0 \quad \left( I \right)} \hfill \\ {,if\,\quad {\mathbf{e}}\left( i \right) \le 0\, \quad \left( {II} \right)} \hfill \\ \end{array} } \right. $$
(15)

Denote the measurement noise vector by \(\mathbf{V}\left(i\right)=\mathrm{col}\left\{{v}_{1}\left(i\right),{v}_{2}\left(i\right),\cdots ,{v}_{N}\left(i\right)\right\}\), \(\mathbf{g}\left(i\right)=\mathrm{col}\left\{{g}_{1}\left(i\right),{g}_{2}\left(i\right),\cdots ,{g}_{N}\left(i\right)\right\}\), \(\mathbf{I}\mathbf{m}\left(i\right)=\mathrm{col}\left\{{Im}_{1}\left(i\right),{Im}_{2}\left(i\right),\cdots ,{Im}_{N}\left(i\right)\right\}\), \({\mathbf{S}}_{\mathbf{g}}\left(i\right)=\mathrm{diag}\left\{\mathrm{sign}\left(\mathbf{g}\left(i\right)\right)\right\}\), \({\mathbf{S}}_{\mathbf{I}\mathbf{m}}\left(i\right)=\mathrm{diag}\left\{\mathrm{sign}\left(\mathbf{I}\mathbf{m}\left(i\right)\right)\right\}\),\(\begin{aligned}& {\mathbf{S}}_{{\varvec{X}}}\left(i\right)=\mathrm{diag}\big\{{\mathbf{X}}_{1}\left(i\right),{\mathbf{X}}_{2}\left(i\right),\cdots, {\mathbf{X}}_{N}\left(i\right)\big\}\end{aligned}\). So, from Eq. (1), we have \(\mathbf{e}\left(i\right)={{\mathbf{S}}_{{\varvec{X}}}}^{\mathrm{T}}\left(i\right)\widehat{\mathbf{W}}\left(i-1\right)+\mathbf{V}\left(i\right)={\mathbf{e}}_{{\varvec{o}}}\left(i\right)+\mathbf{V}\left(i\right)\).

Then, let \(\left\{\begin{array}{c}{\mathbf{e}}_{{\varvec{g}}}\left(i\right)={\mathbf{e}}_{{\varvec{o}}}\left(i\right)+g\left(i\right)\\ {\mathbf{e}}_{{\varvec{I}}{\varvec{m}}\left(i\right)}={\mathbf{e}}_{{\varvec{o}}}\left(i\right)+Im\left(i\right)\end{array}\right.\), \(\left\{\begin{array}{c}{{\varvec{S}}}_{{\mathbf{S}}_{{\varvec{g}}}}\left(i\right)={\mathbf{S}}_{{\varvec{g}}}\left(i\right)\otimes I\\ {{\varvec{S}}}_{{\mathbf{S}}_{{\varvec{I}}{\varvec{m}}}}\left(i\right)={\mathbf{S}}_{{\varvec{I}}{\varvec{m}}}\left(i\right)\otimes I\end{array}\right.\).

So,

$$\mathbf{E}\left[{\mathbf{S}}_{{\mathbf{S}}_{\mathbf{e}}}\left(i\right)\mathbf{X}\left(i\right)\right]=\left(1-{P}_{r}\right)\mathbf{E}\left[{\mathbf{S}}_{{\mathbf{S}}_{\mathbf{g}}}\left(i\right)\mathbf{X}\left(i\right)\right]+{P}_{r}\mathbf{E}\left[{{\varvec{S}}}_{{\mathbf{S}}_{{\varvec{I}}{\varvec{m}}}}\left(i\right)\mathbf{X}\left(i\right)\right]$$
(16)

Let

$$ \left\{ {\begin{array}{*{20}l} {{\upsigma }_{{e_{g,n} }}^{2} \left( i \right) = Tr\left[ {{\mathbf{R}}_{ww,n} \left( {i - 1} \right){\mathbf{R}}_{xx,n} } \right] + {\upsigma }_{g,n}^{2} \left( i \right)} \hfill \\ {{\upsigma }_{{e_{Im,n} }}^{2} \left( i \right) = Tr\left[ {{\mathbf{R}}_{ww,n} \left( {i - 1} \right){\mathbf{R}}_{xx,n} } \right] + {\upsigma }_{Im,n}^{2} \left( i \right)} \hfill \\ \end{array} } \right., $$
$$ \left\{ {\begin{array}{*{20}l} {{\mathbf{X}}_{{\text{g}}} \left( i \right) = \sqrt {\frac{2}{\pi }} \;{\text{diag}}\left\{ {{\upsigma }_{g,1}^{ - 1} \left( i \right),{\upsigma }_{g,2}^{ - 1} \left( i \right), \cdots ,{\upsigma }_{g,N}^{ - 1} \left( i \right)} \right\}} \hfill \\ {{\mathbf{X}}_{{{\varvec{Im}}}} \left( i \right) = \sqrt {\frac{2}{\pi }} \,{\text{diag}}\left\{ {{\upsigma }_{Im,1}^{ - 1} \left( i \right),{\upsigma }_{Im,2}^{ - 1} \left( i \right), \cdots ,{\upsigma }_{Im,N}^{ - 1} \left( i \right)} \right\}} \hfill \\ \end{array} } \right., $$

\(\left\{ {\begin{array}{*{20}l} {{\varvec{S}}_{{{\mathbf{X}}_{{\varvec{g}}} }} \left( i \right) = {\mathbf{X}}_{{\varvec{g}}} \left( i \right) \otimes I} \hfill \\ {{\varvec{S}}_{{{\mathbf{X}}_{{{\varvec{Im}}}} }} \left( i \right) = {\mathbf{X}}_{{{\varvec{Im}}}} \left( i \right) \otimes I} \hfill \\ \end{array} } \right.\).

Then,

$$ \left\{ {\begin{array}{*{20}l} {E\left[ {{\mathbf{S}}_{{{\mathbf{S}}_{{\mathbf{g}}} }} \left( i \right){\mathbf{X}}\left( i \right)|{\mathbf{W}}\left( {i - 1} \right)} \right] = {\mathbf{X}}_{{\varvec{g}}} \left( i \right)\;{\text{diag}}\left\{ {{\mathbf{R}}_{xx,1} ,{\mathbf{R}}_{xx,2} , \cdots ,{\mathbf{R}}_{xx,N} } \right\}{\hat{\mathbf{W}}}\left( {i - 1} \right)} \hfill \\ {E\left[ {{\mathbf{S}}_{{{\mathbf{S}}_{{{\mathbf{Im}}}} }} \left( i \right){\mathbf{X}}\left( i \right)|{\mathbf{W}}\left( {i - 1} \right)} \right] = {\mathbf{X}}_{{{\varvec{Im}}}} \left( i \right)\;{\text{diag}}\left\{ {{\mathbf{R}}_{xx,1} ,{\mathbf{R}}_{xx,2} , \cdots ,{\mathbf{R}}_{xx,N} } \right\}{\hat{\mathbf{W}}}\left( {i - 1} \right)} \hfill \\ \end{array} } \right. $$
(17)

Substituting (26) into (25), we have

$$ \begin{gathered} {\mathbf{E}}\left[ {{\mathbf{S}}_{{{\mathbf{S}}_{{\mathbf{e}}} }} \left( i \right){\mathbf{X}}\left( i \right)} \right] \hfill \\ \approx {\mathbf{E}}\left[ {{\mathbf{E}}\left[ {{\mathbf{S}}_{{{\mathbf{S}}_{{\mathbf{e}}} }} \left( i \right){\mathbf{X}}\left( i \right)|{\mathbf{W}}\left( {i - 1} \right)} \right]} \right] \hfill \\ = \left[ {\left( {1 - P_{r} } \right){\mathbf{S}}_{{{\mathbf{S}}_{{\mathbf{g}}} }} \left( i \right) + P_{r} {\mathbf{S}}_{{{\mathbf{S}}_{{{\mathbf{Im}}}} }} \left( i \right)} \right]{\text{diag}}\left\{ {{\mathbf{R}}_{xx,1} ,{\mathbf{R}}_{xx,2} , \cdots ,{\mathbf{R}}_{xx,N} } \right\}{\mathbf{E}}\left[ {{\hat{\mathbf{W}}}\left( {i - 1} \right)} \right] \hfill \\ \end{gathered} $$
(18)

Finally, substitute (27) with (24) obtains

$$ {\text{E}}\left[ {{{\hat{\mathbf{W}}}}\left( i \right)} \right] = \left\{ {\begin{array}{*{20}l} {{\mathbf{C}}^{{\text{T}}} \left[ {\user2{I}_{{\user2{NM}}} - \user2{S}_{{{\mathbf{\mu }}_{\user2{A}} }} \left( {\left( {1 - P_{r} } \right){\mathbf{S}}_{{{\mathbf{S}}_{{\mathbf{g}}} }} \left( i \right) + P_{r} {\mathbf{S}}_{{{\mathbf{S}}_{{{\mathbf{Im}}}} }} \left( i \right)} \right){\text{diag}}\left\{ {{\mathbf{R}}_{{xx,1}} ,{\mathbf{R}}_{{xx,2}} , \cdots ,{\mathbf{R}}_{{xx,N}} } \right\}} \right]{\mathbf{E}}\left[ {{{\hat{\mathbf{W}}}}\left( {i - 1} \right)} \right],\;{\text{if}}\;~{\mathbf{e}}\left( i \right) > 0~\left( I \right)} \hfill \\ {{\mathbf{C}}^{{\text{T}}} \left[ {\user2{I}_{{\user2{NM}}} - \user2{S}_{{{\mathbf{\mu }}_{\user2{B}} }} \left( {\left( {1 - P_{r} } \right){\mathbf{S}}_{{{\mathbf{S}}_{{\mathbf{g}}} }} \left( i \right) + P_{r} {\mathbf{S}}_{{{\mathbf{S}}_{{{\mathbf{Im}}}} }} \left( i \right)} \right){\text{diag}}\left\{ {{\mathbf{R}}_{{xx,1}} ,{\mathbf{R}}_{{xx,2}} , \cdots ,{\mathbf{R}}_{{xx,N}} } \right\}} \right]{\mathbf{E}}\left[ {{{\hat{\mathbf{W}}}}\left( {i - 1} \right)} \right],\;{\text{if}}\;~{\mathbf{e}}\left( i \right) \le 0~\left( {II} \right)} \hfill \\ \end{array} } \right. $$
(19)

From Eq. (28), one can see that the asymptotic unbiasedness of the DLLCLMS algorithm can be guaranteed if the matrix \({\mathbf{C}}^{\mathrm{T}}\left[{{\varvec{I}}}_{{\varvec{N}}{\varvec{M}}}-{{\varvec{S}}}_{{{\varvec{\upmu}}}_{{\varvec{A}}}}\left(\left(1-{P}_{r}\right){\mathbf{S}}_{{\mathbf{S}}_{\mathbf{g}}}\left(i\right)+{P}_{r}{\mathbf{S}}_{{\mathbf{S}}_{\mathbf{I}\mathbf{m}}}\left(i\right)\right)\mathrm{diag}\left\{{\mathbf{R}}_{xx,1},{\mathbf{R}}_{xx,2},\cdots ,{\mathbf{R}}_{xx,N}\right\}\right]\), and \({\mathbf{C}}^{\mathrm{T}}\left[{{\varvec{I}}}_{{\varvec{N}}{\varvec{M}}}-{{\varvec{S}}}_{{{\varvec{\upmu}}}_{{\varvec{B}}}}\left(\left(1-{P}_{r}\right){\mathbf{S}}_{{\mathbf{S}}_{\mathbf{g}}}\left(i\right)+{P}_{r}{\mathbf{S}}_{{\mathbf{S}}_{\mathbf{I}\mathbf{m}}}\left(i\right)\right)\mathrm{diag}\left\{{\mathbf{R}}_{xx,1},{\mathbf{R}}_{xx,2},\cdots ,{\mathbf{R}}_{xx,N}\right\}\right]\) are stable. Both of the matrix \(\big[{{\varvec{I}}}_{{\varvec{N}}{\varvec{M}}}-{{\varvec{S}}}_{{{\varvec{\upmu}}}_{{\varvec{A}}}}\left(\left(1-{P}_{r}\right){\mathbf{S}}_{{\mathbf{S}}_{\mathbf{g}}}\left(i\right)+{P}_{r}{\mathbf{S}}_{{\mathbf{S}}_{\mathbf{I}\mathbf{m}}}\left(i\right)\right)\mathrm{diag}\big\{{\mathbf{R}}_{xx,1},{\mathbf{R}}_{xx,2},\) \(\cdots ,{\mathbf{R}}_{xx,N}\big\}\big]\) and \(\big[{{\varvec{I}}}_{{\varvec{N}}{\varvec{M}}}-{{\varvec{S}}}_{{{\varvec{\upmu}}}_{{\varvec{B}}}}\left(\left(1-{P}_{r}\right){\mathbf{S}}_{{\mathbf{S}}_{\mathbf{g}}}\left(i\right)+{P}_{r}{\mathbf{S}}_{{\mathbf{S}}_{\mathbf{I}\mathbf{m}}}\left(i\right)\right)\mathrm{diag}\big\{{\mathbf{R}}_{xx,1},{\mathbf{R}}_{xx,2},\cdots ,\) \({\mathbf{R}}_{xx,N}\big\}\big]\) are a block-diagonal matrix, and it can be easily verified that it is stable if its block-diagonal entries \(\left[{\varvec{I}}-a{\mu }_{n}{\mathrm{X}}_{v}\left(i\right){\mathbf{R}}_{xx,n}\right]\), and \(\left[{\varvec{I}}-b{\mu }_{n}{\mathrm{X}}_{v}\left(i\right){\mathbf{R}}_{xx,n}\right]\) are stable, where \({\mathrm{X}}_{v}\left(i\right)\le \sqrt{\frac{2}{\pi }}\left[\left(1-{P}_{r}\right){\upsigma }_{g,1}^{-1}\left(i\right)+{P}_{r}{\upsigma }_{Im,1}^{-1}\left(i\right)\right]\). So, the condition for stability of the mean weight error vector is given by

$$ \left\{ {\begin{array}{*{20}c} {0 < \mu_{{\text{n}}} < \frac{2}{{a{\text{X}}_{v} \left( i \right)\rho_{{{\text{max}}}} \left( {{\mathbf{R}}_{xx,n}} \right)}}}, \\ {0 < \mu_{{\text{n}}} < \frac{2}{{b{\text{X}}_{v} \left( i \right)\rho_{{{\text{max}}}} \left( {{\mathbf{R}}_{xx,n} } \right)}}}, \\ \end{array} \begin{array}{*{20}c} {{\text{if}}\; e_{{\text{n}}} \left( i \right) > 0 \, \left( I \right)} \\ {{\text{if}} e_{{\text{n}}} \left( i \right) \le 0 \, \left( {II} \right)} \\ \end{array} } \right. $$
(20)

where \({\rho }_{max}\) denotes the maximal eigenvalue of \({\mathbf{R}}_{xx,n}\). So, based on Eq. (29) and Eq. (24), we obtain \(\mathrm{E}[\widehat{\mathbf{W}}\left(\infty \right)]=0\).

3.1.2 The DQQCLMS Algorithm

Equations (18) and (19) can be written as

$$ {\hat{\boldsymbol{\varphi }}}\left( i \right) = \left\{ {\begin{array}{*{20}l} {{\hat{\mathbf{W}}}\left( {i - 1} \right) - {\varvec{S}}_{{{{\varvec{\upmu}}}_{{\varvec{A}}} }} {\mathbf{S}}_{{{\mathbf{S}}_{{\mathbf{e}}} }} \left( i \right){\mathbf{X}}\left( i \right),} \hfill \\ {{\hat{\mathbf{W}}}\left( {i - 1} \right) - {\varvec{S}}_{{{{\varvec{\upmu}}}_{{\varvec{B}}} }} {\mathbf{S}}_{{{\mathbf{S}}_{{\mathbf{e}}} }} \left( i \right){\mathbf{X}}\left( i \right),} \hfill \\ \end{array} \begin{array}{*{20}c} {\;{\text{if}}\; {\mathbf{e}}\left( i \right) > 0 \, \left( I \right)} \\ {\;{\text{if}}\; {\mathbf{e}}\left( i \right) \le 0 \, \left( {II} \right)} \\ \end{array} } \right. $$
(21)
$$\widehat{\mathbf{W}}\left(i\right)={\mathbf{C}}^{\mathrm{T}}\widehat{\boldsymbol{\varphi }}\left(i\right)$$
(22)

where \(\mathbf{C}=\mathbf{C}\otimes \mathbf{I}\), \({{\varvec{S}}}_{{{\varvec{\upmu}}}_{{\varvec{A}}}}={{\varvec{\upmu}}}_{{\varvec{A}}}\otimes \mathbf{I}\), \({\mathbf{S}}_{{\mathbf{S}}_{\mathbf{e}}}\left(i\right)={\mathbf{S}}_{\mathbf{e}}\left(i\right)\otimes \mathbf{I}\), \(\mathbf{X}\left(i\right)=\mathrm{col}\left\{{\mathbf{X}}_{1}\left(i\right),{\mathbf{X}}_{2}\left(i\right),\cdots ,{\mathbf{X}}_{N}\left(i\right)\right\}\), \({\mathbf{S}}_{\mathbf{e}}\left(i\right)=\mathrm{diag}\left\{\mathbf{e}\left(i\right)\right\},\) and \(\otimes \) denotes the Kronecker product operation.

Taking the expectation of Eqs. (21) and (22),

$$ {\text{E}}\left[ {{\hat{\mathbf{W}}}\left( i \right)} \right] = \left\{ {\begin{array}{*{20}l} {{\mathbf{C}}^{{\text{T}}} {\mathbf{E}}\left[ {{\hat{\mathbf{W}}}\left( {i - 1} \right)} \right] - {\mathbf{C}}^{{\text{T}}} {\varvec{S}}_{{{{\varvec{\upmu}}}_{{\varvec{A}}} }} {\mathbf{E}}\left[ {{\mathbf{S}}_{{{\mathbf{S}}_{{\mathbf{e}}} }} \left( i \right){\mathbf{X}}\left( i \right)} \right]}, \hfill \\ {{\mathbf{C}}^{{\text{T}}} {\mathbf{E}}\left[ {{\hat{\mathbf{W}}}\left( {i - 1} \right)} \right] - {\mathbf{C}}^{{\text{T}}} {\varvec{S}}_{{{{\varvec{\upmu}}}_{{\varvec{B}}} }} {\mathbf{E}}\left[ {{\mathbf{S}}_{{{\mathbf{S}}_{{\mathbf{e}}} }} \left( i \right){\mathbf{X}}\left( i \right)} \right]}, \hfill \\ \end{array} \begin{array}{*{20}c} {{\text{if}}\; {\mathbf{e}}\left( i \right) > 0\, \left( I \right)} \\ {{\text{if}}\; {\mathbf{e}}\left( i \right) \le 0 \, \left( {II} \right)} \\ \end{array} } \right. $$
(23)

Denote the measurement noise vector by \(\mathbf{V}\left(i\right)=\mathrm{col}\left\{{v}_{1}\left(i\right),{v}_{2}\left(i\right),\cdots ,{v}_{N}\left(i\right)\right\}\), \(\mathbf{g}\left(i\right)=\mathrm{col}\left\{{g}_{1}\left(i\right),{g}_{2}\left(i\right),\cdots ,{g}_{N}\left(i\right)\right\}\), \(\mathbf{I}\mathbf{m}\left(i\right)=\mathrm{col}\left\{{Im}_{1}\left(i\right),{Im}_{2}\left(i\right),\cdots ,{Im}_{N}\left(i\right)\right\}\), \({\mathbf{S}}_{\mathbf{g}}\left(i\right)=\mathrm{diag}\left\{\mathbf{g}\left(i\right)\right\}\), \({\mathbf{S}}_{\mathbf{I}\mathbf{m}}\left(i\right)=\mathrm{diag}\left\{\mathbf{I}\mathbf{m}\left(i\right)\right\}\),\({\mathbf{S}}_{{\varvec{X}}}\left(i\right)=\mathrm{diag}\left\{{\mathbf{X}}_{1}\left(i\right),{\mathbf{X}}_{2}\left(i\right),\cdots ,{\mathbf{X}}_{N}\left(i\right)\right\}\). So, from Eq. (1), we have \(\mathbf{e}\left(i\right)={{\mathbf{S}}_{{\varvec{X}}}}^{\mathrm{T}}\left(i\right)\widehat{\mathbf{W}}\left(i-1\right)+\mathbf{V}\left(i\right)={\mathbf{e}}_{{\varvec{o}}}\left(i\right)+\mathbf{V}\left(i\right)\).

Then, let \(\left\{\begin{array}{c}{\mathbf{e}}_{{\varvec{g}}}\left(i\right)={\mathbf{e}}_{{\varvec{o}}}\left(i\right)+g\left(i\right)\\ {\mathbf{e}}_{{\varvec{I}}{\varvec{m}}\left(i\right)}={\mathbf{e}}_{{\varvec{o}}}\left(i\right)+Im\left(i\right)\end{array}\right.\), \(\left\{\begin{array}{c}{{\varvec{S}}}_{{\mathbf{S}}_{{\varvec{g}}}}\left(i\right)={\mathbf{S}}_{{\varvec{g}}}\left(i\right)\otimes I\\ {{\varvec{S}}}_{{\mathbf{S}}_{{\varvec{I}}{\varvec{m}}}}\left(i\right)={\mathbf{S}}_{{\varvec{I}}{\varvec{m}}}\left(i\right)\otimes I\end{array}\right.\).

So,

$$ {\text{E}}\left[ {{\hat{\mathbf{W}}}\left( i \right)} \right] = \left\{ {\begin{array}{*{20}l} {{\mathbf{C}}^{{\text{T}}} \left[ {{\varvec{I}}_{{{\varvec{NM}}}} - {\varvec{S}}_{{{{\varvec{\upmu}}}_{{\varvec{A}}} }} {\text{diag}}\left\{ {{\mathbf{R}}_{xx,1} ,{\mathbf{R}}_{xx,2} , \cdots ,{\mathbf{R}}_{xx,N} } \right\}} \right]{\mathbf{E}}\left[ {{\hat{\mathbf{W}}}\left( {i - 1} \right)} \right]} \hfill \\ {{\mathbf{C}}^{{\text{T}}} \left[ {{\varvec{I}}_{{{\varvec{NM}}}} - {\varvec{S}}_{{{{\varvec{\upmu}}}_{{\varvec{B}}} }} {\text{diag}}\left\{ {{\mathbf{R}}_{xx,1} ,{\mathbf{R}}_{xx,2} , \cdots ,{\mathbf{R}}_{xx,N} } \right\}} \right]{\mathbf{E}}\left[ {{\hat{\mathbf{W}}}\left( {i - 1} \right)} \right]} \hfill \\ \end{array} \begin{array}{*{20}c} {,\;{\text{if}}\; {\mathbf{e}}\left( i \right) > 0 \, \left( I \right)} \\ {,\;{\text{if}}\; {\mathbf{e}}\left( i \right) \le 0 \, \left( {II} \right)} \\ \end{array} } \right. $$
(24)

From Eq. (33), one can see that the asymptotic unbiasedness of the DQQCLMS algorithm can be guaranteed if the matrix \({\mathbf{C}}^{\mathrm{T}}\Big[ {{\varvec{I}}}_{{\varvec{N}}{\varvec{M}}}-{{\varvec{S}}}_{{{\varvec{\upmu}}}_{{\varvec{A}}}}\mathrm{diag}\Big\{{\mathbf{R}}_{xx,1}, {\mathbf{R}}_{xx,2},\cdots ,{\mathbf{R}}_{xx,N}\Big\}\Big]\), and \({\mathbf{C}}^{\mathrm{T}}\left[{{\varvec{I}}}_{{\varvec{N}}{\varvec{M}}}-{{\varvec{S}}}_{{{\varvec{\upmu}}}_{{\varvec{B}}}}\mathrm{diag}\left\{{\mathbf{R}}_{xx,1},{\mathbf{R}}_{xx,2},\cdots ,{\mathbf{R}}_{xx,N}\right\}\right]\) are stable. Both of the matrix \(\left[{{\varvec{I}}}_{{\varvec{N}}{\varvec{M}}}-{{\varvec{S}}}_{{{\varvec{\upmu}}}_{{\varvec{A}}}}\mathrm{diag}\left\{{\mathbf{R}}_{xx,1},{\mathbf{R}}_{xx,2},\cdots ,{\mathbf{R}}_{xx,N}\right\}\right]\) and \(\left[{{\varvec{I}}}_{{\varvec{N}}{\varvec{M}}}-{{\varvec{S}}}_{{{\varvec{\upmu}}}_{{\varvec{B}}}}\mathrm{diag}\left\{{\mathbf{R}}_{xx,1},{\mathbf{R}}_{xx,2},\cdots ,{\mathbf{R}}_{xx,N}\right\}\right]\) are a block-diagonal matrix, and they can be easily verified that it is stable if its block-diagonal entries \(\left[{\varvec{I}}-a{\mu }_{n}{\mathbf{R}}_{xx,n}\right]\), and \(\left[{\varvec{I}}-b{\mu }_{n}{\mathbf{R}}_{xx,n}\right]\) are stable. So, the condition for stability of the mean weight error vector is given by

$$ \left\{ {\begin{array}{*{20}c} {0 < \mu_{{\text{n}}} < \frac{2}{{a\rho_{{{\text{max}}}} \left( {{\mathbf{R}}_{xx,n} } \right)}}} \\ {0 < \mu_{{\text{n}}} < \frac{2}{{b\rho_{{{\text{max}}}} \left( {{\mathbf{R}}_{xx,n} } \right)}}} \\ \end{array} \begin{array}{*{20}c} {,\;{\text{if}}\; e_{n} \left( i \right) > 0 \quad \left( I \right)} \\ {,\;{\text{if}}\; e_{{\text{n}}} \left( i \right) \le 0\quad \left( {II} \right)} \\ \end{array} } \right. $$
(25)

where \({\rho }_{\mathrm{max}}\) denotes the maximal eigenvalue of \({\mathbf{R}}_{xx,n}\). So, based on Eqs. (23) and (25), we obtain \(\mathrm{E}[\widehat{\mathbf{W}}\left(\infty \right)]=0\).

3.1.3 The DLECLMS algorithm

Based on Eq. (20), let \({J}_{{f}_{DLECLMS}}\left(i\right)=\mathrm{exp}\left(a{e}_{n}\left(i\right)\right)-1\), and then, this can be observed from the Taylor series expansion of \({J}_{{f}_{DLECLMS}}\left(i\right)\) around \({e}_{\mathrm{n}}\left(i\right)=0\),

$$ J_{{{\text{f}}_{{{\text{DLECLMS}}}} }} \left( i \right) = \exp \left( {ae_{n} \left( i \right)} \right) - 1 = a\mathop \sum \limits_{k = 1}^{ + \infty } \frac{1}{k!}e_{n}^{k} \left( i \right) $$
(26)

As desired, since the weight of the \(k-\) th error moment is \(\frac{1}{k!}\), it is given to lower-order moments. Notice also that for small error values, the error cost functions become,

$$ J_{{{\text{f}}_{{{\text{DLECLMS}}}} }} \left( i \right) = \exp \left( {ae_{{\text{n}}} \left( i \right)} \right) - 1 = a\mathop \sum \limits_{k = 1}^{ + \infty } \frac{1}{k!}e_{n}^{k} \left( i \right) \approx ae_{{\text{n}}} \left( i \right) $$
(27)

Equations (20) and (21) can be written as

$$ {\hat{\boldsymbol{\varphi }}}\left( i \right) = {\hat{\mathbf{W}}}\left( {i - 1} \right) - {\varvec{S}}_{{{{\varvec{\upmu}}}_{{{\varvec{AB}}}} }} {\mathbf{S}}_{{{\mathbf{S}}_{{\mathbf{e}}} }} \left( i \right){\mathbf{X}}\left( i \right) $$
(28)
$$\widehat{\mathbf{W}}\left(i\right)={\mathbf{C}}^{\mathrm{T}}\widehat{\boldsymbol{\varphi }}\left(i\right)$$
(29)

where \(\mathbf{C}=\mathbf{C}\otimes \mathbf{I}\), \({{\varvec{S}}}_{{{\varvec{\upmu}}}_{{\varvec{A}}{\varvec{B}}}}={{\varvec{\upmu}}}_{{\varvec{A}}{\varvec{B}}}\otimes \mathbf{I}\), \({\mathbf{S}}_{{\mathbf{S}}_{\mathbf{e}}}\left(i\right)={\mathbf{S}}_{\mathbf{e}}\left(i\right)\otimes \mathbf{I}\), \(\mathbf{X}\left(i\right)=\mathrm{col}\left\{{\mathbf{X}}_{1}\left(i\right),{\mathbf{X}}_{2}\left(i\right),\cdots ,{\mathbf{X}}_{N}\left(i\right)\right\}\), \({\mathbf{S}}_{\mathbf{e}}\left(i\right)=\mathrm{diag}\left\{\mathrm{exp}\left(a\mathbf{e}\left(i\right)\right)-1\right\}\approx \mathrm{diag}\left\{a\mathbf{e}\left(i\right)\right\},\) and \(\otimes \) denotes the Kronecker product operation.

Taking the expectation of Eqs. (28) and (29),

$$\mathrm{E}\left[\widehat{\mathbf{W}}\left(i\right)\right]={\mathbf{C}}^{\mathrm{T}}\mathbf{E}\left[\widehat{\mathbf{W}}\left(i-1\right)\right]-{\mathbf{C}}^{\mathrm{T}}{{\varvec{S}}}_{{{\varvec{\upmu}}}_{{\varvec{A}}{\varvec{B}}}}\mathbf{E}\left[{\mathbf{S}}_{{\mathbf{S}}_{\mathbf{e}}}\left(i\right)\mathbf{X}\left(i\right)\right]$$
(30)

Denote the measurement noise vector by \(\mathbf{V}\left(i\right)=\mathrm{col}\left\{{v}_{1}\left(i\right),{v}_{2}\left(i\right),\cdots ,{v}_{N}\left(i\right)\right\}\), \(\mathbf{g}\left(i\right)=\mathrm{col}\left\{{g}_{1}\left(i\right),{g}_{2}\left(i\right),\cdots ,{g}_{N}\left(i\right)\right\}\), \(\mathbf{I}\mathbf{m}\left(i\right)=\mathrm{col}\left\{{Im}_{1}\left(i\right),{Im}_{2}\left(i\right),\cdots ,{Im}_{N}\left(i\right)\right\}\), \({\mathbf{S}}_{\mathbf{g}}\left(i\right)=\mathrm{diag}\left\{\mathbf{g}\left(i\right)\right\}\), \({\mathbf{S}}_{\mathbf{I}\mathbf{m}}\left(i\right)=\mathrm{diag}\left\{\mathbf{I}\mathbf{m}\left(i\right)\right\}\),\({\mathbf{S}}_{{\varvec{X}}}\left(i\right)=\mathrm{diag}\left\{{\mathbf{X}}_{1}\left(i\right),{\mathbf{X}}_{2}\left(i\right),\cdots ,{\mathbf{X}}_{N}\left(i\right)\right\}\). So, from Eq. (1), we have \(\mathbf{e}\left(i\right)={{\mathbf{S}}_{{\varvec{X}}}}^{\mathrm{T}}\left(i\right)\widehat{\mathbf{W}}\left(i-1\right)+\mathbf{V}\left(i\right)={\mathbf{e}}_{{\varvec{o}}}\left(i\right)+\mathbf{V}\left(i\right)\).

Then, let \(\left\{ {\begin{array}{*{20}l} {{\mathbf{e}}_{{\varvec{g}}} \left( i \right) = {\mathbf{e}}_{{\varvec{o}}} \left( i \right) + g\left( i \right)} \hfill \\ {{\mathbf{e}}_{{{\text{Im}}\left( i \right)}} = {\mathbf{e}}_{{\varvec{o}}} \left( i \right) + {\text{Im}}\left( i \right)} \hfill \\ \end{array} } \right.\), \(\left\{ {\begin{array}{*{20}l} {{\varvec{S}}_{{{\mathbf{S}}_{{\varvec{g}}} }} \left( i \right) = {\mathbf{S}}_{{\varvec{g}}} \left( i \right) \otimes I} \hfill \\ {{\varvec{S}}_{{{\mathbf{S}}_{{{\text{Im}}}} }} \left( i \right) = {\mathbf{S}}_{{{\text{Im}}}} \left( i \right) \otimes I} \hfill \\ \end{array} } \right.\).

So,

$$ {\text{E}}\left[ {{\hat{\mathbf{W}}}\left( i \right)} \right] = {\mathbf{C}}^{{\text{T}}} \left[ {{\varvec{I}}_{{{\varvec{NM}}}} - {\varvec{S}}_{{{{\varvec{\upmu}}}_{{{\varvec{AB}}}} }} {\text{diag}}\left\{ {a{\mathbf{R}}_{xx,1} ,a{\mathbf{R}}_{xx,2} , \cdots ,a{\mathbf{R}}_{xx,N} } \right\}} \right]{\mathbf{E}}\left[ {{\hat{\mathbf{W}}}\left( {i - 1} \right)} \right] $$
(31)

From Eq. (31), one can see that the asymptotic unbiasedness of the DLECLMS algorithm can be guaranteed if the matrix \({\mathbf{C}}^{\mathrm{T}}\left[{{\varvec{I}}}_{{\varvec{N}}{\varvec{M}}}-{{\varvec{S}}}_{{{\varvec{\upmu}}}_{{\varvec{A}}{\varvec{B}}}}\mathrm{diag}\left\{{a\mathbf{R}}_{xx,1},{a\mathbf{R}}_{xx,2},\cdots ,{a\mathbf{R}}_{xx,N}\right\}\right]\) is stable. The matrix \(\left[{{\varvec{I}}}_{{\varvec{N}}{\varvec{M}}}-{{\varvec{S}}}_{{{\varvec{\upmu}}}_{{\varvec{A}}{\varvec{B}}}}\mathrm{diag}\left\{{a\mathbf{R}}_{xx,1},{a\mathbf{R}}_{xx,2},\cdots ,{a\mathbf{R}}_{xx,N}\right\}\right]\) is a block-diagonal matrix and can be easily verified that it is stable if its block-diagonal entries \(\left[{\varvec{I}}-{a}^{2}b{\mu }_{n}{\mathbf{R}}_{xx,n}\right]\) are stable. So, the condition for stability of the mean weight error vector (as Eq. (31)) is given by

$$0<{\mu }_{\mathrm{n}}<\frac{2}{{{a}^{2}b\rho }_{\mathrm{max}}\left({\mathbf{R}}_{xx,n}\right)}$$
(32)

where \({\rho }_{max}\) denotes the maximal eigenvalue of \({\mathbf{R}}_{xx,n}\). So, based on Eqs. (31) and (32), we obtain \(\mathrm{E}[\widehat{\mathbf{W}}\left(\infty \right)]=0\).

3.2 Computational complexity

Another important indicator to measure the performance of an adaptive filtering algorithm is the computational complexity because it directly determines whether the adaptive filtering algorithm is easy to implement in engineering. The computational complexity of the diffusion adaptive filtering algorithm is the number of arithmetic operations per iteration of the weight vector or coefficient vector. That is the number of multiplications, additions, et al. The time-consuming procedure of the multiplication operation is far greater than the addition operation, so the multiplication operation occupies a large proportion of the diffusion adaptive filtering algorithm. Therefore, computational complexity is an important property that affects the performance of the diffusion adaptive filtering algorithm. The DLLCLMS, DQQCLMS, and DLECLMS algorithms have two parameters: a and b, during an exponential function in the DLECLMS algorithm, so the computational complexity of the DLLCLMS algorithm, the DQQCLMS algorithm, and the DLECLMS slightly larger than the DSELMS [27], DRVSSLMS [17], DLLAD [5] algorithms. Furthermore, when M increases, those algorithms have the same computational complexity.

3.3 Parameters \({\varvec{a}}\) and \({\varvec{b}}\) for the Proposed Algorithms

As described early, parameters a and b determine the shape and characteristics of each cost function because, as an essential core parameter, how to set it is critical. So the choice of a and b in Eqs. (16)~(21) plays a vital role in the performance of the DLLCLMS, DQQCLMS, and DLECLMS algorithms. The optimum cut-off value, a and b, under different input signals, impulsive noise, and network structures can be used for both the theoretical derivation and simulation experimental methods. For the theory derivation method, although the optimal parameters \(a\mathrm{ and }b\) of the proposed three diffusion adaptive filtering algorithms are obtained based on minimizing the mean-square deviation (MSD) at the current time, the problem with this operation is that iterative formulas will increase the computational complexity. In this paper, the experimental simulation method will get the parameters \(a\mathrm{ and }b\) of the proposed three diffusion adaptive filtering algorithms. The following will explore the optimal parameters \(a\mathrm{ and }b\) of these three diffusion algorithms. In simulation experiments with an unknown linear system, we set \(M=16\), and the parameters weight vector is selected randomly. Each distributed network topology consists of \(N=20\) nodes. For impulsive noises, in [32], we can compute the impulsive noises by using the Levy alpha-stable distribution with setting \(\alpha , \beta , \gamma , and \delta \) in MATLAB software (2016b). Besides, we set the impulsive noises as spatiotemporally independent. We apply the uniform rule for the adaptation weights in the combination step and the combination weights in the combination step, i.e., \({c}_{l,n}=1/{N}_{n}\). We use the network MSD to evaluate the performance of diffusion adaptive filtering algorithms, where \(\mathrm{MSD}\left(i\right)=\frac{1}{N}\sum_{n=1}^{N}\mathrm{E}[{\left|{\mathbf{W}}_{o}-{\mathbf{W}}_{n}\left(i\right)\right|}^{2}]\). In addition, the independent Monte Carlo number is 20, and each run has 1000 iteration numbers.

3.3.1 For the DLLCLMS Algorithm

3.3.1.1 Parameter a

We evaluate varying \(a\) estimators based on their MSD using the DLLCLMS algorithm. The choice of \(a\) in Eq. (16) is vital in the DLLCLMS algorithm performance. Besides choosing the optimum cut-off value \(a\) under various input signals, different intensities of impulsive noises, and different network structures, we set four groups of the experiment in a system identification application. In Figs. 2 and 3, considering the convergence rate and the steady-state MSD, we know the DLLCLMS algorithm is robust for different probability densities of impulsive noises when a = 0.8 and b = 6.

Fig. 2
figure 2

MSD curve with different a of the DLLCLMS algorithm (μ = 0.4) when network topology and neighbors to be decided by probability (probability = 0.2) with \(\alpha =1.6, \beta =0.05, \gamma =0, \mathrm{and}\, \delta =1000\): (Left) \({\mathbf{R}}_{xx,n}={\sigma }_{x,n}^{2}{\mathbf{I}}_{M}\). (Right) \({\mathbf{R}}_{xx,n}={\sigma }_{x,n}^{2}\left(i\right){\mathbf{I}}_{M},i=1,\mathrm{2,3},\cdots ,M\)

Fig. 3
figure 3

MSD curve with different a of the DLLCLMS algorithm (μ = 0.4) when network topology and neighbors to be decided by closeness in the distance (radius = 0.3) with \(\alpha =1.2, \beta =0.05, \gamma =0, \mathrm{and}\, \delta =1000\): (Left) \({\mathbf{R}}_{xx,n}\) is a diagonal matrix with possibly different diagonal entries chosen randomly. (Right)\({\mathbf{R}}_{xx,n}={\sigma }_{x,n}^{2}\left(i\right){\mathbf{I}}_{\mathrm{M}},i=1,\mathrm{2,3},\cdots ,M\)

3.3.1.2 Parameter b

We evaluate the relative efficiency of varying b estimators based on their MSD using the DLLCLMS algorithm. The choice of b in Eq. (16) is vital in the DLLCLMS algorithm performance. Besides choosing the optimum cut-off value b under different input signals, different intensities of impulsive noises, and different network structures, we set four groups of the experiment in a system identification application. In Figs. 4 and 5, considering the convergence rate and the steady-state MSD, we know the DLLCLMS algorithm is robust for different probability densities of impulsive noises when b = 4 and a = 0.8.

Fig. 4
figure 4

MSD curve with different b of the DLLCLMS algorithm (μ = 0.4) when network topology and neighbors to be decided by probability (probability = 0.2) with \(\alpha =1.6, \beta =0.05, \gamma =0, \mathrm{and}\, \delta =1000\): (Left) \({\mathbf{R}}_{xx,n}={\sigma }_{x,n}^{2}{\mathbf{I}}_{\mathrm{M}}\), and Pr = 0.8. (Right) \({\mathbf{R}}_{xx,n}={\sigma }_{x,n}^{2}\left(i\right){\mathbf{I}}_{\mathrm{M}},i=1,\mathrm{2,3},\cdots ,M\)

Fig. 5
figure 5

MSD curve with different b of the DLLCLMS algorithm (μ = 0.4) when network topology and neighbors to be decided by closeness in the distance (radius = 0.3) with \(\alpha =1.2, \beta =0.05, \gamma =0, \mathrm{and}\, \delta =1000\): (Left) \({\mathbf{R}}_{xx,n}\) is a diagonal matrix with possibly different diagonal entries chosen randomly. (Right)\({\mathbf{R}}_{xx,n}={\sigma }_{x,n}^{2}\left(i\right){\mathbf{I}}_{M},i=1,\mathrm{2,3},\cdots ,M\)

3.3.2 For the DQQCLMS Algorithm

3.3.2.1 Parameter a

We evaluate varying a estimators based on their MSD using the DQQCLMS algorithm. The choice of a in Eq. (18) plays a vital role in the DQQCLMS algorithm's performance. Besides choosing the optimum cut-off value a under different input signals, different intensities of impulsive noises, and different network structures, we set four groups of the experiment in a system identification application. In Figs. 6 and 7, considering the convergence rate and the steady-state MSD, we know the DQQCLMS algorithm is robust for different probability densities of impulsive noises when a = 0.8 with b = 6.

Fig. 6
figure 6

MSD curve with different a of the DQQCLMS algorithm (μ = 0.4) when network topology and neighbors to be decided by probability (probability = 0.2) with \(\alpha =1.6, \beta =0.05, \gamma =0, \mathrm{and}\, \delta =1000\): (Left) \({\mathbf{R}}_{xx,n}={\sigma }_{x,n}^{2}{\mathbf{I}}_{M}\). (Right) \({\mathbf{R}}_{xx,n}={\sigma }_{x,n}^{2}\left(i\right){\mathbf{I}}_{M},i=1,\mathrm{2,3},\cdots ,M\)

Fig. 7
figure 7

MSD curve with different a of the DQQCLMS algorithm (μ = 0.4) when network topology and neighbors to be decided by closeness in the distance (radius = 0.3) with \(\alpha =1.2, \beta =0.05, \gamma =0, \mathrm{and}\, \delta =1000\): (Left) \({\mathbf{R}}_{xx,n}\) is a diagonal matrix with possibly different diagonal entries chosen randomly. (Right)\({\mathbf{R}}_{xx,n}={\sigma }_{x,n}^{2}\left(i\right){\mathbf{I}}_{\mathrm{M}},i=1,\mathrm{2,3},\cdots ,M\)

3.3.2.2 Parameter b

We evaluate the relative efficiency of varying b estimators based on their MSD using the DQQCLMS algorithm. The choice of b in Eq. (18) is vital in the DQQCLMS algorithm performance. Besides choosing the optimum cut-off value b under different input signals, different intensities of impulsive noises, and different network structures, we set four groups of the experiment in a system identification application. In Figs. 8 and 9, considering the convergence rate and the steady-state MSD, we know the DQQCLMS algorithm is robust for different probability densities of impulsive noises when b = 6 and a = 0.8.

Fig. 8
figure 8

MSD curve with different b of the DQQCLMS algorithm (μ = 0.4) when network topology and neighbors to be decided by probability (probability = 0.2) with \(\alpha =1.6, \beta =0.05, \gamma =0, \mathrm{and}\, \delta =1000\): (Left) \({\mathbf{R}}_{xx,n}={\sigma }_{x,n}^{2}{\mathbf{I}}_{M}\), and Pr = 0.8. (Right) \({\mathbf{R}}_{xx,n}={\sigma }_{x,n}^{2}\left(i\right){\mathbf{I}}_{\mathrm{M}},i=1,\mathrm{2,3},\cdots ,M\)

Fig. 9
figure 9

MSD curve with different b of the DQQCLMS algorithm (μ = 0.4) when network topology and neighbors to be decided by closeness in the distance (radius = 0.3) with \(\alpha =1.2, \beta =0.05, \gamma =0, \mathrm{and}\, \delta =1000\): (Left) \({\mathbf{R}}_{xx,n}\) is a diagonal matrix with possibly different diagonal entries chosen randomly. (Right)\({\mathbf{R}}_{xx,n}={\sigma }_{x,n}^{2}\left(i\right){\mathbf{I}}_{\mathrm{M}},i=1,\mathrm{2,3},\cdots ,M\)

3.3.3 For the DLECLMS Algorithm

3.3.3.1 Parameter a

We evaluate different a estimators' relative efficiency based on their MSD using the DLECLMS algorithm. The choice of a in Eq. (20) is vital in the DLECLMS algorithm performance. Besides choosing the optimum cut-off value a under different input signals, different intensities of impulsive noises, and different network structures, we set four groups of the experiment in a system identification application. In Figs. 10 and 11, considering the convergence rate and the steady-state MSD, we know the DLECLMS algorithm is robust for different probability densities of impulsive noises when a = 0.32 and b = 6.

Fig. 10
figure 10

MSD curve with different a of the DLECLMS algorithm (μ = 0.4) when network topology and neighbors to be decided by probability (probability = 0.2) with \(\alpha =1.6, \beta =0.05, \gamma =0, \mathrm{and}\, \delta =1500\): (Left) \({\mathbf{R}}_{xx,n}={\sigma }_{x,n}^{2}{\mathbf{I}}_{M}\). (Right) \({\mathbf{R}}_{xx,n}={\sigma }_{x,n}^{2}\left(i\right){\mathbf{I}}_{M},i=1,\mathrm{2,3},\cdots ,M\)

Fig. 11
figure 11

MSD curve with different a of the DLECLMS algorithm (μ = 0.4) when network topology and neighbors to be decided by closeness in the distance (radius = 0.3) with \(\alpha =1.2, \beta =0.05, \gamma =0, \mathrm{and}\, \delta =1500\): (Left) \({\mathbf{R}}_{xx,n}\) is a diagonal matrix with possibly different diagonal entries chosen randomly. (Right)\({\mathbf{R}}_{xx,n}={\sigma }_{x,n}^{2}\left(i\right){\mathbf{I}}_{M},i=1,\mathrm{2,3},\cdots ,M\)

3.3.3.2 Parameter b

We evaluate the relative efficiency of different b estimators based on their MSD using the DLECLMS algorithm. The choice of b in Eq. (20) is vital in the DLECLMS algorithm performance. Besides choosing the optimum cut-off value b under different input signals, different intensities of impulsive noises, and different network structures, we set four groups of the experiment in a system identification application. In Figs. 12 and 13, considering the convergence rate and the steady-state MSD, we know the DLECLMS algorithm is robust for different probability densities of impulsive noises when b = 6 and a = 0.32.

Fig. 12
figure 12

MSD curve with different b of the DLECLMS algorithm (μ = 0.4) when network topology and neighbors to be decided by probability (probability = 0.2) with \(\alpha =1.6, \beta =0.05, \gamma =0, \mathrm{and}\, \delta =1500\): (Left) \({\mathbf{R}}_{xx,n}={\sigma }_{x,n}^{2}{\mathbf{I}}_{M}\), and Pr = 0.8. (Right) \({\mathbf{R}}_{xx,n}={\sigma }_{x,n}^{2}\left(i\right){\mathbf{I}}_{\mathrm{M}},i=1,\mathrm{2,3},\cdots ,M\)

Fig. 13
figure 13

MSD curve with different b of the DLECLMS algorithm (μ = 0.4) when network topology and neighbors to be decided by closeness in the distance (radius = 0.3) with \(\alpha =1.2, \beta =0.05, \gamma =0, \mathrm{and}\, \delta =1500\): (Left) \({\mathbf{R}}_{xx,n}\) is a diagonal matrix with possibly different diagonal entries chosen randomly. (Right)\({\mathbf{R}}_{xx,n}={\sigma }_{x,n}^{2}\left(i\right){\mathbf{I}}_{\mathrm{M}},i=1,\mathrm{2,3},\cdots ,M\)

Based on Figs 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, considering the convergence rate and the steady-state MSD, we know that the DLLCLMS, DLLCLMS, and DLECLMS algorithm is robust for different probability densities of impulsive noises when (a = 0.8 and b = 6), (a = 0.8 and b = 6), and (a = 0.32 and b = 6), respectively.

4 Simulation Results

Because in this paper, we focus on the distributed adaptive filter algorithm and compare the DLLCLMS, DQQCLMS, and DLECLMS algorithms with the DSELMS [27], DRVSSLMS [17], and DLLAD [5] algorithms in linear system identification. Then, we want to demonstrate the robustness of the three proposed DLLCLMS, DQQCLMS, and DLECLMS algorithms in different intensities of impulsive noise and input signals. Several group simulation experiments are set with varying intensities of impulsive noises and input signal types. For an unknown linear system, we set \(M=16\), and the parameters weight vector is selected randomly. Each distributed network topology consists of \(N=20\) nodes. For impulsive noises, in [32], we can compute the impulsive noises by using the Levy alpha-stable distribution with setting \(\alpha , \beta , \gamma , and \delta \). Besides, we set the impulsive noises as spatiotemporally independent. We apply the uniform rule for the adaptation weights in the combination step and the combination weights in the combination step, i.e., \({c}_{l,n}=1/{N}_{n}\). We use the network MSD to evaluate the performance of diffusion adaptive filtering algorithms, where \(\mathrm{MSD}\left(i\right)=\frac{1}{N}\sum_{n=1}^{N}\mathrm{E}[{\left|{\mathbf{W}}_{o}-{\mathbf{W}}_{n}\left(i\right)\right|}^{2}]\). In addition, the independent Monte Carlo number is 20, and each run has 2000 iteration numbers.

4.1 Simulation Experiment 1

The convergence rate is faster, and the steady-state MSD is lower to show that the proposed distributed adaptive filter algorithms are more robust to the input signal than DSELMS, DRVSSLMS, and DLLAD. Set up three experiments; both have the same network topology, the same impulsive noise, and different input signals. Suppose any two network topology nodes are declared neighbors. In that case, the connection probability is greater than or equal to 0.2. The network topology is shown in Fig. 14. For different types of the input signal, the MSD iteration curves for DRVSSLMS (\(\mu \)=0.35), DSELMS (\(\mu \)=0.35), DNLMS (\(\mu \)=0.35), and DLLAD (\(\mu \)=0.35) algorithms in Figs. 15, 16, and 17 when the measurement noise in an unknown system is impulsive noises with \(\alpha =1.6, \beta =0.05, \gamma =0, \mathrm{and}\, \delta =2000\).

Fig. 14
figure 14

Random network topology to be decided by probability

Fig. 15
figure 15

(Left_top) the input signals \(\left\{{\mathbf{X}}_{\mathrm{n}}\left(i\right)\right\}\) variances at each network node with \({\mathbf{R}}_{xx,n}={\sigma }_{x,n}^{2}{\mathbf{I}}_{M}\) with possibly different diagonal entries chosen randomly, (Left_bottom) the measurement noise variances \({\varepsilon }_{n}\left(i\right)\) at each network node; (Right) Transient network MSD (dB) iteration curves of the DSELMS, DRVSSLMS, DLLAD, DLLCLMS, DQQCLMS, and DLECLMS algorithms

Fig. 16
figure 16

(Left_top) the input signals \(\left\{{\mathbf{X}}_{\mathrm{n}}\left(i\right)\right\}\) variances at each network node with \({\mathbf{R}}_{xx,n}={\sigma }_{x,n}^{2}{\mathbf{I}}_{M}\) with the same value in each diagonal entry, (Left_bottom) the measurement noise variances \(\left\{{\varepsilon }_{\mathrm{n}}\left(i\right)\right\}\) at each network node; (Right) Transient network MSD (dB) iteration curves of the DSELMS, DRVSSLMS, DLLAD, DLLCLMS, DQQCLMS, and DLECLMS algorithms

Fig. 17
figure 17

(Left_top) the input signals \(\left\{{\mathbf{X}}_{\mathrm{n}}\left(i\right)\right\}\) variances at each network node with \({\mathbf{R}}_{xx,n}={\sigma }_{x,n}^{2}\left(t\right){\mathbf{I}}_{M},t=1,\mathrm{2,3},\cdots ,M\) with a different value in each diagonal entry, (Left_bottom) the measurement noise variances \(\left\{{\varepsilon }_{\mathrm{n}}\left(i\right)\right\}\) at each network node; (Right) Transient network MSD (dB) iteration curves of the DSELMS, DRVSSLMS, DLLAD, DLLCLMS, DQQCLMS, and DLECLMS algorithms

Figures 15, 16, and 17 show that although different input signals are used, the DLLCLMS, DQQCLMS, and DLECLMS algorithms have a faster convergence rate and lower steady-state MSD than the DSELMS, DRVSSLMS, and DLLAD algorithms. Besides, the DLLCLMS, DQQCLMS, and DLECLMS algorithms are more robust to the input signal. In conclusion, from Simulation experiment 1, we can get that the DLLCLMS, DQQCLMS, and DLECLMS algorithms are superior to the DSELMS, DRVSSLMS, and DLLAD algorithms. Furthermore, the order of performance superiority is DLECLMS, DQQCLMS, and DLLCLMS. This result is because the cost function designed in this paper is asymmetric, which can better track the change of estimation error due to asymmetrically distributed noise.

4.2 Simulation Experiment 2

Set up three experiments; both have the same network topology, the same input signal, and different intensities of impulsive noises. If any two network topology nodes are declared neighbors, a certain radius for each node is larger than or equal to 0.3; the network topology is shown in Fig. 18(Left). The MSD iteration curves for DRVSSLMS (μ = 0.35), DSELMS (μ = 0.35), DNLMS (μ = 0.35), DLLAD (μ = 0.35), DLLCLMS, DQQCLMS, and DLECLMS algorithms in Fig. 19 with α = 1.6, α = 1.1, α = 0.8, and α = 0.4 with β = 0.05, γ = 0, and δ = 2000, respectively. The convergence rate is faster, and the steady-state MSD is lower to show that the proposed distributed adaptive filter algorithms are more robust to the impulsive noise than DSELMS, DRVSSLMS, and DLLAD.

Fig. 18
figure 18

(Left) Random network topology to be decided by a certain radius; (Right_top) the input signals \(\left\{{\mathbf{X}}_{\mathrm{n}}\left(i\right)\right\}\) variances at each network node with \({\mathbf{R}}_{xx,n}={\sigma }_{x,n}^{2}{\mathbf{I}}_{M}\) with possibly different diagonal entries chosen randomly, (Right_bottom) the measurement noise variances \({\varepsilon }_{\mathrm{n}}\left(t\right)\) at each network node

Fig. 19
figure 19

Transient network MSD (dB) iteration curves of the DSELMS, DRVSSLMS, DLLAD, DLLCLMS, DQQCLMS, and DLECLMS algorithms. (Up_Left) with \(\alpha =1.6\), (Up_right) with \(\alpha =1.1\), and (Down_left) with \(\alpha =0.8\), and (Down_Right) with \(\alpha =0.4\)

From Fig. 19, we can find that although different probability density of impulsive noise is considered, the DLLCLMS, DQQCLMS, and DLECLMS algorithms have a slightly faster rate than the DSELMS, DRVSSLMS, and DLLAD algorithms. The DLLCLMS, DQQCLMS, and DLECLMS algorithms still have a minor steady-state error than the DSELMS, DRVSSLMS, and DLLAD algorithms. Simulation experiment 2 shows that the DLLCLMS, DQQCLMS, and DLECLMS algorithms are more robust to impulsive noise than the DSELMS, DRVSSLMS, and DLLAD algorithms. Furthermore, the order of performance superiority is DLECLMS, DQQCLMS, and DLLCLMS. And when noise distribution tends to Gaussian distribution, DLECLMS and DQQCLMS tend to be the same but better than DLLCLMS. This result is also because the cost function designed in this paper is asymmetric. No matter the intensity of the impulse interference noise, its distribution is always asymmetrical; the cost function intended in this paper can better track the change of the estimation error.

5 Conclusion

This paper proposed a family of diffusion adaptive filtering algorithms using three asymmetric costs of error functions; those three distributed adaptive filtering algorithms are robust to the impulsive noise and input signal. Specifically, those three distributed adaptive algorithms are developed by modifying the DLMS algorithm and combining the LLC, QQC, and LEC functions at all distributed network nodes. The theoretical analysis demonstrates that those three distributed adaptive filtering algorithms can effectively estimate from an asymmetric cost function perspective. Besides, theoretical mean behavior interpreted that those three algorithms can achieve accurate estimation. Simulation results showed that the DLLCLMS, DQQCLMS, and DLECLMS algorithms are more robust to the input signal and impulsive noise than the DSELMS, DRVSSLMS, and DLLAD algorithms. The DLLCLMS, DQQCLMS, and DLECLMS algorithms have superior performance when estimating the unknown linear system under the changeable impulsive noise environments and different types of input signals, which will significantly impact real-world applications. Besides, the environment in actual applications is more complex, nonlinear, and time-varying and needs to be adjusted for different application scenarios [28]. We also need to consider whether the system parameters to be evaluated are sparsity (such as brain networks based on fMRI or EEG signals) [12, 14]. In such cases, adding regularized constraint terms to those adaptive algorithms will be better.