Adaptive Filtering Based on Minimum Error Entropy Conjugate Gradient

Li, Guoliang; Sun, Qi; Zhang, Ying; Wang, Gang; Yang, Xinyue; Ouyang, Linqiang

doi:10.1007/s00034-024-02654-w

Adaptive Filtering Based on Minimum Error Entropy Conjugate Gradient

Short Paper
Published: 17 April 2024

Volume 43, pages 4662–4674, (2024)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

Adaptive Filtering Based on Minimum Error Entropy Conjugate Gradient

Download PDF

Guoliang Li ORCID: orcid.org/0000-0002-9381-7807¹,
Qi Sun¹,
Ying Zhang²,
Gang Wang¹,
Xinyue Yang¹ &
…
Linqiang Ouyang¹

128 Accesses
Explore all metrics

Abstract

Based on the criterion of minimum error entropy, this paper proposes a novel conjugate gradient algorithm, called MEE-CG. This algorithm has robust performance under non-Gaussian interference. Theoretical analysis and experimental results demonstrate that the proposed algorithm displays more robust performance than the conventional conjugate gradient methods on the basis of the mean square error and the maximum correntropy criterion. Compared with the stochastic gradient minimum error entropy algorithm and the recursive minimum error entropy algorithm, the proposed algorithm provides a trade-off between computational complexity and convergence speed.

Mean square cross error: performance analysis and applications in non-Gaussian signal processing

Article Open access 20 May 2021

Robust Gradient Estimation Algorithm for a Stochastic System with Colored Noise

Article 30 January 2023

Spectral gradient method for impulse noise removal

Article 07 January 2015

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Over the past decades, the rapid development in adaptive filtering (AF) algorithms is regarded as a significance for practical applications [20]. The AF algorithms have been widely adopted in the field of signal processing including noise cancellation [11], channel equalization [8] and system identification [30]. The fundamental AF algorithms cover the famous least mean squares (LMS) [9], recursive least squares (RLS) [22] and conjugate gradient (CG) algorithm [29]. The LMS with the fixed step size always presents slow convergent speed [19]. Although the conventional RLS algorithm has a faster convergent speed, it suffers from higher computation complexity and numerical instability [25]. Comparing with the well-known RLS and LMS algorithms, the CG algorithm provides a trade-off between computational complexity and convergence speed [31]. Because of the variable step length according to the real-time input and the avoidance for calculating matrix inversion during derivation, the CG algorithm exhibits comparable performance in convergence with a smaller burden in computational complexity than the RLS algorithm [7].

The mean square error (MSE) acts as one of the most popular criteria in AF algorithms, which works well under Gaussian noises. However, signals will be most likely contaminated by non-Gaussian noise in various physical applications, including wireless channel tracking [18], multipath estimation [12] and acoustic echo cancellation [15]. Under these circumstances, the performance of the adaptive filter under the MSE criterion may work poorly [3, 5]. Therefore, combining with the Information Theoretic Learning (ITL) [23], various AF algorithms have been proposed to cope with inference from the non-Gaussian noise. The maximum correntropy criterion (MCC) [17, 24, 27] and minimum error entropy (MEE) [2, 4, 14], acting as the two typical examples for ITL, are insensitive to larger outliers, and improve the performance under impulsive noise. Several algorithms based on the MCC criterion have been proposed, such as the LMS-type algorithm [16, 21, 26], the RLS-type algorithm named the recursive maximum correntropy (RMC) [32] and the CG-type algorithm called the q-Gaussian MCC-CG [6]. Recent research has shown that the MEE criterion presents more robust performance than the MCC [10], and has been effectively utilized in the LMS-type algorithm [26], RLS-type algorithm named the recursive minimum error entropy (RMEE) [23] and the Kalman filter called the MEE-KF [2]. To the best of our knowledge, the CG has not been applied to the MEE criterion.

Under the MEE criterion, the LMS-type algorithm has slow convergent speed or poor steady-state error [14], and the RLS-type algorithm, or the RMEE, has improved both the convergent speed and the steady state error at the cost of a high computational complexity [23]. In this paper, we derive an CG-typed algorithm based on the MEE criterion, called MEE-CG. It can be expected that the MEE-CG algorithm will achieve a trade-off between computational complexity and convergence speed, compared with the LMS-type and RLS-type algorithms based on the MEE criterion.

For comparison, the LMS-type, CG-type [6] and RLS-type [32] algorithm under the MCC criterion are called MCC-LMS, MCC-CG and MCC-RLS, respectively. The LMS-type, CG-type and RLS-type [23] algorithm under the MEE criterion are called MEE-LMS, MEE-CG (proposed in this paper) and MEE-RLS, respectively.

The organization of the paper is structured as follows: in Sect. 2, the problem is addressed; in the following Sect. 3, we derive the MEE-CG algorithm; the analyses on the convergence and computational complexity are given in Sect. 4 and then experimental simulations for system identification are provided in Sect. 5; in the final Sect. 6, we summarize the paper and obtain the conclusion.

2 Problem Statement

2.1 Adaptive Filtering Theory

The block diagram of the basic adaptive filter is shown in Fig. 1. When taking the adaptive filtering theory into consideration, the desired response ${{d}}\in {{\textbf{R}}^1}$ are generated via an input ${\textbf{u}}\in {{\textbf{R}}^M}$ at instant n

$$\begin{aligned} {d}\left( n \right) = {\textbf{w}}_o^T{\textbf{u}}\left( n \right) {\hspace{1.0pt}} {\hspace{1.0pt}} + {v}\left( n \right) , \end{aligned}$$

(1)

where ${\textbf {w}}_{o}\in {{\textbf{R}}^M}$ denotes the unknown coefficient, and v(n) represents the zero-mean observation noise with variance $\sigma ^{2}_{v}$. The estimation error can be represented as

$$\begin{aligned} {e}\left( n \right)&= {d}\left( n \right) - {\textbf{w}}_{}^T\left( {n-1 } \right) {\textbf{u}}\left( n \right) \nonumber \\&={d}\left( n \right) - y(n), \end{aligned}$$

(2)

where $y(n)={\textbf{w}}_{}^T\left( {n -1} \right) {\textbf{u}}\left( n \right) $, and ${\textbf {w}}(n-1)$ accounts for the estimation of ${\textbf {w}}_{o}$ at instant $n-1$. For simplification, we make the assumptions as follows: 1) The additive noise is white, and we have

$$\begin{aligned} E\left\{ {v\left( m \right) v\left( n \right) {\hspace{1.0pt}} } \right\} {\hspace{1.0pt}} = 0,m \ne n. \end{aligned}$$

(3)

2) The inputs ${\textbf{u}}(n)$ is composed of a zero-mean white sequence

$$\begin{aligned} E\left\{ {{{\textbf{u}}^T}\left( m \right) {\textbf{u}}\left( n \right) } \right\} {\hspace{1.0pt}} = E\left\{ {{\textbf{u}}\left( m \right) {{\textbf{u}}^T}\left( n \right) } \right\} {\hspace{1.0pt}} = 0,m \ne n. \end{aligned}$$

(4)

3) The inputs are uncorrelated with the additive noise at moments (m, n)

$$\begin{aligned} E\left\{ {{{\textbf{u}}^H}\left( m \right) {v}\left( n \right) } \right\} {\hspace{1.0pt}} {\hspace{1.0pt}} = 0,m \ne n. \end{aligned}$$

(5)

2.2 Minimum Error Entropy (MEE) Criterion

From the Information Theoretic Learning (ITL), we can obtain the empirical version of the quadratic information potential [14, 26] for filters with sliding window in length L

$$\begin{aligned} V(e) = \frac{1}{{{L^2}}}{\hspace{1.0pt}} \sum \limits _{i = 1}^{L} {\sum \limits _{j = 1}^{L} {{\lambda ^{i + j}}{G_\sigma }\left( {{e(i)} - {e(j)}} \right) } }, \end{aligned}$$

(6)

where ${G_\sigma }$ stands for the Gaussian kernel function with bandwidth $\sigma $, and $\lambda \left( {0 < \lambda \le 1} \right) $ denotes the forgetting factor

$$\begin{aligned} {G_\sigma }(x) = \frac{1}{{\sqrt{2\pi } \sigma }}\exp \left( { - \frac{{{x^2}}}{{2{\sigma ^2}}}} \right) {\hspace{1.0pt}} . \end{aligned}$$

(7)

Assume that the error is a random variable with probability density function and an estimator of Renyi’s quadratic entropy [14] for the error can be written as

$$\begin{aligned} R_q(e)=\log \frac{1}{V(e)}. \end{aligned}$$

(8)

According to ITL, minimizing the error entropy is equivalent to maximizing the formula (8). Therefore, based on the MEE [2, 4, 14], the cost function can be obtained as

$$\begin{aligned} {J_{MEE}}({\textbf{w}}) = \frac{1}{{{L^2}}}{\hspace{1.0pt}} \sum \limits _{i = 1}^{L} {\sum \limits _{j = 1}^{L} {{\lambda ^{i + j}}{G_\sigma }\left( {{e(i)} - {e(j)}} \right) } } . \end{aligned}$$

(9)

We get the gradient of (9) as

$$\begin{aligned}&\nabla {J_{MEE}}\left( {\textbf{w}} \right) \nonumber \\&= \frac{1}{{{L^2}{\sigma ^2}}}{\hspace{1.0pt}} \sum \limits _{i =1}^L {\sum \limits _{j = 1}^L {{\lambda ^{i + j}}\left( {{{\textbf{u}}(i)} - {{\textbf{u}}(j)}} \right) {G_\sigma }\left( {{e(i)} - {e(j)}} \right) \left( {{e(i)} - {e(j)}} \right) } } . \end{aligned}$$

(10)

For each instant n, we simplify the following expressions

$$\begin{aligned} \left\{ \begin{array}{l} {{\textbf{u}}_i} = {\textbf{u}}\left( {n - i} \right) ,\\ {d_i} = d\left( {n - i} \right) ,\\ {e_i} = e\left( {n - i} \right) ,i = \mathrm{{0}},\mathrm{{1}}, \cdots ,L - \mathrm{{1}}\mathrm{{.}} \end{array} \right. \end{aligned}$$

(11)

Thus, we are able to change the form of the following expressions

$$\begin{aligned} \left\{ \begin{array}{l} {{\textbf{U}}_L} = \left[ {{{\textbf{u}}_{\mathrm{{0}}}},{{\textbf{u}}_\mathrm{{1}}}, \ldots ,{{\textbf{u}}_{L - \mathrm{{1}}}}} \right] = \left[ {{{\textbf{u}}_\mathrm{{0}}},{{\textbf{U}}_{L - \mathrm{{1}}}}} \right] ,\\ {{\textbf{D}}_L} = {\left[ {{d_0},{d_1}, \ldots ,{d_{L - 1}}} \right] ^T} = {\left[ {{d_0},{{\textbf{D}}_{L - 1}}} \right] ^T},\\ {\varepsilon _L} = {\left[ {{e_0},{e_1}, \ldots ,{e_{L - 1}}} \right] ^T} = {\left[ {{e_0},{\varepsilon _{L - 1}}} \right] ^T}.\\ \end{array} \right. \end{aligned}$$

(12)

So (10) can be rewritten as

$$\begin{aligned}&\nabla {J_{MEE}}\left( {\textbf{w}} \right) \nonumber \\&\quad = \frac{1}{{{L^2}{\sigma ^2}}}{\hspace{1.0pt}} \sum \limits _{i = 0}^{L - 1} {\sum \limits _{j = 0}^{L - 1} {{\lambda ^{i + j}}\left( {{{\textbf{u}}_i} - {{\textbf{u}}_j}} \right) {G_\sigma }\left( {{e_i} - {e_j}} \right) \left( {{e_i} - {e_j}} \right) } } \nonumber \\&\quad = \frac{2}{{{L^2}{\sigma ^2}}}{{\textbf{U}}_L}({{\textbf{P}}_L} - {{\textbf{Q}}_L}){{\mathbf{\varepsilon }}_L}\nonumber \\&\quad = \frac{2}{{{L^2}{\sigma ^2}}}{{\textbf{U}}_L}{\Phi _L}{{\mathbf{\varepsilon }}_L}. \end{aligned}$$

(13)

Then we obtain

$$\begin{aligned} \left\{ \begin{array}{l} {\left[ {\textbf{Q}} \right] _{(i + 1)(j + 1)}} = {\lambda ^{i + j}}{G_\sigma }\left( {{e_i} - {e_j}} \right) ,i,j = \mathrm{{0}},\mathrm{{1}}, \ldots ,L-1\\ {\left[ {\textbf{P}} \right] _{(i + 1)(j + 1)}} = \left\{ { \begin{array}{l} \sum \limits _{k = 0}^{L - 1} {{\lambda ^{i + k}}{G_\sigma }\left( {{e_i} - {e_k}} \right) } ,i = j\\ 0,\mathrm{{ }}i \ne j \end{array}} \right. \\ {\left[ \varphi \right] _i} = {\lambda ^i}{G_\sigma }\left( {{e_i} - {e_0}} \right) ,i = \mathrm{{1}}, \ldots ,L - \mathrm{{1}}\\ {\phi _\mathrm{{0}}} = \sum \limits _{k = \mathrm{{1}}}^{L - 1} {{\lambda ^k}{G_\sigma }\left[ {{e_0} - {e_k}} \right] } . \end{array} \right. \end{aligned}$$

(14)

Through expressions (12–14), we can obtain the ${\Phi _L}$ (see the formula (11) in [26] for details) of the objective function under the MEE criterion and its recursive method.

3 Proposed Minimum Error Entropy Conjugate Gradient

Through the derivations in Sect. 2, we have

$$\begin{aligned} \left\{ \begin{array}{l} {{\textbf{R}}_L} = {{\textbf{U}}_L}{\Phi _L}{\textbf{U}}_L^T,\\ {{\textbf{r}}_L} = {{\textbf{U}}_L}{\Phi _L}{{\textbf{D}}_L}. \end{array} \right. \end{aligned}$$

(15)

From (12), (14) and (15), we have

$$\begin{aligned} \left\{ \begin{array}{l} {{\textbf{R}}_L} = {\lambda ^2}{{\textbf{U}}_{L - 1}}{\Phi _{L - 1}}{\textbf{U}}_{L - 1}^T \\ \qquad \quad + \left[ {{{\textbf{u}}_0}{\phi _0}{\textbf{u}}_0^T + {{\textbf{U}}_{L - 1}}\varphi {\textbf{u}}_0^T + {{\textbf{u}}_0}{\varphi ^T}{\textbf{U}}_{L - 1}^T} \right] ,\\ {{\textbf{r}}_L} = {\lambda ^2}{{\textbf{U}}_{L - 1}}{\Phi _{L - 1}}{\textbf{D}}_{L - 1}^T \\ \qquad \quad + \left[ {{{\textbf{u}}_0}{\phi _0}d_0^T + {{\textbf{U}}_{L - 1}}\varphi d_0^T + {{\textbf{u}}_0}{\varphi ^T}{\textbf{D}}_{L - 1}^T} \right] . \end{array} \right. \end{aligned}$$

(16)

After applying the attenuation window to the correlation function for the data matrix in the CG, we achieve the same expressions as the RLS-type algorithm [1, 23]

$$\begin{aligned} \left\{ \begin{array}{c} {{\textbf{R}}_L} = \lambda _{}^{\mathrm{{2}}}{{\textbf{R}}_{L - 1}} + {{\textbf{u}}_0}{\phi _0}{\textbf{u}}_0^T,\\ {{\textbf{r}}_L} = \lambda _{}^{\mathrm{{2}}}{{\textbf{r}}_{L - 1}} + {{\textbf{u}}_0}{\phi _0}d_0^{}. \end{array} \right. \end{aligned}$$

(17)

The online CG method aims to minimize the following cost function [7]

$$\begin{aligned} \min F\left( {\textbf{w}}_L^{}\right) = \min \left( \frac{1}{2}{\textbf{w}}_L^T{\textbf{R}}_L^{}{\textbf{w}}_L^{} - {\textbf{r}}_L^T{\textbf{w}}_L^{}\right) , \end{aligned}$$

(18)

and the solution of (18) is

$$\begin{aligned} {\textbf{R}}_L^{}{\textbf{w}}_L^{} = {\textbf{r}}_L^{}. \end{aligned}$$

(19)

The weight vector ${\textbf{w}}_L^{}$ and the direction vector $\textbf{p}_L$ can be updated as

$$\begin{aligned} {\textbf{w}}_L^{} = {\textbf{w}}_{L - 1}^{} + \alpha _L\textbf{p}_L^{}, \end{aligned}$$

(20)

$$\begin{aligned} \textbf{p}_{L+ 1} = \textbf{g}_L + {\beta _L}{} \textbf{p}_L, \end{aligned}$$

(21)

where step factor $\alpha _L^{}$ and direction factor ${\beta _L}$ will update for each iteration and $\textbf{g}_L$ denotes residual vector.

Substituting (17) into (20) and (21), the residual vector $\textbf{g}_L$ for the online CG method is

$$\begin{aligned} \textbf{g}_L =&\;{\textbf{r}}_L^{} - {\textbf{R}}_L^{}{\textbf{w}}_L^{}\nonumber \\ =&\; \lambda ^{\mathrm{{2}}} \textbf{g}_{L - 1} - \alpha _L{\textbf{R}}_L\textbf{p}_L + {\textbf{u}}_0{\phi _0}(d_0 - {\textbf{u}}_0^H{\textbf{w}}_{L - 1}). \end{aligned}$$

(22)

According to the conjugate properties for $\textbf{p}_L$ of the CG, we can get

$$\begin{aligned} \textbf{p}_{L + 1}^T{{\textbf{R}}_L}{} \textbf{p}_L =&\; (\textbf{g}_L + {\beta _L}{} \textbf{p}_L)_{}^T{{\textbf{R}}_L}{} \textbf{p}_L\nonumber \\ =&\; \textbf{g}_L^T{{\textbf{R}}_L}{} \textbf{p}_L + {\beta _L}{} \textbf{p}_L^T{{\textbf{R}}_L}{} \textbf{p}_L\nonumber \\ =&\; 0, \end{aligned}$$

(23)

and then the expression of ${\beta _L}$ can be derived as [1]

$$\begin{aligned} {\beta _L} = - \frac{{\textbf{g}_L^T{{\textbf{R}}_L}{} \textbf{p}_L}}{{\textbf{p}_L^T{{\textbf{R}}_L}{} \textbf{p}_L}}. \end{aligned}$$

(24)

According to the definition of the step factor $\alpha _L^{}$ in (20), substituting (20) into the objective function $F({\textbf{w}}_L^{})$ in (18) yields

$$\begin{aligned} F({\textbf{w}}_L^{}) =&\; F({\textbf{w}}_{L - 1}^{} + \alpha _L \textbf{p}_L). \end{aligned}$$

(25)

Setting the derivative of (18) to zero, $\alpha _L^{}$ in (25) becomes

$$\begin{aligned} \alpha _L = \frac{{{\textbf{r}}_L^T\textbf{p}_L - {\textbf{w}}_{L - 1}^T{\textbf{R}}_L \textbf{p}_L}}{{\textbf{p}_L^T{\textbf{R}}_L \textbf{p}_L}}. \end{aligned}$$

(26)

According to the derivations above, the detailed description for the MEE-CG algorithm is given in Algorithm 1.

4 Performance Analyses

4.1 Mean Value Behavior

Assuming the input signal to be ergodic and wide-sense stationary, $\alpha \left( n \right) $, $\beta \left( n \right) $, $p\left( n \right) $, ${\textbf{w}}\left( n \right) $, $\textbf{g}\left( n \right) ,{\textbf{r}}\left( n \right) $ and ${\textbf{R}}\left( n \right) $ are independent with each other, and then we have $E\left[ {\alpha \left( n \right) } \right] = {\bar{\alpha }} $, $E\left[ {\beta \left( n \right) } \right] = {\bar{\beta }} $, $E\left[ {{\textbf{r}}\left( n \right) } \right] = r$, and $E\left[ {{\textbf{R}}\left( n \right) } \right] = R$ for simplicity. After applying Z-transform to the proposed MEE-CG, to ensure the stability of the described system, we obtain the same conclusion about the range of $\alpha _L^{}$ and ${\bar{\beta }} $ in [1]

$$\begin{aligned} \left\{ \begin{array}{l} - 1 \le {\bar{\beta }} + {\bar{\alpha }} {\lambda _i} - {\bar{\beta }} - 1,\\ - 1 \le {\bar{\beta }} - {\bar{\alpha }} {\lambda _i} + {\bar{\beta }} + 1,\\ - 1 \le {\bar{\beta }} \le 1. \end{array} \right. \Rightarrow 0 \le {\bar{\alpha }} \le \frac{{2{\bar{\beta }} + 2}}{{{\lambda _{\max }}}}. \end{aligned}$$

(27)

where $\lambda _i$ is the ith eigenvalue of R. When ${\bar{\beta }} \rightarrow 0$, we have $0 \le {\bar{\alpha }} \le \mathrm{{2}}\lambda _{\max }^{\mathrm{{ - 1}}}$, which also supports the steepest descent algorithm in [9].

Besides, after M iterations, we have $\tilde{\textbf{R}}(n)\tilde{\textbf{w}}(n) \approx \tilde{\textbf{r}}(n)$, where $ {\tilde{\textbf{x}}}$ represents the estimation for $\textbf{x}$. Hence, the norm of $ \textbf{g}(M)$ satisfies

$$\begin{aligned} \parallel \textbf{g}(M) \parallel =\parallel {\tilde{\textbf{r}}(n) - \tilde{\textbf{R}}(n)\tilde{\textbf{w}}({M})} \parallel < \varepsilon , \end{aligned}$$

(28)

where $\varepsilon $ can be an arbitrary small value. According to [1], we get the bound for the norm of $\parallel \textbf{w}_o -\textbf{w}(n) \parallel $ as follows

$$\begin{aligned} \parallel \textbf{w}_o -\textbf{w}(n) \parallel _{\textbf{R}_1 }\le 2 \parallel \textbf{w}_o -\textbf{w}(0) \parallel _{\textbf{R}_1 }\left( \frac{\sqrt{\kappa }-1}{\sqrt{\kappa }+1}\right) ^{n} \end{aligned}$$

(29)

where $\textbf{R}_1 = \tilde{\textbf{R}}(n)$ is positive define, $\Vert \textbf{w} \Vert _{{\textbf{R}}_1} = \sqrt{{{\textbf{w}}^T}{{\textbf{R}}_1}{\textbf{w}}},$ and the condition number is defined as $\kappa = {\left\| {\textbf{R}} \right\| _2}{\left\| {{{\textbf{R}}^{ - 1}}} \right\| _2}.$ After taking the expectations on both side of (29), we can reach the conclusion that MEE-CG is convergent in mean value.

4.2 Mean Square Behavior

After reaching the steady state, we have ${{\textbf{w}}(n)} \approx {{\textbf{w}}(n - 1)}$. From (20), we have $\alpha (n) \textbf{p}(n) \approx \textbf{0}$. Then multiplying $\alpha (n)$ with (26), we can have $\alpha (n)^{} \approx 0$, and thus $\textbf{g}(n) \approx \textbf{g}(n-1)$. According to (24), ${\beta (n)} \approx 0$, we can obtain $\textbf{p}(n+1) \approx \textbf{g}(n)$ from (21). Taking this equation into (26), one can obtain $\textbf{g}(n) \approx \textbf{0}$. Since $\textbf{g}(n) = {\textbf{r}}(n) - {\textbf{R}}(n)^{}{\textbf{w}}(n) \approx {\textbf{r}}(n)^{} - {\textbf{R}}(n){\textbf{w}}(n-1)$, we can easily get ${\textbf{w}}(n) \approx {\textbf{R}}(n)^{ - 1}{\textbf{r}}(n)$, which is the same as the MEE-RLS algorithm. Therefore, it can be concluded that the steady state behavior for both CG and RLS algorithms are equivalent [7]. Then, according to [23], the total weighted error in steady state for the proposed MEE-CG is the same as the RMEE algorithm

$$\begin{aligned} \mathop {\lim }\limits _{n \rightarrow \infty } E\left\| {{\varvec{{\tilde{w}}}}(n)} \right\| _2^2 \approx \frac{{\mathrm{{1}} - {\lambda ^2}}}{{\mathrm{{1 + }}{\lambda ^2}}}M\sigma _u^{ - \mathrm{{2}}}E\left\{ {{{\textbf{v}}^2}(n)\phi (n)^2} \right\} {E^{ - \mathrm{{2}}}}\left\{ {{\phi (n)}} \right\} . \end{aligned}$$

(30)

The following numerical simulations also verify the correctness of the theoretical values.

4.3 Computational Complexity

Table 1 Computational complexities of the MSE-RLS,MSE-CG, MCC-RLS, MCC-CG, MEE-RLS and MEE-CG per iteration

Full size table

Table 1 displays the computational complexities of the MSE-RLS, MSE-CG, MCC-RLS, MCC-CG, MEE-RLS and MEE-CG per iteration. Table 1 shows that the computational complexity of the MSE-CG, MCC-CG, and MEE-CG is lower than that of the MSE-RLS, MCC-RLS, and MEE-RLS, respectively.

Sections 4.2 and 4.3 show that the proposed MEE-CG algorithm are capable of achieving the same steady error as the MEE-RLS with lower burden in computation than the MEE-RLS.

5 Experimental Results

In this section, the simulations will be carried out to prove the theoretical analysis and verify the superiority of the proposed MEE-CG algorithm. We gain all the simulations from the average of independent 100 Monte Carlo trials and the performances of the algorithms are evaluated by the steady-state mean-square deviation (MSD)

$$\begin{aligned} \text {MSD} = \mathop {\lim }\limits _{n \rightarrow \infty } E\left\{ {\left\| {{{\textbf{w}}_o} - {\textbf{w}}(n)} \right\| _2^2} \right\} . \end{aligned}$$

(31)

In the simulations, the unknown parameter ${\textbf {w}}_{o}$ is chosen to be a vector with the size $12\times 1$, and we set the sliding window in the length of L=15. We choose the white Gaussian random sequences as inputs ${\textbf {u}}$ where the covariance matrix $E\{{\textbf {uu}}^{T}\}$=${\textbf {I}}_{5}$ and $E\{{\textbf {u}}^{T}{} {\textbf {u}}\}$=5. The additive impulsive noise is $v(n) \sim 0.94N(0,0.01) + 0.06N(0,225).$ Figure 1 compares the proposed MEE-CG under $\sigma $=1. The $\lambda ^{2}$ parameters for MEE-CG1, MEE-CG2, MEE-CG3, MEE-CG4 and MEE-CG5 denote 0.942, 0.965, 0.980, 0.990 and 0.995 respectively. The corresponding theoretical values (30) are TH-MEE-CG1, TH-MEE-CG2, TH-MEE-CG3, TH-MEE-CG4 and TH-MEE-CG5 respectively.

Figure 2 shows that the convergence rate of the MCC-CG algorithm of these five sets of parameters has proportional to the value of $\lambda $, and their steady-state errors are inversely proportional to the value of $\lambda $. The theoretical values for steady-state error will be increased under the decrease of the forgetting factor $\lambda $ as well as the increase of the bandwidth $\sigma $.

Figure 3 shows that the convergence rate of the MEE-CG algorithm of these five sets of parameters has proportional to the value of $\lambda $, and their steady-state errors are inversely proportional to the value of $\lambda $. The theoretical values for steady-state error will be increased when decreasing the forgetting factor $\lambda $ or increasing the bandwidth $\sigma $. And, the convergence rates of the MCC-CG and MEE-CG can be found from Figs. 2 and 3, respectively. One can see that the MEE-CG1 reaches the same MSD for about 800 iterations than the MCC-CG1 and the MEE-CG4 is 400 iterations faster than the MCC-CG4 in Figs. 2 and 3. Thus, the proposed MEE-CG algorithm achieves faster convergence speed than the MCC-CG. When the kernel width sets within an appropriate range through the experimental results and related researches, the MEE-type algorithms can have good filtering performance and the kernel width can be chosen by cross-validation or trial and error methods in practical applications [2, 4, 14, 26]. In this paper, $\sigma = 1$ is a good choice to ensure the performance of the proposed algorithm by trial and error methods.

Then, the comparison between MEE-LMS1 $\left( {\mu = 0.2} \right) $, MEE-LMS2 $\left( {\mu = 0.4} \right) $, MEE-RLS1 [23] $\left( {{\lambda ^2} = 0.990} \right) $, MEE-RLS2 [23] $\left( {{\lambda ^2} = 0.965} \right) $, MEE-CG1 $\left( {{\lambda ^2} = 0.990} \right) $ and MEE-CG2 $\left( {{\lambda ^2} = 0.965} \right) $ are given in Fig. 4. The proposed MEE-CG has a smaller MSD value and a faster convergence rate than MEE-LMS algorithm. Under the same ${\lambda ^2}$ values, it has the same MSD as MEE-RLS, and the convergence rate of MEE-CG becomes slightly slower than the MEE-RLS.

6 Conclusion

We have proposed a new MEE-CG method based on MEE criterion, which has the same theoretical steady state error as that of the MEE-RLS. The numerical simulations reveal that when coping with non-Gaussian noise, the MEE-CG has faster convergence speed than MEE-LMS algorithm and smaller steady state error than MCC-CG algorithm. In addition, the MEE-CG has comparable performance with the MEE-RLS, and its computational complexity is lower than the MEE-RLS.

Data Availability

All data generated or analysed during this study are included in this published article.

References

P.S. Chang, A.N. Willson, Analysis of conjugate gradient algorithms for adaptive filtering. IEEE Trans. Signal Process. 48(2), 409–418 (2000)
Article Google Scholar
B. Chen, L. Dang, Y. Gu, N. Zheng, J.C. Principe, Minimum error entropy Kalman filter. IEEE Trans. Syst. Man. Cybern. Syst. 51(9), 5819–5829 (2021)
Article Google Scholar
B. Chen, J.S. Zhao, P. Zhu, J.C. Principe, Quantized Kernel least mean square algorithm. IEEE Trans. Neural Networks Learn. Syst. 23(1), 22–32 (2012)
Article Google Scholar
B. Chen, Y. Zhu, J. Hu, Mean$-$square convergence analysis of Adaline training with minimum error entropy criterion. IEEE Trans. Neural Netw. 21(7), 1168–1179 (2010)
Article Google Scholar
B. Chen, Y. Zhu, J. Hu, J.C. Principe, System Parameter Identification: Information Criteria and Algorithms (Newnes, Oxford, 2013)
Google Scholar
T. Chen, K. Xiong, H. Xie, G. Li, S. Wang, q-Gaussian Maximum Correntropy Adaptive Filter with Conjugate Gradient Method. In: 2019 Chinese Automation Congress (CAC) (IEEE, 2019), pp. 1164–1168
P.S.R. Diniz, M.O.K. Mendonca, J.O. Ferreira, T.N. Ferreira, Data-Selective Conjugate Gradient Algorithm. In: 2018 26th European Signal Processing Conference (EUSIPCO) (IEEE, 2018), pp. 707–711
Y. Ge, W. Zhang, H. Wang, On adaptive length of temporal filter for space-time equalization with cochannel interference. IEEE Signal Process. Lett. 25(7), 999–1003 (2018)
Article Google Scholar
S. Haykin, Adaptive Filter Theory (Prentice-Hall, Hoboken, 2020)
Google Scholar
A.R. Heravi, G.A. Hodtani, Where does minimum error entropy outperform minimum mean square error? A new and closer look. IEEE Access 6, 5856–5864 (2018)
Article Google Scholar
E.P. Jayakumar, P.S. Sathidevi, An integrated acoustic echo and noise cancellation system using cross-band adaptive filters and wavelet thresholding of multitaper spectrum. Appl. Acoust. 141, 9–18 (2018)
Article Google Scholar
C. Lan, H. Yue, Y. Xing, M. Ren, Multipath estimation based on modified epsilon-constrained rank-based differential evolution with minimum error entropy. IEEE Access 6, 61569–61582 (2018)
Article Google Scholar
L. Li, X. Zhao, Variable step-size LMS algorithm based on hyperbolic tangent function. Circuits Syst. Signal Process. 42, 4415–4431 (2023)
Article Google Scholar
Z. Li, L. Xing, B. Chen, Adaptive filtering with quantized minimum error entropy criterion. Signal Process. 172, 107534 (2020)
Article Google Scholar
W. Liu, I. Park, J.C. Principe, An information theoretic approach of designing sparse Kernel adaptive filters. IEEE Trans. Neural Networks 20(12), 1950–1961 (2009)
Article Google Scholar
W. Liu, P. Pokharel, J.C. Principe, Correntropy: properties and applications in non-Gaussian signal processing. IEEE Trans. Signal Process. 55(11), 5286–5298 (2007)
Article MathSciNet Google Scholar
X. Liu, B. Chen, H. Zhao, J. Qin, J. Cao, Maximum correntropy Kalman filter with state constraints. IEEE Access 5, 25846–25853 (2017)
Article Google Scholar
R. Pogula, T.K. Kumar, F. Albu, Robust sparse normalized LMAT algorithms for adaptive system identification under impulsive noise environments. Circuits Syst. Signal Process. 38(11), 5103–5134 (2019)
Article Google Scholar
H. Radmanesh, M. Hajiabadi, Recursive maximum correntropy learning algorithm with adaptive Kernel size. Circuits and Syst. II: IEEE Trans. Exp. Briefs 65(7), 958–962 (2018)
Google Scholar
A.H. Sayed, Adaptive networks. Proc. IEEE 102(4), 460–497 (2014)
Article Google Scholar
A. Singh, J.C Principe, Using Correntropy as a Cost Function in Linear Adaptive Filters. In: 2009 International Joint Conference on Neural Networks (IJCNN) (IEEE, 2009), pp. 2950–2955
S.D. Stearns, Adaptive Signal Processing (Prentice-Hall, Hoboken, 1985)
Google Scholar
G. Wang, B. Peng, Z. Feng, X. Yang, J. Deng, N. Wang, Adaptive filtering based on recursive minimum error entropy criterion. Signal Process. 179, 107836 (2021)
Article Google Scholar
G. Wang, R. Xue, J. Wang, A distributed maximum correntropy Kalman filter. Signal Process. 160, 247–251 (2019)
Article Google Scholar
S. Wang, W. Wang, S. Duan, L. Wang, Kernel recursive least squares with multiple feedback and its convergence analysis. IEEE Trans. Circuits Syst. II Exp. Briefs 64(10), 1237–1241 (2017)
Google Scholar
Z. Wu, S. Peng, W. Ma, B. Chen, C. Jose, Principe, minimum error entropy algorithms with sparsity penalty constraints. Entropy 17(5), 3419–3437 (2015)
Article Google Scholar
Z. Wu, J. Shi, X. Zhang, W. Ma, B. Chen, Kernel recursive maximum correntropy. Signal Process. 117, 11–16 (2015)
Article Google Scholar
H. Xia, W. Ren, M. Han, Kernel Generalized half-quadratic correntropy conjugate gradient algorithm for online prediction of chaotic time series. Circuits Syst. Signal Process. 42, 2698–2722 (2023)
Article Google Scholar
K. Xiong, S. Wang, The online random Fourier features conjugate gradient algorithm. IEEE Signal Process. Lett. 26(5), 740–744 (2019)
Article Google Scholar
L. Yang, J. Liu, R. Yan, X. Chen, Spline adaptive filter with arctangent-momentum strategy for nonlinear system identification. Signal Process. 164, 99–109 (2019)
Article Google Scholar
M. Zhang, X. Wang, X. Chen, A. Zhang, The Kernel conjugate gradient algorithms. IEEE Trans. Signal Process. 66(16), 4377–4387 (2018)
Article MathSciNet Google Scholar
X. Zhang, K. Li, Z. Wu, Y. Fu, H. Zhao, B. Chen, Convex regularized recursive maximum correntropy algorithm. Signal Process. 129, 12–16 (2016)
Article Google Scholar

Download references

Acknowledgements

The authors would like to thank all the reviewers who participated in the review.

Funding

This work was supported by the National Natural Science Foundation of China under Grant 61371182, 61971100 and 51975107.

Author information

Authors and Affiliations

School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China
Guoliang Li, Qi Sun, Gang Wang, Xinyue Yang & Linqiang Ouyang
School of Resources and Environment, University of Electronic Science and Technology of China, Chengdu, 611731, China
Ying Zhang

Authors

Guoliang Li
View author publications
You can also search for this author in PubMed Google Scholar
Qi Sun
View author publications
You can also search for this author in PubMed Google Scholar
Ying Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Gang Wang
View author publications
You can also search for this author in PubMed Google Scholar
Xinyue Yang
View author publications
You can also search for this author in PubMed Google Scholar
Linqiang Ouyang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ying Zhang.

Ethics declarations

Conflict of interest

The author declares no Conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Li, G., Sun, Q., Zhang, Y. et al. Adaptive Filtering Based on Minimum Error Entropy Conjugate Gradient. Circuits Syst Signal Process 43, 4662–4674 (2024). https://doi.org/10.1007/s00034-024-02654-w

Download citation

Received: 24 August 2023
Revised: 25 February 2024
Accepted: 26 February 2024
Published: 17 April 2024
Issue Date: July 2024
DOI: https://doi.org/10.1007/s00034-024-02654-w

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Adaptive Filtering Based on Minimum Error Entropy Conjugate Gradient

Abstract

Similar content being viewed by others

Mean square cross error: performance analysis and applications in non-Gaussian signal processing

Robust Gradient Estimation Algorithm for a Stochastic System with Colored Noise

Spectral gradient method for impulse noise removal

1 Introduction