Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

17.1 Introduction

This section provides a brief description of Kalman filter that can be considered an extension of the Wiener filtering concept [4]. The Kalman filter has as objective the minimization of the estimation square error of a nonstationary signal buried in noise. The estimated signal itself is modeled utilizing the state–space formulation [1] describing its dynamical behavior. In summary, Kalman filtering deals with random processes described using state–space modeling which generate signals that can be measured and processed utilizing time recursive estimation formulas. The presentation here is brief and addresses the case of signals and noises represented in vector form; for more details in this subject the reader can consult many books available presenting Kalman filtering, including [3, 5]. There are many different ways to describe the Kalman filtering problem, and to derive its corresponding relations, here we follow the presentations of [2, 6].

17.2 State–Space Model

A convenient form of representing some dynamic systems is through what is called the state–space representation [1]. In such description, the outputs of the memory elements are considered as the system states. The state signals are collected in a vector denoted as x(k) which are in turn generated from its previous state x(k − 1) and from an external signal vector denoted as n(k). The observed or measured signals are collected in another vector denoted as y(k) whose elements originate from linear combinations of the previous state variables and of external signals represented in n 1(k). If we know the values of the external signals n(k) and n 1(k), we can determine the current values of the system states, which will be the delay inputs, and the system observation vector as follows:

$$\begin{array}{rcl} \left \{\begin{array}{l} \mathbf{x}(k) = \mathbf{A}(k - 1)\mathbf{x}(k - 1) + \mathbf{B}(k)\mathbf{n}(k) \\ \mathbf{y}(k) ={ \mathbf{C}}^{T}(k)\mathbf{x}(k - 1) + \mathbf{D}(k){\mathbf{n}}_{1}(k)\end{array} \right.& &\end{array}$$
(17.1)

where x(k) is the (N + 1) ×1 vector of the state variables. If M is the number of system inputs and L is the number of system outputs, we then have that A(k − 1) is \((N + 1) \times (N + 1)\), B(k) is (N + 1) ×M, C(k) is (N + 1) ×L, and D(k) is L ×L.Footnote 1

Figure 17.1 shows the state–space system which generates the observation vector y(k) having as inputs the noise vectors n(k) and n 1(k), where the state variables x(k) are processes generated with excitation noise n(k). The recursive solution of (17.1) can be described as

$$ \begin{array}{rcl} \mathbf{x}(k) ={ \prod \nolimits }_{l=0}^{k-1}\mathbf{A}(l)\mathbf{x}(0) +{ \sum \nolimits }_{i=1}^{k}\left [{\prod \nolimits }_{l=i}^{k-1}\mathbf{A}(l)\right ]\mathbf{B}(i)\mathbf{n}(i)& &\end{array}$$
(17.2)

where \({\prod \nolimits }_{l=k}^{k-1}\mathbf{A}(l) = 1\).

Fig. 17.1
figure 1

State-space model for Kalman filtering formulation

17.2.1 Simple Example

Let’s describe a particular example where we assume the signal x(k) is a sample of an autoregressive process generated from the output of a system described by a linear difference equation given by

$$\begin{array}{rcl} x(k) ={ \sum \nolimits }_{i=1}^{N+1} - {a}_{ i}(k - 1)x(k - i) + n(k)& &\end{array}$$
(17.3)

where n(k) is a white noise. The coefficients a i (k − 1), for \(i = 1,2\ldots, N + 1\), are the time-varying parameters of the AR process. As part of the Kalman filtering procedure is the estimation of x(k) from noisy measurements denoted as y l (k) for l = 1, 2, , L.

We can collect a sequence of signals to be estimated and noise measurements in vector forms as

$$\begin{array}{rcl} \mathbf{x}(k)& =& \left [\begin{array}{c} x(k) \\ x(k - 1)\\ \vdots \\ x(k - N) \end{array} \right ] \\ \mathbf{y}(k)& =& \left [\begin{array}{c} {y}_{1}(k) \\ {y}_{2}(k)\\ \vdots \\ {y}_{L}(k) \end{array} \right ]\end{array}$$
(17.4)

where L represents the number of observations collected in y(k).

Each entry of the observation vector is considered to be generated through the following model:

$$\begin{array}{rcl} & {y}_{l}(k) ={ \mathbf{c}}_{l}^{T}(k)\mathbf{x}(k - 1) + {n}_{1,l}(k)&\end{array}$$
(17.5)

where n 1, l (k) for l = 1, 2, , L are also white noises uncorrelated with each other and with n(k).

Applying the state–space formulation to the particular set of (17.3) and (17.5) leads to a block of state variables originating from an autoregressive process described by

$$\begin{array}{rcl} \mathbf{x}(k)& =& \left [\begin{array}{c} x(k) \\ x(k - 1)\\ \vdots \\ x(k - N) \end{array} \right ] \\ & =& \left [\begin{array}{cccccc} - {a}_{1}(k - 1)& - {a}_{2}(k - 1)&\cdots & - {a}_{N}(k - 1)& - {a}_{N+1}(k - 1) \\ 1 & 0 &\cdots & 0 & 0\\ 0 & 1 &\cdots & 0 & 0\\ \vdots & \vdots & \ddots & \vdots & \vdots \\ 0 & 0 &\cdots & 1 & 0 \end{array} \right ]\left [\begin{array}{c} x(k - 1) \\ x(k - 2)\\ \vdots \\ x(k - N - 1) \end{array} \right ] \\ & & +\left [\begin{array}{c} 1\\ 0\\ \vdots \\ 0 \end{array} \right ]n(k) \\ \mathbf{y}(k)& =& \left [\begin{array}{c} {\mathbf{c}}_{1}^{T}(k) \\ {\mathbf{c}}_{2}^{T}(k)\\ \vdots \\ {\mathbf{c}}_{L}^{T}(k) \end{array} \right ]\left [\begin{array}{c} x(k - 1) \\ x(k - 2)\\ \vdots \\ x(k - N - 1) \end{array} \right ] +{ \mathbf{n}}_{1}(k)\end{array}$$
(17.6)

where for this case of single-input and multiple-output system B(k) is (N + 1) ×M whose only nonzero element is the entry (1, 1) that equals one, C(k) is (N + 1) ×L, and D(k) is just an identity matrix since the measurement noise contributes to the elements of the observation vector in an uncoupled form.

17.3 Kalman Filtering

In the following discussion we derive the Kalman filter for the general state–space description of (17.1). For that it is assumed we know

$$\begin{array}{rcl}{ \mathbf{R}}_{{n}_{1}}(k)& =& E[{\mathbf{n}}_{1}(k){\mathbf{n}}_{1}^{T}(k)]\end{array}$$
(17.7)
$$\begin{array}{rcl}{ \mathbf{R}}_{n}(k)& =& E[\mathbf{n}(k){\mathbf{n}}^{T}(k)]\end{array}$$
(17.8)

A(k − 1) and C(k), and that n(k) and n 1(k) are zero-mean white processes and uncorrelated with each other.

By assuming that we have the measurements y(k) available and that we employ all the data available up to a given iteration, we seek the optimal estimate of the state vector x(k), denoted by \(\hat{\mathbf{x}}(k\vert k)\). As justified along the Kalman filtering derivation, the optimal solution has the following general form:

$$\begin{array}{rcl} & \hat{\mathbf{x}}(k\vert k) = \mathbf{A}(k-1)\hat{\mathbf{x}}(k-1\vert k-1) + \mathbf{K}(k)\left [\mathbf{y}(k)-{\mathbf{C}}^{T}(k)\mathbf{A}(k - 1)\hat{\mathbf{x}}(k - 1\vert k - 1)\right ]&\end{array}$$
(17.9)

where K(k) is the (N + 1) ×L matrix called Kalman gain. The reader can notice that:

  • The term \(\mathbf{A}(k - 1)\hat{\mathbf{x}}(k - 1\vert k - 1)\) tries to bring the contribution of the previous estimation of the state variable to the current one, as suggests the state–space equation (17.1).

  • The term \(\left [\mathbf{y}(k) -{\mathbf{C}}^{T}(k)\mathbf{A}(k - 1)\hat{\mathbf{x}}(k - 1\vert k - 1)\right ]\) is a correction term consisting of the difference between the observation vector and its estimate given by \({\mathbf{C}}^{T}(k)\mathbf{A}(k - 1)\hat{\mathbf{x}}(k - 1\vert k - 1)\), which in turn is a function the previous state-variable estimate.

  • The Kalman gain aims at filtering out estimation errors and noise so that the state variable gets the best possible correction term, which minimizes the MSE.

In order to derive the optimal solution for the Kalman gain, let’s first consider two cases where the estimate of x(k) is computed using observation data available until iteration k and another until iteration k − 1, denoted by \(\hat{\mathbf{x}}(k\vert k)\) and \(\hat{\mathbf{x}}(k\vert k - 1)\), respectively. The estimation error vectors in these cases are defined by

$$\begin{array}{rcl} \mathbf{e}(k\vert k)& =& \mathbf{x}(k) -\hat{\mathbf{x}}(k\vert k)\end{array}$$
(17.10)
$$\begin{array}{rcl} \mathbf{e}(k\vert k - 1)& =& \mathbf{x}(k) -\hat{\mathbf{x}}(k\vert k - 1)\end{array}$$
(17.11)

These errors have covariance matrices defined as

$$\begin{array}{rcl}{ \mathbf{R}}_{e}(k\vert k)& =& E[\mathbf{e}(k\vert k){\mathbf{e}}^{T}(k\vert k)]\end{array}$$
(17.12)
$$\begin{array}{rcl}{ \mathbf{R}}_{e}(k\vert k - 1)& =& E[\mathbf{e}(k\vert k - 1){\mathbf{e}}^{T}(k\vert k - 1)]\end{array}$$
(17.13)

Given an instant k − 1 when the information \(\hat{\mathbf{x}}(k - 1\vert k - 1)\) and \({\mathbf{R}}_{e}(k - 1\vert k - 1)\) are available, we first try to estimate \(\hat{\mathbf{x}}(k\vert k - 1)\) which does not require the current observation. Whenever a new observation y(k) is available, \(\hat{\mathbf{x}}(k\vert k)\) is estimated.

According to (17.1), at a given iteration the actual state–space vector evolves as

$$\begin{array}{rcl} \mathbf{x}(k) = \mathbf{A}(k - 1)\mathbf{x}(k - 1) + \mathbf{B}(k)\mathbf{n}(k)& &\end{array}$$
(17.14)

Since the elements of n(k) are zero mean, a possible unbiased MSE estimate for x(k) is provided by

$$\begin{array}{rcl} \hat{\mathbf{x}}(k\vert k - 1) = \mathbf{A}(k - 1)\hat{\mathbf{x}}(k - 1\vert k - 1)& &\end{array}$$
(17.15)

since the previous estimate \(\hat{\mathbf{x}}(k - 1\vert k - 1)\) is available and A(k − 1) is assumed known.

As a result, the state-variable estimation error when the last available observation is related to iteration k − 1 is given by

$$\begin{array}{rcl} \mathbf{e}(k\vert k - 1)& =& \mathbf{x}(k) -\hat{\mathbf{x}}(k\vert k - 1) \\ & =& \mathbf{A}(k - 1)\mathbf{x}(k - 1) + \mathbf{B}(k)\mathbf{n}(k) -\mathbf{A}(k - 1)\hat{\mathbf{x}}(k - 1\vert k - 1) \\ & =& \mathbf{A}(k - 1)\mathbf{e}(k - 1\vert k - 1) + \mathbf{B}(k)\mathbf{n}(k) \end{array}$$
(17.16)

Assuming that \(E[\mathbf{e}(k - 1\vert k - 1)] = \mathbf{0}\), meaning that \(\hat{\mathbf{x}}(k - 1\vert k - 1)\) is an unbiased estimate of x(k − 1), and recalling that the elements of n(k) are white noise with zero mean, then it is possible to conclude that

$$\begin{array}{rcl} E[\mathbf{e}(k\vert k - 1)] = \mathbf{0}& &\end{array}$$
(17.17)

so that \(\hat{\mathbf{x}}(k\vert k - 1)\) is also an unbiased estimate of x(k).

The covariance matrix of e(k | k − 1) can be expressed as follows:

$$\begin{array}{rcl}{ \mathbf{R}}_{e}(k\vert k - 1)& =& E[\mathbf{e}(k\vert k - 1){\mathbf{e}}^{T}(k\vert k - 1)] \\ & =& \mathbf{A}(k - 1)E[\mathbf{e}(k - 1\vert k - 1){\mathbf{e}}^{T}(k - 1\vert k - 1)]{\mathbf{A}}^{T}(k - 1) \\ & & +\mathbf{B}(k)E[\mathbf{n}(k){\mathbf{n}}^{T}(k)]{\mathbf{B}}^{T}(k) \\ & =& \mathbf{A}(k - 1){\mathbf{R}}_{e}(k - 1\vert k - 1){\mathbf{A}}^{T}(k - 1) + \mathbf{B}(k){\mathbf{R}}_{ n}(k){\mathbf{B}}^{T}(k) \\ & & \end{array}$$
(17.18)

The next step is to estimate \(\hat{\mathbf{x}}(k\vert k)\) from \(\hat{\mathbf{x}}(k\vert k - 1)\). In this case we use a linear filtering of the most recent estimate of the state variable \(\hat{\mathbf{x}}(k\vert k - 1)\) properly combined with another linear filtered contribution of the most recent measurement vector y(k). The resulting estimation expression for \(\hat{\mathbf{x}}(k\vert k)\) has the following form

$$\begin{array}{rcl} \hat{\mathbf{x}}(k\vert k) =\tilde{ \mathbf{K}}(k)\hat{\mathbf{x}}(k\vert k - 1) + \mathbf{K}(k)\mathbf{y}(k)& &\end{array}$$
(17.19)

The challenge now is to compute the optimal expressions for the linear filtering matrices \(\tilde{\mathbf{K}}(k)\) and K(k).

The state-variable estimation error e(k | k) that includes the last available observation can then be described as

$$\begin{array}{rcl} \mathbf{e}(k\vert k)& =& \mathbf{x}(k) -\tilde{\mathbf{K}}(k)\hat{\mathbf{x}}(k\vert k - 1) -\mathbf{K}(k)\mathbf{y}(k)\end{array}$$
(17.20)

This expression can be rewritten in a more convenient form by replacing \(\hat{\mathbf{x}}(k\vert k - 1)\) using the first relation of (17.16) and replacing y(k) by its state–space formulation of (17.6). The resulting relation is

$$\begin{array}{rlrlrl} \mathbf{e}(k\vert k) & = \mathbf{x}(k) +\tilde{ \mathbf{K}}(k)\left [\mathbf{e}(k\vert k - 1) -\mathbf{x}(k)\right ] -\mathbf{K}(k)\left [{\mathbf{C}}^{T}(k)\mathbf{x}(k) +{ \mathbf{n}}_{ 1}(k)\right ] & & \\ & = \left [\mathbf{I} -\tilde{\mathbf{K}}(k) -\mathbf{K}(k){\mathbf{C}}^{T}(k)\right ]\mathbf{x}(k) +\tilde{ \mathbf{K}}(k)\mathbf{e}(k\vert k - 1) -\mathbf{K}(k){\mathbf{n}}_{ 1}(k) & \end{array}$$
(17.21)

We know that E[n 1(k)] = 0 and that \(E[\mathbf{e}(k\vert k - 1)] = \mathbf{0}\) since \(\hat{\mathbf{x}}(k\vert k - 1)\) is an unbiased estimate of x(k). However, \(\hat{\mathbf{x}}(k\vert k)\) should also be an unbiased estimate of x(k), that is, E[e(k | k)] = 0. The latter relation is true if we choose

$$\begin{array}{rcl} \tilde{\mathbf{K}}(k) = \mathbf{I} -\mathbf{K}(k){\mathbf{C}}^{T}(k)& &\end{array}$$
(17.22)

so that the first term in the last expression of (17.21) becomes zero.

By replacing (17.22) in (17.19), the estimate of the state variable using the current measurements becomes

$$\begin{array}{rcl} \hat{\mathbf{x}}(k\vert k)& =& \left [\mathbf{I} -\mathbf{K}(k){\mathbf{C}}^{T}(k)\right ]\hat{\mathbf{x}}(k\vert k - 1) + \mathbf{K}(k)\mathbf{y}(k) \\ & =& \hat{\mathbf{x}}(k\vert k - 1) + \mathbf{K}(k)\left [\mathbf{y}(k) -{\mathbf{C}}^{T}(k)\hat{\mathbf{x}}(k\vert k - 1)\right ]\end{array}$$
(17.23)

where according to (17.21) and (17.22) the corresponding estimation error vector is described by

$$\begin{array}{rcl} \mathbf{e}(k\vert k)& =& \left [\mathbf{I} -\mathbf{K}(k){\mathbf{C}}^{T}(k)\right ]\mathbf{e}(k\vert k - 1) -\mathbf{K}(k){\mathbf{n}}_{ 1}(k) \\ & =& \tilde{\mathbf{K}}(k)\mathbf{e}(k\vert k - 1) -\mathbf{K}(k){\mathbf{n}}_{1}(k) \end{array}$$
(17.24)

where the last equality highlights the connection with (17.19).

The covariance matrix of e(k | k) can then be expressed as

$$\begin{array}{rcl}{ \mathbf{R}}_{e}(k\vert k)& =& E[\mathbf{e}(k\vert k){\mathbf{e}}^{T}(k\vert k)] \\ & =& \left [\mathbf{I} -\mathbf{K}(k){\mathbf{C}}^{T}(k)\right ]{\mathbf{R}}_{ e}(k\vert k - 1){\left [\mathbf{I} -\mathbf{K}(k){\mathbf{C}}^{T}(k)\right ]}^{T} + \mathbf{K}(k){\mathbf{R}}_{{ n}_{1}}(k){\mathbf{K}}^{T}(k) \\ & =& \left [\mathbf{I} -\mathbf{K}(k){\mathbf{C}}^{T}(k)\right ]{\mathbf{R}}_{ e}(k\vert k - 1) \\ & & -\left \{\left [\mathbf{I} -\mathbf{K}(k){\mathbf{C}}^{T}(k)\right ]{\mathbf{R}}_{ e}(k\vert k - 1)\mathbf{C}(k) -\mathbf{K}(k){\mathbf{R}}_{{n}_{1}}(k)\right \}{\mathbf{K}}^{T}(k) \end{array}$$
(17.25)

The trace of this covariance matrix determines how good is the estimate of the state variables at a given iteration. As a result, the Kalman gain should be designed in order to minimize the trace of R e (k | k) shown as follows, since it corresponds to the estimation error variance. Defining

$$\begin{array}{rcl}{ \xi }_{K} = \mathrm{tr}[{\mathbf{R}}_{e}(k\vert k)]& &\end{array}$$
(17.26)

it then follows thatFootnote 2

$$\begin{array}{rcl} \frac{\partial {\xi }_{K}} {\partial \mathbf{K}(k)} = -2\left [\mathbf{I} -\mathbf{K}(k){\mathbf{C}}^{T}(k)\right ]{\mathbf{R}}_{ e}(k\vert k - 1)\mathbf{C}(k) + 2\mathbf{K}(k){\mathbf{R}}_{{n}_{1}}(k)& &\end{array}$$
(17.27)

By equating this derivative with zero it is possible to simplify (17.25) since its last term becomes zero, allowing the update to the covariance matrix to have a rather simple form given by

$$\begin{array}{rcl}{ \mathbf{R}}_{e}(k\vert k) = \left [\mathbf{I} -\mathbf{K}(k){\mathbf{C}}^{T}(k)\right ]{\mathbf{R}}_{ e}(k\vert k - 1)& &\end{array}$$
(17.28)

The main purpose of (17.27) is of course to calculate the Kalman gain whose expression is given by

$$\begin{array}{rcl} \mathbf{K}(k) ={ \mathbf{R}}_{e}(k\vert k - 1)\mathbf{C}(k){\left [{\mathbf{C}}^{T}(k){\mathbf{R}}_{ e}(k\vert k - 1)\mathbf{C}(k) +{ \mathbf{R}}_{{n}_{1}}(k)\right ]}^{-1}& &\end{array}$$
(17.29)

Now we have all the expressions required to describe the Kalman filtering algorithm. First we should initialize \(\hat{\mathbf{x}}(0\vert 0)\) with x(0) if available, otherwise generate a zero-mean white Gaussian noise vector. Then initialize the error covariance matrix as R e (0 | 0) = x(0)x T(0). After initialization the algorithm computes \(\hat{\mathbf{x}}(k\vert k - 1)\) as per (17.15) then the error covariance R e (k | k − 1) using (17.18). Next we calculate the Kalman gain as in (17.29) and update the estimate \(\hat{\mathbf{x}}(k\vert k)\) using (17.23) which now takes the form

$$\begin{array}{rcl} \hat{\mathbf{x}}(k\vert k)& =& \hat{\mathbf{x}}(k\vert k - 1) + \mathbf{K}(k)\left [\mathbf{y}(k) -{\mathbf{C}}^{T}(k)\hat{\mathbf{x}}(k\vert k - 1)\right ] \\ & =& \hat{\mathbf{x}}(k\vert k - 1) + \mathbf{K}(k)\left [\mathbf{y}(k) -\hat{\mathbf{y}}(k\vert k - 1)\right ] \end{array}$$
(17.30)

where in the first expression we used (17.15), and in the second expression we observe that the term \({\mathbf{C}}^{T}(k)\hat{\mathbf{x}}(k\vert k - 1)\) represents an unbiased estimate of y(k) denoted as \(\hat{\mathbf{y}}(k\vert k - 1)\). Finally (17.28) updates the error covariance R e (k | k) to include the current measurement contribution. Algorithm 17.1 describes the Kalman filtering procedure. Figure 17.2 illustrates how the building blocks of the Kalman filtering algorithm interact among themselves. As can be observed, from the measurement signal y(k) we perform the best possible estimate of the state variable \(\hat{\mathbf{x}}(k\vert k)\). The Kalman filter solution corresponds to the optimal minimum MSE estimator whenever the noise and the state signal are jointly Gaussian, otherwise it is the optimal linear minimum MSE solution, see [5] for details.

Fig. 17.2
figure 2

Kalman filtering structure

The complex version of the Kalman filter algorithm is almost identical to Algorithm 17.1 and can be derived by replacing x T(0) by x H(0), C T(k) by C H(k), and A T(k − 1) by A H(k − 1).

Example 17.1.

In a nonstationary environment the optimal coefficient vector is described by

$${w}_{o}(k) = 0.9{w}_{o}(k - 1) - 0.81{w}_{o}(k - 2) + {n}_{w}(k)$$

for k ≥ 1, where n w (k) is a zero-mean Gaussian white processes with variance 0. 64. Assume \({w}_{o}(0) = {w}_{o}(-1) = 0\).

Assume this time-varying coefficient is observed through a noisy measurement described by

$$\begin{array}{rcl} y(k) = 0.9{w}_{o}(k) + {n}_{1}(k)& & \\ \end{array}$$

where n 1(k) is another zero-mean Gaussian white processes with variance 0. 16.

Run the Kalman filter algorithm to estimate w o (k) from y(k). Plot w o (k), its estimate \(\hat{{w}}_{o}(k)\) and y(k).

Solution.

The results presented correspond to the average of 200 independent runs of the Kalman filter algorithm. Figure 17.3 shows the signal w o (k) being tracked by its estimate \(\hat{{w}}_{o}(k)\) from iteration 900 to 1, 000, whereas Fig. 17.4 illustrates the measurement signal y(k) from where \(\hat{{w}}_{o}(k)\) was computed. As can be observed, the Kalman filter algorithm is able to track quite closely the signal w o (k) from noisy measurements given by y(k). □ 

Fig. 17.3
figure 3

Tracking performance of the Kalman filter

Fig. 17.4
figure 4

Noisy measurement signal

17.4 Kalman Filter and RLS

As observed in the previous section, the Kalman filtering formulation requires the knowledge of the state–space model generating the observation vector. Such information is not available in a number of adaptive-filtering setups but is quite common in problems related to tracking targets, positioning of dynamic systems, and prediction and estimation of time-varying phenomena, just to mention a few. However, a proper analysis of the Kalman filtering setup allows us to disclose some links with the RLS algorithms. These links are the subject of this section.

Let’s start by observing that in the RLS context one tries to estimate the unknown system parameters denoted as w o (k) through the adaptive-filtering coefficients w(k). The equivalent operation in Kalman filtering is the estimation of x(k) given by \(\hat{\mathbf{x}}(k\vert k)\). The reference signal in the RLS case is d(k) corresponding to the scalar version of y(k) denoted as y(k) in the Kalman case. The estimate of y(k) is given by \(\hat{y}(k\vert k - 1) ={ \mathbf{c}}^{T}(k)\hat{\mathbf{x}}(k\vert k - 1)\) since matrix C T(k) is a row vector in the single output case. As such, it is easy to infer that \(\hat{y}(k\vert k - 1)\) corresponds to the adaptive-filter output denoted as y(k) in the RLS case.

Equation (5.9) repeated here for convenience

$$\begin{array}{rcl} \mathbf{w}(k) = \mathbf{w}(k - 1) + e(k){\mathbf{S}}_{D}(k)\mathbf{x}(k)& &\end{array}$$
(17.31)

is meant for coefficient update in the RLS algorithms. This equation is equivalent to

$$\begin{array}{rcl} \hat{\mathbf{x}}(k\vert k)& =& \hat{\mathbf{x}}(k\vert k - 1) + \mathbf{k}(k)\left (y(k) -{\mathbf{c}}^{T}(k)\hat{\mathbf{x}}(k\vert k - 1)\right ) \\ & =& \hat{\mathbf{x}}(k\vert k - 1) + \mathbf{k}(k)\left (y(k) -\hat{ y}(k\vert k - 1)\right ) \\ & =& \hat{\mathbf{x}}(k\vert k - 1) + \mathbf{k}(k){e}_{y}(k) \end{array}$$
(17.32)

where e y (k) is an a priori error in the estimate of y(k). It can be observed that the Kalman gain matrix K(k) becomes a vector denoted as k(k). By comparing (17.32) with (17.31), we can infer that k(k) is equivalent to S D (k)x(k).

The updating of the Kalman gain in the scalar output case is given by

$$\begin{array}{rcl} \mathbf{k}(k) ={ \mathbf{R}}_{e}(k\vert k - 1)\mathbf{c}(k){\left [{\mathbf{c}}^{T}(k){\mathbf{R}}_{ e}(k\vert k - 1)\mathbf{c}(k) + {r}_{{n}_{1}}(k)\right ]}^{-1}& &\end{array}$$
(17.33)

where \({r}_{{n}_{1}}(k)\) is the additional noise variance. Again by comparing (17.32) with (17.31), we can infer that k(k) is equivalent to

$$\begin{array}{rcl}{ \mathbf{S}}_{D}(k)\mathbf{x}(k)& =& \frac{1} {\lambda }\left [{\mathbf{S}}_{D}(k - 1) -\frac{{\mathbf{S}}_{D}(k - 1)\mathbf{x}(k){\mathbf{x}}^{T}(k){\mathbf{S}}_{D}(k - 1)} {\lambda +{ \mathbf{x}}^{T}(k){\mathbf{S}}_{D}(k - 1)\mathbf{x}(k)} \right ]\mathbf{x}(k) \\ & =& \frac{{\mathbf{S}}_{D}(k - 1)\mathbf{x}(k)} {\lambda +{ \mathbf{x}}^{T}(k){\mathbf{S}}_{D}(k - 1)\mathbf{x}(k)} \\ & =& \frac{\frac{1} {\lambda }{\mathbf{S}}_{D}(k - 1)\mathbf{x}(k)} {1 + \frac{1} {\lambda }{\mathbf{x}}^{T}(k){\mathbf{S}}_{D}(k - 1)\mathbf{x}(k)} \end{array}$$
(17.34)

Now if we assume that the measurement noise in (17.33) has unit variance, it is straightforward to observe by comparing (17.33) and (17.34) that R e (k | k − 1) plays the role of \(\frac{1} {\lambda }{\mathbf{S}}_{D}(k - 1)\) in the RLS algorithm.

The related quantities in the specialized Kalman filter and the RLS algorithm disclosed so far are

$$\begin{array}{rcl} \mathbf{x}(k)& \Longleftrightarrow&{ \mathbf{w}}_{o}(k) \\ y(k)& \Longleftrightarrow& d(k) \\ \hat{y}(k\vert k - 1)& \Longleftrightarrow& y(k) \\ \hat{\mathbf{x}}(k\vert k)& \Longleftrightarrow& \mathbf{w}(k) \\ {e}_{y}(k)& \Longleftrightarrow& e(k) \\ \mathbf{k}(k)& \Longleftrightarrow&{ \mathbf{S}}_{D}(k)\mathbf{x}(k) \\ {\mathbf{R}}_{e}(k\vert k - 1)& \Longleftrightarrow& \frac{1} {\lambda }{\mathbf{S}}_{D}(k - 1)\end{array}$$
(17.35)

These relations show that given that x(k) in the Kalman filter algorithm follows the pattern of w o (k) and \({r}_{{n}_{1}}(k)\) has unit variance (compare (17.33) and (17.34)), the Kalman filter and the RLS algorithms should lead to similar solutions.

As happens with the conventional RLS algorithm, the Kalman filter algorithm faces stability problems when implemented in finite precision mainly related to the ill-conditioning of the estimation error covariance matrix R e (k | k). In practical implementations this matrix could be updated in a factorized form such as U e (k | k)D e (k | k)U e T(k | k), where U e (k | k) is upper triangular with ones on the diagonal and D e (k | k) is a diagonal matrix.