1 Introduction

Code division multiple access (CDMA) is a channel access method ubiquitously used in various modalities and platforms worldwide. It is based on spread spectrum technology as is found, e.g., in third-generation (3G) cellular telephony, terrestrial and satellite communications systems, and indoor wireless networks [7, 25, 30]. Although, LTE (4G) is utilized by several cellular companies inside and outside the USA, their networks are still not fully built, and LTE coverage is still not universal. Thus, most of the older 2G and 3G systems are ubiquitous and exist in parallel with the newer 4G systems worldwide. In the USA, companies such as AT&T and T-mobile use GSM/WCDMA/HSPA while Verizon, Sprint, and MetroPCS use CDMA2000/EV-DO [9]. Moreover, the newer LTE wireless interface is incompatible with the 2G and 3G networks, so that it must be operated on a separate wireless spectrum. While 4G technology is intended to eventually replace the 3G technologies, it is now evident that it will take some time before LTE coverage is fully developed and widely adopted even in the developed countries [9].

As with any radio communication system, CDMA-based systems also suffer from various types of interferences. Specifically, they suffer from (i) an internal multiple access interference (MAI) due to the non-ideal cross-correlations among the users spreading sequences, (ii) narrow-band inter-symbol interference (ISI), and (iii) background noise at the receiver. These drawbacks affect the performance of a CDMA system. The conventional detectors most frequently utilized to counteract CDMA interference is based on second-order statistics. In highly loaded systems, conventional detectors are not considered a suitable choice. Most of the conventional detectors suffer from external interference sources and treat all interferences as a lumped background noise. In CDMA-based systems, however, the primary source of interference is MAI. This has motivated the development of numerous interference rejection techniques to overcome the MAI and the near-far problem in conventional receivers [10, 30]. Several state-of-the-art approaches have been proposed in the literature to overcome this challenge, e.g., using pilot signals and training [12, 33].

In CDMA-based systems, multiuser detection is desirable in order to enhance channel capacity and mitigate MAI [10, 11, 14,15,16,17,18,19]. Multiuser detection has been introduced to obtain an optimum multiuser detector for multi-Gaussian channels in [29,30,31]. Several suboptimal detectors have also been proposed in [6,7,8], to overcome the computational complexity in realizing optimal detectors. In [20,21,22,23,24,25, 35], training pilot sequence techniques have been used to present suboptimal detectors, namely an adaptive linear detector and a zero-forcing detector.

Wang and Poor [28,29,30] proposed the blind minimum mean square error (MMSE) and the blind decorrelating detectors. The suboptimal detector based on the linear minimum mean square error (LMMSE) method has been described in [21]. In [1,2,3,4,5,6,7,8,9, 13], adaptive blind detectors were proposed based on incorporating the minimum output energy with constrained optimization methods. Several subspace approaches were proposed in the literature, e.g., in [10,11,12, 18,19,20,21,22,23,24,25,26,27,28,29,30, 36, 37]. In [36], several types of group-blind linear detectors were proposed in order to enhance the performance for the uplink and downlink channels. The key idea of these detectors is to take advantage of the cross-correlation matrix which was constructed by exploiting the correlation between successive samples of received signals. These detectors, however, are too complex to be practically implemented, especially at the mobile unit. Also, they require information regarding signal timing and the spreading codes of all users.

The aforementioned techniques periodically require the base station to send a training sequence that must be known by the mobile receiver in order to enable the latter to estimate the parameters of the channel propagation model. These parameters attempt to capture the multiple reflections of the radio waves due to obstacles, e.g., buildings, cars, and trees. Furthermore, according to [15, 23], it has been reported that 20% of the bandwidth in GSM, and up to 40% in UMTS CDMA, is devoted to the training sequence. In spite of the good performance of the training sequence techniques, the cost tends to be significantly large in terms of bandwidth. Adaptive signal processing techniques, on the other hand, provide more efficient methods for CDMA systems in the presence of high dynamic conditions as a result of the receiver mobility, the short channel codes, and the fortuitous channel access. In particular, the desire to ensure a high communication rate has made blind adaptive techniques a hot topic, driven by their potential to eliminate/reduce training sessions. Moreover, blind techniques help recover symbol signals in other situations e.g., (i) eavesdropping, where using the training sequence is not available, and (ii) tracking, when the receiver fails to keep the desired user locked in track. It is also noted that the underlying user symbol sequences are reasonably assumed to be statistically independent. Therefore, statistical independence, or near independence, is a key assumption that makes a CDMA system suitable for the blind techniques, e.g., using information maximization [30] or minimum mutual information [27]. In [11, 16,17,18,19], typical CDMA-based systems are represented by wide stationary slowly fading multipath environment and are expressed by a linear multichannel convolution model. Thus, the received signals in a CDMA mobile can be considered as signals generated by the linear convolutive model of statistically independent components of independent users as shown in [27, 31,32,33,34,35,36, 36, 37]. The adaptive LMMSE detector has been originally proposed to overcome the need for complex matrix inversion operations [19]; however, it still requires the spreading codes of all users. While the LMMSE detector maybe suitable to be deployed at the base station, as computational resources are usually abundant, it is less practical to be deployed at the mobile receiver as computational resources are scarcer. In that context, blind techniques are crucial tools for estimating multiple symbol sequences at the mobile units of a communication system using only the received wireless data and without any knowledge of the user spreading codes.

This paper aims at recovering the source symbol sequences from the linear convolutive received mixture without any knowledge of the user short channelizing codes and in the absence of explicit channel identification. In essence, the paper proposes new improved blind adaptive detections, based on the state space approach [26, 27], using the natural gradient method for multipath channels of CDMA-based systems. Three update laws are derived for various filtering structures, and then, three adaptive blind CDMA detectors are introduced for more effective MAI, ISI suppression, and symbol estimation. The second contribution of the paper is three semi-blind adaptive stochastic gradient algorithms fused into the conventional Rake receiver. Specifically, we fuse algorithms based on, respectively, FastICA, RobustICA, and principle component analysis (PCA). Furthermore, higher-order statistics (HOS) are exploited in order to make the proposed methods robust and secure against incomplete cross-correlation and the near-far problem in conventional detectors [23]. Extensive Monte Carlo simulations have been carried out to verify and evaluate the effectiveness of the proposed methods in estimating the users symbols. In summary, we provide metric comparisons in the bit error rate (BER) as a function of (i) the number of users, (ii) the number of symbols per user, and (iii) the signal-to-noise ratio (SNR). The comparisons include the proposed methods with existing and conventional ones in terms of BER performance and computational complexity.

We now set the notation used throughout the paper. Lowercase letters denote scalars, bold lowercase letters denote vectors, and bold uppercase letters denote matrices. Moreover, the following symbols are used:

  • \(({\cdot })^{\mathrm{T}}\) refers to the transpose operator;

  • \({({\cdot })^{\mathrm{H}}}\) refers to the Hermitian transpose operator;

  • \(\hbox {trace}\left( {\cdot }\right) \) refers to the trace operator;

  • \(j = \sqrt{ - 1} \) refers to the imaginary symbol;

  • \(\hbox {diag}\left( {\cdot }\right) \) refers to the standard diagonal of a matrix;

  • \(\hbox {Diag}\left( {\cdot }\right) \) refers to the diagonal of a block matrix, where elements may be block matrices themselves;

  • \(\hbox {sgn}\left( {\cdot }\right) \) refers to the sign operator;

  • \(E[{\cdot }]\) refers to the statistical expectation operator.

The remainder of the paper is organized as follows. In Sect. 2, brief descriptions and derivations of synchronous CDMA signal models in multipath fading are presented. The conventional Rake receiver model is described in Sect. 3. Section 4 is dedicated to the derivation of adaptive update laws and to the proposed new detection schemes. The comparative simulations with summary results and conclusions are given in Sects. 5 and 6, respectively.

2 CDMA Signal Model

We now briefly present two signal models for a CDMA-based system using one layer of channel spreading codes. Specifically, we describe the DS–CDMA signal and WCDMA signal models in a typical synchronous CDMA system usually employed, e.g., for cellphones, indoor ATM, and certain ad hoc wireless networks [9, 30].

Fig. 1
figure 1

Signal generation model for a typical QPSK DS–CDMA system

2.1 A DS–CDMA Receiver Signal Model

In a DS–CDMA system, several users share the medium simultaneously by using unique individualized code signatures. We refer to Fig. 1 for a typical system schematic block diagram. In this paper, we assume the data transmission to be quaternary phase-shift keying (QPSK). At the mobile unit receiver, assume a total of K active users in an L multipath environment and M transmitted symbols during the observation frame time. The simplest downlink received signal model r(t) at the time sample t over a single symbol interval is given by [29]

$$\begin{aligned} r\left( t \right) = \mathop \sum \limits _{m = 1}^M \mathop \sum \limits _{k = 1}^K \mathop \sum \limits _{l = 0}^{L-1 }{\alpha _{lm}}{b_{k,m}}{s_k}\left( {t - m{T_{\mathrm{b}}} - {d_l}{T_{\mathrm{c}}}} \right) + n\left( t \right) \end{aligned}$$
(1)

where

  • lk, and m are path, user, and symbol indices, respectively.

  • \({\alpha _{lm}}\) is the path gain—in the downlink model the path gain is assumed to be the same among users because all users’ signals are transmitted together. Thus, the path gain \({\alpha _{lm}}\) and propagation delay factor \({d_l}\) do not depend on the user k.

  • \({b_{k,m}}\) is the kth user m symbol.

  • \({s_k}\left( {\cdot }\right) \) is the kth user spreading code (chip sequence).

  • \({d_l}\) is the propagation delay factor.

  • \(t,{T_{\mathrm{b}}},{T_{\mathrm{c}}}\) are time, symbol duration, and chip duration, respectively.

  • \(n\left( t \right) \) is the channel additive white Gaussian noise (AWGN) with zero mean and covariance equals q.

The system is assumed to be time-invariant, over a small duration, which means that the channel parameters are much slower than the frequency of transmitted symbol data. Let us assume that G is the number of chips per symbol, K is the number of users, and L is the number of paths. Thus, the scalar form of Eq. (1) can be transformed to a vector form [27, 29] as:

$$\begin{aligned} \mathbf{r} = \mathbf{H}{} \mathbf{S}{} \mathbf{b} + \mathbf{n} \end{aligned}$$
(2)

where \({\mathbf{r}}\) is a received (G1)-dimensional vector signal; \({\mathbf{H}}\) is a \({\left( {{\mathrm{G}}1} \right) {\mathrm{\;x\;G}}}\) matrix with \({{\mathrm{\;G}}1 \ge {\mathrm{G}} + {\mathrm{L}} - 1}\), which represents the multipath propagation coefficients; \({\mathbf{S}}\) is a \({\mathrm{G\;x\;K}}\) block diagonal matrix; \({\mathbf{b}}\) is a K-dimensional vector, which represents the users data symbols; and \({\mathbf{n}}\) is the (G1)-dimensional channel noise vector with covariance matrix, say, \({\mathbf{Q}}\). This standardized model of received signals has been used in deriving the conventional detectors, e.g., match filter, Rake filter, blind LMMSE, and other blind detectors [29]. We shall use it in our development as well. In addition, an alternative two-tap symbol signal model is given in [36]:

$$\begin{aligned} {\mathbf{r}_\mathbf{n}} = {\mathbf{H}_\mathbf{0}}{\mathbf{b}_\mathbf{n}} + {\mathbf{H}_\mathbf{1}}{\mathbf{b}_{\mathbf{n - 1}}} + {\mathbf{n}_\mathbf{n}} = {\bar{\mathbf{H}}}{\bar{\mathbf{b}}_\mathbf{n}} + {\mathbf{n}_\mathbf{n}} \end{aligned}$$
(3)

where

  • \({\mathbf{r}_\mathbf{n}}\) is the total received user’s signal vector;

  • \({{\mathbf{H}}_{0}} = \left[ {{{\mathbf{h}}_{1}}, \ldots ,{{\mathbf{h}}_{\mathbf{k}}}} \right] \) is the signature matrix of the current symbol vectors of all users including MAI, specifically,

    $$\begin{aligned} {\mathbf{h}_\mathbf{k}} = \left[ {\begin{array}{*{20}{c}} 0\\ {{\mathbf{h}_\mathbf{k}}\left( 0 \right) }\\ .\\ .\\ .\\ {\mathbf{h_k}\left( {G - {D_l} - 1} \right) } \end{array}} \right] \end{aligned}$$
    (4)
  • \({\mathbf{H}_1} = \left[ {{{\bar{\mathbf{h}}_1}} , \ldots ,{{\bar{\mathbf{h}}_\mathbf{k}}} } \right] \) is the signature matrix of the previous symbol vectors of all users including ISI, where

    $$\begin{aligned} {\overline{\mathbf{h}}_k} = \left[ {\begin{array}{*{20}{c}} {\mathbf{h_k}\left( {G - {D_l}} \right) }\\ .\\ .\\ .\\ {\mathbf{h_k}\left( {G + L - 1} \right) }\\ 0 \end{array}} \right] \end{aligned}$$
    (5)

    \({D_{l}} \in \{ 0,1, \ldots ,{{G - 1}}\}\) is the delay in chip periods.

  • \({\bar{\mathbf{H}}}= \left[ {\;{\mathbf{H}_\mathbf{0}}\;{\mathbf{H}_\mathbf{1}}} \right] \) is the signature matrix of all users;

  • \({\mathbf{b}_\mathbf{n}} = {\left[ {{b_1}\left( n \right) , \ldots ,{b_K}\left( n \right) } \right] ^{\mathrm{T}}}\) are the current symbols of all users;

  • \({\mathbf{b}_{\mathbf{n - 1}}} = \left[ {{b_1}\left( {n - 1} \right) , \ldots ,{b_K}\left( {n - 1} \right) } \right] ^{\mathrm{T}}\) are the previous symbols of all users;

  • \({{\bar{\mathbf{b}}_\mathbf{n}}} = \left[ {\mathbf{b}_\mathbf{n}^\mathbf{T}\;,\;\;\mathbf{b}_{\mathbf{n - 1}}^\mathbf{T}} \right] ^{\mathrm{T}}\) are the augmented two-tap symbols of all users;

  • \({\mathbf{n}_\mathbf{n}} = \left[ {\mathbf{n}\left( {nG} \right) , \ldots , \mathbf{n}\left( {nG + G - 1} \right) } \right] ^{\mathrm{T}}\) is the independent white composite Gaussian noise vector. We defer further details to [27, 29].

In the asynchronous uplink CDMA systems, one can assume that the columns of \({\mathbf{H}_\mathbf{0}}\) and \({\mathbf{H}_\mathbf{1}}\) are mutually independent. Therefore, \({{\bar{\mathbf{H}}}}\) is a full-rank matrix, whereas for the synchronous downlink CDMA communication, \({{\bar{\mathbf{H}}}}\) is full rank with some restrictions. The main focus in this paper is on the synchronous downlink CDMA communication system, although our proposed algorithms work well in the uplink asynchronous CDMA systems [18, 36].

2.2 WCDMA Receiver Signal Model

One difference between a WCDMA system and a DS–CDMA system is the presence of scrambling codes. The main cause of the MAI in WCDMA systems is the intra-cell multiple user signals sharing the same multipath channels. Figure 2 depicts a block diagram that shows the additional code scrambling before transmission through the air interface. In Fig. 2, DPDCH stands for dedicated physical data channel which is a term adopted in universal mobile telecommunications systems (UMTS) and a S/P block stands for serial to parallel converter. Consequently, the basic received signal model r(t) is given by [27]:

$$\begin{aligned} r\left( t \right) = \mathop \sum \limits _{m = 1}^M \mathop \sum \limits _{k = 1}^K \mathop \sum \limits _{l = 0}^L {\alpha _{lm}}{b_{k,m}}{c_k}\left( {t - \;{d_l}{T_{\mathrm{c}}}} \right) {s_k}\left( {t - m{T_{\mathrm{b}}} - {d_l}{T_{\mathrm{c}}}} \right) + n\left( t \right) \end{aligned}$$
(6)

where in addition to the previous parameters, one adds \({{\mathrm{c}}_{\mathrm{k}}}\left( {\mathrm{t}} \right) \in \left\{ { \pm 1{\mathrm{\;}} \pm {\mathrm{j}}} \right\} \), the complex cell-specific scrambling sequences. The remaining variables are defined in Eq. (1). The received signal at the mobile unit is passed through a chip-matched filter and sampled at the chip rate. The received discrete vector \(\mathbf{r}\) in this case can be expressed as [3, 27, 28, 36].

$$\begin{aligned} \mathbf{r} = \mathbf{H}{} \mathbf{C}{} \mathbf{S}{} \mathbf{b} + \mathbf{n} \end{aligned}$$
(7)

where \(\mathbf{C}\) is the \(G\times G\) complex diagonal scrambling matrix with \({\mathbf{C}}{{\mathbf{C}}^{\mathrm{H}}} = {{\mathrm{\mathbf{I}}}_{{\mathrm{\mathbf{G}\times \mathbf{G}}}}}\) and the remaining variables are defined similarly as in (2). The form of \(\mathbf{C}\) is given by:

$$\begin{aligned} \mathbf{C} = \mathrm{diag}{\left( {{\mathbf{c}_\mathbf{1}}\;\;\;{\mathbf{c}_\mathbf{2}}, \ldots ,{\mathbf{c}_\mathbf{G}}\;} \right) } \end{aligned}$$
(8)

where \({c_i} \in \left\{ { \pm 1\; \pm j} \right\} \;\;\;\;\;\;\forall \;\;1 \le i\; \le G\)

Fig. 2
figure 2

Signal generation based on the proposed 3GPP UMTS FDD standard

3 Conventional Blind Linear Multiuser Detectors

We briefly describe the baseline conventional linear multiuser detectors such as the match filter (MF), the Rake receiver, and the LMMSE detector in multipath environments. For further details, see [25, 30].

3.1 Single User Detector (SUD)

The SUD is a standard MF detector which exploits the user’s code signature to provide an estimate of the user’s symbol sequence from the received data. This detector completely ignores the presence of MAI due to other users. One can express the MF detector for the ith user in the DS–CDMA system as follows:

$$\begin{aligned} \mathbf{b}_{\mathbf{i,MF}}^{\mathbf{D}} = \mathbf{S}_{\mathbf{i}}^\mathbf{H}{} \mathbf{r} \end{aligned}$$
(9)

where \({{\mathbf{S}}_{\mathrm{i}}} = {\mathrm{Diag}}\left( {{{{{\bar{\mathbf{s}}}}}_{\mathbf{i}}},{{{{\bar{\mathbf{s}}}}}_{\mathbf{i}}}, \ldots ,{\mathbf{\;}}{{{{\bar{\mathbf{s}}}}}_{\mathbf{i}}}} \right) \), \({{{\bar{\mathbf{s}}}}_{\mathbf{i}}} = \left[ {0{\mathbf{\;}}0, \ldots ,{{\mathbf{s}}_{\mathbf{i}}} \ldots 0} \right] \). \({{\mathbf{s}}_{\mathbf{i}}}\) is the ith user’s signature code, \(\mathbf{r}\) is the received discrete signal vector, and \({\mathbf{b}}_{{\mathbf{i}},{{\mathbf{MF}}}}^{{\mathbf{D}}}\) is the estimated DS–CDMA ith symbol vector.

3.2 Rake Detector

Perhaps, the most popular linear user detection is the Rake detector, which consists of multiple parallel chip-delayed SUD fingers. In this paper, we implement the Rake detector with the estimated known channel gain coefficients, but not the channel delays. One can express the Rake detector for the DS–CDMA system mathematically as follows:

$$\begin{aligned} \mathbf{b}_{\mathbf{i,Rake}}^{\mathbf{D}} = \mathbf{S}_{\mathbf{i}}^{\mathbf{H}}{{\mathbf{H}}^{\mathbf{H}}} \mathbf{r} \end{aligned}$$
(10)

where \(\mathbf{H}\) represents the estimated channel matrix, and \(\mathbf{b}_{{\mathbf{i,Rake}}}^{\mathbf{D}}\) is the estimated ith user’s symbol vector.

3.3 LMMSE Detector

Conventional linear detectors based on the least square (LS), zero-force (ZF), and BLUE algorithms [25] perform poorly especially in the presence of colored noise. The LMMSE detector, however, is considered to be one of the best linear detectors for DS–CDMA systems. Mathematically, one can express the LMMSE as follows:

$$\begin{aligned} \mathbf{b}_{\mathbf{i,LMMSE}}^\mathbf{D} =\mathbf{S}_{\mathbf{i}}^{\mathbf{H}}{{\mathbf{H}}^{\mathbf{H}}}{\left( {{\sigma ^2}{\mathbf{H}}{{\mathbf{H}}^{\mathbf{H}}} + \mathbf{Q}} \right) ^{ - 1}}{} \mathbf{r} \end{aligned}$$
(11)

where \(\left( {{{\mathrm{\sigma }}^2}{\mathbf{H}}{{\mathbf{H}}^{\mathrm{H}}} + \mathbf{Q}} \right) = {\mathbf{R}} = {\mathrm{E}}\left[ {{\mathbf{r}}{{\mathbf{r}}^{\mathrm{H}}}} \right] \) is the autocorrelation of the received data at the mobile unit and \({\sigma ^2}\) is the average power of the received signal. There are several drawbacks in the implementation of the LMMSE receiver. The main drawback is that the computation of the autocorrelation R is very expensive. If possible, one may use eigenstructure decomposition instead of inverting the autocorrelation matrix \(\mathbf{R}\) directly to obtain

$$\begin{aligned} \mathbf{b}_{\mathbf{i,LMMSE}}^\mathbf{W} = \mathbf{S}_{\mathbf{i}}^{\mathbf{H}}{{\mathbf{H}}^{\mathbf{H}}}\left( {{\mathbf{V}_{\mathbf{s}}}{} \mathbf{D}_{\mathbf{s}}^{ - 1}\mathbf{V}_{\mathbf{s}}^{\mathbf{H}}} \right) \mathbf{r} \end{aligned}$$
(12)

where \(\mathbf{V}_{\mathbf{s}}\) is the estimated eigenvector matrix of the autocorrelation matrix \(\mathbf{R}\) and \(\mathbf{D}_{\mathbf{s}}\) is the corresponding diagonal eigenvalue matrix. Additionally, one can use adaptive algorithms to estimate the LMMSE user’s symbols as in [21].

4 The Proposed Adaptive Blind Detection Schemes

In this section, we introduce new blind detection strategies for the filtering structures. We propose three blind multiuser detectors based on (i) a feed-forward structure, (ii) a feedback structure I, and (iii) a feedback structure II, as in [27]. These filtering structures are depicted in Figs. 34, and 5, respectively.

To that end, one recalls the discrete received signal model (3), namely

$$\begin{aligned} {\mathbf{r}_\mathbf{n}} = {\mathbf{H}_\mathbf{0}}{\mathbf{b}_\mathbf{n}} + {\mathbf{H}_\mathbf{1}}{\mathbf{b}_{\mathbf{n} - \mathbf{1}}} + {\mathbf{n}_\mathbf{n}} \end{aligned}$$

The aim here is to detect the symbol vector \({\mathbf{b}_\mathbf{n}}\) from the received data vector \({\mathbf{r}_\mathbf{n}}\), over the discrete index n, under the following assumptions:

  • AS1 the \(G1\times K\) matrices \({\mathbf{H}_\mathbf{0}}\) and \({\mathbf{H}_\mathbf{1}}\) are of full column rank.

  • AS2 the symbol signal vector series, \({\mathbf{b}_\mathbf{n}}\), have statistically independent components and are identically distributed (i.i.d).

  • AS3 the additive noise vector \({\mathbf{n}_\mathbf{n}}\) is white, Gaussian, and independent of the symbol source signals.

  • AS4 the power of the transmitted symbol signals are normalized to be unity.

  • AS5 the maximum lag in the entire multipath channels is smaller than the spreading gain G of the CDMA codes.

  • AS6 the CDMA system is not over-saturated, which means the number of users (K) is less than the number of the spreading gain (G).

  • AS7 the channel is assumed to be a slowly fading wide sense stationary.

For methodical convenience, each detector algorithm involves two steps: first, a preprocessing stage, and second, the (matrix) rotation stage based on the filtering structures. In the next subsection, we will present the common preprocessing stage (i.e., whitening processes), and then, we will derive each of the three algorithms based on each filtering structure in individual subsections.

4.1 Step 1: Preprocessing (i.e., Data Whitening)

The outcome of this step is that the symbol signals are detected up to a unitary rotational matrix. This step uses second-order statistics (SOS) in order to normalize the variance (or power) of the received discrete signal vector. It may also be used to eliminate redundancy in the data based on PCA. Under assumptions AS1–AS4, the \(G1\times G1\) covariance matrix, say (\({\mathbf{Cov}}\)), of the noiseless received discrete signal vector can be expressed as

$$\begin{aligned} {\mathbf{Cov}} = {\mathbf{E}}\left[ {{{\mathbf{r}_\mathbf{n}}}{} \mathbf{r}_\mathbf{n}^\mathbf{H}} \right] - {q}{{\mathbf{I}}_{{\mathbf{G1}}}} \end{aligned}$$
(13)

We will now consider the two-tap signal model. Then we may generalize it using induction techniques. Under assumptions AS1–AS7, substituting \(\mathbf{r}_\mathbf{n}\) from Eq. (3) into (13) results in the following covariance matrix:

$$\begin{aligned} {\mathbf{Cov}}= & {} {{\mathbf{H}_\mathbf{0}}}{{E}}\left[ {{{\mathbf{b}_\mathbf{n}}}{\mathbf{b}_\mathbf{n}^\mathbf{H}}} \right] {{\mathbf{H}_\mathbf{0}}}^{\mathbf{H}} + {{\mathbf{H}_\mathbf{1}}}{{E}}\left[ {{{\mathbf{b}_{\mathbf{n - 1}}}}{\mathbf{b}_{\mathbf{n - 1}}^\mathbf{H}}} \right] {{\mathbf{H}_\mathbf{1}}}^{\mathbf{H}}\nonumber \\ {\mathbf{Cov}}= & {} \;{{\mathbf{H}_\mathbf{0}}}{{\mathbf{H}_\mathbf{0}}}^{\mathbf{H}} + {{\mathbf{H}_\mathbf{1}}}{{\mathbf{H}_\mathbf{1}}}^{\mathbf{H}} = \;{[{{\mathbf{H}_\mathbf{0}}}\;\;\;{{\mathbf{H}_\mathbf{1}}}\left] \; \right[{{\mathbf{H}_\mathbf{0}}}\;\;{{\mathbf{H}_\mathbf{1}}}]^{\mathbf{H}}}\; \end{aligned}$$
(14)

Observe that under AS2, \(E\left[ {{{\mathbf{b}_\mathbf{n}}}{\mathbf{b}_\mathbf{n}^\mathbf{H}}} \right] = {{\mathbf{I}}_{\mathbf{K}}}\) and \(E\left[ {{{\mathbf{b}_{\mathbf{n} - \mathbf{1}}}}{\mathbf{b}_{\mathbf{n} - \mathbf{1}}^\mathbf{H}}} \right] = {{\mathbf{I}}_{\mathbf{K}}}\). Without loss of generality, we shall briefly proceed with the basic algebraic procedure by adopting the eigenstructure decomposition for the symmetric square matrix \({\mathbf{Cov}}\) and use it to obtain a singular value decomposition for the combined matrix \([{{\mathbf{H}_\mathbf{0}}}\;\;{{\mathbf{H}_\mathbf{1}}}]\) Thus, let

$$\begin{aligned} {\mathbf{Cov}} = {\mathbf{VD}}{{\mathbf{V}}^{\mathbf{H}}} \end{aligned}$$
(15)

where \({\mathbf{V}}\) is a \({\mathrm{G}}1{\mathrm{xG}}1\) matrix of orthogonal eigenvectors satisfying

$$\begin{aligned} {\mathbf{V}}{{\mathbf{V}}^{\mathbf{H}}} = {{\mathbf{V}}^{\mathbf{H}}}{\mathbf{V}} = {{\mathbf{I}}_{{\mathbf{G}1}}} \end{aligned}$$
(16)

and \(\mathbf{D}\) is the corresponding \({\mathrm{G}}1{\mathrm{xG}}1\) diagonal eigenmatrix containing its eigenvalue entries along the diagonal. Thus, from (14), the \({\mathrm{G}}1{\mathrm{x K}}\) \({\mathbf{H}_\mathbf{0}}\) and \({\mathbf{H}_\mathbf{1}}\) matrices can be represented, respectively, as

$$\begin{aligned} {{\mathbf{H}_\mathbf{0}}}= & {} {{\mathbf{V}_\mathbf{0}}}{{{\varvec{\Lambda } }_\mathbf{0}}}{{\mathbf{U}_\mathbf{0}}}^{\mathbf{H}}\nonumber \\ {{\mathbf{H}_\mathbf{1}}}= & {} {{\mathbf{V}_\mathbf{1}}}{{{\varvec{\Lambda } }_\mathbf{1}}}{{\mathbf{U}_\mathbf{1}}}^{\mathbf{H}} \end{aligned}$$
(17)

where \({\mathbf{V}_\mathbf{0}}\) and \({\mathbf{V}_\mathbf{1}}\) are composed of orderly non-overlapping columns of the \({\mathrm{G}}1{\mathrm{xG}}1\) unitary matrix \(\mathbf{V}\). \(\mathbf{U}_\mathbf{0}\) and \(\mathbf{U}_\mathbf{1}\) are constant but unknown \({\mathrm{K}}{\mathrm{xK}}\) right singular value unitary matrices with \({{\mathbf{U}_{\mathbf{0}}}}{{\mathbf{U}_{\mathbf{0}}}}^{\mathbf{H}} = \;{{\mathbf{U}_\mathbf{1}}}{{\mathbf{U}_\mathbf{1}}}^{\mathbf{H}} = {{\mathbf{I}}_{\mathbf{K}}}\), and \({{{\varvec{\Lambda } }_\mathbf{0}}}\) and \({{{\varvec{\Lambda } }_\mathbf{1}}}\) are the appropriate \({\mathrm{G}}1{\mathrm{x K}}\) singular value matrices. We note that the whitening or algebraic PCA procedure can (i) estimate the noise power in Eq. (13) and (ii) reduce the whitened signal dimension to the signal subspace, in this case K. Now, we process the received data to obtain the (whitened) data; specifically, we define:

$$\begin{aligned} {\mathbf{r}_\mathbf{n}^\mathbf{w}} = {{{\varvec{\Lambda } }}^ + }{{\mathbf{V}}^{\mathbf{H}}}{\mathbf{r}_\mathbf{n}} \end{aligned}$$
(18)

where the \({\mathrm{K}}{\mathrm{x G}}1\) matrix \({{{\varvec{\Lambda } }}^ + }\) denotes the pseudo-inverse of the singular value matrices. One simplifies (18) to eventually obtain:

$$\begin{aligned} {\mathbf{r}}_{\mathbf{n}}^{\mathbf{w}} = {{\mathbf{U}_\mathbf{0}}}^{\mathbf{H}}{{\mathbf{b}}_{\mathbf{n}}} + {{\mathbf{U}_\mathbf{1}}}^{\mathbf{H}}{{\mathbf{b}}_{{\mathbf{n} - \mathbf{1}}}} + \left( {{{{\varvec{\Lambda } }^ +} }{{\mathbf{V}}^{\mathbf{H}}}} \right) {{\mathbf{n}}_{\mathbf{n}}} \end{aligned}$$
(19)

Thus, the whitening step renders the whitened data expressed in (18) or (19) as having a reduced dimension to the symbol space and a covariance matrix equal to the identity. That is \(E\left[ {{\mathbf{r}}_{\mathbf{n}}^{\mathbf{w}}{\mathbf{r}}_{\mathbf{n}}^{{\mathbf{wH}}}} \right] = {\mathbf{I}_\mathbf{K}}\).

Note that, after the preprocessing step, the detection of the symbol signal \({{{\hat{\mathbf{b}}}}_{\mathbf{n}}}\) reduces to determining or compensating for the unknown \(K\times K\) (rotation) unitary matrices \(\mathbf{U}_\mathbf{0}\) and \(\mathbf{U}_\mathbf{1}\). Next, we proceed with the development and derivations of the three proposed adaptive filtering structures, based on (i) feed-forward structure (FF), (ii) feedback structure I (FB-I), and (iii) feedback structure II (FB-II) [26, 27].

Remark

For the purposes of the adaptive filtering to be discussed next, we shall relabel these unknown (but fixed) unitary matrices as the starred values for the environment. Specifically, in Eq. (19), we set

$$\begin{aligned}\begin{array}{l} {\mathbf{U}_\mathbf{0}} = {\mathbf{U}_\mathbf{0}^*}\\ {\mathbf{U}_\mathbf{1}} = {\mathbf{U}_\mathbf{1}^*} \end{array} \end{aligned}$$

The developed adaptive filtering will have parameter matrices that, when adaptation is successful, will converge to (approximately) these fixed starred environment parameters.

4.2 Step 2a: Determining the Rotation Unitary Matrix \(\mathbf{U}\) for the Feedforward Structure

Fig. 3
figure 3

Feed-forward (FF) demixing structure

The output from the FF structure, as depicted in Fig. 3, is expressed as

$$\begin{aligned} {\mathbf{y}_\mathbf{n}} = {{\mathbf{U}_\mathbf{0}}} \mathbf{r}_\mathbf{n}^\mathbf{w} + \sum \limits _{\mathbf{k} = \mathbf{1}}^\mathbf{K} {{\mathbf{U}_\mathbf{k}}} \mathbf{r}_{\mathbf{n - k}}^\mathbf{w} \end{aligned}$$
(20)

For simplicity of presentation, we begin with a two-tap model; thus, the two-tap FF structure becomes

$$\begin{aligned} {\mathbf{y}_\mathbf{n}} = {{\mathbf{U}_\mathbf{0}}}{} \mathbf{r}_\mathbf{n}^\mathbf{w} + {{\mathbf{U}}_\mathbf{1}}{} \mathbf{r}_{\mathbf{n - 1}}^\mathbf{w} \end{aligned}$$
(21)

The goal for a successful adaptive algorithm is to bring about the convergence of the parameter matrices to the (starred) environment parameters. Specially, the adaptive algorithm succeeds when its parameter matrices converge to \(\mathbf{U}_\mathbf{0}^*\), and \(\mathbf{U}_\mathbf{1}^*\), respectively.

We now proceed with the development. One can rewrite this convolutive filter (21) as the following (static) map

$$\begin{aligned} \left[ {\begin{array}{*{20}{c}} {{\mathbf{y}_\mathbf{n}}}\\ {\mathbf{r}_{\mathbf{n - 1}}^\mathbf{w}} \end{array}} \right] = \left[ {\begin{array}{*{20}{c}} {{\mathbf{U}_\mathbf{0}}\;\;\;\;\;\;\;\;{\mathbf{U}_\mathbf{1}}}\\ {\mathbf{0}\;\;\;\;\;\;\;\;\;\;\;\;\;\mathbf{I}} \end{array}} \right] \left[ {\begin{array}{*{20}{c}} {\mathbf{r}_\mathbf{n}^\mathbf{w}}\\ {\mathbf{r}_{\mathbf{n - 1}}^\mathbf{w}} \end{array}} \right] \end{aligned}$$
(22)

Then, one defines, respectively, the new augmented output, the static map, and the augmented input as

$$\begin{aligned} \mathrm{{\;}}\tilde{\mathbf{Y}}= & {} \left[ {\begin{array}{*{20}{c}} {{\mathbf{y}_\mathbf{n}}}\\ {\mathbf{r}_{\mathbf{n - 1}}^\mathbf{w}} \end{array}} \right] \\ \tilde{\mathbf{U}}= & {} \left[ {\begin{array}{*{20}{c}} {{\mathbf{U}_\mathbf{0}}\;\;\;\;\;\;\;\;\;\mathbf{0}}\\ {\;{\mathbf{U}_\mathbf{1}}\;\;\;\;\;\;\;\;\;\;\mathbf{I}} \end{array}} \right] \\ \tilde{\mathbf{R}}= & {} \left[ {\begin{array}{*{20}{c}} {\mathbf{r}_\mathbf{n}^\mathbf{w}}\\ {\mathbf{r}_{\mathbf{n - 1}}^\mathbf{w}} \end{array}} \right] \end{aligned}$$

Thus, the expression in (22) becomes the static map

$$\begin{aligned} \tilde{\mathbf{Y}} = {\tilde{\mathbf{U}}^{\mathrm{H}}}\tilde{\mathbf{R}} \end{aligned}$$
(23)

Based on the natural gradient approach [4, 13], the update law for the columns of the augmented demixing matrix \(\tilde{\mathbf{U}}\) can be expressed as

$$\begin{aligned} {\mathbf{u}^ + } = \mathbf{u} - \mu \mathbf{E}\left[ {\tilde{\mathbf{R}}\left( {\mathbf{g}\left( {{\mathbf{u}^\mathbf{H}}\tilde{ \mathbf{R}}}\right) } \right) } \right] \end{aligned}$$
(24)

where \({\mathbf{u} }\), respectively, \({\mathbf{u}^ + }\), is the current, respectively, value of one column vector of \({\tilde{\mathbf{U}}}\), \(\mu \) is the step size and g is the chosen score function. Noting the structure of the demixing matrix in (23), one decomposes the column vector as

$$\begin{aligned} \mathbf{u} = \left[ {\begin{array}{*{20}{c}} {{\mathbf{u}_\mathbf{0}}}\\ {{\mathbf{u} _\mathbf{1}}} \end{array}} \right] \end{aligned}$$
(25)

Hence, the update law is correspondingly decomposed as (note that we have suppressed the \(E[{\cdot }]\) operator):

$$\begin{aligned} \left[ {\begin{array}{*{20}{c}} {{\mathbf{u}_\mathbf{0}}^ + }\\ {{\mathbf{u}_\mathbf{1}}^ + } \end{array}} \right] = \left[ {\begin{array}{*{20}{c}} {{\mathbf{u}_\mathbf{0}}}\\ {{\mathbf{u}_\mathbf{1}}} \end{array}} \right] - \mu \;\left[ {\begin{array}{*{20}{c}} {\mathbf{r}_\mathbf{n^w}}\\ {\mathbf{r}_{\mathbf{n - 1}}^\mathbf{w}} \end{array}} \right] g\left( {{\mathbf{y}_\mathbf{n}}} \right) \end{aligned}$$
(26)

where \(\mathbf{u}_\mathbf{0}\), \(\mathbf{u}_\mathbf{1}\) are the column vectors of \(\mathbf{U}_\mathbf{0}\) and \(\mathbf{U}_\mathbf{1}\) in (22), respectively. Therefore, the update laws for the individual (sub-)columns are

$$\begin{aligned} {\mathbf{u}_\mathbf{0}}^ += & {} {\mathbf{u}_\mathbf{0}} - \mu \mathbf{r}_\mathbf{n}^{\mathbf{w}}{} \mathbf{g}\left( {{\mathbf{y}_\mathbf{n}}} \right) \end{aligned}$$
(27)
$$\begin{aligned} {\mathbf{u}_\mathbf{1}}^ += & {} {\mathbf{u}_\mathbf{1}} - \mu \mathbf{r}_{\mathbf{n-1}}^{\mathbf{w}}{} \mathbf{g}\left( {{\mathbf{y}_\mathbf{n}}} \right) \end{aligned}$$
(28)

Now, by induction, the update law for the kth lag element \(\mathbf{u}_\mathbf{k}\) is

$$\begin{aligned} {\mathbf{u}_\mathbf{k}}^ + = {\mathbf{u}_\mathbf{k}} - \mu \mathbf{r}_{\mathbf{n-k}}^{\mathbf{w}}{} \mathbf{g}\left( {{\mathbf{y}_\mathbf{n}}} \right) \end{aligned}$$
(29)

4.3 Step 2b: Determining the Rotation Unitary Matrix \(\mathbf{U}\) Based on Feedback Structure I (FB-I)

Fig. 4
figure 4

Feedback demixing structure I (FB-I)

The output of FB-I, as depicted in Fig. 4, results in the filtering expression

$$\begin{aligned} {\mathbf{y}_\mathbf{n}} = \mathbf{U}_\mathbf{0}^{- 1}\left( {\mathbf{r}_\mathbf{n}^\mathbf{w} - \sum \limits _{\mathbf{k = 1}}^\mathbf{K} {\mathbf{U}_\mathbf{k}}{\mathbf{y}_{\mathbf{n - k}}}} \right) \end{aligned}$$
(30)

Consider now just two taps of FB-I, i.e.,

$$\begin{aligned} {\mathbf{y}_\mathbf{n}} = {\mathbf{U}_\mathbf{0}}^{\mathbf{- 1}}\left( {\mathbf{r}_\mathbf{n}^\mathbf{w} - {\mathbf{U}_\mathbf{1}}{\mathbf{y}_{\mathbf{n - 1}}}} \right) \end{aligned}$$
(31)

One can rewrite this convolutive filter into the following augmented static form

$$\begin{aligned} \left[ {\begin{array}{*{20}{c}} {\mathbf{r}_\mathbf{n}^\mathbf{w}}\\ {{\mathbf{y}_{\mathbf{n - 1}}}} \end{array}} \right] = \left[ {\begin{array}{*{20}{c}} {{\mathbf{U}_\mathbf{0}}\;\;\;\;\;\;\;\;{\mathbf{U}_\mathbf{1}}}\\ {\mathbf{0}\;\;\;\;\;\;\;\;\;\;\;\;\;\mathbf{I}} \end{array}} \right] \left[ {\begin{array}{*{20}{c}} {{\mathbf{y}_\mathbf{n}}}\\ {{\mathbf{y}_{\mathbf{n - 1}}}} \end{array}} \right] \end{aligned}$$
(32)

Or

$$\begin{aligned} \left[ {\begin{array}{*{20}{c}} {{\mathbf{y}_\mathbf{n}}}\\ {{\mathbf{y}_{\mathbf{n - 1}}}} \end{array}} \right] = {\left[ {\begin{array}{*{20}{c}} {{\mathbf{U}_\mathbf{0}}\;\;\;\;\;\;\;\;{\mathbf{U}_\mathbf{1}}}\\ {\mathbf{0}\;\;\;\;\;\;\;\;\;\;\;\;\;\mathbf{I}} \end{array}} \right] ^{ - 1}}\left[ {\begin{array}{*{20}{c}} {\mathbf{r}_\mathbf{n}^\mathbf{w}}\\ {{\mathbf{y}_{\mathbf{n - 1}}}} \end{array}} \right] \nonumber \\ \left[ \begin{array}{l} {\mathbf{y}_\mathbf{n}}\\ {\mathbf{y}_{\mathbf{n - 1}}} \end{array} \right] = \left[ \begin{array}{l} \mathbf{U}_\mathbf{0}^{\mathbf{- 1}}\\ \;\;\;\mathbf{0} \end{array} \right. \left. \begin{array}{l} - \mathbf{U}_\mathbf{0}^{ \mathbf{- 1}}{\mathbf{U}_\mathbf{1}}\\ \;\;\;\;\;\; \mathbf{I} \end{array} \right] \left[ \begin{array}{l} \mathbf{r}_\mathbf{n}^\mathbf{w}\\ {\mathbf{y}_{\mathbf{n - 1}}} \end{array} \right] \end{aligned}$$
(33)

Thus, in this case, one defines the augmented output, demixing matrix, and input as follows:

$$\begin{aligned} \tilde{\mathbf{Y}}= & {} \left[ {\begin{array}{*{20}{c}} {{\mathbf{y}_\mathbf{n}}}\\ {{\mathbf{y}_{\mathbf{n - 1}}}} \end{array}} \right] \\ \tilde{\mathbf{U}}= & {} \left[ {\begin{array}{*{20}{c}} {{\mathbf{U}_\mathbf{0}}\;\;\;\;\;\;\;\;\;\;\;\mathbf{0}}\\ {{\mathbf{U}_\mathbf{1}}\;\;\;\;\;\;\;\;\;\;\;\mathbf{I}} \end{array}} \right] \\ \tilde{\mathbf{R}}= & {} \left[ {\begin{array}{*{20}{c}} {\mathbf{r}_\mathbf{n}^\mathbf{w}}\\ {{\mathbf{y}_{\mathbf{n - 1}}}} \end{array}} \right] \end{aligned}$$

One then re-expresses (32) into the compact equation

$$\begin{aligned} \tilde{\mathbf{R}} = {\tilde{\mathbf{U}}^{\mathrm{H}}}\tilde{\mathbf{Y}} \end{aligned}$$
(34)

Again, using the natural gradient approach, the update law for a column of the demixing matrix \({\tilde{\mathbf{U}}}\) is

$$\begin{aligned} {\mathbf{u}^ + } = {\mathbf{u}} - \mu \mathbf{E}\left[ \tilde{\mathbf{Y}}\left( {\mathbf{g}\left( {{\mathbf{u}^\mathbf{H}}{\tilde{\mathbf{Y}}}} \right) } \right) \right] \end{aligned}$$
(35)

As before, \({\mathbf{u} }\), respectively, \({\mathbf{u}^ + }\), is the current, respectively, value of one column vectors of \({\tilde{\mathbf{U}}}\), \(\mu \) is the step size and g is the chosen score function.

One can exploit the block matrix structure of the demixing matrix and simplify the update law. To that end, consider the block matrix

$$\begin{aligned} {\mathbf{u}^\mathbf{0}} = \left[ {\begin{array}{*{20}{c}} {{{\mathbf{u}_\mathbf{0}}^{\mathbf{0}}}}\\ { {\mathbf{u}_1}^{\mathbf{0}}} \end{array}} \right] \end{aligned}$$
(36)

Thus, the update laws can be calculated to produce

$$\begin{aligned} \left[ {\begin{array}{*{20}{c}} {{{\mathbf{u}_\mathbf{0}}^ + }}\\ {{\mathbf{u}_\mathbf{1}}^ + } \end{array}} \right] = \left[ {\begin{array}{*{20}{c}} {{\mathbf{u}_\mathbf{0}}}\\ {{\mathbf{u}_\mathbf{1}}} \end{array}} \right] - \;\mu \left[ {\begin{array}{*{20}{c}} {\mathbf{y}_{\mathbf{n}}}\\ {{\mathbf{y}_{\mathbf{n - 1}}}} \end{array}} \right] g\left( {{\mathbf{u}_\mathbf{0}}{} \mathbf{y}_\mathbf{n} + {\mathbf{u}_\mathbf{1}}{\mathbf{y}_{\mathbf{n - 1}}}} \right) \end{aligned}$$
(37)

Similarly, the next block matrices can be defined as

$$\begin{aligned} {\mathbf{u}^1} = \left[ {\begin{array}{*{20}{c}} {{\mathbf{0}}}\\ {{\mathbf{i}^1}} \end{array}} \right] \end{aligned}$$
(38)

This leads to the specialized form

$$\begin{aligned} \left[ {\begin{array}{*{20}{c}} {{\mathbf{0}^ + }}\\ {{\mathbf{i}^ + }} \end{array}} \right] = \left[ {\begin{array}{*{20}{c}} \mathbf{0}\\ {{\mathbf{i}}} \end{array}} \right] - \mu \;\left[ {\begin{array}{*{20}{c}} {\mathbf{y}_\mathbf{n}}\\ {{\mathbf{y}_{\mathbf{n - 1}}}} \end{array}} \right] g\left( {{\mathbf{y}_{\mathbf{n - 1}}}} \right) \end{aligned}$$
(39)

Thus, the update laws for the individual columns are

$$\begin{aligned} {\mathbf{u}_\mathbf{0}}^ + = {\mathbf{u}_\mathbf{0}} - \mu {\mathbf{y}_{\mathbf{n}}}g\left( {\mathbf{r}_\mathbf{n}^\mathbf{w}} \right) \end{aligned}$$
(40)

and

$$\begin{aligned} {\mathbf{u}_\mathbf{1}}^ + = {\mathbf{u}_\mathbf{1}} - \mu {\mathbf{y}_{\mathbf{n - 1}}}g\left( {\mathbf{r}_\mathbf{n}^\mathbf{w}} \right) \end{aligned}$$
(41)

Analogously, by induction, the update law for the kth lag element, say \(\mathbf{u}_\mathbf{k}\) is

$$\begin{aligned} {\mathbf{u}_\mathbf{k}}^ + = {\mathbf{u}_\mathbf{k}} - \mu {\mathbf{y}_{\mathbf{n - k}}}g\left( {\mathbf{r}_\mathbf{n}^\mathbf{w}} \right) \end{aligned}$$
(42)

4.4 Step 2c: Determining the Rotation Unitary Matrix \(\mathbf{U}\) Based on Feedback Structure II

Fig. 5
figure 5

Feedback demixing structure II (FB-II)

The output of FB-II, as depicted in Fig. 5, is expressed as:

$$\begin{aligned} {\mathbf{y}_\mathbf{n}} = {\mathbf{U}_\mathbf{0}}{} \mathbf{r}_\mathbf{n}^\mathbf{w} + \sum \limits _{\mathbf{k = 1}}^\mathbf{K} {\mathbf{U}_\mathbf{k}}{\mathbf{y}_{\mathbf{n - k}}} \end{aligned}$$
(43)

Again, consider two taps of FB-II, i.e.,

$$\begin{aligned} {\mathbf{y}_\mathbf{n}} = {\mathbf{U}_\mathbf{0}}{} \mathbf{r}_\mathbf{n}^\mathbf{w} - {\mathbf{U}_\mathbf{1}}{\mathbf{y}_{\mathbf{n - 1}}} \end{aligned}$$
(44)

Hence, one rewrites this convolutive filter in the following augmented static form

$$\begin{aligned} \left[ {\begin{array}{*{20}{c}} {{\mathbf{y}_\mathbf{n}}}\\ {{\mathbf{y}_{\mathbf{n - 1}}}} \end{array}} \right] = \left[ {\begin{array}{*{20}{c}} {{\mathbf{U}_\mathbf{0}}\;\;\;\;\;\;\;\; - {\mathbf{U}_\mathbf{1}}}\\ {\mathbf{0}\;\;\;\;\;\;\;\;\;\;\;\;\;\mathbf{I}} \end{array}} \right] \left[ {\begin{array}{*{20}{c}} {\mathbf{r}_\mathbf{n}^\mathbf{w}}\\ {{\mathbf{y}_{\mathbf{n - 1}}}} \end{array}} \right] \end{aligned}$$
(45)

Similarly, define the augmented entities as

$$\begin{aligned} \tilde{\mathbf{Y}}= & {} \left[ {\begin{array}{*{20}{c}} {{\mathbf{y}_\mathbf{n}}}\\ {{\mathbf{y}_{\mathbf{n - 1}}}} \end{array}} \right] \\ \tilde{\mathbf{U}}= & {} \left[ {\begin{array}{*{20}{c}} {\;{\mathbf{U}_\mathbf{0}}\;\;\;\;\;\;\;\;\mathbf{0}}\\ { - {\mathbf{U}_\mathbf{1}}\;\;\;\;\;\;\;\mathbf{I}} \end{array}} \right] \\ \tilde{\mathbf{R}}= & {} \left[ {\begin{array}{*{20}{c}} {\mathbf{r}_\mathbf{n}^\mathbf{w}}\\ {{\mathbf{y}_{\mathbf{n - 1}}}} \end{array}} \right] \end{aligned}$$

Thus, one rewrites (45) into the compact mapping

$$\begin{aligned} \tilde{\mathbf{Y}} = {\tilde{\mathbf{U}}^{\mathrm{H}}}\tilde{\mathbf{R}} \end{aligned}$$
(46)

Using the natural gradient approach, the update laws for a weight column of the demixing matrix \(\tilde{\mathbf{U}}\) are expressed as

$$\begin{aligned} {\mathbf{u}^ + } = \mathbf{u} - \mu \mathbf{E}\left[ {\tilde{\mathbf{R}}\left( {\mathbf{g}\left( {{\mathbf{u}^\mathbf{H}}\tilde{\mathbf{R}} }\right) } \right) } \right] \end{aligned}$$
(47)

where as before, \({\mathbf{u} }\), respectively, \({\mathbf{u}^ + }\), is the current, respectively, value of one column vectors of \({\tilde{\mathbf{U}}}\), \(\mu \) is the step size and \(\mathbf{g({\cdot })}\) is the chosen score function. One can appropriately decompose a column vector in order to simplify the update expressions as:

$$\begin{aligned} \mathbf{u} = \left[ {\begin{array}{*{20}{c}} {{\mathbf{u}_\mathbf{0}}}\\ {{\mathbf{u}_\mathbf{1}}} \end{array}} \right] \end{aligned}$$
(48)

Then the update law becomes decomposed as follows:

$$\begin{aligned} \left[ {\begin{array}{*{20}{c}} {{\mathbf{u}_\mathbf{0}}^ + }\\ {{\mathbf{u}_\mathbf{1}}^ + } \end{array}} \right] = \left[ {\begin{array}{*{20}{c}} {{\mathbf{u}_\mathbf{0}}}\\ {{\mathbf{u}_\mathbf{1}}} \end{array}} \right] - \mu \;\left[ {\begin{array}{*{20}{c}} {\mathbf{r}_\mathbf{n}^\mathbf{w}}\\ {{\mathbf{y}_{\mathbf{n - 1}}}} \end{array}} \right] g\left( {{\mathbf{y}_\mathbf{n}}} \right) \end{aligned}$$
(49)

Thus, the update laws for the individual subcolumns are

$$\begin{aligned} {\mathbf{u}_\mathbf{0}}^ + = {\mathbf{u}_\mathbf{0}} - \mu \mathbf{r}_\mathbf{n}^\mathbf{w} \mathbf{g}\left( {{\mathbf{y}_\mathbf{n}}} \right) \end{aligned}$$
(50)

and

$$\begin{aligned} {\mathbf{u}_\mathbf{1}}^ + = {\mathbf{u}_\mathbf{1}} - \mu {\mathbf{y}_{\mathbf{n - 1}}}g\left( {{\mathbf{y}_\mathbf{n}}} \right) \end{aligned}$$
(51)

Finally, by induction, the update law for the kth lag element \(\mathbf{u}_\mathbf{k}\) is

$$\begin{aligned} {\mathbf{u}_\mathbf{k}}^ + = {\mathbf{u}_\mathbf{k}} - \mu {\mathbf{y}_{\mathbf{n - k}}}g\left( {{\mathbf{y}_\mathbf{n}}} \right) \end{aligned}$$
(52)

4.5 The Proposed Adaptive Rake-Based Detectors

While the previous filtering structures constitute new adaptive filters, one can augment the existing conventional Rake detectors to improve its performance adaptively. We now develop three adaptive modifications of the conventional Rake detector based on, respectively, independent component analysis (ICA) [24], RobustICA [34], and principle component analysis (PCA) [16]. Recalling the Rake detector’s structure as given in (10), one can mathematically express the adaptive modified Rake detector for DS–CDMA systems as follows:

$$\begin{aligned} \mathbf{b}_{\mathbf{i,Rake}}^\mathbf{D} = \mathbf{S}_\mathbf{i}^\mathbf{H}{\mathbf{W}}{\mathbf{H}^\mathbf{H}}{} \mathbf{r} \end{aligned}$$
(53)

where as before, \(\mathbf{H}\) is the crudely estimated (inverse) channel matrix, \(\mathbf{S}_\mathbf{i}\) is a vector associated with the ith user’s signature code, and \(\mathbf{b}_{\mathbf{i,Rake}}^\mathbf{D}\) is the estimated ith user’s symbol. A \(G \times G\) matrix \(\mathbf{W}\) is inserted which will adaptively augment and improve the estimate of the channel inverse. In the following, we summarize the process in Algorithms 12, and 3 to adaptively estimate the matrix \(\mathbf{W}\) using the FastICA, Robust ICA, and PCA algorithms, respectively.

figure a
figure b
figure c

5 Simulation Results

A series of extensive simulations are carried out in order to verify and evaluate the performance of the proposed adaptive filters and algorithms in the multipath downlink DS–CDMA system in the presence of AWGN. We summarize the case study results as follows. We assume a constant spreading gain, which is \(G=63\) for gold codes and \(G=64\) for orthogonal variable spreading factor (OVSF) codes. The received CDMA signal experiences five multipath channels \(L=5\) with delays of 0, 1, 2, 3, 4 chips, respectively. Also, we set the complex attenuation coefficients to represent the multipath channels, specifically, \( h_0=0.3684 + 0.5364i\), \( h_1=0.1982 + 0.0187i\), \(h_2=0.0237 + 0.5683\), \( h_3=0.1112 + 0.0835i\), and \(h_4=0.2203 + 0.2756i \), respectively. We use the following model function for sub-Gaussian sources for which the source signals have a negative kurtosis sign:

$$\begin{aligned} {{\mathrm{g}}^{{\mathrm{SUB}}}}\left( {{\hat{\mathbf{b}}}} \right) = {\hat{\mathbf{b}}} - \left( {\tanh \left( {{\mathrm{Re}}\left\{ {{\hat{\mathbf{b}}}} \right\} } \right) + {\mathrm{jtanh}}\left( {{\mathrm{Im}}\left\{ {{\hat{\mathbf{b}}}} \right\} } \right) } \right) \end{aligned}$$
(54)

Monte Carlo simulations have been run to verify the validity of the algorithms. We also use the signal-to-noise ratio (SNR) as a figure of merit which represents the ratio of the energy per symbol and the power spectral density (PSD) of the noise. Moreover, all the user symbols are assumed to be transmitted with the same power. Figure 6a, b shows the simulation results of BER versus SNR for the proposed detectors in contrast to the existing and conventional ones for the number of users \(K=30\) and \(K=50\), respectively. The other parameters were set as (i) number of symbols \(M=1000\) and (ii) number of paths \(L=5\), with the values of SNR in the range of \({-}\)10 to 30 dB. The simulations have been carried out using the MATLAB software on an Intel Core i5 CPU 2.4 GHz processor and 4G MB RAM. We finally remark that we repeat the experiments for 1000 realizations of the M symbols to produce the BER rate. Moreover, we assume the data transmission to be quaternary phase-shift keying (QPSK).

Fig. 6
figure 6

Average BER as a function of SNR for DS–CDMA downlink. Using gold codes \(G=63\). a Using 30 users, b using 50 users

Figure 6 shows that the proposed algorithms improve the performance of the CDMA system. One observes that the blind multiuser detection based on FB-II has resulted in the lowest BER, and thus, it outperforms all other detectors, including the BMUD algorithm presented in [27]. One also observes that the proposed algorithms work even in cases which cause difficulties for the LMMSE receiver, as in the high SNR ratio, and when the sample set is fairly small. Moreover, the performance of the blind multiuser detection degrades as the number of users increases as comparatively shown in Fig. 6b. In our comparison among the aforementioned algorithms, we employ several metrics including computational load time and performance accuracy. With the advent of more powerful computing platforms including graphics processing units (GPUs), however, performance accuracy holds more merit. CPU time is primarily used as an indicator of the comparative computational load and convergence speed. As depicted in Table 1, the convergence speeds of the proposed blind adaptive approaches are comparable to the BMUD algorithm presented in [27] even though they improve the BER performance. Table 1 shows the proposed fused semi-blind methods, especially the RICA semi-blind algorithm, naturally exhibit faster convergence speed than the BMUD and the proposed detectors.

Table 1 CPU time comparison among different detectors—in seconds

Furthermore, we have also evaluated the effect of the OVSF codes. As depicted in Fig. 7, it is generally the case that using the OVSF codes enhances the performance of the proposed methods.

Fig. 7
figure 7

Average BER as a function of SNR for DS–CDMA downlink. Using OVSF codes \(G=64\). a Using 30 users, b using 50 users

In the WCDMA System case, we assume that the channel coefficients are \( h_0=0.3684 + 0.5364i\), \( h_1=0.1982 + 0.0187i\), \(h_2=0.0237 + 0.5683\), \( h_3=0.1112 + 0.0835i\), and \(h_4=0.2203 + 0.2756i \), respectively. Also, all user-specific codes use two types of spreading codes, namely gold codes with spreading gain \(G=63\) and OVSF (or Walsh–Hadamard) codes with spreading gain \(G=64\).

In Figs. 8 and 9, we document and demonstrate the performance of the various methods in terms of BER for the WCDMA downlink scenario. We observe that the LMMSE is slightly better than some presented detectors under good (i.e., high) SNR conditions. However, the proposed algorithm based on FB-II outperforms all detectors over all SNR depicted ranges and has again produced the lowest BER when compared to all other methods.

Fig. 8
figure 8

Average BER as a function of SNR for WCDMA downlink. Using gold codes \(G=63\). a Using 30 users, b using 50 users

Fig. 9
figure 9

Average BER as a function of SNR for WCDMA downlink. Using OVSF codes \(G=64\). a Using 30 users, b using 50 users

It is also worthwhile to compare the presented algorithms with a relatively large data sample set. Thus, Fig. 10 and Fig. 11 present the performance of the various detectors with fairly long sample set, namely \(M=30{,}000\) in each of the DS–CDMA and WCDMA systems. It is noted that the benchmark LMMSE detector performs much better for high SNR. It is plausible to assume that the LMMSE detector becomes better than other detectors under good SNR conditions. However, the proposed algorithm based on FB-II has exceeded the LMMSE detector at all SNRs less than 22 dB.

Fig. 10
figure 10

Average BER as a function of SNR for DS–CDMA downlink. For 30 users. a Using gold codes \(G=63\), b using OVSF codes \(G=64\)

Fig. 11
figure 11

Average BER as a function of SNR for WCDMA downlink. For 30 users. a Using gold codes \(G=63\), b using OVSF codes \(G=64\)

Fig. 12
figure 12

Average BER as a function of SNR for various number of users K

Fig. 13
figure 13

Average BER as a function of SNR for various sample sets M

We clarify the BER computation in the following. We consider \(M=30{,}000\) and repeat the experiment for 1000 realizations. Thus, we have \(30{,}000 \times 1000\) symbols. This results in \(30 \times 10^6\) symbols, which can measure an error in the order of \({\sim } 10^ {-6}\). The BER is calculated by comparing the transmitted sequence of bits to the received bits and counting the number of errors. The ratio of how many bits received in error over the total number of bits received is the BER. Moreover, we assume the data transmission to be quaternary phase-shift keying (QPSK).

Finally, we evaluate the effect of the number of users and the size of the sample set on the performance of the proposed FB-II method in Figs. 12 and 13, respectively. In Fig. 12, the simulation results show the BER versus SNR with various K users at 500 symbols for each user for blind multiuser detection based on the FB-II detector. As expected, Fig. 12 shows that the FB-II detector decreases in performance as K, the number of users, is increased. Moreover, Fig. 13 shows the simulation results of BER versus SNR with 30 users \((K=30)\) for various data samples (M). The proposed FB-II algorithm appears robust and performs resonably well, and it is obvious that its performance improves more consistently as M increases by mitigating the MIA.

Overall, the proposed variant detectors and algorithms perform well in solving the symbol estimation problem in the DS/WCDMA downlink system, especially when the size of the sample set is reasonably small.

6 Conclusion

We have presented formulations, derivations, and subsequent extensive simulations of various filtering algorithms with various structures for multiuser detection in CDMA-based systems. Specifically, we have developed three blind multiuser detectors of different filtering structures, namely FF, FB-I, and FB-II detectors. In addition, we have introduced three adaptive semi-blind algorithms fused into the conventional Rake detector based on ICA, RICA, and PCA. The results appear to show that the proposed structures perform well in the symbol estimation problem in DS/CDMA systems; more specifically, they outperform all other detectors in the comparative study, including the LMMSE detector. Our results also show that MAI can be mitigated by the proposed detectorsalgorithms, particularly the proposed FB-II detector. Although the FB-II detector further improves as the size of the sample set increases, the results show that it performs well even when the sample sets are relatively small. Finally, the proposed algorithms, unlike the adaptive LMMSE detector, do not require the spreading codes of the interfering users. While these detectors are intended for the mobile unit, they can be used at the base station as well.