Keywords

1 Introduction

In traditional MIMO system, a base station is usually mounted with multiple antennas and simultaneously serves multiple users. This kind of system has been widely utilized in mobile communication to enhance data throughput and link range without demanding additional bandwidth or transmit power [1, 2]. Beneficial from this advantage, MIMO technology plays a significant role in the majority of up-to-date wireless communication standards, such as 4G LTE and LTE-Advanced [3]. However, due to the constantly increasing demands for higher data rates, these systems are already approaching their throughout limits. In order to utilize resources more efficiently, reduce interference, improve the transmission rate and robustness, an emerging technique referred to as massive MIMO which employs antenna arrays with a few hundred of antennas at base station is proposed in recent years [4, 5]. It has been regarded as an enabler for the development of future broadband wireless networks and the next generation mobile communication systems [6, 7].

It is not trivial to establish a practical system to gain the extremely attractive advantages of massive MIMO technology and low-complexity signal detection algorithms are of an actual interest in system uplink when the number of single-antenna users is getting tremendously large [4]. Many signal detection algorithms that work very efficiently in conventional MIMO systems fail in massive MIMO systems because of computational complexity or performance. For examples, the complexity of the maximum likelihood (ML) detector, which is optimal among the hard decision methods, grows exponentially with the modulation order and the number of transmit antennas. The fix-complexity sphere decoding (FSD) [8] and tabu search(TS) [9] algorithms are put forward to obtain quasi-optimal performance, but their complexity is not affordable when the configuration of MIMO system is large or the modulation order is high [10]. One has no choice but to turn to linear detection algorithms such as zero-forcing (ZF) and MMSE due to their relatively low complexity and good bit error(BER) performance for multiuser massive MIMO systems [4], but such algorithms still require complexity of \(O(K^3) \) for calculating a matrix inversion, where K is the number of single antennas user. Therefore, many efforts have been dedicated to relieving the burdensome high complexity problem for practical detector design.

In [11], Neumann series expansion was proposed to approximate the matrix inversion in LMMSE detection, the performance and computational complexity of which scaled with the number of selected terms of Neumann series. However, when the number of selected terms was larger than two, the complexity of Neumann series expansion method was the same as that of the exact matrix inversion based detection method. In [12], Richardson iteration was proposed to avoid complicated matrix inversion, but the tradeoff between the signal detection performance and computational complexity did not meet expectations. In this paper, we first propose the WSSOR method to avoid direct matrix inversion on the premise of LMMSE filtering matrix is symmetric positive definite in massive MIMO systems and maintain good performance at same time. Then we present a proper initial solution, relaxation parameter and scope of the weighting factor to speed up the convergence rate.

The rest of paper is organized as follows: Sect. 2 introduces the general massive MIMO system model. Section 3 proposes the WSSOR-based signal detector. Simulation results are given in Sect. 4. Section 5 provides a summary of our findings and concludes the paper.

Notations: Lower-case and upper-case boldface symbols are used to represent column vectors and matrices, respectively. The superscripts T, H and \(-1\) respectively denote the transpose, conjugate-transpose and inverse of a matrix. \(\mathbf {Tr}(.)\) denotes the trace and \(\mathbf {I}_K \) is the \(K\times K \) identity matrix.

2 Massive MIMO System Model

Suppose an uplink multiuser massive MIMO system is composed of N receive antennas at the base station and K single-antenna user equipments \((K\le N) \). Let \(\mathbf {x}_c=[x_1,x_2,\dots ,x_K]^T\) stand for the vector containing the symbols simultaneously transmitted by all the users, where \(x_k\in \mathbb {B}\) is the symbol transmitted from the k-th user and \(\mathbb {B}\) is the modulation alphabet. Let \(\mathbf {H}_c\in \mathbb {C}^{N\times K}\) represent the channel coefficient matrix, whose entries are assumed to be independently and identically distributed. Therefore, the received signal vector \(\mathbf {y}_c \) at the base station can be denoted as

$$\begin{aligned} \mathbf {y}_c=\mathbf {H}_c\mathbf {x}_c+\mathbf {z}_c. \end{aligned}$$
(1)

where \(\mathbf {z}_c \) is the additive white Gaussian noise vector with its entries follow the Gaussian distribution \(\mathbb {C}\mathcal {N}(0,\sigma ^2) \).

Focusing on the uplink signal detection, when the subscript is dropped for convenience, the complex-valued system model (1) can be written in the real domain as

$$\begin{aligned} \mathbf {y}=\mathbf {H}\mathbf {x}+\mathbf {z}, \end{aligned}$$
(2)

where \(\mathbf {H}\in \mathbb {R}^{2N\times 2K} \), \(\mathbf {y}\in \mathbb {R}^{2N} \) and \(\mathbf {z}\in \mathbb {R}^{2N} \).

The task of massive MIMO signal detection at the base station is to detect the transmitted signal vector \(\mathbf {x}\) on the basis of the received signal vector \(\mathbf {y}\). It is worth mentioning that the channel coefficient matrix \(\mathbf {H} \) can be obtained by time or frequency domain training pilots without loss of generality [13, 14]. It has been testified that LMMSE detection algorithm is quasi-optimal for recovering the transmitted signal vector \(\hat{\mathbf {x}}\) from all the K single-antenna users

$$\begin{aligned} \hat{\mathbf {x}}=(\mathbf {H}^{H}\mathbf {H}+\sigma ^2\mathbf {I}_{2K})^{-1}\mathbf {H}^{H}\mathbf {y}=\mathbf {W}^{-1}\mathbf {y}_{_\text {MF}}. \end{aligned}$$
(3)

where \(\mathbf {y}_{_\text {MF}}=\mathbf {H}^{H}\mathbf {y}\) is regarded as the matched-filter output of \(\mathbf {y} \), and LMMSE filtering matrix \(\mathbf {W} \) is described as

$$\begin{aligned} \mathbf {W}=\mathbf {G}+\sigma ^2\mathbf {I}_\text {2K}. \end{aligned}$$
(4)

where \(\mathbf {G}=\mathbf {H}^H\mathbf {H} \) stands for the Gram matrix. It is worth noting that the exact computation of matrix inversion \(\mathbf {W}^{-1} \) needs unbearable complexity of \(O(K^3)\).

3 Near-Optimal Massive MIMO Signal Detector with Low Complexity

In this section, first of all, we propose a low complexity signal detection algorithm employing WSSOR without exact matrix inversion. Then we present the proper initial solution, relaxation parameter, and scope of the weighting factor for the WSSOR method, which are able to enhance the convergence rate in the case of high-order modulation. In addition, we analyze the computational complexity of the proposed algorithm in detail.

3.1 Signal Detection Based on WSSOR Method

Inspired by the special characteristics that the LMMSE filtering matrix \(\mathbf {W} \) has the property of being symmetric positive definite in massive MIMO systems uplink [15], we can utilize the WSSOR method to efficiently solve Eq. (3) with low complexity. Unlike that the LMMSE signal detector straightforwardly computes \(\mathbf {W}^{-1}\mathbf {y}_{_\text {MF}}\), the WSSOR method converts the matrix inversion problem into the one of solving linear equation

$$\begin{aligned} \mathbf {A}\mathbf {s}=\mathbf {b}, \end{aligned}$$
(5)

where \(\mathbf {A}\) denotes the symmetric positive definite matrix, \(\mathbf {s}\) the \(N\times 1\) solution vector, and \(\mathbf {b}\) the \(N\times 1\) measurement vector. The successive over-relaxation (SOR) method [15] can solve the linear equation efficiently in an iterative way. It greatly helps one avoid the complicated matrix inversion calculation and it is entirely different from the conventional method that directly computes \(\mathbf {A}^{-1}\mathbf {b}\) to estimate \(\mathbf {s}\). Due to the fact that matrix \(\mathbf {A}\) is symmetric positive definite, we can decompose it into a diagonal matrix \(\mathbf {D}_{\mathbf {A}}\), a strictly lower triangular matrix \(\mathbf {L}_{\mathbf {A}}\), and a strictly upper triangular matrix \(\mathbf {L}_{\mathbf {A}}^H\). Hence, the iterations of SOR can be represented as

$$\begin{aligned} \begin{aligned} \mathbf {s}^{(t+1)}=(\mathbf {A}_{\mathbf {A}}+\frac{1}{\omega }\mathbf {D}_{\mathbf {A}})^{-1}\left[ ((\frac{1}{\omega }-1)\mathbf {D}_{\mathbf {A}}-\mathbf {L}_{\mathbf {A}}^H)\mathbf {s}^{(t)}+\mathbf {b}\right] , \end{aligned} \end{aligned}$$
(6)

where the superscript t represents the number of iterations, and \(\omega \) indicates the relaxation parameter, which imposes an strong impact on the convergence rate.

Observing that LMMSE filtering matrix \(\mathbf {W} \) has the property of being symmetric positive definite in massive MIMO systems uplink, we may decompose \(\mathbf {W}\) in another manner as

$$\begin{aligned} \mathbf {W}=\mathbf {D}+\mathbf {L}+\mathbf {L}^H, \end{aligned}$$
(7)

where \(\mathbf {D} \), \(\mathbf {L} \) and \(\mathbf {L}^H \) represent the diagonal matrix, the strictly lower triangular matrix, and the strictly upper triangular matrix of the LMMSE filtering matrix \(\mathbf {W}\), respectively. By using the SOR method, the transmitted signal vector \(\mathbf {x}\) can be expressed as

$$\begin{aligned} \begin{aligned}\mathbf {x}^{(t+1)}=(\mathbf {L}+\frac{1}{\omega }\mathbf {D})^{-1}\left[ ((\frac{1}{\omega }-1)\mathbf {D}-\mathbf {L}^H)\mathbf {x}^{(t)}+\mathbf {y}_{_\text {MF}}\right] ,\end{aligned} \end{aligned}$$
(8)

where \(\mathbf {x}^{(0)}\) is the initial solution of SOR and it is set as a \(2K\times 1\) zero vector in general [16]. Consequently, the signal detection problem in Eq. (3) can be solved by SOR method in accordance with

$$\begin{aligned} (\mathbf {D}+\omega \mathbf {L})\mathbf {x}^{(t+1)}=(1-\omega )\mathbf {D}\mathbf {x}^{(t)}-\omega \mathbf {L}^H\mathbf {x}^{(t)}+\omega \mathbf {y}_{_\text {MF}}. \end{aligned}$$
(9)

As \(\mathbf {D}+\omega \mathbf {L} \) is a lower triangular matrix, we can solve Eq. (9) to obtain \(\mathbf {s}^{(t+1)} \) with low complexity and set relaxation parameter \(\omega \) within value scope \(0 \!<\!\omega \!<\!2 \).

However, when we encounter the more complex problems, very complicated eigenvalue needs to be analyzed. Thus, [17] proposed Chebyshev acceleration and symmetric successive over-relaxation (SSOR). SSOR is the improved method of symmetry of the SOR, whose basic idea is to combine SOR iterative method and backward SOR. The iterations of SSOR can be carried out in the following two steps:

  • Step 1: Compute the previous half iteration which is identical with the SOR iteration [16] by

    $$\begin{aligned} \begin{aligned} (\mathbf {D}+\omega \mathbf {L})\overline{\mathbf {x}}^{(t+1/2)}=(1-\omega )\mathbf {D}\overline{\mathbf {x}}^{(t)}-\omega \mathbf {L}^H\overline{\mathbf {x}}^{(t)}+\omega \mathbf {y}_{_\text {MF}}, \end{aligned} \end{aligned}$$
    (10)
  • Step 2: Compute the latter half iteration which is the SOR method with the equations taken in reverse order by

    $$\begin{aligned} \begin{aligned} (\mathbf {D}+\omega \mathbf {L}&^H)\overline{\mathbf {x}}^{(t+1)}=(1-\omega )\mathbf {D}\overline{\mathbf {x}}^{(t+1/2)}-\omega \mathbf {L}^H\overline{\mathbf {s}}^{(t+1/2)}+\omega \mathbf {y}_{_\text {MF}}. \end{aligned} \end{aligned}$$
    (11)

    where \(\overline{\mathbf {x}}\) represents the vector that needs to be estimated in SSOR for the propose of weighting the solution of SOR and SSOR, and \(\overline{\mathbf {x}}^{(0)}\) indicates the initial solution of SSOR, which is chosen as a \(2K \times 1\) zero vector in particular.

Compared with the SOR method, the SSOR method has two advantages. Firstly, the structure of SSOR method is symmetric, which implies that the convergence rate of SSOR can be improved by using Chebyshev acceleration. Secondly, a simple and quantified relaxation parameter can be employed to approximate a precise relaxation parameter with negligible performance loss, considering the convergence rate of SSOR method is not very sensitive to the relaxation parameter \(\omega \). A detailed description of the relaxation parameters is given in the next subsection.

Based on the basic idea of the SOR and the SSOR iterative method, we employ the averaging weight to deal with the vector derived by the iteration of Eq. (9) and the vector derived by iteration of Eq. (11). The WSSOR method can be described as

$$\begin{aligned} \hat{\mathbf {x}}=\theta \mathbf {x}^{(t+1)}+(1-\theta )\overline{\mathbf {x}}^{(t+1)}, \end{aligned}$$
(12)

where superscript t represents the number of iterations, and \(\theta \) indicates the weighting factor, \(\theta \in [0,1]\). When \(\theta =0\), the WSSOR iteration will degenerate into SSOR iteration; as for \(\theta =1\), it just boils down to the SOR iteration.

Applying the WSSOR method mentioned above to solve the equation. In consequence, we obtain the estimated signal vector in the tth iteration is

$$\begin{aligned} \hat{\mathbf {x}}^{(t+1)}=\mathbf {B}\hat{\mathbf {x}}^{(t)}+\mathbf {C}, \end{aligned}$$
(13)

where \(\mathbf {B}=\theta (1-\omega )((\mathbf {D}+\omega \mathbf {L})^{-1}-(\mathbf {D}+\omega \mathbf {L}^H)^{-1})\mathbf {D}+\theta \omega ((\mathbf {D}+\omega \mathbf {L})^{-1}\mathbf {L}^H-(\mathbf {\mathbf {D}+\omega \mathbf {L}^H})^{-1}\mathbf {L})+(1-\omega )(\mathbf {D}+\omega \mathbf {L}^H)^{-1}\mathbf {D}+\omega (\mathbf {D}+\omega \mathbf {L}^H)^{-1}\mathbf {L}.\) \(\mathbf {C}=\omega (\mathbf {D}+\omega \mathbf {L})\mathbf {y}_{_\text {MF}}, \) the weighting factor \(\theta \in [0,1] \). Proper relaxation parameter and initial solution will be given in subsection B.

3.2 Proper Relaxation and Parameter and Initial Solution

From Eqs. (9), (11), and (13), it is clear that the setting of relaxation parameter may result in some effects on the convergence rate of the WSSOR method. In [18], the optimal relaxation parameter \(\omega ^{opt} \) has been proposed as

$$\begin{aligned} \omega ^{opt}=\frac{2}{1+\sqrt{2(1-\rho (\mathbf {B}_{J}))}}, \end{aligned}$$
(14)

where \(\rho (\mathbf {B}_J)\) denotes the spectral radius of Jacobi iteration matrix \(\mathbf {B}_J \), which can be represented by

$$\begin{aligned} \mathbf {B}_J=\mathbf {D}^{-1}\mathbf {W}-\mathbf {I}_\text {2K}. \end{aligned}$$
(15)

Each element of the diagonal matrix \(\mathbf {D}\) will tend towards a fixed value N in massive MIMO systems [4], which indicates that we have

$$\begin{aligned} \mathbf {D}^{-1}\approx \frac{1}{N}\mathbf {I}_\text {2K}. \end{aligned}$$
(16)

Furthermore, as \(\mathbf {W}\) is a central Wishart matrix, when N and K are large enough and the system configuration ratio \(\beta =K/N\) remains fixed, the largest eigenvalue \(\lambda _{max} \) of \(\mathbf {W}\) can be well approached by [4]

$$\begin{aligned} \lambda _{max}=N(1+\beta )^2. \end{aligned}$$
(17)

Therefore, we can exploit a simple proper relaxation parameter \(\overline{\omega }\) to replace \(\omega ^{opt} \) (14) with insignificant error as

$$\begin{aligned} \overline{\omega }=\frac{2}{1+\sqrt{2(1-c)}}, \quad c=(1+\beta )^2-1. \end{aligned}$$
(18)

which signifies that the relaxation parameter \(\overline{\omega } \) only depends on the system configuration ratio \(\beta \). The relaxation parameter \(\overline{\omega }\) will be a constant in case that the configuration of massive MIMO system is kept as a fixed value.

For convenience, the initial vectors of tradition iterative algorithms are often selected as zero vectors. However, better performance can be achieved by choosing a proper initial solution than zero vectors under the same number of iterations. As \(\mathbf {D}^{-1} \) is a very good approximation for \(\mathbf {W}^{-1} \) when K / N is large enough, and \(\mathbf {G}\approx N\mathbf {I}_\text {2K} \) according to the channel hardening phenomenon, we can obtain \(\mathbf {W}^{-1}\approx \mathbf {D}^{-1}\approx (N+\sigma ^2/E_x)\mathbf {I}_\text {2K}^{-1}\approx N^{-1}\mathbf {I}_\text {2K} \), where \(E_{x}\) represents the transmission power. Then, the proper initial solution of (12), (13) and (14) can be selected as [19]

$$\begin{aligned} \mathbf {x}^{(0)}=\frac{1}{N}\mathbf {I}_\text {2K}\mathbf {y}_{_\text {MF}}. \end{aligned}$$
(19)

3.3 Computational Complexity Analysis

Owing to that fact that number of multiplication is dominant in computational complexity, in this subsection, we evaluate the complexity with respect to the required number of multiplications in each iteration. The whole complexity is mainly composed of two parts. The first part originates from the calculation of Eq. (9), for which the solution can be expressed by

$$\begin{aligned} \begin{aligned}\overline{x}_m^{(t+1/2)}&=(1-\omega )\overline{x}_m^{(t)}+\frac{\omega }{W_{m,m}}({y}_{_{\text {MF}}}^m-\sum _{k<m}W_{m,k}\overline{x}_k^{(t+1/2)}-\sum _{k>m}W_{m,k}\overline{x}_k^{(t)}),\\ \end{aligned} \end{aligned}$$
(20)

where \(\overline{x}_m^{(t+1/2)}\), \(\overline{x}_m^{(t)}\), and \({y}_{_\text {MF}}^m\) indicate the m-th element of \(\overline{\mathbf {x}}^{(t+1/2)}\), \(\overline{\mathbf {x}}^{(t)}\) and \(\mathbf {y}_{_\text {MF}}\) in Eq. (13), respectively. The entry \(W_{m,k} \) indicates the mth row and kth column of the matrix \(\mathbf {W}\). It is easy to know that the required number of multiplications in the computation of each element of \(\mathbf {s}^{(t+1/2)} \) is \(K+1 \). Due to the K elements in \(\overline{\mathbf {s}}^{(t+1/2)}\), the overall number of multiplications needed for this part is \(K^2+K\).

The second part is from the computation of (11). Same as (20), the solution to (11) can be written as

$$\begin{aligned} \begin{aligned} \overline{x}_m^{(t+1)}&=(1-\omega )\overline{x}_m^{t+1/2}+\frac{\omega }{W_{m,m}}({y}_{_\text {MF}}^m-\sum _{k<m}W_{m,k}\overline{x}_k^{(t+1/2)}-\sum _{k>m}W_{m,k}\overline{x}_k^{(t+1)}),\\ \end{aligned} \end{aligned}$$
(21)

where \(\overline{x}_m^{(t+1)} \) indicates the mth element of \(\overline{\mathbf {x}}^{(t+1)} \) in (11). According to (11), we can conclude that this part also requires \(K^2+K \) times of multiplications.

In a word, the entire complexity of the proposed WSSOR based signal detector is \(t(2K^2+2K) \). Compared with MMSE algorithm, the complexity has been reduced from \(O(K^3) \) to \(O(K^2) \). A comparison between Neumann series expansion and WSSOR about computational complexity is shown in Fig. 1. It is clear that the complexity of proposed WSSOR method is significantly lower than that of the Neumann series expansion after two iterations. When the number of iteration is larger than two, Neumann series expansion based signal detector loses the advantage in computational complexity.

Fig. 1.
figure 1

Complexity comparison against the number of users K

Fig. 2.
figure 2

BER performance comparison between Neumann and WSSOR without proper relaxation parameter and initial solution

Fig. 3.
figure 3

BER performance comparison between Neumann and WSSOR with proper relaxation parameter and initial solution

4 Simulation Result

To verify the performance of the proposed WSSOR signal detection algorithm compared with the Neumann series expansion one, we provide the BER simulation results in this section. The BER performance of the MMSE algorithm with exact matrix inversion and just inversion of its diagonal elements are included as the benchmark for comparison. We consider massive MIMO systems with \(N\times K=128\times 16 \). The modulation scheme of 16QAM is adopted and weighting factor \(\theta \) is chosen as 0.75. In the following simulation diagrams, t will denote the number of iterations for the algorithm based on WSSOR method, but the first terms of the algorithm based on Neumann series expansion.

Figure 2 shows the BER performance comparison between Neumann-based signal detector and WSSOR-based signal detector without proper relaxation parameter and initial solution. We can observed from Fig. 2, the BER performance of both Neumann-based signal detector and WSSOR-based signal detector improves when increasing of the number of iterations. However, the proposed algorithm outperforms the Neumann-based one with the same number of iterations t. For example, when \(t=4 \), to achieve the BER performance of \(10^{-4} \), the required SNR by WSSOR-based signal detector just requires 10 dB, while the Neumann-based one is about 14 dB. Furthermore, we can conclude that WSSOR-based signal detector obtains quasi-optimal BER performance of LMMSE signal detector though fewer iterations.

Instead of choosing zero vector as the initial solution and \(0<\omega <2\), we find a proper initial solution and relaxation parameter in Fig. 3. When the BER performances in Figs. 2 and 3 are compared, it is clear that proper initial solution and relaxation parameter are helpful for accelerating the convergence rate evidently. For instance, the algorithm with proper initial solution and relaxation parameter outperforms the conventional one, especially in cases where the number of iteration t is small. When \(t=2\), the BER performance of the WSSOR method with proper initial solution and relaxation parameter is nearly similar to the one without them when \(t=4\), which implies we can be close to the optimal BER performance of LMMSE signal detector through only a smaller number of iterations.

5 Conclusion

In this paper, in accordance with the special characteristics of massive MIMO systems, we propose a low complexity detection method based on the WSSOR method, which exploits an iterative strategy to detect the transmitted signal vectors without demanding complicated matrix inversion. The complexity has been reduced from \(O(K^3)\) to \(O(K^2)\). Meanwhile, we present proper initial solution and relaxation parameter, which improve the detection performance and convergence rate. Simulation results illustrate that the proposed algorithm outperforms the Neumann expansion-based signal detector, and achieves near-optimal detection performance via only a small number of iterations.