Keywords

1 Introduction

The ever-growing challenges for significant traffic growth driven by mobile Internet and Internet of Things have made system capacity enhancement one of the most important features in next-generation wireless communication systems. The general consensus is that the aggregate data rate will increase by roughly 1000 by 2020. Massive multiple-input multiple-output (MIMO), which is first advocated in [1], is identified as one of the key enabling technologies to achieve this goal due to its strong potential in boosting the spectral efficiency (SE) of wireless networks [1, 2]. The term massive MIMO indicates that the base station (BS) employs a number of antennas (typically several tens to hundreds) much larger than the number of active data streams per time-frequency resource. The benefits of massive MIMO are twofold. First, massive MIMO produces a large surplus of degrees of freedom, which can be used to create asymptotically orthogonal channels and deliver near interference-free signals for each user equipment (UE). In this way, the network SE is enhanced significantly because more UEs can be served in parallel and each UE suffers from less interference. On the other hand, the tremendous array gain of the large-scale antenna array also helps to save transmit power and thus potentially improves the energy efficiency.

Massive MIMO was originally designed for time-division duplex (TDD) system [1,2,3,4,5,6,7], since by exploiting the channel reciprocity in TDD setting, the required channel state information (CSI) for downlink transmission at the BS can be easily obtained via uplink training [1]. The training overhead scales linearly with the number of user equipments (UEs) and is independent with the number of BS antennas, which is acceptable in most of the typical scenarios. As frequency-division duplex (FDD) dominates the current wireless cellular systems, the application of massive MIMO in FDD system is even more desirable. In FDD massive MIMO, the downlink training and corresponding CSI feedback yield an unacceptably high overhead, which poses a significant bottleneck on the achievable SE. One attempt of practical FDD massive MIMO is called joint spatial division and multiplexing (JSDM) [8], where the correlation between channels is exploited to reduce the training and feedback dimensions. Another scheme that enables FDD massive MIMO is called beam division multiple access (BDMA) [9]. The BDMA gets rid of the need of CSI at transmitter and provides strong potential to realize massive MIMO gain in FDD system. Moreover, other innovative approaches, such as the phase-only beamforming [10] and two-stage beamforming [11], are also promising solutions to the FDD massive MIMO.

In TDD and FDD massive MIMO systems (namely, half-duplex (HD) massive MIMO systems), the uplink and downlink UEs must be allocated with orthogonal time slots or frequency bands, which results in insufficient utilization of time-frequency resources. Inspired by the recent development of full-duplex (FD) communication [17], co-time co-frequency uplink and downlink (CCUD) transmission becomes another option in the cellular system. Although attractive in SE, CCUD transmission is considered challenging due to the strong self-interference (SI) caused by the signal leakage between BS transmitter and receiver, especially when the BS is equipped with large-scale antenna arrays. In the small-scale MIMO system, the SI can be mitigated by the active SI cancellation (SIC) scheme, such as digital/circuit domain SIC and spatial suppression [17]. However, the impractical requirement of instantaneous high-dimension SI channel knowledge makes these technologies difficult when applied in the large-scale antenna system. The passive SIC can be applied in the SI channel-unware environment, but it fails to provide satisfactory SIC level when used alone [17]. On the other hand, to support the CCUD transmission, the BS employs a separate antenna configurationFootnote 1 where two separate large-scale antenna arrays are used for transmission and reception, respectively [18]. In this case, the downlink channel reciprocity is commonly considered as unavailable [19]. Without reciprocity, the training overhead to obtain the downlink CSI scales linearly with the number of BS antennas, which poses another big challenge. In [18], to make the system feasible, the authors assumed that each transmit antenna of BS is also connected with a receive radio-frequency chain so that it can receive the pilot signal. In this case, the downlink reciprocity can still be exploited, however, at cost of additional hardware complexity.

Note that the CCUD transmission in the cellular system with massive MIMO BS has been investigated recently in several works (see [12,13,14,15,16] and the references therein). For example, the authors in [12,13,14] studied the SE performance of CCUD transmission in both macro-cell and small-cell environments. The linear beamforming design of the BS for CCUD transmission has been considered in [15] and [16]. However, most of these works are based on the assumption that the SI has been suppressed to a reasonable level and the uplink/downlink channel can be efficiently obtained. As a result, the aforementioned challenges are still not fully addressed.

In this chapter, we investigate the feasibility of CCUD transmission in the cellular system with massive MIMO BS. The contributions are summarized as follows.

  1. 1.

    By exploiting the beam-domain representation of channels based on the basis expansion model [23], we prove that massive MIMO channel matrix (vector) can be represented by a low-dimension effective beam-domain channel matrix (vector). Based on this property, we propose a beam-domain full-duplex (BDFD) massive MIMO scheme (BDFD scheme for short) to enable CCUD transmission in the cellular system. We show that the BDFD scheme achieves significant saving in uplink/downlink training and achieves the uplink and downlink sum capacities simultaneously as the number of BS antennas approaches to infinity.

  2. 2.

    Then, we investigate several important components for the practical implementation of BDFD scheme in the cellular system, including UEs grouping, effective beam-domain channel estimation, beam-domain data transmission, and interference control between uplink and downlink.

  3. 3.

    Finally, we examine the SE of BDFD scheme using the third-generation partnership project long-term evolution (3GPP LTE) simulation model for macro-cell environment. The results demonstrate the superiority of BDFD scheme over the TDD/FDD massive MIMO.

The rest of the paper is organized as follows. The system and channel models are described in Sects. 2, 3, and 4 considering the basic ideal and practical implementation of BDFD scheme, respectively. Section 5 presents the simulation results. Section 6 draws the conclusions.

Notation \( \mathbb{E}\left(\cdot \right) \) denotes the expectation. δ(⋅) denotes the Dirac delta function. \( {\mathbf{A}}^{\left\{{B}_1,{B}_2\right\}} \) denotes the submatrix of A by keeping its rows indexed by set B 1 and columns indexed by set B 2. A {B, :} (A {:, B}) denotes the submatrix of A by keeping its rows (columns) indexed by set B. (⋅)T, (⋅), (⋅)H, |⋅|, ‖⋅‖, and tr(⋅) denote transpose, conjugate, conjugate-transpose, determinant, Frobenius norm, and trace of a matrix, respectively. \( \mathbf{A}\underline {\succ }0 \) means that A is Hermitian positive semi-definite matrix. The frequently used symbols in this paper are summarized in Table 1.

Table 1 Summary of frequently used symbols

2 System and Channel Models

Consider a single-cell system with a FD BS, a number of uplink UEs, and a number of downlink UEs as shown in Fig. 1a. We assume that all UEs are HD and have single antenna. To support the CCUD transmission, the BS employs two separate large-scale antenna arrays for transmission and reception, respectively. The uniform linear arrays are assumed. In the practical implementation, the transmit and receive antenna arrays of the BS can be deployed on the opposite sides of a building with distance of tens of meters to reduce the SI.

Fig. 1
figure 1

TDD/FDD massive MIMO and FD massive MIMO systems. (a) TDD/FDD massive MIMO. (b) Full-duplex massive MIMO

We use \( {\mathbf{h}}_{k_u}\in {\mathrm{\mathbb{C}}}^{N\times 1} \) to denote the channel vector from the uplink UE k u to the receive antenna array of BS and use \( {\mathbf{h}}_{k_d}\in {\mathrm{\mathbb{C}}}^{N\times 1} \) to denote the channel vector from the transmit antenna array of BS to downlink UE k d, where N denotes the number of transmit/receive antennas at the BS.Footnote 2 We use H SI ∈ ℂN × N to denote the SI channel matrix from transmit antenna array to receive antenna array of the BS.

We consider the general cluster-based channel model [21] where the received signal at the BS from the uplink UE k u is a sum of the contributions from M u scattering clusters. The direction of arrival (DOA) of signals resulting from the ith cluster is within the region \( \left[{\theta}_{k_u,i}^{\mathrm{min}},{\theta}_{k_u,i}^{\mathrm{max}}\right] \). Thus, the channel vector between the uplink UE k u and the BS can be expressed as [21]

$$ {\mathbf{h}}_{k_u}=\sum \limits_{i=1}^{M_u}{\int}_{\theta_{k_u,i}^{\mathrm{min}}}^{\theta_{k_u,i}^{\mathrm{max}}}\mathbf{a}\left(\theta \right){r}_{k_u,i}\left(\theta \right) d\theta $$
(1)

where a(θ) = [1, exp(j2πd sin (θ)/λ), ⋯, exp(j2πd(N − 1) sin (θ)/λ)]T is the array response vector with d and λ denoting the antenna spacing and carrier wavelength, respectively. \( {r}_{k_u,i}\left(\theta \right) \) denotes the complex-valued response gain. In the above model, the DOA regions of signals from different scattering clusters are disjoint (otherwise, these signals should be considered from the same scattering cluster). Therefore, the number of scattering clusters is finite because \( {\sum}_{i=1}^{M_u}\left({\theta}_{k_u,i}^{\mathrm{max}}-{\theta}_{k_u,i}^{\mathrm{min}}\right)\le 2\pi \).

Similarly, let \( \left[{\theta}_{k_d,i}^{\mathrm{min}},{\theta}_{k_d,i}^{\mathrm{max}}\right] \) be the direction of departure (DOD) region of signals resulting from the ith scattering clusters and let \( {r}_{k_d,i}\left(\theta \right) \) denote the associated complex-valued response gain; the channel vector from the BS to downlink UE k d can be written as

$$ {\mathbf{h}}_{k_d}=\sum \limits_{i=1}^{M_u}{\int}_{\theta_{k_d,i}^{\mathrm{min}}}^{\theta_{k_d,i}^{\mathrm{max}}}\mathbf{a}\left(\theta \right){r}_{k_d,i}\left(\theta \right) d\theta $$
(2)

The SI signal can be viewed as the contributions of signals from M SI scattering clusters with different DOA and DOD regions. Thus, the SI channel matrix H SI can be expressed as

$$ {\mathbf{H}}_{SI}=\sum \limits_{i=1}^{M_{SI}}{\int}_{\theta_{R,i}^{\mathrm{min}}}^{\theta_{R,i}^{\mathrm{max}}}{\int}_{\theta_{T,i}^{\mathrm{min}}}^{\theta_{T,i}^{\mathrm{max}}}{r}_{SI,i}\left({\theta}_R,{\theta}_T\right)\mathbf{a}\left({\theta}_R\right){\mathbf{a}}^H\left({\theta}_T\right)d{\theta}_Rd{\theta}_T $$
(3)

where r SI,i(θ R, θ T) denotes the complex-valued response gain. In real systems, the BS is commonly elevated at a relatively high altitude, e.g., on the top of a high building or a dedicated tower, so that there are few surrounding scatterers [24]. Moreover, we assume that the passive SIC scheme for infrastructure nodes in [25] has been used, and the direct path between transmit and receive antenna arrays of BS is virtually cancelled. Therefore, in this chapter, we assume that the number of scattering clusters for SI channel is small.

In (1), (2), and (3), the complex-valued response gains with different incidence angles are uncorrelated [22], that is,

$$ \mathbb{E}\left[{r}_{k_u,i}\left(\theta \right){r}_{k_u,i}^{\ast}\left(\theta \right)\right]={S}_{k_u,i}\left(\theta \right)\delta \left(\theta -{\theta}^{\prime}\right) $$
$$ \mathbb{E}\left[{r}_{k_d,i}\left(\theta \right){r}_{k_d,i}^{\ast}\left(\theta \right)\right]={S}_{k_d,i}\left(\theta \right)\delta \left(\theta -{\theta}^{\prime}\right) $$
$$ \mathbb{E}\left[{r}_{SI,i}\left({\theta}_R,{\theta}_T\right){r}_{SI,i}^{\ast}\left({\theta}_R,{\theta}_T\right)\right]={S}_{SI,i}\left({\theta}_R,{\theta}_T\right)\delta \left({\theta}_R-{\theta}_R^{\prime}\right)\delta \left({\theta}_T-{\theta}_T^{\prime}\right) $$
(4)

where S ω,i(⋅),ω ∈ {k u, k d, SI} represents the product of the large-scale fading and channel power angle spectrum. Note that the considered model can be easily transformed into several well-known massive MIMO channel models. For example, by setting M u = M d = 1, we obtain the “one-ring” model studied in [8]. The “one-ring” model is typically used in the macro-cell environment where the uplink/downlink received signals are resulted from the scattering process in the vicinity of the UEs [21]. Moreover, by setting

$$ {r}_{k_u,i}\left(\theta \right)=\sum \limits_j{r}_{k_u,i,j}\delta \left(\theta -{\theta}_j\right) $$
$$ {r}_{k_u,i}\left(\theta \right)=\sum \limits_j{r}_{k_u,i,j}\delta \left(\theta -{\theta}_j\right) $$
$$ {r}_{SI,i}\left({\theta}_R,{\theta}_T\right)=\sum \limits_{j,l}{r}_{k_{SI},i,j,l}\delta \left({\theta}_R-{\theta}_{R,j}\right)\delta \left({\theta}_T-{\theta}_{T,j}\right) $$
(5)

we arrive at the ray-cluster-based spatial channel model which is usually used for millimeter wave MIMO systems [26]. Therefore, the results in this chapter can be readily applied in these scenarios.

3 Beam-Domain Full-Duplex Transmission Scheme

In this section, we propose a BDFD scheme to realize CCUD transmission in the cellular system. Using the basis expansion model, we first derive the beam-domain channel representation which is the projection of channel vector (matrix) on a common basis. The benefit of the beam-domain representation is that the channel becomes compressible in the beam domain under certain basis. Using this property, channel dimension required to be estimated can be greatly reduced. Moreover, by exploiting the structure of SI channel in the beam domain, it is possible to eliminate the SI without using the instantaneous SI channel knowledge and hence realize efficient CCUD transmission.

3.1 Beam-Domain Channel Representation

Under the basis expansion model [23], the uplink channel vector can be expanded from a set of uniform basis vectors {f 1, f 2, ⋯, f N} ∈ ℂN × 1, that is

$$ {\mathbf{h}}_{k_u}=\sum \limits_{m=1}^N{\tilde{h}}_{k_u,m}{\mathbf{f}}_m=\mathbf{F}{\tilde{\mathbf{h}}}_{k_u} $$
(6)

where F = [f 1, f 2⋯, f N]. Following [9], the basis vector f i is also called a beam, and \( {\tilde{\mathbf{h}}}_{k_u}={\left[{\tilde{h}}_{k_u,1},{\tilde{h}}_{k_u,2},\cdots, {\tilde{h}}_{k_u,N}\right]}^T \) is called the beam-domain channel.Footnote 3 According to (1) and (6), we have

$$ {\tilde{\mathbf{h}}}_{k_u}={\mathbf{F}}^H{\mathbf{h}}_{k_u}=\sum \limits_{i=1}^{M_u}{\int}_{\theta_{k_u,i}^{\mathrm{min}}}^{\theta_{k_u,i}^{\mathrm{max}}}{\mathbf{F}}^H\mathbf{a}\left(\theta \right){r}_{k_u,i}\left(\theta \right) d\theta $$
(7)

To investigate the compressibility of beam-domain channel, we propose the following lemma.

Lemma 1

Consider the basis F = [f 1, f 2⋯, f N] with \( {\mathbf{f}}_n=\frac{1}{\sqrt{N}}{\left[1,\exp \left(j2\pi d{\theta}_n/\lambda \right),\right.}\\ {\left.\cdots, \exp \left(j2\pi d\left(N-1\right){\theta}_n/\lambda \right)\right]}^T \). θ n is defined as the beam angle of nth beam f n, which is selected so that the different beams are orthogonal. As the number of BS antennas N tends to infinity, the average beam-domain channel gain for the uplink UE k u associated with the nth beam f n, i.e., \( \mathbb{E}\left[{\left|{\tilde{h}}_{k_u,n}\right|}^2\right] \), has non-negligible value only when \( {\theta}_n\in {\cup}_{i=1}^{M_u}\left[\sin {\theta}_{k_u,i}^{\mathrm{min}}-\varepsilon, \sin {\theta}_{k_u,i}^{\mathrm{max}}+\varepsilon \right] \), where ε ≥ 0 and limN → ∞ ε = 0.

Proof

Using (7) and [28, Eq. (5)], \( \mathbb{E}\left[{\left|{\tilde{h}}_{k_u,n}\right|}^2\right] \) can be written as

$$ {\displaystyle \begin{array}{c}\mathbb{E}\left[{\left|{\tilde{h}}_{k_u,n}\right|}^2\right]=\sum \limits_{i=1}^{M_u}{\int}_{\theta_{k_u,i}^{\mathrm{min}}}^{\theta_{k_u,i}^{\mathrm{max}}}{\left|{\mathbf{f}}_n^H\mathbf{a}\left(\theta \right)\right|}^2{S}_{k_u,i}\left(\theta \right) d\theta \\ {}=\sum \limits_{i=1}^{M_u}\underset{Y_i}{\underbrace{N\int_{\theta_{k_u,i}^{\mathrm{min}}}^{\theta_{k_u,i}^{\mathrm{max}}}{\mathrm{asinc}}_N^2\left(\frac{d}{\lambda }{\theta}_n-\frac{d}{\lambda}\sin \theta \right){S}_{k_u,i}\left(\theta \right) d\theta}}\end{array}} $$
(8)

where asincN(x) is the aliased sinc function, which is defined as asincN(x)= sin(Nπx)/(N sin (πx)). The envelope of the squared aliased sinc function is shown in Fig. 2. Assuming \( {\theta}_n=\sin {\theta}_{k_u,i}^{\mathrm{max}}+\varepsilon \) with ε > 0 and i ∈ {1, 2, ⋯, M u}, the ith term in the summation of (8), named as Y i, can be upper bounded as

$$ {\displaystyle \begin{array}{c}{Y}_i\le N\kern0em \underset{\theta \in \left[{\theta}_{k_u,i}^{\mathrm{min}},{\theta}_{k_u,i}^{\mathrm{max}}\right]}{\max}\kern0em {S}_{k_u,i}\left(\theta \right){\int}_{\theta_{k_u,i}^{\mathrm{min}}}^{\theta_{k_u,i}^{\mathrm{max}}}{\mathrm{asinc}}_N^2\left(\frac{d}{\lambda }{\theta}_n-\frac{d}{\lambda}\sin \theta \right) d\theta \\ {}\le N\kern0em \underset{\theta \in \left[{\theta}_{k_u,i}^{\mathrm{min}},{\theta}_{k_u,i}^{\mathrm{max}}\right]}{\max}\kern0em {S}_{k_u,i}\left(\theta \right)\kern0em \sum \limits_{l={L}_i^{\mathrm{min}}}^{L_i^{\mathrm{max}}}\kern0em {\int}_{\vartheta_{l+1}}^{\vartheta_l}\kern0em {\mathrm{asinc}}_N^2\left(\frac{d}{\lambda }{\theta}_n-\frac{d}{\lambda}\sin \theta \right) d\theta \\ {}\le N\left({L}_i^{\mathrm{max}}-{L}_i^{\mathrm{min}}\right)\underset{\theta \in \left[{\theta}_{k_u,i}^{\mathrm{min}},{\theta}_{k_u,i}^{\mathrm{max}}\right]}{\max }{S}_{k_u,i}\left(\theta \right){\int}_{\vartheta_{L_i^{\mathrm{min}}+1}}^{\vartheta_{L_i^{\mathrm{min}}}}{\mathrm{asinc}}_N^2\left(\frac{d}{\lambda }{\theta}_n-\frac{d}{\lambda}\sin \theta \right) d\theta \\ {}\le \frac{\left({L}_i^{\mathrm{max}}-{L}_i^{\mathrm{min}}\right)\kern0em \left({\vartheta}_{L_i^{\mathrm{min}}}-{\vartheta}_{L_i^{\mathrm{min}}+1}\right)}{N{\left(\pi {L}_i^{\mathrm{min}}\right)}^2}\kern0em \underset{\theta \in \left[{\theta}_{k_u,i}^{\mathrm{min}},{\theta}_{k_u,i}^{\mathrm{max}}\right]}{\max }{S}_{k_u,i}\left(\theta \right)\end{array}} $$
(9)

where \( {L}_i^{\mathrm{min}}=\left\lceil Nd\left({\theta}_n-\sin {\theta}_{k_u,i}^{\mathrm{max}}\right)/\lambda \right\rceil \) and \( {L}_i^{\mathrm{max}}=\left\lfloor Nd\left({\theta}_n-\sin {\theta}_{k_u,i}^{\mathrm{min}}\right)/\lambda \right\rfloor \). ϑ l =  arcsin (θ n − λl/dN) is the lth zero point of the function \( {\mathrm{asinc}}_N^2(d({\theta}_n{-}\sin \theta)/\lambda ) \). The lth term in the summation of the second step is the integral over the lth side lobe of the squared aliased sinc function as shown in Fig. 2. The third step is based on the fact that the power of the side lobe of aliased sinc function is a decreasing function of its index. The last step is obtained by using the property that the aliased sinc function converges to the standard sinc function as N → ∞ [29, Ch. 3] and using the fact sinx ≤ 1.

Fig. 2
figure 2

Envelope of the squared aliased sinc function, where N = 100

Note that \( {L}_i^{\mathrm{max}}-{L}_i^{\mathrm{min}} \) scales with \( \mathcal{O}(N) \). Moreover, as N → ∞, \( {\vartheta}_{L_i^{\mathrm{min}}}-{\vartheta}_{L_i^{\mathrm{min}}+1} \) can be replaced by the differential of the arcsinx at point \( x={\theta}_n-\lambda {L}_i^{\mathrm{min}}/ dN \), i.e.,

$$ {\displaystyle \begin{array}{c}{\vartheta}_{L_i^{\mathrm{min}}}-{\vartheta}_{L_i^{\mathrm{min}}+1}=\frac{\lambda }{dN}{\left.\frac{\mathrm{d}\left(\arcsin x\right)}{\mathrm{d}x}\right|}_{x={\theta}_n-\frac{\lambda {L}_i^{\mathrm{min}}}{dN}}\\ {}=\frac{\lambda }{dN}{\left(1-{\left({\theta}_n-\frac{\lambda {L}_i^{\mathrm{min}}}{dN}\right)}^2\right)}^{-1/2}=\mathcal{O}\left(\frac{1}{N}\right)\end{array}} $$
(10)

Recall that we have assumed \( {\theta}_n=\sin {\theta}_{k_u,i}^{\mathrm{max}}+\varepsilon \). Thus, we have \( {L}_i^{\mathrm{min}}=\left\lfloor dN\varepsilon /\lambda \right\rfloor \). Based on the results in the above, the upper bound of Y i given by the last step of (9) converges to zero when N → ∞ as long as \( \varepsilon \ge \mathcal{O}\left({N}^u\right) \) with u >  − 3/2. If we can choose \( \varepsilon =\mathcal{O}\left({N}^{-1}\right) \), then ε approaches to 0 if N → ∞. In the same way, we can obtain the similar result if \( {\theta}_n=\sin {\theta}_{k_u,i}^{\mathrm{max}}-\varepsilon \). Therefore, as N → ∞, Y i (∀i ∈ {1, 2, ⋯, M u}) has non-negligible value only if \( {\theta}_n\in \left[\sin {\theta}_{k_u,i}^{\mathrm{min}}-\varepsilon, \sin {\theta}_{k_u,i}^{\mathrm{max}}+\varepsilon \right] \), where limN → ∞ ε = 0. This completes the proof.

Note that the upper bound of Y i in (9) is generally not tight. However, this does not impact the analysis because \( {Y}_i\overset{N\to \infty }{\to }0 \) as long as its upper bound converges to 0.

If we consider sinθ as the virtual DOA of the uplink signal, in Lemma 1 we actually select the beam angle to mimic virtual DOA. That is why only the beam-domain channel elements with beam angles within \( {\cup}_{i=1}^{M_u}\left[\sin {\theta}_{k_u,i}^{\mathrm{min}}-\varepsilon, \sin {\theta}_{k_u,i}^{\mathrm{max}}+\varepsilon \right] \) have non-negligible gains. From Lemma 1, the beam-domain channel vector exhibits the desired compressibility in the large N regime when the considered basis is used. Therefore, the basis in Lemma 1 will be employed in the following.

Example 1

As a concrete example of the compressibility, we consider a scenario with N = 128, M u = 1, and \( \left[{\theta}_{k_u,1}^{\mathrm{min}},{\theta}_{k_u,1}^{\mathrm{max}}\right]=\left[{24.3}^{{}^{\circ}},{35.7}^{{}^{\circ}}\right] \). This corresponds to “one-ring” model with about 30 m scattering radius and 300 m BS-to-UE distance [21]. The normalized average beam-domain channel gain, which is defined as \( \mathbb{E}\left[{\left|{\tilde{h}}_{k_u,n}\right|}^2\right]/\underset{n^{\prime }=1,2,\cdots, N}{\max}\mathbb{E}\left[{\left|{\tilde{h}}_{k_u,{n}^{\prime }}\right|}^2\right] \), is plotted in Fig. 3. The beam-domain channel elements whose beam angles are within \( \left[\sin {\theta}_{k_u,1}^{\mathrm{min}},\sin {\theta}_{k_u,1}^{\mathrm{max}}\right] \) are marked in red. From the figure, we can see that the gains of these channel elements are much higher than the remaining, which matches well with the results in Lemma 1. In fact, from the simulation results, about 96.6% of channel power is captured by these channel elements (less than 10% of the all elements) when N = 128. This value becomes 96.9% and 98.1% if we increase N to 256 and 512, respectively.

Fig. 3
figure 3

Normalized average beam-domain channel gain as a function of index of the associated beam, where \( {\theta}_n=\frac{\lambda }{d}\left(\frac{n}{N}-\frac{1}{2}\right) \), and d = λ/2

Based on Lemma 1, we can approximate the channel vector from uplink UE k u to BS as

$$ {\mathbf{h}}_{k_u}\approx \sum \limits_{m\in {B}_{k_u}}{\tilde{h}}_{k_u,m}{\mathbf{f}}_m={\mathbf{F}}^{\left\{:,{B}_{k_u}\right\}}{\tilde{\mathbf{h}}}_{k_u}^{\left\{{B}_{k_u},:\right\}} $$
(11)

where \( {B}_{k_u} \) is called the active beam set which contains the indexes of beams with non-negligible beam-domain channel gains. \( {\mathbf{F}}^{\left\{:,{B}_{k_u}\right\}} \) is called the active beam space, whose columns are consisted of the beams in \( {B}_{k_u} \). The reduced-dimension beam-domain channel vector \( {\tilde{\mathbf{h}}}_{k_u}^{\left\{{B}_{k_u},:\right\}}\in {\mathbf{C}}^{\mid {B}_{k_u}\mid \times 1} \) is called the effective beam-domain channel. Note that (11) holds with equality as N → ∞ according to Lemma 1. Based on (1) and (11), the effective beam-domain channel vector can be expressed as

$$ {\tilde{\mathbf{h}}}_{k_u}^{\left\{{B}_{k_u},:\right\}}={\left({\mathbf{F}}^{\left\{:,{B}_{k_u}\right\}}\right)}^H{\mathbf{h}}_{k_u}=\sum \limits_{i=1}^{M_u}{\int}_{\theta_{k_u,i}^{\mathrm{min}}}^{\theta_{k_u,i}^{\mathrm{max}}}{\left({\mathbf{F}}^{\left\{:,{B}_{k_u}\right\}}\right)}^H\mathbf{a}\left(\theta \right){r}_{k_u,i}\left(\theta \right) d\theta $$
(12)

From (11), the original channel vector can be recovered from the effective beam-domain channel vector if the DOA (and hence the active beam set) information is known. As a result, in order to obtain \( {\mathbf{h}}_{k_u} \), it is enough to estimate the \( \left|{B}_{k_u}\right| \)-dimension effective beam-domain channel during the training phase. This can potentially result in significant saving of the training resource.

In the practical scenario with arbitrary finite number of BS antennas, the active beam set \( {B}_{k_u} \) can be obtained by solving a cardinality minimization problem, with constraint that most of the channel power is captured by the effective beam-domain channel vector, i.e.,

$$ {\displaystyle \begin{array}{l}\underset{B_{k_u}}{\min}\left|{B}_{k_u}\right|\\ {}\mathrm{s}.\mathrm{t}.\left\{\begin{array}{l}\frac{\mathbb{E}\left[{\left\Vert {\tilde{\mathbf{h}}}_{k_u}^{\left\{{B}_{k_u},:\right\}}\right\Vert}^2\right]}{\mathbb{E}\left[{\left\Vert {\tilde{\mathbf{h}}}_{k_u}\right\Vert}^2\right]}=\frac{\sum \limits_{i=1}^{M_u}{\int}_{\theta_{k_u,i}^{\mathrm{min}}}^{\theta_{k_u,i}^{\mathrm{max}}}{\left\Vert {\left({\mathbf{F}}^{\left\{:,{B}_{k_u}\right\}}\right)}^H\mathbf{a}\left(\theta \right)\right\Vert}^2{S}_{k_u,i}\left(\theta \right) d\theta}{\sum \limits_{i=1}^{M_u}{\int}_{\theta_{k_u,i}^{\mathrm{min}}}^{\theta_{k_u,i}^{\mathrm{max}}}{\left\Vert {\mathbf{F}}^H\mathbf{a}\left(\theta \right)\right\Vert}^2{S}_{k_u,i}\left(\theta \right) d\theta}\ge \eta \\ {}\eta <1\end{array}\right.\end{array}} $$
(13)

where η < 1 denotes the threshold and should be chosen closed to 1 in real implementation. The problem is combinatorial and difficult to solve in closed form. However, since (13) is related only with DOA information which is slow time-varying,Footnote 4 we can build off-line table of \( {B}_{k_u} \) for different DOA regions to reduce the computation load. If N is large enough, to further reduce the complexity, we can simply select the beams whose angles are in \( {\theta}_m\in {\cup}_{i=1}^{M_{SI}}\left[\sin {\theta}_{R,i}^{\mathrm{min}}-\varepsilon, \sin {\theta}_{R,i}^{\mathrm{max}}+\varepsilon \right] \) to constitute \( {B}_{k_u} \). According to Lemma 1 (also demonstrated by in Example 1), the beam-domain channel elements associated with these beams contain almost all the channel power in the large N regime.

Similarly, with the basis expansion model, the channel vector from the BS to downlink UE k d and the associated effective beam-domain channel vector can be expressed as

$$ {\displaystyle \begin{array}{c}{\tilde{\mathbf{H}}}_{SI}^{\left\{{B}_{SI,R},{B}_{SI,T}\right\}}={\left({\mathbf{F}}^{\left\{:,{B}_{SI,R}\right\}}\right)}^H{\mathbf{H}}_{SI}{\mathbf{F}}^{\left\{:,{B}_{SI,T}\right\}}\\ {}=\sum \limits_{i=1}^{M_{SI}}{\int}_{\theta_{R,i}^{\mathrm{min}}}^{\theta_{R,i}^{\mathrm{max}}}{\int}_{\theta_{T,i}^{\mathrm{min}}}^{\theta_{T,i}^{\mathrm{max}}}{r}_{SI,i}\left({\theta}_R,{\theta}_T\right){\left({\mathbf{F}}^{\left\{:,{B}_{SI,R}\right\}}\right)}^H\mathbf{a}\left({\theta}_R\right){\mathbf{a}}^H\left({\theta}_T\right){\mathbf{F}}^{\left\{:,{B}_{SI,T}\right\}}d{\theta}_Rd{\theta}_T\end{array}} $$
(14)

where the active beam set \( {B}_{k_d} \) can be design by solving a similar problem as that in (13).

To exploit the compressibility of SI channel, the basis expansion is performed for column and row spaces of H SI simultaneously, which results in \( {\mathbf{H}}_{SI}=\mathbf{F}{\tilde{\mathbf{H}}}_{SI}{\mathbf{F}}^H \). As a generalization of Lemma 1, we have the following Lemma.

Lemma 2

The average beam-domain SI channel gain \( \mathbb{E}\left[{\left|{\left[{\tilde{\mathbf{H}}}_{SI}\right]}_{m,n}\right|}^2\right]=\mathbb{E}\left[{\left|{\mathbf{f}}_m^H{\mathbf{H}}_{SI}{\mathbf{f}}_n\right|}^2\right] \) has non-negligible value only when the beam angle of f m lies in \( {\cup}_{i=1}^{M_{SI}}\left[\sin {\theta}_{R,i}^{\mathrm{min}}-\varepsilon, \sin {\theta}_{R,i}^{\mathrm{max}}+\varepsilon \right] \), and meanwhile the beam angle of f n lies in \( {\cup}_{i=1}^{M_{SI}}\left[\sin {\theta}_{T,i}^{\mathrm{min}}-\varepsilon, \sin {\theta}_{T,i}^{\mathrm{max}}+\varepsilon \right] \), where ε approaches to zero in the large N regime.

Lemma 2 can be simply proved by using a similar procedure at that in the proof of Lemma 1. Therefore, the detailed proof is omitted due to space limitation. From Lemma 2, the SI channel matrix can be approximated as \( {\mathbf{H}}_{SI}\approx {\mathbf{F}}^{\left\{:,{B}_{SI,R}\right\}}{\tilde{\mathbf{H}}}_{SI}^{\left\{{B}_{SI,R},{B}_{SI,T}\right\}}{\left({\mathbf{F}}^{\left\{:,{B}_{SI,T}\right\}}\right)}^H \), where the effective beam-domain SI channel matrix \( {\tilde{\mathbf{H}}}_{SI}^{\left\{{B}_{SI,R},{B}_{SI,T}\right\}}\in {\mathbf{C}}^{\mid {B}_{SI,R}\mid \times \mid {B}_{SI,T}\mid } \) can be expressed as

$$ {\displaystyle \begin{array}{c}{\tilde{\mathbf{H}}}_{SI}^{\left\{{B}_{SI,R},{B}_{SI,T}\right\}}={\left({\mathbf{F}}^{\left\{:,{B}_{SI,R}\right\}}\right)}^H{\mathbf{H}}_{SI}{\mathbf{F}}^{\left\{:,{B}_{SI,T}\right\}}\\ {}=\sum \limits_{i=1}^{M_{SI}}{\int}_{\theta_{R,i}^{\mathrm{min}}}^{\theta_{R,i}^{\mathrm{max}}}{\int}_{\theta_{T,i}^{\mathrm{min}}}^{\theta_{T,i}^{\mathrm{max}}}{r}_{SI,i}\left({\theta}_R,{\theta}_T\right){\left({\mathbf{F}}^{\left\{:,{B}_{SI,R}\right\}}\right)}^H\mathbf{a}\left({\theta}_R\right){\mathbf{a}}^H\left({\theta}_T\right){\mathbf{F}}^{\left\{:,{B}_{SI,T}\right\}}d{\theta}_Rd{\theta}_T\end{array}} $$
(15)

The active beam sets B SI,R and B SI,R can be determined by solving the problem (16) on the top of the next page, where \( {\tilde{\mathbf{H}}}_{SI} \) is defined as \( {\tilde{\mathbf{H}}}_{SI}={\mathbf{F}}^H{\mathbf{H}}_{SI}\mathbf{F} \) and η < 1 denotes the threshold in (16), shown at the bottom of the page.

$$ {\displaystyle {\fontsize{8.5}{10.5}\selectfont{\begin{array}{l}\underset{B_{SI,R},{B}_{SI,T}}{\min}\max \left\{\left|{B}_{SI,R}\right|,\left|{B}_{SI,T}\right|\right\} {}\mathrm{s}.\mathrm{t}.\frac{\mathbb{E}\left[{\left\Vert {\tilde{\mathbf{H}}}_{SI}^{\left\{{B}_{SI,R},{B}_{SI,T}\right\}}\right\Vert}^2\right]}{\mathbb{E}\left[{\left\Vert {\tilde{\mathbf{H}}}_{SI}\right\Vert}^2\right]}\\[20pt] =\frac{\sum \limits_{i=1}^{M_{SI}}\underset{\theta_{R,i}^{\mathrm{min}}}{\overset{\theta_{R,i}^{\mathrm{max}}}{\int }}\underset{\theta_{T,i}^{\mathrm{min}}}{\overset{\theta_{T,i}^{\mathrm{max}}}{\int }}{S}_{SI,i}\left({\theta}_R,{\theta}_T\right){\left\Vert {\left({\mathbf{F}}^{\left\{:,{B}_{SI,R}\right\}}\right)}^H\mathbf{a}\left({\theta}_R\right){\mathbf{a}}^H\left({\theta}_T\right){\mathbf{F}}^{\left\{:,{B}_{SI,T}\right\}}\right\Vert}^2d{\theta}_Rd{\theta}_T}{\sum \limits_{i=1}^{M_{SI}}\underset{\theta_{R,i}^{\mathrm{min}}}{\overset{\theta_{R,i}^{\mathrm{max}}}{\int }}\underset{\theta_{T,i}^{\mathrm{min}}}{\overset{\theta_{T,i}^{\mathrm{max}}}{\int }}{S}_{SI,i}\left({\theta}_R,{\theta}_T\right){\left\Vert {\mathbf{F}}^H\mathbf{a}\left({\theta}_R\right){\mathbf{a}}^H\left({\theta}_T\right)\mathbf{F}\right\Vert}^2d{\theta}_Rd{\theta}_T}\ge \eta \end{array}}}} $$
(16)

3.2 Beam-Domain Full-Duplex Transmission

The key idea of the BDFD scheme lies in partitioning UEs according to their active beam sets to realize efficient CCUD transmission. In particular, we divide the UEs into groups according the following two criteria (UE grouping criteria):

  1. 1.

    Criterion 1: The uplink/downlink UEs with the same active beam set are collected in the same group. The active beam sets of different uplink/downlink groups are non-overlapping. Mathematically, letting \( {B}_{g_u} \) and \( {B}_{g_u^{\prime }} \) be the active beam sets of two arbitrary uplink groups g u and \( {g}_u^{\prime } \), and letting \( {B}_{g_d} \) and \( {B}_{g_d^{\prime }} \) be the active beam sets of two arbitrary downlink groups g d and \( {g}_d^{\prime } \), we have \( {B}_{g_u}\cap {B}_{g_u^{\prime }}=\varnothing \) and \( {B}_{g_d}\cap {B}_{g_d^{\prime }}=\varnothing \).

  2. 2.

    Criterion 2: Let G u and G d be the sets of uplink UE groups and downlink UE groups, respectively. The active beam sets \( {B}_{g_u} \) and \( {B}_{g_d} \) satisfy \( \left({\cup}_{g_u\in {G}_u}{B}_{g_u}\right)\cap {B}_{SI,R}=\varnothing \) or \( \left({\cup}_{g_d\in {G}_d}{B}_{g_d}\right)\cap {B}_{SI,T}=\varnothing \).

For the sake of illustration, in the following we assume that each active beam set contains the same number of beams, i.e., \( \left|{B}_{g_u}\right|={b}_u \) and \( \left|{B}_{g_d}\right|={b}_d \). Define the index \( {g}_{u,k}=k+{\sum}_{g_u^{\prime }=1}^{g_u-1}{K}_{g_u^{\prime }} \) to denote the kth uplink UE of the uplink group g u, where \( {K}_{g_u} \) is the number of UEs in the group g u. Similarly, letting \( {K}_{g_d} \) be the number of UEs in the downlink group g d, we can define the \( {g}_{d,k}=k+{\sum}_{g_d^{\prime }=1}^{g_d-1}{K}_{g_d^{\prime }} \) to denote the kth downlink UE of the group g d. Define \( {\mathbf{H}}_{g_u}=\left[{\mathbf{h}}_{g_{u,1}},\cdots, {\mathbf{h}}_{g_{u,{K}_{g_u}}}\right] \) and \( {\mathbf{H}}_{g_d}=\left[{\mathbf{h}}_{g_{d,1}},\cdots, {\mathbf{h}}_{g_{d,{K}_{g_d}}}\right] \) as the channel matrix from the uplink group g u to the BS and that from the BS to the downlink group g d, respectively, and define \( {\tilde{\mathbf{H}}}_{g_u} \) and \( {\tilde{\mathbf{H}}}_{g_d} \) as the corresponding beam-domain channel matrices.

During the data transmission phase, the uplink UEs transmit data to the BS, and meanwhile, the BS transmits data to the downlink UEs. Assuming the above UE grouping criteria, the received signals at the BS and downlink group g d can be expressed as

$$ {\mathbf{y}}_u=\sum \limits_{g_u^{\prime}\in {G}_u}{\mathbf{H}}_{g_u^{\prime }}{\mathbf{s}}_{g_u^{\prime }}+{\mathbf{H}}_{SI}\sum \limits_{g_d^{\prime}\in {G}_d}{\mathbf{x}}_{g_d^{\prime }}+{\mathbf{n}}_u $$
(17)
$$ {\mathbf{y}}_{g_d}={\mathbf{H}}_{g_d}^H\sum \limits_{g_d^{\prime}\in {G}_d}{\mathbf{x}}_{g_d^{\prime }}+\sum \limits_{g_u^{\prime}\in {G}_u}{\mathbf{H}}_{g_u^{\prime}\to {g}_d}{\mathbf{s}}_{g_u^{\prime }}+{\mathbf{n}}_{g_d} $$
(18)

where n u and \( {\mathbf{n}}_{g_d} \) denote the additive white Gaussian noises (AWGNs) with variance σ. \( {\mathbf{H}}_{g_u^{\prime}\to {g}_d} \) denotes the interference channel from uplink group g u to downlink group g d. Since the UEs have single antenna and are geographically distributed, the elements of \( {\mathbf{H}}_{g_u^{\prime}\to {g}_d} \) are assumed to be independent Gaussian random variables with zero mean. \( {\mathbf{s}}_{g_u}\in {\mathrm{\mathbb{C}}}^{K_{g_u}\times 1} \) denotes the transmit signal of uplink group g u. \( {\mathbf{x}}_{g_d}\in {\mathrm{\mathbb{C}}}^{N\times 1} \) denotes the precoded transmit signal of the BS. In the BDFD scheme, the UEs of the downlink group g d detect signal only on their active beam space. Therefore, we let \( {\mathbf{x}}_{g_d}={\mathbf{F}}^{\left\{:,{B}_{g_d}\right\}}{\tilde{\mathbf{x}}}_{g_d} \), where \( {\tilde{\mathbf{x}}}_{g_d}\in {\mathrm{\mathbb{C}}}^{b_d\times 1} \) is referred to as the beam-domain precoded transmit signal.

By multiplying both sides of (17) with \( {\left({\mathbf{F}}^{\left\{:,{B}_{g_u}\right\}}\right)}^H \) and using the definition \( {\mathbf{x}}_{g_d}={\mathbf{F}}^{\left\{:,{B}_{g_d}\right\}}{\tilde{\mathbf{x}}}_{g_d} \) on (18), we arrive at the beam-domain received signal at the BS from uplink UE group g u and the beam-domain received signal at downlink group g d

$$ {\tilde{\mathbf{y}}}_{g_u}={\tilde{\mathbf{H}}}_{g_u}^{\left\{{B}_{g_u},:\right\}}{\mathbf{s}}_{g_u}+\sum \limits_{g_u^{\prime}\in {G}_u/\left\{{g}_u\right\}}{\tilde{\mathbf{H}}}_{g_u^{\prime}}^{\left\{{B}_{g_u},:\right\}}{\mathbf{s}}_{g_u^{\prime }}+\sum \limits_{g_d^{\prime}\in {G}_d}{\tilde{\mathbf{H}}}_{SI}^{\left\{{B}_{g_u},{B}_{g_d^{\prime }}\right\}}{\tilde{\mathbf{x}}}_{g_d^{\prime }}+{\tilde{\mathbf{n}}}_{g_u} $$
(19)
$$ {\tilde{\mathbf{y}}}_{g_d}={\left({\tilde{\mathbf{H}}}_{g_d}^{\left\{{B}_{g_d},:\right\}}\right)}^H{\tilde{\mathbf{x}}}_{g_d}+\sum \limits_{g_d^{\prime}\in {G}_d/\left\{{g}_d\right\}}\kern0em {\left({\tilde{\mathbf{H}}}_{g_d}^{\left\{{B}_{g_d^{\prime }},:\right\}}\right)}^H{\tilde{\mathbf{x}}}_{g_d^{\prime }}+\sum \limits_{g_u^{\prime}\in {G}_u}{\mathbf{H}}_{g_u^{\prime}\to {g}_d}{\mathbf{s}}_{g_u^{\prime }}+{\mathbf{n}}_{g_d} $$
(20)

where \( {\tilde{\mathbf{n}}}_{g_u}={\left({\mathbf{F}}^{\left\{:,{B}_{g_u}\right\}}\right)}^H{\mathbf{n}}_u \). Note that (20) is in fact the same with (18). To emphasize that (20) is the beam-domain received signal and be consistent with (19), we introduce the new notation \( {\tilde{\mathbf{y}}}_{g_d} \). The second terms on the right-hand sides (RHSs) of (19) and (20) indicate the inter-group interferences (IGIs). The third term of RHS of (19) denotes the received SI. From (19), b u should satisfy \( {b}_u\ge {K}_{g_u} \) in order to support \( {K}_{g_u} \) independent data streams. Similarly, we require \( {b}_d\ge {K}_{g_d} \) in downlink according to (20). As will be shown in Sect. 4.4, with simple UE scheduling, the interference from uplink UEs to downlink UEs can be made negligible compared to AWGN. Thus, we temporarily neglect this interference term in the analysis below. In this case, we build the optimality of the BDFD scheme using the following theorem.

Theorem 1

Assuming the UE grouping criteria is satisfied and the effective beam-domain channel matrices for all uplink and downlink groups are perfectly known, the BDFD scheme achieves the uplink and downlink sum capacitiesFootnote 5 simultaneously as the number of BS antennas approaches to infinity.

Proof

Recalling the UE grouping Criterion 1 and Lemma 1, we can deduce that the IGI approaches to zero in the large N regime. Since we require \( \left({\cap}_{g_u\in {G}_u}{B}_{g_u}\right)\cap {B}_{SI,R}=\varnothing \) or \( \left({\cap}_{g_d\in {G}_d}{B}_{g_d}\right)\cap {B}_{SI,T}=\varnothing \) in UE grouping Criterion 2, the elements of \( {\tilde{\mathbf{H}}}_{SI}^{\left\{{B}_{g_u},{B}_{g_d^{\prime }}\right\}} \) SI converge to zeroFootnote 6 for very large N according to Lemma 2. Therefore, as N → ∞ the beam-domain received signal \( {\tilde{\mathbf{y}}}_{g_u} \) reduces to

$$ {\tilde{\mathbf{y}}}_{g_u}={\tilde{\mathbf{H}}}_{g_u}^{\left\{{B}_{g_u},:\right\}}{\mathbf{s}}_{g_u}+{\tilde{\mathbf{n}}}_{g_u} $$
(21)

Let \( {R}_u^{\mathrm{sum}}\left({P}_u\right) \) be the uplink achievable sum rate with total power constraint P u, and let \( {\Lambda}_{g_u}=\mathbb{E}\left[{\mathbf{s}}_{g_u}{\mathbf{s}}_{g_u}^H\right] \) be the diagonal input covariance matrix of the uplink group g u. Assuming the minimum mean square error with successive interference cancellation (MMSE-SIC) is employed to detect \( {\mathbf{s}}_{g_u} \) from (23), the uplink achievable sum rate can be expressed as (22) [30] on the top of the next page, where the second step is based on the property ∣I+AB ∣  =  ∣ I+BA∣. The last step follows from Lemma 1, i.e., \( {\mathbf{H}}_{g_u}={\mathbf{F}}^{\left\{:,{B}_{g_u}\right\}}{\tilde{\mathbf{H}}}_{g_u}^{\left\{{B}_{g_u},:\right\}} \) is satisfied as N → ∞. Note that the last line of (22) is exactly the uplink sum capacity [30].

$$ {\displaystyle \begin{array}{c}{R}_u^{\mathrm{sum}}\left({P}_u\right){=}\underset{{\boldsymbol{\Lambda}}_{g_u}\succcurlyeq 0,\sum \limits_{g_u\in {G}_u}\mathrm{tr}\left({\boldsymbol{\Lambda}}_{g_u}\right)\le {P}_u}{\max}\sum \limits_{g_u\in {G}_u}{\log}_2\left|{\mathbf{I}}_{b_u}{+}\frac{1}{\sigma }{\tilde{\mathbf{H}}}_{g_u}^{\left\{{B}_{g_u}\right\}}{\boldsymbol{\Lambda}}_{g_u}{\left({\tilde{\mathbf{H}}}_{g_u}^{\left\{{B}_{g_u}\right\}}\right)}^H\right|\\ {}{=}\underset{{\boldsymbol{\Lambda}}_{g_u}\succcurlyeq 0,\sum \limits_{g_u\in {G}_u}\mathrm{tr}\left({\boldsymbol{\Lambda}}_{g_u}\right)\le {P}_u}{\max}\sum \limits_{g_u\in {G}_u}{\log}_2\left|{\mathbf{I}}_N{+}\frac{1}{\sigma }{\mathbf{F}}^{\left\{{B}_{g_u}\right\}}{\tilde{\mathbf{H}}}_{g_u}^{\left\{{B}_{g_u}\right\}}{\boldsymbol{\Lambda}}_{g_u}{\left({\tilde{\mathbf{H}}}_{g_u}^{\left\{{B}_{g_u}\right\}}\right)}^H{\left({\mathbf{F}}^{\left\{{B}_{g_u}\right\}}\right)}^H\right|\\ {}{=}\underset{{\boldsymbol{\Lambda}}_{g_u}\succcurlyeq 0,\sum \limits_{g_u\in {G}_u}\mathrm{tr}\left({\boldsymbol{\Lambda}}_{g_u}\right)\le {P}_u}{\max}\sum \limits_{g_u\in {G}_u}{\log}_2\left|{\mathbf{I}}_N{+}\frac{1}{\sigma }{\mathbf{H}}_{g_u}{\boldsymbol{\Lambda}}_{g_u}{\mathbf{H}}_{g_u}^H\right|\end{array}} $$
(22)

Similarly, as N → ∞, according to Criterion 1 and Lemma 1, the downlink beam-domain received signal \( {\tilde{\mathbf{y}}}_{g_d} \) reduces to

$$ {\tilde{\mathbf{y}}}_{g_d}={\left({\tilde{\mathbf{H}}}_{g_d}^{\left\{{B}_{g_d},:\right\}}\right)}^H{\tilde{\mathbf{x}}}_{g_d}+{\mathbf{n}}_{g_d} $$
(23)

Let \( {R}_d^{\mathrm{sum}}\left({P}_d\right) \) be the downlink achievable sum rate with total power constraint P d. Assuming the beam-domain transmit signal \( {\tilde{\mathbf{x}}}_{g_d} \) is generated according to the rule of dirty paper code and using the MAC-BC duality [30], we have

$$ \begin{aligned} {R}_d^{\mathrm{sum}}\left({P}_d\right)&=\underset{\Lambda_{g_d}\succcurlyeq 0,\sum \limits_{g_d\in {G}_d}\mathrm{tr}\left({\Lambda}_{g_d}\right)\le {P}_d}{\max }\\&\times{\log}_2\left|{\mathbf{I}}_N+\frac{1}{\sigma}\sum \limits_{g_d\in {G}_d}{\mathbf{F}}^{\left\{:,{B}_{g_d}\right\}}{\tilde{\mathbf{H}}}_{g_d}^{\left\{{B}_{g_d},:\right\}}{\boldsymbol{\Lambda}}_{g_d}{\left({\tilde{\mathbf{H}}}_{g_d}^{\left\{{B}_{g_d},:\right\}}\right)}^H{\left({\mathbf{F}}^{\left\{:,{B}_{g_d}\right\}}\right)}^H\right|\end{aligned} $$
(24)

where \( {\boldsymbol{\Lambda}}_{g_d} \) denotes the diagonal input covariance matrix of the dual MAC channel. To show the optimality of the BDFD scheme, we examine the following equality. Letting \( {G}_d^{\prime}\subset {G}_d \) and \( {G}_d^{\prime}\ne \varnothing \), for arbitrary \( {g}_d\in {G}_d^{\prime } \), we have

$$ {\displaystyle \begin{array}{l}\left|{\mathbf{I}}_N{+}\frac{1}{\sigma}\kern0em \sum \limits_{g_d\in {G}_d}{\mathbf{F}}^{\left\{:,{B}_{g_d}\right\}}{\tilde{\mathbf{H}}}_{g_d}^{\left\{{B}_{g_d},:\right\}}{\Lambda}_{g_d}{\left({\tilde{\mathbf{H}}}_{g_d}^{\left\{{B}_{g_d},:\right\}}\right)}^H\kern0em {\left({\mathbf{F}}^{\left\{:,{B}_{g_d}\right\}}\right)}^H\right|\\ {}\kern1pt {=}\left|{\mathbf{K}}_{g_d}\right|\left|{\mathbf{I}}_N{+}\frac{1}{\sigma }{\mathbf{K}}_{g_d}^{-1}\sum \limits_{g_d^{\prime}\in {G}_d/\left\{{g}_d\right\}}{\mathbf{F}}^{\left\{:,{B}_{g_d^{\prime }}\right\}}{\tilde{\mathbf{H}}}_{g_d^{\prime}}^{\left\{{B}_{g_d^{\prime }},:\right\}}{\Lambda}_{g_d^{\prime }}\right.\left.{\left({\tilde{\mathbf{H}}}_{g_d^{\prime}}^{\left\{{B}_{g_d^{\prime }},:\right\}}\right)}^H{\left({\mathbf{F}}^{\left\{:,{B}_{g_d^{\prime }}\right\}}\right)}^H\right|\\ {}\kern0.5pt {=}\left|{\mathbf{K}}_{g_d}\right|\!\left|{\mathbf{I}}_N{+}\frac{1}{\sigma}\!\left({\mathbf{I}}_N{-}{\mathbf{F}}^{\left\{:,{B}_{g_d}\right\}}{\tilde{\mathbf{H}}}_{g_d}^{\left\{{B}_{g_d},:\right\}}\right.\right.{\!\!\left(\!{\Lambda}_{g_d}^{-1}{+}{\left({\tilde{\mathbf{H}}}_{g_d}^{\left\{{B}_{g_d},:\right\}}\right)}^H{\tilde{\mathbf{H}}}_{g_d}^{\left\{{B}_{g_d},:\right\}}\!\right)}^{-1}{\left(\!{\tilde{\mathbf{H}}}_{g_d}^{\left\{{B}_{g_d},:\right\}}\!\right)}^H\\ {}\kern1em \times \left.{\left({\mathbf{F}}^{\left\{:,{B}_{g_d}\right\}}\right)}^H\right)\sum \limits_{g_d^{\prime}\in {G}_d/\left\{{g}_d\right\}}{\mathbf{F}}^{\left\{:,{B}_{g_d^{\prime }}\right\}}{\tilde{\mathbf{H}}}_{g_d^{\prime}}^{\left\{{B}_{g_d^{\prime }},:\right\}}\left.{\Lambda}_{g_d^{\prime }}{\left({\tilde{\mathbf{H}}}_{g_d^{\prime}}^{\left\{{B}_{g_d^{\prime }},:\right\}}\right)}^H{\left({\mathbf{F}}^{\left\{:,{B}_{g_d^{\prime }}\right\}}\right)}^H\right|\\ {}\kern1pt =\left|{\mathbf{K}}_{g_d}\right|\left|{\mathbf{I}}_N{+}\frac{1}{\sigma}\sum \limits_{g_d^{\prime}\in {G}_d/\left\{{g}_d\right\}}{\mathbf{F}}^{\left\{:,{B}_{g_d^{\prime }}\right\}}{\tilde{\mathbf{H}}}_{d,{g}^{\prime}}^{\left\{{B}_{g_d^{\prime }},:\right\}}{\Lambda}_{d,{g}^{\prime }}\right.\left.{\left({\tilde{\mathbf{H}}}_{d,{g}^{\prime}}^{\left\{{B}_{g_d^{\prime }},:\right\}}\right)}^H\left({\mathbf{F}}^{\left\{:,{B}_{g_d^{\prime }}\right\}}\right)\right|\end{array}} $$
(25)

where \( {\mathbf{K}}_{g_d}={\mathbf{I}}_N+\frac{1}{\sigma }{\mathbf{F}}^{\left\{:,{B}_{g_d}\right\}}{\tilde{\mathbf{H}}}_{g_d}^{\left\{{B}_{g_d},:\right\}}{\Lambda}_{g_d}{\left({\tilde{\mathbf{H}}}_{g_d}^{\left\{{B}_{g_d},:\right\}}\right)}^H{\left({\mathbf{F}}^{\left\{{B}_{g_d},:\right\}}\right)}^H \). The second step is obtained by applying the matrix inversion lemma on \( {\mathbf{K}}_{g_d}^{-1} \) and the third step is based on Criterion 1. Using (25) repeatedly, we can rewrite the achievable downlink sum rate (24) as

$$ {\displaystyle \begin{array}{l}{R}_d^{\mathrm{sum}}\left({P}_d\right)=\underset{\Lambda_{g_d}\succcurlyeq 0,\sum \limits_{g_d\in {G}_d}\mathrm{tr}\left({\Lambda}_{g_d}\right)\le {P}_d}{\max}\\ \times\sum \limits_{g_d\in {G}_d}{\log}_2\left|{\mathbf{I}}_N+\frac{1}{\sigma }{\mathbf{F}}^{\left\{:,{B}_{g_d}\right\}}{\tilde{\mathbf{H}}}_{g_d}^{\left\{{B}_{g_d},:\right\}}{\Lambda}_{g_d}{\left({\tilde{\mathbf{H}}}_{g_d}^{\left\{{B}_{g_d},:\right\}}\right)}^H{\left({\mathbf{F}}^{\left\{:,{B}_{g_d}\right\}}\right)}^H\right|\\ {}=\underset{\Lambda_{g_d}\succcurlyeq 0,\sum \limits_{g_d\in {G}_d}\mathrm{tr}\left({\Lambda}_{g_d}\right)\le {P}_d}{\max}\sum \limits_{g_d\in {G}_d}{\log}_2\left|{\mathbf{I}}_N+\frac{1}{\sigma }{\mathbf{H}}_{g_d}{\Lambda}_{g_d}{\mathbf{H}}_{g_d}^H\right|\end{array}} $$
(26)

which is exactly the sum capacity of the dual MAC channel with total power constraint P d.

Theorem 1 reveals that only the reduced-dimension effective beam-domain CSI is enough for the BDFD scheme to achieve the uplink and downlink capacities simultaneously in the large N regime. Therefore, the BDFD scheme reduces the difficulty of channel acquisition. Note that conventional TDD/FDD massive MIMO can only achieve the uplink or downlink capacity on each time-frequency unit, even full CSI is available. Meanwhile, the BDFD scheme avoids the deployment of active SIC (which is hardware and energy costly) and does not need the instantaneous knowledge of the SI channel. However, in the practical application, the passive SIC may still be needed to suppress the residual SI under the SI channel-unaware environment.

Remark 1

Note that Theorem 1 is valid for single-cell system. In multicell system with FD BS, each transmission experiences more interferences compared to the single-cell situation, which include SI, UE-to-UE interference from both within the cell and neighboring cells, and BS-to-BS interference. To realize the gain of FD massive MIMO, efficient multicell interference mitigation technologies from different aspects, which may include UE scheduling, power control, and multiuser precoding with limited interference channel knowledge, should be studied.

4 Practical Implementation of BDFD Scheme

In this section, we consider several key components of BDFD scheme in the practical implementation, which include UE grouping, effective beam-domain channel acquisition, beam-domain data transmission, and interference control between uplink and downlink.

4.1 K-Means-Based UE Grouping

In real cellular system, UEs will not naturally partition in groups with exactly the same active beam set. In order to implement the BDFD scheme efficiently, the UEs with different active beam sets must be partitioned so that the UE grouping criteria are satisfied as close as possible. In this subsection, we propose a UE grouping scheme to achieve this task. Our scheme consists of the following three steps.

Step 1: Compute the active beam sets of all UEs and the SI channel based on the DOA/DOD information using the method in Sect. 3.1.

Step 2: The aim of the second step is to gather the UEs with similar active beam spaces into a group based on the K-means principle. Without loss of generality, we consider the uplink UEs and the operation for downlink UEs is similar. To apply the K-means algorithm, we need first to define the “distance” between UEs. In the proposed scheme, we employ the chordal distance between the active beam spaces of UEs. In particular, the distance between UEs k u and \( {k}_u^{\prime } \) can be expressed as

$$ {\displaystyle \begin{array}{c}{D}_{\mathrm{chordal}}\left({\mathbf{F}}^{\left\{:,{B}_{k_u}\right\}},{\mathbf{F}}^{\left\{:,{B}_{k_u^{\prime }}\right\}}\right)={\left\Vert {\mathbf{F}}^{\left\{:,{B}_{k_u}\right\}}{\left({\mathbf{F}}^{\left\{:,{B}_{k_u}\right\}}\right)}^H-{\mathbf{F}}^{\left\{:,{B}_{k_u^{\prime }}\right\}}{\left({\mathbf{F}}^{\left\{:,{B}_{k_u^{\prime }}\right\}}\right)}^H\right\Vert}^2\\ {}\qquad\qquad\ =\left|{B}_{k_u}\right|+\left|{B}_{k_u^{\prime }}\right|-2\left|{B}_{k_u}\cap {B}_{k_u^{\prime }}\right|\end{array}} $$
(27)

where the second equality is based on the orthogonality between columns of \( {\mathbf{F}}^{\left\{:,{B}_{k_u}\right\}} \) and \( {\mathbf{F}}^{\left\{:,{B}_{k_u^{\prime }}\right\}} \). Moreover, for a group of UEs \( \mathcal{U} \), the “centroid” of their active beam spaces is defined as [31]

$$ \overline{\mathbf{F}}={\mathrm{eig}}_{b_u}\left\{\sum \limits_{k_u\in U}{\mathbf{F}}^{\left\{:,{B}_{k_u}\right\}}{\left({\mathbf{F}}^{\left\{:,{B}_{k_u}\right\}}\right)}^H\right\} $$
(28)

where \( {\mathrm{eig}}_{b_u}\left\{\mathbf{A}\right\} \) indicates the unitary matrix whose columns are composed of b u dominant eigenvectors of matrix A. Since the columns of F are all eigenvectors of \( {\sum}_{k_u\in \mathcal{U}}{\mathbf{F}}^{\left\{:,{B}_{k_u}\right\}}{\left({\mathbf{F}}^{\left\{:,{B}_{k_u}\right\}}\right)}^H \), we can obtain \( \overline{\mathbf{F}}={\mathbf{F}}^{\left\{:,\overline{B}\right\}} \), where \( \overline{B} \) denotes the active beam set of “centroid.” By examining (28), it is easy to see that \( \overline{B} \) can be expressed as

$$ \overline{B}=\left\{{f}_1,{f}_2,\cdots, {f}_{b_d}\right\} $$
(29)

where f i is the index of the ith most frequent appeared beam in the sets \( {\left\{{B}_{k_u}\right\}}_{k_u\in \mathcal{U}} \). With (27) and the notion of active beam set, we can conduct a very simple UE grouping algorithm based on the K-means principle, as shown in Algorithm 1. Note that as the output of Algorithm 1, the active beam set for group “centroid” is treated as the active beam set of that group.

Step 3: After step 2, we get a set of uplink UE groups, a set of downlink UE groups, and their active beam sets. To meet the SI cancellation condition in Criterion 2, for each uplink group, if its active beam set (denoted by B u) is (partially) overlapped with B SI,R, we update B u as B u = B u/(B u ∩ B SI,R). The active beam sets for downlink groups keep unchanged. Alternatively, we can also update the active beam set of downlink group (denoted by B d) as B d = B d/(B d ∩ B SI,T), if B d ∩ B SI,T ≠ ∅, while keeping the active beam sets for uplink groups unchanged. On the other hand, the active beam sets of different uplink/downlink UE groups after step 2 may also partially overlapped, which is not allowed according to Criterion 1. To deal with this problem, we further classify the uplink/downlink UE groups into several clusters so that the active beam sets of uplink/downlink UE groups in the same cluster are non-overlapping with certain guard interval. On certain time-frequency resource, only one uplink cluster and one downlink cluster are served using the BDFD scheme. Moreover, the UEs groups from different clusters are served using orthogonal time-frequency resources to eliminate the interference.

Remark 2

Here, we mention that orthogonal time-frequency resource allocation for different clusters does not mean more time-frequency resource consumption. The reason is that, in the cellular system, the number of UEs within each cell is commonly large. Thus it is impossible to serve all UEs using the same time-frequency resource. To access to the network, UEs which cannot be served simultaneously should be allocated to other time-frequency resources using technologies such as orthogonal frequency-division multiple access. Thus, the UE clustering operation can actually be viewed as an additional constraint on time-frequency resource allocation and will not degrade the system SE significantly.

Remark 3

When the active beam sets of uplink or downlink groups are (partially) overlapped with that of the SI channel, these groups get less beams due to Criterion 2. This may cause some problems in fairness between uplink and downlink UEs. Note that Criterion 2 can be satisfied by reducing the active beams of uplink UE groups or downlink groups in step 3. Therefore, if we reduce the active beams of uplink group to meet Criterion 2 in odd time slots, and reduce the active beams of downlink group to meet Criterion 2 in even time slots. The fairness can be improved to some extent. More intelligently, the resource allocation algorithm in time-frequency dimension can be investigated to achieve some kind of fairness (e.g., max-min fairness) among all UEs. This is interesting for future research.

4.2 Full-Duplex Effective Beam-Domain Channel Estimation

In this subsection, we propose a full-duplex channel estimation scheme to estimate the effective beam-domain channels. During the training phase, all the uplink UEs transmit pilot signals to the BS, and meanwhile, the BS transmits the pilot signals to the downlink UEs. Let \( {\Phi}_u\in {\mathbf{C}}^{\tau_u\times \underset{g_u\in {G}_u}{\max }{K}_{g_u}} \) be the orthogonal pilot sequence set for uplink training, where τ u denotes the length of pilot sequence which satisfies \( {\tau}_u\ge \underset{g_u\in {G}_u}{\max }{K}_{g_u} \). The pilot sequences allocated for group g u can be given by \( {\Phi}_{g_u}={\Phi}_u^{\left\{:,1:{K}_{g_u}\right\}} \). Meanwhile, let \( {\Phi}_d\in {\mathbf{C}}^{\tau_d\times {b}_d} \) be the orthogonal downlink pilot sequences, where τ d ≥ b d denotes the length of pilot sequence. The downlink pilot sequence for group g d is precoded by multiplying the matrix \( {\mathbf{F}}^{\left\{:,{B}_{g_d}\right\}} \). This operation is essential to suppress the IGI during training phase as will be seen below. The received pilot signals at the BS and the downlink group g d can be expressed as

$$ {\mathbf{Y}}_u=\sum \limits_{g_u^{\prime}\in {G}_u}{\mathbf{H}}_{g_u^{\prime }}{\Phi}_{g_u^{\prime}}^T+{\mathbf{H}}_{SI}\sum \limits_{g_d^{\prime}\in {G}_d}{\mathbf{F}}^{\left\{:,{B}_{g_d^{\prime }}\right\}}{\dot{\Phi}}_d^T+{\mathbf{N}}_u $$
(30)
$$ {\mathbf{Y}}_{g_d}={\mathbf{H}}_{g_d}^H\sum \limits_{g_d^{\prime}\in {G}_d}{\mathbf{F}}^{\left\{:,{B}_{g_d^{\prime }}\right\}}{\Phi}_d^T+{\mathbf{N}}_{g_d} $$
(31)

where \( {\dot{\Phi}}_d={\Phi}_d^{\left\{1:{\tau}_u\right\}} \) if τ d ≥ τ u and \( {\dot{\Phi}}_d={\left[{\Phi}_d^T,{0}_{\left({\tau}_u-{\tau}_d\right)\times {b}_d}^T\right]}^T \) if τ d < τ u. N u and N d denote the AWGNs with variance σ.

4.2.1 Uplink Effective Beam-Domain Channel Estimation

By multiplying both sides of (30) with \( {\left({\mathbf{F}}^{\left\{:,{B}_{g_u}\right\}}\right)}^H \), we arrive at the beam-domain receive pilot signal from uplink group g u

$$ {\displaystyle \begin{array}{c}{\mathbf{Y}}_{g_u}={\left({\mathbf{F}}^{\left\{:,{B}_{g_u}\right\}}\right)}^H{\mathbf{Y}}_u\\ {}={\tilde{\mathbf{H}}}_{g_u}^{\left\{{B}_{g_u,:}\right\}}{\Phi}_{g_u}^T+\sum \limits_{g_u^{\prime}\in {G}_u/\left\{{g}_u\right\}}{\tilde{\mathbf{H}}}_{g_u^{\prime}}^{\left\{{B}_{g_u},:\right\}}{\Phi}_{g_u^{\prime}}^T+\sum \limits_{g_d^{\prime}\in {G}_d}{\tilde{\mathbf{H}}}_{SI}^{\left\{{B}_{g_u},{B}_{g_d^{\prime }}\right\}}{\dot{\Phi}}_d^T+{\left({\mathbf{F}}^{\left\{:,{B}_{g_u}\right\}}\right)}^H{\mathbf{N}}_u\end{array}} $$
(32)

With (32), the least squares (LS) estimator of the effective beam-domain channel vector for the uplink UE g u,k can be obtained as

$$ {\displaystyle \begin{array}{c}{\tilde{\mathbf{h}}}_{g_{u,k},\mathrm{LS}}^{\left\{{B}_{g_u},:\right\}}=\frac{1}{\tau_u{p}_u}{\mathbf{Y}}_{g_u}{\Phi}_u^{\ast }{\mathbf{e}}_k\\ {}={\tilde{\mathbf{h}}}_{g_{u,k}}^{\left\{{B}_{g_u},:\right\}}+\sum \limits_{g_u^{\prime}\in {G}_u/\left\{{g}_u\right\}}{\tilde{\mathbf{h}}}_{g_{u,k}^{\prime}}^{\left\{{B}_{g_u},:\right\}}+\frac{1}{\tau_u{p}_u}\sum \limits_{g_d^{\prime}\in {G}_d}{\tilde{\mathbf{H}}}_{SI}^{\left\{{B}_{g_u},{B}_{g_d^{\prime }}\right\}}{\dot{\Phi}}_d^T{\Phi}_u^{\ast }{\mathbf{e}}_k\\ {}+\frac{1}{\tau_u{p}_u}{\left({\mathbf{F}}^{\left\{:,{B}_{u,g}\right\}}\right)}^H{\mathbf{N}}_u{\Phi}_u^{\ast }{\mathbf{e}}_k\end{array}} $$
(33)

where p u denotes the power of each uplink pilot symbol, i.e., \( {\left|{\left[{\Phi}_{g_u}\right]}_{i,j}\right|}^2={p}_u \). The second term of RHS of (33) indicates the pilot contamination due to the use of same pilot sequences over all the uplink groups. The third term is the SI due to the simultaneous uplink and downlink training. Recalling Criterion 1 and Criterion 2 in the last section, and using Lemma 1 and Lemma 2, we can deduce that the pilot contamination and SI approach to zero in the large N regime.

In the practical scenario with finite number of BS antennas, the LS estimate in (33) can be further refined by a linear minimum mean square error (LMMSE) procedure to mitigate the residual pilot contamination and SI. Based on the general expression of LMMSE estimator [32, Ch. 12], the refined estimates can be expressed as

$$ { \begin{aligned}&{\tilde{\mathbf{h}}}_{g_{u,k},\mathrm{LM}}^{\left\{{B}_{g_u},:\right\}}=\sum \limits_{i=1}^{M_u}{\int}_{\theta_{g_{u,k,i}}^{\mathrm{min}}}^{\theta_{g_{u,k,i}}^{\mathrm{max}}}\kern0em {\left(\kern0em {\mathbf{F}}^{\left\{:,{B}_{g_u}\right\}}\kern0em \right)}^H\mathbf{a}\left(\theta \right){\mathbf{a}}^H\left(\theta \right){\mathbf{F}}^{\left\{:,{B}_{g_u}\right\}}{S}_{g_{u,k},i}\left(\theta \right) d\theta \\ &{\times}\left(\frac{\sigma }{\tau_u{p}_u}{\mathbf{I}}_{b_u}{+}\sum \limits_{g_u^{\prime}\in {G}_u}\sum \limits_{i=1}^{M_u}{\int}_{\theta_{g_{u,k,i}}^{\mathrm{min}}}^{\theta_{g_{u,k,i}^{\prime}}^{\mathrm{max}}}{\left(\!{\mathbf{F}}^{\left\{:,{B}_{g_u}\right\}}\!\right)}^H\mathbf{a}\left(\theta \right){\mathbf{a}}^H\left(\theta \right){\mathbf{F}}^{\left\{:,{B}_{g_u}\right\}}{S}_{g_{u,k}^{\prime },i}\left(\theta \right) d\theta\right.\\ & {+}{\left(\frac{1}{\tau_u{p}_u}\!\right)}^2 \sum \limits_{g_d^{\prime },{g}_d^{{\prime\prime}}\in {G}_d}\sum \limits_{i=1}^{M_{SI}}{\int}_{\theta_{R,i}^{\mathrm{min}}}^{\theta_{R,i}^{\mathrm{max}}}{\int}_{\theta_{T,i}^{\mathrm{min}}}^{\theta_{T,i}^{\mathrm{max}}}{\mathbf{G}}_{\theta_R,{\theta}_T}^{\left\{{B}_{g_u},{B}_{g_d^{\prime }}\right\}}{\Psi}_k{\left.{\left({\mathbf{G}}_{\theta_R,{\theta}_T}^{\left\{{B}_{g_u},{B}_{g_d^{{\prime\prime} }}\right\}}\right)}^H\right.}\\ &{\times\left.{S}_{SI,i}\left({\theta}_R,{\theta}_T\right)d{\theta}_Rd{\theta}_T\right)}^{-1}{\tilde{\mathbf{h}}}_{g_{u,k},\mathrm{LS}}^{\left\{{B}_{g_u},:\right\}}\end{aligned}} $$
(34)

where \( {\mathbf{G}}_{\theta_R,{\theta}_T}^{\left\{{B}_{g_u},{B}_{g_d^{\prime }}\right\}}\triangleq {\left({\mathbf{F}}^{\left\{:,{B}_{g_u}\right\}}\right)}^H\mathbf{a}\left({\theta}_R\right){\mathbf{a}}^H\left({\theta}_T\right){\mathbf{F}}^{\left\{:,{B}_{g_d^{\prime }}\right\}} \) and \( {\Psi}_k\stackrel{\wedge}{=}{\left(\frac{1}{\tau_u{p}_u}\right)}^2\break {\dot{\Phi}}_d^T{\Phi}_u^{\ast }{\mathbf{e}}_k{\mathbf{e}}_k^H{\Phi}_u^T{\dot{\Phi}}_d^{\ast } \).

4.2.2 Downlink Effective Beam-Domain Channel Estimation

By exploiting the beam-domain presentation and using (31), the LS estimator for the effective beam-domain channel vector of the downlink UE g d,k can be obtained as

$$ {\displaystyle \begin{array}{c}{\tilde{\mathbf{h}}}_{g_{d,k},\mathrm{LS}}^{\left\{{B}_{g_d},:\right\}}=\frac{1}{\tau_d{p}_d}{\left({\mathbf{Y}}_{g_d}{\Phi}_d^{\ast}\right)}^H{\mathbf{e}}_k\\ {}={\tilde{\mathbf{h}}}_{g_{d,k}}^{\left\{{B}_{g_d},:\right\}}+\sum \limits_{g_d^{\prime}\in {G}_d}{\tilde{\mathbf{h}}}_{g_{d,k}}^{\left\{{B}_{g_d^{\prime }},:\right\}}+\frac{1}{\tau_d{p}_d}{\left({\mathbf{N}}_{g_d}{\Phi}_d^{\ast}\right)}^H{\mathbf{e}}_k\end{array}} $$
(35)

where p d denotes the power of each downlink pilot symbol, i.e., |[Φd]i,j|2 = p d. The second term indicates the pilot contamination due to the use of same pilot sequences over all the downlink UE groups. Using Criterion 1 and Lemma 1, the pilot contamination converges to zero as N → ∞. Similarly, we can refine the estimates with the LMMSE procedure, resulting in

$$ {\displaystyle \begin{array}{c}{\tilde{\mathbf{h}}}_{g_{d,k},\mathrm{LM}}^{\left\{{B}_{g_d},:\right\}}=\sum \limits_{g_d^{\prime}\in {G}_d}\sum \limits_{i=1}^{M_d}{\int}_{\theta_{g_{d,k,i}}^{\mathrm{min}}}^{\theta_{g_{d,k,i}}^{\mathrm{max}}}{\left({\mathbf{F}}^{\left\{:,{B}_{g_d}\right\}}\right)}^H\mathbf{a}\left(\theta \right)\\ {}\times {\mathbf{a}}^H\left(\theta \right){\mathbf{F}}^{\left\{:,{B}_{g_d^{\prime }}\right\}}{S}_{g_{d,k},i}\left(\theta \right) d\theta \left(\sum \limits_{g_d^{\prime },{g}_d^{{\prime\prime}}\in {G}_d}\sum \limits_{i=1}^{M_d}{\int}_{\theta_{g_{d,k,i}}^{\mathrm{min}}}^{\theta_{g_{d,k,i}}^{\mathrm{max}}}{\left({\mathbf{F}}^{\left\{:,{B}_{g_d^{\prime }}\right\}}\right)}^H\mathbf{a}\left(\theta \right)\right.\\ {}{\left.\times {\mathbf{a}}^H\left(\theta \right){\mathbf{F}}^{\left\{:,{B}_{g_d^{{\prime\prime} }}\right\}}{S}_{g_{d,k},i}\left(\theta \right) d\theta +\frac{\sigma }{\tau_d{p}_d}\right)}^{-1}{\tilde{\mathbf{h}}}_{g_{d,k},\mathrm{LS}}^{\left\{{B}_{d,g},:\right\}}\end{array}} $$
(36)

To estimate the channels of all \( {\sum}_{g_u\in {G}_u}{K}_{g_u} \) uplink UEs and \( {\sum}_{g_d\in {G}_d}{K}_{g_d} \) downlink UEs, the minimum required lengths of pilot sequences in the proposed full-duplex estimation scheme and conventional schemes used in the TDD/FDD/FD massive MIMO systems are summarized in Table 2. It is seen that the proposed scheme improves the training efficiency significantly.

Table 2 Minimum required lengths of pilot sequences in the proposed estimation scheme and conventional schemes

Example 2

Considering the channel model in Example 1, we have b d  12 if the BS is equipped with N = 128 antennas. If three uplink groups and three downlink groups are scheduled and each group contains five UEs, the minimum required length of pilot sequences in the proposed scheme is 12 (symbol times).

However, this number becomes 30, 17, and 143 (symbol times), respectively, in the reference schemes listed in Table 1. After downlink channel estimation, the estimated CSI should be feedback to BS in order to perform downlink transmission. This can affect the system from two aspects. First, the feedback error due to quantization error, noise, and feedback delay decreases the accuracy of downlink CSI. Moreover, CSI feedback increases the load of feedback channel and, hence, can degrade the overall system SE. However, the results in [33] showed that the CSI error (in term of mean-square error) due to imperfect feedback can be made much smaller than that caused by estimation error in downlink training phase, especially in the high signal-to-noise ratio (SNR) region. Moreover, since the effective downlink channel dimension is greatly reduced in proposed BDFD scheme, we assume that the additional load caused by CSI feedback is negligible when compared with the other feedback information. Therefore, for simplicity, we consider the optimistic situation of error-free CSI feedback and neglect the SE penalty due to feedback. A similar approach is also adopted in [8].

4.3 Beam-Domain Data Transmission and Achievable Rate with Noisy CSI

To keep the complexity low, we assume that the BS employs linear processing in the beam domain. In uplink, to detect the signals from group g u, the BS combines the beam-domain received signal (Sect. 3.2) by multiplying the receive beamforming matrix \( {\mathbf{W}}_{g_u}=\left[{\mathbf{w}}_{g_{u,1}},{\mathbf{w}}_{g_{u,2}},\cdots, {\mathbf{w}}_{g_{u,{K}_g}}\right]\in {\mathbf{C}}^{b_u\times {K}_g} \), i.e., \( {\dot{\mathbf{y}}}_{g_u}={\mathbf{W}}_{g_u}^H{\tilde{\mathbf{y}}}_{g_u} \). The kth entry of \( {\dot{\mathbf{y}}}_{g_u} \)

$$ {\displaystyle \begin{array}{c}{\dot{y}}_{g_{u,k}}={\mathbf{w}}_{g_{u,k}}^H{\tilde{\mathbf{h}}}_{g_{u,k}}^{\left\{{B}_{g_u},:\right\}}{s}_{g_{u,k}}+{\mathbf{w}}_{u,{g}_k}^H\sum \limits_{k^{\prime }=1,{k}^{\prime}\ne k}^{K_{g_u}}{\tilde{\mathbf{h}}}_{g_{u,{k}^{\prime}}}^{\left\{{B}_{g_u},:\right\}}{s}_{g_{u,{k}^{\prime }}}\\ {}+{\mathbf{w}}_{g_{u,k}}^H\sum \limits_{g_u^{\prime}\in {G}_u/\left\{{g}_u\right\}}{\tilde{\mathbf{H}}}_{g_u^{\prime}}^{\left\{{B}_{g_u},:\right\}}{\mathbf{s}}_{g_u^{\prime }}+{\mathbf{w}}_{g_{u,k}}^H\sum \limits_{g_u^{\prime}\in {G}_d}{\tilde{\mathbf{H}}}_{SI}^{\left\{{B}_{g_u},{B}_{g_d^{\prime }}\right\}}{\tilde{\mathbf{x}}}_{g_d^{\prime }}+{\mathbf{w}}_{g_{u,k}}^H{\tilde{\mathbf{n}}}_{g_u}\end{array}} $$
(37)

is used to decode the symbol of UE g u,k. In downlink, the intended signal of group g d, i.e., \( {\mathbf{s}}_{g_d}\in {\mathbf{C}}^{K_{g_d}\times 1} \), are precoded by the beamforming matrix \( {\mathbf{W}}_{g_d}=\left[{\mathbf{w}}_{g_{d,1}},{\mathbf{w}}_{g_{d,2}},\cdots, {\mathbf{w}}_{g_{d,{K}_{g_d}}}\right]\in {\mathbf{C}}^{b_d\times {K}_{g_d}} \) in the beam domain. Thus, the beam-domain transmit signal vector for group g d can be expressed as \( {\tilde{\mathbf{x}}}_{g_d}={\mathbf{W}}_{g_d}{\mathbf{s}}_{g_d} \). Using these on (III-B), the beam-domain received signal at UE g d,k can be expressed as

$$ {\displaystyle \begin{array}{c}{\tilde{y}}_{g_{d,k}}={\left({\tilde{\mathbf{h}}}_{g_{d,k}}^{\left\{{B}_{g_d},:\right\}}\right)}^H{\mathbf{w}}_{g_{d,k}}{s}_{g_{d,k}}+{\left({\tilde{\mathbf{h}}}_{g_{d,k}}^{\left\{{B}_{g_d},:\right\}}\right)}^H\sum \limits_{k^{\prime }=1,{k}^{\prime}\ne k}^K\kern0em {\mathbf{w}}_{g_{d,{k}^{\prime }}}{s}_{g_{d,{k}^{\prime }}}\\ {}+\sum \limits_{g_d^{\prime}\in {G}_d/\left\{{g}_d\right\}}{\left({\tilde{\mathbf{h}}}_{g_{d,k}}^{\left\{{B}_{g_d^{\prime }},:\right\}}\right)}^H{\mathbf{W}}_{g_d^{\prime }}{\mathbf{s}}_{g_d^{\prime }}+{n}_{g_{d,k}}\end{array}} $$
(38)

The optimal beamforming scheme to maximize the sum rate has been proved NP-hard [34]. Thus, we consider the suboptimal scheme to provide a bound on the system performance. In general, the (suboptimal) beamforming matrices can be designed with different criteria, e.g., maximizing the desired signal power which corresponds to the eigen beamforming or minimizing the inter-UE interference which corresponds to the zero-forcing (ZF) beamforming. In this work, we adopt the latter one since the ZF beamforming is known to approach the asymptotic limit of achievable rate faster as the number of BS antennas increases [2]. Assuming the channel estimators in (34) and (36), the transmit and receive beamforming matrices of the BS can be expressed as

$$ {\mathbf{W}}_{g_u}={\tilde{\mathbf{H}}}_{g_u,\mathrm{LM}}^{\left\{{B}_{g_u},:\right\}}{\left({\left({\tilde{\mathbf{H}}}_{g_u,\mathrm{LM}}^{\left\{{B}_{g_u},:\right\}}\right)}^H{\tilde{\mathbf{H}}}_{g_u,\mathrm{LM}}^{\left\{{B}_{g_u,:}\right\}}\right)}^{-1} $$
$$ {\mathbf{W}}_{g_d}={\tilde{\mathbf{H}}}_{g_d,\mathrm{LM}}^{\left\{{B}_{g_d},:\right\}}{\left({\left({\tilde{\mathbf{H}}}_{g_d,\mathrm{LM}}^{\left\{{B}_{g_d,:}\right\}}\right)}^H{\tilde{\mathbf{H}}}_{g_d,\mathrm{LM}}^{\left\{{B}_{g_d},:\right\}}\right)}^{-1}{\Upsilon}_{g_d}^{-1/2} $$
(39)

where \( {\Upsilon}_{g_d} \) is a diagonal normalized matrix with \( {\left[{\Upsilon}_{g_d}\right]}_{l,l}=\break{\mathbf{e}}_l^H{\left({\left({\tilde{\mathbf{H}}}_{g_d,\mathrm{LM}}^{\left\{{B}_{g_d},:\right\}}\right)}^H{\tilde{\mathbf{H}}}_{g_d,\mathrm{LM}}^{\left\{{B}_{g_d},:\right\}}\right)}^{-1}{\mathbf{e}}_l \). Due to the requirement of matrix inversion, the complexity of ZF beamforming becomes high when the number of UEs in each group is large. Some low complexity linear beamforming schemes, such as eigen beamforming, can be employed to deal with this problem at the cost of a few performance loss. To do so, we need just replace (39) with the beamforming matrices of these schemes. No other change is required.

According to (37) and (38) and using the bounding technique in [35], the average achievable rates at the uplink UE g u,k and downlink UE g d,k can be expressed as

$$ {\displaystyle \begin{array}{l}{R}_{g_{u,k}}{=}\frac{T{-}\max \left\{{\tau}_u,{\tau}_d\right\}}{T}{\log}_2\left(1{+}\frac{p_{g_{u,k}}}{\mathbb{E}\left[{\mathrm{CE}}_{g_{u,k}}\right]+\mathbb{E}\left[{\mathrm{IUI}}_{g_{u,k}}\right]+\mathbb{E}\left[{\mathrm{IGI}}_{g_{u,k}}\right]+\mathbb{E}\left[{\mathrm{SI}}_{g_{u,k}}\right]+\mathbb{E}\left[{\left\Vert {\mathbf{w}}_{u,{g}_k}\right\Vert}^2\right]}\right)\\ {}{R}_{g_{d,k}}=\frac{T-\max \left\{{\tau}_u,{\tau}_d\right\}}{T}{\log}_2\left(1+\frac{p_{g_{d,k}}\mathbb{E}\left[{\left[{\Upsilon}_{g_d}^{-1}\right]}_{k,k}\right]}{\mathbb{E}\left[{\mathrm{CE}}_{g_{d,k}}\right]+\mathbb{E}\left[{\mathrm{IUI}}_{g_{d,k}}\right]+\mathbb{E}\left[{\mathrm{IGI}}_{g_{d,k}}\right]+1}\right)\end{array}} $$
(40)

where T denotes the channel coherent time. \( {p}_{g_{u,k}}=\mathbb{E}\left[{\left|{s}_{g_{u,k}}\right|}^2\right] \) and \( {p}_{g_{d,k}}=\mathbb{E}\left[{\left|{s}_{g_{d,k}}\right|}^2\right] \) denote the transmit powers. CEi, IUIi, IGIi, and SIi (i ∈ {g u,k, g d,k}) denote the powers of channel estimation error, inter-UE interference (IUI) within the group, IGI, and SI, respectively, whose expressions are summarized in Table 3.

Table 3 Expressions of powers of channel estimation error, IUI within the group, IGI, and SI

The exact expressions of the average achievable rates are difficult to obtain under the considered channel model. Instead, in the following we focus on the question how do the above negative factors (i.e., channel estimation error, IUI, IGI, and SI) affect the achievable rate performance in the BDFD scheme. To answer this question, we present the scaling behaviors for powers of channel estimation error, IUI, IGI, and SI in the following theorem.

Theorem 2

Assume the cardinalities of the active beam sets scale linearly with N, i.e., \( \underset{N\to \infty }{\lim}\frac{b_u}{N}>0 \) and \( \underset{N\to \infty }{\lim}\frac{b_d}{N}>0 \). The scaling behaviors for average powers of channel estimation error, IUI, IGI, and SI in the large N regime are given by Table~4.

Table 4 Scaling behaviors of channel estimation error, IUI within the group, IGI, and SI in the large N regime

Proof

See the Appendix.

Theorem 2 reveals that, in the BDFD scheme, the powers of IGI and SI decrease faster than other terms when the number of BS antennas increases. As a result, the effect of IGI and SI diminishes in the large N regime. In this sense, the BDFD scheme in fact decomposes the original system into several lower dimension uplink or downlink massive MIMO systems operating on the (asymptotically) orthogonal beam spaces. Another important observation from Theorem 2 is that the SI power decreases faster than \( \mathcal{O}\left({N}^{-1}\right) \) in the BDFD scheme. This is quite different from the FD massive MIMO with linear transceiver [18], where the SI power changes exactly with \( \mathcal{O}\left({N}^{-1}\right) \) in the large N regime. The reason is that, with the UE grouping criteria (Criterion 2), the signals of uplink or downlink groups occupy asymptotically orthogonal beam spaces with the SI. Thus, better SI suppression can be achieved in the BDFD scheme.

Remark 4

In theorem 2, we have assumed that b u and b d scale linearly with N. This is a standard assumption in the field of massive MIMO [8] in order to use the analytic tools developed for large-scale antenna systems. The assumption indicates that b u and b d, and hence the required length of pilot sequences, tend to infinity as N → ∞, which is contrary to the purpose of this paper. However, in the practical implementation, the BS cannot be equipped with too many antennas due to the realistic constraints on hardware complexity and power consumption. With reasonable N, the training overhead is still low (See the Example 2 in Sect. 4.2).

4.4 Interference Control Between Uplink and Downlink

When multiple uplink UEs and downlink UEs are active simultaneously at the same frequency band, the resultant network suffers from increased interferences from uplink UEs to downlink UEs. One simple approach to alleviate the interference is cell sectorization. As shown in Fig. 4a, on the particular time-frequency resource, we only schedule the uplink UEs and downlink UEs in two opposite 120° sectors. This ensures that the uplink UE and downlink UE with small distance will not be scheduled on the same time-frequency resource. In the worst case where the uplink and downlink UEs are both located on the boundaries of the sectors as shown in Fig. 4a, the interference channel between uplink and downlink is still much weaker than the useful channel. For example, when the distance between BS and UEs is 300 m, the interference channel between two boundary UEs is 46 dB weaker than the useful channel according to the 3GPP LTE BS-to-UE and UE-to-UE path loss models [36, Table 6.4–1] (note that the UE-to-UE channel suffers from more path loss than the BS-to-UE channel even though the transmission distances are the same [36]). On the other hand, to cover the whole cell evenly, we can schedule the UEs in the rotated sectors, as shown in Fig. 4b, c, using different time-frequency resources.

Fig. 4
figure 4

Cell sectorization

5 Simulation Results

In this section, the performance of BDFD scheme is evaluated using the 3GPP LTE simulation model for macro-cell environment [36]. The simulation parameters are summarized in Table 5. It is assumed that the passive SIC scheme for infrastructure nodes proposed in [25] has been employed at the BS. In such scheme, the suppression is from two parts, namely, (i) the path loss introduced by the 20 m separation between transmit and receive antenna arrays and (ii) an additional cancellation of 45 dB provided by techniques, such as radio-frequency absorber material and cross-polarization. No other active SIC scheme is used.

Table 5 Simulation parameters

We first consider a scenario where the uplink/downlink UEs gather perfectly in three groups and the DOA/DOD regions of UEs in each group are identical. We assume that each group contains five UEs. The DOA regions of three uplink groups are [−33°, −23°], [7°, 17°], and [34°, 44°], respectively. Since we assume Mu = 1, the DOA region of uplink group is [a, b] means \( \left[{\theta}_{g_{u,k},1}^{\mathrm{min}},{\theta}_{g_{u,k},1}^{\mathrm{max}}\right]=\left[a,b\right] \). Similarly, the DOD regions of three downlink groups are [−39°, −29°], [10°, 20°], and [19°, 30°], respectively.

The DOA and DOD regions of SI channel are set to [−15°, −5°], [54°, 66°], and [−25°, −35°], [19°, 30°], respectively. The resulting active beam sets for all the groups satisfy the UE grouping criteria.

Figure 5 compares the SEsFootnote 7 of BDFD scheme, TDD massive MIMO with linear transceiver [3], FDD massive MIMO with JSDM [8], FD massive MIMO with linear transceiver [18], and FD massive MIMO with spatial SI suppression [37]. For the scheme with spatial SI suppression [37], the instantaneous CSI of SI channel is required at the BS in order to perform SI cancellation in spatial domain. With perfect effective beam-domain CSI, it is seen that the SE of BDFD scheme approaches the sum of uplink and downlink capacities as the number of BS antennas increases. With estimated effective beam-domain channels, the performance gap increases as N becomes larger. The reason is that, although the BDFD scheme can reduce the required length of pilot sequence significantly, the training overhead still increases linearly with N. Due to the same reason, when downlink reciprocity is available (at the cost of higher hardware complexity), the reference schemes in [18] and [37] achieve better SE over the BDFD scheme in the large N region. On the other hand, significant SE gain can be achieved by the BDFD scheme over the TDD and FDD massive MIMO systems. Interestingly, the performance gain can even be greater than 2 × (e.g., 2.08 × gain is observed over the TDD massive MIMO when N = 200), which is impossible in the conventional FD system. This is because the TDD massive MIMO spends more resource for pilot signaling as discussed in Sect. 4.2. At last, without downlink reciprocity, it is observed that the FD massive MIMO with linear transceiver becomes infeasible if the number of BS antennas exceeds 175, since almost all the time resource is allocated for downlink training.

Fig. 5
figure 5

Spectral efficiency with perfect UE grouping. The transmit powers of BS and uplink UEs are 20.2 dBm. The distance between UEs and BS is set to 500 m. This setup ensures the average uplink/downlink receive SNR is 3 dB

To examine the scaling results in Theorem 2, Fig. 6 simulates the average powers of useful signal, channel estimation error, IUI, IGI, and SI in the BDFD scheme. From the figure, it is seen that the powers of IGI and SI decrease faster than other negative factors in the large N regime, which coincides with Theorem 2. Then, in Figs. 7 and 8, we consider a more realistic scenario where the UEs are not naturally partitioned in groups with exactly the same active beam set. We assume that 50 uplink UEs and 50 downlink UEs are located in two opposite 120° sectors, as shown in Fig. 4a. The BS is equipped with N = 128 transmit/receive antennas. The signal of each uplink/downlink UE is within a 10° DOA/DOD region which is randomly distributed in the sectors. The distance between uplink/downlink UE and BS is randomly distributed in the interval [200, 1000] m. After UE grouping, three UE clusters are formed using the method in Sect. 4.1, and the UEs groups in different clusters are served with orthogonal time-frequency resources. Without loss of generality, the active beam sets of uplink groups which are (partially) overlapped that of the SI channel are updated using the method in Sect. 4.1 (step 3).

Fig. 6
figure 6

Average powers of useful signal, channel estimation error, and interferences in the BDFD scheme. The simulation setup is the same with Fig. 5. (a) Uplink. (b) Downlink

Fig. 7
figure 7

Spectral efficiency of BDFD scheme with imperfect UE grouping for different average receive SNRs

Fig. 8
figure 8

Spectral efficiency of BDFD scheme for different numbers of scattering clusters for SI channel

Figure 7 depicts the SE of BDFD scheme as a function of average receive SNR. Since the UE groups are divided into three clusters, the SE is defined as the average of SEs for three clusters. The number of scattering cluster for SI channel is set to M SI = 2. The DOA and DOD regions of SI channel are [−15°, −5°], [54°, 66°], and [−25°, −35°], [19°, 30°], respectively. Again, it is seen that the BDFD scheme achieves the best performance. In particular, the BDFD scheme achieves 1.80× and 1.87 × SE gain over the TDD massive MIMO when the SNRs are 3 dB and 12 dB, respectively. The performance gain is generally smaller compared with that in Fig. 5. The reason is that, different from Fig. 5, the UEs in each group may not have exactly the same active beam set. Thus, not all the beams in the active beam set of that group can be fully used by all the UEs. This will result in some performance degradation. Moreover, there is no performance floor in the large SNR region for the BDFD scheme, even the interference from uplink UE to downlink exist. At last, we mention that, with the UE scheduling scheme in Sect. 4.4, the powers of interferences from uplink UEs to downlink UEs are much smaller than the background noise. That is why no obvious performance floor is observed for BDFD scheme in the large SNR region.

In the previous simulations, the number of scattering clusters for SI channel is fixed to M SI = 2. In Fig. 8, we consider the SE of BDFD scheme with larger M SI. In particular, we let M SI increase from 2 to 10. The SI signal from each scattering cluster is within a 10° DOA/DOD region which is randomly distributed [−90°, 90°]. It is seen the SE of BDFD scheme approaches to that of the TDD/FDD massive MIMO as M SI increases. This is because the numbers of uplink groups and the active beams for each uplink group decrease according to algorithm in Sect. 4.1. In fact, for large M SI, the performance gain of BDFD scheme over TDD/FDD massive MIMO is mainly due to the saving in the training resources. Moreover, since the scheme in [37] cancels the SI completely using the instantaneous CSI, it achieves better SE when M SI  6. However, the gain can be realized only when the instantaneous SI channel can be efficiently estimated.

6 Conclusion

This paper proposes a BDFD massive MIMO scheme to realize CCUD transmission in the cellular system. By exploiting the compressibility of beam-domain channel, the BDFD scheme can eliminate SI due to CCUD transmission efficiently. The simulation results show that the BDFD massive MIMO scheme outperforms the TDD/FDD massive MIMO and FD massive MIMO with linear transceiver significantly in the macro-cell environment. Due to the above advantages, we suggest BDFD massive MIMO as a potential enabling technology for evolution toward future wireless cellular system.