1 Introduction

The explosive growth of mobile data traffic has triggered an unrelenting demand for high capacity and ubiquitous coverage of wireless networks. In this regard, small cell networks, served by low-cost low-power small cell base stations (SBSs), have been proposed as a promising solution to tackle this data tsunami. In general, types of small cells include outdoor picocells as well as indoor femtocells. Due to their short transmit–receive distance, small cells can improve the network throughput with a more efficient spectrum spatial reuse [1]. In addition, a vast amount of data traffic can be offloaded from macrocells with the aid of small cells. The massive deployment of small cells, nevertheless, inevitably induces serious co-channel interference in Orthogonal Frequency Division Multiple-Access (OFDMA)-based cellular networks.

Sub-channel assignment and interference alignment (IA) are two potential interference management approaches in small cell networks. As a representative of the conventional interference management strategies, sub-channel assignment could eliminate interference via assigning orthogonal sub-channels to mutually interfering links, yet with an underutilized spectrum efficiency, especially when small cells are densely-deployed. Meanwhile, IA is a precoding technique which aligns interference to a reduced-dimensional subspace, so that the interference-free dimensions remained for the desired signal can be maximized. Authors in [2] have proved that each user can achieve 1 / 2 degrees of freedom (DoF) in a K-user interference channel with IA, which is K/2 times that with sub-channel assignment. As such, the total DoF IA achieves is proportional to the number of transmit–receive user pairs. However, the size of a single IA network could not be infinitely large for two reasons. On one hand, a larger size of an IA network means a tighter constraint to align all interference, which may result in the failure to acquire a feasible precoding solution [3]. On the other hand, global channel state information (CSI) must be available at transmitters, and the amount of CSI overhead is a quadratic function of the number of small cells [4, 5]. Hence, if the size of an IA network is too large, then the CSI overhead might be extremely high.

To guarantee the IA feasibility and retain the amount of overhead within an acceptable range, some related works have been conducted in [613]. Specifically, several clustering-based IA approaches were proposed in [69]. The case where some pre-selected clusters of base stations inside a network collaborate to align interference was studied in [6]. In [7], the network was divided into several IA clusters, with interference among clusters treated as noise. As such, the CSI overhead is significantly reduced, whereas the proposed clustering method in [7] is of high complexity. Moreover, the authors in [8] proposed a novel partial IA scheme, where two interference link scheduling algorithms were introduced to enhance the sum throughput of multi-cell networks. In addition, the application of semi-blind IA methods in clustered small cell networks was investigated in [9], with the objective to minimize the super-symbol length by grouping the users that can be served in the same time slot. However, the semi-blind IA scheme is difficult to realize, because the knowledge of distinct channel coherence intervals is typically unavailable at transmitters.

More specifically, clustered IA combined with conventional interference management schemes was investigated in [1013]. The scheme of IA in conjunction with code-division multiple access (CDMA) was proposed in [10]. In this scheme, intra-cluster interference can be eliminated by IA, while inter-cluster interference be mitigated through low cross-correlation nature of the pseudo-noise codes. However, the authors in [10] did not provide an effective clustering method. Moreover, the authors in [11] incorporated opportunistic resource allocation (ORA) into IA, and investigated the interactions and trade-offs between these two strategies. Furthermore, a time division multiple access (TDMA)-related IA approach was investigated in [12], where three low-complexity algorithms were proposed to partition users into orthogonal groups. Note that, IA is exploited inside groups and TDMA is leveraged among groups in [12]. However, the feasibility constraint for IA is overlooked in both [11] and [12]. In addition, a joint sub-channel assignment and IA optimization was solved to maximize the number of satisfactory users in femtocell networks in [13]. Nevertheless, in [13], the size of an IA cluster is pre-determined and static, which is not adaptive to the dynamic network topology.

All these aforementioned works make no quantitative analysis to the impact of CSI overhead on IA performance. But in reality, the payload transmission would be delayed due to the existence of CSI overhead, which may result in a great loss of the effective system throughput of small cell networks [14, 15]. In this paper, we study the joint clustered IA and sub-channel assignment scheme to cope with the co-channel interference in small cell networks. Our contributions can be summarized as follows.

  • Based on the widely-used pilot model with analog feedback for IA, we make a quantitative analysis to the amount of CSI overhead, followed by the expression of payload ratio. Meanwhile, the IA feasibility with CSI overhead involved is also considered.

  • We propose a new metric named average effective degrees of freedom (AEDoF), which embodies the average DoF of small cell networks with CSI overhead considered. As such, maximizing the AEDoF could enable the best tradeoff between multiplexing gain and overhead reduction.

  • For reducing the computational complexity, we propose a graph-based clustering algorithm to solve the formulated AEDoF maximization problem. The basic principle behind our proposed algorithm is to firstly chordalize the given interference graph, and then cluster partial small cells to perform IA, followed by the final sub-channel assignment to all small cells.

The rest of this paper is organized as follows. Section 2 describes the network model and the pilot overhead model with analog feedback for IA. The problem of the AEDoF maximization is also formulated. Section 3 specifies the graph-based algorithm with three sub-phases. The comparison of complexity between exhaustive search and our proposed algorithm is also conducted. Section 4 provides simulation results and performance comparisons. Finally, Sect. 5 concludes this paper.

2 System model

A standard downlink of a small cell network is considered where OFDMA is applied. Without loss of generality, assume that the network is comprised of \(K_{\mathrm {tot}}\) small cells, each small cell serving only one user. When a user is located in the coverage area of small cells, it is served by the SBS from which it receives the strongest power. Each SBS is equipped with M omni-directional antennas while each small cell user (SUE) is equipped with N omni-directional antennas. The transmit power of each SBS is identical. According to the exponential-decayed path loss model [16], we can calculate \(P_{j,i}\), which is the received signal of SUE j from SBS i. Define \(P_{\text {thr}}\) as a threshold: If \(P_{j,i}>P_{\text {thr}}\ (i \ne j)\), the interference is considered as strong; otherwise it can be treated as white noise. In this paper, we mainly focus on eliminating the strong interference by performing IA combined with orthogonal sub-channel assignment. In addition, there is a macro-cell base station (MBS) locating in the center of the network. The users out of the coverage of small cells is served by the MBS. we assume that the macro-cell is pre-allocated with distinct frequency bands, which can not be reused by any small-cell because the basic quality of service (QoS) requirements of macro users must be guaranteed. Thereby this paper takes no account of the cross-tier interference, and the co-tier interference is the only barrier we try to break through. The integral network model is depicted in Fig. 1.

Fig. 1
figure 1

Network model

In order to facilitate the analysis, we can transform Fig. 1 to an interference graph, as shown in Fig. 2. In this graph, vertices \(A\sim J\) denote small cells. If a small cell is strongly interfered by another one, we draw an edge across them. For instance, the user in small cell B locates close to and is interfered by small cell A, C and D, and then the edge AB, BC as well as BD are established.

Fig. 2
figure 2

Interference graph

2.1 IA feasibility

If at least three small cells interfering with each other are selected to perform IA, they can be denoted as an IA cluster. Suppose that each SBS sends d independent data streams to its SUE. The received signal at SUE j is composed of both interference from other SBSs and its intended signal, which can be written as

$$\begin{aligned} {\mathbf {Y}}_j = \sum _{i=1}^{K} \sqrt{\frac{P_f}{d}} {\mathbf {H}}_{j,i} {\mathbf {V}}_i {\mathbf {s}}_i + {\mathbf {Z}}_j, \end{aligned}$$
(1)

where \(K\le K_{\mathrm {tot}}\) is the number of small cells in this IA cluster, \({\mathbf {H}}_{j,i} \in {\mathbb {C}}^{N \times M}\) represents the narrowband channel matrix from SBS i to SUE j, \({\mathbf {V}}_i \in {\mathbb {C}}^{M \times d}\) represents the precoding matrix of SBS i, \({\mathbf {s}}_i \in {\mathbb {C}}^{d \times 1}\) is the data symbol vector of SBS i, and \({\mathbf {Z}}_j \in {\mathbb {C}}^{N \times 1}\) is the circularly symmetric additive white Gaussian noise. The received signal \({\mathbf {Y}}_j\) is filtered by a unitary interference suppression matrix \({\mathbf {U}}_j \in {\mathbb {C}}^{N \times d}\). \(P_f\) is the forward link power.

The conditions for perfect IA can be stated as

$$\begin{aligned}&{\mathbf {U}}^*_j {\mathbf {H}}_{j,i} {\mathbf {V}}_i= \mathbf {0}, \forall j \ne i \end{aligned}$$
(2)
$$\begin{aligned}&\mathrm {rank}({\mathbf {U}}^*_j {\mathbf {H}}_{j,j} {\mathbf {V}}_j)= d, \forall j \end{aligned}$$
(3)

where (2) ensures interference aligned in a reduced-dimensional subspace, while (3) guarantees the required dimensionality for the desired signal space. In general, a generic system of multivariate polynomial equations is solvable if and only if the number of equations does not exceed the number of variables. Therefore, to make IA feasible, we must ensure that the number of equations is no larger than that of variables in (2) and (3). Along this line, a necessary and sufficient condition for the IA feasibility turns out to be [17, 18]

$$\begin{aligned} M+N \ge (K+1)d. \end{aligned}$$
(4)

Note that the maximum value of d is equal to \(\frac{\min (M,N)}{2}\) [19]. In this case, the number of user-pairs in the IA cluster must satisfy

$$\begin{aligned} K \le \frac{2(M+N)}{\min (M,N)}-1. \end{aligned}$$
(5)

2.2 Pilot overhead model for IA

We introduce a typical pilot overhead model as in [4, 20]. Frequency Division Duplex (FDD) is employed, which implies that the forward and reverse links occupy distinct frequency bands. Thus, the forward and reverse links are statistically independent. The IA transmission consists of five steps: forward channel estimation, reverse channel estimation, analog CSI feedback, precoded channel estimation and payload transmission. We adopt a block-fading channel model in which channels remain fixed for a period, but vary independently from block to block. In addition, suppose that \(N_{\text {b}}\) symbols can be transmitted during each block and the IA solution is recomputed at the start of each block.

For forward channel estimation, each SBS broadcasts an orthogonal pilot sequence matrix spanning \(N_{\text {pf}}=KM\) symbols. Then, each SUE estimates the forward channel matrices corresponding to each SBS. Similarly, for reverse channel estimation, each SUE broadcasts an orthogonal pilot sequence matrix over \(N_{\text {pr}}=KN\) symbols to inform SBSs of the reverse channel matrices. For analog CSI feedback, the forward channel estimates are transmitted via the reverse link by each SUE, using unquantized quadrature-amplitude modulation over \(N_{\text {fb}}=K^2 M\) symbols. And for precoded channel estimation, each SBS sends orthogonal pilots along its precoder over \(N_{\text {pd}}=Kd\) symbols. Note that the role of this step is to enable each SUE to learn the precoded channels.

Fig. 3
figure 3

Overhead pilot model

When ignoring the estimation error caused by noise, SBSs could obtain perfect knowledge of CSI after the aforementioned four steps. Then the remaining symbols are reserved for payload transmission, as depicted in Fig. 3. That is to say, the amount of CSI overhead is a quadratic function of the number of user-pairs in an IA cluster. Then the payload ratio can be expressed as

$$\begin{aligned} \alpha =1- \eta K^2- \lambda K, \end{aligned}$$
(6)

where \(\eta =M/N_{\text {b}}\), \(\lambda =(M+N+d)/N_{\text {b}}\). It follows that the values of \(\eta\) and \(\lambda\) are inversely proportional to the length of a block. Since the payload ratio can not be less than zero, the number of user-pairs in the IA cluster must satisfy

$$\begin{aligned} K \le \frac{\sqrt{\lambda ^2+4\eta }-\lambda }{2\eta }. \end{aligned}$$
(7)

2.3 Average effective degrees of freedom (AEDoF)

The degrees of freedom (DoF) indicates the number of dimensions in which data streams can be independently delivered over per sub-channel. It represents the perfect multiplexing gain, yet without considering the transmission delay caused by CSI overhead. Therefore, we propose a new metric, named Average Effective Degrees of Freedom (AEDoF), to incorporate the payload ratio into channel multiplexing gain. The definition of AEDoF is given by

$$\begin{aligned} {\overline{d}}_{\text {eff}}=\frac{1}{K_{\mathrm {tot}} L} \sum _{i=1}^{K_{\mathrm {tot}}} \alpha _i d_i l_i, \end{aligned}$$
(8)

where L is the total number of sub-channels assigned, \(d_i\) is the DoF achieved by small cell i, \(l_i\) is the number of sub-channels assigned to small cell i, \(\alpha _i\) is the payload ratio of small cell i. Note that the payload ratio of each small cell in the same cluster is identical.

By quantifying the negative impact incurred by the overhead, the AEDoF can reflect the network multiplexing gain more accurately, and be regarded as a full combination of the transmission efficiency in time, frequency and spatial domain. It is obvious that the application of IA helps to reduce the number of sub-channels in demand to satisfy users’ requirement, yet with a decline of payload ratio. In the following, we propose a joint IA and sub-channel assignment strategy to maximize the network AEDoF, and it can be proved to be the best tradeoff between multiplexing gain and overhead reduction.

2.4 Problem formulation

Let \({\mathcal {A}}=\{{\mathcal {A}}_1,\ldots ,{\mathcal {A}}_n,\ldots ,{\mathcal {A}}_{N_{\mathcal {A}}}\}\) be the set of all candidate clusters in the interference graph, with the set of their indexes denoted as \({\mathcal {N}}=\{1,\ldots ,n,\ldots ,N_{\mathcal {A}}\}\). According to conditions (5) and (7), the size of a candidate cluster must satisfy

$$\begin{aligned} 3 \le |{\mathcal {A}}_n| \le \min \left\{ \frac{2(M+N)}{\min (M,N)}-1,\frac{\sqrt{\lambda ^2+4\eta }-\lambda }{2\eta }\right\} . \end{aligned}$$
(9)

A cluster combination consists of an arbitrary number of candidate clusters. The set of all cluster combinations is denoted as \({\mathcal {C}}=\{{\mathcal {C}}_1,\ldots ,{\mathcal {C}}_m,\ldots ,{\mathcal {C}}_{2^{N_{\mathcal {A}}}}\}\) with \({\mathcal {C}}_m \subset {\mathcal {A}}\). The optimization target is to maximize the network AEDoF by selecting the optimal cluster combination to perform IA. Note that, the small cells inside the combination perform IA, while those outside the combination are assigned with orthogonal sub-channels (Non-IA).

For the convenience of notation, we introduce an indicator variable \(x_n\) for each candidate cluster \({\mathcal {A}}_n\). Let \(\mathcal {X}=\{x_1,\ldots ,x_n,\ldots ,x_{N_{\mathcal {A}}}\}\) be the set of indicator variables. \(\forall {\mathcal {C}}_m\in {\mathcal {C}}\), the indicator variable of cluster \({\mathcal {A}}_n\) is denoted as

$$\begin{aligned} x_n=\left\{ \begin{array}{rcl} 1, &{} &{} {{\mathcal {A}}_n \in {\mathcal {C}}_m, n \in {\mathcal {N}}}\\ 0, &{} &{} {{\mathcal {A}}_n \notin {\mathcal {C}}_m, n \in {\mathcal {N}}} \end{array} \right. \end{aligned}$$
(10)

Thus, different cluster combinations in \({\mathcal {C}}\) correspond to diverse outputs of \(\mathcal {X}\).

Let \({\mathcal {K}}=\{1,\ldots ,K_{\mathrm {tot}}\}\) be the set of small cells. We assume that each small cell requires \(\min (M,N)\) dimensions for independent data streams; that is, \(\forall i \in {\mathcal {K}}\), \(d_i l_i=\min (M,N)\). Then the optimization problem can be formulated as follows:

$$\begin{aligned}&\max \limits _{\mathcal {X}} \frac{\left\{ K_{\mathrm {tot}}-\sum \limits _{n=1}^{N_{{\mathcal {A}}}} x_n (1-\alpha ^{(n)}) |{\mathcal {A}}_n|\right\} \cdot \min (M,N)}{K_{\mathrm {tot}} L_{\mathcal {X}}}, \end{aligned}$$
(11a)
$$\begin{aligned} s.t. \quad&x_n x_{n'}=0, \forall n \in {\mathcal {N}}, n' \in {\mathcal {N}} \setminus n, {\mathcal {A}}_n \cap {\mathcal {A}}_{n'} \ne \emptyset , \end{aligned}$$
(11b)
$$\begin{aligned}&x_n=\{0,1\}, \forall n \in {\mathcal {N}}. \end{aligned}$$
(11c)

where \(|{\mathcal {A}}_n|\) is the size of cluster \({\mathcal {A}}_n\), \(\alpha ^{(n)}\) is the payload ratio of each small cell in cluster \({\mathcal {A}}_n\), \(L_{\mathcal {X}}\) is the necessary number of sub-channels to satisfy users’ requirement when the indicator vector \(\mathcal {X}\) varies. Specifically, to ensure that each small cell gets \(\min (M,N)\) dimensions for independent data streams, we should assign two sub-channels to each IA cluster and one sub-channel to each non-IA small cell with the assistance of multi-input multi-output (MIMO) [21]. Note that small cells that do not interfere with each other can reuse sub-channels. Constraint (11b) guarantees any two candidate clusters that have intersections can not perform IA simultaneously. Constraint (11c) restricts the indicator variable for each candidate cluster to be binary.

Note that the overhead of IA is a quadratic function of \(|{\mathcal {A}}_n|\), while that of sub-channel assignment approximates to zero. Therefore, \(\alpha ^{(n)}\) can be given by

$$\begin{aligned} \alpha ^{(n)}= {\left\{ \begin{array}{ll} \qquad 1,&{} \text {Non-IA}\\ 1-\eta |{\mathcal {A}}_n|^2-\lambda |{\mathcal {A}}_n|,&{} \text {IA} \end{array}\right. } \end{aligned}$$
(12)

It follows that (11a) is a 0–1 integer programming problem and exhaustive search is a possible approach to solve it. Nevertheless, the process to enumerate all the IA clusters and their various combinations is exponentially complex. In the next section, we will resort to a graph-based alternative, which is of low complexity.

3 Graph-based algorithm

The algorithm can be divided into three phases: A. Chordalize the given interference graph and enumerate all the maximal cliques. B. Select one IA cluster combination with Multistep locally clustering (MLC) sub-algorithm. C. Assign sub-channels to small cells.

3.1 Phase A

First, we introduce several definitions for a graph [22].

Clique: A set of pairwise adjacent vertices.

Maximal clique: A maximal set of pairwise adjacent vertices.

Maximum clique: The maximum set of pairwise adjacent vertices.

Clique number: The maximum size of a set of pairwise adjacent vertices.

Chromatic number: Minimum number of colors to color all the vertices.

Chordal graph: An undirected graph in which any cycle with more than three edges has at least one chord.

In a general undirected graph, the number of maximal cliques is exponential and there exists no polynomial-time algorithm to enumerate them. By contrast, a chordal graph with n vertices has no more than n maximal cliques, which can be enumerated in polynomial time. Furthermore, a chordal graph’s chromatic number is equal to its clique number, and this property is a foundation of our following analysis [22].

In view of this, we can chordalize the interference graph by adding a minimal set of virtual edges firstly, with the classical method named minimal triangulation [23]. Figure 4 depicts a simple case of chordalization, in which AD and CG are two added virtual edges.

Fig. 4
figure 4

A simple case of chordalization

After the chordalization, we can enumerate all the maximal cliques following the triangulating direction in polynomial time.

3.2 Phase B

The chordal graph of an interference graph \({\mathcal {G}}=(\mathcal {V},{\mathcal {E}})\) is denoted as \({\mathcal {G}}^\prime =(\mathcal {V},{\mathcal {E}}^\prime )\), where \(\mathcal {V}\) is the set of vertices, \({\mathcal {E}}\) is the set of edges in the original interference graph and \({\mathcal {E}}^\prime\) is the set of edges in the chordalized graph. We further define \(\omega ({\mathcal {G}}^\prime )\) as the clique number of \({\mathcal {G}}^\prime\).

Candidate clusters are subsets of the maximal cliques. Although we have found all the maximal cliques in polynomial time, the process to substitute all the combinations of candidate clusters to the objective function still has exponential complexity. In this part, we propose a heuristic method termed MLC sub-algorithm with low complexity.

Basically, the MLC sub-algorithm is a reverse recursion method. The candidate cluster combinations explode in a large quantity, while the number of required sub-channels can never be larger than the clique number of \({\mathcal {G}}^\prime\), i.e., \(L \le \omega ({\mathcal {G}}^\prime )\) must hold. That is to say, multiple cluster combinations map one result of the number of required sub-channels. Hence, rather than enumerate all the candidate cluster combinations one by one, we resort to enumerating the number of required sub-channels and then backward deriving the optimal cluster combination. The derivation is described as follows: Firstly, let \(L=\omega ({\mathcal {G}}^\prime )\) and choose one cluster combination through multiple local iterations to drive the max-sum resource requirement down to L. Then let \(L=L-1\) and repeat the previous step, until \(L=0\) or no cluster combination can meet the conditions. Finally, we select the one with the maximum AEDoF over all the cluster combinations which have been chosen at different L. In summary, the MLC sub-algorithm presents an approach to select the optimal cluster combination in \({\mathcal {G}}^\prime\) to perform IA, as illustrated in Algorithm 1.

figure e

Several requisite explanations to Algorithm 1 are given as follows.

  • Resource requirement: Resource requirement refers to the number of sub-channels each node requires; that is, each non-IA small cell requires one sub-channel, while each IA cluster requires two sub-channels. Hence, different IA cluster combinations results in different configurations of resource requirement. In particular, the maximum sum resource requirements over all the maximal cliques is named max-sum resource requirement.

  • Transition graph: If cluster \({\mathcal {A}}_n\) is selected to perform IA, the small cells inside \({\mathcal {A}}_n\) can be integrated into one node and they require two sub-channels in total. This cluster suffers the aggregation interference of its each member. Following the aforementioned rules, we can transform an interference graph \({\mathcal {G}}^T\) to a transition graph \(\varphi ({\mathcal {G}}^T)\). Note that a chordal graph’s transition graph remains chordal [13]. Besides, the maximal clique with the max-sum resource requirement \(R_x\) in \(\varphi ({\mathcal {G}}^T)\) is denoted as \(C_x^T\). Furthermore, we define \(\mathcal {R}\) as the set of candidate clusters which can drive \(R_x\) down to L by performing IA. For example, Fig. 5(a) is a transition graph in which no cluster performs IA; Fig. 5(b) is a transition graph in which \(\{B,C,D,E\}\) and \(\{H,I,L\}\) are selected to perform IA. \(A \sim N\) are labels of small cells and each number surrounded by a circle represents the resource requirement of each node.

  • Weighting: In step 13, \(w_1\), \(w_2\) and \(w_3\) are the non-negative weighting factors for three heuristic features of \(\mathcal {R}(j)\), satisfying \(w_1+w_2+w_3=1\). \(y_1(j)\), \(y_2(j)\) and \(y_3(j)\) are the indicator variables. If \(\mathcal {R}(j)\) has no intersection with other maximal cliques, then \(y_1(j)=1\); otherwise, \(y_1(j)=0\). If \(\mathcal {R}_{j}\) does not increase the sum resource requirement of its neighboring maximal cliques, then \(y_2(j)=1\); otherwise, \(y_2(j)=0\). \(y_3(j)\) is the local AEDoF of \(C_x^T\) when \(\mathcal {R}(j)\) is selected to perform IA.

Fig. 5
figure 5

a Transition graph 1. b Transition graph 2

3.3 Phase C

After the clustering, we need to assign sub-channels to all small cells. The principle behind the assignment is to guarantee that each IA cluster obtains two sub-channels and each non-IA small cell acquires one sub-channel. Meanwhile, any two neighboring nodes in the transition graph must occupy different sub-channels. Mathematically, the assignment can be formulated as a vertex-coloring problem and minimum coloring approach can be leveraged to solve it [22].

After the three phases above, all the small cells are free from interference and the network achieves the maximum AEDoF. Unfortunately, the graph-based algorithm is not always optimal for two reasons. On one hand, the chordlization leads to a conservative allocation of sub-channels. On the other hand, the MLC sub-algorithm is a heuristic search method, which sometimes only enables a sub-optimal solution.

3.4 Complexity analysis

If exhaustive search is used to solve (11a), the complexity reaches \(O(2^{|\mathcal {V}||{\mathcal {E}}|})\). With respect to our proposed algorithm, the complexity analysis consists of three parts. For phase A, the complexity of chordalization is \(O(|\mathcal {V}||{\mathcal {E}}|)\) and that of enumerating maximal cliques is \(O(|\mathcal {V}|)\). For phase B, the MLC sub-algorithm runs in time \(O(|\mathcal {V}|^2)\). For phase C, the sub-channel assignment scheme runs in time \(O(|\mathcal {V}|)\). Taken together, our proposed algorithm has a polynomial complexity of \(O\{|\mathcal {V}|(|{\mathcal {E}}|+|\mathcal {V}|+2)\}\).

4 Simulation results

In this section, we evaluate the performance of our proposed algorithm with different block lengths and small cell densities respectively. In addition, for performance comparison, the following five clustering methods are presented:

  1. 1.

    SM1: Min-WLI—The method in which the interference level is differentiated by weights, with the clustering objective of minimizing the aggregated inter-cluster interference, as described in [7].

  2. 2.

    SM2: Non-IA—The method in which no cluster is selected to perform IA and sub-channel assignment is the only way to cope with interference, as described in [24].

  3. 3.

    SM3: Random—The method in which IA clusters are randomly selected.

  4. 4.

    SM4: Max-DoF—The method in which IA overhead is not in consideration, with the objective to maximize the total DoF of the network, as described in [13].

  5. 5.

    SM5: Max-AEDoF—The method in which our proposed algorithm is applied to maximize the AEDoF.

4.1 System parameters

Consider a typical scenario of small cell networks, consisting of 20 randomly-deployed outdoor picocells, each picocell serving only one user. As in [25], we give some basic parameter settings in our simulation: Each pico station is equipped with four antennas while each user two antennas. The pico transmit power is 24 dBm, with the carrier frequency of 2 GHz. The path loss is give by \(98.4+20\text {lg}(\text {R})\), where \(\text {R}\) is the transmit–receive distance in kilometers. Meanwhile, the shadowing standard deviation is 10 dB, rayleigh fading standard deviation is 10 dB, and noise figure is 9 dB. For each user, the interference threshold \(P_{\text {thr}}\) is set to −98.5 dBm.

4.2 Effects of block length

Since \(\eta =M/N_{\text {b}}\) with M being a fixed value, the effect of block length \(N_{\text {b}}\) can also be evaluated by the different settings of \(\eta\). Figure 6 depicts the different clustering results with different \(\eta\) when applying our proposed algorithm. 20 picocells are randomly deployed in an area of \(3\mathrm {km} \times 3\mathrm {km}\). If we set \(\eta =0.01\), there are two clusters containing six picocells are selected to perform IA, as illustrated in Fig. 6(a). For the identical scenario, no cluster is proper for IA if we set \(\eta =0.05\) , as shown in Fig. 6(b). More clusters bring in more overhead, which results in a decline of payload ratio. Therefore, greedily selecting clusters to perform IA is not the best choice for the AEDoF maximization. Especially when channel matrices are fast-changing [20], the clustering strategy should be more conservative. Simulation results in Fig. 6 demonstrate that our proposed clustering algorithm is adaptive to the variation of block length, and may do help to design network for operators.

Fig. 6
figure 6

a Clustering results with \(\eta =0.01\). b Clustering results with \(\eta =0.05\)

We illustrate the AEDoF obtained with our proposed algorithm versus \(\eta\) with different pico transmit powers, as depicted in Fig. 7. In this part of simulation, 20 picocells are randomly deployed in an area of \(2\mathrm {km} \times 2\mathrm {km}\). As shown in Fig. 7, under the premise that each SUE is covered by at least one picocell, a smaller transmit power results in a better performance. This is due to the fact that the interference would be reduced in this case. When the transmit power is 16 dBm, the AEDoF keeps constant with the increase of \(\eta\). The reason is that the interference is so weak that there is no need to perform IA. When the transmit power is 20–28 dBm, the performance curve is a multi-piece polyline. The AEDoF decreases with the increase of \(\eta\) until IA is no longer beneficial to the AEDoF maximization. Different clustering results correspond to different slopes of the curve. Therefore, once our proposed algorithm is combined with a reasonable power optimization category, the network performance may have a further enhancement.

Fig. 7
figure 7

The AEDoF our proposed algorithm achieves versus \(\eta\) with different transmit powers

Figure 8 compares the spectrum efficiency of five clustering methods versus \(\eta\). In this part of simulation, 20 picocells are randomly deployed in an area of \(2\mathrm {km} \times 2\mathrm {km}\). As observed, with the increase of \(\eta\), SM2 keeps a constant spectrum efficiency due to the fact that there is no picocell performing IA in SM2. SM3 underperforms SM2 due to its blind selection. Both SM1 and SM4 are designed without taking overhead into account. Hence, Even if \(\eta\) increases, their selection results remain unchanged. Indeed, the amount of overhead has great influence on the actual performance. As a result, the performance of SM1 and SM4 linearly degrades with the increase of \(\eta\), until the payload ratio of each selected cluster goes down to zero. Since our proposed algorithm is adaptive to the variation of \(\eta\), the curve of SM5 is a three-piece polyline, corresponding to three different clustering results. Therefore, it is deserved that SM4 underperforms SM2 when \(\eta\) is large, whereas SM5 can never perform worse than SM2. Meanwhile, SM1 has the worst performance among all theses methods because it always selects the clusters with the top size. Thereby the considerable overhead causes a drastic performance degradation.

Fig. 8
figure 8

Comparison among five clustering methods versus \(\eta\)

4.3 Effects of cell density

We illustrate the AEDoF obtained with our proposed algorithm versus area of region in different settings of \(\eta\), as depicted in Fig. 9. The area of region represents the size of square-shaped space where 20 picocells are deployed. A larger area of region means a lower cell density. For each value of area, we take an average over 100 simulation results. As observed, all the curves ascend with the increase of area. This is due to the fact that a lower cell density creates a more interference-free scenario. We can also see that a smaller \(\eta\) results in a better performance, which is consistent with the previous simulation results in Fig. 8.

Fig. 9
figure 9

The AEDoF our proposed algorithm achieves versus area of region with different \(\eta\)

Figure 10 compares the spectrum efficiency of five clustering methods versus area of region, with \(\eta =0.01\). For each value of area, we take an average over 100 simulation results. As anticipated, all the curves ascend with the increase of area. There is no doubt that SM1 achieves a poor performance together with SM3. Moreover, we note that SM2 and SM4 are close to each other in terms of spectrum efficiency. SM2 performs slightly better than SM4 in small areas because the interference is so serious that IA performance is greatly impacted by substantial overhead. Nevertheless, with the increase of area, the performance of SM4 improves faster than that of SM2 due to the fact that the impact of overhead is reduced. Finally, Although SM5 still takes the lead in all these clustering methods, its superiority declines with the increase of area. Especially when the area becomes infinitely large, the spectrum efficiency of all these clustering methods can approximate the theoretical maximum because there is no interference at all in such low cell density.

Fig. 10
figure 10

Comparison among five clustering methods versus area of region

Figure 11 compares the number of iterations between two optimization algorithms: exhaustive search and our proposed algorithm, with \(\eta =0.01\) and area of region being the variable. For each value of area, we take an average over 100 simulation results. Figure 11 verifies that the number of iterations for exhaustive search is too large to run in polynomial time, especially when the area is small. This is because the denser the network is, there would be more available cluster combinations. Rather, the number of iterations of our proposed algorithm keeps polynomial stably, which is consistent with our complexity analysis above.

Fig. 11
figure 11

Comparison of complexity between exhaustive search and our proposed algorithm

Figure 12 depicts the total number of picocells selected to perform IA by our proposed algorithm, with \(\eta =0.01\) and area of region being the variable. For each value of area, we give 100 simulation results, which are all depicted in color-blocks along the Y-axis. The colorbar on the right side of Fig. 12 represents the number of selected picocells. In general, the number of selected picocells decreases with the increase of area, which is consistent with our anticipation. That is to say, to improve the AEDoF of denser networks, more small cells are required to perform IA.

Fig. 12
figure 12

Total number of picocells selected by our proposed algorithm to perform IA

5 Conclusions

In this paper, the joint clustered IA and sub-channel assignment scheme has been studied. First, we presented a new metric to reveal the average DoF of small cell networks named AEDoF, with taking CSI overhead into account. To better quantify the impact of overhead on the AEDoF, a typical pilot overhead model for IA was also introduced. Then we proposed a graph-based algorithm for clustering partial small cells to perform IA, aiming at the AEDoF maximization. The algorithm consists of three phases: chordalization, clustering with MLC sub-algorithm and sub-channel assignment. In the end, we evaluated the impact of block length and small cell density on the performance of our proposed algorithm with simulations. Meanwhile, Performance comparison was also made among several existing clustering methods to show that our proposed algorithm is efficient to achieve the best tradeoff between multiplexing gain and overhead reduction with polynomial complexity. Future work is in progress to study the maximum number of satisfactory users in small cell networks with considering IA overhead.