1 Introduction

In recent times there is a significant growth in wireless traffic due to modern hand-held device applications of online video streaming like watching online movie, attending online tutorial classes, watching live telecast of sports and sending online medical reports [1]. These applications require larger bandwidth and higher data rate to maintain the quality of service (QoS). The ever increasing traffic load of wireless users with their larger bandwidth applications restrict the performance of the existing cellular network. This is due to the spectrum crisis and high energy consumption of conventional cellular network where the wireless traffic needs to be routed through the cellular base station (BS) or the core network [2]. As oppose to this, device to device (D2D) communication has been introduced as a new communication paradigm in fifth generation (5G) cellular network where two nearby devices can communicate directly without the involvement of the existing cellular BS. This results better spectrum utilization, enhanced throughput and high energy efficiency [2,3,4]. The data rate of D2D communication is significantly higher in comparison to the cellular network due to the close proximity of the communicating devices. Furthermore, direct communication results better spectrum utilization in comparison to two-hop communication of the cellular network.

There is a significant research on 5G networks with the objective of mitigating spectrum crisis and providing high data rate [2, 5]. The advantage of 5G D2D communication is further extended in cluster based D2D communication as it reduces the traffic of the cellular BS [5], provides better resource utilization at high data rate to a significant number of wireless users. This also results energy efficient system capacity [1, 2, 6,7,8,9]. In cluster based D2D communication a group of mobile terminals (MTs) form a cluster also known as cluster members. A common MT, called cluster-head (CH) present within the nearby proximity (D2D distance) of all the cluster-members serve them through the technique known as D2D multi-casting. This helps to increase the service coverage of the existing cellular network by providing QoS to the edge users by serving them through the CH. Furthermore, spectrum utilization is also enhanced since all cluster members are served through a common multicast channel and thereby making it cost effective [6, 10].

Cluster based D2D communication significantly reduces the load of the cellular BS and enhances the spectrum efficiency. Traditionally the requesting users are grouped based on their requesting content which is delivered via a selected CH through a common multicast channel [4]. However in a real scenario, it is very hard to ensure that users with similar requesting content will coexist in the same geographic region, instead they may be scattered throughout the entire networking coverage. So how to address an efficient clustering technique for the requesting D2D users irrespective of their communication content is a challenging issue which is one of the major motivations of the present work. Furthermore, the performance of D2D multicast significantly depends on appropriate CH selection as it decides service coverage, resource utilization and fair QoS [10]. Therefore how to provide an uniform QoS to maximum number of requesting D2D users with efficient resource utilization is another motivations of this study. In this paper we have addressed an overlay D2D multicasting where the objective is to form clusters such that each cluster admits maximum number of requesting D2D users with uniform signal coverage and efficient resource utilization through considering inter-cluster interference.

1.1 Our Contributions

We have considered a set of requesting D2D pairs in the presence of cellular BS and proposed an overlay D2D multicasting. The major objective is to provide maximum service coverage while maintaining appropriate QoS. This essentially increases the spectrum utilization because independent D2D communications may create severe interference due to limited availability of spectrum resources. Furthermore, instead of considering the content of the requesting D2D pairs as described in [7, 11, 12] we have considered the locations of the MTs to form the clusters with their appropriate CH selection. This helps to increase the service coverage as well as spectrum efficiency since most of the users are served through D2D multicasting. The proposed approach also reduces the number of independent D2D communications. More specifically the main contributions of the paper are as follows:

  • We have developed a framework for location based cluster formation (LBCF) approach in an overlay D2D multicasting. The framework describes the required parameters of the proposed technique. An analytical model based on sparse matrix has been developed to determine an appropriate CH for each cluster. The proposed analytical model considers the locations of the transmitting MTs while selecting the CH in order to provide an uniform signal coverage to the receivers of each cluster.

  • We have proposed an LBCF algorithm to form the clusters using the analytical model of CH selection. Each cluster admits maximum number of requesting users which obtain uniform signal coverage from their CH. The approach becomes cost effective as it reduces the number of independent D2D communications by means of accommodating them through common D2D multicasting hence better spectrum efficiency. The proposed approach also takes care of the inter-cluster interference with appropriate power assignment and resource allocation which leads to a cost effective solution.

  • The performance of the proposed approach has been evaluated through extensive simulations. The entire simulation is performed using network simulator2 (NS2). It has been shown that the proposed approach significantly achieves better service coverage and enhanced network throughput in comparison to the non-cluster approach. This is because a non-cluster approach admits independent D2D communications thereby increases the possibility of interference and hence restricts the service coverage. Furthermore we have also established that the receiving users of each cluster obtain an uniform signal coverage. The fairness of the proposed approach is also established by appropriate sensitivity analysis. Finally the novelty of the proposed LBCF approach is established by comparing with other existing approaches.

The rest of the paper is organized as follows. Related work is described in Sect. 2 followed by the System model in Sect. 3. In Sect. 4 we have described the proposed framework for location based clustering and based on this the proposed analytical model for CH selection is described in Sect. 5. Section 6 describes the proposed LBCF algorithm with flowchart. We evaluate the performance of the proposed approach in Sect. 7. In Sect. 7.2.2 we compare the performance of the proposed approach with the other existing approaches as described in [13, 14]. Sensitivity analysis on the proposed approach is performed in Sect. 7.2.3. Finally Sect. 8 concludes the paper.

2 Related Works

5G cellular network addresses high data rate. However performance of conventional cellular network becomes limited due to increased load of modern wireless traffic and enhanced spectrum crisis. In this context, D2D multicasting has been evolved as a promising approach for achieving high data rate with enhanced spectrum and energy efficiency [2,3,4]. In D2D multicasting requesting users are grouped together on the basis of their similar content to receive their service from a common MT known as CH. However it is very difficult to ensure the content similarity among the users [4]. In [5], a clustering approach for vehicular network has been proposed depending on communication range of the vehicles. However, in such technique addressing stability for the clusters and CH is a challenging concern. Clustering based on the required data of the users is proposed in [7] where the objective is to minimize the overall energy consumption of the network. An energy efficient channel allocation for D2D multicasting is proposed in [11]. Content based D2D clustering has been addressed in [15] where weighted directed graph is designed to measure the physical social preferences for choosing appropriate pair among multiple requesting pairs. Content based clustering has also been addressed in [16, 17]. Many researchers have also addressed D2D multicasting for appropriate CH selection because it influences the service coverage. A two-step algorithm based on Chinese Restaurant Process (CRP) for appropriate CH selection has been described in [18]. Though user mobility and other networking parameters like social attribute, energy transfer rate are considered in this work, the method still suffers from the drawback of content sharing. The authors in [19] have introduced a clustering algorithm based on the channel condition and battery energy of all the users. The objective is to select appropriate CH focusing on the channel quality of the network. Unlike these centralized approaches, a distributed muticast based content sharing has been proposed in [20] to select the CH based on social maximum weight (SMW). They have applied a game theoretic approach to maximize the energy efficiency of the whole network. Similar attention has been given in [21] for multicasting data where D2D clusters are made based on a joint power and channel allocation algorithm. Optimization algorithms have been proposed in [22] to reduce the network’s load through clustering technique by considering the user’s interest for spreading data and the content of shared data. A cluster formation and resource allocation algorithm is designed in [23] based on social awareness. Content based D2D clustering technique has also been addressed in [24] to overcome the worst link delay of multicasting data from BS to users using relay node. Enhanced network throughput is one of the major advantages of D2D communications. Similarly to enhance the data rate, an admission policy based on D2D clustering technique has been proposed in [25] where arrival of new D2D node is taken care by CRP. All these works consider clustering approach based on the type of the requesting content of the communicating users. As oppose to these works we argue for a content independent location based clustering technique. In our work requesting users can be part of the cluster irrespective of their content of communication.

Resource allocation is another major issue of D2D multicasting that is often related to appropriate CH selection. In this context D2D communication may take place either in underlay or in overlay mode [26]. In this context, a CH selection strategy has been addressed using Fuzzy C-means algorithm [27]. The authors in [28] have addressed the issue of load balancing where they have formulated a hedonic coalition game to establish the user’s association with the CH and the BS. Again the practical disadvantage is the issue of homogeneous traffic. In a similar context a power control mechanism is proposed using game theoretic approach for controlling interference between both cellular users and D2D users [29]. We have proposed an overlay based content independent D2D multicasting that judiciously selects an appropriate CH. The objective is to serve maximum number of D2D users. Furthermore, instead of assigning a fixed power we have dynamically determine the power of each CH depending on its service coverage. In this context an appropriate CH selection strategy is proposed irrespective of the geographic locations of the nodes and the content of the communication.

3 System Model

In this section we describe the network model of 5G D2D communication using the concept of clustering technique. Based on that we also present the data rate and throughput computation of the individual user and the entire network.

Fig. 1
figure 1

Network model

3.1 Network Scenario

We consider a network-controlled D2D communications as shown in Fig. 1, where a cellular BS such as evolved-Network-B (eNB) of long term evolution-advanced (LTE-A) network covers multiple mobile terminals (MTs) and thus forming a single cell. Among the MTs the set of m receivers and n transmitters are identified by \(R_x=\{R_1,R_2\ldots R_m\}\) and \(T_x=\{T_1,T_2,\ldots T_n\}\) respectively. The data requirement of different users may be of heterogeneous types such as online video streaming like online movie or watching online sports and simple data downloading. A receiver \(R_p \in R_x\), \(1 \le p \le m\) is interested to establish D2D link with a transmitter \(T_q \in T_x\), \(1 \le q \le n\) of close proximity in order to satisfy their data communication requirement through a suitable peer discovery by the BS [30]. We consider that MTs are smart 5G devices, having enough buffer space [12, 16, 31] that can be used to store the data content for some time interval. Requesting receivers of nearby locations can form a cluster and served by a common MT present within the D2D coverage of the receivers known as CH. In our work the formation of the cluster is considered to be content independent. Let k different clusters are formed and we represent them by the set \({\mathcal {C}}=C_1, C_2,\ldots C_k\) where each cluster \(C_i\) has a CH, \(CH_i\), controlling a set of receivers included in \(\psi _i\), \(1 \le i \le k\). All the receivers in a cluster are served through a common multicast channel and let F such channels are reserved for cluster based communication. We have considered an overlay in-band communication and hence there is no interference between the cellular communication with cluster-based communication

3.2 Data Rate Computation

In every cluster, a CH along with its members form individual D2D links and the CH acts as transmitter for all the members of the concern cluster. Considering the free space propagation model as described in [32], the received power by a cluster member acting as receiver from an active \(CH_i\) with \(P_t\) transmitting power is represented by the following equation:

$$\begin{aligned} P_r=P_t \cdot G_{ip} \cdot d_{ip}^{-\alpha } \cdot |h_0|^2\xi . \end{aligned}$$
(1)

In Eq. (1) \(G_{ip}\) is the power gain factor from \(CH_i\) to a member receiver p, located at a distance of \(d_{ip}\). Note that, \(\alpha\) is the path loss exponent, \(|h_0|\) represents complex Gausian variable (0, 1) of Rayleigh fading and \(\xi\) represents the log normal distribution of shadow fading. Using the above equation the received SINR of the cluster member is given by the following equation:

$$\begin{aligned} \gamma _{ip} = \frac{P_t \cdot G_{ip} \cdot d_{ip}^{-\alpha } \cdot |h_0|^2\xi }{\sigma ^2+I_E}. \end{aligned}$$
(2)

Note that, \(\sigma ^2\) and \(I_E\) is the received noise power and received external interference power on same frequency respectively [33]. In our model the external interference \(I_E=0\) because we have considered orthogonal frequency assignment. The individual data rate (\(\omega _{ip}\)) on band-width B of the concern D2D link is given by Shannon capacity formula and expressed as follows:

$$\begin{aligned} \omega _{ip}=B\log _2(1+\gamma _{ip}). \end{aligned}$$
(3)

Let k number of active clusters are formed and each controlled by a specific cluster-head, \(CH_i\) (\(1 \le i \le k\)) with a sum-rate of \({\mathbb {W}}_i\) in each cluster, then the cluster based network throughput (\({\mathbb {T}}\)) is given by:

$$\begin{aligned} {\mathbb {T}}=\sum _{i=1}^{k}{\mathbb {W}}_i. \end{aligned}$$
(4)

In addition to the cluster based network throughput, using the capacity formula presented in Eq. (3), we compute direct D2D throughput as the sum of the obtained individual data rates. The entire network throughput is the sum of both cluster based throughput and the direct D2D throughput.

4 Proposed Framework for Location Based Clustering

Fig. 2
figure 2

Cluster formation

In this section we present a novel framework for location based clustering technique where the requesting users are included in a cluster irrespective of their content of communication. The main objective is to provide maximum service coverage with a uniform QoS to all the cluster members of each cluster. Furthermore, multicast communications enhance the spectrum utilization because a significant number of users are served through a common multicast channel instead of each involve in a two-hop communications through the BS [6]. This also reduces the load of the cellular BS. Consider that the BS helps to select k number of CHs from n transmitting MTs (\(k<n\)), each of which is capable of covering a well defined set of requesting receivers. Each such CH with their covering receivers form a cluster. The location of of \(CH_i\) must ensure a good channel condition to the BS as well as to the receivers of \(\psi _i\) (\(1 \le i \le k\)) and their corresponding transmitters. This is to confirm path loss fading [17]. Furthermore, each cluster must ensure inter-cluster interference with suitable determination of power \(P_i\) to \(CH_i\) since the assigned power determines the transmission radius [34]. The process of cluster formation in a single region is shown in Fig. 2. As shown, the BS selects one of the transmitters as CH such that all the remaining transmitters of nearby locations can send their data intended for their receivers to the selected CH through uplink communications. Such data will be sent to the corresponding receivers through the downlink communications. Thus, the requirement of multiple D2D links is eliminated for better resource planning. The resultant cluster framework for the entire network shown in Fig. 3. The proposed framework of cluster formation in a specific region must satisfy the following requirements:

  1. 1.

    Selection of a cluster-head \(CH_i\) must ensure an uniform signal quality to the serving set of receivers \(\psi _i\), \(1 \le i \le k\).

  2. 2.

    The transmit power \(P_i\) of \(CH_i\) needs to satisfy the minimum SINR requirement (\(\gamma _{min}\)) to all its associated receivers in \(\psi _i\). In addition to this, the assigned power must provide a disjoint service coverage between the neighboring clusters (as shown in Fig. 3) to ensure the avoidance of inter-cluster interference.

  3. 3.

    The set of D2D links that have not been included into any cluster are considered to operate independently subject to the avoidance of interference.

Advantage of such clustering technique is of two folds such as a better resource planning with less interference due to reduction in direct D2D links and maximum service coverage irrespective of the content.

Fig. 3
figure 3

Framework for cluster

4.1 Possible Use Cases of the Proposed LBCF Technique

D2D multicasting has been evolved as a promising approach to reach spectrum efficiency when a group of users with similar needs coexist in the same geographic region. However realistically it is also possible that users with heterogeneous data requirement are distributed through out the entire networking coverage area. In such situation proposed content independent LBCF approach would be an effective solution. More specifically the possible use cases from the proposal’s perspective are as follows:

  • Spectrum Efficiency The proposed approach is effective when the number of requesting D2D users with high data rate requirement is increased. The performance of traditional cellular network may become limited due to high spectrum scarcity to satisfy such large number of users with appropriate QoS. In such scenario the proposed approach has potential use as it reduces the number of independent D2D communications and thus enhancing the spectrum efficiency.

  • Coverage Extension The set of users present under the weak coverage of cellular BS may not obtain potential data rate through traditional cellular communications. Such users can access the network with required QoS through the proposed D2D multicasting. This is due to the fact that the proposed LBCF takes care of an appropriate CH selection with the objective of providing uniform service coverage to the receivers included in the same cluster. This way the coverage area of network accessibility for the traditional cellular network is enhanced.

  • Content Independency The proposed LBCF can have a potential impact on a networking scenario where users need of data requirement may vary in the same geographic region. In this case content based D2D clustering may not become much effective to provide significant network coverage. Unlike such case the proposed technique considers the locations of the MTs for appropriate CH selection to enhance the service coverage.

In addition to the above scenarios the proposed technique has an effective applications in network offloading of the cellular BS.

5 Proposed Analytical Model for CH Selection

Based on the proposed framework mentioned in Sect. 4, we develop an analytical model using sparse matrix for CH selection from a set of transmitting MTs of the requesting D2D pairs. The proposed model considers the locations of the MTs to select appropriate CH ensuring appropriate signal range from the cellular BS as well as provides uniform service coverage to a well defined set of receivers. Let us consider N number of requesting D2D pairs with \(r_1, r_2 \ldots r_{{\mathbb {N}}}\) paring distances. The maximum value of these distances, \(r_{max}\) is given by \(\displaystyle r_{max}=\max {\{r_1, r_2 \ldots r_{{\mathbb {N}}}\}}\). Considering \(d_{max}\) is the maximum allowable D2D pairing distance, we compute the threshold distance (\(d_{th}\)) by the following equation:

$$\begin{aligned} d_{th}=d_{max}-r_{max}. \end{aligned}$$
(5)

In order to construct a cluster and to find out a CH, a specific transmitting MT is selected, we call it as reference transmitter. The set of transmitters present within \(d_{th}\) distance of the reference transmitter is determined. Let, the total number of such transmitters is \({\mathbb {N}}\) including the reference transmitter. We then define an upper triangular sparse matrix \({\mathcal {D}}= (d)_{{\mathbb {N}} \times {\mathbb {N}}}\) where non-zero elements represent the intermediate distances among them represented as follows:

$$\begin{aligned}{\mathcal {D}}= \left( \begin{array}{ccccccc} 0 &{}\quad d_{12} &{}\quad d_{13} &{}\quad d_{14} &{}\quad . &{}\quad . &{}\quad d_{1{\mathbb {N}}} \\ 0 &{}\quad 0 &{}\quad d_{23} &{}\quad d_{24} &{}\quad . &{}\quad . &{}\quad d_{2{\mathbb {N}}} \\ 0 &{}\quad 0 &{}\quad 0 &{}\quad d_{34} &{}\quad . &{}\quad . &{}\quad d_{3{\mathbb {N}}} \\ . &{}\quad . &{}\quad . &{}\quad . &{}\quad . &{}\quad . &{}\quad .\\ . &{}\quad . &{}\quad . &{}\quad . &{}\quad . &{}\quad . &{}\quad .\\ 0 &{}\quad 0 &{}\quad 0 &{}\quad 0 &{}\quad . &{}\quad . &{}\quad 0 \end{array} \right) \end{aligned}$$

From matrix \({\mathcal {D}}\) we derive row-sum vector (\({\mathcal {V}}_{row}^{\mathcal {D}}\)) and column-sum vector (\({\mathcal {V}}_{col}^{\mathcal {D}}\)), where each element from \({\mathcal {V}}_{row}^{\mathcal {D}}\) and \({\mathcal {V}}_{col}^{\mathcal {D}}\) represents row-sum and column-sum for each row and column of \({\mathcal {D}}\) respectively. For each element of \({\mathcal {V}}_{row}^{\mathcal {D}}\) and \({\mathcal {V}}_{col}^{\mathcal {D}}\) the average value (\(a_j\)) of row-sum and column-sum is computed and the transmitter with the minimum value (\(\min {\{a_j\}}\), \(1 \le j \le {\mathbb {N}}\)) is selected as CH. The receivers are included into the cluster. We demonstrate the proposed model of CH selection with a small example.

Fig. 4
figure 4

Example for cluster-head selection

Example 1

Consider seven transmitters as shown in Fig. 4. Let \(T_1\) be the reference transmitter and the rest of them are within \(d_{th}\) distance of \(T_1\). The computed distance based matrix is given by:

$$\begin{aligned}{\mathcal {D}}= \left( \begin{array}{ccccccc} 0 &{}\quad 2.11 &{}\quad 1.21 &{}\quad 1.48 &{}\quad 2.69 &{}\quad 2.88 &{}\quad 3.71 \\ 0 &{}\quad 0 &{}\quad 2.91 &{}\quad 2.05 &{}\quad 2.11 &{}\quad 3.31 &{}\quad 3.52 \\ 0 &{}\quad 0 &{}\quad 0 &{}\quad 1.22 &{}\quad 2.53 &{}\quad 2.08 &{}\quad 3.11 \\ 0 &{}\quad 0 &{}\quad 0 &{}\quad 0 &{}\quad 1.34 &{}\quad 1.5 &{}\quad 2.24 \\ 0 &{}\quad 0 &{}\quad 0 &{}\quad 0 &{}\quad 0 &{}\quad 1.5 &{}\quad 1.41 \\ 0 &{}\quad 0 &{}\quad 0 &{}\quad 0 &{}\quad 0 &{}\quad 0 &{}\quad 1.12 \\ 0 &{}\quad 0 &{}\quad 0 &{}\quad 0 &{}\quad 0 &{}\quad 0 &{}\quad 0 \\ \end{array} \right) \end{aligned}$$

Note that we consider the coordinate positions of the transmitters to compute the above distance matrix. Based on the computed upper triangular sparse matrix we compute the row-sum and column-sum vector as follows:

$$\begin{aligned}&{\mathcal {V}}_{row}^{\mathcal {D}}= \left( \begin{array}{c} 14.08 \\ 13.9 \\ 8.94 \\ 5.08 \\ 2.91 \\ 1.12 \\ 0 \\ \end{array} \right) \\&{\mathcal {V}}_{col}^{\mathcal {D}}= \left( \begin{array}{c} 0 \\ 2.11 \\ 4.12 \\ 4.75 \\ 8.67 \\ 11.27 \\ 15.11 \\ \end{array} \right) \end{aligned}$$

The average values of row-sum and column-sum are listed in Table 1. In this case \(T_4\) is selected as CH because \(a_4\) has the minimum value. All the receivers are included in the cluster controlled by \(T_4\).

Table 1 Table of average values

6 The Proposed Location Based Cluster Formation (LBCF) Algorithm

Considering the framework for location based clustering described in Sect. 4 and the proposed analytical model described in Sect. 5, we design our LBCF algorithm. The basic objective is to provide maximum service coverage with optimum resource utilization irrespective of the content of communications. An appropriate CH is selected from the set of transmitting MTs \(\{T_1, T_2,\ldots ,T_n\}\in T_x\) which are interested in D2D communication to their corresponding receiving links

\(\{R_1,R_2,\ldots R_m\}\in R_x\). The location information of such D2D links already exists at the controlling BS. In this context we consider, \(d_{max}\) as the maximum distance within which the interested communicating peer may involve in D2D communication. The proposed LBCF divides the MTs in the network into k number of clusters \(\{C_1,C_2,\ldots C_k\} \in \mathcal {C}\), controlled by \(CH_1\), \(CH_2\),...,\(CH_k\) respectively such that each \(CH_i \in C_i\) (\(1 \le i \le k\)) is at the uniform signal coverage from all the transmitters whose receivers are included in the same cluster and thereby leading to an uniform delay while buffering the data. We consider an overlay based communication where the reserved channel is further divided into two parts. While one part is used for serving a set of users through cluster based D2D communications represented by the set \({\mathcal {N}}_c\) and the other is used for serving another set of users by the direct D2D communications which has not been included in any cluster also represented by the set \({\mathcal {N}}_d\). This way the interference between the users of \({\mathcal {N}}_c\) and \({\mathcal {N}}_d\) is eliminated. Let \(\eta\) be the set of requesting D2D links with pairing distances \(d_1, d_2,.., d_n\) from which BS computes \(r_{max}\), such that \(r_{max}=\max {\{d_1, d_2,\ldots d_n\}}\). Using \(r_{max}\), and taking a reference transmitter \(T_p \in T_x\), the CH is selected along with the receivers of the constructed cluster is formed based on the analytical model described earlier in Sect. 5. Note that each time a cluster is formed its service coverage must be disjoint to the already constructed cluster(s) if any with a suitable power tuning as explained in Sect. 4. This way inter-cluster interference is avoided. When all possible clusters are constructed, the set of unserved D2D links outside the clusters represented by \(\eta '\) are considered to operate in a conflict-free way using a frequency from the reserved channel set. We represent such links by the set \({\mathcal {N}}_d\). The formal representation of the proposed approach is presented in Algorithm 1. The flowchart of the overall algorithm is presented in Fig. 5. Note that the notations used in the flowchart and the algorithm with their meaning are presented in Table 2.

Fig. 5
figure 5

Flowchart of LBCF algorithm

Table 2 Notations with their meaning
figure c

6.1 Complexity of the Proposed Algorithm

We derive the time complexity of the proposed algorithm by considering the key steps. Let m number of clusters are formed from n number of requesting links. The outer loop denoted by the while statement will execute a maximum number of m times. Computing the maximum of the pairing distance among n number of links requires a complexity of O(n). \(Step\;4\) introduces a complexity overhead O(mn) to select the transmitter of \(d_{th}\) distance. When number of such transmitters is more than 3 the algorithm formulates a suitable cluster in else part of \(Step\;12\). Formulation of sparse matrix results a complexity overhead \(O(mk'^{2})\), where \(k'\) is the total number of transmitters considered in a cluster. In reality \(k'<n\) and varies at different instance of executions, therefore the algorithm executes with a worst case time complexity of O(mn).

7 Performance Evaluation

In this section we first describe the simulation environment followed by the evaluation of the proposed clustering technique. Next the proposed approach is compared with other existing approaches namely 0.2 Force D2D and 0.4 Force D2D as described in [13].

7.1 Simulation Environment

We have performed the entire simulations in Network simulator 2 (NS2). NS2 is a discrete event simulator where programming is written in TCL language [35]. It is mostly used for developing realistic model of networking environment. In this work 5G D2D communication system is modelled on NS2. We consider a single cell controlled by a BS. MTs are placed uniformly in an area of \(500 \times 500\) square meters [32]. The number of MTs may vary from 50 to 300. An overlay based D2D communication is considered where the number of reserved channel is fixed at 3 and is not shared with existing cellular communications [36]. In this context a free space propagation path loss model with Rayleigh fading is considered [32]. We have assigned the power dynamically for cluster based D2D communications and for direct D2D communications a fixed power of 10 dBm is considered [32]. The other different parameter values are mentioned in Table 3.

Table 3 Parameters list

7.2 Simulation Results

The performance of the proposed approach is evaluated in terms of several networking parameters such as service coverage, obtained sum-rate, obtained SINR to the users and the energy consumption. Note that the sum-rate is computed by the method described in Sect. 3.2. The results presented are the average value of 10 different runs of simulations.

7.2.1 Performance Evaluation of the Proposed Approach

Fig. 6
figure 6

Relationship between the number of communications and number of requesting pairs

Fig. 7
figure 7

Relationship between the obtained sum-rate and number of requesting pairs

Providing maximum service coverage is one of the objectives of the proposed cluster based approach. To establish the same we measure the number of successful D2D communications against different number of requesting pairs and compare it with an approach that relies on direct D2D communications. We vary the number of requesting pairs from 25 to 150 and the available number of frequency is fixed. It is observed in Fig. 6, when the number of requesting pair is less, the performance in terms of admitted successful pair is comparable for both the approaches however the proposed approach surprisingly outperforms the non-cluster approach when the density of requesting pairs is significantly higher. Less number of requesting pair causes very few number of clusters to form and most of the communications are admitted on direct pairing hence comparable performance with non-cluster approach that relies on direct D2D communication only. When the density of requesting pair increases the proposed approach forms disjoint clusters accommodating maximum number of served users in each of them while avoiding inter-cluster interferences with appropriate transmission power to the CHs. This results significantly improved performance in comparison to the non-cluster approach that suffers from more interference on increased user density due to limited availability of spectrum resources. In a similar reason, same performance trend is observed in Fig. 7 when we measure the obtained sum-rate on the above mentioned simulation parameters. Note that Fig. 6 also establishes that the proposed approach is cost effective because to reach to the similar standard of service coverage through non-cluster approach it would require more spectrum resource.

Fig. 8
figure 8

Relationship between different types of communications and number of requesting pairs

The proposed location based clustering causes improved service coverage with cluster based communication for a suitable number of requesting users as well as allowing some direct communications to a set of D2D pairs which have not been included in any cluster. Note that proposed approach considers interference while admitting cluster based D2D or direct D2D. In Fig. 8, we plot these two types of served users against different number of requesting pairs as shown. We observed that initially when the number of requesting pair is less the resultant direct communications dominates the cluster based communications. The number of cluster based D2D communications significantly increases in comparison to the direct communications when the density of the requesting pairs increases. Less number of requesting pairs in a given service area restricts the number of clusters as the MTs are scattered through out the entire region and hence causes more number of direct D2D communications with the available channels. Increased user density restricts direct D2D communications due to spectrum scarcity but the proposed approach enhances the coverage by satisfying them through cluster based communications. It is also important to note that unlike the direct D2D communications the number of cluster-based communications increases with the increase in number of requesting pairs. Hence the proposed approach provides better service coverage by satisfying communication requests through cluster-based and direct D2D communications.

Fig. 9
figure 9

Relationship between average SINR received from CH of each cluster and clusters for proposed approach

Fig. 10
figure 10

Relationship between obtained sum-rate and SINR threshold for proposed approach

The objective of the proposed cluster based communication is not only to provide better service coverage but also to maintain QoS by providing uniform signal coverage to the receiving members of the clusters. In Fig. 9, we plot the average SINR received by the members of each cluster for 250 MTs. As observed in the figure that in all 19 clusters the average SINR received from each CH to their members are uniform as well as exceeds a certain threshold value sufficient for receiving quality data. This also establishes the fact that the proposed approach appropriately selects the CH of all the clusters.

We have considered three different number of requesting pairs to measure the obtained aggregate sum-rate from all the members of active clusters for varying SINR threshold values. As observed in Fig. 10, for all three different number of requesting pairs increasing the SINR value also increases the aggregate sum-rate obtained from the members of all clusters. Proposed clustering technique dynamically determines the assigned power to the cluster-head (CH) of each cluster to satisfy the required SINR threshold of their members. Furthermore, LBCF results an appropriate set of cluster based communications to increase the coverage while respecting the interference. The increased SINR results an increased data rate to the members of the clusters resulting a higher aggregate sum-rate. It is important to note that for a fixed SINR, the obtained aggregate sum-rate is increased with increased number of requesting pairs. Though increasing requesting pairs increases the network load but the proposed approach suitably accommodate them into different clusters to serve them and hence increased throughput. This surely justifies the effectiveness of the proposed approach.

Fig. 11
figure 11

Relationship between different types of communications and physical proximity for proposed approach

Fig. 12
figure 12

Relationship between aggregate power and requesting power for different approaches

Next, we conduct simulations to count the number of communications by varying the physical proximity threshold of D2D communications from 30 to 80 meters. We have considered two different set of requesting pairs 50 and 100 for the same and count the number of successful direct D2D communications and cluster based communications. It is observed in Fig. 11 that for both types of requesting pairs, increasing the physical proximity threshold results an increased number of cluster based communications which is opposite to that of direct communications. Number of direct communications decreases with the increase in proximity threshold. This is because increased proximity threshold also increases the size of each cluster covering increased number of receiving terminals and hence increase in the number of cluster-based communications. Whereas, the number of direct communications supersede the number of cluster based communications when the physical proximity threshold is less due to reduction in the size of the cluster.

The energy consumption of a communication depends on the assigned power of the transmitter. We compare the aggregate allocated power from the proposed approach with the non-cluster approach against different number of requesting pair. We measure the allocated aggregate power from two different approaches by varying the number of requesting pair from 25 to 150. It is observed in Fig. 12 that the proposed approach requires much less power than that of the non-cluster approach for all the requesting set of communications which is surely beneficial from the perspective of energy consumption. Instead of making the assigned power fixed as happens in the non-cluster approach the proposed approach dynamically determines the power of CH to accommodate a suitable number of cluster members. It significantly demonstrates that a judicial determination of transmitting power leads improved performance in terms of energy consumption observed from Fig. 12 even by serving more number of successful pair as shown in Fig. 6, described earlier.

7.2.2 Comparison with Existing Approach

In this section we compare the performance of the proposed clustering technique with the approach known as Force D2D as explained in [13] as well as with work described in [14]. In [13], the number of communicating users is obtained by multiplying the total number of users with a parameter known as direct communication probability (\(\rho\)). Considering one of the users as CH, the rest of the users are distributed randomly within a fixed distance from the concern CH, also known as cluster-radius. The user which does not communicate locally with probability \(1-\rho\) is allowed to operate in normal cellular mode. We compare such existing approach with our proposed LBCF approach from the perspective of formulated clusters, number of served cluster-based users and the obtained throughput on similar parameters. To do so, against two different probability values such as \(\rho =0.2\) (0.2 Force D2D ) and \(\rho =0.4\) (0.4 Force D2D ) [13], represented as Proposed clustering scenario 1 and Proposed clustering scenario 2 are considered respectively for our LBCF approach. As explained in [13] cluster radius is considered to be 25 meters. In our case the cluster size is selected dynamically. The simulations are made on the same user distributions and hence the distributions considered in [13] for \(\rho =0.2\), is considered as \(scenario\;1\) of the proposed approach and similarly for \(\rho =0.4\) we consider \(scenario\;2\) of the proposed approach. The two approaches differ in formation of CH. While the proposed approach dynamically selects the CH, the approach described in [13] selects the CH at fixed positions. All simulations are taken by averaging the results of 10 different runs of the same.

Fig. 13
figure 13

Relationship between number of D2D clusters and number of users for different approaches

In Fig. 13, we plot the number of formulated clusters against different number of users distributions. We vary the number of users from 40 to 80 and count the average number of formulated cluster for two different scenarios namely 0.2 Force D2D and 0.4 Force D2D and compared it with Proposed clustering scenario 1 and Proposed clustering scenario 2 respectively. It is found that in most of the cases the number of formulated clusters is increased with increase number of users as expected. It is observed from Fig. 13, for a given number of users increasing the value of \(\rho\) from 0.2 to 0.4 significantly increases the number of formulated clusters and it dominates proposed approach in the case of \(scenario\;2\) however such increased number of cluster may not result an increased performance in terms of number of served users as observed in Fig. 14.

Fig. 14
figure 14

Relationship between cluster based served users and number of users for different approaches

Fig. 15
figure 15

Relationship between average cluster based sum-rate and number of users for different approaches

Fig. 16
figure 16

Cluster based served users vs requesting pair for different approaches

It is clear from from Fig. 14, the proposed approach in both scenario 1 and scenario 2 surprisingly outperforms 0.2 Force D2D and 0.4 Force D2D respectively in terms of service coverage by satisfying more requesting users. In cluster-based approach increased number of clusters may also result an increased delay overhead. The proposed approach dynamically formulates the clusters where the selection of the CH is not fixed beforehand, instead it is also dynamic with the objective of improving the service coverage. Such dynamic formulation of clusters with judicious selection of CH results improved number of served users. Hence in Fig. 14, it is observed that in both scenario 1 and scenario 2 the proposed approach significantly outperforms 0.2 Force D2D and 0.4 Force D2D respectively in terms of number of served users for the same simulation parameters.

In Fig. 15, we also compare the performance in terms of obtained throughput by considering same simulation parameters. It is important to note that the proposed approach in both the scenarios result improved sum-rate in comparison to 0.2 Force D2D and 0.4 Force D2D respectively in all cases. Improved number of served users as evident from Fig. 14 results improved throughput in terms of obtained sum-rate as expected.

We also compare our work with existing ’interference aware clustering technique’ described in [14]. In this work the authors have addressed a clustering approach to include maximum number of served users considering inter-cluster interference. However, unlike our approach they have chosen the CH randomly and the member are included or excluded depending on the interferences. In order to compare the performance we have measured number of cluster based served users and obtained sum-rate for different requesting pairs.

In Fig. 16, we compare the performance in terms of service coverage. We observe that the proposed approach is fairly comparable when the number of requesting pair is less and it outperforms the existing interference aware clustering when the density of the requesting pairs is increased. Our approach judiciously selects the CH and forms the cluster to improve the service coverage. It is evident from Fig. 16 that our proposed LBCF provides better service coverage than the existing interference aware clustering technique. Similar performance is noticed in Fig. 17 where we plot the cluster based sum rate for the same number of requesting pairs. This surely justifies the novelty of the proposed approach specially in dense environment.

Fig. 17
figure 17

Cluster based served users versus requesting pair for different approaches

7.2.3 Sensitivity Analysis

In order to establish the fairness of the approach, we conduct a sensitivity analysis on our proposed LBCF. Sensitivity analysis is an effective tool widely used to observe how the variation of an input parameter influences a specific output under certain predetermined conditions [37]. Let us denote \(\beta\) as sensitivity parameter which is defined as follows [38]:

$$\begin{aligned} \beta = \frac{\%\; \text {of}\; \text {change}\; \text {in}\; \text {output}}{\%\; \text {of}\; \text {change}\; \text {in}\; \text {input}}. \end{aligned}$$
(6)
Table 4 Sensitivity analysis on different performance metrics for changing input parameter

In this context we have performed sensitivity analysis by considering number of requesting pairs as input and number of communications, obtained sum-rate and the average power consumption as the outputs shown in Table 4. We present the results of the sensitivity parameter for an increased variations of input from \(100\%\) to \(500\%\). The corresponding simulations have already been represented in Figs. 6, 7 and 12 respectively of Sect. 7.2.1. From the insight analysis of the results presented in Table 4, we conclude that the respective values of \(\beta\) changes in a quite proportional manner for all three outputs. It is important to note, for all three outputs, \(\beta\) is computed by varying the number of requesting pairs while keeping the other input parameters such as proximity threshold and SINR threshold constant. The results presented in Table 4 significantly implies that our proposed approach is fair enough as no such abrupt variation in output is noticed with the variations of input.

8 Conclusion and Future Work

In this paper a content independent location based clustering technique has been proposed in order to maximize the service coverage with improved throughput. Since the communicating users may be distributed through out the entire covering region, the performance of content based clustering technique may become limited. This is because in such cases clusters are mostly formed by considering the users of similar needs of data requirement. Our proposed LBCF approach takes care of this issue by including a group of receivers which may be with different data requirement under the same cluster. In addition to this a novel analytical model based on sparse matrix for appropriate CH selection is formulated. The objective is to provide uniform signal coverage to all the receivers under its control. It has also been established through extensive simulations that the proposed LBCF algorithm results enhanced system throughput and maximum service coverage in comparison to the non-cluster approach. We have also shown that our approach is highly comparable with other existing approaches from the literature. The robustness of the same has also been established through a sensitivity analysis considering different input variations.

The proposed LBCF algorithm clusters the requesting MTs irrespective of their content. This essentially eliminates the drawbacks of the content based D2D clustering by reducing the number of independent D2D communications and thereby enhancing spectrum utilizations. However, there may be a need of appropriate scheduling so that users within a cluster can receive their content through a common multicast channel. Such scheduling policy may influence the obtained network performance. Furthermore the availability of the number of overlay channels will put a limitation on the performance of existing underlay cellular communication. We consider these limitations as our challenging future concern in this direction.