1 Introduction

Dynamic Spectrum Leasing (DSL) is an approach that smooth the way for practical implementation of cognitive radio (CR). Two methods are developed for DSL. In the first one, PUs temporarily lease their spectrum in exchange for receiving cooperation from SUs [1, 2]. In fact, SUs transmit their own data during leasing periods while in the remaining part of time slots, each SU acts as a relay station and forwards information originated by PUs. As a result, both PUs and SUs achieve higher transmission rates. In the second DSL method, PU receives monetary compensation for leasing its spectrum to SUs [3, 4].

This paper focuses on the second approach for DSL. Game theory is the convenient tool for analyzing PU–SU interactions in DSL. Some works propose non-cooperative game to model DSL method [37]. In these methods, PU defines tolerable interference from SUs as interference cap (IC). PU broadcasts different ICs as a Leader. For each IC, SUs (Follower) adjust their transmission powers in a non-cooperative manner. Nash Equilibrium (NE) is proved to be the outcome of this Leader–follower game. On the other hand, cooperative game is also developed for DSL [8]. It is proposed that SUs bargain over transmission powers subject to a fixed IC.

In this paper, we first consider necessary conditions on PU’s utility to represent its incentive for spectrum leasing. Then, the inefficiency of Leader–follower game is discussed and Nash bargaining is proposed for both PUs and SUs. The Nash bargaining solution (NBS) secures a Pareto efficient outcome. PU announces an interference cap (IC) and SUs adjust their transmission powers in order to avoid aggregate interference higher than IC. PU and SUs iteratively change their IC and powers in order to reach an agreement. To facilitate the distributed development of NBS for DSL, a simple pricing method is used which results in the Pareto optimal solution. The new method is called decentralized bargaining for dynamic spectrum leasing (DB-DSL).

Our approach is different from previous non-cooperative DSL games [37]. The advantage is Pareto efficiency of outcome. The difference between our method and the cooperative DSL game proposed in [8] are threefold: first, PU’s incentive for DSL is considered in terms of a utility function. Second, both PUs and SUs play the bargaining game. Finally, PU dynamically changes its IC to attain better SINR or revenue. If the required resource by SUs is low, PU chooses a lower IC and achieves higher SINR in its regular transmission; otherwise by setting a higher IC, PU would receive remuneration from SUs by trading (sacrificing) some of its SINR in a way that preserves its QoS. As a result, the term Dynamic could be realized for spectrum leasing. In fact, depending on requests from SUs, PUs could provide higher (lower) spectrum and earn much (less) remuneration.

The paper is organized as follows. Section 2 discusses the related work in the literature. In Sect. 3, general assumptions and system model is described. Comparison between cooperative and non-cooperative games’ outcomes is provided in Sect. 4. Section 5 comprises the proposed cooperative game. Numerical analysis is provided in Sect. 6. The relation between proposed method and coalitional games is discussed in Sect. 7 and the paper is concluded in Sect. 8.

2 Related work

Dynamic spectrum leasing (DSL) has been considered in different scenarios. In [9] decentralized spectrum access is proposed. Matching theory is utilized to find the best assignment between negotiating multi-PU multi-SU scenario. As a result, minimum rate requirement of PUs and SUs could be satisfied. In [10] DSL has been implemented for 802.11-based wireless networks. SUs compete with each other to act as a relay for PUs. Access point gets the channel state information (CSI) form all SUs and selects the one that has the best channel to it. The winner SU has also been rewarded to send its own data. In [3], PU announces an interference cap (IC) which represents maximum tolerable interference from SUs. But, PU’s utility reflects concern about efficient spectrum utilization rather than PU’s regular preferences such as SINR or profit. Jayaweera et al. [5] improved their method by considering PU’s utility as an increasing function of IC (and therefore profit). Generalizations to multiple primary systems are provided in [6, 7]. In [37] PU and SUs play a Leader–follower game to maximize their utilities and making proper decisions about IC and transmission powers, respectively. In this game, each player (PU/SU) individually maximizes its utility and Nash Equilibrium (NE) would be the outcome.

Auction theory provides a distributed framework for spectrum leasing. In [11], an auction based algorithm for radio resource management is developed. Secondary system leases unused TV white spaces from spectrum broker in a way that PUs’ QoS is guaranteed. Huang et al. [12] proposed an approach in which PU announces its transmission power and the price of spectrum. Then, each SU submits a bid that has direct relation with its transmission power. Fixing other SUs bids, each SU selects a bid to maximize its utility. Generalization to multi-PU multi-SU scenario is provided in [13]. It is assumed that each SU first chooses the best PU and then selects a bid that maximizes its utility. In [14], multi-winner auction is proposed to lease one spectrum band to a number of SUs with the constraint that their mutual interference on each other must be negligible. Adian et al. [15] utilized auction theory to find the best time and amount of leased spectrum by PUs. Regarding selfish behavior of PU/SUs, the solution of these auction-based approaches is obtained by performing non-cooperative game. So, PU–SU interactions cast as Leader–follower game. Moreover, SU can lease just from one of the PUs and some PUs have low spectrum utilization which is not desirable from the cognitive radio’s ultimate goal.

Nash bargaining [16] was previously applied for spectrum sharing in different scenarios. In [17], resource allocation problem for cellular OFDM-based cognitive radio network is considered. Total transmission power of PUs/SUs is minimized such that required rates by PUs and maximum power constraint in each sub-channel would be met. Both PUs and SUs should send their channel state and power information to the base station (BS) and power minimization is developed in a centralized manner. This approach requires regular signaling between PUs/SUs and BS which imposes unnecessary overhead and complexity. Suris et al. defined cognitive user’s utility as the summation of transmission rates on different OFDM sub-channels [18]. In [19] both channel and power allocation for OFDMA-based CR have been considered. The sum of SUs’ throughputs is maximized subject to power constraints. These requirements ensure minimum SINR for primary transmissions. Each SU can lease just one primary channel. To compute power requirement the pairwise distances between n PUs and m SUs should be determined. Also, the distances between n PUs and BS are required. This \(nm+n\) computed distances are utilized to compute the maximum SU’s power on each primary channel. In this setting, all computations should be repeated as a result of the movement of mobile SUs. Moreover, PUs’ incentive from spectrum leasing is not considered and modeled.

Considering interference zone as a region in which transmissions from other nodes have detrimental effect on a given node, an asymptotic optimal bargaining solution is proposed. Guan et al. [20] develop cooperative based DSL between SUs and PUs. It is proposed that a secondary base station (SBS) collects channel state information of SUs and then bargains with PU. SUs compete with each other and the winner has the role of relay for PU while it is benefited from transmission opportunity. Toroujeni et al. [21] generalize [20] to OFDM networks. In fact, spectrum leasing has been performed both in time and frequency domains. In [8], distributed power control bargaining game is introduced for SUs who are trying to maximize their rates subject to interference constraint on primary system. This interference constraint is fixed and PU’s incentive for spectrum leasing is not justified and considered. Saraydar et al. [22] utilized pricing method to obtain Pareto efficient solution to optimize energy consumption of wireless ad-hoc users. In [23], Azimi developed a special utility for SUs that secures a Pareto efficient outcome for non-cooperative game. In this method, all SUs are treated in the same way and proportional fairness is not guaranteed. In comparison with [23], we are going to develop a cooperative approach to optimize both PU and SU’s utilities.

3 System model

The framework developed in this research consists of a cognitive radio system with multiple PUs and multiple SUs. Each PU has its own channel and temporarily leases the channel to one or more SUs subject to a constraint that total interference from SUs should be lower than interference cap. In the upcoming sections, first signal model is introduced. Then, required conditions on PU and SU’s utilities are considered. Finally, the inefficiency of leader–follower game between PU and SU is investigated and viability of our method are provided.

3.1 Preliminary

It is assumed that there are L PUs who tend to lease their unused spectrum or could tolerate interference without sacrificing their desired QoS. Each PU has an exclusive channel. The set of PUs is denoted by \({\fancyscript{L}}=\{1,2,\ldots ,L\}\). There are K SUs from the set \({\fancyscript{K}}=\{1,2,\ldots ,K\}\). SUs could lease from more than one PUs. The channel gain between one SU and common secondary receiver is denoted by \(H_{i}^{j}\) with i as the index of transmitter \((i \in {\fancyscript{K}})\) and j as the index of leased primary channel \(j \in {\fancyscript{L}}\). Also, the interference channel from ith SU on jth PU’s channels at primary receiver is denoted by \(h_{i}^{j}\). The channel gain between lth PU and secondary receiver is represented by \(g_{l}\). Channel gain between lth primary transmitter–receiver is \(G_{l}\). Table 1 summarizes the introduced symbols throughout this paper. Fading is considered as quasi-static, so the channel gains would be fixed for short periods of time. lth PU broadcasts the maximum tolerable interference from SUs. This parameter is called interference cap (IC) and is denoted by \(Q_{l}\) with \(l \in {\fancyscript{L}}\). In return, PU is monetarily benefited from adapting interference. \(p_{l}\) stands for transmission power of lth PU. Also, \(p_{k}^{l}\) is the transmission power of kth SU on the lth PU’s channel with \(l \in {\fancyscript{L}}\) and \(k \in {\fancyscript{K}}\). Matched-filter (MF) receivers are used at primary/secondary systems. Let \(A_{n}^{l}= h_{n}^{l}\sqrt{p_{n}^{l}}\) is the received signal amplitude at lth primary receiver from nth SU. Also, \(B_{n}^{k}= H_{n}^{k}\sqrt{p_{n}^{k}}\) is the received signal amplitude at kth secondary receiver from nth SU with \(k,n \in {\fancyscript{K}}\). Denoting \(\rho _{n}^{l}\) as the cross correlation between nth SU and lth PU and \(\rho _{n}^{k}\) as the cross correlation between nth and kth SU, the received signals at lth PU and kth SU (on the lth PU’s channel) are given by (1) and (2), respectively:

$$y_{\textit{PU}}^{(l)}= G_{l}\sqrt{p_{l}}c_{l}+\sum _{n=1}^{K} \rho _{n}^{l}A_{n}^{l}b_{n}+\sigma _{p}\eta ^{(p)}$$
(1)
$$y_{\textit{SU}}^{(k)}= B_{k}^{k}b_{k}+\rho _{l}^{k}g_{l}\sqrt{p_{l}}c_{k}+\sum _{n=1,n\ne k}^{K}\rho _{n}^{k}B_{n}^{k}b_{n}+\sigma _{s}\eta _{k}^{(s)}$$
(2)

where \(b_{k}\) and \(c_{k}\) are transmitted symbols by kth SU and PU, respectively. \(\sigma _{p}^{2}\) and \(\sigma _{s}^{2}\) are the variances of noise at the primary and secondary receivers, respectively. Since, noise is considered as AWGN with unit spectral height, the terms \(\eta ^{(p)}\) and \(\eta _{k}^{(s)}\) follow an \({\fancyscript{N}}(0,1)\).

Table 1 A list of symbols

Denoting \(\tilde{A_{k}^{l}}=\rho _{k}^{l} g_{k}^{l}\) as the efficient channel coefficient of the kth SU as seen by the lth primary receiver, the total interference \((I_{l})\) from all SUs to the lth PU could be written as:

$$\begin{aligned} I_{l}=\sum _{k=1}^{K}\tilde{A_{k}^{l}}^{2}p_{k}^{l} \end{aligned}$$
(3)

Similarly, the total interference from all SUs to the kth SU, excluding the PU signal, would be:

$$\begin{aligned} i_{k}=\sum _{n=1, n\ne k}^{K}\left( {\rho _{n}^{k}}H_{n}^{k}\right) ^{2}p_{n}^{k} \end{aligned}$$
(4)

So, the received SINR of kth SU on the lth channel is computed as follows:

$$\begin{aligned} \gamma _{k}^{l}=\frac{\left| H_{k}^{l}\right| ^{2}p_{k}^{l}}{i_{k} +\left( {\rho _{l}^{k}}\right) ^{2}|g_{l}|^{2}p_{l}+\sigma _{s}^{2}} \end{aligned}$$
(5)

The selected ICs by PUs are denoted by \({\mathbf{Q}}=(Q_{1},\ldots ,Q_{L})^{T}\). Furthermore, the transmission powers by jth SU are represented by \({\mathbf{p}}_j=(p_{j}^{1},\ldots ,p_{j}^{L})^{T}\). So, the action vector set is defined by \({\mathbf{a}}=({\mathbf{Q}},{\mathbf{p}}_{1},\ldots ,{\mathbf{p}}_{K})^{T}\). The action vector set excluding the kth user is denoted by \({\mathbf{a}}_{-k}\).

3.2 Utility design

The action vector comprises PUs’ ICs and SUs’ transmission powers. PU is monetarily benefited from leasing its spectrum to secondary system. Therefore, the PU’s utility is proportional to capacity (spectral efficiency) leased to secondary system. The utility of lth PU is defined as:

$$\begin{aligned} U_{l}(Q_{l}) = { r }.\log \left( 1+\frac{Q_{l}}{G_{l}p_{l}+\sigma _{n}^{2}}\right) \end{aligned}$$
(6)

where r is a coefficient that represents revenue per leased capacity. Also, \(Q_{l}/(G_{l}p_{l}+\sigma _{p}^{2})\) is SINRFootnote 1 perceived by secondary system. PU does not change its transmission power \(p_{l}\) regarding to SUs’ need. So, PU has a guaranteed QoS. From (6), PU’s utility is an increasing function of IC \((Q_{l})\). The following explanation sets some constraints on r.

When PU decides to lease its spectrum, the provided (leased) resource should not affect its regular QoS. Considering IC as \(Q_{l}\), and PU’s transmission power as \(p_{l}\) and the channel gain to PU receiver as \(G_{l}\), the required SINR for the lth PU would be:

$$\begin{aligned} {\textit{SINR}}_{\textit{req.}}=\frac{G_{l}p_{l}}{Q_{l}+\sigma _{p}^{2}} \end{aligned}$$
(7)

Regarding to IC, SUs adjust their powers so the total interference at the lth PU’s channel would be \(I_{l}\). Therefore, instantaneous SINR for the lth PU could be simplified as:

$$\begin{aligned} {\textit{SINR}}_{\textit{ins.}}=\frac{G_{l}p_{l}}{I_{l}+\sigma _{p}^{2}}= \frac{G_{l}p_{l}}{Q_{l}+\sigma _{p}^{2}}\left( 1+\frac{Q_{l}-I_{l}}{I_{l}+\sigma _{p}^{2}}\right) \end{aligned}$$
(8)

So the instantaneous SINR is higher than the required SINR and it could be decreased to obtain better revenue. The only means is the perfect matching between \(Q_{l}\) and \(I_{l}\). In fact, the \(Q_{l}-I_{l}\) plays the role of price on the PU’s utility. So the lth PU utility is improved as:

$$\begin{aligned} U_{l}(Q_{l})&= { r }.\log \left( 1+\frac{Q_{l}}{G_{l}p_{l}+\sigma _{n}^{2}}\right) \\&-(Q_{l}-I_{l})\log \left( 1+\frac{Q_{l}}{G_{l}p_{l}+\sigma _{n}^{2}}\right) \end{aligned}$$
(9)

The second term in (9) implies that the difference between supply \((Q_{l})\) and demand \((I_{l})\) poses cost on the utility. When none of the SUs utilize the lth PU’s channel \((I_{l}=0)\), the (9) could be rewritten as:

$$\begin{aligned} U_{l}(Q_{l})|_{I_{l}=0}=({ r }-Q_{l}). \log \left( 1+\frac{Q_{l}}{G_{l}p_{l}+\sigma _{n}^{2}}\right) \end{aligned}$$
(10)

Since, the utility in (10) should be non-negative, it results in \(r \ge Q_{l}\). Also, \(\bar{Q_{l}}\) is defined as the maximum value for lth PU’s IC, so \(Q_{l} \le \bar{Q_{l}}\). To earn highest revenue per leased capacity, r is considered as equal to \(\bar{Q_{l}}\) and by substituting this in (9), the utility of lth PU would be:

$$\begin{aligned} U_{l}(Q_{l}) = \left( \bar{Q_{l}}-(Q_{l}-I_{l})\right)\cdot\log \left( 1+\frac{Q_{l}}{G_{l}p_{l}+\sigma _{n}^{2}}\right) \end{aligned}$$
(11)

From (11), it should be noted that \(U_{l}\) is concave in \(Q_{l}\). More information about utility design in cognitive radio systems can be found in [24].

The price on the PU’s utility represents the revenue achieved by secondary system. So, the second term in (9) namely \((Q_{l}-I_{l})\log (1+\frac{Q_{l}}{G_{l}p_{l}+\sigma _{n}^{2}})\) embodies aggregate utility of SUs. As a result, kth SU’s utility could be written as [5]:

$$\begin{aligned} u_{k}({\mathbf{p}}_{k})={\sum }_{l=1}^{L}(Q_{l}-I_{l})\log \left( 1+\gamma _{k}^{l}\right) \end{aligned}$$
(12)

The first term in (12) emphasizes on the fact that interference should be lower than IC \((I_{l}\le Q_{l})\). Furthermore, rate is embedded in SU’s utility to ensure its incentive to utilize PU’s spectrum. \(\gamma _{k}^{l}\) is the received SINR of kth SU on the lth PU’s channel which is defined in (5). It should be noted that \(I_{l}\) is a function of \(p_{k}^{l}\). By substituting (3) and (5) in (12), the kth SU’s utility would be:

$$\begin{aligned} u_{k}({\mathbf{p}}_{k})&= {\sum }_{l=1}^{L}\left( Q_{l}-{\sum }_{j=1, j\ne k}^{K} \tilde{A_{j}^{l}}^{2}p_{j}^{l}-\tilde{A_{k}^{l}}^{2}p_{k}^{l}\right) \\&\quad \times \,\log \left( 1+\frac{\left| H_{k}^{l}\right| ^{2}p_{k}^{l}}{i_{k}+\left( {\rho _{l}^{k}}\right) ^{2} |g_{l}|^{2}p_{l}+\sigma _{s}^{2}}\right) \end{aligned}$$
(13)

Taking second derivative with respect to \({\mathbf{p}}_{k}\) proves the concavity of \(u_{k}({\mathbf{p}}_{k})\).

4 The viability of cooperative game

Existing spectrum leasing methods relies on the leader–follower game in which PU sets an IC as a leader and SU responds by adjusting its transmission power. This game casts as non-cooperative game. Figure 1 shows the utility space of one PU-one SU case, considering PU and SU utilities which are defined by (11) and (12). For each IC \((Q_{1})\), the SU can respond with different transmission powers \((p_{1})\), leading to a horizontal parabola on the utility space \((u_{1},U_{1})\). Six parabolas corresponding to \(Q_{1}=5,6,7,8,9\) and 10 are plotted. For each parabola, the SU chooses a transmission power \(p_{1}\) to maximize its utility. Connecting these points yields the solid parabola which is called PU’s best response curve. From Fig. 1, it is clear that the maximum utility is reached if PU sets IC \(Q_{1}=6\). So, SU responds by setting transmission power \(p_{1}=2.7\). The outcome of such non-cooperative game is called Nash Equilibrium (NE) at which no user could be better off by unilaterally changing its decision.

Fig. 1
figure 1

PU/SU utility space and the outcomes of non-cooperative game

From Fig. 1, it is obvious that the outcome of the leader–follower game is not on the Pareto boundary. Thus, the utilities for both PU and SU can be improved. However, such improvement cannot be achieved without cooperation between both parties. For example, if point A is the desirable output (which corresponds to \((Q_{1},p_{1})=(10,3.6)\)) SU could reduce its power to \(p_{1}=2\) and be better off (point B). Then, PU changes its IC to \(Q_{1}=9\) and the outcome would be point C. This alternating changes will be terminated at the NE which is the outcome of the leader–follower game. So, the Pareto boundary would be achievable just by introducing cooperation between PU and SU. This behaviour was previously investigated in the Internet pricing game [25]. It was shown that introducing cooperation between Internet service provider (ISP) and users would be beneficial for both parties rather than leader–follower game; while at the first glance they have opposing interests.

We propose pricing method to define cooperation between PU and SU. First, the original Nash bargaining objective function is considered as the product of PU and SU’s utilities. Then, dual decomposition method is utilized to convert the problem into two separate subproblems which are solved by SU and PU. Each player solves its corresponding subproblem by determining the best strategy (power or IC). The price is updated with respect to new strategies and is utilized to solve the subproblems in the next iteration. The optimum values of power and IC are obtained after the convergence of prices. As a result, pricing method forces the outcome of game to satisfy Pareto efficiency criterion. In the following section, we elaborate the distributed implementation of bargaining for DSL.

5 Decentralized bargaining for dynamic spectrum leasing (DB-DSL)

Let \(G=({\fancyscript{L}},{\fancyscript{K}},{\fancyscript{A}}_{l}^{p},{\fancyscript{A}}_{k}^{s},U_{l},u_{k})\) denotes the cooperative game at which \({\fancyscript{L}}=\{1,\ldots ,L\}\) represents the set of PUs and \({\fancyscript{K}}=\{1,\ldots ,K\}\) is the set of SUs. \({\fancyscript{A}}_{l}^{p}\) denotes the action space of lth PU, where \({\fancyscript{A}}_{l}^{p}=[0,\bar{Q_{l}}]\) represents the lth PU’s action set and \(\bar{Q_{l}}\) is the maximum possible IC of PU. \({\fancyscript{A}}_{k}^{s}=[0,\bar{P_{k}}]\) represents the action set of kth SU with \(\bar{P_{k}}\) as the maximum transmission power of kth SU. Both \({\fancyscript{A}}_{l}^{p}\) and \({\fancyscript{A}}_{k}^{s}\) are convex and compact for \(l \in {\fancyscript{L}}\) and \(k \in {\fancyscript{K}}\), respectively. \(U_{l}\) and \(u_{k}\) could be obtained by (11) and (12). Let \({\fancyscript{E}}\) and \({\fancyscript{D}}\) stands for the set of possible agreements and disagreements between players, respectively. So, \({\fancyscript{E}} \cup {\fancyscript{D}}= {\fancyscript{A}}_{1}^{p} \times \cdots \times {\fancyscript{A}}_{L}^{p}\times {\fancyscript{A}}_{1}^{s} \times \cdots \times {\fancyscript{A}}_{K}^{s}\subset {\fancyscript{R}}^{{\fancyscript{K}} \cup {\fancyscript{L}}}\). Defining, S as the set of \((U_{1}(e),\ldots ,U_{L}(e),u_{1}(e),\ldots ,u_{K}(e))\) for all \(e \in {\fancyscript{E}}\) and \(d=(U_{1}({\fancyscript{D}}),\ldots ,U_{L}({\fancyscript{D}}),u_{1}({\fancyscript{D}}),\ldots ,u_{K}({\fancyscript{D}}))\), the bargaining problem [16] is a pair (Sd), where \(S \subset {\fancyscript{R}}^{{\fancyscript{K}} \cup {\fancyscript{L}}}\) is compact and convex, \(d \in S\), and there exists \(s \in S\) such that \(s_{i} > d_{i}\), for \(i \in {\fancyscript{K}} \cup {\fancyscript{L}}\).

Denoting \({\fancyscript{B}}\) as the set of all bargaining solutions, it is described by a function \({ f }: {\fancyscript{B}} \rightarrow {\fancyscript{R}}^{{\fancyscript{K}} \cup {\fancyscript{L}}}\) that maps each bargaining problem \((S,d) \in {\fancyscript{B}}\) a unique element of S. Nash considers four axioms of which Pareto efficiency is the most significant.Footnote 2 It states that players never agree on an outcome when there is a solution in which they all could be better off. According to the Nash’s theorem [16], the unique Nash Bargaining Solution (NBS) is obtained by:

$$\begin{aligned} f(S,d)&= \arg \max {\prod }_{l=1}^{L}(s_{l}-d_{l}){\prod }_{k=1}^{K}(s_{k}-d_{k}) \\&\quad (d_{1},\ldots ,d_{L},d_{1},\ldots ,d_{K}) \preceq (s_{1},\ldots ,s_{L},s_{1},\ldots ,s_{K}) \in S \end{aligned}$$
(14)

The bargaining problem for \(G=({\fancyscript{L}},{\fancyscript{K}},{\fancyscript{A}}_{l}^{p},{\fancyscript{A}}_{k}^{s},U_{l},u_{k})\) is as follows:

$$\begin{aligned}&\max {\prod }_{l=1}^{L}\left( U_{1}(Q_{l})-U_{l}^{\textit{NE}}\right) {\prod }_{l=1}^{K}\left( u_{k}({\mathbf{p}}_{k})-u_{k}^{\textit{NE}}\right) \\&\quad {\mathbf{a}}=(Q_{1},\ldots ,Q_{L},{\mathbf{p}}_{1},\ldots ,{\mathbf{p}}_{K}) \\&\quad 0 \le Q_{l} \le \bar{Q_{l}} \quad 0 \le {\mathbf{1}}^{T}.{\mathbf{p}}_{k} \le \bar{P_{k}} \\&\quad U_{l}^{\textit{NE}} \le U_{l}(Q_{l})\quad u_{k}^{\textit{NE}} \le u_{k}({\mathbf{p}}_{k}) \\&\quad I_{l}={\sum }_{k=1}^{K}\tilde{A_{k}^{l}}^{2}p_{k}^{l} \le Q_{l}\quad l \in {\fancyscript{L}}\quad k \in {\fancyscript{K}} \end{aligned}$$
(15)

The last constraint follows from the fact that the total interference \((I_{l})\) from all secondary transmissions to the lth PU should be less than IC \((Q_{l})\). In (15), it is assumed that when players (PUs/SUs) fail to reach an agreement, they choose the outcome of leader–follower (or non-cooperative) game and individually maximize their utilities (namely \(U_{l}^{\textit{NE}}\) for lth PU or \(u_{k}^{\textit{NE}}\) for kth SU). Since the objective function in (15) is concave (see Sect. 3.2), by using concave function ln(.), (15) could be rewritten as:

$$\begin{aligned}&\max {\sum }_{l=1}^{L}ln\left( U_{l}(Q_{l})-U_{l}^{\textit{NE}}\right) + {\sum }_{k=1}^{K}ln\left( u_{k}({\mathbf{p}}_{k})-u_{k}^{\textit{NE}}\right) \\&\quad {\mathbf{a}}=(Q_{1},\ldots ,Q_{L},{\mathbf{p}}_{1},\ldots ,{\mathbf{p}}_{K}) \\&\quad 0\le Q_{l} \le \bar{Q_{l}} \quad 0 \le {\mathbf{1}}^{T}.{\mathbf{p}}_{k} \le \bar{P_{k}} \\&\quad U_{l}^{\textit{NE}} \le U_{l}(Q_{l})\quad u_{k}^{\textit{NE}} \le u_{k}({\mathbf{p}}_{k}) \\&\quad I_{l}={\sum }_{k=1}^{K}\tilde{A_{k}^{l}}^{2}p_{k}^{l} \le Q_{l}\quad l \in {\fancyscript{L}}\quad k \in {\fancyscript{K}} \end{aligned}$$
(16)

The Lagrangian associated with the problem (15) would be:

$$\begin{aligned}&L(\varvec{\lambda },\varvec{\nu },\varvec{\mu },Q_{1},\ldots ,Q_{L},{\mathbf{p}}_{1},\ldots ,{\mathbf{p}}_{K}) \\&\quad ={\sum }_{l=1}^{L}ln\left( U_{l}(Q_{l})-U_{l}^{\textit{NE}}\right) + {\sum }_{k=1}^{K}ln\left( u_{k}({\mathbf{p}}_{k})-u_{k}^{\textit{NE}}\right) \\&\quad\quad -\,{\sum }_{k=1}^{K}\mu _{k}\left( u_{k}^{\textit{NE}}-u_{k}({\mathbf{p}}_{k})\right) -{\sum }_{l=1}^{L}\nu _{l}\left( U_{l}^{\textit{NE}}-U_{l}(Q_{l})\right) \\&\quad\quad -\,{\sum }_{l=1}^{L}\lambda _{l} \left( {\sum }_{k=1}^{K}\tilde{A_{k}^{l}}^{2}p_{k}^{l} - Q_{l}\right) \end{aligned}$$
(17)

where \(\varvec{\lambda }=(\lambda _{1},\ldots ,\lambda _{L})\), \(\varvec{\nu }=(\nu _{1},\ldots ,\nu _{L})\) and \(\varvec{\mu }=(\mu _{1},\ldots ,\mu _{K})\) are Lagrange multipliers. The dual function for (15) is as follows [26]:

$$\begin{aligned} g(\varvec{\lambda },\varvec{\nu },\varvec{\mu })&= \sup L(\varvec{\lambda },\varvec{\nu },\varvec{\mu },Q_{1},\ldots ,Q_{L},{\mathbf{p}}_{1},\ldots ,{\mathbf{p}}_{K}) \\&(Q_{1},\ldots ,Q_{L},{\mathbf{p}}_{1},\ldots ,{\mathbf{p}}_{K}) \\&={\sum }_{l=1}^{L}{\sum }_{k=1}^{K}\Big ( A(\lambda _{l},\nu _{l},\mu _{k}) + B(\lambda _{l},\nu _{l},\mu _{k}) \Big ) \end{aligned}$$
(18)

where

$$\begin{aligned} A(\lambda _{l},\nu _{l},\mu _{k})&= \sup \Big (ln\left( U_{l}(Q_{l})-U_{l}^{\textit{NE}}\right) \Big )+ \lambda _{l} Q_{l} \\&0 \le Q_{l} \le \bar{Q_{l}} \\&-\,\nu _{l}\Big (U_{l}^{\textit{NE}}-U_{l}(Q_{l}))\Big ) \end{aligned}$$
(19)

and

$$\begin{aligned} B(\lambda _{l},\nu _{l},\mu _{k})&= \sup \Big ( ln\left( u_{k}({\mathbf{p}}_{k})-u_{k}^{\textit{NE}}\right) - \lambda _{l} \tilde{A_{k}^{l}}^{2}p_{k}^{l} \\&0 \le {\mathbf{1}}^{T}.{\mathbf{p}}_{k} \le \bar{P_{k}} \\&-\,\mu _{k}\Big (u_{k}^{\textit{NE}}-u_{k}({\mathbf{p}}_{k})\Big ) \end{aligned}$$
(20)

As a result, the dual problem of (15) would be:

$$\begin{aligned}&\min g(\varvec{\lambda },\varvec{\nu },\varvec{\mu }) \\&\varvec{\lambda } \succ 0, \varvec{\mu } \succ 0, \varvec{\nu } \succ 0 \end{aligned}$$
(21)

where \(g(\varvec{\lambda },\varvec{\nu },\varvec{\mu })\) is defined by (18). \(A(\lambda _{l},\nu _{l},\mu _{k})\) is solved by the lth PU and \(B(\lambda _{l},\nu _{l},\mu _{k})\) is solved by the kth SU. The dual problem (21) could be solved in an iterative manner using the gradient projection method [27]:

$$\begin{aligned} \lambda ^{(i+1)}_{l}&= \Big [\lambda ^{(i)}_{l}-\alpha ^{(i)} \Big (\frac{dg(\varvec{\lambda },\varvec{\nu },\varvec{\mu })}{d\lambda _{l}}\Big )\Big ]^{+} \\&= \Big [\lambda ^{(i)}_{l}+\alpha ^{(i)}\Big ({\sum }_{k=1}^{K}\tilde{A_{k}^{l}}^{2}(p_{k}^{l})^{*}-Q_{l}^{*} \Big )\Big ]^{+} \end{aligned}$$
(22)
$$\begin{aligned} \nu _{l}^{(i+1)}&= \Big [\nu _{l}^{(i)}-\beta ^{(i)} \Big (\frac{dg(\varvec{\lambda },\varvec{\nu },\varvec{\mu })}{d\nu _{l}}\Big )\Big ]^{+} \\&= \Big [\nu _{l}^{(i)}+\beta ^{(i)}\Big (U_{l}^{\textit{NE}}-U_{l}(Q_{l}^{*})\Big )\Big ]^{+} \end{aligned}$$
(23)
$$\begin{aligned} \mu _{k}^{(i+1)}&= \Big [\mu _{k}^{(i)}-\xi ^{(i)} \Big (\frac{dg(\varvec{\lambda },\varvec{\nu },\varvec{\mu })}{d\mu _{k}}\Big )\Big ]^{+} \\&= \Big [\mu _{k}^{(i)}+\xi ^{(i)}\Big (u_{k}^{\textit{NE}}-u_{k}({\mathbf{p}}_{k}^{*})\Big )\Big ]^{+} \end{aligned}$$
(24)

where i is the iteration index, \(\alpha _{i}\), \(\beta _{i}\) and \(\xi _{i}\) are sufficiently small positive step-sizes, and \([a]^{+}=max(0,a)\). Here, \({\mathbf{p}}_{k}^{*}\) is the local optimizer of (20), and \(Q_{l}^{*}\) is the local optimizer of (19) for given \(\lambda _{l}^{(i)}, \nu _{l}^{(i)}, \mu _{k}^{(i)}\). Following theorem gives the final result [27]:

Theorem 1

Dual variables \(\lambda _{l}^{(i)}, \nu _{l}^{(i)}, \mu _{k}^{(i)}\) converges to the optimal dual solution \(\lambda _{l}, \nu _{l}, \mu _{k}\) if the step size is chosen such that \(\alpha ^{(i)}\rightarrow 0\), \(\sum _{i=0}^{\infty }\alpha ^{(i)}=\infty\), \(\beta ^{(i)}\rightarrow 0\), \(\sum _{i=0}^{\infty }\beta ^{(i)}=\infty\), \(\xi ^{(i)}\rightarrow 0\), \(\sum _{i=0}^{\infty }\xi ^{(i)}=\infty\).

Proof

The step sizes (i.e. \(\alpha ^{(i)}, \beta ^{(i)}, \xi ^{(i)}\) ) are tending to zero with respect to iteration index (diminishing step sizes). To avoid convergence to a nonstationary point, the sum of step sizes are infinity. As provided in [27], these conditions satisfy the convergence of the result.

By Theorem 1, since the primal problem (15) is convex and the constraint is strictly positive, Slater’s condition for strong duality of primal problem (15) is hold. As a result, the duality gap is zero and the solutions of (19) and (20) for \(\lambda _{l}^{(i)}, \nu _{l}^{(i)}, \mu _{k}^{(i)}\) are global maximizer of the primal problem.

5.1 Mechanism design

Following Eqs. (15) up to (24), we propose two algorithms for lth PU and kth SU actions in DB-DSL. At ith iteration, primary receiver computes total interference from SUs \((I_{l})\) and sends this value in the reverse channel to its corresponding PU. Primary user updates the price \((\lambda ^{(i)}_l)\) then maximizes its net-utility (2.b Algorithm 1) by adjusting a new IC \((Q_{l}^{*})\). Finally, lth PU broadcasts three parameters \((Q_{l}^{*}, I_{l},\lambda ^{(i)}_l)\). Upon the reception of parameters by kth SU, he adjusts his transmission power (2.c Algorithm 2) in order to maximize his net-utility. SU’s welfare quantifies the comparison of cooperative and non-cooperative utilities which is then subtracted by PU’s charged price. These procedures are performed by each PU \((l\in {\fancyscript{L}})\) and SU \((k\in {\fancyscript{K}})\). As a result, the problem (15) is solved in a distributed manner using dual decomposition method. The shadow price \(\lambda _l^{(i)}\) and local parameters such as \(\nu _l^{(i)}\) and \(\mu _k^{(i)}\) forces the outcome of Algorithms 1 and 2 to be the NBS.

figure d

Utilizing feedback channel between primary receiver and SUs, each SU could acquire its interference channel information on PU. Since PU broadcasts instantaneous interference \((I_l)\), kth SU could compute interference by others SUs via \(I_{l,-k}=I_l-\tilde{A_{k}^{l}}^{2}p_{k}^{l}\) which is readily available by knowing its channel state information (using feedback channel from PU). As a result, there is no need to exchange CSI between all SUs and PUs directly. In other words, a regular broadcast of total interference by PUs would be sufficient. It should be noted that a similar method is used at [57].

Aforementioned discussion in Sect. 4 represents the fact that PU could not broadcast a selfishly adjusted shadow price. It should be noted that the NBS preserves Pareto efficiency in which players would be better off rather than individually maximizing their utilities.

6 Numerical results

A scenario with two PUs and two SUs is considered. Randomly generated channel gains between PUs/SUs are shown in Table 2. Noise is assumed to be AWGN with zero mean and unit variance.

Table 2 Channel gains between PUs and SUs

The cross-correlation coefficients are assumed to be the same, \(\rho _{01}=\rho _{10}=1\). Maximum possible ICs of the PUs are assumed to be \(\bar{Q}_{1}=\bar{Q}_{2}=10\) and maximum possible transmission powers of SUs are \(\bar{P}_{1}=\bar{P}_{2}=12\). Bandwidth of SUs are \(W_{1}=W_{2}=1\,MHz\). These assumptions are the same as [5], so that DB-DSL could be compared with non-cooperative game which is proposed in [5].

figure e

Following the proposed DB-DSL method and also non-cooperative game, we can compute the equilibrium strategy profile for SUs and PUs transmission powers. The convergence of powers/ICs are depicted in Fig. 2. All parameters converges to optimum values after ten iterations.

Fig. 2
figure 2

The convergence of PUs’ ICs and SUs’ transmission powers

Decentralized power control provides the opportunity to adjust power with respect to channel gain and also avoiding unnecessary mutual interference on other transmissions. Adjusting powers or ICs in a collaborative manner provides a room for PUs/SUs to attain better utility levels. The result is shown in Fig. 3. It should be noted that the improvement of SU1’s utility is negligible. The reason is that the channel gain between SU1 and PU1 or PU2 is similar and therefore DB-DSL could not improve the result of non-cooperative game.

Fig. 3
figure 3

The utility of PUs and SUs for conventional non-cooperative game and proposed DB-DSL

Although DB-DSL is designed to improve the utility of PUs/SUs, the dependency between PU’s utility and its instantaneous SINR reveals another aspect of proposed method. Transmission powers and ICs are selected to provide a socially optimum outcome. As a result, SUs’ interference on the lth PU’s channel \((I_{l})\) is reduced and instantaneous SINR of PU (8) will be increased with respect to minimum required SINR (7). Figure 4 depicts this fact for the developed scenario.

Fig. 4
figure 4

Required and Instantaneous SINRs of PUs for conventional non-cooperative game and proposed DB-DSL

7 DB-DSL in terms of cooperative games

In the preceded sections, we have developed bargaining game to model PU–SU interactions. Cooperative games casts into two major branches, Coalition games and Bargaining games. Coalition games model the formation of cooperating groups of players. In our model, each PU leases its channel to a number of SUs. As a result, a coalition consists of one PU and some SUs. The members of coalition receives non-negative utilities. Since PUs have exclusive channels and the value of each coalition only depends on its members, the game would be in characteristic form with transferable utility (TU). Two coalition games could be applied for the developed model [28]: Canonical coalitional game and coalition formation game. The former one relies on the fact that the extension of a coalition is never detrimental to the members of coalition and the value of new coalition will be increased. In coalition formation game, a subset of all members form a coalition and extension or reduction of coalition is not beneficial for its members.

As discussed in the next paragraph, our developed system model results in Canonical coalitional games. So, the proposed Nash bargaining game in Sect. 4 would be a viable approach to determine PUs’ ICs and SUs’ transmission powers within the coalition. This allocation provides a proportionally fair solution. Moreover, the utilities of PUs/SUs in the grand coalition are more than individual decision making that is associated with non-cooperative game.

The most important concept in canonical coalitional games is the existence and formation of grand coalition. Shapley provided four axioms for the formation of grand coalition in the canonical coalitional TU games [29]. The most important axiom is superadditivity. This axiom states that the value of a large coalition out of two disjoint coalitions is greater or equal to the sum of values obtained by the disjoint coalitions separately. We provide a simple illustrative example (see Fig. 5.) that shows the proposed model has superadditivity property. Recall that each coalition is formed by at least one PU and a number of SUs, i.e. coalitions such as \(S_1\),\(S_2\) and \(S_3\) could be formed but \(S_4\) never happens. Simple game captures this model. PU is Veto player because he is in every formed coalitions. In [29] it has been showed that the core of a simple game is not empty if there is at least one Veto player. As a result, bargaining could be utilized among the members of grand coalition.

Fig. 5
figure 5

Possible coalitions of PUs and SUs comprising at least one PU and some SUs \((S_1,S_2,S_3)\)

8 Conclusion

In this paper, the concept of DB-DSL is proposed as a new paradigm for spectrum sharing. Departing from previous works which relied on the usage of non-cooperative game, DB-DSL exploits the bargaining method to model PU/SUs interactions. Different from other cooperative spectrum sharing methods, both PU and SUs act as bargainers. The problem is solved in a distributed manner via pricing method. lth PU broadcasts IC \((Q_{l})\), total interference from SUs \((I_{l})\) and price \((\lambda ^{(i)}_l)\), so SUs utilize these parameters and solve their own problems via local information. In fact, PU chooses an IC (based on SUs’ demand) to maximize its profit. On the other hand, SUs choose their transmission powers to utilize unused spectrum in a collaborative manner.