1 Introduction

One of the fundamental tasks performed frequently by ad hoc networks is congestion control, whose main objective is to regulate the allowed source rates so that the total traffic load on any link does not exceed its available capacity. At the same time, the data rates that can be achieved for wireless links depend on the interference levels, which in turn are determined by the power control and spectrum allocation policies. It is well known that both power and wireless spectrum are scarce resources in wireless ad hoc networks. To improve the spectrum utilization significantly, cognitive radio (CR) has been recently proposed as a smart technology which allows detecting vacant spectrum holes and jumping into them [1, 2], which is the primary motivation of the use of cognitive radio networks. In [34, 35], cognitive cross-layer architecture for the cognitive ad hoc networks was proposed and many new and challenging problems regarding the design of CR systems are formulated. Thus, in cognitive ad hoc networks with multicarrier communications (e.g., orthogonal frequency division multiple access (OFDMA)), which allow multiple users to transmit simultaneously on different subcarriers, it is a serious challenge how to efficiently allocate the available spectrums detected by spectrum sensing to different subcarrier links of cognitive radio inter-/intra-users in a distributed fashion, and how to mitigate congestion of links while maintaining a higher utility of subcarriers. This is one of motivations of our work.

Another motivation of our work is that unlike in wired networks where links are disjoint resources with fixed capacities, in ad hoc wireless networks the link capacities are “elastic”, which depends on transmission power and frequency spectrum since wireless channel is a shared medium and interference-limited. The third motivation of our work is that because of multi-path routing induced by adding spectrum allocation, the objective functions are non-convex and non-separable, which is also very difficult to obtain their optimal solutions. However, the case exists extensively in ad hoc networks.

Congestion control has been widely studied as a network utility maximization (NUM) problem, and much of existing research on NUM in wireless networks focuses on the jointly optimal transport layer congestion control and media access control (MAC) layer contention control and channel assignment [37, 11, 12]. They consider joint design of congestion control and contention control to find optimal end-to-end source rate at the transport layer and per link persistence probabilities at MAC layer to maximize the aggregate source utility. In [33], the utility-lifetime tradeoff in wireless sensor networks (WSNs) was studied by optimal flow control in a more practical way by taking into account link congestion and energy efficiency in their network model. But they do not consider the impact of transmission power on congestion control and channel utilization.

To address the issue of the jointly optimal congestion control and per-link power control, more recently, in [8, 9], the problem of joint congestion control and resource allocation was considered in wireless networks. By jointly considering contention control problem, Gürses in [10] proposed a cross-layer optimized congestion, contention and power control algorithm for transport, MAC and physical layers respectively by adopting the generalized network utility maximization (GNUM) approach in ad hoc networks. However, those results in [810] are only feasible for the network model with fixed nodes and fixed single-path routing, where the NUM problem is convex and can be decomposed into two separate maximization problems.

Besides the interaction between congestion control and power control, the influence of spectrum allocation on congestion control cannot be neglected as well since both the elastic capacities of links and the transmission rates of sources are dependent on spectrum allocation in cognitive ad hoc networks with multicarrier communications. There are several pieces of work on the subcarrier allocation of OFDMA systems in [13, 14]. However, those subcarrier allocation schemes are based on the centralized approach. Obviously, they are infeasible for ad hoc networks. In [1517], the distributed opportunistic access schemes were presented through a judicious design of a novel backoff mechanism to utilize the channel information and reduce collisions for OFDMA systems. In [18], a primal–dual optimization framework was developed to schedule any-to-any CR communications based on OFDMA and to allocate powers so as to maximize the weighted average sum-rate of all users. However, those frameworks in [17, 18] do not consider the subcarrier multi-path nature of ad hoc networks with OFDMA, which generally leads to a primal non-convex and non-separable optimization problem. In [36], the authors considered the problem of gathering correlated sensor data by a single sink node in a wireless sensor network, and designed efficient distributed protocols to maximize the network lifetime. In [19], close to our work, Xi and Yeh proposed distributed algorithms in wireless networks for spectrum allocation, power control, routing, and congestion control in accordance with the duplexing constraint. One of their core contributions is that the spectrum allocation is transformed into an equivalent graphic-theoretic link-coloring problem to activate different subsets of conflict-free links on different sub-bands. However, these results are not suitable for cognitive ad hoc networks with OFDMA because the available subcarrier set for CR users is detected by an appropriate spectrum sensing approach, thus there is conflict-free among each other.

Therefore, in this paper, to address the above challenges and achieve high end-to-end throughput and efficient resource utilization, we study and analyze a distributed coordination problem of jointly optimal end-to-end congestion control and per-link power control and spectrum allocation while the architectural separation among different layers is preserved. Furthermore, we propose a new distributed coordinative algorithm that integrates together transport layer congestion control, and physical layer power control coupled with spectrum allocation for cognitive ad hoc networks with OFDMA. Based on elastic link capacity that is regarded as a function of transmission power, we construct a joint optimization formulation of congestion control and power control to maximize the network utility with multiple alternative subcarrier links, which is a non-convex and non-separable optimization problem. By introducing new variables and small disturbance quadratic terms, the original joint optimization problem becomes decoupled and dual-decomposable. That is, it can be vertically decomposed to the congestion control subproblem and power control subproblem. The former is a primal algorithm where the congestion prices are generated based on local aggregate source rate. The latter is a dual subgradient algorithm where the transmission power of each source is updated by the congestion prices. The heuristics spectrum allocation sub-algorithm is the suboptimal solution of the min–max effective subcarrier utilization based on the congestion prices.

Remarkably, when the convergence takes place, for all links along the multi-hop route, our distributed coordinative algorithm not only provides the optimal transmission rates for each subcarrier-to-subcarrier link, but also determines which portion of the rates is designated for transmission in what powers and which subcarriers under the adjustment of congestion prices. We show that this interaction and coordination among the three sub-problems are crucial to the convergence and performance of the joint solution. In comparison with [19], simulation results demonstrate that our distributed algorithm converges much faster and achieves significant gains of network utilization and throughput. More importantly, our algorithm is robust to node mobility and packet losses.

The rest of this paper is organized as follows: Sect. 2 describes the network model and parameters with cognitive ad hoc networks. Section 3 introduces the subcarrier-to-subcarrier link construction for each node-to-node path of source. Section 4 formulates the utility-maximization problem of joint congestion control and resource allocation with elastic link capacities. In Sects. 5 and 6, we propose the distributed congestion control, power control and spectrum allocation sub-algorithms based on dual-decomposable and subgradient approach and prove their convergence. Several numerical examples are given in Sect. 7 to demonstrate the effectiveness, convergence and robustness of our proposed algorithm. The conclusions are drawn in Sect. 8.

2 Network model

We consider a cognitive ad hoc wireless network with multicarrier communications. We assume that \(\mathcal{N}\) denotes a set of CR nodes with the node number N = |\(\mathcal{N}\)| and ɛ is a set of logical node-to-node paths with the link number E = |ɛ| which represents the ability of successful transmission between nodes within wireless range. The whole channel consists of K subcarriers and the bandwidth of each subcarrier is not identical. A CR node n ∈ \(\mathcal{N}\) represents a wireless transceiver. We assume that the CR node n has M n available subcarriers detected by the optimal linear cooperation framework (OLCF) given in [21], and the collection of M n available subcarriers is denoted by \(\mathcal{M}_n.\)

Each node-to-node path e ∈ ɛ consists of one or more subcarrier-to-subcarrier logical links l, formed between the subcarriers of e′s endpoints. The set of all subcarrier-to-subcarrier links in the ad hoc network is denoted by \(\mathcal{L}\), with L = |\(\mathcal{L}\)|. Each node-to-node path is unidirectional and can be active on one or more of the subcarriers. A spectrum allocation is given by the collection \( \pi = \left\{ {k_{n,i} ,n = 1, \ldots ,N,i = 1, \ldots ,M_{n} } \right\}, \) where subcarrier i of node n is operating at channel k n,i. We consider a slot-synchronized system with a periodic frame consisting of multiple slots. That is, a node cannot transmit or receive on the same subcarrier at the same slot, and simultaneous operation of different subcarriers of the same node at the same slot is permitted only if the subcarriers have been assigned to different channels.

We consider a set of sources \(\mathcal{S}\), with S = |\(\mathcal{S}\)| ≤ N. We assume that packets from sources are routed to their destinations according to the established links which consist of a path from source node to destination node and are fixed by the network layer at the time of optimization. In our network model the routing matrix R of size L × S corresponds to these established links where entries r ls of the matrix take the value of 1 if source s is using subcarrier link l, and 0 otherwise. Although node-to-node path are fixed, it is possible to split traffic across the subcarriers and channels of each path. The descriptions of most of notations in this paper are summarized in Table 1.

Table 1 Notation table

3 Path construction

In an ad hoc network with multicarrier communications, multi-subcarriers enable the multiple transmission options for a transmitter through different subcarriers and sub-channels. The total incoming traffic at each intermediate node is aggregated from multiple subcarriers and should be split to each of outgoing links which might be incident to different subcarriers.

By the approach similar to that of [20], we construct a set of subcarrier links for each node-to-node path of source s as follows. Firstly, we start from the subcarrier-to-subcarrier links of the path e near to source s, and create one link i for each available subcarrier between the two end nodes, tr(e) and rcv(e), of the path e, then add the link i into the set of paths ϖ s ; secondly, for subsequent path e′of path e, for each available subcarrier between its end nodes, tr(e′) and rcv(e′), the corresponding subcarrier-to-subcarrier link is appended to all the paths constructed for the previous link, and the link is added to the set of paths ϖ s ; and finally, this incremental path construction procedure iterates for all links until the end of the node-to-node path, and results in a set of paths ϖ s for each source s. As shown in Fig. 1, source s 1 is associated with an origin–destination node pair denoted by \( (h(s_{1} ), \, d(s_{1} )). \) We assume that the node h(s 1) has three available subcarriers, k 1, k 2 and k 3, then by the above procedures, it has three subcarrier-to-subcarrier links to its next neighbor node R1 corresponding to subcarriers k 1, k 2 and k 3, respectively.

Fig. 1
figure 1

Construction of the set of paths

Our congestion control design distributes the traffic of each source s across its set of paths ϖ s . Based on the construction procedure and slot-synchronized nature, the number of subcarrier links ω s  = |ϖ s | of source s is equal to the number of available subcarriers at each node-to-node path, such as \( \omega_{{s_{1} }} = 6. \)

4 Problem statement

In network utility maximization (NUM) framework, each source s has its utility function and network resource (power, spectrum and bandwidth) are allocated so that network utility (i.e., the sum of all users’ utilities) is maximized. A utility function can be interpreted as the level of satisfaction attained by a user as a function of resource allocation. Different shapes of utility functions lead to different types of fairness. In this paper, the utility of a source transmitting at rate x s is expressed by

$$ U(x_{s} ) = \left\{ {\begin{array}{*{20}c} {w_{s} {\frac{{x_{s}^{1 - \alpha } }}{1 - \alpha }},\quad {\text{if }}\alpha \ne 1,} \\ {w_{s} \log x_{s} ,\quad {\text{otherwise,}}} \\ \end{array} } \right. $$
(1)

where U(·) is a strictly concave, non-decreasing, twice differentiable function. If we set α = 0, NUM reduces to system throughput maximization. If α = 1, proportional fairness among competing users is attained; if α = 2, then harmonic mean fairness; and if α → ∞, then max–min fairness. One of our aims is to find a source rate vector that maximizes aggregate utility and can lead to the realization of various fairness objectives. The fairness region depends on the positive priority parameters \( W = \left\{ {w_{s} > 0,\,s = 1, \ldots ,S} \right\} \) and parameter α ≥ 0.

The rates of links depend on the medium access scheme, channel condition and the resource allocations such as transmit powers, spectrums and time-slots. A source s perceives utility U(x s ) when data are transmitted from h(s) to d(s) at a total rate of x s . Since rate x s is the aggregate traffic achieved by transmission to each subcarrier path i with rate x (i) s , hence, \( x_{s} = \sum\nolimits_{i = 1}^{{\omega_{s} }} {x_{s}^{(i)} } \) where ω s denotes the number of paths of each route from h(s) to d(s).

We assume that the capacity c (i) l (p (i) l ) of link l on subcarrier i of source s is a function of the signal-to-interference-plus-noise ratio (SINR) γ (i) l (p (i) l ) of link l over i, and p (i) l denotes the transmission power used by link l on subcarrier i. Thus, in interference-limited ad hoc networks, we can write the physical layer link capacity as

$$ c_{l}^{(i)} \left( {p_{l}^{(i)} } \right) = B_{l}^{(i)} \log \left( {1 + k \cdot \gamma_{l}^{(i)} \left( {p_{l}^{(i)} } \right)} \right), $$

where B (i) l is the bandwidth of link l on subcarrier i, constant k = (−ϕ 1)/log(ϕ 2 · BER), ϕ 1 and ϕ 2 are constants depending on the modulation and BER is the required bit-error-rate [22]. However, high SINR can always be maintained since CSMA/CA based MAC layer prevents the nearby (i.e., 2-hop) nodes to operate at the same time. High SINR allows us to approximate c (i) l (p (i) l ) as below

$$ c_{l}^{(i)} (p_{l}^{(i)} ) \approx B_{l}^{(i)} \log (k \cdot \gamma_{l}^{(i)} (p_{l}^{(i)} )). $$
(2)

Let σ (i) l be the thermal noise power at the receiver of link l on subcarrier i, G (i) lm be the direct link gain between the transmitter of link m and the receiver of link l on subcarrier i. Then SINR of link l on subcarrier i can be defined as

$$ \gamma_{l}^{(i)} \left( {p_{l}^{(i)} } \right) = {\frac{{G_{ll}^{(i)} p_{l}^{(i)} }}{{\sum\nolimits_{m \ne l} {G_{lm}^{(i)} p_{m}^{(i)} } + \sigma_{l}^{(i)} }}}. $$
(3)

It is manifest from (2) and (3) that each link capacity c (i) l of link l on subcarrier i is “elastic”, and is a function of the transmission power p (i) l , subcarrier allocation and the channel conditions (such as link gain G (i) lm and thermal noise σ (i) l ). Note that in fact, in an OFDMA system, since each subcarrier can be used by only one link at any time, the sum interference to the receiver of link l on subcarrier i should be zero.

We denote by \( \vec{x}_{s} = \left[ {x_{s}^{(i)} ,i = 1, \ldots ,\omega_{s} } \right] \) the source rate distribution vector, and by \( X = \left[ {\vec{x}_{s} ,s = 1, \ldots ,S} \right]^{T} \) the network rate distribution. We assume that the binary routing variable r (i) ls indicates if subcarrier-to-subcarrier link l is used by the path i of source s or not. The rate of a subcarrier-to-subcarrier link equals the sum of the individual rates of all the paths of the network crossing this link. Those individual rates also indicate the portion of the aggregate traffic of the link that should be routed in each subcarrier path. Therefore, given subcarrier links, we formulate the joint congestion control and resource allocation as the following utility-maximization problem with elastic link capacities:

$$ \mathop {\max }\limits_{x \ge 0,p \ge 0} \sum\limits_{s = 1}^{S} {U\left( {\sum\limits_{i = 1}^{{\omega_{s} }} {x_{s}^{(i)} } } \right)} , $$
(4)
$$ {\text{subject}}\;{\text{to}}\;\sum\limits_{s = 1}^{S} {r_{ls}^{(i)} x_{s}^{(i)} } \le c_{l}^{(i)} \left( {p_{l}^{(i)} } \right),\quad \forall l \in \mathcal{L},\quad\forall i \in \varpi_{s} $$
(5)
$$ p_{l}^{\min } \le \sum\limits_{i = 1}^{{\omega_{s} }} {p_{l}^{(i)} } \le p_{l}^{\max } ,\quad \forall l \in \mathcal{L}, $$
(6)
$$ x_{s}^{\min } \le \sum\limits_{i = 1}^{{\omega_{s} }} {x_{s}^{(i)} } \le x_{s}^{\max } ,\quad \forall s \in \mathcal{S}, $$
(7)

where the optimization variables are both source rate x > 0 and transmission powers p > 0. Constraint (5) indicates that traffic traversing a wireless subcarrier-to-subcarrier link l do not exceed the capacity of the link; Constraint (6) describes that the transmission power at each transmitter on link l is constrained by the maximum and minimum transmission powers, p max l and p min l , in order to save the finite energy of each transmitter and satisfy the diverse quality of service requirement, respectively; Constraint (7) demonstrates that the transmitting rate of each source is bounded by the maximum and minimum transmitting rates, x max s and x min s in order to keep the fairness and improve the efficiency of wireless links. The GNUM formulations given by (4)–(7) bind the transport and physical layer. The solution gives the optimal transport layer source rates and physical layer power level p l that maximize the total network utility.

The multi-path nature of the problem (4) leads to several difficulties in constructing solutions suitable for online implementation. One of the main difficulties is that once some sources have multiple alternative paths, even if U(·) is a strictly concave function in x s , it is not in x (i) s due to the linearity of \( \sum\nolimits_{i = 1}^{{\omega_{s} }} {x_{s}^{(i)} } ; \) and, since the dual of the problem may not be differentiable at every point, hence, the objective function in (4) may not be strictly concave [23].

Another major challenge of solving the optimization problem (4) is two global dependencies: (1) between source rates x s and link capacities c l , and (2) between each link capacity c l and all the interfering powers, in terms of the attainable data rate under a given power vector \( \vec{p}_{l} = \left[ {p_{l}^{(i)} ,i = 1, \ldots ,\omega_{s} } \right]. \) Let \( P = \left[ {\vec{p}_{l} ,l = 1, \ldots ,L} \right]^{T} . \) Our primary goal in this paper is to distributively find the joint global optimal solution (X *, P *) to problem (4) by breaking down the two global dependencies.

5 Distributed algorithm

The core goal of this section is to propose the distributed congestion control and power control sub-algorithms by using Layering as Optimization Decomposition (LOD) [26] approach to solve the optimization problem (4) systematically and distributively. LOD is a popular approach for solving the cross-layer optimization problems in the form of GNUM. It simply decomposes the optimization problem into subproblems using dual decomposition and assigns them to different layers. The necessary conditions for applying LOD are convexity and separability. However, the optimization problem (4) is neither convex and nor separable on optimization variables in its dual. Therefore, to circumvent the difficulties due to the lack of strict concavity and variable separability, we will first convert the original optimization problem (4) into a new convex and separable one by using the ideas from Proximal Optimization Algorithms [23, 28, p. 232]; secondly, we will propose two sub-algorithms by employing LOD; finally, we will discuss the robustness of the two distributed sub-algorithms.

At first, we convert the original optimization problem according to the ideas from Proximal Optimization Algorithms, which is

  1. (i)

    to add a small penalty quadratic term to objective function: \( \sum\nolimits_{s = 1}^{S} {\sum\nolimits_{i = 1}^{{\omega_{s} }} {{\frac{\delta }{2}}\left( {x_{s}^{(i)} - y_{s}^{(i)} } \right)^{2} } } , \) where δ is a small positive constant, y (i) s is an auxiliary variable for each x (i) s . Then the optimization objective function becomes

    $$ U^{\prime}\left( {x_{s}^{(i)} ,y_{s}^{(i)} } \right) = U\sum\limits_{i = 1}^{{\omega_{s} }} {x_{s}^{(i)} } - \sum\limits_{i = 1}^{{\omega_{s} }} {{\frac{\delta }{2}}\left( {x_{s}^{(i)} - y_{s}^{(i)} } \right)^{2} } ; $$
  2. (ii)

    to make the logarithm change of variables and constants: \( \hat{x}_{s}^{(i)} = \log x_{s}^{(i)} , \) \( \hat{y}_{s}^{(i)} = \log y_{s}^{(i)} , \) \( \hat{p}_{l}^{(i)} = \log p_{l}^{(i)} , \) \( \hat{p}_{l}^{\min } = \log p_{l}^{\min } , \) \( \hat{p}_{l}^{\max } = \log p_{l}^{\max } , \) \( \hat{x}_{s}^{\min } = \log x_{s}^{\min } , \) \( \hat{x}_{s}^{\max } = \log x_{s}^{\max } . \)

After the above two-step transformation, the optimization problem (4) can be rewritten as

$$ \max \sum\limits_{s = 1}^{S} {\hat{U}} \left( {\hat{x}_{s}^{(i)} ,\;\hat{y}_{s}^{(i)} } \right), $$
(8)
$$ {\text{subject to }}\sum\limits_{s = 1}^{S} {r_{ls}^{(i)} e^{{\hat{x}_{s}^{(i)} }} } \le \hat{c}_{l}^{(i)} (\hat{p}_{l}^{(i)} ), \quad \forall l \in \mathcal{L}, \forall i \in \varpi_{s} , $$
(9)
$$ \hat{p}_{l}^{\min } \le \sum\limits_{i = 1}^{{\omega_{s} }} {\hat{p}_{l}^{(i)} } \le \hat{p}_{l}^{\max } , \, \quad\forall l \in \mathcal{L}, $$
(10)
$$ \hat{x}_{s}^{\min } \le \sum\limits_{i = 1}^{{\omega_{s} }} {\hat{x}_{s}^{(i)} } \le \hat{x}_{s}^{\max } ,\quad \forall s \in \mathcal{S}, $$
(11)

where \( \hat{U}\left( {\hat{x}_{s}^{(i)} ,\hat{y}_{s}^{(i)} } \right) = U\left( {\sum\limits_{i = 1}^{{\omega_{s} }} {e^{{\hat{x}_{s}^{(i)} }} } } \right) - \sum\limits_{i = 1}^{{\omega_{s} }} {{\frac{\delta }{2}}\left( {e^{{\hat{x}_{s}^{(i)} }} - e^{{\hat{y}_{s}^{(i)} }} } \right)^{2} } . \) It is easy to show that the optimal value of (8) coincides with that of (4). In fact, let \( \vec{y}_{s} = \left[ {y_{s}^{(i)} ,i = 1, \ldots ,\omega_{s} } \right] \) and \( Y = \left[ {\vec{y}_{s} ,s = 1, \ldots ,S} \right]^{T} . \) If X * denote the maximizer of (4), then X = X *, Y = X * are the maximizer of (8).

Then the new link SINR is obtained as (12) after setting \( \hat{\gamma }_{l}^{(i)} \left( {\hat{p}_{l}^{(i)} } \right) = \hat{\gamma }_{l}^{(i)} \left( {e^{{\hat{p}_{l}^{(i)} }} } \right). \)

$$ \hat{\gamma }_{l}^{(i)} (\hat{p}_{l}^{(i)} ) = {\frac{{G_{ll}^{(i)} e^{{\hat{p}_{l}^{(i)} }} }}{{\sum\nolimits_{m \ne l} {G_{lm}^{(i)} e^{{\hat{p}_{m}^{(i)} }} } + \sigma_{l}^{(i)} }}}. $$
(12)

Similarly, new physical layer link capacities are given by (13) after setting \( \hat{c}_{l}^{(i)} \left( {\hat{p}_{l}^{(i)} } \right) = \hat{c}_{l}^{(i)} \left( {e^{{\hat{p}_{l}^{(i)} }} } \right) \) and using (12)

$$ \hat{c}_{l}^{(i)} \left( {\hat{p}_{l}^{(i)} } \right) = B_{l}^{(i)} \left[ {\log \left( {kG_{ll}^{(i)} } \right) + \hat{p}_{l}^{(i)} - \log \left( {\sum\nolimits_{m \ne l} {G_{lm}^{(i)} e^{{\hat{p}_{m}^{(i)} }} } + \sigma_{l}^{(i)} } \right)} \right]. $$
(13)

It can be verified, e.g., through the second derivative test, that the term \( \log \left( {\sum\nolimits_{m \ne l} {G_{lm}^{(i)} e^{{\hat{p}_{m}^{(i)} }} } + \sigma_{l}^{(i)} } \right) \) in (13) is a convex function. Thus, it is not difficult to obtain that the constraint set formed by (9)–(11) is convex as well.

Next, in order to prove (8) to be a convex optimization problem, we need verify the concavity of the objective function.

The optimization problem (8) can be solved by two steps at t-th iteriation:

(S 1 ) The maximization is alternately taken over X while keeping Y fixed, i.e., fix Y = Y(t) and maximize the augmented objective function with respect to X.

Define

$$ g\left( {x_{s}^{(i)} } \right) = {\frac{{\partial^{2} U^{\prime}\left( {x_{s}^{(i)} } \right)}}{{\partial x_{s}^{(i)2} }}}x_{s}^{(i)} + {\frac{{\partial U^{\prime}\left( {x_{s}^{(i)} } \right)}}{{\partial x_{s}^{(i)} }}}, $$
(14)

It is easy to obtain the following theorem:

Theorem 1

With the addition of the quadratic term \( \sum\nolimits_{s = 1}^{S} {\sum\nolimits_{i = 1}^{{\omega_{s} }} {{\frac{\delta }{2}}\left( {x_{s}^{(i)} - y_{s}^{(i)} } \right)^{2} } } , \) for any fixed Y, if \( g\left( {x_{s}^{(i)} } \right) \le 0, \) then the primal objective function \( \hat{U}(\hat{x}_{s}^{(i)} ) \) is strictly concave with respect to X.

Proof

This can be proven in the same way as in [25, Proof of Lemma 1].□

From Theorem 1, considering the utility functions (1) parameterized by α, we can easily show that if α ≥ 1, \( \hat{U}\left( {\hat{x}_{s}^{(i)} } \right) \) becomes a strictly concave function. Throughout this paper, we assume that the condition in Theorem 1 is satisfied. Hence, the maximizer of (8) exists and is unique. Let X(t) be the solution to this optimization.

(S 2 ) Based on the results obtained in (S 1 ), then the maximization is resolved over Y while keeping X fixed, i.e., Y(t + 1) = X(t).

It is easy to show that such iterations will converge to the optimal solution of problem (4) as t → ∞ [24, p. 233].

Since the objective function in (8) is now strictly concave, we can use LOD approach to solve the optimization problem (8)–(11) as follows.

Let λ (i) l , i = 1, …, ω s , be the Lagrange Multiplier for constraints (9). Let \( \vec{\lambda }_{l} = \left[ {\lambda_{l}^{(i)} ,i = 1, \ldots ,\omega_{s} } \right] \) and \( \lambda = \left[ {\vec{\lambda }_{l} ,l = 1, \ldots ,L} \right]^{T} . \) To proceed, let us define the Lagrangian function

$$ \begin{aligned} L(X,\lambda ,P,Y) & = \sum\limits_{s = 1}^{S} {\left[ {U\left( {\sum\limits_{i = 1}^{{\omega_{s} }} {e^{{\hat{x}_{s}^{(i)} }} } } \right) - \sum\limits_{i = 1}^{{\omega_{s} }} {{\frac{\delta }{2}}\left( {e^{{\hat{x}_{s}^{(i)} }} - e^{{\hat{y}_{s}^{(i)} }} } \right)^{2} } } \right]} - \sum\limits_{l = 1}^{L} {\sum\limits_{i = 1}^{{\omega_{s} }} {\lambda_{l}^{(i)} } } \left[ {\sum\limits_{s = 1}^{S} {r_{ls}^{(i)} e^{{\hat{x}_{s}^{(i)} }} } - \hat{c}_{l}^{(i)} \left( {\hat{p}_{l}^{(i)} } \right)} \right] \\ \, & = \sum\limits_{s = 1}^{S} {\left[ {U\left( {\sum\limits_{i = 1}^{{\omega_{s} }} {e^{{\hat{x}_{s}^{(i)} }} } } \right) - \sum\limits_{i = 1}^{{\omega_{s} }} {{\frac{\delta }{2}}\left( {e^{{\hat{x}_{s}^{(i)} }} - e^{{\hat{y}_{s}^{(i)} }} } \right)^{2} } - \sum\limits_{i = 1}^{{\omega_{s} }} {\sum\limits_{l = 1}^{L} {\lambda_{l}^{(i)} r_{ls}^{(i)} e^{{\hat{x}_{s}^{(i)} }} } } } \right]} + \sum\limits_{l = 1}^{L} {\sum\limits_{i = 1}^{{\omega_{s} }} {\lambda_{l}^{(i)} \hat{c}_{l}^{(i)} \left( {\hat{p}_{l}^{(i)} } \right).} } \\ \end{aligned} $$
(15)

Moreover, the objective function of the dual problem is

$$ D(X,\lambda ,P,Y) = \mathop {\max }\limits_{x \ge 0,p \ge 0} L(X,\lambda ,P,Y) = \mathop {\max }\limits_{x \ge 0,p \ge 0} \left[ {\sum\limits_{s = 1}^{S} {B_{s} (X,\lambda ,Y)} + \sum\limits_{l = 1}^{L} {\sum\limits_{i = 1}^{{\omega_{s} }} {\lambda_{l}^{(i)} \hat{c}_{l}^{(i)} (\hat{p}_{l}^{(i)} )} } } \right], $$
(16)

where

$$ B_{s} (X,\lambda ,Y) = U(\sum\limits_{i = 1}^{{\omega_{s} }} {e^{{\hat{x}_{s}^{(i)} }} } ) - \sum\limits_{i = 1}^{{\omega_{s} }} {{\frac{\delta }{2}}\left( {e^{{\hat{x}_{s}^{(i)} }} - e^{{\hat{y}_{s}^{(i)} }} } \right)^{2} } - \sum\limits_{i = 1}^{{\omega_{s} }} {\sum\limits_{l = 1}^{L} {\lambda_{l}^{(i)} r_{ls}^{(i)} e^{{\hat{x}_{s}^{(i)} }} } } . $$
(17)

By the linearity of the differentiation operator, the objective function (16) can be decomposed into two separate maximization subproblems

$$ \begin{gathered} {\mathbf{P1:}}\,D_{1} \left( {X,\lambda ,Y} \right) = \mathop {\max }\limits_{x \ge 0} \sum\limits_{s = 1}^{S} {B_{s} \left( {X,\lambda ,Y} \right)} \hfill \\ {\text{subject to }}\hat{x}_{s}^{\min } \le \sum\nolimits_{i = 1}^{{\omega_{s} }} {\hat{x}_{s}^{(i)} } \le \hat{x}_{s}^{\max } , \quad \forall s \in \mathcal{S}; \hfill \\ \end{gathered} $$
(18)
$$ \begin{gathered} {\mathbf{P2:}}\,D_{2} (P,\lambda ) = \mathop {\max }\limits_{p \ge 0} V_{power} (P,\lambda ) = \sum\limits_{l = 1}^{L} {\sum\limits_{i = 1}^{{\omega_{s} }} {\lambda_{l}^{(i)} \hat{c}_{l}^{(i)} (\hat{p}_{l}^{(i)} )} } , \hfill \\ {\text{subject to }}\hat{p}_{l}^{\min } \le \sum\limits_{i = 1}^{{\omega_{s} }} {\hat{p}_{l}^{(i)} } \le \hat{p}_{l}^{\max } , \quad\forall l \in \mathcal{L}. \hfill \\ \end{gathered} $$
(19)

The first maximization P1 involves in congestion control mechanism for different \( \hat{x}_{s}^{(i)} , \) and the second one P2 concentrates on power control mechanism in physical layer. Note that the maximization involved in the calculation of the dual function has been decomposed into many simple maximization subproblems, which imply that the optimal resource allocation problems can be separated and solved in a distributed manner.

In the following, we will present congestion control sub-algorithm and power control sub-algorithm by solving the two maximization subproblems (18) and (19), respectively, to effectively mitigate congestion at the network bottlenecks and increasing the link data rates, using the Lagrange multipliers λ as congestion prices to allocate exactly the right power to each transmitter.

5.1 Congestion control sub-algorithm

The dual problem of (18), given Y, then corresponds to minimizing D 1(XλY) over the dual variables λ, i.e.,

$$ \mathop {\min }\limits_{\lambda \ge 0} D_{1} (X,\lambda ,Y). $$
(20)

Since the objective function of the primal problem (8) is strictly concave, the dual is always differentiable. The subgradient formulae of D 1 on the dual variable λ (i) l and the data rate \( \hat{x}_{s}^{(i)} \) are given by

$$ \nabla D_{1} (\lambda_{l}^{(i)} ) = - r_{ls}^{(i)} e^{{\hat{x}_{s}^{(i)} }} , $$
(21)

and

$$ \nabla D_{1} \left( {\hat{x}_{s}^{(i)} } \right) = {\frac{{{\text{d}}U(x_{s}^{(i)} )}}{{{\text{d}}x_{s}^{(i)} }}}e^{{\hat{x}_{s}^{(i)} }} - \delta \left( {e^{{\hat{x}_{s}^{(i)} }} - e^{{\hat{y}_{s}^{(i)} }} } \right)e^{{\hat{x}_{s}^{(i)} }} - \lambda_{l}^{(i)} r_{ls}^{(i)} e^{{\hat{x}_{s}^{(i)} }} . $$
(22)

The solutions for the maximization problem P1 can then be distributively solved by subgradient descent iterations algorithm given by

$$ \hat{x}_{s}^{(i)} (t + 1) = \left[ {\hat{x}_{s}^{(i)} (t) + \left. {\beta_{r} (t)\nabla D_{1} \left( {\hat{x}_{s}^{(i)} } \right)} \right|_{{\hat{x}_{s}^{(i)} = \hat{x}_{s}^{(i)} (t),\lambda_{l}^{(i)} = \lambda_{l}^{(i)} (t)}} } \right]_{{\hat{x}_{s}^{\min } }}^{{\hat{x}_{s}^{\max } }} , $$
(23)
$$ \lambda_{l}^{(i)} (t + 1) = \left[ {\lambda_{l}^{(i)} (t) + \left. {\beta_{r} (t)\nabla D_{1} \left( {\lambda_{l}^{(i)} } \right)} \right|_{{\hat{x}_{s}^{(i)} = \hat{x}_{s}^{(i)} (t)}} } \right]^{ + } , $$
(24)

where [·]+denotes the projection on [0, +∞). It is again easy to show that, given Y, the dual update (23) and (24) will converge to the minimizer of D 1(XλY) as t → ∞, provided that the step sizes β r (t) are sufficiently small [24, p. 214].

Remark 1

From (17), we can observe that the potential oscillation may be caused by the multipath nature of problem (4), but the additional quadratic term plays the crucial role in inhibiting the oscillation. Assume that there is no additional quadratic term, i.e., δ = 0, then we can easily find from network flow competition viewpoint that only paths that have at least competition price λ (i) l will have positive transmission rate x (i) s , when (17) is solved for any source s that has multiple alternative paths. That is, if \( \lambda_{l}^{(i)} > \mathop {\min }\limits_{k} \lambda_{l}^{(k)} , \) then x (i) s  = 0. The property can easily lead to oscillation of x (i) s when the dual variable λ (i) l is updated by (24). However, when δ > 0, it makes the maximum point \( \vec{x}_{s} \) of (17) continuous in \( \vec{\lambda }_{l} . \) Hence, the quadratic term serves a crucial role to restrict the oscillation and stabilize the system.

5.2 Power control sub-algorithm

As without fixed infrastructure of ad hoc networks, the power control operated by the optimization problem P2 must also be implemented distributively, just like congestion control part. Since the data rate on each wireless link is a global function of all the transmit powers, it is difficult that the power control subproblem can be nicely decoupled into local problem for each link. However, we show that distributed solution is still feasible, provided that an appropriate set of limited information is passed among the nodes.

We first establish that the objective function of problem P2 is a strictly concave function of a logarithmically transformed power vector.

Theorem 2

The objective function V power (P, λ) of problem P2 in ( 19 ) is concave over \( \hat{p}_{l}^{(i)} . \)

Proof

It is not difficult to see from (13) that the second term in the square bracket is linear in \( \hat{p}_{l}^{(i)} , \) and the last term is concave in \( \hat{p}_{l}^{(i)} \) because the log of a sum of exponentials of linear function of \( \hat{p}_{l}^{(i)} \) is convex, as proven in [6]. Therefore, the function \( \hat{c}_{l}^{(i)} \left( {\hat{p}_{l}^{(i)} } \right) \) is concave.□

It is easy to verify that the subgradient of V power (Pλ) with respect to \( \hat{p}_{l}^{(i)} \) is

$$ \nabla_{l}^{(i)} V_{power} \left( {\hat{p}_{l}^{(i)} } \right) = B_{l}^{(i)} \left[ {\lambda_{l}^{(i)} - e^{{\hat{p}_{l}^{(i)} }} \sum\limits_{j \ne l} {{\frac{{\lambda_{j}^{(i)} G_{jl}^{(i)} }}{{\sum\nolimits_{m \ne j} {G_{jm}^{(i)} e^{{\hat{p}_{m}^{(i)} }} } + \sigma_{j}^{(i)} }}}} } \right]. $$
(25)

Remark 2

The subgradient (25) is obtained under the high SINR assumption. If this assumption is not maintained, the expression (13) of \( \hat{c}_{l}^{(i)} \left( {\hat{p}_{l}^{(i)} } \right) \) will become

$$ \hat{c}_{l}^{(i)} (\hat{p}_{l}^{(i)} ) = B_{l}^{(i)} \left[ {\log \left( {\sum\nolimits_{m \ne l} {G_{lm}^{(i)} e^{{\hat{p}_{m}^{(i)} }} } + \sigma_{l}^{(i)} + kG_{ll}^{(i)} e^{{\hat{p}_{l}^{(i)} }} } \right) - \log \left( {\sum\nolimits_{m \ne l} {G_{lm}^{(i)} e^{{\hat{p}_{m}^{(i)} }} } + \sigma_{l}^{(i)} } \right)} \right]. $$

The subgradient (25) will be also modified according to the above expression of \( \hat{c}_{l}^{(i)} \left( {\hat{p}_{l}^{(i)} } \right). \)

Then the subgradient projection algorithm that solves problem P2 is obtained as follows. The transmission power \( \hat{p}_{l}^{(i)} \) used by link l on subcarrier i is adjusted iteratively

$$ \hat{p}_{l}^{(i)} \left( {t + 1} \right) = \left[ {\left. {\hat{p}_{l}^{(i)} (t) + \beta_{p} (t)\nabla_{l}^{(i)} V_{power} \left( {\hat{p}_{l}^{(i)} } \right)} \right|_{{\hat{p}_{l}^{(i)} = \hat{p}_{l}^{(i)} (t),\lambda_{l}^{(i)} = \lambda_{l}^{(i)} (t)}} } \right]_{{\hat{p}_{l}^{\min } }}^{{\hat{p}_{l}^{\max } }} . $$
(26)

Simplifying the equation and using the definition (12) of SINR, we can write (26) as the following distributed power control sub-algorithm with message passing:

$$ \hat{p}_{l}^{(i)} (t + 1) = \left[ {\hat{p}_{l}^{(i)} (t) + \beta (t)B_{l}^{(i)} \lambda_{l}^{(i)} (t) - \beta_{p} (t)B_{l}^{(i)} \sum\limits_{j \ne l} {G_{lj}^{(i)} \psi_{j} (t)} } \right]_{{\hat{p}_{l}^{\min } }}^{{\hat{p}_{l}^{\max } }} , $$
(27)

where \( \psi_{j} (t) = {\frac{{\lambda_{j}^{(i)} (t)\hat{\gamma }_{j}^{(i)} (t)}}{{e^{{\hat{p}_{j}^{(i)} (t)}} G_{jj}^{(i)} }}} \) are messages passed from the transmitter of link j ≠ l. That is, the transmitter of each link j calculates a message ψ j (t) based on locally measurable quantities, and passes the message to the transmitters of all other links by a flooding protocol. Each transmitter updates its power based on locally measurable quantities and the received messages by iterative equation (27). More importantly, the link price λ (i) j (t) in the message ψ j (t) is obtained by subgradient descent iterations algorithm (24) in congestion control mechanism. Thus, we can implement the joint control of congestion and power by the Lagrange multipliers λ.

It is worth noting that the distributed sub-algorithms of joint congestion control and power control requires two level convergence structure, i.e., each outer iteration (S1) consists of an inner level of iterations (24). In order to guarantee the convergence of the distributed algorithm, the inner level of iterations (24) must converge before each outer iteration can proceed from steps (S1) to (S2). In practice, it is difficult for network elements to decide in a distributed manner when the inner level iterations should stop. However, all computation in the constructed algorithms can be carried out based on locally measurable information, and hence can be easily distributed. More precisely, the dual objective function D(XλPY) in (16) has been decomposed into S separate congestion control subproblems for each source s = 1,…, S and L power control subproblems for each link l = 1,…, L.

Remark 3

According to (27), a message ψ j (t) can be updated at each link j based on locally measurable quantities such as SINR \( \hat{\gamma }_{j}^{(i)} (t) \) and link gain G (i) jj , which can be estimated over a certain time window. Hence, no per-source information needs to be stored or maintained. That is, an advantage of the measurement-based approach is that the algorithm does not need to rely on prior knowledge of the parameters of the system. This facilitates all the computation in a distributed fashion.

5.3 Robustness discuss of distributed algorithm

It is known from the previous parts that the path loss factor G (i) lj , which determines the message ψ j (t) and the power level in (27), is perfectly estimated by receiver. However, mobility of the nodes and fast variation of the fading process may lead to a mismatch between the G (i) lj used in the power update algorithm and the G (i) lj that actually appear in the link data rate formula. Therefore, it is useful to know how much error in the estimation of G (i) lj can be tolerated without losing the convergence of joint power control and congestion control.

Let ΔG (i) lj (t) denote the error in the estimation of G (i) lj at t-th iteration. We provide a sufficient condition using results about gradient algorithm with errors in gradient circulation [28].

Theorem 3

The distributed algorithm ( 21 )( 27 ) converge to the optimum of ( 16 ) with G (i) lj estimation errors, if there exists a T such that for all times t ≥ T, the following inequality holds:

$$ \begin{gathered} \sum\limits_{l} {\sum\limits_{j \ne l} {\sum\limits_{k \ne l} {\left[ {B_{l}^{(i)} (t)p_{l}^{(i)} (t)} \right]} } }^{2} \left( {G_{jl}^{(i)} (t)G_{kl}^{(i)} (t) - \Updelta G_{jl}^{(i)} (t)\Updelta G_{kl}^{(i)} (t)} \right){\frac{{\lambda_{j}^{(i)} (t)\lambda_{k}^{(i)} (t)\gamma_{j}^{(i)} (t)\gamma_{k}^{(i)} (t)}}{{p_{j}^{(i)} (t)p_{k}^{(i)} (t)G_{jj}^{(i)} (t)G_{kk}^{(i)} (t)}}} \hfill \\ > 2\sum\limits_{l} {B_{l}^{(i)} (t)p_{l}^{i} (t)\sum\limits_{j \ne l} {{\frac{{\lambda_{l}^{(i)} (t)\lambda_{j}^{(i)} (t)G_{jl}^{(i)} (t)}}{{p_{j}^{(i)} (t)G_{jj}^{(i)} (t)}}}} } \gamma_{j}^{(i)} (t) - \sum\limits_{l} {\left( {\lambda_{l}^{(i)} (t)B_{l}^{(i)} (t)} \right)}^{2} . \hfill \\ \end{gathered} $$
(28)

Proof

The proof can be found in “Appendix”.□

Theorem 3 gives an analytic condition of convergence with inaccurate estimations of G (i) lj for path losses. Numerical experiments can be carried out in simulations, where the G (i) lj factors in (27) are perturbed randomly within a range. Results of one typical experiment are shown in Fig. 6(b) for the same network topology and connections as those in Fig. 3. In this simulation, compared with Fig. 3, the G (i) lj factors are generated at random between +20 and −20% of their true values. The algorithm converges to the same optimum after a much longer and wider transient period.

6 Spectrum allocation

In this section, one of our focuses is how to efficiently find a feasible spectrum allocation strategy when frequency bands are enough to alleviate congestion. The key idea of our spectrum allocation is to use the overload spectrum utilization and the Lagrange multipliers of the congestion control sub-problem to identify the most congested links as local areas of highest priority. The new spectrum allocation π′ should remove traffic from highly congested resources or add bandwidth to highly congested resources if possible. Therefore, in this paper, we will allocate a subcarrier which can minimize the maximum spectrum utilization of the heavy overload link to this link to reduce congestion. To achieve the goal, we first formulate the constrained min–max problem to minimize the maximum efficient subcarrier utilization (ESU) of traffic source; secondly, we will give the expression of ESU related to transmission probability and link utilization; finally, we will propose a heuristic distributed spectrum allocation algorithm by solving the min–max optimization problem, in which congestion control and spectrum allocation are resolved sequentially and iteratively.

We assume that there are A (i) l active CR nodes within the coverage area of source s when it operates on link l over subcarrier i. These nodes are indexed by \( m \in I_{{A_{l}^{(i)} }} = \left\{ {1,2, \ldots ,A_{l}^{(i)} } \right\}. \) In cognitive ad hoc networks, A (i) l is not constant value because nodes can join and leave the network at any time due to their mobility, or become inactive, i.e., there is no message to be sent. Each active node has an average volume of traffic demand, denoted by d m , which is the data rate that must be provided by its associated traffic source in order to support the required QoS in the application layer. Let ϕ (i) l denote the utilization of link l on subcarrier i, and Ψ (i) l (n) be the ESU of source s when it operates on link l over subcarrier i during the n-th spectrum adjustment interval, where n ∈ I n  = {1, 2,…, n T }, and n T is the total number of spectrum adjustment allowed within a spectrum allocation and reallocation period, denoted by T.

Remark 4

It is worth noting that ϕ (i) l is fraction of time interval during which its link l on subcarrier i is sensed busy or used for transmission, the larger the link utilization, the busier the link and the higher the probability that congestion occurs. Therefore, one of our aims is to control the larger ESU to reduce the probability that congestion occurs in this link.

To effectively allocate the M n orthogonal subcarrier to the L links in a distributed fashion, the following constrained objective function for spectrum allocation can be achieved at the n-th spectrum adjustment:

$$ \mathop {\min }\limits_{i} \mathop {\max }\limits_{l} \Uppsi_{l}^{(i)} (n) \quad{\text{for}}\;i \in \mathcal{M}_{n} ,l \in \mathcal{L},n \in I_{n}, $$
(29)

subject to

$$ \mathop {\max }\limits_{m} \left\{ {d_{m} } \right\} \le \Uppsi_{l}^{(i)} (n)c_{l}^{(i)} ,\quad{\text{for }}m \in I_{{A_{l}^{(i)} }} ,i \in \mathcal{M}_{n} ,l \in \mathcal{L},n \in I_{n} . $$
(30)

The objective function in (29) is to allocate spectrums such that the ESU of the most heavily loaded link is minimized. The min–max operation results in more resource available for the most heavily loaded link at each spectrum allocation period. Constraint (30) states that the traffic demand for a link from all its associated nodes should be less than the capacity the link can provide. It can be seen that the spectrum allocation problem (29) is a global optimization problem. In theory, it is usually formulated as an integer linear programming (ILP) problem and found to be NP-complete or NP-hard. Therefore, we will have to develop heuristic algorithm to find suboptimal solutions locally so that the NP-hard problems can be avoided.

A virtual-transmission time (VTT), which refers to the time interval between two successful transmissions, consists of one successful transmission and several idle [29]. For a source with A (i) l active neighbor CR nodes, each neighbor transmits with a probability ρ (i) l , the probabilities that the link l on subcarrier i is in idle and successful transmission, denoted as ρ (i)l0 and ρ (i)l1 , respectively, can be provided as follows:

$$ \rho_{l0}^{(i)} = \left( {1 - \rho_{l}^{(i)} } \right)^{{A_{l}^{(i)} }} \;{\text{and}}\;\rho_{l1}^{(i)} = A_{l}^{(i)} \rho_{l}^{(i)} \left( {1 - \rho_{l}^{(i)} } \right)^{{A_{l}^{(i)} - 1}} . $$
(31)

Let \( \bar{u}_{l}^{(i)} \) denote the average time interval that the link l on subcarrier i spends on a successful transmission; and \( \bar{\upsilon }_{l}^{(i)} \) be its average VTT. The utilization of link l on subcarrier i can be defined at n-th subcarrier adjustment as [29, 30].

$$ \phi_{l}^{(i)} (n) = \bar{u}_{l}^{(i)} (n)/\bar{\upsilon }_{l}^{(i)} (n), $$
(32)

where

$$ \bar{u}_{l}^{(i)} (n) = \bar{L}_{msg} t_{b} \rho_{l1}^{(i)} (n), $$
(33)
$$ \bar{\upsilon }_{l}^{(i)} (n) = t_{slot} \rho_{l0}^{(i)} (n) + E_{suc} \rho_{l1}^{(i)} (n), $$
(34)

where \( \bar{L}_{msg} \) is the average message length in bytes; t b and t slot are the time needed to transmit a byte and a slot, respectively; E suc is the average duration time of a successful transmission.

By substituting (31), (33) and (34) into (32), we find

$$ \phi_{l}^{(i)} (n) = {\frac{{\bar{L}_{msg} t_{b} }}{{t_{slot} N_{l}^{(i)} (n) + E_{suc} }}}, $$
(35)

where \( N_{l}^{(i)} (n) = {\frac{{1 - \rho_{l}^{(i)} (n)}}{{A_{l}^{(i)} \rho_{l}^{(i)} (n)}}}. \) It is easily seen that the utilization ϕ (i) l (n) is a function of the transmission probability ρ (i) l , the number of the active CR nodes A (i) l and the duration time of successful transmission E suc .

Remark 5

Although the channel utilization is also defined in [30], the difference is that we do not consider the interference and collision in spectrum allocation due to with OFDMA modulation. However, it needs to be taken into consideration that the impact on the ESU of source s is caused by the collective interference generated by its active neighbor CR nodes.

By the above consideration, Ψ (i) l (n) consists of two parts: the first part accounts for the utilization of link l on subcarrier i as shown in (35); the other accounts for the wasted time slots due to the interference caused by a group of active CR nodes that are transmitting simultaneously. Therefore, we have

$$ \Uppsi_{l}^{(i)} (n) = \lambda_{l}^{(i)} (n)\left[ {\phi_{l}^{(i)} (n) + \sum\limits_{k = 1}^{{A_{l}^{(i)} }} {\phi_{k}^{(i)} (n)\rho_{k}^{(i)} (n)} } \right], $$
(36)

where the definition of λ (i) l (n) is the same as that in (24) and ρ (i) k denotes the transmission probability of active CR node k other than l on subcarrier i. The link on subcarrier i of the co-link interfering nodes transmits at the same probability as the link l of source s. Thus we have

$$ \Uppsi_{l}^{(i)} (n) = \lambda_{l}^{(i)} (n)\left[ {\phi_{l}^{(i)} (n) + A_{l}^{(i)} (n)\phi_{l}^{(i)} (n)\rho_{l}^{(i)} (n)} \right], $$
(37)

where it is regarded as the fact that all the active neighbor CR nodes virtually join the competition for subcarrier i with probability ρ (i) l .

Especially, we define the ESU function as the product of the link utilization ϕ (i) l (n) and the congestion price (Lagrange Multiplier) λ (i) l (n), which is reasonable since the higher link congestion price implies the link has the more traffic load while ϕ (i) l (n) effects on the ESU level. By the bridge role played by the Lagrange Multiplier, we can build up the correlation among congestion control, power control and spectrum allocation to achieve the joint optimization. Note that since each active CR node has only one link on subcarrier ito transmit data simultaneously, it is feasible to take the upper bound of active CR node link as A (i) l .

In the following, we will develop a heuristic distributed spectrum allocation algorithm in two aspects:

First, the search for a link l in \( \mathop {\max }\limits_{l} \Uppsi_{l}^{(i)} (n) \) in (25) is replaced by a research for a ρ (i) l (n) that maximizes the Ψ (i) l (n). By taking the partial derivative of Ψ (i) l (n) with respect to ρ (i) l (n) equal 0, and replacing the time step n by n − 1 in the right hand side of equation, we can find

$$ \rho_{l}^{(i)} (n) = - \phi_{l}^{(i)} (n - 1)/(\partial \phi_{l}^{(i)} (n - 1)/\partial \rho_{l}^{(i)} (n - 1)) - 1/A_{l}^{(i)} (n - 1). $$
(38)

Let W (i) l (n) denote the maximum ESU achieved by using ρ (i) l (n) in (38). Therefore, each subcarrier of source s can operate independently to minimize its W (i) l (n) without relying on extensive message exchanges with other nodes.

Second, we need search for an optimal subcarrier i, i ∈ M n , in \( \mathop {\min }\limits_{i} W_{l}^{(i)} (n) \) in (25).

The distributed spectrum allocation algorithm can be summarized as follows:

  • Step1: Initially, set n = 1, for n ∈ I n . Each link of source s is allocated with one of the available subcarriers randomly.

  • Step2: For link l on subcarrier i, collect network congestion status information, i.e., λ (i) l and x (i) l from congestion control sub-algorithm and c (i) l information from power control sub-algorithm, then the link l on subcarrier i estimates its optimal transmission probability ρ (i) l (n) by using (38) to achieve maximum subcarrier utilization. Compute ESU by using (37) and obtain the maximum ESU W (i) l (n) during the n-th subcarrier updating cycle.

  • Step3: After switching to a few different subcarriers during different subcarrier updating cycles, the link l has experienced several different values of W (i) l (n). Select the subcarrier that corresponds to the minimum W (i) l (n) value. Denote \( Z_{l} (n) = \min \left\{ {W_{l}^{(i)} (n)} \right\} \) and K l (n) = i, which is the optimal subcarrier for link l at this time.

  • Step4: Check the inequality

    $$ \left| {Z_{l} (n) - } Z_{l} (n - 1)\right| \le {{\upxi}}_{0} $$
    (39)

    where ξ0 is predefined constant that determines the degradation of QoS of the network. If (39) holds, stop the current subcarrier adjustment. Go to Step 2, to monitor the network status.

  • Step5: Check whether or not the constraint (30) is satisfied. If not, let i = K l (n) for link l and set n = n + 1, then go to Step (2). If yes, select K l (n) = i as the best subcarrier for link l and go to Step (6).

  • Step6: Switch to the best channel related to an optimal subcarrier with switching probability ρ s , for example ρ s  = 0.5, for uniformly distributed subcarrier layout.

  • Step7: Let n = n + 1, then go to Step 2.

Therefore, each source node can run the sub-algorithm periodically and independently. It periodically scans its available subcarrier set and maintains a record for each subcarrier. Among them, it finds and allocates its best subcarrier locally to the link l. The convergence of this algorithm can be proved by using the Theorem 2 in [31], which states that the algorithm does not have infinite looping if 0 < ρ s  < 1.

7 Numerical results

We evaluate the performance of the proposed distributed joint optimization algorithms, and compare it to the approach of Xi and Yeh in [19]. We first describe the implementation of our iterative congestion control, power control and spectrum allocation algorithms. Then we discuss the results of the comparative performance evaluation.

7.1 Implementation

We have implemented our joint congestion control, power control and spectrum allocation framework and the algorithms in [19] using Network Simulator-2 (NS-2) [32] in C++, respectively. The implementation of our algorithms includes the following components.

7.2 Path construction sub-algorithm

We first make use of the optimal linear cooperation framework (OLCF) [21] to detect the availability of a subcarrier and obtain the available subcarrier set of a CR node. A set of subcarrier paths is constructed for each node-to-node route of source. Since each subcarrier of nodes has one subcarrier-to-subcarrier link, every node has multiple subcarrier paths to its destination. The construction procedure is shown in Sect. 3. In addition, node-to-node paths are pre-computed separately by using Dijkstra’s shortest path algorithm and minimum hop count as metric.

7.3 Congestion control sub-algorithm

The CCA distributes optimally the traffics of each source to these paths to avoid congestion by adjusting the congestion prices. We interpret the congestion prices as the congestion information of links, which is indeed consistent with the contention relationship among flows in ad hoc networks since transmissions on one link contribute the congestion over all links. Source s can solve (18) separately without the need to coordinate with other sources because congestion price summarizes the entire congestion information source s needs to know. This is enabled by the multi-path congestion control problem formulation presented in Sect. 5A.

7.4 Power control sub-algorithm

By congestion price derived by CCA, we iteratively update the power of each transmitter to assign the optimal power for every link. The change of transmission power leads to the adjustment of the link capacity, which can reduce the allowed transmission rates to reach the goal of reduce congestion. The distributed algorithm is presented in Sect. 5B

7.5 Heuristic spectrum allocation sub-algorithm

Based on the link utilization obtained in (37) and congestion price in Sect. 5, the spectrum allocation sub-algorithm distributes the optimal subcarrier for each link. The algorithm is of low complexity since it only involves local searches in the most congested links. The detail allocation procedure is shown in Sect. 6.

The relationship among congestion, power and spectrum allocation can be summarized as shown in Fig. 2. As stated in Introduction, the main objective of congestion control is to regulate the allowed source rates so that the total traffic load on any link does not exceed its available capacity, which depends on the interference level, and in turn the power control and spectrum allocation policy. It follows from these observations that a tradeoff exists between subcarrier allocation, power levels, data rates, and congestion in ad hoc networks. These elements need to be coordinated in order to arrive at enhanced strategies that not only maintain a minimum variance between the actual and desired SINR levels but that also ensure a low probability of packet loss in the network.

Fig. 2
figure 2

Nonlinearly coupled dynamics of congestion, power control and spectrum allocation

It is not difficult from Fig. 2 to see that the couple of spectrum allocation with power control and the effect of spectrum allocation on the overall performance work as follows:

  1. (1)

    The HSA always monitor network congestion and data rate status in a real time from the transport layer. Once congestion occurs, at first, the HSA begins to run iteratively to turn out a spectrum allocation policy;

  2. (2)

    According to the policy and channel conditions, the PCA adjusts the corresponding transmission powers, and in turn modifies the link capacities;

  3. (3)

    The CCA adjusts the date rates of traffic sources. Certainly, it is possible that the data rates are adjusted while the allocated subcarriers and the computed link capacity keep invariable.

Therefore, congestion price information from transport layer need be used by power control and spectrum allocation in physical layer to either adapt its existing algorithm or create new diversity. Link capacity information from power control component and source rate information from congestion control component will be transmitted to spectrum allocation component to make a new subcarrier allocation strategy. Furthermore, the strategy is provided with power control component as criteria of adjusting power together with congestion price. Then congestion control component modifies transmitting rates of sources according the link capacity given by power control component. Therefore, these tasks need be jointly accomplished across the layers.

7.6 Overhead and complexity analysis for distributed joint optimization algorithm

  • Overhead—The messages ψ j (t) passed from the transmitter of link j ≠ l for every subcarrier i to the transmitters of all other links by a flooding protocol; the congestion price messages λ (i) j (t) transmitted among these cross-layer sub-algorithms; the average duration time E suc of a successful transmission; link capacity c (i) l information of link l on subcarrier i; data rate x information from its associated active CR nodes; the propagation delay and interframe spacing time.

  • Complexity of distributed joint optimization algorithm

For path construction sub-algorithm, we employ the Dijkstra’s shortest path algorithm to pre-compute node-to-node paths, and a CR node n has at most M n available subcarriers. Therefore, the complexity of this sub-algorithm is O(MN logN), where Ndenotes the number of the CR nodes, M denotes the number of the available subcarriers detected.

For congestion control sub-algorithm, it is shown in [27] that given λ (i) l , the subproblem (18) can be solved in at most O[ω s  log ω s ] steps. For each subcarrier i of source s, the number of the data rate \( \hat{x}_{s}^{(i)} \) updating iterations is bound and does depend on the number of the subcarrier-to-subcarrier links. For a source s, the complexity of this sub-algorithm is O((N + L) s  log ω s ), where ω s denotes the number of subcarrier-to-subcarrier links from h(s) to d(s). Therefore, the complexity of this sub-algorithm is O[(N + L)M 2 S log (SM)].

For power control sub-algorithm, since it need not additionally compute the congestion price λ (i) l and the number of power updating iterations is bound, which does not depend on the number of the CR nodes, the complexity of the power allocation process is, therefore, O(LM).

For spectrum allocation sub-algorithm, as stated in step 2 of the spectrum allocation sub-algorithm, we need obtain the maximum ESU W (i) l (n) according to the optimal transmission probability ρ (i) l (n) by using (38). The complexity of this step is O(M log M), where M the total number of spectrum adjustment allowed. The complexity of the whole spectrum allocation sub-algorithm is O(LM 2 log M).

If we combine the complexity of these sub-algorithms, the overall complexity of the distributed joint optimization algorithm is O[(N + L)M 2 S log (SM)].

7.7 Performance evaluation

In our simulations, for giving a justified evaluation in a large scale wireless multi-hop ad hoc network, network topology is composed of N = 100 nodes, E = 60 node-to-node links, S = 30 sources. We model an OFDMA system in which the entire bandwidth is divided into 2048 subcarriers, i.e., K = 2048, which is provided by OFDMA standard. The bandwidth Bof each subcarrier is randomly distributed in [1, 5] kbps. All the nodes are distributed randomly in an area of 1,000 × 1,000 m2. Without loss of generality, we take sources s1, s2, s3, their corresponding destinations h(s1), h(s2), h(s3) and the intermediate nodes R1, R2 as the observation object in our simulation. We set BER = 10−3, ϕ 1 = ϕ 2 = 1, and G lk  = d −4 lk where d lk is the distance between transmitter of link l and receiver of link k. The thermal noise power density of each subcarrier is −145 dBm/Hz. We assume that the minimum and maximum data rates for all the links, x min and x max are uniformly distributed in (0, 1) kbps and (12, 15) kbps, respectively, and let the minimum and maximum transmission powers for all the sources, p min and p max, have uniform distribution in (0.2, 0.8) and (6, 8) mW, respectively, for reflecting the diversity of different source rates and subcarrier powers. Finally, we set the parameters related to algorithms as iteration step size β r (t) = β p (t) = 0.01, δ = 0.1, t slot  = 20 μs, \( \bar{L}_{msg} = 10\,{\text{KB}}, \) E suc  = 300 μs, ξ 0 = 0.01 and t b be distributed randomly in (5, 20) μs since the time needed to transmit a byte is different. All of results are obtained by taking the mean values collected from 100 trial runs of corresponding appropriately experimental scenarios.

7.8 The effect of power control and spectrum allocation on utility

In this part, we will verify the efficiency and convergence of our proposed distributed algorithms. We take the utility function with proportional fairness, i.e., w s  = 1 and α = 1. The initial transmission powers for all the sources are 2.25 mW. After the algorithm achieves stable, the obtained spectrum allocation for each node is list in Table 2. Figures 3 and 4 indicate the convergences of data rate and power for each subcarrier link of source h(s1), respectively. A key question for practical implementation of the proposed cross-layer design is whether the coupled nonlinear dynamics among these sub-algorithms can proceed close to the equilibrium before the network topology, routing and source characteristics change dramatically. It is desirable to choose a constant step size that is neither so large that the algorithm diverges (e.g., violating the conditions in Sect. 5), nor so small that the convergence is too low. We find that when β r (t) and β p (t) are less than 0.05, the proposed algorithms can converge to the equilibrium. One way to accomplish this is to let each source and each transmitter autonomously decrease the step size at each time slot t according to the following rule: β r (t) = β p (t) = w/t, w > 0 is a constant. Such a diminishing sequence of step size also make the algorithm even robust: errors in queueing delay λ and path loss G can be tolerated. It is also possible to speed up the convergence of the algorithm by diagonally scaling the distributed subgradient method

$$ \hat{p}_{l}^{(i)} (t + 1) = \left[ {\hat{p}_{l}^{(i)} (t) + \left. {\beta_{p} (t)W\nabla_{l}^{(i)} V_{power} \left( {\hat{p}_{l}^{(i)} } \right)} \right|_{{\hat{p}_{l}^{(i)} = \hat{p}_{l}^{(i)} (t),\lambda_{l}^{(i)} = \lambda_{l}^{(i)} (t)}} } \right]_{{\hat{p}_{l}^{\min } }}^{{\hat{p}_{l}^{\max } }} , $$

where W should be the inverse of the Hessian H of \( \nabla_{l}^{(i)} V_{power} \left( {\hat{p}_{l}^{(i)} } \right). \) Approximately, we can take W = diag(H −1 ii ).

Table 2 The efficiency evaluation of power control and spectrum allocation
Fig. 3
figure 3

Evaluation of data rate for each subcarrier of source h(s1)

Fig. 4
figure 4

Evaluation of transmission power for each subcarrier of source h(s1)

We verify the effect of power control and spectrum allocation on the network utility from two aspect, i.e., without power control and spectrum allocation (case 1) and with them (case 2), where there exists the link congestion between R1 and R2 and the interference to R1 and d(s2) from h(s3). Table 2 shows that as more sources begin to transmit data, some subcarriers are reallocated and powers are also readjusted. Some node link capacities actually decrease, while the capacities on the bottleneck links rise to maximize the total network utility. This is achieved through a distributive adaptation of power, which lowers the power levels that cause most interference on the links that are becoming a bottleneck in the dynamic demand–supply matching process. The end-to-end throughput per watt of power transmitted, i.e., the throughput power ratio (TPR), is 20.5% higher with power control and spectrum allocation, compared TPR without power control and spectrum allocation. Therefore, power control, spectrum allocation and congestion control, which are running distributively and coordinated through the congestion prices, work together to enhance energy efficiency of multi-hop transmission across a multi-hop ad hoc network.

Moreover, we compare the convergence rate and utility by our algorithms with those by approach in [19] using equal weights w s for all sources. The main reason behind this choice is that it is not straightforward to determine optimal weights in advance. It can be seen from Fig. 5 that the convergences of our proposed algorithms are faster and the throughputs obtained by our algorithms are larger by comparison with those obtained by [19]. The main reasons are that (i) in spectrum allocation in [19], the connectivity and interference graphs need be constructed, and the link coloring problem need also be resolved, which is NP-hard. However, we need only solve the linear combination optimization problem to detect and allocate those available subcarriers, which has less overhead; (ii) in [19], the power and spectrum modifications are based on route links of nodes. Link-based modifications involve switching both radios of a radio link to a different channel. However, our modifications are based on radio spectrums, which involve switching only a single radio to a different channel. Link-based modifications are more drastic because they result in more links switching channels.

Fig. 5
figure 5

Comparison of convergence and throughput between our algorithm and [19]

7.9 The impact of node mobility on convergence and throughput

In this part, we further verify the impact of the path losses induced by mobility of the nodes and fast variation of the fading process on the convergence and the utility of joint distributed coordinative algorithms proposed in this paper. The mobility model in the simulator works in the following way: a node randomly select a destination point within the network limits and then moves toward that point at a speed of 0, 5, 10, 15 and 20 m/s. In this simulation, G lk in (27) are perturbed randomly between 20 and −20% of their true values to verify the robustness of the algorithms with inaccurate estimations of G (i) lj . From the results of Fig. 6, we have the following observations. First, our proposed algorithms are robust to the low speed mobility, and can be fast converged. Second, the aggregated throughput of the ad hoc network with lower speed mobility is larger than that with higher speed mobility. Third, the higher speed of node mobility, the larger fluctuation of the aggregated throughput. Meanwhile, it can be observed from Fig. 7 that for a given power, the higher the speed of node mobility is, and the lower the transmission rate of the node has; similarly, to keep a larger transmission rate, the node with higher mobility speed is allocated to larger power.

Fig. 6
figure 6

a The impact of node mobility on convergence b The algorithm is robust to wrong estimates of path losses

Fig. 7
figure 7

Relationships among the transmission rate, power and mobility speed for node h(s1)

The reason behind the observations is that the higher speed mobility leads to larger transmission power allocated by power control algorithm in order to keep the connectivity between source and destination, moreover, induces larger interference range and contention region. Thus, to share the same capacity of the wireless channel, flows in the ad hoc network with higher mobility speed faces more contention than that with lower speed mobility. Thus the allocated rate to flows is lower, and with more fluctuation.

7.10 The impact of packet losses on convergence

The distributed implementation of our proposed algorithms depends heavily on information feedback through message passing (such as congestion prices and messages ψ j (t)) among network elements. Obviously, if the newest congestion prices and the newest messages ψ j (t) are lost, the rate, power and price can not be timely updated. Thus, the packet losses will impact on the convergence of the distributed algorithms. It is of significance to study the robustness of our distributed algorithms to packet losses. We experimented with different percentages of packet losses and evaluated its impact on convergence. The simulation results of are shown in Fig. 8, where RNI stands for the Required Number of Iterations to achieve the expected optimal precision which refers to the ratio between the optimal solution obtained in the current iteration and that in the previous iteration. It can be observed from Fig. 8 that a large percentage of packet losses increase the convergence time and the higher optimal precision need the larger RNI to guarantee the convergence satisfied the required convergence precision.

Fig. 8
figure 8

Impact of packet losses on convergence

7.11 The impact of the dynamic change of routes on throughput and fairness

In a mobile ad hoc network, the topology change of the network induced by node mobility and the change of the total offered traffic loads will lead to the dynamic change of routes between source and destination pairs. While simulating the ad hoc network with mobile nodes under different offered traffic loads from 0 to 100 kps, we choose the random waypoint mobility model with zero pause time, and each node in the network moves at either 5, 10 or 20 m/s speed. Obviously, the faster the node moves and the more the offered traffic loads, the more frequent the routes adjust. The impact of the dynamic change of routes on the end-to-end throughput and consumed power are shown in Figs. 9 and 10, respectively. It is straightforward from Figs. 9 and 10 that the faster the change of routes, the higher the consumed power whereas the lower the achieved throughput. We further observe that when total offered loads increase, the increase of throughput is slower as the change of routes becomes fast. This is reasonable because an increase of the node movement speed will induce that the more intermediate nodes move in or out of existing routes, requiring the routing protocol to take periodic actions to repair these routes in time. It in turn leads to the higher number of signaling packets required by the routing protocol to discover and maintain routes. Those signaling packets need consume capacity and power resource in the network. The faster change of routes may generate too many signaling packets in the presence of node mobility, and therefore, a higher transmission power may be consumed.

Fig. 9
figure 9

The impact of the dynamic change of routes on throughput

Fig. 10
figure 10

The impact of the dynamic change of routes on consumed power

Furthermore, we will employ fairness index for all flows x i (1 ≤ i ≤ N)to evaluate the impact of the dynamic change of routes on fairness among flows in terms of end-to-end throughput, i.e.,

$$ f = {{\left( {\sum\limits_{i = 1}^{N} {x_{i} } } \right)^{2} } \mathord{\left/ {\vphantom {{\left( {\sum\limits_{i = 1}^{N} {x_{i} } } \right)^{2} } {\left( {N \cdot \sum\limits_{i = 1}^{N} {x_{i}^{2} } } \right)}}} \right. \kern-\nulldelimiterspace} {\left( {N \cdot \sum\limits_{i = 1}^{N} {x_{i}^{2} } } \right)}}, $$

where x i denotes the end-to-end throughput of the ith flow.

Figure 11 shows that the low movement speed will turn out the high fairness. The reason behind the observation is that the high movement speed will generate the more signaling packets. These packets waste a lot of wireless bandwidth and consume significant power. It is unfair to the flows of the nodes with low movement speed. The fairness index is still much less than 1 for the scenario of low movement speed because the traffic distribution is unbalanced in the presence of node mobility and the flows with shorter paths still have advantages over the flows with longer paths. In addition, the dynamic change of routes may also introduce instability into the network for the power control to converge, and this may result in a higher number of link drops.

Fig. 11
figure 11

The impact of the dynamic change of routes on fairness

7.12 Performance analysis of heuristic spectrum allocation

In wireless ad hoc networks, the selection criterion that a link chooses a subcarrier to provide the best connection service can be either the stability of the connection, such as the duration of connection in a mobile environment, or signal quality, such as signal strength and bit error rate. In the first case, the subcarrier allocation is somewhat random because it is not related to subcarrier utilization. We call this scheme Random Subcarrier Allocation (RSA). In the second case, the allocation is based on the interference level detected by the traffic source. We call this scheme Interference-based subcarrier allocation (ISA).

For the same throughput, compared to RSA, our heuristic spectrum allocation (HSA) improves ESU for about 40% as shown in Fig. 12; compared to ISA, HSA still improves ESU for about 25% for the simulated layout and load. This is due to the fact that HSA keeps monitoring the subcarrier utilization and reallocates subcarrier whenever it is degraded below a threshold value. An advantage is that our HSA takes into account link available capacity, the attained data rate and congestion status information.

Fig. 12
figure 12

Comparison among RSA, ISA and HAS

In terms of complexity, HSA needs much more computations than RSA and ISA due to min–max optimization. In terms of fairness, RSA and ISA may achieve an equal share of the number of the available subcarriers. For HSA, the fairness is related to the load distribution. If the load is uniformly distributed, HSA also achieves a fair subcarrier allocation, particularly for light-to-moderate load. If the load is non-uniformly distributed, HSA allocates subcarriers in a unfair way that heavily loaded nodes get more shares of the available subcarriers. The reason is that HSA allocates subcarriers in order to maximize network utility and alleviate network congestion, in stead of fairness.

8 Conclusions

In this paper, we have presented a joint cross-layer design scheme of congestion control, power control and spectrum allocation for cognitive ad hoc networks with OFDMA modulation, where the resulting algorithms are to solve a utility maximization problem with elastic link capacities. Although the original optimization problem is non-convex and non-decoupled due to multiple alternative subcarrier paths, we are still able to provide a dual-decomposable convex formulation by adding the penalty quadratic term and transforming the variables. We have derived two sub-algorithms that are not only distributed spatially, but more interestingly, they decompose vertically into the transport layer and physical layer to jointly solve the optimization problem by use of subgradient iterative project approach. The first one is a primal algorithm, i.e., congestion control sub-algorithm, where sources adjust their end-to-end rates, and the congestion prices are generated. The second is a dual subgradient algorithm, i.e., power control sub-algorithm, where the transmission power of each source is adjusted. Moreover, to dynamically adapt for the change of channel status and meet heterogeneous traffic demands, a distributed spectrum allocation sub-algorithm is proposed by iteratively solving the min–max optimization problem of effective spectrum utilization. The three sub-algorithms interact and are coordinated through congestion prices. We prove that the nonlinear coupled algorithms converges to the optimum of the joint congestion control, power control and spectrum allocation problems, even in the case of path and packet losses. Simulation results have further verified the convergence and effectiveness of the proposed algorithms and the robustness to node mobility and packet losses.