Abstract
Recently, non-orthogonal multiple access (NOMA) small cell networks (SCNs) have been studied to meet the stringent requirements for spectral efficiency and massive connectivity in the emerging 5G networks. This paper aims at addressing the overall resource orchestration issue in 5G SCNs, by considering the problem of joint user to small cell association and uplink power allocation, employing NOMA technology. In particular, the power allocation is performed under an incomplete information scenario, where the users’ channel conditions are unknown to the small cell. To treat this issue in an effective manner, contract theory is adopted in order to incentivize each user to select the power level that optimizes its own utility, while each small-cell base station (SBS) rewards them inversely proportionally to their respective sensed interference. The proposed framework is complemented by a distributed user-cell association mechanism based on reinforcement learning (RL). Indicative numerical results are provided to validate the operation and effectiveness of the proposed contract-theoretic approach.
Access provided by Autonomous University of Puebla. Download conference paper PDF
Similar content being viewed by others
Keywords
- Small cell networks (SCNs)
- Non-orthogonal multiple access (NOMA)
- Reinforcement learning (RL)
- Contract theory
- Incomplete information
- User association
- Power allocation
1 Introduction
The deployment of small cells co-existing with the legacy macrocell network, constitutes a straightforward and effective approach to support the deluge of data traffic induced in the uplink of 5G wireless networks. Though small cells are typically low power, low cost, short range wireless transmission systems (base stations), they have all the basic characteristics of conventional base stations and are capable of handling high data rate for individual users [1, 2]. Accordingly, they can efficiently reuse the spectrum across different geographical areas, thus, improving spectrum efficiency and enhancing network coverage and capacity. Towards the same direction of contributing to improved spectral efficiency and massive connectivity, non-orthogonal multiple access (NOMA) technology has been adopted by 5G networks, allowing multiple users to multiplex in the power domain over the same time/frequency/code resources. The successive interference cancellation (SIC) technique is, then, applied at the receiver to mitigate part of the interference, by sequentially decoding the received signals based on their signal strength [3].
One of the key problems and challenges in this emerging environment, is the inter-user interference caused by different users transmitting over the same resources. The latter heavily depends on the joint user to small cell association and corresponding power allocation problem. Therefore, the development of efficient resource allocation schemes that deal with this problem in NOMA small cell networks (SCNs) is of significant research and practical importance.
1.1 State of the Art and Motivation
Several research works (e.g., [4, 5]) have addressed the emerging resource orchestration problem in NOMA SCNs, and in particular the joint problem of user association and power allocation. For instance, in [4], the problem is formulated as a coalition formation game that associates users with the small cell, for which the total power is minimized in the optimal partition of users. In [5], a similar problem focusing on the joint user clustering and power allocation, is treated in two iterative stages to avoid the exhaustive search of user clusters. Nevertheless, the majority of existing works on the topic of user association and/or interference mitigation, presume perfect knowledge of the global channel and/or network information (e.g., channel state information (CSI)), which is either impossible to have or impractical to obtain. Considering the situation when only statistical CSI is available at the small base stations (SBSs), the private information of the user devices regarding their experienced channel conditions could be utilized to ease the resource allocation procedure. In this context, contract theory, enables the modeling of an incentive mechanism in 5G SCNs under a practical scenario of incomplete information, where the users’ private information (i.e., transmission power, wireless channel characteristics) are not known by the SBSs. Based on contract theory, the negotiations among the SBSs and the users are modeled under a network economics framework aiming to identify the SBSs’ optimal rewards provided to the users in order the latter to determine their optimal transmission power. Indicatively, we note that the work in [6] has attempted to deal with the specific CSI incompleteness problem based on contract theory principles, for heterogeneous long-term evolution advanced (LTE-A) networks. Nonetheless, the proposed approach assumes a central entity determining the optimal user association and contracts, while it is targeting orthogonal frequency division multiple access (OFDMA) environments in order to mitigate or eliminate interference, which in turn reduces the spectral efficiency due to the requirement of channel access orthogonality.
1.2 Contribution and Outline
Targeting both spectral efficiency and application feasibility, it is practically beneficial to assume that the total system available bandwidth is subdivided into orthogonal frequency chunks (i.e., channels). In this regard, adjacent small cells are allocated to different frequency chunks that do not interfere with each other. Within such communication environment, our contribution aims at introducing a resource orchestration framework through a distributed user association and power allocation scheme which reduces computational complexity, while accounting for the CSI incompleteness. Specifically, we propose a contract-based approach, in which the small cell, acting as a monopolist, determines the users’ transmit power level so as to be able of decoding their signals, and rewards them inversely proportionally to their interference in order to incentivize them to accept the proposed contract. The key idea is to offer the right contract item to each user, so that all users have the incentive to truthfully reveal their CSI. Furthermore, the determination of the small cell to which each user is associated with, is performed by the users individually based on the evaluation of the contract items, through a reinforcement learning (RL) algorithm. In a nutshell, our work offers a resource orchestration framework applicable in SCNs that differs from previous existing efforts in the literature, in the sense that it: a) considers an interference-limited wireless environment, adopting the use of NOMA technology that has arisen as a promising access technology in 5G networks, b) formulates the overall resource orchestration problem under an incomplete CSI scenario, and c) promotes the use of distributed approaches in the decision making, eliminating the need for centralized decision making entities.
The rest of the paper is organized as follows. Section 2 contains the considered system model and the introduced SBSs’ and users’ utility functions. In Sect. 3, we present the optimal contract design under incomplete information, while we determine the SBSs’ optimal rewards to incentivize the users to transmit with their optimal uplink transmission powers. The users’ association to the small cells based on reinforcement learning is highlighted in Sect. 4. In Sect. 5, indicative numerical results validating the operation of the contract theoretic framework are presented, while Sect. 6 concludes the paper.
2 System Model
We consider a heterogeneous dense wireless network consisting of a set of users \(U=\{1,\dots , |U|\}\) and a set of SBSs \(C=\{1,\dots , |C|\}\). A set \(\mathbb {U}_c\) of cardinality \(|\mathbb {U}_c|\) users represents the users associated with cell c. The channel gain of user u communicating with the SBS c is denoted by \(G_u^c=k_u/(d_u^c)^a\), where \(k_u\) is a lognormal distributed random variable with mean 0 and variance \(\sigma ^2\), \(d_u^c\) [m] is the distance between user u and SBS c and a is the corresponding path loss exponent. To consider realistic scenarios, the SBS lacks specific information of users’ transmission characteristics, i.e., CSI, type of user. Adopting the principles of contract theory [7], each SBS aims at incentivizing the users to exhibit their improved transmission characteristics by providing them some transmission-related rewards. The goal is to determine an equilibrium point, where both the SBS and the users maximize their utilities, in terms of collecting and transmitting information, respectively. The type of the user is defined as \(t_u^c=(G_u^c\cdot \sum \{1/G_u^c\})^{-1/2}, t_u^c\in (0,1]\), and without loss of generality, we consider \(|U_c|\) types of users with \(t_1^c<\dots<t_u^c<\dots <t_{|\mathbb {U}_c|}^c\). Adopting the NOMA technique and by applying SIC at the receiver in the uplink, the users with worse channel gain conditions are alleviated by the interference caused by the users with better channel conditions, as the signals of the latter are decoded first at the receiver and excluded from the interference sensed by the users with worse channel conditions. Thus, the SBS c rewards the user u with \(r_u^c=\rho /I_u^c\), where \(\rho \in \mathbb {R}^+\) is the reward factor and \(I_u^c\) is the interference that user u senses, while being associated with SBS c. The physical meaning and interpretation of the reward is that the SBS provides a greater reward to users that experience less interference. Thus, a user of higher type \(t_u^c\) (i.e., worse channel conditions), senses less interference and it is rewarded more by the SBS (for fairness purposes), as it is expected to transmit with higher power (i.e., invest greater effort).
Figure 1 summarizes the overall operation of the proposed framework. At the beginning of each time slot, the users select an SBS to be associated with in a distributed manner following a reinforcement learning approach (Sect. 4). Then, the optimal contracts are determined by each SBS for the corresponding users residing within each small cell (Sect. 3). The overall nested procedure is executed over the time guaranteeing the smooth operation of the 5G SCNs.
3 Optimal Contract Design
The considered heterogeneous dense network is characterized by asymmetry of information, as the exact users’ types and transmission powers are unknown to each SBS. Instead each SBS c only knows the probability \(Pr_u^c\) that the user u is of type \(t_u^c\) and \(\sum \limits _{u=1}^{|\mathbb {U}_c|}Pr_u^c=1\). The utility that SBS c experiences from user u is defined as \(U_c^u=P_u^c-\mathcal {C}\cdot r_u^c\), expressing the SBS’s satisfaction in terms of its operation by receiving a signal with high power strength \(P_u^c\), while being charged with the cost \(\mathcal {C}\cdot r_u^c\) of providing incentives through the rewards to the user. Thus, the overall SBS’s utility is \(U_c=\sum \limits _ {u=1}^{|\mathbb {U}_c|}[Pr_u^c\cdot (P_u^c-\mathcal {C}\cdot r_u^c)]\). On the other hand, the user’s utility is defined as \(U_u^c(P_u^c)=t_u^c\cdot e( r_u^c)-\mathcal {C'}\cdot P_u^c\), which expresses the user’s satisfaction \(t_u^c\cdot e(r_u^c)\) from receiving the reward \(r_u^c\) from the SBS c, while considering its personal transmission cost \(\mathcal {C'}\cdot P_u^c\). It is noted that the user’s satisfaction depends on its type \(t_u^c\) and on the evaluation function \(e(r_u^c)\) (with \(e(0)=0, e'(r_u^c)>0, e''(r_u^c)<0\)), which captures the user’s perception and personal satisfaction from the reward.
Following the principles of contract theory, the SBS establishes a personalized contract \((r_u^c,P_u^c)\) with each user in the small cell, where the user invests its personal effort \(P_u^c\) (i.e., uplink transmission power) and the SBS rewards the user with \(r_u^c\). A contract is feasible if the following conditions hold true: (i) the Individual Rationality (IR), i.e., the contract should guarantee that the user’s utility is non-negative \(U_u^c(P_u^c)\ge 0, \forall u\in \mathbb {U}_c\); and (ii) the Incentive Compatibility (IC), i.e., each user must select that contract which is designed for its type \(t_u^c\cdot e( r_u^c)-\mathcal {C'}\cdot P_u^c\ge t_u^c\cdot e(r_{u'}^c)-\mathcal {C'}\cdot P_{u'}^c, \forall u,u'\in \mathbb {U}_c, u\ne u'\). Additionally, the following three conditions must hold true in order the contract to be feasible.
Proposition 1
For any feasible contract \((r_u^c,P_u^c)\), the following must hold true: \(r_u^c>r_{u'}^c \Longleftrightarrow t_u^c>t_{u'}^c\) and \(r_u^c=r_{u'}^c \Longleftrightarrow t_u^c=t_{u'}^c\).
Proposition 1 can be proven by arguing as follows. We can prove separately the sufficiency and the necessity of the described condition. Regarding the sufficiency, we can write the incentive compatibility condition for two types of users, e.g., \(t_u^c>t_{u'}^c\), add the inequalities, and exploit the strictly increasing property of the evaluation function \(e( r_u^c)\) to conclude that \(r_u^c>r_{u'}^c\). Regarding the necessity, we can work backwards by considering the strictly increasing property of \(e( r_u^c)\), build the summation of the incentive compatibility constraints for two users, and conclude that \(t_u^c>t_{u'}^c\). Similarly, we can show \(r_u^c=r_{u'}^c \Longleftrightarrow t_u^c=t_{u'}^c\).
Proposition 2
A user of higher type, i.e., \(t_1^c<\dots<t_u^c<\dots <t_{|\mathbb {U}_c|}^c\) (i.e., worse channel conditions), will receive a greater reward from the SBS c to be incentivized to be served by this SBS, i.e., \(r_1^c<\dots<r_u^c<\dots <r_{|\mathbb {U}_c|}^c\), and it will transmit with greater power (i.e., effort), i.e., \(0<P_1^c<\dots<P_u^c<\dots <P_{|\mathbb {U}_c|}^c\).
The proof of this proposition intuitively stems from Proposition 1, given that \(t_1^c<\dots<t_u^c<\dots <t_{|\mathbb {U}_c|}^c\).
Proposition 3
A user of higher type, i.e., \(t_1^c<\dots<t_u^c<\dots <t_{|\mathbb {U}_c|}^c\), receives higher utility to be incentivized by the SBS c, i.e., \(U_1^c<\dots<U_u^c<\dots <U_{|\mathbb {U}_c|}^c\).
Proposition 3 proof can be concluded based on the following steps. We consider the incentive compatibility constraint for one user, and we analyze the inequality by considering a user of lower type, i.e., \(t_u^c>t_{u'}^c\), and sequentially we conclude that \(U_u^c>U_{u'}^c\). Due to space limitations only the key arguments required for the proofs of Propositions 1–3 are provided here, while the detailed intermediate steps of the proofs are presented in [8, 9].
Each SBS aims at maximizing its overall utility in order to be able to collect and properly decode the users’ transmitted signals (as dictated by the received SINR). In parallel, each user should satisfy all of its personal constraints, as they have been described above, in order to be willing to be associated with the specific cell. Thus, the following distributed optimization problem is solved at each SBS, by jointly considering the SBS and corresponding users’ sides.
The optimization problem P1 is non-convex, thus, in order to solve it, we reduce its constraints. By performing appropriate derivations we can easily show that the constraints (1b) and (1c) can be reduced to (2b) and (2c) [8, 9]. In particular, to show that the constraint (1b) can be reduced to (2b), we consider the incentive compatibility constraint for a user and by considering the strictly increasing property of the evaluation function, as well as the monotonicity of the users’ types (proposition 2), we can sequentially rewrite the incentive compatibility constraint to be reduced to the inequality \(t_u^c\cdot e( r_u^c)-\mathcal {C'}\cdot P_u^c\ge t_1^c\cdot e( r_{1}^c)-\mathcal {C'}\cdot P_{1}^c\). Then, by exploiting the individual rationality condition, we can conclude to the constraint (2b). Moreover, in order to show that the constraint (1c) is reduced to (2c), we consider the downward (i.e., \(u,u', u'\in \{1,\dots ,u-1\}\)), the upward (i.e., \(u,u', u'\in \{u+1,\dots ,|\mathbb {U}_c|\}\)), and the local downward (i.e., \(u,u-1, \forall u,u-1\in \mathbb {U}_c\)) incentive compatibility constraints. Then, we can show that all the downward incentive compatibility constraints can be represented by the local downward incentive compatibility constraint, while all the upward incentive compatibility constraints can be equivalently captured by the local downward incentive compatibility constraints. Thus, the optimization problem P1 can be rewritten as:
We can easily prove that P2 is a convex programming problem by checking the Hessian matrix. Thus, P2 can be solved by applying the KKT conditions, and accordingly the optimal users’ transmission power vector \(\mathbf {P_c^*}=[P_1^{c*},\dots ,P_{|\mathbb {U}_c|}^{c*}]\) and the optimal SBS’s rewards vector \(\mathbf {r_c^*}=[r_1^{c*},\dots ,r_{|\mathbb {U}_c|}^{c*}]\), can be determined.
4 Users’ Association Based on Reinforcement Learning
In order to support the operation of the aforementioned framework, users are enabled to autonomously select the specific cell to be associated with, based on a reinforcement learning (RL)-based approach. In this work, we adopted the reinforcement learning mechanism of the stochastic learning automata (SLA) to realize this process in a distributed manner, though other alternatives may be considered as well (e.g., Max-logit, Binary-logit) [10]. In particular, each user acts as a learning agent, where at each iteration of the SLA algorithm makes a probabilistic-based selection of an SBS to be associated with, following a simple probabilistic rule. To realize this, each cell is characterized by a reputation function \(REP_c\) depending on socio-physical characteristics, e.g., average users’ distance from an SBS, average users’ transmission power invested to communicate with an SBS, average users’ received rewards, etc., similarly to [11]. Accordingly, the cell’s reputation functions are incorporated within the RL algorithms to enable the users to probabilistically learn their best cell association.
5 Numerical Results
In this section, we present some numerical results that validate the operation and performance of the contract-theoretic component of our proposed framework, obtained through modeling and simulation. Given that adjacent cells do not interfere with each other, we focus on the operation of one indicative cell, assuming that users have already performed their association with each SBS. In the following, we consider a small cell of 500-meter radius with an SBS placed in the center of the cell and \(|U_c|=10\) users placed in increased distance from the SBS (with a step of 50-meters). Specifically, we define one type for each user as analyzed in Sect. 2, presuming that all users have the same probability \(Pr^c_u\) of belonging to each type. As mentioned earlier, the channel gain of user u communicating with the SBS c is modeled as \(G_u^c=k_u/(d_u^c)^a\), where \(k_u\) is assumed to be a lognormal distributed random variable with mean 0 and variance \(\sigma ^2=8dB\), and the corresponding path loss exponent is \(a=4\). Furthermore, the maximum uplink transmission power of the users is \(P^c_{u,max}=0.7\) [W], while the reward factor is \(\rho =10^{-15}\), the SBS’s unit cost is \(\mathcal {C}=0.7\), the user’s unit power cost is \(\mathcal {C'}=0.4\), and the user’s evaluation function is assumed \(e(r_u)=\sqrt{r_u}\).
Figure 2a presents the users’ effort, i.e. optimal uplink transmission power, and reward vs. their index, while similarly Fig. 2b shows the achievable utilities for both the users and the cell. Logarithmic scale is used to better visualize the curves’ trend and differences in the values. The results in both Fig. 2a and 2b, validate the monotonicity behavior in the offered contracts. That is the higher the user type, the more effort is required and thus, the more the reward it receives, leading to larger utility for the user itself and the cell. This is well aligned with the fact that the measured interference at the receiver after performing the SIC technique for the user with the lowest channel gain (i.e., the highest type user) is impacted only by the background noise, which is very low. Moreover, from Fig. 2b it can be seen that all types of users receive a non-negative utility, being consistent with the individual rationality constraint imposed. To complement the evaluation of the contract feasibility, the utilities of two selected users (i.e. users with type 5 and 8) are examined in Fig. 2c, for different possible contracts offered by the SBS (represented in the horizontal axis by user types). Observing these results we note that following the proposed approach, each user achieves equal or higher utility from the other types, if and only if selects the contract item intended for its own type (dictated by the red dashed vertical line in the graphs), demonstrating the satisfaction of the incentive compatibility constraint.
6 Conclusions and Future Work
In this paper, the problem of resource orchestration, in terms of user to small cell association and power allocation, in the uplink of 5G NOMA-based small cell networks is studied. A reinforcement learning technique is adopted to enable the users to select the optimal small cell to be connected with, in an autonomous and distributed manner. Thereafter, a contract-theoretic approach is introduced to design user specific contracts in terms of determining the users’ optimal uplink transmission power and the small cell’s optimal rewards provided to the users, to incentivize them to perform in an interference limited manner in the network.
Our current and future work contains the extension of this framework by considering multiple reinforcement learning approaches for the user association and comparing them in terms of efficiency and computation complexity. Finally, a natural extension of this work focuses on the use of the proposed contract-theoretic framework for the end-to-end study of the network operation, by including the backhauling communication of the SBSs to the macro base station.
References
Muirhead, D., Imran, M.A., Arshad, K.: A survey of the challenges, opportunities and use of multiple antennas in current and future 5g small cell base stations. IEEE Access 4, 2952–2964 (2016)
Tsiropoulou, E.E., Vamvakas, P., Papavassiliou, S.: Supermodular game-based distributed joint uplink power and rate allocation in two-tier femtocell networks. IEEE Trans. Mob. Comput. 16(9), 2656–2667 (2016)
Dai, L., Wang, B., Yuan, Y., Han, S., Chih-Lin, I., Wang, Z.: Non-orthogonal multiple access for 5g: solutions, challenges, opportunities, and future research trends. IEEE Commun. Mag. 53(9), 74–81 (2015)
Qian, L.P., Wu, Y., Zhou, H., Shen, X.: Joint uplink base station association and power control for small-cell networks with non-orthogonal multiple access. IEEE Trans. Wirel. Commun. 16(9), 5567–5582 (2017)
Ali, M.S., Tabassum, H., Hossain, E.: Dynamic user clustering and power allocation for uplink and downlink non-orthogonal multiple access (noma) systems. IEEE Access 4, 6325–6343 (2016)
Asheralieva, A., Miyanaga, Y.: Optimal contract design for joint user association and intercell interference mitigation in heterogeneous lte-a networks with asymmetric information. IEEE Trans. Veh. Technol. 66(6), 5284–5300 (2017)
Bolton, P., Dewatripont, M.: Contract Theory, MIT, Cambridge (2005)
Fragkos, G., Patrizi, N., Tsiropoulou, E.E., Papavassiliou, S.: Socio-aware public safety framework design: a contract theory based approach, pp. 69–79. IEEE ICC, 2020. (preprint available at shorturl.at/CPRU9) (2020)
Zhang, Y., Song, L., Saad, W., Dawy, Z., Han, Z.: Contract-based incentive mechanisms for device-to-device communications in cellular networks. IEEE J. Sel. Areas Commun. 33(10), 2144–2155 (2015)
Xu, Y., Wang, J., Wu, Q.: Distributed learning of equilibria with incomplete, dynamic, and uncertain information in wireless communication networks. In: IGI Global, Game Theory Framework Applied to Wireless Communication Networks, pp. 63–86 (2016)
Sikeridis, D., Tsiropoulou, E.E., Devetsikiotis, M., Papavassiliou, S.: Wireless powered public safety iot: a uav-assisted adaptive-learning approach towards energy efficiency. J. Netw. Comput. Appl. 123, 69–79 (2018)
Acknowledgments
The research work was supported by the Hellenic Foundation for Research and Innovation (H.F.R.I.) under the “First Call for H.F.R.I. Research Projects to support Faculty members and Researchers and the procurement of high-cost research equipment grant” (Project Number: HFRI-FM17-2436). The research of Dr. Tsiropoulou was conducted as part of the NSF CRII-1849739.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Diamanti, M., Fragkos, G., Tsiropoulou, E.E., Papavassiliou, S. (2021). Resource Orchestration in Interference-Limited Small Cell Networks: A Contract-Theoretic Approach. In: Lasaulce, S., Mertikopoulos, P., Orda, A. (eds) Network Games, Control and Optimization. NETGCOOP 2021. Communications in Computer and Information Science, vol 1354. Springer, Cham. https://doi.org/10.1007/978-3-030-87473-5_10
Download citation
DOI: https://doi.org/10.1007/978-3-030-87473-5_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-87472-8
Online ISBN: 978-3-030-87473-5
eBook Packages: Computer ScienceComputer Science (R0)