Keywords

1 Introduction

With the advancement of information communication and aerospace technology, the global information development field has been fully expanded to human production, life, and scientific research, including land, sea, sky, and space. The Space-Air-Ground (SAG) integrated network is composed of a space-based network which composed of space satellite interconnected and a ground-based network [1]. The integrated development of space-based networks and ground-based networks has increased the wide-area coverage of the network. This has obvious advantages for realizing communication and information services in remote areas and has become an important development area for wide-area communication guarantees and information applications [2].

With the commercialization of 5G technology, mobile communications are expanding from satisfying the communication needs of humans to being able to provide wireless connections for the Internet of Vehicles (IoV), Internet of Things (IoT), and the Industrial Internet [3]. New service targets are often beyond the scope of conventional activities, and the current network is still dominated by the coverage of urban areas and some suburbs [4]. Remote areas still lack large broadband and high-quality communication services, which will lead the future development direction of 6G.

The SAG integration is a possible form of 6G in the future. It can use the advantages of space-based and ground-based systems to comprehensively use satellites, unmanned aerial vehicles (UAV), and ground facilities to cover the global area in a three-dimensional and efficient manner to meet the ubiquitous needs of future communication networks [5]. Satellite coverage is large, but the communication rate is limited, and the communication delay is relatively large. The ground network can directly use the new 5G technology, but the deployment of network facilities in remote areas is limited. In recent years, UAV-assisted communication and High Altitude Platform Station (HAPS) have become important ways to compensate for insufficient ground network coverage and excessive satellite network communication delays [6, 7].

However, the integration of the original communication sub-networks in different spaces has the problem of how to schedule and use spectrum resources [8]. The SAG integration design should fully consider the advantages and disadvantages of different subsystems, and integrate all subsystems with open and flexible overall network architecture.

The 6G technology based on SAG integration makes the unified design of the access network and the core network possible. In addition, to achieve SAG integrated networking, the first task is to solve the problem of spectrum usage. Under the condition of a limited frequency spectrum, sharing the frequency spectrum flexibly is the only way.

For a simple network of integrated networks of individual drones, Hua M et al. consider multi-point collaboration models of drones and ground base stations, by optimizing transmit power, enhances the service performance of ground users, and reduces the same frequency interference to the satellite system [9]. Li X et al. considers the offshore scene, under the interference constraints of the drone, optimizes the drone flight trajectory and resource allocation strategy, and realize the accompanying coverage of the drone to the target vessel [10]. In order to avoid the same frequency interference of integrated network, KONG H et al. attempts to use free space optical communication to establish a satellite and drone aerial link, while the transport link of the drone and the ground user communicates with the ground user [11]. These studies have laid an important foundation for integrated network design. However, a single drone is often limited, so a plurality of drone forming proceeds are a feasible way to improve network mobile coverage. Zhang S et al. considers two programs for forwarding satellite data to realize rapid reconstruction of communication networks in the disaster [12]. Liu C et al. consider using multiple drones into aerial multi-cells, and adjusts the energy efficiency of the drone formation communication by optimizing the transmission power of the drone [13]. Due to the wider space distribution, spectrum sharing of drone formations in integrated networks will lead to more complex same-frequency interference, which is not an effective problem in existing research.

In this paper, the organizational form of the SAG integrated communication network is discussed first. Then the HAPS spectrum requirements in the SAG integrated network are analyzed. The theoretical derivation and basis of the last SAG integrated spectrum dynamic allocation are finally given.

2 SAG Integrated Communication Network

The SAG integrated network relies on the ground-based network, expands on the space-based network, and adopts a unified structure, technical system, and standard specifications. The SAG integration is interconnected by a space-based network, ground-based Internet, and mobile communication network, as shown in Fig. 1.

Fig. 1.
figure 1

SAG integrated network structure

A space-based network is composed of space segment and ground segment. The space-based constellation network can include Geostationary Earth Orbit (GEO) satellites node, Medium Earth Orbit (MEO) satellites node, and Low Earth Orbit (LEO) satellites node. It can also contain only one type of satellite node in GEO, MEO, or LEO. The gateway (GW) node network of the ground segment is formed by interconnecting satellite ground GW stations related to the space-based constellation network. On the one hand, it interconnects and exchanges information with the space-based constellation network; on the other hand, it interconnects and exchanges information with the terrestrial Internet and mobile communication network.

The ground cellular mobile network in the ground-based network is the land-based public mobile communication system with the widest coverage. A fixed base station is set up in each cell to provide users with access and information forwarding services. The ground base station is generally connected to the core network by wire. The core network is mainly responsible for user subscription management, Internet access, and other services, mobility management, and session management.

There are many solutions for the integration of space-based networks and ground-based cellular mobile networks. And these different integration architectures will likely coexist for a long time in the evolution process, and will eventually achieve deep integration. The simplest way of integration is that the satellite network serves as a backhaul for ground base stations and core networks or as a backup for ground wired backhaul. In addition, satellites can access the 6G core network through non-3GPP access methods, and share the core network with the ground mobile network. The satellite can also be connected to the 6G core network as a special 6G base station through the 3GPP access method, which is a deep integration method of the satellite network and the ground network.

3 SAG Integration Spectrum Requirements

HAPS is the connecting layer in the SAG integrated network, which is 20 km to 50 km above the ground. It provides broadband access services for ground-based equipment in the ground coverage area, provides connections for network access points to connect to the backbone network, and provides emergency communication services for temporary deployment.

HAPS communication services include special application services and connectivity services. Special application services are mainly emergency communications. Connectivity services refer to the provision of broadband connections between nodes in areas where the communication infrastructure is assumed to be difficult, including connections between ground user equipment (GUE) and gateways and user equipment. Due to the obvious differences in the use scenarios of connectivity services defined by countries, the results of spectrum use demand analysis are also clearly different. The specific numerical ranges are shown in Table 1.

Table 1. Frequency requirements of HAPS communication system (MHz)

The number of ground users covered by each HAPS is.

$$\mathrm{c}=\uppi \times {a}^{2}\times b,$$
(1)

where \(a\) is the radius of the coverage area in kilometers, and \(b\) is the number of users per square kilometer, that is, the user density. Suppose the penetration rate is \(d\), then the number of connected users per HAPS is

$$e=c\times d.$$
(2)

The amount of forwarding link data is

$$h=f\times g,$$
(3)

where \(f\) represents the amount of data used by each user per month in GB, and \(g\) is the forward link ratio.

When the data rate ratio \(i\) and the utilization rate \(j\) during busy and idle hours are known, the capacity demand per user (in kbps) can be expressed as

$$k=\frac{h\times 8\times i\times 100\times {10}^{6}}{30\times 24\times 3600\times j}.$$
(4)

Therefore, we can measure the forward link capacity of each HAPS platform as

$$l=e\times k,$$
(5)

and the spectrum demand ratio of each HAPS platform as

$$n=\frac{l\times m}{100},$$
(6)

where \(m\) is the forward/reverse link ratio.

The spectrum efficiency of the forward and reverse links of the HAPS system is shown in Table 2.

Table 2. Spectrum efficiency of forward and reverse links

The position between the ground GW and the HAPS is relatively fixed, and the communication link uses a higher-gain directional antenna, so the highest value of 5.5 bit/s/Hz is adopted. There are three recommended values (low, average, high) for the link between GUE and HAPS to estimate the system bandwidth requirements under different channel conditions.

The frequency resources that can match the requirements of the satellite mobile communication capabilities of the SAG integrated network are mainly concentrated in the L-band. In the L-band, most of the frequency bands from 1518 to 1559 MHz (uplink) and 1610 to 1675 MHz (downlink) are allocated for mobile satellite services. In the frequency range of 1518 to 1675 MHz, frequency resources are also allocated for fixed, mobile, satellite meteorological services, and mobile satellite services must share frequency resources with these also majorly allocated radio services within the framework of the radio regulations.

Table 3 analyzes the proportion of the main frequency interval groups in the current declared data and the declaration stage of the corresponding data. If notification data (N) has been declared for this frequency range, count the countries with the most frequency groups. If the frequency range has only coordinated data (C) declaration, then count the countries with the earliest receipt time of the frequency group.

Table 3. Proportion of L-band interval declaration and its data declaration stage

From the point of view of the coordination status of each frequency group, the Geostationary Stationary Orbit (GSO) satellite network has N declarations in all frequency ranges, and the UK has the largest amount of data in all four frequency groups. For the Non-Geostationary Stationary Orbit (NGSO) satellite network, France occupies a dominant position in coordination. France has a C declaration with the highest coordination status among the frequency groups that have no N declaration. In terms of the proportion of frequency declarations, the most concentrated declaration frequency groups are 1525 to 1559 MHz and 1626.5 to 1660.5 MHz, which account for almost one-third of the number of declarations in satellite networks.

4 Dynamic Spectrum Allocation

As a key technique for cognitive radio, dynamic spectrum allocation technology can greatly improve the utilization efficiency of spectrum resources, improve the unevenness of current spectrum resources in development and utilization, so the industry has caused extensive attention and in-depth study. At present, research based on non-smart technology-based dynamic spectrum allocation algorithms can be divided into the following three directions: based on chart, game theory and transaction theory.

The dynamic spectrum allocation algorithm based on the chart discussion is the vertex coloring problem in the map discussion, and each cognitive radio user and its available channel are used as the vertices in the figure. When the user cannot share the same channel, it is connected The spectral assignment process is abstracted to a coloring process of this vertex called the interference map.

The vertex coloring of the interference map is a np-hard problem. It is difficult to obtain the best solution. PENG et al. proposed the heuristic algorithm for seeking secondary solution, the algorithm needs to set different application environments in advance to set up the different node settings Level, a high priority node priority to allocate spectrum, when the channel is more complex, and the convergence speed is slow [14]. Liao Chulin, etc., proposes a method of decomposing complex interference maps as simple graphs, which will transform the sequential dyeing of nodes into simple charts in parallel, improve the time overhead brought by sequential dyeing [15]. Wang et al. Proposes a list of coloring algorithms, and after each round of random allocation channels, the channel is deleted in the list, enhances the convergence speed [16]. Liu Peng et al. proposed a dynamic spectrum allocation algorithm based on quantum genetic and picture coloring method, combining small habitatics and quantum genetic algorithm, the algorithm can be solved in partial optimal problems, by dynamically adjusting rotary door and improving chromosome threshold Increase the overall convergence speed [17]. He Jianqiang et al. proposed an improvement method based on color sensitive map, with maximizing bandwidth as a target function, and take the most fair guidelines in secondary distribution, superior to a single color sensitive map coloring algorithm and maximum fair guidelines algorithm [18].

The dynamic spectrum allocation algorithm for obtaining maximum spectrum utilization efficiency in the multi-cognitive radio user competition spectrum has achieved good results. Neel et al. I analyze the application prospects in cognitive radio systems in cognitive radios, and proposed dynamic spectrum allocation under complete potential models [19, 20]. The dynamic spectrum allocation will eventually converge to Nash Equares, and then analyze the use of repetition Game, Short Academic Game, S-Model Game, Submarine Cognitive Radio Model Convergence. Teng Zhijun et al. proposed a distributed algorithm based on the potential, and the convergence is verified by simulation [21]. Xu et al. proposed a game theory of improved pricing functions to dynamically spectral allocation models, and verified under static game and dynamic games [22].

In addition to the method based on chart and game theory, the dynamic spectrum allocation algorithms based on spectrum market theory and auction mechanism have also developed a lot of results. The dynamic spectrum allocation method based on the auction theory will vase the active cognitive user as auction bidder, regard the idle spectrum cognitive user as auction selection, the base station as auction and distribution process. Chen et al. Proposed a spectrum auction algorithm based on the simplified.

Vickrey-Clark-Grov-ES (VGG) model, which proposes a new price method based on the first pricing closed auction according to the number of cumulative participation and successful access, which reduces the spectrum Communication interrupt during switching and improves the fairness of spectral assignment. Zhou et al. proposed a trusted dual spectrum auction model to solve the incomplessability problem in spectrum repetition and double auction. Wang et al. takes the maximization of spectral utilization as a target function, introducing approximate integrity concepts, taking into account spectrum utilization and integrity, and maximizes spectral auctioneer profit [16]. Based on the auction theory, although it can converge to maximize spectral utilization efficiency under defined primary user conditions, lack of flexibility.

Although the above algorithm can solve the spectral utilization of dynamic spectrum allocation and the constraints and optimization problems between user communication efficiency and network communication performance, there is a problem with flexibility, slow convergence and unable to meet the requirements under distributed conditions. This centralized distribution method is relatively high for communication conditions between the control center and the user and the accuracy of the spectrum perception, and the difficulty is difficult in actual.

With the rapid development of machine learning research in the study and other machine learning research in recent years, the intelligent dynamic spectrum allocation method based on the machine learning algorithm has gradually attracted more and more researchers.

5 Dynamic Spectrum Allocation Method Based on Multi-intelligent Body Strength

Based on traditional algorithms such as chart coloring, game theory and trading theory requires the distribution of spectrum resources using a central control entity. The common problems of these methods are mainly to take up a large number of resources and the user’s communication between the spectrum allocation control centers, and these algorithms must be re-allocated when the environment changes, so the time overhead is relatively large, and the practical application is not reached. Real-time requirements for dynamic spectrum allocation.

The utilization of multi-agent reinforcement learning methods can solve such problems, and the intelligent body can be distributed according to the training income based on the experience of the channel environment, and converge to optimal. When the external environment changes, each user (agent) can respond quickly according to the well-trained strategy, and quickly converge. This intelligent dynamic spectrum allocation method has a huge advantage over the real-time algorithm for the adaptability of the dynamic environment and the real-time performance of the spectrum.

5.1 Dynamic Spectrum Allocation Model Analysis Based on DEC-POMDP

Research on dynamic spectrum allocation model is the basis for studying dynamic allocation algorithms. It is also an important aspect of cognitive radio theory research. Three layered access models, wherein the dedicated model is divided into spectral property model and dynamic proprietary model; the layered access model is divided into the spectrum underlay access model and the opportunistic spectrum access model.

Guo Bingjie proposed to add a confirmation character status word based on the time slots of each user in the DEC-POMDP dynamic spectrum allocation model, and increase the observation channel status as 4: idle, busy, success, failure. In addition, in the observing information species, it is also added to the current observation time slot observation channel as a busy number of times and the total observation of the channel, and the number of channels of each channel is characterized by joining the statistic. However, the model does not consider the quality of service quality of the user service in the design of the reward function, only the basis of the success of the access channel as a reward, not only in the actual dynamic spectrum allocation, not only should the user can access the spectrum, It is also necessary to consider the effects of interference caused by access to the same channel (especially for the master user) to QoS, and weigh the trade-off under this constraint.

5.2 Dynamic Spectrum Allocation Method Based on DEC-POMDP Model

At present, DEC-POMDP model and MARL dynamic spectrum allocation algorithm are divided into: a method based on independent Q-learning, based on cooperative Q-learning methods, based on joint Q-learning methods, and execution method of concentrated training distribution based on multi-agent actor-critic.

Dynamic Spectrum Allocation Method Based on IQL.

Based on independent Q-learning, each agent performs status value estimation and strategy based on the information of independent observations, and converges to stable points through a large number of training. TENG et al. proposed a dynamic spectrum allocation based on IQL-based bidding mechanism. Secondary users learn optimal bidding strategies through IQL algorithms [21], the primary user generates acceptable price vectors to ensure their own interests, the algorithm is effectively improved according to the secondary user policy. Bidding efficiency; WU, etc. According to the mutual interference between the user due to spectrum access behavior, WU, the user’s learning method K-Means and IQL algorithm are combined with the IQL algorithm, and the user is clustered. After reducing the number of intelligent, policies are performed with a variable learning rate IQL method.

The dynamic spectrum allocation method based on the IQL algorithm ignores the nature of the non-Marcov chain having a variation of the external environment for a single user, and its state transfer model is not smooth, and the user cooperation is not considered in the optimization of the value function. The balanced strategy is constrained, so the number of users applies is small, the convergence speed is slow, and it is often not necessarily converged to the optimal strategy, and the secondary strategy is often obtained.

Dynamic Spectrum Allocation Method Based on CQL.

Based on cooperation Q-learning methods not only considering the current state itself, but also the factors of other user actions, but also considers the strategic trend of other intelligent body, making the Q function of separate users can converge faster. To the stable point (or Nash equilibrium point).The CQL algorithm needs to obtain all other intelligent operations and Q functions and the Q function of the independent Q function and the combined status information of the environment. Under distributed decision conditions, the overall status is actually not easy; this complete information interaction is actually Communication networks will cause a lot of communication overhead, which is difficult to implement.

Dynamic Spectrum Allocation Method Based on JQL.

JQL-based approach is a way to concentrate on training concentration. This method treats all users’ actions as unified actions in the global environment, so the partial observational Markov decision-making problem will be simplified. For the general Markov decision-making problem, you can directly apply a single intelligent body strength to learn. Wang et al. As a centralized training focused, and verified the convergence of the algorithm in the experimental environment, compared with the optimal short-term algorithm under the whittle index heuristic algorithm and channel positive correlation, indicating that DQN can converge the result of the optimum algorithm [16]. However, this JQL algorithm first needs to make centralized decisions. In each state, it must ensure that the center’s full control of the user, there is a shortcoming of communication overhead; secondly, the algorithm requires full perceive information about the environment, due to multipath, Shadow fading and path loss, this complete perception of the environment is difficult to do in practice; in addition to the number of users increase, its evaluation and decision-making action space dimension expires an exponential growth, and it is easy to cause value to function. It is difficult to train. Therefore, it is suitable for solving fewer problems with fewer users, and is not suitable for solving a dynamic spectrum allocation problem that users with a large number of ultra-intensive networks (UDNs).

6 SAG Integration Spectrum Allocation

The SAG integrated network of UAV formations and satellites sharing spectrum is shown in Fig. 2. Among them, ground-based base stations cover areas within 100 km, and satellites cover all areas, but mainly remote users. UAV formations are deployed in an on-demand manner to fill the blind spots of broadband coverage in ground-based networks.

UAV formations often move randomly in a large area, and satellite systems cover a large area. This makes it impossible for satellites or UAV formations to occupy a section of frequency spectrum effectively in the entire network. Therefore, we consider the scenario where UAV formations and satellites share a spectrum. Under the UAV load constraint, the joint allocation of UAV formation power and frequency domain sub-channels is studied, to satisfy the interference constraints of UAV formation on the satellite system and maximize the communication energy efficiency of the UAV formation.

Fig. 2.
figure 2

UAV formation and satellite sharing spectrum in SAG integrated network

Assume that \(K\) UAVs form a formation, and there are \(N\) frequency domain sub-channels available for use. Without loss of generality, it is assumed that each sub-channel is used by a satellite user and a UAV user, that is, there are \(N\) UAV users and \(N\) satellite users sharing the spectrum in the network. If the m-th UAV user uses the g-th sub-channel, the received signal can be expressed as follows

$${y}_{m,g}={H}_{m,g}{x}_{g}+{n}_{m,g},$$
(7)

where \({H}_{m,g}\) represents the channel matrix and \({x}_{g}\) represents the transmission signal of the UAV formation on the g-th subchannel. \({n}_{m,g}\) represents additive white Gaussian noise, each element of which obeys a complex Gaussian distribution with a mean value of 0 and a variance of \({\sigma }^{2}\), and the elements are independent of each other.

Considering the classic UAV channel model, \({H}_{m,g}\) can be expressed as

$${H}_{m,g}={S}_{m,g}{L}_{m,g},$$
(8)

where the elements of \({S}_{m,g}\) obey the independent and identically distributed standard complex Gaussian distribution, which characterizes small-scale channel fading. \({L}_{m,g}=diag\left\{{l}_{m,g,1},\cdots ,{l}_{m,g,K}\right\}\) is a diagonal matrix, where \({l}_{m,g,K}\) is a large-scale channel parameter, which represents the path loss between the k-th UAV and the m-th UAV user when using the g-th sub-channel.

The average \(R\) communication rate when the UAV user \(m\) uses the g-th channel in T time is.

$${R}_{m,g}={E}_{{S}_{m,g}}\left({\mathrm{log}}_{2}det\left({I}_{{N}_{a}}+\frac{1}{{\sigma }^{2}}{S}_{m,g}{L}_{m,g}{P}_{g}{L}_{m,g}{S}_{m,g}^{H}\right)\right),$$
(9)

where \({P}_{g}=E\left\{{x}_{g}{x}_{g}^{H}\right\}=diag\left\{{p}_{g,1},\cdots ,{p}_{g,K}\right\}\) is the diagonal matrix, which represents t transmission signal power matrix of the UAV formation.

The total data transmission volume of all users in T time is

$$ D\left( {P,A} \right) = \sum\nolimits_{m = 1}^{N} {\sum\nolimits_{g = 1}^{N} {\alpha_{m,g} TR_{m,g} } } . $$
(10)

where \(P=\left\{{P}_{1},\cdots ,{P}_{N}\right\}\) represents a collection of power matrices, and the element \({\alpha }_{m,g}\in \left\{\mathrm{0,1}\right\}\) in \(A\) indicates that the m-th UAV user uses the g-th sub-channel for communication.

For satellite users, suppose the g-th satellite user uses the g-th sub-channel. Then its channel can be expressed as

$${h}_{g}={s}_{g}{\overline{L} }_{g},$$
(11)

where the elements in \({s}_{g}\) obey the standard complex Gaussian distribution and are independent of each other. \({\overline{L} }_{g}=diag\left\{{\overline{l} }_{g,l}^{2},\cdots ,{\overline{l} }_{g,K}^{2}\right\}\) represents the large-scale channel information of satellite users.

Based on this, the average interference caused by the UAV formation to the satellite user \(g\) can be expressed as

$$ I_{g} = E_{{s_{g} }} \left\{ {s_{g} \overline{L}_{g} P_{g} \overline{L}_{g} s_{g}^{H} } \right\} = \sum\nolimits_{k = 1}^{K} {p_{g,k} \overline{l}_{g,k}^{2} } . $$
(12)

7 Conclusion

The SAG integrated network is an important public information infrastructure in the future. The concept of dynamic spectral allocation and the related algorithm are analyzed and introduced in the text. This article focuses on spectrum resources of the SAG integrated network in which mobile communication can be matched. The feasibility of the spectrum resource will be obtained according to the ratio of the declarations of different frequency groups of each service. It also believes that the dynamic allocation of spectrum resources in the SAG integrated network, where the UAV formation and satellite sharing spectrum. Related conclusions provide references to improve the feasibility of spectrum resource allocation in the SAG integrated network.