Keywords

1 Introduction

The Space-Air-Ground Integrated Network (SAGIN) has piqued the interest of academics and business in recent years as a novel network design that integrates satellite, air platform, and ground communication systems. SAGIN, on the other hand, provides significant benefits to various services and applications due to its heterogeneity, self-organization, and time-varying characteristics, but it also faces numerous challenges, such as routing, resource allocation and management, power control, end-to-end quality of service (QoS) requirements, and so on. In comparison to typical ground communication systems and satellite networks, SAGIN is constrained by imbalanced network resources in each network segment, making it difficult to achieve optimal network performance during data transfer. As a result, network optimization and system design are critical in the space-earth integrated network.

The traditional ground wireless communication network has developed and evolved over the last few decades, from the first generation analog system to the second generation digital mobile communication, and finally to the third generation, fourth generation, or digital mobile communication. Because of the exponential development tendency in both the number of users and the type of service, the wireless communication network has become the fundamental network for human social information exchange. To fulfill the growing demand for data, communication networks will need to supply more resources than present systems in the future [1]. The fifth-generation mobile communication system (5G) [2] has drawn attention from all walks of life in order to support new application technologies such as cloud computing, big data, and the Internet of things, as well as meet more wireless access service requirements, and has successively put forward and implemented relevant technologies and standards. Unlike 3G, which places the user at the center [3], and 4G, which places the service at the center [4], the 5G network, which places the user at the center [4], and is application-driven [5], will deploy open, flexible, safe, and reconfigurable network infrastructure, and provide high bandwidth and low delay data access for billions of types and multi-mode user devices, in order to meet the huge demand for data generated by new applications from the ever-growing new business. However, due to network capacity and coverage limitations, relying solely on ground communication systems is unable to provide high-speed and reliable wireless access services anywhere on the planet, particularly in remote and environmentally sensitive areas such as mountains, seas, and other environments. As a result, new network architecture must be designed and developed to suit the various application needs and service quality in future wireless communication.

As a novel network design, the Space-Air-Ground Integrated Network (SAGIN) makes extensive use of contemporary information, communication, and computer network technologies. It achieves network resource convergence by the effective integration of air, space, and ground networks, based on the connectivity of satellite, air, and ground heterogeneous networks. Management and dissemination, as well as the gathering, transmission, and analysis of real-time data. Wide coverage, high data transfer rate, and high network dependability are all benefits of SAGIN. Earth observation and mapping, intelligent transportation systems, military missions, homeland security, disaster relief, and other disciplines can all benefit from it. High-throughput satellites can provide global wireless access services, air networks can meet high-quality service needs for the areas they serve, and densely distributed ground network equipment can enable high-speed data access. The combination of space, space, and ground networks can give incalculable benefits for the future 5g wireless communication system, as well as expand the number of applications and services available. In recent years, the United States, Japan, and other developed countries have launched their own space and space integration projects, such as the United States’ Global Information Grid (GIG), Space Communication and Navigation architecture [6], one web [7], space X [8], and so on, as well as the European Union’s Multinational Space-based Imaging System [9], Japan’s Basic Plan on Space Policy [10], and others.

Different communication protocols are employed in the satellite, air, and ground network segments of the space earth integrated network to provide high-speed and reliable data transfer as part of a multi-dimensional network design. Unlike a typical single ground network or satellite communication, the data distribution, resource allocation, load balancing, power control, routing strategy, end-to-end quality of service (QoS) needs, and other network segments impact the space earth integrated network. In this approach, network designers must consider how to achieve the greatest network performance in end-to-end data transmission while working with restricted network resources. However, for SAGIN, which is a converged network architecture that includes various communication systems, it is difficult to use the limited network resources to achieve the best performance of information exchange, particularly cross layer data transmission between different network segments of satellite, air, and ground. As a result, SAGIN’s system design and performance optimization are critical for collaborative control and interconnection of satellite networks, air platforms, and ground communication systems, as well as the development of a real-time, dependable, stable, and efficient integrated information system.

2 Space-Air-Ground-Sea Intergrated Network

The goal of the space-earth integrated network is to offer a mobile communication network that may be used by anybody, anywhere, at any time. The space earth integrated network’s handover algorithm seeks to enhance the network’s overall usage efficiency and minimize user access delays. Figure 1 depicts the space-based network’s design.

Fig. 1.
figure 1

Space-Air-Ground-Sea integrated network architecture

2.1 Overview of the Research on Space Earth Integrated Network

Researchers at home and abroad have worked hard on the satellite network, air network, and satellite ground integration network in the integrated system of space and earth, delving into topics such as switching and mobility management, small satellite systems, unmanned aerial vehicle (UAV) communication, and QoS provision and support. K. Chowdhury and colleagues conducted extensive study on the switching mechanism in LEO satellites as early as 2006 [11], comparing the link layer switching and network layer switching mechanisms according to various QoS standards. While Gupta et al. looked at UAV Communication Network research hotspots including high mobility, dynamic topology, link discontinuity, energy constraint, and link quality change [12]. Hayat et al. summarized the features and needs of UAV networks in projected civil applications from 2000 to 2015 [13] from the standpoint of communication and networking. In 2016, Nie phaus and colleagues looked at the state of research on satellite ground integrated networks, focusing on QoS provision and guarantee when satellite and ground links are together [14].

In comparison to a typical ground communication network, the space earth integrated network will be hampered by resource restrictions imposed by multiple network segments, such as restricted spectrum, limited bandwidth, and unstable wireless connections. In this approach, network operators must deliver the highest network performance in order to provide a communication environment with high bandwidth, high dependability, and high throughput. To achieve this aim, extensive study into the performance of the integrated network has been conducted in order to increase system bandwidth, dependability, and throughput.

2.2 Selection of Cross Layer Data Communication Gateway

In the integrated network of space and earth, cross-layer data transfer from the ground to the satellite through the air platform is a technological difficulty. The most popular approach for connecting several network domains, similar to multi domain wireless network communication in mobile ad hoc networks (MANETs), is to pick a set number of nodes in each domain, termed gateways. Many strategies for gateway selection in MANETs have been proposed by a lot of research work over the last 10 years. The application situations for these approaches, as well as the challenges to be solved, are summarized in Table 1.

Table 1. Gateway selection methods

Zhioua and colleagues for gateway selection from IOT cluster to LTE advanced infrastructure, proposed a cooperative data transmission method based on fuzzy logic [15], which takes into account received signal strength, load, candidate gateway, and duration of vehicle to vehicle link connection. A. In the integration system of the Internet of vehicles and mobile communication, Alawi et al. designed a simple gateway selection scheme based on three network performance indicators to select multi hop relay nodes, expand network coverage, and keep vehicles connected to the mobile communication network infrastructure continuously [16]. A distributed gateway selection technique is given in reference [17] for obtaining accurate and effective path performance results in hybrid wireless networks. In the Internet MANETs converged network, Zaman et al. designed a gateway selection method based on data priority to minimize the average end-to-end delay, packet loss rate, and routing load overhead. Dhaou et al. [19] developed an evolutionary algorithm to tackle the gateway selection problem in MANETs and satellite hybrid networks, with the goal of reducing gateway load and quantity.

Luo and colleagues to increase the network’s stability, a distributed method was proposed to choose the gateway in an air ad hoc network made up of UAVs. The method splits the whole UAV network into various sub regions based on the network's application features [20].

Although several successful algorithms for solving the problem of gateway selection have been presented, they all share the same flaw: they only evaluate a single ground or air network and disregard other linked networks. Despite the fact that Zhong et al. devised an optimum gateway selection method based on Game Theory for multidomain wireless networks [21], they failed to account for cross domain communication restrictions. It’s worth noting that the network parameters of the integrated system’s air, sky, and ground segments (layers) can affect the performance of cross-layer data transmission, particularly data volume distribution, cross-layer link quality, and capacity, which are critical for ensuring QoS in cross-layer communication. As a result, optimizing cross-layer data transfer and obtaining the greatest QoS is critical.

3 Access of Space-Air-Ground-Sea Intergrated Network

3.1 Wireless Access Control Based on Artificial Intelligence

Artificial intelligence and big data mining technologies have advanced to the point where intelligent wireless communication network building is on the rise. Leading domestic and international communication companies have realized that the knowledge contained in big data may aid in the operation and efficiency of wireless communication networks. China Mobile has developed a big data-based network operation and maintenance optimization platform [22]. Network performance monitoring, fault detection, intelligent dynamic design of access resources, and other tasks are accomplished through the mining and analysis of signaling data. It also presents a design scheme for an access network with flexible function deployment, user and business situation intelligence perception, such as an access network that anticipates user demand using big data analysis and accomplishes intelligent local content push. Huawei discovers the characteristics of user behavior and network traffic by mining and analyzing user, traffic, network, and other related data, and then uses machine learning to learn the mining features, allowing the network to adaptively manage wireless resources and drive intelligent network planning with user experience. Orange, a French telecommunications company, proposes using big data analysis to achieve intelligent network parameter scheduling and optimization. The European Telecommunication Standardization Association proposes that NWDAFT7 be added to the 5G network in order to provide customized slice level load data analysis for policy control function entities and slice selection function entities to aid in network resource allocation, service orientation, and slice selection. Basic research on wireless network big data aided intelligent access control has recently concentrated on how to utilize big data prediction to distribute active resources and schedule active services. The use of predictive knowledge to plan or undertake actions in the future in advance is referred to as “initiative.” Document [23] extracts the future resource competition relationship of multi-users and analyzes the queuing model of mobile users in active switching mechanism, which is used to coordinate the competition among users in advance, using user mobility prediction information, including the cell and the cell dwell time accessed by the user at the next time. According to literature [24], by using user request content prediction information and channel state prediction information in the prediction window, the user can be accessed to the base station in advance for the requested file transmission, and the longer the prediction window is, the shorter the service delay is. Similarly, the paper [25] analyzes the queuing model of active service scheduling and employs service arrival prediction information to communicate user request data in advance. The literature [26] observes the expected transmission rate and prepares the distribution of the users’ frequency band resources in advance using the deep learning approach. In each frame of the prediction window, the document [27] pre-plans the user access choices. The article predicts the user’s position in the future and utilizes that information to arrange base station sleep management, resource reservation, and data transmission. The robust optimization approach is presented in literature [28] to combat the uncertainty induced by the prediction error, because the active mechanism’s performance gain is largely dependent on prediction accuracy. The study anticipates the change in traffic flow over time and utilizes that knowledge to model the load on a geographic location. The prediction error is assumed to have a Gaussian distribution, and an optimization algorithm based on gradient descent is proposed to optimize the user access selection pre-planning to balance the negative base station, with the goal of minimizing the expectation of the sum of regional load squares.

Although research has been conducted in the industrial and academic circles on improving the serviceability of wireless communication networks through intelligent means and learning from the communication environment through intelligent methods, research on how to transform wireless data into useful knowledge to solve access control problems, as well as when and how to use the knowledge, is still limited.

3.2 Multiple Access Selection in Heterogeneous Wireless Networks

The era of big data has arrived in the mobile wireless network. The rising traffic demand necessitates more severe criteria for future wireless network service capabilities, such as a 10–100 fold increase in network capacity and a 1 ms end-to-end latency. Effective technologies, such as MIMO, millimeter wave, and mobile edge computing, have evolved to attain greater performance metrics. Furthermore, for a long time, a variety of wireless networks based on different standards, such as LTE and WiMAX, will coexist to build heterogeneous wireless networks. Making full use of the coexistence of multiple wireless networks for parallel transmission in this heterogeneous scenario will result in rat multiplexing gain [29], greatly improve network capacity, reliability, and reduce service delay, and become the key and effective scheme to enhance network service capability. Wireless communication equipment may now be fitted with a number of rat interfaces because to advances in electronics.

Intelligent user access selection and band resource allocation strategies may substantially enhance throughput and QoS in heterogeneous wireless networks. On the user side, wireless communication network service kinds will become more diverse in the future. Varied services have quite different service needs. Moreover, the variety of terminal equipment, user preferences, and other variables all contribute to varying QoS needs for a same service. Because services are clearly interconnected, a wireless network is required to deliver unique services. On the network side, heterogeneous networks provide a wide range of network characteristics to the access environment, including transmission characteristics, energy consumption characteristics, coverage characteristics, and so on. As a result, the network’s diverse features must be adapted to the wide range of business requirements. The term “context” refers to all of the information that may be used to characterize the properties of things. For users, it may be their location, QoS needs, or other factors; for the network, it could be changes in base station load, interference, network characteristics, or other factors. The adaptability of network and user side conditions is a fundamental need of intellectualization, and failure to address either situation will result in performance loss. For example, current network maximum signal-to-noise (max SNR) access technology overlooks network load, resulting in network congestion, low transmission rate, and frequent handoff.

Reference [30] suggests integrating user services through various wireless transmission routes to accomplish load balancing in other research on user multi access choices. The utility function is maximized depending on network capacity to improve bandwidth allocation. The return link latency is also taken into account in reference [31]. However, the research above only looks at a single service, ignoring the differences in multi-service requirements in heterogeneous networks. Joint optimization of access selection and resource allocation has been done in the literature [32–34]. The file transmission service is discussed in reference [32]. The best wireless network group for access and the size of file block transmitted by each wireless network in the group are selected in order to reduce transmission costs. However, the cost of interference and congestion induced by user resource competition is not taken into account in this work. The network congestion threshold is utilized to decide user access selection in reference [33], and following the user access, the allocation of time slot and frequency band resources is further optimized. Reference [34] optimizes the best wireless network group for each user, however in practice, each wireless network has numerous APS, and the AP selection is not taken into account in this study. However, there is a scarcity of studies on the multiplexing benefit of multiple access in rats. To summarize, past research on user access and resource allocation in heterogeneous wireless networks has failed to take into account both user and network situations. Furthermore, it does not fully utilize coexisting heterogeneous network resources, only considers the use of a single wireless network for transmission, or only optimizes the wireless network group for parallel transmission, and does not take into account the AP selection problem within each wireless network. In light of the aforementioned flaws.

4 Reinforcement Learning Based Intelligent Access of Space-Air-Ground-Sea Intergrated Network

4.1 Heterogeneous Wireless Network Access Algorithm

When a terminal device is in a heterogeneous wireless network environment, it automatically identifies the presence of a range of networks and may access those that match their requirements. The process of deciding which network of access to use is essentially a combination of network resource reassignment and resource scheduling. The user’s terminal continuously creates different services to pick different networks based on network QoS and user QOE, but how to effectively use this information to enable consumers access the best network is a network. The access algorithm’s main flaw.

Existing network access algorithms are based on game theory, historical data, historical data from high altitudes, optimization theory, and tactics. The most basic network access method is based on SINR, which compares the SINR values of each network to access the best network, along with RSS to establish the standard access algorithm and its association, although after access performance is frequently poor.

MADM is the most researched algorithm in the field of network access algorithms, and multi-attribute decision policies are primarily split into policies and other policies based on cost functions. Network QoS and user QoE tend to establish a utility function as a standard for terminal access networks during terminal selection, and the service type provided by the terminal is an impacted network in the network access algorithm. A typical technique is to use access parameters. Because the terminal’s program communicates via the HTTP/HTTPS protocol, it may be categorized as different service kinds based on the traffic it generates.

Raschellà, et al. propose an algorithm based on load and priority, while classifying the type of service into real-time and non-real-time, the status utility function of each network is represented, and the adaptation factor is calculated in conjunction with the network’s load, and the user final outcome is calculated. To get access to the network, the maximum network of the adaption factor will be chosen, it is excellent for the efficient use of network resources since it not only ensures network load balancing but also reduces the accessibility rate of the access. The influence of user choices on network access is examined by Chen J, et al. An acceptable multi-criteria decision technique is presented in the client, and the flow cost is taken into account, as well as the network traffic being split into three categories: non-elastic, streaming, and elasticity. Different throughput needs are recommended for different types of traffic, and end users can pick the right network to connect to. The user's speed and channel occupancy time are used by Udhayakumar S, et al. to forecast the user's residence time in the WLAN. Real-time and non-real-time services are likewise distinguished, as is non-real-time business choices. The strain on the cell and WLAN will be detected via WLAN and real-time traffic. Furthermore, if high-priority users are accessible, traffic is moved, while low-priority users are transferred to other networks to free up resources for advanced users. Under the SDN's network management architecture, access to the terminal can help better monitor changes in network state. Wu X, et al. present a network access policy for an SDN-based framework that employs the adaptive factor to calculate the QoS needs of the downlink stream. To assist the access terminal to an ideal network, the framework depends on SDN flexibility to implement the functionality and concentration of the monitor and assess network capacity.

When compared to addressing difficult problems like resource allocation in heterogeneous wireless networks, fuzzy logical theory may be utilized to solve access alternatives in heterogeneous wireless networks. The computational complexity of fuzzy logic can be reduced. Wu X, et al. utilize fuzzy logic to rate the priority of the user who accesses the WLAN, which decides who is connected to the WLAN. Fuzzy logical theory, on the other hand, solely examines fixed indications and is unable to adapt to complicated and dynamic custom rules.

Ontological reasoning approaches have the benefit of allowing the network’s access rules to be modified. Q. Zhou and colleagues propose a semantic access point resource allocation service for heterogeneous wireless networks, based on knowledge-based independent network management systems. Knowledge Base may introduce an access point that delivers the highest level of service quality automatically. With enough flexibility, this body-based knowledge system may also automatically modify the access point selection policy according to customer-defined requirements.

The ontology and fuzzy reasoning binding approach may also be utilized for heterogeneous network access, and Al-Saadi A, et al. employ semantic reasoning of cross-layer QoS parameters from the heterogeneous network design to manage and optimize the network’s performance. The fuzzy reinforcer infers the next action in the heterogeneous wireless network using the Knowledge Base’s rule set, and selects the network architecture that can be handled. The decision is sent to the layer in the Internet protocol stack that is in charge of completing the necessary operation. The results demonstrate that in the event of heavy load, the cognitive network architecture based on ontology fuzzy reasoning can substantially enhance throughput and packet delivery rate (PDR). In a heterogeneous wireless network, defining the terminal access problem as a classification problem is a sympathy, where each potential network represents a category. The aim of network access is based on previous knowledge or statistical information to identify unknown objects as a known class, and a network user is regarded to have an object that specifies a collection of features of the corresponding decision factor. Machine learning is a field of research that focuses on learning systems and is often used to gain this information from a group of input items (also known as training data). A lot of work has been done using machine learning technologies to explore access strategies for heterogeneous wireless networks. The machine learning algorithm is a powerful tool for resolving categorization problems. Simple Bayes, decision trees, and neural networks, for example, can classify the type of company in the network access process. Lee D K, et al. split user traffic into four types: high-definition video streams, Internet telephony, audio streams, and files, using a decision tree method. Prior to terminal selection AP, first prioritize the user’s request based on the service type, which is transmitted to the controller through the AP, and then propose an access method by the controller, which not only considers the business’s QoS needs, but also the AP's backhaul connection state, this can finally offer acceptable load balancing between the AP and supply the terminal with the QoS it requires.

How to examine these aspects to guarantee better access to users, given that users must take into consideration numerous characteristics during network access, such as QoS, RSS, and user-level user preferences. A critical concern is the network. Artificial neural networks may produce correct answers for inputs that do not exist during training, and neural network algorithms are a viable solution for solving such issues.

In recent years, stronger learning technology has been increasingly used in network access algorithms, and it has been coupled with other supervised or non-monitoring learning algorithms to improve algorithm intelligence. The qualities of online learning are reflected in the term “strengthen learning.” If the network deployment changes, the previously learnt method may no longer be the best option, and the cumulative advantages may be reduced. This bias will be detected through online learning, and the optimal policy will be modified by another round of training. Because of its independent characteristics, which are typically used to solve decision-making issues, and because of their constant update features, which allow it to adapt to changes in network environments, the importance of strengthening learning in network access management technology is growing.

4.2 Heterogeneous Wireless Network Access Algorithm Based on Reinforcement Learning

The space space integrated network can achieve worldwide network coverage thanks to the presence of a satellite network. Multiple networks (such as satellite, terrestrial LTE, WiFi, and so on) may overlap coverage in some user-intensive locations (such as cities), and user access to diverse networks will have a significant influence on network performance and user experience. Meanwhile, spectrum resources, access methods, and protocols differ between space-based, space-based, and ground-based networks. As a result, in the field of space earth integrated network research, user access selection, or optimizing the user’s access network to increase network performance, has become a key priority. Unlike conventional network handoff, where the aim is to ensure service continuity, the goal of radio access technology selection is to maximize network performance in real time.

As a result, the implementation technique is modified from a location change triggered (passive) switch to an active selection approach, in which the user’s access network is determined for each time period. The user association problem is another name for this type of network access selection challenge. In the integrated Space-Earth network, the conventional and optimization-based network access selection strategies will confront the following two problems. To begin, the majority of user allocation issues will be built as an integer or mixed integer combination optimization problem. This issue is not just non-convex, but it has also been shown to be NP-hard. The optimization approach can result in a significant quantity of computing and a long calculation time, and it can’t handle the air-earth integrated network’s vast-scale and high complexity. The optimization-based approach, on the other hand, is based on prior knowledge and modeling assumptions (such as network topology model, user distribution model, user mobility model, channel characteristic statistical model, service arrival model, and so on), which are not only for modeling network behavior with large particle sizes or for special network scenarios. At the moment, none of them can fulfill the demands of the integrated network of space and space, reducing the efficiency of optimization outcomes.

Unlike optimization-based methods, the RL approach uses “observation and trial and error” to learn new network environments without the use of a previous model. Furthermore, after a period of operation, the RL neural network may theoretically suit a very complicated network environment and guarantee to output optimization results at a very rapid speed, allowing for real-time network access selection with little computing complexity.

The study of RL-based network access selection is still in its early stages. Network access option may be separated into two categories based on the RL agent’s deployment site. 1) A RL agent is installed on each user, and the user selects pure dispersed access [35]. This approach may swiftly respond to changes in the user's present environment and decrease data gathering signaling overhead by collecting data and making decisions locally. However, due to a single user’s limited observation ability, joint optimization of multi-user access selection techniques across a broad range is difficult to achieve. 2) To allow many users to share particular access resources, RL agents are deployed on access nodes (base stations, UAVs) or edge controllers [36]. This centralized deployment approach can easily optimize a large number of users, but it is restricted by the wireless transmission data and signaling overhead in the process of user data collecting and decision-making distribution, making real-time reaction to the user environment difficult.

Multi wireless access technology is widely employed in diverse services of vehicle users in the space earth integrated network. As a result, one of the most important topics to address in order to enhance network performance is access control. In this case, the simulation platform may simulate several access techniques. The best access strategy for car users is analyzed, and the access scheme with the highest network data rate is selected, thanks to the gathering of global information by cloud controller. The simulation platform is utilized in this study to perform preliminary training of the DRL model in order to address the issues of difficult to get training samples and low training effectiveness of the space earth integrated network. Figure 2 depicts the DRL model training and use with the help of the simulation platform.

Fig. 2.
figure 2

Reinforcement learning based intelligent access of Space-Air-Ground Integrated Network

Simultaneously, because the network’s statistical features may vary, network environment data is gathered in the real network, and the model is updated to respond to the dynamic network environment of air space integration.

Vehicle users can be serviced by several access modalities via the network's terrestrial base station, UAV, and LEO satellites. RL is a mechanism for continually learning information from the external environment (such as user channel conditions and other complicated environments) in order to identify the best approach. In the space-earth integrated network, it can adapt effectively to the complex network environment and multi-user scenarios. The actor critical RL algorithm is used to discover the best access method. The parameters of two neural networks, actor and critical, are specified and initialized. Actor network determines each user’s access mode based on current network information, whereas critical network assesses the access mode in order to influence actor network to make a better decision in the future. At each learning moment, the current position information of each vehicle user is fed into the actor network, and each vehicle user’s access strategy is decided based on the probability distribution of the actor network’s output. Then, based on each car user’s access choice and the vehicle's position information, we can calculate the reward value at any given time, i.e. the vehicle’s average data rate. Simultaneously, the critical network evaluates the error between the real reward value and the estimated reward value at the present instant in order to enhance the accuracy of the evaluation of actor network action output. The performance of an RL-based strategy will approach the optimal through a huge number of data input and iterative learning. Figure 5 depicts the experimental findings of network access selection. Although it is difficult for a learning algorithm based on a neural network to converge to the global optimal owing to the network’s complexity, forward propagation of a neural network requires minimum computing, a short running time, and a quick response, it makes it more appropriate for a wide range of real-time services in the integrated space-and-earth network.

5 Conclusion

The intelligent access technology in the Space-Air-Ground Integrated Network is discussed in this article. In light of current research and issues in space earth integrated intelligent access technology, this article analyzes the use of a reinforcement learning computer in a space earth integrated network and proposes a space earth integrated intelligent access technology scheme for network optimization.