Abstract
The various demands from cellular users are increasing day by day. A tower can be selected if the traffic through it at a specific time is predictable. Then a chosen set of issues can also be addressed. In the future, such prediction of traffic will help to identify networks that have better conditions with desired performance. To be specific, in this paper, the Long-Short Term Memory (LSTM) model has been proposed to predict the network traffic at a given 4G LTE cell tower. The LSTM model uses traffic data records of a network for one full year to predict its traffic pattern in a particular week. In addition to this, the 4G LTE access networks are operated via Blockchain. Each LTE tower acts as a separate node like a Peer-to-Peer network. The processes and conditions for accessing mobile networks are coded as smart contracts. Blockchain - Radio Access Network (B-RAN) establishes a secure connection between Access Point (AP) and User Equipment (UE) using smart contracts. If all the conditions of the contract code are satisfied, then data transmission takes place between the user and the AP. The proposed LSTM model on blockchain-enabled network APs to predict LTE data traffic gets validated. It is found that it makes the network encrypted, secure and improves its functioning. The superior performance of the proposed LSTM model justifies that the proposed multilayer LSTM improves the performance from 8.2% to 17.7% as compared to the Autoregressive Integrated Moving Average (ARIMA). The latter is the baseline prediction model.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
1 Introduction
In mobile communications, network traffic prediction has a significant advantage in numerous territories. For example, predictive congestion control, network security and planning, effective bandwidth allocation, etc. Traffic prediction for extended length gives detailed traffic prediction models to assess the future. So, licenses for meticulous planning and better choices were made [1]. This was due to traffic generated by smartphones. It was predicted that the traffic created by smartphones exceed 86% of all other mobile data traffic [2]. Therefore many researchers took strenuous effort to predict cellular traffic in the recent past. Such researchers proposed a method to predict the cellular traffic based on the hours of mobile use [3, 4]. These methods failed due to frequent sampling of GPS which leads to the shortening of battery life. The cellular traffic could also be estimated by comparing the cellular trace with the reference trace of the road [5]. But this method had failed to measure the correct strength of cellular traffic. The traffic prediction could be estimated using the monitoring of the handover mechanism also. The frequent handover of a large number of users was monitored. One of the particular base stations was identified to measure cellular traffic [6,7,8]. However, these methods had a stringent limitation, since each and every time the base station should be searched which leads to overloading of the system.
The prediction of low traffic and then configuring the access points (APs) in sleep mode to save the power was investigated [9]. Hence, analyzing the requirements of cellular traffic prediction would lead to an improvement in resource utilization efficiency [10]. Also, on the other side, due to diverse applications the traffic analysis faced a complex problem. Therefore the state of art prediction scheme could be categorized as a statistical learning approach [11, 12] and a machine learning-based approach [13,14,15,16]. The former category was modeled and analyzed using statistical time series techniques which included Auto-Regressive Integrated Moving Average (ARIMA) [17, 18], α stable distribution [12] and co-variance function [19]. The latter category utilized Artificial Intelligence (AI) based cellular traffic prediction methods [20, 21] These were strong alternatives to the former statistical-based methods.
AI is often referred to as machine intelligence. It would mean a programmed setup to process human intelligence and the deciding options developed by human beings. The machine learning field was developed much in recent years. This was owing to the increase in the processing power of the computer and the developments in the algorithms. The machine learning that used Deep Neural Networks (DNNs) could predict the high-level dependencies in the health care dataset [22]. Deep learning became a part of machine learning that taught a computer to filter inputs (in the video, audio, or text) by surfaces to identify how to foresee and arrange data. Deep learning played a vital role in classification and prediction in many fields like health care, smart city, agriculture, automated vehicles, wireless communication, etc. RNNs were the deep learning network and they were called recurrent for the fact that they did a similar assignment for each component of succession, with the yield being subject to the past calculations. Hence RNN was called till recent times memory-based architecture that found and collected data. The main drawback of RNN is the vanishing gradient and exploding gradient problems. LSTM is an enhanced version of RNN architecture which improved memory for storing large data. LSTM based RNN had sufficiently exhibited on language transformation and for speech recognition. LSTM delivers preferable outcomes over other AI techniques for time-series traffic prediction analysis [20]. Several real data sets were demonstrated to conclude that LSTM could be used as a good timing sequence forecast model. Also, it offered better performance than the other traditional model [21]. In the author’s work in [23], they had given a correlation among performance and power utilization of three distinct classes of predictor utilizing a number of real network traces. They had presumed that the network traffic is commonly predictable and the decision of the predictor is reliant on the attributes of the system. Jaffry [24] proposed the concept of cellular data traffic prediction using an RNN, especially the LSTM network. Using LSTM and vanilla feed-forward neural networks, they had verified that this combination was predicted with great accuracy compared to the statistical ARIMA.
Utilizing a Blockchain-based portable system with an LTE network gave fast traffic prediction. It managed local connectivity for trustworthy assistance for the client. Blockchain is nothing but a collection of information on numerous transactions [25]. After each transaction, a hash was created. If most of the nodes verify a specific transaction then that alone was written into a block. The timestamp and the connection to the previous block were the major elements of the digital record block. The significantly worthy units went from proprietor 1 to proprietor 2 through the network nodes and that track account alone remained in the records. Receiving them was confirmed through connections to each past exchange. Verification was done by accord, and the trust is created by the straightforward and auditable data progression. Blockchain was the backend of decentralized digital currencies. Bitcoin was one such example Blockchain that provided transparent, provable and secure digital asset transactions with ownership and proof of rights. The Blockchain is one of the groundbreaking innovations that would radically change the method proposed model’s interaction, automate payments and also it was the transaction security. The Blockchain can be cost-effective by the elimination of the centralized authority to verify users’ transactions.
AI permits a machine to keep acquiring knowledge, based on the data it collected. Blockchain is a decentralized ledger and could be used to save and maintain the transactions of untrusted parties. Mobile carriers gained great opportunities with the help of blockchain for the transformation of business models through new network layers. Level 1 cell carriers were probably going to make new blockchain organizations with all-inclusive computerized records to improve industry-wide interoperability. This would control costs and give wide coverage. A few blockchain ventures depend on the ideas very much lined up with the objectives of explicit 5G models. They were moving from appropriated frameworks and databases to high performance distributed computing systems [26]. Mobile carriers were well-positioned to provide a ‘blockchain-as-a-service’ ecosystem for content providers. The blockchain infrastructure could be used by the content providers for access control and monetization. The copyrights, self-governed computerized offer of content would be monitored by smart contracts. The smart contracts removed the danger of duplicating and content reshuffle by illegal programmers. Additionally, blockchains accommodated variable, request based content evaluation through computerized disclosure. Data mining (DM) was generally used for the discovery of knowledge. Network traffic analysis involved DM. The various data mining methodologies utilized by the different techno crafts for analyzing network traffic was explained [27].
1.1 Limitations of the existing system
The major drawback of an existing cellular network is roaming fraud. It happens when a user connects to a visitor network service provider while using their services but later doesn’t pay for the services used. When this happens, the home network service provider has to pay for the services used by the subscriber in the visitor network. This type of fraud goes unnoticed because the time taken to detect the fraud is very long. Additionally, when the home network gets notified about this fraud it is already too late to respond or take severe action on the fraud. To tackle these kinds of drawbacks to the existing system, the blockchain in network access points is deployed where all network service providers have a roaming agreement using smart contracts so that it gains trust and also is very secure.
While installing new APs in a society/community, the host needs awareness about the traffic patterns and how it may vary in the future. Doing so will save them a lot of network resources and also ensure the customer with a quality service. The main drawback of not doing so is due to Network Congestion. It means that when the host sends huge amounts of data in packets but it is not used effectively, then congestion occurs. This leads to the poor performance of the network and increased delays. So in order to maintain a steady flow of data, traffic prediction on mobile networks is performed. This helps to understand the data traffic patterns which are useful for Congestion Control. It is also beneficial to some other applications such as network admission control, bandwidth allocation and anomaly detection.
1.2 Research contributions
Based on the above literature it is observed that there is a lot of demand for suitable cellular traffic prediction techniques along with reducing roaming fraud. The major contributions of this proposed research are as follows:
-
Configure the existing 4G LTE cellular network as a blockchain-based mobile network. This is done by using a blockchain network on the 4G LTE access points or eNodeBs connected in a fully decentralized manner.
-
Analyze 4G LTE cellular traffic and suggesting the suitable multilayer LSTM technique for the optimal traffic prediction
-
Proposed Blockchain - Radio Access Network (B-RAN) for more security and the feasibility for implementing in 4G LTE is validated.
-
Compared the various parameters like R2 score, mean score error (MSE) and mean absolute error (MAE) of the LSTM based traffic prediction method with the existing ARIMA baseline models.
-
Analyzed the security issues and deployment feasibility of the Blockchain-based network to the existing 4G LTE networks.
2 Materials and methods
The mathematical modeling of the cellular traffic, B-RAN, multilayered LSTM are shown in this section in detail.
2.1 Traffic prediction
The problem formation using a mathematical analysis is derived in this section. Let A be a UE-1 that runs on a particular cellular network. Based on the UE behavior, for a defined period gap (t, t + λ) with the span of λ, each link can be in either an active mode or silent mode. Now, anyone can describe the features like the number and the downlink and uplink packet sizes for combined traffic in any cellular network within the interval of (t, t + λ) for a particular UE. Let f(t) be the vector that represents the cellular traffic features defined in the interval of (t, t + λ). Further, Fn(t) is a matrix containing the recent m features of cellular traffic for n ≥ 0. The scenario can be explained with a suitable example as follows:
Let, F2(t) = [f(t-1), f(t)] and x is a binary indicator vector. The submatrix Fnx(t) of the matric Fn(t) can be defined such that all zero indicator rows are removed.
Let
Based on the above example, cellular traffic prediction problem can be defined as
where m is the one step ahead future prediction, i.e. n > 0, G(t) denotes the predicted matric of the traffic for a given time t, J(.) is the error function which is to be minimized.
2.2 Blockchain
Blockchains can change the digital economy and empower associations to meet their objectives for open and encrypted exchanges in mobile networks. In the Blockchain, every node keeps an identical copy of the Blockchain. The cord consistency is accomplished through a specific agreement calculation. Blockchain-based management of the network, described by a completely decentralized control component, empowers connection to be legitimately settled among the user’s network at the P2P level. This is without depending on middle person specialists, which prompts lower communication expenses and better security [28]. The mobile traffic increase is demanding more efficient and effective management in the fifth generation communication networks [29]. Blockchains are on the network layer’s highest point and it will assist in incorporating various networks. This is done with the arrangement of consistent access between various networks. It is the place that uses smart contracts. This is one of the most important factors to computerize the technique of arrangements and understandings between organizing administrators and endorsers. Also, during this time operational expenses can be considerably reduced. In the author’s work in [30], they had planned a two-layer blockchain design for productive communication. It also updated the block in the decentralized traffic data assortment framework. In addition, they had proposed a security saving plan to ensure the user’s identities and driving courses. In the author’s work done in [31], it shows how a permitted blockchain structure, for example, Hyperledger Fabric can be re-built to help almost 20,000 exchanges per each second. They executed a progression of free advancements concentrating on I/O, storing, parallelism and effectively collect the required information. The two applications that include keen agreements and B-RAN are given below:
Blockchain is considered as a chain of interconnected data blocks framing an open assortment of monetary documents for recording a rundown of exchanges [32]. Transactions are implemented through two stages that live in the Blockchain which are also called smart contracts. Right off the bat, the smart contracts that pass on advanced activities are made into the cells and communicate to the system. Besides, organize hubs that keep up agreement are known as miners. These support the trades by looking at the computerized signature and confirming its authenticity. The miners make a stack out of considerable digitalize exercises into another cell. This is for association at the finish of the Blockchain through a riddle understanding technique called the mining.
2.2.1 Advantages
Greater transparency
The blockchain is a distributed system, all the members in the system will segment similar transactions than individual redundant transactions. These common things can be achieved by an agreement implying all the members agree to it.
Improved security
All the blockchain transactions are endorsed and recorded first. Then it is scrambled and linked to the previous blocks. Hence, all the information is scrambled over many nodes instead of a centralized server. This ensures the security of the transaction information.
Improved recognizability
Consider a company managing items exchanged through an intricate supply chain. Staff will be there who is familiar with the procedure for tracking an item back from its origin. But if the same items for trading are logged on a blockchain, then it is simple to recognize the origin and also its transaction on each stage.
Speed and efficiency
Instead of doing a traditional transaction, implement it with blockchains, then the transactions can be completed very quickly and efficiently. The record-keeping is done by a single digital ledger which can be shared among participants. So, it is not required to reunite multiple ledgers.
Diminished expenses
For any business, reducing expenditure is highly essential. But by utilizing the blockchain for transactions, the diminishing expense requirement is very small. The reason is intermediaries are not needed for any trading transaction.
2.3 Blockchain - radio access network (B-RAN)
The B-RAN is proposed to coordinate cross-arrange assets, including spectra and frameworks, to meet the development of various availability requirements [33, 34]. The structure of B-RAN is planned using the guideline of the blockchain network and it is shown in Fig. 1.
The B-RAN participates as both access clients and access suppliers. This can do self- sorting out as an amazing system. It is by evacuating the representatives and the security hazards that accompany them [35]. The B-RAN can sort out an enormous helpful system and secure member’s advantages. The UEs and the APs are implemented by smart contracts. Such contracts subsequently set up the trust between the first trustless access point and the user equipment.
Mafakheri et al. [36] fused the blockchain in mobile networks. They also implemented the Blockchain-based decentralized method for data transfer. In addition, it is proved that trusted communications between peer-to-peer nodes will increase system throughput and cut costs. The Decentralized Application (DApp) may be a peer-to-peer application. It can be executed on a specific Blockchain written for a case of a particular use [37]. However, a smart contract is an autonomous agreement the terms between the seller and the buyer are pre-agreed and are mentioned in the code. It is a gathering of functions and data that are present in a certain address on the Blockchain. The existing data may be interrogated or reformed by making a transaction to the smart contract. The calling functions are performed automatically on every device present in the network containing the data that trigger transactions.
2.3.1 Working framework for B-RAN
If operator 1 desires to transmit the information to operator 2, this transaction is considered as a block. This transaction block is broadcasted to every node which is present in the B-RAN. The remaining nodes execute a consensus protocol to approve that a particular transaction within the block is valid. The validation of the transaction can be done using a smart contract code. Every node has to cross-check the results of a transaction that successfully invokes the smart contract. If any node fails to invoke the smart contract, the transaction will be terminated and there is no result.
The LTE network has two important elements: one is RAN and the other is Evolved Packet Core (EPC). The RAN consists of numerous amount of small interconnected cells and macro eNodeBs using a standard x2 interface. And to these, the terminals are attached. All these small elements are interconnected using the standardized LTE interfaces. Mobile network operators form a conglomerate by sharing their eNodeBs and constructing the blockchain networks. The system is related to the mobile network that transfers data straight to the remote user equipment. This is arranged by a base handset station in Global System for Mobile Communications. Generally, Node B has the least usefulness and is constrained by a Radio Network Controller (RNC). As such, with an eNodeBs, there is no different controller component. This disentangles the design and permits lower reaction times. eNodeBs can validate brilliant agreements since the smart contracts send information utilizing a USIM gave by the mobile system administrators.
Figure 2 shows a working framework for a better understanding of B-RAN. An agreement is reached between the user equipment and the host access points with the terms of the contract. A smart contract that is authorized by the client’s digital signature is used here to record the terms of the contract. The mining network is looked over by the smart contracts. They are authorized by the miners to make sure that the credit balance of user equipment is enough to pay the host access points. Also, it is ensured that there are sufficient spectrum assets in host access points. Once the conditions of the contract are verified, a new block is created with the help of all the verified contracts. These new blocks are then added to the already existing Blockchain. The spectrum assets can be accessed by the user equipment with a time-limit condition and the access for host access points are received from the user equipment. Large cooperative networks can be well organized by the Blockchain in B-RAN.
By introducing blockchain into mobile networks, one can reduce the unnecessary overhead cost. To cite a situation, while providing roaming services there will be a strong trust factor between the user and service provider as implementing blockchain is secure and transparent. One of the main advantages is that in a blockchain network all the brokers involved will be immediately eliminated and also improves security. As it is known, using blockchain will enable roaming services across different networks and network service providers. When B-RAN acts as a virtual public network it is known that it is secure and has a very high demand in the market. The competition among different service providers can decrease the cost of data service rates without installing or implementing any other network infrastructure.
2.3.2 Consensus protocol
Since blockchain-based mobile network is s decentralized network. There is no central base station to verify and validate every transaction in the blockchain. Therefore a common agreement is required between peers to achieve the reliability in B-RAN. This common agreement is called a consensus protocol. The various consensus protocol is reported for different applications. Out of which the Proof of Elapsed Time (PoET) is more suitable for B-RAN [38]. Hence in this research paper, the PoET has been utilized as a consensus protocol.
2.3.3 Security analysis - roaming fraud prevention using B-RAN
Current framework – In any network Home Location Register (HLR) is responsible for maintaining the location of the nodes. If a new call is set then the Visited Public Mobile Network (VPMN) inquiries the Host Public Mobile Network (HPMN). The query is about the administrations to which the roaming UE has accepted by questioning the HLR. Then the Call Detail Records (CDRs) are referred to the indicting frameworks in their respective premises. These frameworks are responsible for handling CDRs and generating the invoice. The VPMN sends CDR information data to the HPMN through a Transfer Account Procedure (TAP) document. A few organizations may forward TAP documents to the Data Clearing House (DCH). Therefore, now a DCH is answerable for the transmission and change of the TAP records in the interest of the Communication Service Providers (CSP) which has recruited it. If the TAP documents are received then the HPMN should settle accounts per costs acquired with the VPMN. It is according to the corresponding roaming agreement price.
Current difficulties - Roaming fraud happens when a UE gets to the assets of the HPMN by means of the VPNM. But, yet the HPMN can't be able to charge the secondary use for the service provided. Roaming fraud misuses two qualities:
-
Large detection time: The time period required to detect the roaming fraud is very high due to latencies in the exchange of information between VPMN and HPMN.
-
Large response time: The reaction time to the roaming fraud is very large as compared with fraud occurring time in HPMN. This may happen due to poor control over the systems.
These issues can be solved by implementing a permission blockchain, which could be actualized between each pair of administrators who have a roaming agreement. Figure 3 shows the working of the roaming agreement as a smart contract in a blockchain network. B-RAN could establish a roaming agreement between the home network access point and visitor network access point. By doing this it will enable trust between users and network providers. Assigned nodes from the two administrators go about as miners to confirm the sanctity of every transaction communicated in the network. The consensus protocol is deployed between HPMN and VPMN which is set off when a CDR information transaction is transmitted over the blockchain network. Each time a UE triggers an occasion in a visiting network, the VPMN communicates the CDR data to the HPMN. The transmitted CDR data activates the smart contract and the provisions of the arrangement are executed. The HPMN can function along these lines to automatically ascertain the charging sum dependent on the administrations delivered and send this information back to the VPMN. Figure 3 shows the working of the roaming agreement as a smart contract in a blockchain network.
2.4 Long short-term memory
LSTM is a type of RNN architecture used for making predictions on time-series data. They are capable of learning long-term dependencies. The LSTM contains the following four parts:
-
Cell
-
Input Gate
-
Output Gate
-
Forget Gate
These components are specifically developed to work with the vanishing gradient problem, which was a major drawback of conventional RNN networks. Figure 4 shows the internal cell state of an LSTM cell.
Cell state can be considered as a long continuous chain that goes through the chain with some small interactions with the unit. It is the line that consists of Ct-1 and Ct from Fig. 4. Initially, the information which needs to be removed or forgotten from the cell should be identified. This process can be handled with the sigmoid layer/forget gate layer. The forget gate layer can be expressed as:
where xt, wf, ht − 1 and bf are the new input, weight function, the output from the past timestamp and biasing value respectively. The above equation gives an output of 0 or 1. Here when 0 is the output it means to get rid of the value completely. The output of 1 implies retain the value. In the next step, the data needed to be contained in the cell state is decided. The ‘sigmoid layer’ which is also known as the ‘input gate layer’ decides the values be restored. The ‘tanh’ layer generates a vector of new candidate values, which can be combined with the current state.
where it and \( \tilde{c}_{t} \) are the input gate layer and a vector of the new candidate values respectively. The next state vector of new candidate value can be computed by combining eqns. (4) and (5). Hence, Ct from the cell state becomes,
The last step is to get the output. Execute a sigmoid layer. This selects which parts of the cell states are outputs. Next, insert the cell state and perform multiplication on the cell state through the ‘tanh’ layer (push the values to be in between -1 and 1) and the output produced by the sigmoid gate. This is done to get the output of the decided parts only.
Here, ot is the output of the sigmoid state and ht is the output of the decided parts. This new output will be fed into another layer as an input and the chain continues. The multilayered LSTM structured architecture is shown in Fig. 5.
In Fig. 5, C0 is the initial cell state to which an input X0 is fed with weights Wc and Wi respectively to give the output H1 in the state C1. The output obtained at C1 is fed as input to the succeeding layer, which is Y0. And the process continues. This process is called Multi-Layered LSTM.
3 Proposed methodology
The overall system structure of the proposed traffic prediction on blockchain-based mobile networks using the LSTM model is illustrated in Fig. 6. The proposed approach, blockchain-based mobile networks is achieved by installing a B-RAN network. This is to prevent roaming fraud. It also performs traffic prediction using the proposed very optimal LSTM prediction architecture on these mobile networks. This yields better bandwidth allocation and network congestion control.
Figure 6 depicts the inclusion of blockchain in mobile networks helps to control roaming fraud. Traffic prediction improves bandwidth allocation and controls network congestion confined in that area. To prevent roaming fraud, a ‘Roaming Pact’ is formed between the HPMN and VPMN via blockchain in the form of a smart contract code. This roaming pact which consists of a blockchain network acts as a decentralized bridge between HPMN and VPMN. For example, when a subscriber is on a roaming network (i.e., VPMN which visitor network) the VPMN provides services to the subscriber. Then it sends a record of services used to HPMN through the blockchain network.
The HPMN pays for the services used by the subscriber to VPMN and also later asks the subscriber to pay for his services when he/she gets back from roaming. So, in this way fraudulent subscribers cannot escape without paying for their services used. Hence, the transparency and security between HPMN and VPMN are achieved through a blockchain network B-RAN. Then, such an arrangement predicts the network traffic patterns for these mobile networks. It is in order to deliver strong and seamless data services in that area for official/personal use by preventing congestion control. This is done by using the proposed LSTM network which is proven to outperform the best prediction model ARIMA.
3.1 Proposed traffic prediction model
The given raw dataset must be transformed in such a way so that it can act as an input for the prediction model. Before feeding the training data to the prediction model, certain transformations and reshaping have been done on them. First, the training data samples are read, and then the column which contains the data traffic is acquired, and all the remaining columns are eliminated. Then, feature scaling is performed on this new data. MinMaxScaler is used to transform this data. MinMaxScaler is a python model imported from scikit package, it is used to transform the data traffic samples from training data between 0 to 1. The overview of the proposed prediction model is illustrated in Fig. 7.
In the next step, these scaled traffic values are then reshaped with 24 timesteps and 1 output. Here, the time-series data is transformed into input and output data. So, the observation at the previous time step will be used as an input to predict the observation of the current time step. In the proposed model, 24 inputs from the dataset are used to predict the next time step. All things considered, for each time step, that numerous rows must be expelled from the initial point of the dataset. This is because there are no earlier perceptions to use as time steps for the initial values in the dataset. The training data is now complete and is ready to be trained in the LSTM model. The major focus and type of ‘Keras’ and ‘Tensorflow’ which defined in a sequential stack of linear layers. The required model is selected and also the required number of layers in the constructor is defined. There are 4 layers in this model, and each layer has a dropout of 20%. As shown in Fig. 7, each step has a desired LSTM units and a dropout of 20% for each layer. Dropout can be applied to the layers of the training algorithm. A dropout on the input implies that for a given probability, the block excludes the input connection from nodes and weights. After defining the model with 4 layers of training, it is compiled by specifying the optimizer and loss function. Finally, the LSTM model is trained with 120 epochs and a batch size of 32. Epoch is nothing but how many times the model is used for training. The batch size is the number of training data that has appeared to a model before a weight update is performed. Once the model is ready, it is used to run a prediction on the test data.
LSTM model is used to predict the traffic data of 3 cell towers. Traffic prediction helps in congestion control. The prediction of long term data gives more information about the prediction of the traffic models. This ensures that there will be no query regarding upcoming requirements. On this LTE cell tower, Blockchain is implemented so that these LTE eNodeBs can be interconnected through the blockchain network. Their functionality can be maintained by writing a smart contract code. Also, the AP and the UE will reach an agreement based on the contract code. This architecture is called B-RAN. The predicted samples from the LSTM model are inversely transformed to compare them with the test dataset. A graph is plotted between ground truth (i.e., Test Data) and LSTM to compare them. One another comparison is made between the proposed LSTM model and the ARIMA model.
3.2 Proposed algorithm for blockchain-based mobile network
The proposed blockchain simulation algorithm for n UE considering the priority queue is shown in Table 1. The algorithm has been assumed with n UE are running simultaneously with the standard B-RAN for m logical iterations. In this proposed algorithm, it lists B of n block transactions that imitate the local copy of each UE. The transactions are initialized with the origin block 0. It is extending the block while simulating the B-RAN for a random number of iterations m to add the transactions to the block.
The transactions and blocks of the proposed B-RAN can be iterated using the following processes. Initially, it waits for a new block which is to be produced based on the priority queue. The next process is receiving the new notifications within a predefined waiting period. After this process, the algorithm adds the additional block to the blockchain of a randomly chosen UE. Finally, it broadcasts the new position of a chosen UE through the shared blockchain to the remaining UEs. The transactions/messages are queuing by the use of the priority queue qp for future delivery by simulating the communication latencies. The transactions/messages are represented as (t’, k, B′), where t’ is the arrival time of the transactions, k is the receiving UE and B′ is the content which informs that a non-specified UE that has the sequence of blocks B′. The arrival time is increased by σ() and λ() They can be drawn from the probability samples from positive probability (σ) and non-negative probability (λ) distributions.
The classification of the blockchain system based on the speed has been categorized by the relation between the mean value of σ and λ. Let \( \overline{\sigma} \) and \( \overline{\lambda} \) be the mean value of σ and λ respectively, then the blockchain system has been classified as:
-
Slow Blockchain System if \( \overline{\sigma} \) ≫ \( \overline{\lambda} \)
-
Chaotic Blockchain System if \( \overline{\sigma} \) ≪ \( \overline{\lambda} \)
-
Fast Blockchain System if \( \overline{\sigma} \) ≈ \( \overline{\lambda} \)
In the slow blockchain system, the time to create the two consecutive blocks is greater than the synchronization of the local blockchains. However, on a chaotic blockchain system, synchronization with the local blockchains is not possible since the time to create the two consecutive blocks is much small. But in a fast blockchain system, the time taken to create the block is almost equal to the time required for broadcasting. Therefore fast blockchain system is always more suitable for blockchain-based mobile networks.
4 Results and discussions
This section presents the dataset utilized for training the LSTM network, the performance analysis of B-RAN and the performance of the proposed traffic prediction method. The performance has been compared with the ARIMA baseline traffic prediction model.
4.1 Dataset
In this research work, the data traffic information of 4G LTE networks which is available as raw data in KaggleFootnote 1 has been used as a dataset for training and testing. When a subscriber uses mobile data service on their devices, data will be provided by their nearby 4G cell. The traffic of any cell within 1 hour is nothing but the total data capacity of all users served by a cell within an hour. For example, a cell is serving 50 subscribers, each subscriber in 1-hour uses an average of 10Mb. So, the product of average data consumed by a user (in Megabytes) and the number of users gives the traffic of that cell. So the traffic of this cell, in hours, x = 50 * 10 = 500Mb. This is the traffic value of one sample from the gathered dataset. Data is collected in approximately 1 year x 24 hours. The nature of traffic will vary from hour to hour (e.g., from10:00 to 12:00 and 19:00 to 23:00, the traffic will be very high. But, from 0:00 to 6:00, the traffic will be very low). The divergence between days of the week (e.g., traffic graphs for office buildings: high traffic on Mondays from 14:00 to 18:00 and low on Saturdays and Sundays). It differs for special events such as festivals, holidays, etc. The data consists of 8733 entries from 23-10-2017 to 22-10-2018 (one year). It is further divided into training and testing datasets.
The test dataset includes the last week’s data (i.e., from 16 to 10-2018 to 22-10-2018) which is 168 samples. The training data consists of data traffic excluding test data (which is from 23-10-2017 to 15-10-2018) which is 8565. Here predict the traffic values of 3 LTE cell towers (Cell_000111, Cell_000112, and Cell_000113) are predicted to check the consistency of the proposed LSTM model. The RNN algorithm requires training and test datasets to predict data traffic. It uses the training dataset to train the algorithm based on the model and the number of layers required for training to get the best output.
4.2 Performance evaluation of B-RAN
This subsection provides the implementation of blockchain-based mobile networks and the performance evaluation of B-RAN. The various parameters such as fitness of the blockchain with respect to speed, the convergence rate of the different blockchain systems and also the latency with respect to the number of blocks are evaluated. The MATLAB R2019 is used to analyze the performance of the proposed B-RAN. The outcome of the swiftness on the effective blocks are simulated and illustrated in Fig. 8.
Figure 8 denotes the outcome of swiftness on the effective blocks in terms of the expected value of the probability distribution ratios. i.e.\( \overline{\lambda} \) / \( \overline{\sigma} \). It provides an insight into the way the proportion of the valid block is achieved with the variation in the speed. In Fig. 8, the right side indicates the fast growth but unstable which depicts the Chaotic system unsuitable for validating the blocks in the blockchain. Similarly, on the left side, it indicates slow growth. But at the same, time it is stable. It infers that the slow blockchain system will take more time to create and validate a new block. The ‘fit line’ which is plotted in red color is the boundary line between slow and chaotic blockchain systems. In this boundary line, almost 99% of the UEs are able to create and validate their blocks. This ensures that B-RAN is more suitable for implementing in 4G LTE system.
The convergence rate of three B-RAN is analyzed by increasing the number of UEs from 10 to 50 as shown in Fig. 9. Now, it is observed that the chaotic B-RAN (blockchain system) has a slow convergence. So, it may not be able to create and synchronize the blocks within the specified time. The slow B-RAN (blockchain system) has an average convergence. But, the fast B-RAN (blockchain system) has a high convergence which will be more suitable for blockchain-based mobile networks. The latency of the proposed blockchain-based mobile networks has been simulated with respect to the number of produced valid blocks and is shown in Fig. 10.
From Fig. 10, it has been found that even the number of blocks are increased to 50. Also, the maximum latency produced by the proposed blockchain-based mobile network is 39 ms. The number of blocks is equal to the number of UEs present in a particular cell. The higher bound for packet latency of 4G mobile networks is 100 ms. Therefore, from the above three performance evaluation, it is clear that the proposed blockchain-based mobile network will meet the requirements for 4G LTE system. Therefore it ensures the feasibility of the B-RAN for real-time implementation.
4.3 Performance evaluation of traffic prediction architecture
The mobile traffic data from 3 different LTE cell towers are used to evaluate the performance of the proposed training architecture and side by side comparison is done with the state of the art prediction model ‘Autoregressive Integrated Moving Average (ARIMA)’. ARIMA is one of the best prediction models available. Yet at times, it fails in the prediction of mobile traffic prediction, this is where LSTM outperforms ARIMA. The performance of both LSTM and ARIMA is compared to given ground truth values. The given dataset which has hourly data traffic values for a year has 8733 samples and divided into training and testing for the proposed LSTM model as mentioned above. It is for one cell tower. The same is to be performed on 3 cell towers to check the consistency of the proposed model. The LSTM prediction model is implemented in Python using Tensorflow and Keras modules as backend. The algorithm uses a regression model for this case’s prediction, as it is used to predict quantities or sizes. If the prediction output is a continuous value, then the regression model is picked up in place of the classification model. To fine-tune the prediction model, proper hyperparameters are needed to be set. Refer to Table 2 for training hyperparameters. In this model, adjusting the number of epochs and hidden layers are crucial for training the algorithm. Increasing the number of states increases the precision of the prediction. But in this case, the training dataset is long. Therefore, there should be an agreement between the observed values of training data and the precision of the predicted data. It determines the amount of information that needs to be remembered by the algorithm. Hence based on the above reason, the number of epochs is fixed as 120. A year’s training data is required to validate the architecture.
The optimizer used for the proposed prediction model is Nadam (Nesterov Adam). Much like Adam optimizer is essentially RMSProp with momentum. Nadam is RMSProp with Nesterov momentum. Optimizers are very essential for a prediction model. They integrate the model parameters and loss function by refurbishing the model. This is due to the yield of the loss function. They create and design the model into its most precise form by dealing with the weights. Here the number of timesteps is 24 as the traffic values are recorded every hour per day for one year. In the proposed one timesteps are like a memory for our RNN. It contains a memory of 24 characters to predict the next output. The lookback window is the number of previous time steps used as input to predict the next time period. So, the proposed model has a lookback window of 32 which means the prediction model trains from the previous 32 time steps and predict the next time period. The loss function chosen for this model is Mean Absolute Error (MAE), Mean Squared Error (MSE) and R2 Score. MSE is the sum of squares of the distances between the target variable and predicted values.
Whereas, R2 Score is the variance of the predicted data to the total variance. It is used as the testing loss metrics for both LSTM and ARIMA. The range is varying between -1 to 1. The higher the R2 Score, the better is the prediction model. R2 Score should not be confused with the model accuracy because a model can have a low R2 score for a good prediction model. Also, there can be a model that does fit-in the values properly but still have a high R2 Score. Sometimes, the R2 score can tend to become mediocre in large datasets (similar to the scenario in the stated case) It is due to the fact that the variance will be high in a large dataset, even though the prediction is good. In this research work, R2 metrics are used to compare the proposed LSTM model with ARIMA.
When all the hyperparameters are set accordingly, the dataset is trained by the proposed prediction LSTM model. The predicted data of LSTM and ARIMA models are compared to the ground truth values which is the test dataset of the 3 datasets as shown below in Figs. 11, 12 and 13.
From Fig. 11, it has been observed that the predictions of LSTM and ARIMA almost match with the ground truth values. To measure the testing efficiency, the R2 score has been computed for both models and it is found as 0.5014 and 0.45302 for LSTM and ARIMA model respectively. Hence, the proposed LSTM model is better than ARIMA by 9.6% approximately for the cell tower 000113.
The traffic prediction for cell tower 000231 has been computed for both the LSTM and ARIMA baseline model and is shown in Fig. 12. The performance is compared by generating R2 score and it is found as 0.54233 and 0.44584 for LSTM and ARIMA model respectively. Hence, the proposed LSTM model is better than ARIMA by nearly 17.7% for the cell tower 000231.
Similarly, the traffic prediction for cell tower 000233 has been computed using both the models and is shown in Fig. 13. The R2 score is computed for the performance comparison and it is found as 0.54077 and 0.49619 for LSTM and ARIMA baseline models respectively. It is inferred that the proposed LSTM model is better than ARIMA by 8.2% approximately for the cell tower 000233.
Figure 14 shows the training loss of the proposed LSTM model of all the 3 cell towers for 120 epochs. The loss parameters shown above are the Mean Squared Errors. Each epoch trains for 8542 training samples, hence the above graph plots the MSE values for 120 epochs. The performance comparison of the proposed multilayered LSTM has been compared with the ARIMA baseline model. The performance metrics considered are Mean Square Error (MSE), Mean Absolute Error (MAE) and R2 score. The comparison is tabulated in Table 3.
From Table 3, it is observed that the MSE and MAE of the proposed LSTM is low when compared with the ARIMA baseline model. Also, it is observed that the proposed model improves the prediction performance between 8.2% ~ 17.7% as R2 scores when compared with the ARIMA model. The ARIMA’s failure reason compared to the LSTM based traffic prediction is because the 4G LTE dataset is for a long term forecasting behavior. For these types of traffic prediction, it is found that the proposed LSTM model achieves a better performance.
5 Conclusion
In this research article, the LSTM prediction model to predict the cellular traffic on blockchain-based mobile networks in 4G LTE has been proposed. Utilizing a Blockchain-based portable system alongside predicting and managing LTE network traffic yields quicker local connectivity. Also, it offers reliable assistance to clients. The LSTM RNN multi-layered model is trained with one-year traffic data to predict the traffic during the last week in the same year. The performance of the proposed LSTM model has been validated with three cell towers and compared with the state of the art prediction model ARIMA. From the validation, it is inferred that the proposed LSTM model predicted almost the actual traffic. Also, when compared with the results of ARIMA model, it is inferred that LSTM outperforms ARIMA in cellular traffic time series prediction. Hence it has been concluded that LSTM performs better while working with large datasets. In addition, it is very useful in Cellular Traffic Prediction compared to the ARIMA baseline model.
Code availability
Custom Code. It Will be provided by request.
References
Xu F, Li Y, Wang H, Zhang P, Jin D (2017) Understanding mobile traffic patterns of large scale cellular towers in urban environment. IEEE/ACM Trans Networking 25(2):1147–1161
Cisco, Visual Networking Index (2017) Global mobile data traffic forecast update, 2017–2022 white paper. Document ID, 1454457600805266. https://bit.ly/3q6gfMZ
Hansapalangkul T, Keeratiwintakorn P, Pattara-Atikom W (2007) Detection and estimation of road congestion using cellular phones. In Proceedings of the IEEE International Conference on ITS Telecommunications, 143–146. https://doi.org/10.1109/ITST.2007.4295851
Hongsakham W, Pattara-Atikom W, Peachavanish R (2008) Estimating road traffic congestion from cellular handoff information using cell-based neural networks and K-means clustering. In proceedings of the IEEE international conference on electrical engineering/electronics computer. Telecommun Inform Technol 1:13–16
Chandrasekaran G, Vu T, Varshavsky A, Gruteser M, Martin RP, Yang J, Chen Y (2011) Tracking vehicular speed variations by warping mobile phone signal strengths. In Proceedings of the IEEE International Conference on Pervasive Computing and Communications, 213–221. https://doi.org/10.1109/PERCOM.2011.5767589
Liu HX, Danczyk A, Brewer R, Starr R (2008) Evaluation of cell phone traffic data in Minnesota. Transp Res Rec 2086:1–7
Gundlegård D, Karlsson JM (2009) Handover location accuracy for travel time estimation in GSM and UMTS. IET Intell Transp Syst 3:87–94
Dong H, Man J, Jia L, Wang X, Qin Y, Liu K (2018) Traffic speed estimation using mobile phone location data based on longest common subsequence. In Proceedings of the IEEE International Conference on Intelligent Transportation Systems, 2819–2824. https://doi.org/10.1109/ITSC.2018.8569470
Wang K, Li X, Ji H, Du X (2016) Modeling and optimizing the LTE discontinuous reception mechanism under self-similar traffic. IEEE Trans Veh Technol 65(7):5595–5610
Azari A, Ozger M, Cavdar C (2019) Risk-aware resource allocation for URLLC: challenges and strategies with machine learning. IEEE Commun Mag 57(3):42–48
Rebane J, Karlsson I, Denic S, Papapetrou P (2018) Seq2seq RNNs and ARIMA models for cryptocurrency prediction: a comparative study. Proceedings of the SIGKDD Workshop on Fintech, 1–5
Li R, Zhao Z, Zheng J, Mei C, Cai Y, Zhang H (2017) The learning and prediction of application-level traffic data in cellular networks. IEEE Trans Wirel Commun 16(6):3899–3912. https://doi.org/10.1109/TWC.2017.2689772
Trinh HD, Giupponi L, Dini P (2018) Mobile traffic prediction from raw data using LSTM networks. In: IEEE 29th Annual International Symposium on Personal, Indoor and Mobile Radio Communications, 1827–1832. https://doi.org/10.1109/PIMRC.2018.8581000
Wang J, Tang J, Xu Z, Wang Y, Xue G, Zhang X, Yang D (2017) Spatiotemporal modeling and prediction in cellular networks: A big data enabled deep learning approach. In: IEEE Conference on Computer Communications, 1–9. https://doi.org/10.1109/INFOCOM.2017.8057090
Huang C-W, Chiang C-T, Li Q (2017) A study of deep learning networks on mobile traffic forecasting. In IEEE 28th Annual International Symposium on Personal, Indoor, and Mobile Radio Communications, 1–6. https://doi.org/10.1109/PIMRC.2017.8292737
Byoungsuk J, Hong EJ (2019) Deep-learning-based real-time road traffic prediction using long-term evolution access data. Sensors 19:5327. https://doi.org/10.3390/s19235327
Shu Y, Yu M, Liu J, Yang OWW (2003) Wireless traffic modeling and prediction using seasonal ARIMA models. IEEE Int Conf Commun 3:1675–1679
Zhou B, He D, Sun Z (2006) Traffic predictability based on ARIMA/GARCH model. 2nd Conference on Next Generation Internet Design and Engineering, 200–207. https://doi.org/10.1109/NGI.2006.1678242
Chen X, Jin Y, Qiang S, Hu W, Jiang K (2015) Analyzing and modeling spatio-temporal dependence of cellular traffic at city scale. IEEE International Conference on Communications, 3585–3591. https://doi.org/10.1109/ICC.2015.7248881
Zhang C, Zhang H, Qiao J, Yuan D, Zhang M (2019) Deep transfer learning for intelligent cellular traffic prediction based on cross-domain big data. IEEE J Select Areas Commun 37(6):1389–1401
Shihao W, Qinzheng Z, Han Y, Qianmu L, Yong Q (2019) A Network Traffic Prediction Method Based on LSTM. ZTE Communications 17(2):19–25
Polina M, Vieira A, Putin E, Zhavoronkov A (2016) Applications of deep learning in biomedicine. Mol Pharm 13(5):1445–1454
Iqbal MF, Zahid M, Habib D, John LK (2019) Efficient prediction of network traffic for real-time applications. J Comput Networks Commun 2019:1–11. https://doi.org/10.1155/2019/4067135
Jaffry S, Hasan SF (2020) Cellular Traffic Prediction using Recurrent Neural Networks. 2020 IEEE 5th International Symposium on Telecommunication Technologies, 94–98. https://doi.org/10.1109/ISTT50966.2020.9279373
Decker C, Wattenhofer R (2013) Information propagation in the Bitcoin network. In Proc IEEE Peer-to-Peer Computing, 1–10. https://doi.org/10.1109/P2P.2013.6688704
Hedman P (2016) White paper on description of network slicing concept. NGMN 5G Project Requirements and Architectures, v1.0, 1–7
Manish J, Hadi TH (2015) A review of network traffic analysis and prediction techniques. arXiv preprint. https://arxiv.org/abs/1507.05722
Jeffrey AG, Buzzi S, Choi W, Hanly SV, Lozano A, Soong ACK, Zhang JC (2014) What will 5G be? IEEE J Select Areas commun 32(6):1065–1082
Nakamoto S (2018) Bitcoin: a peer-to-peer electronic cash system. HN Publication
Wang Q, Ji T, Guo Y, Yu L, Chen X, Li P (2020) TrafficChain: a Blockchain-based secure and privacy-preserving traffic map. IEEE Access 8:60598–60612
Gorenflo C, Lee S, Golab L, Keshav S (2019) Fastfabric: scaling hyperledger fabric to 20,000 transactions per second. In IEEE International Conference on Blockchain and Cryptocurrency, 455–463. https://doi.org/10.1109/BLOC.2019.8751452
Chaer A, Khaled S, Claudio L, Ray PP, Sheltami T (2019) Blockchain for 5G: opportunities and challenges. In IEEE Globecom Workshops, 1–6. https://doi.org/10.1109/GCWkshps45667.2019.9024627
Ling X, Jiaheng W, Bouchoucha T, Levy BC, Ding Z (2019) Blockchain radio access network (B-RAN): towards decentralized secure radio access paradigm. IEEE Access 7:9714–9723
Ling X, Le Y, Wang J, Ding Z, Gao X (2020) Practical modeling and analysis of Blockchain radio access network. IEEE Trans Commun. https://doi.org/10.1109/TCOMM.2020.3029779 1
Gao J, Asamoah KO, Sifah EB, Smahi A, Xia Q, Xia H, Zhang X, Dong G (2018) Grid monitoring: secured sovereign blockchain based monitoring on smart grid. IEEE Access 6:9917–9925
Mafakheri B, Subramanya T, Goratti L, Riggio R (2018) Blockchain-based Infrastructure Sharing in 5G Small Cell Networks. In 14th International Conference on Network and Service Management, Rome, 313–317
Pascale ED, McMenamy J, Macaluso I, Doyle L (2017) Smart contract SLAs for dense small-cell-as-a-service. arXiv preprint. https://arxiv.org/abs/1703.04502
Salimitari M, Chatterjee M, Fallah YP (2020) A survey on consensus methods in blockchain for resource-constrained IoT networks. Internet of Things 11:100212. https://doi.org/10.1016/j.iot.2020.100212
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interests
The authors declare that there is no conflict of interest regarding the publication of this paper and that the work presented in this article is not supported by any funding agency.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Kurri, V., Raja, V. & Prakasam, P. Cellular traffic prediction on blockchain-based mobile networks using LSTM model in 4G LTE network. Peer-to-Peer Netw. Appl. 14, 1088–1105 (2021). https://doi.org/10.1007/s12083-021-01085-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12083-021-01085-7