Keywords

1 Introduction

The idea of CR addresses the intersection of personal wireless technology and computational intelligence [1]. The term itself refers to the radio devices that are capable of learning and adapting to their environment. After the agile development in the field of wireless communication, the number of devices used in a wireless spectrum has tremendously increased. This has led to a scarcity problem in the available spectrum. CRs was designed to mitigate the problem of scarcity by improving the utilization of the spectrum, by providing the unlicensed users opportunistic spectrum access. This also solves the under-utilization of the spectrum in some frequency bands. In order to perform cognitive tasks, the CR should be aware of its RF environment. It should sense its surroundings and identify all types of RF activities.

Spectrum sensing is a technique by which the CRs learn about its environment. It helps the CR to obtain opportunities for spectrum access and also allows Secondary User (SU) to share the licensed spectrum of Primary User (PU) provided that PU is not subject to interference. Thus, spectrum sensing was considered as a primary function of any CR [2].

Apart from regular spectrum sensing techniques like a matched filter, energy detection, cyclostationary feature detection, etc., which sense the spectrum efficiently, this work deals in applying machine learning algorithms to CRs spectrum sensing which eventually increases the precision of the sensing [3]. The applied algorithm using residual learning creates a classifier model that trains and is capable of classifying the Primary User’s (PU) transmission pattern into 10 different classes of patterns.

1.1 Problem Description

  • CRs can sense the PU transmission pattern and can occupy the spectrum such that there is no potential interference.

  • Existing spectrum sensing techniques, even though efficient and robust, can be replaced by machine learning-based spectrum sensing techniques.

  • In this paper, a 2-layer CNN model is compared with the residual learning technique (ResNet-50) with transferable weights, to have a better detection.

The following section gives a brief description of the dataset used. Section 3 comprehensively explains the system model used in this work followed by the its performance analysis in Sect. 4. Section 5 summaries the paper and takes it into further discussions.

2 Description of the Dataset

The dataset used in this paper is obtained from [4] with appropriate licenses. As described in [4], the spectrograms characterize the PU’s scenarios into different classes. Spectrograms are visual representations of frequencies of the signal as it varies with time. The spectrogram images are used as inputs for the CNN models in this paper. The images are a 64 × 64 greyscale images.

Figure 1 shows a sample image of the obtained dataset. Each spectrogram covers 4 channels for 50 ms with a bandwidth of 10 MHz (4 × 2.5 MHz channels) to assess the spectrum occupancy of each channel. The dataset obtained was divided into 10 mutually exclusive classes each class containing approximately 8000 training and 2000 testing images.

Fig. 1
figure 1

Sample image of the dataset

From the complete collection of 89,000 spectrograms, 25,000 spectrograms were labeled by one of the authors. Note that human-applied labels are unverified, and based on subjective visual interpretation, as we did not have access to measurement methods.

Each class in the data set were labeled with respect to its occupancy in the channels. With classes 0 to 4 having absence in the few channels and 5 to 9 having all the channels present. A detailed description of all the classes is given in Fig. 2.

Fig. 2
figure 2

Labels of different classes

3 System Model

The system uses a residual learning-based deep convolution neural network model for detecting the different classes in the dataset. This type of model is also called a deep residual network model. Deep neural networks have shown their high performance on image classification tasks but meanwhile, it has more training difficulties[5]. Due to its complex and vanishing gradient, it usually takes a long time and a lot of computational resources to train deeper neural networks. Deep Residual Networks (ResNets), however, can make the training process easier and faster. And at the same time, it achieves better accuracy compared to their equivalent neural networks [6]. Deep Residual Networks have been proven to be a very successful model on image classification [7]. The images used in the dataset are briefly explained in the previous section. This section comprehensively discusses the underlying concept in the working of the deep residual network. This work initially considers a two-layer convolutional neural network for classification between 2 categories and then for multi-categorical classification. As the two-layer model failed for more correlated classes, a deeper model was built using residual learning architecture ResNet50.

Figure 3 describes the system model used for spectrum sensing in CR using ResNet-50. The input image passes through the convolution res-net, gets interpolated to acceptable image size. Then the image will be treated as matrices of pixel or vectors of the pixel. After passing through layer after layer, the model learning from each layer, a weight vector is obtained. The last convolution layer Conv 5_x has four 512 layers and four 2048 layers which are flattened to 1 × 8192 vector. The fully connected layer is defined with a dense function consisting of a 1000 node output layer with a ReLU activation function. A multi-class node output with the SoftMax activation function is created. It generates a 1 × 10 label vector containing probabilities of different target classes. This vector can be used to predict the test image and provide appropriate probabilities.

Fig. 3
figure 3

ResNet-50 Model for spectrum sensing

4 Performance Analysis of the Classifier Model

The model, as usual, is trained on training sets and is tested on test sets or validation sets. The accuracy while training and testing all of the samples are given at the end of each epoch, where single epoch trains for a batch size of optimal value. This model is trained on 20,000 spectrograms and is tested using 5000 spectrograms for 15 epochs for a step size of 625 comprising a batch size of 32. The ResNet-50 algorithm was implemented using the open-source TensorFlow python library running on Nvidia GTX 1050 graphics processing unit (GPU) card. A comparison of 2-Layer CNN and ResNet-50 is shown in the Table 1.

Table 1 Comparison of 2-Layer CNN and ResNet-50

Confusion matrix was also used as one of the metric analyzers which showed us the classification accuracy for each and every class.

A key point to be noted from the matrix (see Fig. 4) is that class 6 had a perfect prediction and an area-under-curve (AUC) of 1. Also, classes 1, 3, 9 had relatively poor AUC, i.e., more of those class’s images were wrongly classified into other classes. Thus, more training is needed in certain classes.

Fig. 4
figure 4

Confusion Matrix for the ResNet-50 model

5 Summary and Discussion

Spectrum sensing is a key aspect of the paradigm of CR networks. As conventional sensing techniques had some relative constraints, machine learning was used to predict the classes of PU’s spectrum scenarios.

In this work, convolutional neural networks were used as a deep learning network to predict PU scenarios. Initially, a 2-layer deep CNN model was used and then it was upgraded to a residual learning-based ResNet-50 model. Further simulation and prediction results indicated that the ResNet-50 model was more precise and robust in classifying the PU classes. Also, ResNet-50 was able to classify more correlated classes into their appropriate individual classes. A few identical classes scored a relatively lower accuracy rate compared to other classes.

As a future extension, the Region-based CNN (RCNN) model can be trained using a self-annotated dataset. This model can classify a single class test image into regions of interest, were interest regions are the whitespaces. This can be further trained and can be used to predict the mask region with more accuracy and which can be used to allocate the SU’s spectrum in the available channels without potential interference.