A Deep Neural Network Model for Hybrid Spectrum Sensing in Cognitive Radio

Nasser, A.; Chaitou, M.; Mansour, A.; Yao, K. C.; Charara, H.

doi:10.1007/s11277-020-08013-7

A Deep Neural Network Model for Hybrid Spectrum Sensing in Cognitive Radio

Published: 03 January 2021

Volume 118, pages 281–299, (2021)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Wireless Personal Communications Aims and scope Submit manuscript

A Deep Neural Network Model for Hybrid Spectrum Sensing in Cognitive Radio

Download PDF

A. Nasser ORCID: orcid.org/0000-0002-7768-8953^1,2,
M. Chaitou⁴,
A. Mansour¹,
K. C. Yao³ &
…
H. Charara⁴

549 Accesses
22 Citations
Explore all metrics

Abstract

Spectrum sensing (SS) is an essential task of the secondary user (SU) in a cognitive radio system. SS monitors the primary user (PU) activity in order to avoid any collision with SU, as the latter should be silent when PU is active on a given channel. Hybrid SS (HSS) is one of the powerful methods used to monitor PU activity. It consists of using different detectors together to make a final decision on the PU status. In this manuscript, artificial neural networks (ANN) are used to perform HSS. Since our data is composed from the test statistics (TSs) of several detectors, thus it can be modeled as tabular. Fully connected neural networks become the most suitable ANN model. We applied cutting-edge techniques in the field of deep learning in order to get the best possible accurate neural network model in our application. These techniques boil down to: embedding, regularization, batch normalization and smart learning rate selection. With the help TSs related to several detectors, ANN is trained to distinguish between two hypotheses, $H_0$: PU is absent and $H_1$: PU is active. Numerical results show the effectiveness of our proposed ANN-based HSS, as it outperforms the classical ANN-based energy detector and proves its capability to detect PU signal at very low SNR.

Research on Spectrum Sensing Algorithm Based on Deep Neural Network

Deep Learning Based Spectrum Sensing Method for Cognitive Radio System

Deep learning application for sensing available spectrum for cognitive radio: An ECRNN approach

Article 07 June 2021

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Cognitive Radio has been proposed in order to overcome the spectrum scarcity problem. Unlicensed, namely known as Secondary User (SU) may opportunistically access the channel of the licensed user known as primary user (PU) when the latter is absent [1]. Thus, one of the most important functions in CR becomes the spectrum sensing (SS), which is responsible to verify the primary channel status whether it is occupied or not. Several detector have been proposed to perform the SS tasks, such as: energy detector (ED), auto-correlation detector (ACD) and cyclo-stationary detector (CSD) [2].

In classical SS, i.e. signal detection, the SU applies a test statistic (TS) on the received signal and compares it to a predefined threshold in order to make a decision on the PU status. If the TS is above a certain threshold, then PU is considered as active. In fact, in order to set the optimal threshold that meets the target detection and false alarm rates, this approach predetermines that the statistical distribution of TS is known, which is not always possible due to the unstable, and may be unknown, statistical properties of the noise, the PU signal or the transmission channel.

To overcome the analytic statistical problems of the classical SS and improve its performance, several published works propose the adoption of the machine learning (ML) and the neural networks (NN) techniques in order to make decisions on the PU channel occupancy [3,4,5,6,7,8,9]. The main aim of the proposed works is to tune ML or NN systems with the statistics of both hypotheses: the first one is $H_0$ when PU is assumed to be absent, and $H_1$ when PU is assumed to be active.

In [5], ML techniques such as the K-means and support-vector machine (SVM) are used to distinguish between the $H_0$ and $H_1$ hypotheses in a cooperative SS. Two low-dimension probability vectors related to both $H_0$ and $H_1$ of ED are used in order to train the system. SVM is used in order to set the threshold curve between $H_0$ and $H_1$ clusters. K-nearest-based ML is adopted in [10] for a cooperative SS. The related mechanism of the proposed work is divided into two phases: training and classification. The global decision of the presence/absence taken at the end of the classification phase of the PU takes into consideration the reliability of each CR user when reporting to the fusion center during the training phase.

For a local SS, an ensemble classifier is proposed in [11]. The classifier seeks to discriminate between $H_0$ and $H_1$ hypotheses by being trained with the extracted cyclic features of PU’s signal in low SNR conditions. This ensemble classifier is based on decision trees and AdaBoost algorithm. Wideband SS is tackled in [12], where three ML techniques: neural networks, expectation maximization and k-means are used in order to detect presence of one or multiple primary users in a wideband spectrum.

In order to enhance the accuracy of the ML system in making decision on the PU status, hybrid SS (HSS) has been proposed [6, 7]. HSS consists of making a sensing decision based on several detectors instead of considering only one as per the classical SS. In [6, 7], Artificial neural network (ANN) have been applied in order to perform a HSS. ANN is trained using the TSs of two detectors related to $H_0$ and $H_1$ (in [6] ED and cyclostationary detector (CSD) are used and in [7] ED and likelihood ratio statistics are used).

The strength of the HSS consists on compensate the weak points of a given detector by the advantages of the another one. For instance, ED suffers from the noise uncertainty at low SNR, which is overcome by ACD. In return, ACD is adversely impacted by the low oversampling rate of the PU signal, while ED is not affected by this issue. A HSS scheme is proposed in [13], where ED and CSD are adopted. First, ED is evaluated to verify whether primary user is present or not. The CSD is used when energy detector is not sure about the presence or absence of PU. Moghimi et al. [14] and Cardenas-Juarez et al. [15] exploit the ED and the waveform detector (WFD) which is coherent detector that is based on the correlation of the received PU signal with a known reference of this signal. An optimal hybrid detector based on ED and WFD is derived as a linear combination of an energy detection metric and a coherent correlation metric.

However, the classical dealing with the HSS requires the knowledge of some statistical features of the combined detectors. This may be hard to obtain since the PU signal’s statistical parameters are not always known/available. This fact makes the numerical techniques such as NN an efficient solution. In return, even when NN was used in literature, the hybridization was limited to two detectors as in [6, 7], which does not reflect the real potential of such technique.

In this paper, we present a more general study on the performance of the HSS by admitting up to six different detectors. ANN are trained by the TSs of the detectors using data related to $H_0$ and $H_1$. A discussion on the performance is presented according to several criterion related to the ANN itself and the number of detectors to be combined in HSS. Regarding the ANN system, a discussion on the number of layers and the number of nodes in each layer is detailed showing the effect of them on the accuracy of the decision on the PU channel status. For the adopted detectors, the performance is evaluated based on the Probability of Detection, PD, and the False Alarm Rate (FAR). In addition, the impact of the number of combined detectors in HSS on the performance is detailed.

The remaining of this paper is organized as follows. In Sect. 2, our system model on the PU signal and the noise is presented. The data model, the neural network model, and the discrimination process between the two hypotheses $H_0$ and $H_1$ are given in Sect. 3. Numerical results and discussions are provided in Sect. 4. Finally Sect. 5 concludes our work.

2 System Model

The decision in SS is binary where two hypotheses must be distinguished $H_0$ and $H_1$:

$$\begin{aligned} {\left\{ \begin{array}{ll} H_0:\text {PU is absent}\\ H_1: \text {PU is active} \end{array}\right. } \end{aligned}$$

(1)

The measured TS value leads SU to decide on the PU activity by comparing TS to a predefined threshold.

Accordingly, two classes of TS values have to be defined: $H_0$-class and $H_1$-class related to the hypotheses $H_0$ and $H_1$ respectively. In fact, $H_0$-class only depends on the system parameters such as the noise and the hardware imperfections, in other words it is independent from the PU signal since the received signal r(n) can be presented as follows:

$$\begin{aligned} {\left\{ \begin{array}{ll} r(n)&{}=w(n) \text { under } H_0\\ r(n)&{}=s(n)+w(n)\text { under } H_1 \end{array}\right. } \end{aligned}$$

(2)

where w(n) stands for an additive white Gaussian noise (AWGN) and s(n) is assumed to be the received PU signal to be detected.

For HSS, the SU evaluates a $m\times 1$-dimension vector V related to m detectors.

$$\begin{aligned} V=[T_{D_1}, T_{D_2}, \ldots , T_{D_m}]^{tr} \end{aligned}$$

(3)

where the upper script tr stands for the transpose operation, $T_{D_i}$ is the TS related to the detector $D_i,\ i\in [1,m]$. Each TS is a mathematical application applied on r(n). For instance, ED evaluates the sum of squares of the samples of r(n), whereas ACD stands for the correlation between r(n) and a shifted version of itself, and so on. In classical SS, SU may evaluate only one TS related to a given detector. This TS is compared to a threshold to take a decision on PU status. However, in HSS, a vector of TSs related to several detectors are evaluated and combined in order to examine the PU channel status. In our work, this ANN is used to combine the data of these detectors and exploit them in outcome a final decision on PU.

3 The Data Model

In this section we present the details about our dataset and the ANN model used in order to combine the evaluated TSs of the adopted detectors. By training the ANN system with hybrid data, we use such system to make a decision on the PU status.

3.1 Dataset

The data consists of two categories according to the two hypotheses $H_0$ and $H_1$. The data was generated corresponding to the TSs of six detectors: ED [16], ACD [17], maximum eigenvalue detector (EVM) [18], maximum–minimum eigenvalue detector (EVMM) [18], cumulative power spectral density detector (CPSD)[19] and goodness-of-fit detector (GoF) [20]. The data respects an AWGN noise and a 16-QAM modulated PU signal with an oversampling rate $N_s=4$. The TSs related to the adopted detectors are given in the “Appendix”. Our dataset, as depicted by Fig. 1, consists of seven features which are $\{\text {ED, ACD, EVM, EVMM, CPSD, GoF, SNR}\}$ and a label. The label values are 0 under hypothesis $H_0$ and 1 under $H_1$. Figure 1 presents a description of the dataset. In particular, the dataset contains $9 \times 10^6$ rows. Our choice to include the SNR into the set of features is an important issue. Indeed, this prevents building a separate neural network model (NN model) and from training it over each SNR value.

We splitted the dataset randomly into $80\%$ training set and $20\%$ validation set. Figure 2 illustrates the count of rows with $H_0$ and $H_1$ respectively (i.e. labels 0 and 1) in the validation set. It can be observed that the data is uniformly distributed among all SNR values. This also applies to the training set.

In order to carefully analyse the data may look in depth, we picked out 1000 random samples from the validation dataset and we plot the scattering of two detectors: ED and ACD as depicted in Fig. 3. The $H_1$ data drifts away from the $H_0$ data class as the SNR increases. $H_0$ data keeps the same place in the space of the scattering for all SNR values because it is only related to the noise. However, low SNR values (i.e. − 21 dB) makes the discrimination between $H_0$ and $H_1$ a tough task due to the huge mix-up of $H_0$ and $H_1$ related data (see Fig. 3). However, at a relatively good SNR value (i.e. 6 dB), the classification becomes an easy task.

The data input of the model is a batch of 64 rows (see Sect. 4.1 for a discussion on the batch size). Figure 4 illustrates the first 10 rows of a batch drawn randomly from the dataset.

We iterate on the training set by selecting a batch on each step and we fed it as an input to Algorithm 1. After completing a whole pass on the training set, we switch to the validation set and we apply Algorithm 2 in order to assess the accuracy of the model. This completes one epoch. This procedure can be repeated until getting an acceptable value of the accuracy (e.g. an accuracy value $> 95\%$).

3.2 The Neural Network Model

Since our data is in tabular form, we select a fully connected neural network (FCNN). A FCNN consists of one input layer, several hidden layers and one ouput layer. The features’ set is the input layer for our model. The output layer will simply consist of two nodes because we are trying to predict whether a row of features’ values belongs hypothesis $H_0$ or $H_1$. That is, the values of the two output nodes will be two probability values that sum to one. It remains to specify the number of hidden layers, i.e. the ones between the input and output layers,

The number of hidden layers and the number of nodes in each layer, are considered as hyper-parameters and can be tweaked. Two layers are considered. The first layer with 1000 nodes and the second one with 500 nodes. We give a discussion of the model parameters’ tweaking in Sect. 4.1.

As a subtle point, notice that the SNR has discrete values, hence it is considered as a categorical variable as opposite to the six detector variables which are continuous. It is a common behaviour to use embedding [21] in the case of a categorical variable since it leads to improve the model accuracy. The embedding process is shown in Fig. 5. In this figure, we take a one-hot encoded vector [21] of SNR concatenated with a bias (i.e. a real value which will be learnt by the NN) which yields a vector of length 10. This vector is mapped to a vector of length 6, called the embedding vector. The embedding vector dimension is a hyper-parameter and can be tweaked (Sect. 4.1). A bias is added because this is required by the embedding process. We concatenate this $6-D$ vector with the six detectors (Eq. 3) in order to produce the input layer of the FCNN (Fig. 5). Then, we add two hidden layers with [1000, 500] nodes and an output layer with 2 nodes.

For the performance metrics, we select the binary negative log likelihood (NLL) loss function [22] because the type of our problem is binary classification.

A brief explanation of the NLL loss function is given hereinafter: Let us take a features’ row from the dataset. The ground truth label (or target) of this row is 0 or 1 (e.g. the first row in Fig. 4, has a ground truth label $=0$). After SNR embedding and concatenation with the other features as explained before, we get a vector x of dimension (12, 1) (the input layer in Fig. 5). Call the output layer $\hat{y} = [\hat{y_0}, \hat{y_1}]^{tr}$ where $\hat{y_i}, i=0,1$ is the probability of getting $H_i, i=0,1$ as prediction and the upperscript tr is the transpose operator. We encode the ground truth label using one-hot encoding [21]. That is, label 0 is encoded as vector $y=[1,0]^\top$ whereas label 1 is encoded as $y=[0,1]^{tr}$. That is $y=[y_0, y_1]$ where $y_0=1$ if label = 0 and $y_0=0$ if label =1. Note that, $y_1=1-y_0$. The binary NLL loss function for this row (e.g. row 1) is expressed as:

$$\begin{aligned} L_1=-y_0\log (\hat{y_0})-(1-y_0)\log (1-\hat{y_0}) \end{aligned}$$

For a batch of 64 rows, the loss function becomes:

$$\begin{aligned} L=\sum _{n=1}^{64}-y_{n0}\log (\hat{y}_{n0})-(1-y_{n0})\log (1-\hat{y}_{n0})/64 \end{aligned}$$

(4)

where $y_{n0}$ (resp. $\hat{y}_{n0}$) is the encoded label value (resp. predicted probability) of row n of the batch. During the training phase (Algorithm 1) the model will try to minimize the loss function. During the validation phase (Algorithm 2), the loss is also calculated. In addition, we get the confusion matrix and we will derive from it the model accuracy. Furthermore, we well obtain two other important metrics which are the detection probability and the false alarm rate (these two also are derived from the confusion matrix). An example is given in Fig. 6 where the True Positive $TP=898{,}399$, the False Positive $FP=62{,}934$,the False Negative $FN=2246$ and the True Negative $TN=836{,}421$. Hence, we get the accuracy as :$\frac{TP+TN}{P+N}=0.9637$ ($P+N$ is the total count of the validation set which is 1800, 000). Consequently, the Detection Probability PD can be evaluated as: $PD=\frac{TP}{TP+FP}=0.9345$ and the False Alarm Rate FAR is: $FAR=\frac{FN}{FN+TN}=0.002678$.

The details of the model are described in Fig. 7. First, an embedding layer is constructed as discussed before. Then, we apply a regularization technique called Dropout^{Footnote 1} [23]. Dropout consists of dropping a percentage of a layer nodes randomly in the training process. This percentage is determined by the value p in Fig. 6. For the embedding layer, we put $p=0$, that means we do not drop any node since the number of nodes in this layer is too small (6 nodes). Normalization is also an important procedure in FCNN, which is normally used in order to avoid the cases where the NN parameters vanish or explode. Batch normalization [24] is very efficient and hence we applied it to all the layers except the output. Equation 5 is the core operation in batch normalization.

$$\begin{aligned} y = \frac{x-E(x)}{\sqrt{Var(x)+\epsilon }}*\gamma +\beta \end{aligned}$$

(5)

x represents a batch, E(x) and Var(x) are the mean and the variance of x respectively, $\epsilon$ is added to ensure numerical stability, and $\beta$ and $\gamma$ (affine$=True$) are two learnable parameters. Also by default, during training this layer keeps running estimates of its computed mean and variance (track_running_stats$=True$), which are then used for normalization during evaluation. The running estimates are kept with a default momentum of 0.1^{Footnote 2}. After normalization, a linear layer is added (Eq. 6):

$$\begin{aligned} y=W^{tr}.x + b \end{aligned}$$

(6)

where W is a learnable parameter matrix, x is the batch, . is the dot product and b is a learnable bias vector. For instance, the first linear layer model connects the input layer (12 nodes) to the first hidden layer (1000 nodes) as shown in Fig. 5. Given a batch size $=64$, hence the dimension of matrix W becomes (12, 1000), whereas the dimensions of x are (12, 64) and those of b are (1000, 64).

After adding the linear layer, we introduce a non-linearity by applying an activation function. In our case, it is the ReLU (Rectified Linear Units) function [25]. ReLU is simply max(0, y), to get rid of negative values.

As mentioned before, the model contains two phases: training and validation (see Algorithms 1 and 2). Note that the backward pass is applied during the training phase only; Where the parameters of the model are updated in order to minimize the loss function. The validation phase, however, contains only a forward pass. Note also the Dropout is turned off during the validation.

4 Results

4.1 Model Tweaking

We tested several model architectures with various numbers of layers and different number of nodes per layer.

Table 1 Accuracy as function of different model architectures

Full size table

The results reported in Table 1 are after one epoch of training, since the accuracy was almost independent from the number of epochs. We conducted our experiments on a cloud AWS (Amazon Web Service) machine equipped with a k80 GPU (12 GB integrated RAM; 5.6 TFLOPS [27]). It is clear that increasing the number of layers and the number of nodes per layer leads to better accuracy. However, we did not notice an accuracy improvement with a number of layers more than two. Also, we increased the number of nodes to the maximum value allowed by the machine RAM. In addition to the number of layers and the number of nodes, there are other hyperparameters to tweak. The most important one is the learning rate. We applied the methodology suggested in [28] in order to select a learning rate which minimizes the loss function. The result is illustrated in Fig. 8. We obtained this figure by applying algorithm 1 on a small percentage of the training set (5% in our case).

According to [28], the learning rate should be selected from the decreasing zone in Fig. 8. That is, in the range $[10^{-5}, 10^{-1}]$. In our experiments we used the value $10^{-5}$.

Other parameters are: batch size, momentum, epsilon, dropout probability and the length of the embedding vector.

For the batch size, we selected a value of 64 (a larger value can be used but this requires more RAM). For the embedding vector length, the best practice [21] is to reduce the dimension of the categorical input vector (SNR vector in Fig. 5). Hence, any value less than 9 is acceptable. In our experiments, we fixed this value to 6. For the remaining parameters, we used momentum = 0.1 ([26]), epsilon = $10^{-5}$ (this should be a number close to 0 [24]) and dropout probability $p = 0.001$ for the hidden layer 1 and $p=0.01$ for the hidden layer 2 (p should be a small percentage of the nodes’ layer). With these parameters, we obtained a high accuracy value (0.96) for the model architecture with 2 layers, [1000, 500] nodes. Also, as illustrated in Fig. 9, validation and training losses are very close which means that our model does not over-fit, i.e. it can generalize well to any dataset.

4.2 Sensing Performance Evaluation

In this section, we present results obtained from our model. We emphasize on two performance measures: the probability of detection (PD) and the false alarm rate (FAR). Our dataset contains six detectors which are: ED, ACD, EVM, EVMM, CPSD and GoF. We may present results for any combination among these detectors; However this will be a time consuming. Instead, we take the following set of combinations where ED is common in all the adopted combinations: $\{ED,\ ED-EVM,\ ED-EVM-GoF,\ ED-EVM-GoF-EVMM,\ ED-EVM-GoF-EVMM-CPSD,$ all detectors$\}$. Our assumption comes form the fact that ED is the classical detector in SS and is widely considered as the reference one, thus ED is common in all the considered combinations.

Figure 10 shows the evolution of PD and FAR of ANN-based HSS detector in terms of SNR for all the adopted combinations. Noting that adopting ED solely reflects the classical case when ANN is used to train/validate only one detector, thus it can be considered as the reference of the non-HSS. However, for the combination $ED-EVM$, PD increases from 0.6 at SNR = − 24 dB to a value greater than 0.95 at SNR of − 12 dB. This evolution of PD is accompanied with a decrease of FAR from 0.06 at SNR = − 24 dB to a value less than 0.1 at − 12 dB. On the other hand, for the ANN-based ED (no HSS is adopted) PD increases from 0.65 to 0.85 for the SNR range [− 24 ; − 12] dB, while FAR presents very high values compared to $ED-EVM$ on such SNR range.

Furthermore, Fig. 10 shows that PD increases with the number of used detectors, whereas FAR decreases with the number of used detectors. When three detectors are used, i.e. $ED-EVM-GoF$, PD achieves 0.92 at − 12 dB and FAR becomes less than 0.06 for the same SNR. These two performance indicators, PD and FAR, become respectively higher than 0.95 and less than 0.001 when six detectors are used. This fact reflects the efficiency of the hybrid sensing in terms of both protecting PU form the interference (when PD is high) and exploiting the available spectrum resources (when FAR is low).

However, for very low SNR, i.e. − 24 dB, PD is above 0.825 with a FAR less than 0.001, which reveals the high robustness of such a hybrid detector in achieving good performance when the other techniques fail.

In Fig. 11, we present the average values of PD and FAR over all SNRs. The average could be interpreted as the robustness of the proposed technique in terms of SNR. In fact, the data corresponding to $H_0$ are noise-only related and not impacted by the SNR, thus their detectors scattering remains stable in the space independently of the SNR. On the another hand, the data under $H_1$ is PU signal dependent, and subsequently it is related to the SNR of the received PU signal. Hence, the performance analysis presenting the average PD and FAR gives us an in-depth view on the efficiency of the proposed technique to distinguish between $H_0$ and $H_1$, for wide range of SNR ([− 24 ; 0] dB). For the case where no HSS is used, i.e. only ED is used, thre average PD is around 0.84 for an average FAR of 0.25 as shown in Fig. 11 respectively. In contrast, for HSS when the number of used detectors increases the average PD increases accordingly, whereas the average FAR decreases. An average PD higher than 0.93 is observed when more than 3 detectors are used, while an almost zero FAR is obtained.

5 Conclusion

In this paper, we presented hybrid spectrum sensing (HSS) technique using artificial neural network (ANN). Instead of using one detection method as per the classical spectrum sensing, several test statistics (TSs) of several detectors are combined using ANN. ANN system is trained with the TSs of the used detectors for the noise-only case and for the case where PU is active. The numerical results corroborate the efficiency of the proposed HSS compared to the non hybrid detection technique, where ANN is trained with the TS on only one detector. In addition, the results proved that the detection outcome becomes more reliable as the number of detectors increases.

Notes

Regularization is used in order to give the model the ability to generalize on unseen datasets.
Momentum is a hyperparameter, i.e. it can be tweaked. However, the value of 0.1 is generally adopted in the literature [24].

References

Mitolal, J. (1999). Cognitive radio: Making software radios more personal. IEEE Personal Communication, 6(4), 13–18.
Article Google Scholar
Yucek, T., & Arslan, H. (2009). A survey of spectrum sensing algorithms for cognitive radio applications. IEEE Communication Surveys & Tutorials, 11(1), 116–130. First Quarter.
Article Google Scholar
Clancy, C., Hecker, J., Stuntebeck, E., & O’Shea, T. (2007). Applications of machine learning to cognitive radio networks. IEEE Wireless Communications, 14(4), 47–52.
Article Google Scholar
Thilina, K. M., Choi, K. W., Saquib, N., & Hossain, E. (2013). Machine learning techniques for cooperative spectrum sensing in cognitive radio networks. IEEE Journal on Selected Areas in Communications, 31(11), 2209–2221.
Article Google Scholar
Lu, Y., Zhu, P., Wang, D., & Fattouche M. (2016). Machine learning techniques with probability vector for cooperative spectrum sensing in cognitive radio networks. In 2016 IEEE wireless communications and networking conference (pp. 1–6).
Vyas, M. R., Patel, D. K., & Lopez-Benitez, M. (2017). Artificial neural network based hybrid spectrum sensing scheme for cognitive radio. In 2017 IEEE 28th annual international symposium on personal, indoor, and mobile radio communications (PIMRC) (pp. 1–7).
Tang, Y., Zhang, Q., & Lin, W. (2010). Artificial neural network based spectrum sensing method for cognitive radio. In 2010 6th international conference on wireless communications networking and mobile computing (WiCOM) (pp. 1–4).
Li, Z., Wu, W., Liu, X., & Qi, P. (2018). Improved cooperative spectrum sensing model based on machine learning for cognitive radio networks. IET Communications, 12(19), 2485–2492.
Article Google Scholar
Guo, C., Jin, M., Guo, Q., & Li, Y. (2018). Spectrum sensing based on combined eigenvalue and eigenvector through blind learning. IEEE Communications Letters, 22(8), 1636–1639.
Article Google Scholar
Shah, I., & Koo, H. A. (2018). Reliable machine learning based spectrum sensing in cognitive radio networks. Wireless Communications and Mobile Computing, 2018
Ahmad, H. B. (2019). Ensemble classifier based spectrum sensing in cognitive radio networks. Wireless Communications and Mobile Computing, 2018
Molina-Tenorio, Y., Prieto-Guerrero, A., Aguilar-Gonzalez, R., & Ruiz-Boqué, S. (2019). Machine learning techniques applied to multiband spectrum sensing in cognitive radios. Sensors, 19(21).
Shirazi, S. F., Shirazi, S. H., Shah, S. M., & Shahid, M. K. (2012). Article: Hybrid spectrum sensing algorithm for cognitive radio network. International Journal of Computer Applications, 45(17), 25–30.
Google Scholar
Moghimi, F., Schober, R., & Mallik, R. K. (2010). Hybrid coherent/energy detection for cognitive radio networks. In 2010 IEEE global telecommunications conference GLOBECOM 2010 (pp. 1–6).
Cardenas-Juarez, M., Ghogho, M., Pineda-Rico, U., & Stevens-Navarro, E. (2016). Improved semi-blind spectrum sensing for cognitive radio with locally optimum detection. IET Signal Processing, 10(7), 524–531.
Article Google Scholar
Digham, F., Alouini, M.-S., & Simon, K. (2007). On the energy detection of unknown signals over fading channels. IEEE Transactions on Communications, 55(1), 21–24.
Article Google Scholar
Naraghi-Poor, M., & Ikuma, T. (2010). Autocorrelation-based spectrum sensing for cognitive radio. IEEE Transactions on Vehicular Technology, 59(2), 718–733.
Article Google Scholar
Zeng, Y., & Liang, Y.-C. (2009). Eigenvalue-based spectrum sensing algorithms for cognitive radio. IEEE Transactions on Communications, 57(6), 1784–1793.
Article Google Scholar
Nasser, A., Mansour, A., Yao, K. C., Abdallah, H., & Charara, H. (2017). Spectrum sensing based on cumulative power spectral density. EURASIP Journal on Advances in Signal Processing, 2017(1), 38.
Article Google Scholar
Teguig, D., Le Nir, V., & Scheers, B. (2015). Spectrum sensing method based on the likelihood ratio goodness of fit test. IEEE Electronic Letters, 51(3), 253–255.
Article Google Scholar
Guo, C., & Berkhahn, F. (2016). Entity embeddings of categorical variables. arXiv, http://arxiv.org/abs/1604.06737.
Zhu, D., Yao, H., Jiang, B., & Yu, P. (2018). Negative log likelihood ratio loss for deep neural network classification. arXiv, http://arxiv.org/abs/1804.10690.
Labach, A., Salehinejad, H. & Valaee, S. (2019). Survey of dropout methods for deep neural networks. arXiv, http://arxiv.org/abs/1904.13310.
Ioffe, S. & Szegedy, C. (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv, http://arxiv.org/abs/1502.03167.
Arora, R., Basu, A., Mianjy, P. & Mukherjee, A. (2016). Understanding deep neural networks with rectified linear units. arXiv, http://arxiv.org/abs/1611.01491.
Ruder, S. (2016). An overview of gradient descent optimization algorithms. arXiv, http://arxiv.org/abs/1609.04747.
Markidis, S., Der Chien, S. W., Laure, E., Peng, I. B., & Vetter, J. S. (2018). Nvidia tensor core programmability, performance & precision. In 2018 IEEE international parallel and distributed processing symposium workshops (IPDPSW) (pp. 522—531).
Smith, L. N. (2015). Cyclical learning rates for training neural networks. In 2017 IEEE winter conference on applications of computer vision (WACV) (pp. 464–472).
Zhang, G., Wang, X., Liang, Y.-C., & Liu, J. (2010). Fast and robust spectrum sensing via Kolmogorov–Smirnov test. IEEE Transactions on Communications, 58(12), 3410–3416.
Article Google Scholar

Download references

Author information

Authors and Affiliations

LABSTICC, CNRS, UMR 6285, ENSTA Bretagne, 2 Rue François Verny, 29806, Brest, France
A. Nasser & A. Mansour
ICCS-Lab, Computer Science Department, American University of Culture and Education, Beirut, Lebanon
A. Nasser
LABSTICC, CNRS, UMR 6285, UBO, 6 Avenue le Gorgeu, 29238, Brest, France
K. C. Yao
Computer Science department, Faculty of Science, Lebanese University, Nabatieh, Lebanon
M. Chaitou & H. Charara

Authors

A. Nasser
View author publications
You can also search for this author in PubMed Google Scholar
M. Chaitou
View author publications
You can also search for this author in PubMed Google Scholar
A. Mansour
View author publications
You can also search for this author in PubMed Google Scholar
K. C. Yao
View author publications
You can also search for this author in PubMed Google Scholar
H. Charara
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to A. Nasser.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: Mathematical Formulae of Adopted Detectors

Energy detector (ED) is defined as the sum of the square modulus of the received signal:

$$\begin{aligned} T_{ED}=\frac{1}{N}\sum _{1}^N |r(n)|^2 \end{aligned}$$

(7)

where N is the number of received samples.

Autocorrelation detector (ACD) consists of evaluating the inter-sample correlation of the received signal, and is defined as follows:

$$\begin{aligned} T_{ACD}=\frac{1}{N_sNT_{ED}} \sum _{l=1}^{N_s-1}Re\bigg \{\sum _{n=1}^Nr(n)r^*(n-l)\bigg \} \end{aligned}$$

(8)

where $*$ stands for the congugate operation, $N_s$ is the number of samples per symbol and $\sigma _w^2$ is the AWGN noise variance.

The CSPD detector evaluates the non-flatness of the noise in frequency domain and is given by [19, eq. 26]:

$$\begin{aligned} T_{CPSD}=\dfrac{2}{N^2\sigma _w^2}\sum _{k=1}^{N/2}\bigg (\frac{N}{2}-k+1\bigg )\dfrac{|R(m)|^2+|R(-m+1)|^2}{2} \end{aligned}$$

(9)

where R(m) is the discrete Fourrier transform of r(n).

EVM consists of finding the maximum eigenvalue of the covariance matrix $R_{\mathcal {r}}$ of ${\mathcal {r}}(n)$ which is a set of shifted versions of r(n).

$$\begin{aligned} T_{EVM}=\lambda _{max}=||\lambda _1, \lambda _2, \ldots , \lambda _L||_{\infty } \end{aligned}$$

(10)

where $\lambda _i,\ i\in [1,L]$ is the ith eigenvalue of $R_{\mathcal {r}}$, L is related to the number of shifted versions of r(n), and $|| \cdot ||_{\infty }$ is the $\infty$ norm.

Similarly to EVM, EVMM is evaluated based on the ratio of the maximal eigenvalue to the minimal eigenvalue of $R_{\mathcal {y}}$:

$$\begin{aligned} T_{EVMM}=\dfrac{\lambda _{max}}{\lambda _{min}} \end{aligned}$$

(11)

where $\lambda _{min}=min\{\lambda _1, \lambda _2, \ldots , \lambda _L\}$

Finally, $T_{GoF}$ consists of detecting the presence of PU signal by determining whether the received samples are drawn from the noise distribution with a Cumulative Distribution Function F [29]:

$$\begin{aligned} T_{GoF}=-\sum _n^{N}\Bigg [\frac{\log (F\{ r(n)\})}{N-n+1/2}+\frac{\log (1-F\{r(n)\})}{n-1/2} \Bigg ] \end{aligned}$$

(12)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Nasser, A., Chaitou, M., Mansour, A. et al. A Deep Neural Network Model for Hybrid Spectrum Sensing in Cognitive Radio. Wireless Pers Commun 118, 281–299 (2021). https://doi.org/10.1007/s11277-020-08013-7

Download citation

Accepted: 27 November 2020
Published: 03 January 2021
Issue Date: May 2021
DOI: https://doi.org/10.1007/s11277-020-08013-7

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A Deep Neural Network Model for Hybrid Spectrum Sensing in Cognitive Radio

Abstract

Similar content being viewed by others

Research on Spectrum Sensing Algorithm Based on Deep Neural Network

Deep Learning Based Spectrum Sensing Method for Cognitive Radio System

Deep learning application for sensing available spectrum for cognitive radio: An ECRNN approach

1 Introduction

2 System Model