1 Introduction

In the primitive years of radio spectrum, non-communicative RF energy based applications (oven, heating appliances to name a few) merely occupied the band until in recent times, incremental usage of communication based application compelled to create more space. The band regulations hence got modified by the international telecommunication union (ITU) and covered both energy and communication based applications. This worldwide regulation for such dedicated spectrum slot was majorly classified for applications in the field of industry, science and medical. Industrial Scientific and Medical (ISM) is a band which facilitates a free radio spectrum band in order to operate a cordless telephone, a Bluetooth device, a wi-fi, a garage door opener, an animal tracker or even a baby monitor. Due to phenomenal advancement in wireless technology, artificial intelligence, smart phones and the burst of social networking platforms (Instagram, Facebook, and Twitter), demand for the wireless network is increasing exponentially (Fig. 1).

Fig. 1
figure 1

Cognitive radio environment

Due to exponential demand, ISM band is heading towards a point of suffocation. Primitive allocation of spectrum and the governing policies has led to underutilized spectrum resources. Licensed users like television, cellular, and radio users are assigned a large spectrum portion which varies its operation (or is even turned OFF) with time and geographical location, resulting in underutilization or complete wastage of a limited spectrum [1,2,3].

To summarize, we work in an environment which is either congested at times or with no traffic at all. A study at Berkeley Wireless Research Center (BWRC) in year 2005 indicates that spectrum utilization is rising high at lower frequencies and dipping rapidly at higher frequencies [2]. The study indirectly supports the Federal Communications Commission (FCC) statement “the average utilization of numerous spectrum bands varies between 15 and 85% as most of them have been assigned through a fixed spectrum allocation policy, confined to specific geographical areas” [4]. As a result of spectrum scarcity and underutilization of spectrum at the same time, FCC releases analog TV bands known as white space (spectrum holes) for unlicensed users to cater to the congestion without disrupting the routine flow of traffic (licensed users).

A mechanism is desired which could supervise the overall traffic, allow unlicensed traffic to come in, ensure continuous smooth movement of licensed traffic (even in peak hours) without compromising on security. An intelligent radio tool referred as cognitive radio network (CRN) which works on dynamic spectrum scheme, envisages the emerging demands of users and efficient spectrum allocation [5]. There are two types of users in CRN: primary user/licensed traffic (PU) and secondary user/unlicensed traffic (SU). CRN uses artificial intelligence to assess all working parameter, observe, learn and dynamically accommodates new parameters without reducing the overall functionality or efficiency.

Dynamic spectrum access is more vulnerable to security threats compared to a fixed spectrum allocation scheme. Cognitive radio faces security threats common to wireless networks affecting all layers of the protocol stack. In addition to this, CR is vulnerable to additional security threats due to its dynamic nature. Security is a serious threat as various attacks in cognitive radio occurs at different layers of the communication system. A cognitive radio monitors the spectrum using sensing ability and identifies spectrum holes. If results obtained after sensing the spectrum are not accurate, there is a possibility that secondary users may interrupt primary user transmission. Interfering with primary users leads to a violation of the FCC mandate: “There should be no modification or interference to the primary user transmission.” And therefore primary user emulation attack (PUEA) is a significant threat to all the functions of CR [6].

In this attack, a secondary user behaves like the primary user by imitating its signal characteristics and forces secondary users to vacate the channel. The impact of this attack is the highest on spectrum sensing operation. Therefore security mechanism at the spectrum sensing stage should be dominant.

In the recent few years, there has been significant research on security threats and countermeasures in cognitive radio network. Overview of CR security threats, defense algorithms can be found in [7,8,9,10,11]. Although many survey papers enhance our knowledge of cognitive radio security, this paper attempts to culminate recent findings with the old ones to keep up the research community’s pace. This survey paper aims to provide recent development towards security with more focus on PUEA countermeasures.

The paper is arranged in four sections. After a brief introduction, we have attempted to diagrammatically describe the basic operation of PUE attacker in Sect. 2. In Sect. 3, the principles and limitations of the PUEA countermeasures are explained. In Sect. 4, we describe unexplored areas related to the future development of cognitive radio security. Finally, in Sect. 5, the paper is concluded.

2 Primary User Emulation Attack

Primary user emulation attack is a significant obstacle to the spectrum sensing operation of CRN, which occurs at the physical layer of the protocol stack. A severe threat to spectrum sensing operation is to differentiate the primary transmitter signal from secondary user signals accurately. When the primary user reclaims the spectrum, the cognitive user leaves the frequency band and switch to another band for continuing its transmission [11]. A physical layer attack known as primary user emulation attack affects the spectrum sensing and other functions of the cognitive network. In this attack, a malicious secondary user imitates the characteristics of the primary user and behaves as a primary user to access the available frequency band when the primary user is inactive. The malicious user (MU) can force secondary user (SU) to vacate the spectrum by behaving as a primary user (PU). In this case, these attackers can occupy the entire licensed spectrum by themselves or waste the valuable radio spectrum [12] (Fig. 2).

Fig. 2
figure 2

Primary user emulation attack scenario

3 Countermeasures

Various defense mechanisms for the PUE attack have been introduced in the last 2 decades. All mechanism aims to achieve a common goal—enhanced spectrum management. Federal communications commission (FCC), an independent body that supervises communication regulations, has mandated “No interference or modification in Primary User transmission.” Since a successful transmission in a CR network cannot supersede the FCC mandate, therefore, all Countermeasures for PUEA need to work within the specified compulsions of FCC, resulting in higher complexities in CR implementation.

All these strategies require a precise categorization of security methods to arrive at a holistic approach, which is robust enough to tackle various attacks. Existing countermeasures for PUEA are broadly classified under seven categories: (1) location and distance, (2) analytical model, (3) cryptography, (4) belief propagation, (5) wireless microphone, (6) game theory, (7) machine learning, (8) proactive MAC design and (9) blockchain for security in CRN against PUEA.

3.1 Distance and Location Based

Chen, Ruiliang, and Park presented the first method [12] for identifying a PUEA based on location and distance. This method uses a transmitter verification scheme to differentiate between signals from licensed user (PU) and secondary user (SU) imitating as a licensed user. A master location verifier (LV) and slave LV are two essential nodes in this process. Master LV is equipped with a secure GPS to record the database of all TV tower coordinates present outside the cognitive radio network. The author makes few assumptions for the detection process: (1) all LVs need close synchronization and communication among each other through a common control channel. Master and slave location verifiers use identical radio propagation models. (2) TV broadcast tower is assumed to be a primary network with transmission range in tens of miles and output power in thousands of watts. Two tests conducted for verifying location of PU and MU.

  1. 1.

    Distance ratio test This method is based on the measurement of received signal strength (RSS) of the signal by LV with a cooperative distance ratio verification scheme. In each of the DRT iteration, RSS values measured by the pair of LVs are sent to Master verifier and compares the value with each TV tower coordinates from the database. If the signal received does not match with any of the existing TV tower coordinates, then location verification for the signal fails, and the signal is detected as an attacker.

  2. 2.

    Distance difference test This method uses the phase of a licensed signal measured by the master–slave LV pair to verify the transmitter location. Synchronized pair of LVs sends the time difference of the TV signal pulse and their coordinates to the master verifier. Once the parameters are received, the master LVs calculate the difference in distance using the time difference of signal and compare it with the entire TV tower database. If the received signal does not match with any of the existing TV tower coordinates, then location verification for the signal fails, and the signal is detected as an attacker.

Techniques mentioned above have certain limitations:

  1. 1.

    There is a need for tight synchronization between master and slave location verifiers in DRT and DDT, which makes the process expensive.

  2. 2.

    A large number of iterations required for DRT can increase system overhead.

  3. 3.

    False-negative ratio increases if there is less number of LVs.

The limitations of DRT and DDT [12], is addressed to some extent by Chen et al. in [13] by using RSS localization method. The primary signal transmitter’s location is verified by measuring its location and observing its signal characteristics (carrier frequency, modulation frequency, power, e for this paper etc.). The localization technique collects synchronized RSS measurements with the help of wireless sensor network (WSN) attach to each secondary user. This RSS measurement helps in determining the TV transmitter location. Further, to distinguish between the primary transmitter and attacker, a comparison is made between measured RSS peak and TV transmitter location. However, the mechanisms described in this paper do not consider major aspects of the wireless channel: fading and shadowing. Also, the paper is limited to detection (and not mitigation) of the primary user emulation attack. Few drawbacks are associated with RSS measurements also as given below

  1. 1.

    The use of WSN makes the system more expensive and complicated.

  2. 2.

    The attacker’s transmission power is assumed to be of constant magnitude; however, practically, it may differ.

  3. 3.

    Computation time is derived by the summation of the running localization algorithm and data collection by WSN. Due to the wireless sensor network, the delay could be induced in the network.

In [14], Rehman and Saeed have proposed to diminish PUEA in CRN by using radio-frequency (RF) fingerprinting. The work has explored the viability of applying RF fingerprinting through software-defined cognitive radios. The result shows that with high signal to noise ratio (SNR) at receiver end (in an ad hoc CRN), PUEA could be mitigated successfully. However the same technique may not yield similar results for a centralized network. The fingerprinting based solutions need large samples of data as well as extra storage, substantial computation with signal processing overhead. Also, damages in the low-end receiver affect the transmitter classification accuracy, and this accuracy differs across receivers. A slight improvement is seen in the outcome of false probability when RSS is combined with maximum likelihood estimation [15].

In [16], a wavelet transform (WT) method is used by Zhao and Caidan to distinguish between the primary transmitter and PUEA signal. To counter this threat, transmitter location fingerprints is extracted and analyzed in a multipath propagation environment. An assumption made to implement this approach is as follows. (1) The primary user transmitter location is fixed. (2) Primary and secondary users are low power handheld devices. (3) A verifier (secondary node) is used to distinguish between the primary transmitter and the PUE attacker. In this scheme, a verifier extract signal from the frequency band of interest using a bandpass filter. Then samples (transmitter location fingerprints) present in the time domain are converted into the frequency domain by power spectral density function. After that, the characteristics of transmitter fingerprints are extracted by the wavelet transform technique. Feature extraction is done by obtaining statistical parameters of the wavelet coefficient. The authors have considered a multipath fading indoor office situation with a time-invariant propagation channel to extract fingerprints of the transmitter. The real-time experiment is conducted at four locations using a spectrum analyzer and signal generator. This approach may identify the PUE attacker; however, mitigation of the attacker is again a limitation here. Since the primary transmitter location is fixed, it is not an efficient method for ad-hoc CRN or mobile PUs.

The location-based approach proposed by León et al. in [17] is a cooperation localization method suited explicitly for a centralized IEEE 802.22 network [39]. The primary objective of the IEEE 802.22 is to enable access to the vacant position (white space) in the digital television (DTV) channel. The CRN comprises of a base station (BS) and a batch of SUs with static positions (distributed arbitrarily in the network). This approach employs the time difference of arrival (TDOA) method to calculate the time observed by each node pair to distinguish between a PU and a bad secondary user (an attacker). This method needs tight synchronization between a pair of nodes. The base station in a database records the positions for all fixed PUs and SUs. SUs are used as anchor nodes to determine the location of the emitter signal. Secure synchronization is required between anchor nodes with stationary and known positions. All anchor nodes carry out spectrum sensing and send sensing results to the base station. If a primary transmission is sensed in submitted reports, the localization procedure is started by CRN to detect whether it is genuine or fake signals. The time to detect PU transmitter is bounded to 2 s (by IEEE 802.22). Location detection time in this method depends on following factors (1) time required by anchor nodes to measure and record primary user signal and send a recorded signal to the base station and (2) calculation time taken at the base station using weighted least square (WLS) and Taylor series (TS) estimation technique. TS estimation convergence depends on a reasonable guess of initial value; therefore, accuracy is low. This technique provides accurate results if there is cooperation between secondary nodes. If secondary nodes are compromised, they may provide incorrect results to the base station resulting in false signal detection. This method does not consider multiple attack scenarios.

In [18], the authors focus on two CRN attacks: spectrum sensing data falsification (SSDF) and PUEA affecting the centralized CRN network. A transmitter verification scheme (localization) is applied for PUEA detection. For the SSDF attacks, an optimum nonlinear cooperation framework is proposed to reduce the interference induced by the SSDF attacks. A secure distributed CR system is implemented where the fusion center decides for spectrum sensing after combining the results of individual SUs. The two-tier cognitive network is proposed: First-tier consists of a cluster of SUs. Second tier consists of relay nodes to process local spectrum sensing information from SUs and forwards it to FC in compressed form. The probability of false detection is shown through simulations for three different cases: No attacker, 2 SSDF attackers, and 2 PUEA attackers. This detection method needs extra relay nodes, which increases the possibility of infrastructure overhead.

A recent work [19] conducted by Adebo et al. is a combination of RSS and angle of arrival (AoA) location technique. This method measures (1) distance between SUs and transmitter (2) angle at which secondary user receives PUs signal (3) TV tower location is known. Therefore, the PU signal position determined by a hybrid technique is matched with the location of the PU to detect PUEA. This scheme is simulated with only two secondary users to estimate PUEA. A comparison is shown between the received signal strength (RSS), AOA, and hybrid localization scheme. This hybrid method requires only 20 iterations to converge, which is 30 iterations less than AOA and RSS. Whether the Hybrid work is suitable enough for a long communication range is still not covered. Hence the limitations related to RSS measurement may prevail in this scheme as well.

Another modified localization solution in [20] combines trilateration and RSSI technique with the Bayesian decision model with cost matrix involving conditional risk. Based on the estimated position of PUE by RSSI at the secondary node, the Bayesian model decides the legitimacy of the PU signal using the cost matrix. The conditional risk for each decision is calculated to reduced false alarm probability and increase the detection rate. The problem of localization of the PUE attack is reformulated as a multi-objective optimization problem, and game theory is used to solve it. This approach can be explored for other wireless communication applications as well.

In [21], another method for PUEA detection and mitigation by examining received signal power based on adaptive learning is proposed. The learning procedure implements cyclostationary feature analysis and distance variance estimation for differentiating malicious user and primary user in CRN. Through simulations in network simulator, it has been proved that the proposed learning method is stable, provides enhanced SU throughput with less time required for signal classification, and reduced miss detection probability. Hence this approach can detect attacks at different layers with minimum time for detection.

In most of the above solutions, static PU and MU is considered to implement the simulation. However, in practical scenario, location of PU, SU and MU is dynamic or mobile. Considering this aspect, a new detection scheme is designed for mobile primary users based on Kalman filter in [22]. In this model, the position of dynamic primary user is tracked, and the source of the incumbent transmitter is verified using a Kalman filter. Then, the free-space path loss model is used to calculated distance between the secondary node and the incumbent transmitter received power. If the difference between estimated distances is more than the predefined threshold, then it is assumed that signal is received from a malicious user, otherwise genuine primary signal. This result of the Kalman filter is satisfactory in a non-static environment and shows better than RSS based location method.

In [23], a combination of energy detection sensing and location identification is proposed to detect PUEA. This model depends on three major factors: energy detector with multi-threshold sensing, RSS Location verification, and the two-level database (local and global). The local databank comprises of two components: RSS probability function and fingerprint data. Location details of PUE attacker, as well as primary user, are stored in the global database. The location verification technique identifies the originating source of the received signal coming from the PU or PUE attackers. Finally, the local and global database, RSS Probability function, and thresholds are updated.

In [24], a modified energy detection scheme is proposed for PUEA detection. A hypothesis problem is modeled representing three states of the channel: (1) idle channel, (2) channel accessed by PU and (c) channel attacked by malicious user. Energy statistics of SUs is extracted for accurate detection of PU and MU considering detection statistics and predefined threshold. Three values are set for threshold D0, D1, and D2. When energy statistics are less than D0, it is assumed that the channel is idle and can be accessed by genuine SU. When detection statistics are between D1 and D2, it is assumed that primary user accesses the channel, and finally, statistics higher than D2 indicated the presence of MU in the channel. This modified energy detection method is simulated with MATLAB software, and the accuracy of detection is computed.

Most of the existing literature on attack detection in a cognitive environment considers the location of the attacker fixed and only physical layer is considered. In [25], a cross-layer technique is applied for the detection of a dynamic attacker’s location. A testbed of a mobile phone base station with software-defined radio is set up for real-time experimental validation. It combines energy detection, motion estimation, and information analysis at physical, MAC, user’s application data, and PUE at motion. The experimental result shows that signal is detected with more than 93% accuracy and SNR equal to – 9 db (Table 1).

Table 1 Location and distance methods

3.2 Analytical Model

There are few analytical methods for the detection of PUEA based on PDF of power received at secondary user interfaces.

An analytical model explaining the probability statistics of signal power received by SU (from both PU and attacker) has been described in this paper [26]. This method is a pioneer attempt to find the viability of PUEA using Fenton approximation (FA) and Markov inequality (MI). SU measures the received powers (from PU and MU), assuming Rayleigh fading and shadowing. If the difference between received powers at SU is less than a set threshold, the spectrum is considered as ‘available’ for SUs. However if the case is opposite, a decision is made to determine whether the signal is from PU or MU. Mathematical expressions are derived for computing (1) PDF of received power at SU due to PU and (2) received power PDF at SU due to MU. Subsequently, the FA method is applied to calculate the mean and variance of the power received. Eventually, lower bound probability of successful PUEA is estimated using MI.

There are a few limitations:

  1. 1.

    This analytical model cannot perform well in a highly dynamic environment. PU’s location is assumed static and is known to all users in the system.

  2. 2.

    The assumption made does not work in a realistic hostile environment (1) transmission power of MU is assumed constant, and (2) SUs and MUs are uniformly distributed.

The extension of the above model is explained in [27]. The author explores the feasibility of PUEA using Fenton approximation (FA) and the Wald sequential probability ratio test (WSPRT). The simulation follows the same methodology of measuring powers and finding mean and variance through FA. However, instead of MI author uses WSPRT (hypothesis testing) to detect PUEA (H0: PU, H1: MU). The simulation result shows that (1) SU receive more power from attackers as compared to PU, if MUs are too close to SUs. In this scenario, the probability of false alarm as well as miss detection rises. (2) If MUs are too far from SU, false alarm and miss detection probability reduces. Limitations of this approach:

  1. 1.

    Uniform distribution of SU and MU still exists. Moreover, MU’s transmission power is constant, and PU’s location is static; both may differ in a realistic scenario.

  2. 2.

    A huge sample size and enormous testing time may only provide accurate results.

To achieve higher efficiency [28], the same author compared Neyman–Pearson composite hypothesis test (NPCHT) with the results of Wald’s sequential probability ratio test (WSPRT). The simulation follows the same methodology of received signal power measurement and calculating mean and variance through FA. However, instead of WSPRT author has used NPHCT to detect PUEA (H1: PU H2: MU). The simulation result shows that the WSPRT achieved almost 50% reduced detection of PUEA compared to NPHCT; however, at the cost of an enormous sample size and enormous testing time.

Similar to [28], the power received at SU is statistically analyzed in [29]. This work uses channel transmission characteristics and received power probability density function at SUs. The author of this paper proposes a variance method to resist attackers and compares it with a naïve detection method, which is based solely on primary user power. An advanced attacker is designed, which may have variable transmitting power, knowledge of path loss exponent, and variance of SUs. A maximum likelihood estimator and mean-field approach are used by the attacker to infer the transmission power of the primary user and produce primary user emulated signals. The author significantly claims that the attacker can imitate various primary user signal characteristics except for the communication channel feature. However, these claims are under various assumptions like distances between SU and PU, the attacker and PU and the SU and attacker are known. The attacker’s known location is another assumption that is not suitable in a real environment.

3.3 Cryptography

Post limitations of location and PDF-based countermeasures, researchers have explored cryptography to detect and mitigate PUEA. Authors have used channel impulse response and cryptographic techniques to determine the location of a primary transmitter (Table 2).

Table 2 Analytical model

The author in [30] proposes a scheme based on the amalgamation of wireless link signatures and cryptography generated from channel impulse response. A helper node (HN) is positioned in close proximity with primary user at a constant location. HN acts as a signal messenger to SU, enabling SU to learn the signatures (both link and cryptographic), thus allowing SU to confirm PU’s authenticity. The helper node transmits cryptographic signatures to SUs using channel assigned to its PUs (when the channel is idle). The attacker may copy PU signals and send false signals to the target channel. Finally, the helper node differentiates between PU signals and SU imitating PU signals.

In [31], a cryptography-based method modified by a DNA algorithm is proposed. Authors focused mainly on two aspects: (1) a new member is added in the cognitive group only after integrity verification. Data encryption is performed between spectrum manager and CR for secure transmission. (2) A reliable node situated in close proximity with PU location detects and mitigates PUEA. When the spectrum is idle, this node sends a DNA algorithm based authentication tag to the CR. Information received by the CR without the authenticated tag is discarded and reported to the spectrum manager. Further, a malicious node is search by spectrum manager distance measurement and is removed from the cognitive group. Simulation outcome shows a comparison of detection probability with and without authentication tag. In order to make out if the signal is from PU or an attacker, a helper node equipped with amplitude ratio of multi-path element (AR) is developed. This is a physical layer authentication approach. A helper node can calculate AR using channel impulse response and compares it with the set threshold. If AR is greater than threshold, the received signal is marked as PU signal else it is considered as attacker signal and gets discarded. The efficiency of this approach is presented by mathematical modeling in terms of false-negative probability (attacker misinterpreted as PU) and false alarm probability (PU misinterpreted as an attacker). There are few drawbacks associated with this work:

  1. 1.

    It is assumed that an attacker’s transmission power is much higher than PU’s transmitting power.

  2. 2.

    An attacker cannot be close to PU.

  3. 3.

    It is assumed that an attacker cannot compromise the helper node, which is a possibility in a real scenario.

  4. 4.

    A security mechanism is required between helper nodes and secondary users to avoid the possibility of an attacker modifying messages.

The author in [32] illustrates denial of service (DoS) existence considering DSA network architecture as per standard IEEE 802.22. This approach is based on four elements: A certification authority (CA), PU, SU, and secondary base station (SBS). A public key cryptography is adopted where a PU encrypts the data (before transmitting) with digital signatures. A current timestamp, private key, and PU ID are used to generate digital signatures. SU continuously scans the spectrum in the sensing period for digital signature. The SU detaches the signature from the PU data unit once a signal is transmitted through the channel and forwards it to the base station for verification. The certification authority maintains a database of public keys used by PUs in a confined geographical space. A BS and a CA uses its database to confirm if the received signature is of PU or an attacker. If the existing public key does not decrypt the signature, it is considered as an attacker and discarded from the database. Thus, this method can successfully detect and mitigate MU from the network with few limitations as stated beneath.

  1. 1.

    Attaching a signature with PU transmission violates FCC rule (no modification or interference to PU transmission).

  2. 2.

    Infrastructure is costly for a substantial geographical area.

  3. 3.

    The complete detection and mitigation process depends on CA. IF CA fails or compromised, the entire system goes down.

  4. 4.

    If fake signals are sent by MU continuously, it can trench network resources badly. This would lead to congestion of the common control channel and making the secondary network inoperable.

  5. 5.

    For the encryption scheme, SU must exhibit capacity to synchronize and demodulate primary signals.

Another defense approach build on hash message authentication code is implemented [33] by the author, in which a key is shared between PU and SU. The shared key is attached to the message with a tag before transmitting it to the receiver. At the receiver, a SU regenerated a tag using a hash function and shared key. If the transmitted tag matches with the received one, it is considered that signal has arrived from PU; else it is attackers signal. This process is productive, but there are few limitations which makes it inapplicable

  1. 1.

    Bandwidth efficiency is degraded.

  2. 2.

    Noise sensitive method and adds attenuation to PU signal.

  3. 3.

    This process is useful, but modification in PU transmitter may disrupt the synchronization between the transmitter and receiver. It reduces coverage area of the primary network and violates the FCC statement (“no modification to the primary user system is mandatory to allow the opportunistic use of the spectrum by secondary users”).

In [34], an advanced encryption standard is used for PUE detection. In this method, digital TV or PU transmitter sends a reference signal (RS) that works as segment synchronization bits of digital TV data frames. RS is created by following two steps—Step 1: generating a pseudo-random sequence (PN); Step 2: advanced encryption algorithm (AES) encrypting the PN sequence. For robust security, a 256-bit secret key (SK) is applied in Step 2. RS is regenerated at the receiver (by the virtue of shared SK amid receiver and transmitter) to achieve an accurate identification of PU and MU. To validate a PU signal, a comparison is performed by correlating the RS and the received signal with a set predefined threshold (T). If comparison yields a value higher than or equals to T, PU’s presence is confirmed else the PU is absent. Malicious user (MU) is detected by evaluating the autocorrelation of RS. The detection performance for MU and PU is gauged through a false alarm graph and probability of miss-detection. Four hypotheses model for detection is formed: (1) H00: MU is absent given that PU is absent (alpha = 0); (2) H01: MU is present given that PU is absent (alpha = 0); (3) H10: MU is absent given that PU is present (alpha = 1); (4) H11: MU is present given that PU is present (alpha = 1).

Limitations of the AES method:

  1. 1.

    Plug-in AES chip is needed, which increases the cost of the process.

  2. 2.

    It is a symmetric key algorithm; the key needs to be shared with each recipient.

  3. 3.

    Due to key size (256), excessive time is required to encrypt and decrypt messages, which can hinder effective communication and upsurge overhead time (Table 3).

    Table 3 Cryptographic based detection technique

3.4 Belief Propagation

After location, distance analytical, and cryptography based methods, there was a strong need to develop a robust algorithm. Researchers proposed belief propagation claiming that this algorithm is more effective than localization and cryptography. Yuan and Zhou described the belief propagation algorithm [35] and Markov random field (MRF) for the detection and mitigation of PUEA. SU obtains the received signal strength measurement (RSS) to identify the location of the PU transmitter. Each SU compares the known location of PU with the received signal and computes its probability (of MU or PU)). The probability calculated by each SU is denoted as belief or messages. When a signal is received by cognitive radio network, SUs exchange belief with each other in the form of messages in an iterative mode. When all the beliefs are exchanged from all the secondary nodes, the mean of final belief is computed. If this mean is lower than the set threshold, the incoming signal is considered as PUEA; else, it is an honest SU seeking spectrum. Subsequently, all the SUs in the network are informed about PUEA signal characteristics (broadcast) so that it can be circumvented in the future. There is a need for almost 8 iterations to complete the iteration process.

Limitations of this BP in terms of CR security are:

  1. 1.

    Cooperation between secondary users becomes challenging due to high computation complexity of local and compatibility function.

  2. 2.

    With rise in number of SUs, algorithm scalability and accuracy decreases.

  3. 3.

    The PUs position is known well in advance.

  4. 4.

    When the distance between MU and PU is less, firmer belief is achieved as the probability of suspecting a PU is higher.

Belief propagation approach is extended further in [36] as an effort to lessen the number of iterations and computation time. In the previous attempt, BP algorithm required approximately eight iterations before converging to a final belief. However, in this work, the new BP method has redefined protocol for ex-changing messages, and the method is modified to compute more straightforward beliefs at each SU. Modified BP method can detect and mitigate PUE attackers in a single iteration, and results are equally accurate as they were in the former approach (eight iterations). A BP framework based on pairwise Markov random fields (MRF) is exploited to achieve high accuracy and scalability. The location of the PU transmitter is recognized by using power observation at SU relatively. A comparison of computation time is made for both BP algorithms [36]. Results show that computation time is directly proportional to the number of SUs for both approaches. Limitations of this modified BP algorithm are:

  1. 1.

    When the distance between PU and MU reduces, mean of final belief is affected.

  2. 2.

    If secondary users are compromised, inaccurate results are produced.

3.5 Wireless Microphone

This work describes the detection of emulation attacks when the primary user is mobile. Wireless microphone (WM) and TV towers exist in the same white space. Wireless microphone location is not stationary, and its transmission power is also low. This property makes detection of attack difficult compared to stationary users. In [37], a novel experimental method is described to spot wireless microphone emulation attacks (WMEA).

In this paper, a real-time experiment is conducted to detect PUEA for wireless microphones, which is authorized to operate in TV bands. White space consists of TV towers and WMs. The relation amid received RF signal energy level and audio information received from the sensors (attached to the SUs) is exploited to verify the authenticity of PU. Ambient Noise mitigation of the experiment is done using collaborative sensing. The communication range of the WM radio frequency signal is < 100–150 m. The relationship between signals received by the SUs and the sound sensors output is calculated. If the received signal does not match the correlation test criteria, WM emulation attackers are assumed to be present. According to the IEEE 802.22 standard, sensors should be capable of identifying wireless microphone signals over 200 kHz bands within 2 s with both misdetection and false alarm probabilities < 0.1. So timing is an essential parameter in emulation attack detection. A spectrum analyzer is used to measure RF signals power [38, 39]. The detection time of this method is approximately 3 s, which can be further reduced. SUs are equipped with extra sound sensors; therefore, if the number of secondary users increases, sound sensors also increase, making the system complicated and expensive. Nevertheless, this is a unique literature based on wireless microphones in a cognitive radio network. Therefore, there is a need for useful implementation of robust methodologies applicable for both stationary and mobile PUs and hence enhancing CRN security.

3.6 Game Theory

Game theory (GT) comes up with a mathematical model for analyzing the strategic interaction amid multiple decision makers. It is an emerging and effective structure for designing the CRN security mechanism to compute interactions among rational entities with conflicting interests. A classic multi-user game comprises of three elements (1) active users known as players (2) actions taken by the active users known as strategies and (3) outcome of the players based on the adapted strategies known as Payoff/Utility [40,41,42,43].

The author in [44] has adopted GT for PUEA detection using Nash equilibrium. A dynamic non-cooperative multistage game is developed between SUs and attackers. It is a two-player game in which both the players are rational and having conflicting purposes. SU strives to use an idle frequency band without intruding PU transmission, whereas MU makes an effort to acquire entire bandwidth by pushing SU out. It is assumed that the schedule of primary user arrival is unknown to both secondary and malicious users. Thus both users learn and build a probabilistic model to acquire the knowledge of PU arrival. Further, to know more about the state of PU, a Belief updating system is applied for SU. This system helps in fine tuning the strategy smartly and defends malicious attacker. In comparison with the other models, simulation outcome exhibits that the belief updating system attains better results in terms of considerable payoff and provide more sturdiness to the inaccurate approximation of the primary user’s state.

In [45], a game model is designed to detect a PUE attack. A game theory-based defense strategy deals with a selfish and malicious PUE attacker. The time frame is divided into sensing and data frame. Selfish attacker occupies the spectrum band in data transmission duration after performing a PUE attack during the sensing period. Channel surveillance is adopted to monitor the attacker while accessing the channel. Once such attacker is narrowed down, it is punished by either limiting its bandwidth or completely isolating it from the network. It is assumed that the spectrum sensing process cannot differentiate between emulated and licensed primary user signals. Thus, an attacker cannot be detected in the sensing time frame. An extra sensing process is adopted, which helps in detecting the channel under PUEA. Once such a channel is sensed, it is again set free for the next data frame. This extra sensing process is cheaper compared to channel surveillance. This paper does not deal with avoiding the attacker. It detects the attacker before switching on to channel surveillance procedure. The key role is played by the manager of the network who acts like a superintendent. The strategies of the attacker and the superintendent are figured out in a closed-form, as Nash equilibrium (NE) point. This simulation is effective considering the scenario where attacker eyes only a single channel however the model is not suitable for a CRN dealing with multi-channel attack.

As the superintendent’s role is limited to cater single channel attack [45], the capabilities got enhanced [46] to address a multichannel attack by adopting a channel surveillance procedure. An attacker learns and adapts the art of surveillance by monitoring the spectrum for a fixed period before performing a selfish PUEA. In such a case, adopting Nash equilibrium may not give results as desired by the defender.

In the previous approach, Nash equilibrium (NE) is used to analyze a formulated non-zero sum game of a selfish user, malicious user, and mixed PUE. But a smart attacker can learn and adapt surveillance strategies by monitoring the spectrum for a fixed period before it attempts to occupy the channel or perform selfish PUEA. In this case, NE may not be used as an efficient defender strategy.

An advanced form of game theory based on strong Stackelberg equilibrium (SSE) is proposed in [47]. The algorithm revolves around a leader and a follower (which resonates to superintendent and attacker [45]) where the strategy is initiated by the leader who drives and defines the follower’s strategy. The whole idea is to have a commitment model where follower’s actions are committed to leader’s strategy. A non-commitment model is analyzed using NE, whereas a commitment model is analyzed by strong Stackelberg equilibrium (SSE). The benefits and losses of both the players are evaluated and compared with the non-commitment model. PUEA cannot be detected during the sensing process, as it is assumed that the sensing engine cannot distinguish between emulated and genuine PU signal. The commitment model is examined through the proper modeling of strategic interaction between a leader and follower. Subsequently, attacker responds to a strategy used by the network manager lowering model’s computation time and maximizing the expected payoff. If the net-work size increases, the game theory model becomes complicated and impractical.

In [48], another game theory approach is proposed by author Mohsen et al. In this work, a non-zero sum game is developed between good and bad secondary users in the synchronization phase. During data transmission phase, a genuine SU sends data randomly on the frequency channel in presence of a bad SU. Nash equilibrium point is obtained for this game with improvement in the SU throughput. Simulation results show that NE point is consistent as per Lemke and Howson algorithm.

Another GT based scheme is proposed to reduce the false alarm and miss detection probability [49]. A non-zero sum game is formulated between SUs and PUEA, which does not allow the attacker to use the channel. Each cluster is embedded with this game model so that SU does not switch channels when the attacker node arrives. A trust list table consisting of PU node ID and attacker ID is created and updated after every miss-detection. Therefore, SU switches and moves to another channel or stays on the same channel, which depends on the arriving node ID and trust list table. Simulation results show that chances of miss-detection and false alarms are reduced substantially.

General limitations of GT model:

  1. 1.

    The number of players can be finite. If network size increases, the game theory approach becomes complicated and impractical.

  2. 2.

    GT revolves around mathematical models that account for logical responses however real-world responses may vary.

3.7 Machine Learning

Human’s ability to learn and get better at tasks with experience is part of being human. At the time of birth, our knowledge about things is nil and hence our capability to do anything for us is also nil. We learn with time and become more capable and computers have the same potential. Machine learning (ML) brings together statistics and computer science to enable computers to learn and accomplish a given task without being programmed to do so. The way human brain uses experience to improve at a given task similarly computers can enrich their experience too [50, 51]. Feeding information, images helps the computer to recognize and translate the data numerically. It identifies patterns and establishes algorithms for better predictions. In the learning stage, it may give inaccurate results however with incremental data input, algorithm is finely tuned and becomes more accurate in its predictions. The technology behind facial recognition, text and speech recognition, SPAM filters in the inboxes, credit card forge detection, online shopping or online viewing recommendations is machine learning. From medical diagnosis to social media, the horizon of machine learning is vast [52].

Researchers thrives to achieve a combination of statistics and computer science to build algorithms that can solve complex problems more efficiently using less computing power. ML makes a system capable of thinking, analyzing, predicting, responding and automatically learning from the previous experience without the need for programming. Research areas implemented in the perspective of cognitive radio and ML are broadly categorized into two groups: pattern classification and decision making. The learning algorithm is broadly classified as supervised, unsupervised and reinforcement learning (RL) [53, 54].

CR well-versed with its radio frequency environment performs intellectual tasks smoothly. Besides being aware and truly cognitive, an effective CR is well equipped with learning and reasoning capabilities. Such capabilities are embedded in the engine of a cognitive radio which acts as a nucleus. Such CR engine makes use of machine learning algorithms to coordinate the entire actions of a CR.

3.7.1 Unsupervised and supervised learning solutions

The work in [55] describes secured spectrum sensing using unsupervised machine learning in the presence of PUEA and spectrum sensing data falsification (SSDF). The coexistence of PUEA and SSDF leads to unsafe spectrum sensing performance. If the emulation attacker is identified, the neighbor can send the wrong sensing report leading to falsification. The solution suggests that both the attacked SUs should be removed from the cooperative sensing process. A secure sensing algorithm is proposed, which uses a clustering mechanism to detect the malicious SU. This algorithm does not need prior information about SUs location and attacking strategy. Identification of SUs is checked for errors through the uniquely assigned identity value for each SU. Such values are updated intermittently. The process focuses on securing the spectrum sensing and making the decision making process more reliable irrespective of the threat—PUEA or SSDF.

A solution to make sensing report more accurate is to discarding the attacker from the average observation before finalizing the spectrum decision. To achieve the same, K-means unsupervised machine learning is used to filter out the anomaly which could lead to a deviation in the spectrum sensing. These reports are omitted from the cooperative spectrum sensing (CSS) process and therefore limiting the sensing report strictly on the result of trusted SUs. The bigger risk however exists if the fusion centre (FC) is compromised as it is overall responsible for the identification of attack, securing the spectrum sensing and ensuring a reliable decision making. Other limitations K-means unsupervised ML are (1) K means chooses K value manually. (2) K means clustering technique cannot be used for varying size and density. (3) K means clustering assumes spherical clusters and each cluster has equal numbers of observations. The algorithm does not work for clusters of unusual size. (4) Clustering quality gets affected by the sensing history dimensions. Large dimensions lead to fewer identification errors as computational complexity increases.

In order to address spectrum occupancy issue, researchers have explored integrating conventional wireless sensing network with the cognitive features. A new method [56] has been devised which is based upon two non-parametric algorithms: the data clustering (DC) and cumulative sum (CUSUM) algorithms. The basic methodology is to analyze detection delay, scenario dependency, resources scalability and learning time. A CWSN simulator verifies the validity of the applied method. However, the DC algorithm is more appropriate for dynamic or intricate scenarios. The CUSUM is comparatively slow to respond to large shifts. Also, both the algorithms are not suitable for applications with large datasets. Both non-parametric algorithms have displayed competence to detect the PUEA behavior but the algorithms are slow (increase in the number of users reduces the network speed).

In [57], an effective PUEA detection method is presented, taking advantage of the recurrent neural network (RNN). In centralized CRN, throughout the preliminary stages of spectrum sensing (at SU or FC), the RNN model could be trained using the standard PU activity series. During spectrum sharing with primary users, the SU is able to continuously check the signal behavior with the application of PUEA detector and trained RNN. Due to the eminent issue of gradient vanishing, it is difficult for basic RNN to process activity series with long temporal dependency. Due to this gradient vanishing problem, RNN forgets long term data, which means RNN cannot use patterns taking place long before the current input series. Several other recurrent neural network structures are introduced to overcome these limitations.

An artificial expansion of RNN known as long short-term memory (LSTM) is widely used in deep learning. LSTM has connections of feedbacks comprising of memory cell and input, output and forget gates. The performances of the fundamental RNN detector are compared with the LSTM detector and the multi-layer LSTM detector in the presence of PUEA. The three-layer LSTM detector attains the finest performance because it is capable of learning intricate features and storing lengthy historical behavior [58].

In [59], a classification model based on machine learning is used for identifying primary user emulation attack detection. The detection process is divided into three stages: (1) channel estimation is done by treating channel impulse response as a link signature. (2) A four-dimensional feature space (mean, variance, skewness, and difference of maximum and minimum CIR values) is created by the raw channel impulse response (CIR) sample. (3) Pattern recognition of extracted features is carried out by 6 classification model: logistic regression (LR), linear discriminant analysis (LDA), K-nearest neighbors (KNN), decision tree classifier (DTC), Gaussian Naïve Bayes (NB), and support vector machine (SVM). Evaluation through above techniques was recorded in terms of accuracy; recall, precision, and F score. A real-time testbed software define radio was used to test the performance of the proposed scheme.

In [60], genetic crossover and mutation operators are combined with the artificial bee colony algorithm to attain stability between exploration and exploitation of solution. Primary user emulation attack is an intelligent model and devised as an optimization problem in this paper. Every secondary users is embedded with an energy detector to carry out spectrum sensing in the following channel scenarios—(a) White Gaussian noise (b) PU with noise (c) PUEA with noise and PU (d) PUEA with noise. The received signal strength from SUs is compared with two predefined thresholds instead of a single threshold. This double threshold method improves the probability of detection as compared to other recently proposed detection algorithm. Further in [61], artificial bee colony algorithm is combined with K-means to explore data clustering in an effective way.

Another ML secure sensing algorithm is proposed, which uses a clustering mechanism to detect the malicious SU. This algorithm does not need prior information about SUs location and attacking strategy. Identification of SUs is checked for errors through the uniquely assigned identity value for each SU. Such values are updated intermittently. The process focuses on securing the spectrum sensing and making the decision making process more reliable irrespective of the threat—PUEA or SSDF [62].

Traditionally, machine learning comes with the capability of learning, analyzing, and predicting radio frequency signals. However, it is challenging to classify RF signals disrupted by malicious activities such as jammer, eavesdropper and PUEA. A new machine learning model known as generative adversarial network (GAN) is implemented to discriminate between real and fake radio signals in the cognitive radio network. As this is a recently explored learning model, few research techniques have been proposed for attack detection based on GAN [63, 64].

In [65], a new solution is provided for the detection of a primary user emulation attack. The author designed the generative adversarial network (GAN) model to generate and detect malicious SUs. GAN arrangement consists of two neural networks. (a) Generator for generating data probabilistically and (b) discriminator for differentiating fake and real signals after getting trained by an artificial neural network. For the detection of MUs, a dumb generator with no PU signal information and smart generator with enough PU signal data are designed. These generated data from the generator are identified by the discriminator model, which is trained by the neural network for real (PU) and fake (malicious SU) signals. A universal Software Radio Peripheral test bed train the discriminator over primary user signal data. The models detect malicious users, and PU with a model than 98% accuracy GAN model is a good solution for PUEA detection and the security issues of the wireless communication network.

In [66], a discriminator model is built to distinguish between trusted and fake RF transmitter. This model learns signal characteristics of the RF signal by convolutional neural network (CNN) and deep neural network (DNN). Results show that CNN and DNN can detect trusted and malicious transmitters with 81.6% and 96.6% accuracy, respectively.

GAN is applied for creating a secure and self-aware CR network in [67]. This work focuses on the security of the physical layer by using two abnormality detection techniques—(1) conditional generative adversarial network (C-GAN), (2) dynamic Bayesian network (DBN). A high dimensional state vector acting as an input of radio frequency spectrum is extracted and learned by both the detection network.

In [68], a power allocation algorithm based on GAN is used to set up private communication under primary transmitter guidance. A generator and discriminator model is framed to learn power allocation solution strategy, covert or non-covert communication, and detection accuracy. This model learns from a fully connected deep neural network to provide a near-optimal power allocation solution to achieve fast convergence.

In [69], a distributed semi-supervised machine learning method running on the cloud is proposed. Data is first classified and labeled by a cognitive engine incorporated with self organizing map (SOM). Then, this labeled data is forwarded for the second classification based on past experiences to cloud core induced with convolutional deep learning network (CDLN). Both CDLN and rule-based learning are compared based on error rate percentage, false alarm probability, and detection accuracy with and without noise. The simulation outcome shows that the algorithm is 25% better than the conventional neural network and 40% better than a rule-based method. This is an effective technique to enhance system security when attack signature changes frequently.

In [70], cloud expansions nowadays has become a promising technology which can help in taking rapid decision due to simplified scaling and potent virtual machine based compute engine. This paper discovers prevailing real-time supervised learning-based PUEA detection merged with the edge computing engines and core cloud.

3.7.2 Reinforcement learning solutions

Reinforcement learning is a type of machine learning with no defined input database. The learning mechanism operates on four parameters: environment, agent, state, and actions. For example, the CR environment consists of the secondary node, primary node, cognitive, and primary base station [71, 72]. An agent on the fly observes, learns, and takes action based on environment, state and reward. The state is the decision making factor affecting rewards and the agent’s actions. The action of an agent on its surrounding leads to a positive reward or a negative reward. Every agent’s action is towards maximizing the network’s rewards and improving the next state [73].

Reinforcement learning and cluster size adjustment based security systems for CRN are presented in [76]. The size of the cluster is adjusted to enhance network scalability, stability, and spectrum utilization efficiency. In this approach, the free spectrum is distributed as tokens to SU member nodes in the cluster by the cluster head (CH). It is the duty of the CH to monitor the token utilization performance of the secondary users. If a secondary node wastes a specific amount of token, its performance ratio decreases, and the node is removed from the cluster. MU can intentionally waste the token received to degrade utilization of radio frequency spectrum without being detected. RL model is integrated inside the cluster head as well as all SUs presumed as potential attackers. RL and cluster size adjustment arrangement provides excellent security and scalability for the network when attacker intensity is lower [74]. As the probability and intensity of attack increase, the cluster head cannot detect malicious nodes in the cluster. This RL approach is excellent for a cognitive radio network where SU can turn malicious over time. Continuous observation and learning of the node’s behavior and surrounding are essential parameters for deploying a smooth security system.

In [74,75,76], RL learning and clustering application, model, features are explored. These papers explain the step-by-step procedure on how RL can address security, routing, scalability, and stability in cognitive radio network.

Most of the prevailing reinforcement learning-based structures are implemented using simulations. However, to the best of our knowledge, there are only a few implementations of RL-defined schemes in the CR hardware platform—universal software radio peripheral and software defined radio [77]. The real time implementation of the RL algorithms is significant for confirming their accuracy and precision in a practical CR environment. To this end, progressive research is needed to investigate the implementation and road blocks of the RL-based scheme on the CR hardware platform [78, 79].

3.8 Blockchain

Recently popular technology called blockchain is used for implementing security in a cognitive radio network. A chain of blocks (referred as blockchain) are linked in a distributed form using cryptography and are used for storing information which is difficult to modify. Blocks are comprised of three critical parts: data, hash, and hash of the previous block. An example of blockchain technology is exchange of cryptocurrency (Bitcoin) where the transactions are recorded in a public ledger (in the chain of blocks). The ledger contains the details about the sender of Bitcoin, receiver and the volume of coins traded. If the hash of the second block changes due to some reason, it can make all the following blocks invalid since every block depends on the previous block’s hash value [80]. A mechanism called ‘proof of work’ is used in the blockchain, which slows down the creation of a new block in case of block tampering. On creation of every new block, each node in the chain verifies the new block to check the authenticity. Once the consensus is created amongst all the network nodes, new block is created. Therefore, security of blockchain comes from the creative ideas of hashing, proof of work, decentralized and distributed time servers [82, 83].

In a recent paper [81], blockchain technology is implemented to detect and mitigate malicious users and improve the spectrum sensing process in the cognitive radio network. All primary and secondary users are altered into blocks and from a decentralized network. Each cognitive radio block consist of four kinds of information. (1) Hash: SHA256 algorithm generates a unique key known to the next block of the user. (2) Sensing result: outcome of sensing result. (3) Private key: A 16-bit unique key known to the user only. (4) Previous hash: the hash value of the previous node. Malicious user detection is done by verifying the digital signature (hash and private key) of nodes. Detected MU is filtered out from participating in the future spectrum sensing process. Complex simulations are performed in MATLAB software to compute miss detection and possibilities of false alarm using public and private keys.

In [84], another blockchain defined spectrum sharing protocol is designed against malicious users in CR internet of things environment. This work defines a proactive protocol to learn an attacker’s behavior and blockchain to enhance CR system security. Along with enormous increases in blockchain technology, its complexity is also very low as compared to other existing models. Several blocks in blockchain can grow very large over time hence storage would be a bottleneck. High energy consumption is also a critical aspect of blockchain technology which is concerning at present (Tables 4, 5).

Table 4 Countermeasure for BP, WM, game theory and machine learning
Table 5 ML countermeasure: reinforcement learning, GAN and blockchain

4 Past Present and Future

Spearing the ship in the right direction is vital. A step towards security is helpful when the assumptions are well written else the ship may lead to unsafe waters.

  1. 1.

    Impractical Assumptions—The contributions on PUEA detection covered in this paper are majorly relying on an underlying hypothesis that the location and transmission power of all users (primary secondary and malicious) are fixed and well-known. This assumption alone is naive and mainly impractical for the applications of CR.

  2. 2.

    Learning and Reasoning—To perform intellectual tasks, a CR is expected to be well-versed with its radio frequency (RF) environment, learning and reasoning capabilities. Several factors and policies need to be adjusted instantaneously (e.g. transmission power, coding method, modulation technique, sensing process, policy and communication protocol.) and a one-dimensional approach of assuming ‘location’ and ‘transmission power’ may not help in synchronizing multiple parameters simultaneously [52]. In existing approaches, security comes at a cost of affecting several system requirements. Therefore, research efforts in this direction would be a worthwhile contribution for evolving a security mechanism suitable for network of any kind (cooperative or non-cooperative, distributed or centralized) and doesn’t trade off with system quality specifications (delay, throughput, reliability, energy consumption and spectral efficiency). Advanced machine learning methods are moving towards enhancing security and are independent of the coordinates of PU and SU [53]. Numerous introductory studies are using various machine learning techniques (supervised and unsupervised) such as clustering, object classification, reinforcement learning, pattern recognition, and artificial neural network (ANN) [55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78].

  3. 3.

    Deep Reinforcement Learning—Research in the field of cognitive radio security has found a fresh opening in the form of reinforcement learning. RL incorporated CR learns from state and surrounding environment and eliminates the need of knowing location (of PU and MU). In a practical CR scenario, a there is a possibility that a genuine secondary user can turn malicious during the operation. Such scenarios have helped researchers to evolve in their approach too. In a recent paper [76], author Mee Hong Ling et al. has integrated RL and clustering algorithms to detect and mitigate online malicious SUs. Such approach provides a pragmatic solution which is missing in most of the existing models (if SU turns malicious).

    Q learning is an RL algorithm used in this trust model. Q learning creates a matrix of state (S) and actions (A) by observing the environment. But there are few limitations associated with this algorithm. If the number of users rise or network is large, it has to create a huge S * A matrix which can exhaust the computational abilities of the system. To address the matter, deep reinforcement learning (DRL) is required to be thoroughly explored. There is limited recent research work on application of deep reinforcement learning for various cognitive radio operation and security. In [100] a reinforcement learning method is combined with graph neural network to enhance energy optimization in a distributed CSS process. In [101] resource allocation is carried out in cognitive network by applying deep learning concepts called doctive learning. Artificial neural network (NN) successfully combined with RL is the best possible scenario to attain greater goals.

  4. 4.

    Physical Layer—Enhanced security at the outermost layer is the foundation of all security controls in a cognitive radio network. Primary user emulation attack takes place in the physical layer of the protocol stack; therefore mitigation of this attack lies in achieving security at physical layer. A number of researchers are working on improving secrecy capacity and outage probability of a cooperative communication. The secrecy rate analysis shows how strong the security mechanism can perform against the potential threat in the cognitive radio networks. Various methods are implemented for improving physical layer security with energy harvesting and cooperative communication [88,89,90,91].

  5. 5.

    Cross Layer Design—Research may not require a focused strategic approach but a focused execution. Strategy formulation could be balanced and flexible. Excess of focus on studying the physical layer alone has prevented the early researchers from changing the perspective. Most of the countermeasures covered in this paper focus on optimizing the physical layer features without considering the alteration of the upper layer parameters. An attack at the physical layer can affect MAC and upper layers of the network directly or indirectly. The notion of cross-layer security is not heavily investigated historically. There is a need for interaction between physical and upper layers to develop an effective security design. This has motivated just a few researchers to instigate new security ideas. A researcher has combined [85] detection of PUEA with the authentication of upper layers using SUs cross-layer learning ability. A radio-frequency fingerprint (RF) is utilized to detect PUEA, considering multipath Rayleigh fading channel for mobile SUs. To achieve secrecy, the author of [86] has proposed a cross MAC-PHY layer security design, which is a combination of MAC layer ARQ (automatic repeat request) and physical layer artificial noise (AN) mechanism [87].

  6. 6.

    Proactive Design—Most of the countermeasures for PUEA detection are reducing the network’s efficiency while implementing external algorithms. If the instances of attack increase, count for running the algorithms (countermeasures) also increases. This leads to slowing down the network and squeezing out its energy. A better alternative would be to have an integral (not external) mechanism to run the detection algorithms. These issues can be addressed by designing a robust and secure MAC protocol [92]. An ideal MAC protocol may depict the following characteristics. (1) MAC can detect and discard MU and its data from the network itself at the proactive stage of the spectrum sensing and decision-making process. (2) Post spectrum sensing stage, MAC is not required to run external algorithms to mitigate attacks and, therefore, avoid high energy consumption. A design that incorporates a proactive outlook and can predict when and where the communication pathway would exist is significant. Recent work based on similar security concept in [93], proposed a proactive learning MAC protocol for defense against two significant attacks taking place in Centralized and distributed CRN: PUEA and SSDF. This protocol is compared with PROMAC [94] and POMAC [95] in terms of channel utilization, back-off rate, and sensing delay. Designing a CR MAC model for building a sturdy security mechanism is still a difficult topic of research.

  7. 7.

    Energy Harvesting—Transmission power has always been a critical assumption for early researchers (going back to the aforementioned one-dimensional approach of assumption—location and ‘power’). Researchers have assumed a constant magnitude of transmission power, assuming the energy shall always remain constant. Based on this, researchers have visualized certain test cases or the environment in which a PUEA occurs. To achieve the countermeasures (in the visualized and controlled environment), signal strength got consumed (external algorithms) and kept reducing. With the reduction of power, there were rarely any attempts to modify the test case scenarios to regain the strength of the signal to cater to lower efficiency and drained energy. This gave rise to a new concept of Energy Harvesting [95, 96]. Meng-Lin Ku has compiled various efforts towards this concept [97] and has shown a promising path to achieve a mechanism in CRN to conserve the energy. Energy concerns are majorly in two areas: energy efficiency and energy harvesting. In a CR network seeking to harvest energy, each node is stimulated by the energy sources such as wind, solar, or downlink radio frequency signals from the base stations. The energy harvesting process allows the CR network to do away with intermittent recharge or renewal of the batteries. It has potential to lead us in the direction of the green (communication) revolution [98, 99].

Blockchain and DRL are two competent areas with significant outcome which further needs exploration for deploying security as well as other spectrum management operations. Researchers have successfully initiated the work in this direction [102, 103].

5 Conclusion

Cognitive radio is a captivating technology to improve the efficiency of valuable radio frequency resources. The dynamic nature of CR makes it more vulnerable to security threats, compared to traditional wireless networks. The Primary user emulation attack is a significant threat to the spectrum sensing operation of CR. This paper presented the most significant contributions for countering PUEA, describing their principles and shortcomings. Regardless of the utility and suggestions discussed in this paper, most of them are not suitable for practical CR environment. Most of the countermeasures for PUEA detection are reducing the network’s efficiency while implementing external algorithms. Thus there is a need for security mechanisms to proactively detect and discard attacks from the system without affecting the crucial parameters such as delay, throughput, energy consumption and reliability. An innovative and sophisticated mechanism providing a secure, reliable, high speed, energy efficient and low-complexity network is desired to make CR network a feasible solution for the future spectrum management requirements. Accomplishing these quality features with highly cross-layer security design is still a stimulating area of research.