Keywords

1 Introduction

Development of sophisticated information systems for civilian and military applications in congestion is a difficult thing [1]. During such imperfect situations, advanced systems are required for monitoring signal processing at regular intervals. In last decade, huge number of innovations are being done in communications [2]. AMC is one such innovation to enable higher transmission reliability and transmission rate by altering the modulation format according to channel characteristics. Implementation of AMC requires the receiver to have the intelligence of the modulation technique to demodulate it [3]. To achieve this, additional data is included in each frame, so that the receivers will know changes in modulation technique and respond according to it. But spectrum efficiency is greatly affected in this case [4].

To combat this problem, automatic modulation recognition (AMR) was proposed for recognizing the type of modulation without requiring additional data. So, AMR becomes an essential part of the receiver especially for future adaptive radio systems [5]. Figure 1 shows block diagram of adaptive modulation system (AMS).

Fig. 1
figure 1

Block diagram of AMS [6]

In general, there are three groups of typical AMC approaches: 1. time–frequency analysis (TFA), 2. decision-theoretic methods and 3. pattern recognition solutions. Amplitude and frequency variations of the signal can be efficiently traced by TFA, but it is unable to trace out the phase variations in the signal [7]. Decision-theoretic approach is carried on likelihood basis [8]. Likelihood-based [LB] classifiers perform hypothesis testing that leads to optimal solutions [9].

In case of pattern recognition type, classification is broadly categorized into two types: 1. feature extraction and 2. pattern recognizer [10,11,12,13,14,15]. Features are extracted from the signal samples in feature extraction. In pattern recognizer, features are processed (usually spectrograms), and they will be trained for classification purpose. This article mainly focuses on pattern recognizer techniques. Various authors made use of different modulation classes like PSK, quadrature PSK (QPSK) and QAM. They used higher-order cumulants (HOCs), quadrature (Q) and in-phase (I) to represent the modulation signal. In this paper, we considered analog modulation signals rather than digital modulation classes.

The rest of this paper is summarized as follows. Section 2 discusses the deep learning (DL) approach. Section 3 describes the CNN approach. Section 4 gives insights into simulation results. Section 5 concludes the work.

2 Deep Learning Approaches

DL is a part of machine learning. Almost all DL algorithms are regarded as deep neural networks (DNNs). It is capable of learning unstructured data. Figure 2 shows the structure of artificial neural network (ANN). In DL, all layers are interconnected, and they learn activities layer by layer. DL uses a greater number of hidden layers for efficient feature extraction [16]. Reasons behind the demand for DL in this era are extended learning capability, high performance and recent developments in machine learning (ML).

Fig. 2
figure 2

Basic structure of ANN

Typical DL approaches are long short-term memory (LSTM), convolutional neural network (CNN), autoencoder (AE) and recurrent neural network (RNN). LSTM is one of the most important RNNs. LSTM cells have an internal memory to learn efficiently. Input gate (I), output gate (O) and forget gate (F) are utilized by LSTM architectures [17]. Based on input data and previous state, gate weights are updated. This gating approach helps LSTM cells to have long-term learning. LSTM cells can learn temporal dependencies efficiently. It is used in various fields like speech recognition and machine translation [18].

Recently, deep learning is applied in various fields like image classification [19], medical [20], authentication [21] and speech recognition [22]. In [23], they proposed deep CNN for image classification. In [24], they proposed 3-D input CNN model for P300 signal detection. In [25], they provided user authentication based on mouse movements. For feature extraction, they used CNN, RNN and hybrid model combining both CNN and RNN. Layer-wise relevance propagation (LRP) algorithm is proposed to calculate the relevance scores for mouse movements. In [26], they combined LSTM with RNNs for improved speech recognition. In [27], deep learning algorithms are used in various tasks, such as detection, segmentation and classification of various microscopy images.

In [24], authors proposed a new method Bhattacharyya distance-based feature selection (BDFS) algorithm for feature selection. The dissimilarity between probability density functions (PDFs) acts as a criterion for feature selection. The proposed classifiers used are RBFN, CNN and sparse autoencoder. MPSK and 16QAM are used as modulated signals. They have chosen frequency selective fading and AWGN with SNR values ranging (0–15) dB. Their feature selection model achieved an accuracy of 100% at 15 dB.

In [25], authors presented AMC based on VGGNet model. VGGNet is a CNN with a greater number of hidden layers. Initially, sampled values of signals are converted into gray images, and they will be trained on VGGNet. 4ASK, 2PSK, 4PSK, 2FSK, 4FSK and 8FSK are used as modulated signals. They observed their performance under different SNRs from (−5 dB to 10 dB). Their proposed model got an accuracy of 98% even at low SNR (−2 dB).

In [26] for enlarging the dataset, they proposed auxiliary classifier generative adversarial networks (ACGANs). AlexNet is used as classifier. 4ASK, BPSK, OQPSK, QPSK, 16QAM, 8PSK, 64QAM and 32QAM are used as modulated signals. Contour stellar image is used to train the algorithm. They compared original dataset and enlarged dataset. They found that there is a 6% rise in classification accuracy while considering ACGAN-based dataset.

In [27], authors considered carrier phase offset (PO) effect which most of them neglect while dealing with AMR. They proposed a CNN model and verified with and without PO effect. They used HOC and instantaneous values as features. They compared their proposed model with DT, random forest (RF) and DNN with and without PO. The four modulation classes chosen are BPSK, QPSK, 8PSK and 16QAM. They verified their performance under different SNRs from (−20 dB to 20 dB). Proposed model nullified the PO effect and got a highest classification accuracy 100% at 10 dB.

In [28], adversarial transfer learning architecture (ATLA) is proposed for AMC. It is a unified form of adversarial training and knowledge transfer. This model improves the performance when there is insufficient data. I and Q are used as features, and AM-DSB, 8PSK, BPSK, AM-SSB, 4PAM, GFSK, 64QAM, 16QAM, WBFM and QPSK are used as modulated signals. They observed their classification accuracy under different SNRs from (−20 dB to 18 dB). Their proposed model worked exceptionally well even when the training data is lowered.

The merits of DL approaches are high dimensionality, robust to changes in data and good classification accuracy when dataset is high. The demerits are it requires more training time and computationally expensive.

3 Convolutional Neural Network (CNN)

CNNs are similar to traditional ANNs in some aspects. CNNs self-optimize themselves through learning. The main difference between CNNs and traditional ANNs is that CNNs are mostly suited for pattern recognition. The schematic diagram of CNN architecture is shown in Fig. 3. CNNs are having three layers: 1. convolutional layers, 2. pooling layers and 3. fully connected layers. CNN architectures are varied according to application. They are formed by stacking these three layers [29, 30].

Fig. 3
figure 3

Basic CNN architecture [29]

Convolutional layer plays a key role for CNNs to operate. This layer makes use of kernels and activation functions. Rectified linear unit (ReLU) performs elementary-wise activation function (preferably sigmoid function). In the next stage, pooling layer will perform downsampling, to reduce the number of parameters inside the activation function. In the last stage, it generates class scores from activation function. To improve performance, ReLU can be used between layers.

4 Simulation Results

Analog modulated signals such as Double Sideband Amplitude Modulation (DSB-AM), 4-ary PAM, Frequency Modulation (FM) and Single Sideband Amplitude Modulation (SSB-AM) are considered for classification. These signals are trained, tested and validated at different scenarios [31, 32]. For each modulation class, 10,000 frames are generated. Signal frames for modulation classes are shown in Fig. 4. CNN classifier is trained under imperfect conditions. Spectrograms of modulation classes are plotted utilizing STFT. Figure 5 shows the spectrograms of modulation types.

Fig. 4
figure 4

Signal frames for modulation classes

Fig. 5
figure 5

Spectrograms of four modulation classes

Figure 4 interprets the confusion matrix of CNN at 30 dB SNR. The test accuracies observed at 80%, 70% and 60% are 100%, 99.5% and 99.5%, respectively.

Table 1 provides insights to the accuracies of modulation class at different SNR values. Table 2 summarizes the test accuracies for AMS with different training rates and SNR.

Table 1 Accuracies of modulation class at different SNR
Table 2 Test accuracy summary for different training rates

5 Conclusion

This study initially presents a comprehensive review of AMR using DL approaches. In the latter part, simulations are carried out for analog modulation signals using CNN approach. From simulation results, it is clear that the proposed approach achieved 100% test accuracy at SNR 30 dB for 80% training rate.