1 Introduction

In today’s world, protecting multimedia data against unauthorized access is very much needed [1]. E-learning comprises multimedia data like text, audio, and video. Security threats in e-learning may be unauthorized access to information, stealing personal information, and tracking online activity. So, there is a need to provide information security for multimedia data. Encryption is one of the methods of cryptography, frequently used to provide information security. Symmetric-key approaches like AES, DES, IDES, and 3DES are widely used in cryptography. But AES is not applicable for multimedia data security due to strong correlation, redundancy, and public data with the degraded performance of encryption. In this paper, a novel audio encryption method is proposed with a new Henon–Tent chaotic pseudo-random number generation algorithm. The symmetric key cryptographic approach is used, where the random number sequence acts as a secret key. The secret key is generated both at the sender and receiver’s end. In the encryption phase, the xor operation is performed between the original audio file data and the random number sequence to create the cipher audio file. The cipher audio file is decrypted with the same random number sequence in the decryption phase. The encrypted audio file can be used in many e-learning processes instead of invitation links as an audio password. The statistical analysis is also performed. The encryption and decryption time of the audio file is very less compared to others. There are some advantages of our proposed audio encryption method. The encrypted audio file has a uniform spectrogram and histogram with a large keyspace to resist brute force attacks and statistical attacks. The values of the correlation coefficient indicate no dependency between original and cipher audio data values. The entropy is also high. The negative values of SNR and low values of PSNR represent powerful encryption and the presence of a high level of noise respectively. The high value of NSCR proves our method is resistant against differential attacks. The audio password will provide better security than invitation links to join any online activity for the e-learning process.

2 Preliminaries

Chaos theory represents complex nature and unpredictability with small initial values and control parameters [2, 3]. Chaotic tent map [4, 5] is piecewise linear and continuous map with a unique maximum and the equation is given below.

$$\begin{aligned} x _{i+1}={\left\{ \begin{array}{ll} \mu *x _{i} \quad \text {if}\,\, x _{i} < 0.5 \, \\ \mu *(1-x _{i}) \quad \text {otherwise}\\ \end{array}\right. } \mu \in (0,2) \; \text {and} \; x \in (0,1) \end{aligned}$$
(1)

We take the value of \(\mu\) \(\in\) [1, 2]   and  the  initial  value  of  \(x_{0}\)=0.4. The henon map is one of the utmost studied examples of the dynamical systems [6, 7] and can be written as

$$\begin{aligned} x_{n+1}=1- a \times x_n^2+y_n \, , \, y_{n+1}=b \times x_n \end{aligned}$$
(2)

The values of a and b are set to 1.4 and 0.3. The initial point \(x_{0}\), \(y_{0}\) is set to (0.1, 0.3). The 2D Henon map and 1D Tent map are used in our encryption method to generate the pseudo-random number sequence.

3 Literature review

A chaotic system with both the confusion and diffusion technique is proposed to encrypt the dual-channel audio data with a one-time key. The method has a large keyspace to prevent brute force attacks [8]. The cosine number transformation has already been applied to non-compressed 16-bit audio data block by block and to create a secret key [9]. A novel combining henon and economic maps is used to create the sequence. The confusion and diffusion technique are repeatedly applied to plain audio data to compute cipher audio data [10]. DNA coding and chaotic system has also been used for confusion and diffusion of audio data. The hash value of audio is used to compute the initial value of the chaotic system [11]. In a new encryption approach, the audio signal is converted into data using a lifting wavelet scheme. Then it is encrypted using a chaotic dataset and hyperbolic function [12].

The concept of block cipher and chaotic maps are used for .wav file encryption block by block. A chaotic tent map is used in the permutation step. Then the obtained block XORed with a key block. The resultant block is substituted with the multiplication inverse-based method of substitution [13]. A new method of audio transmission is discussed with self-adaptive scrambling, chaotic maps, DNA coding, and cipher feedback mechanism. Five different chaotic maps with eight control parameters are combined and used to create a pseudo-random number [14]. An encryption algorithm for audio data is proposed where the chaotic circle map and modified rotation equations are used to generate the pseudo-random number [15]. An audio encryption method is proposed with the help of the permutation of audio samples using a discrete modified Henon map followed by substitution operation. The keystream is obtained from the modified Lorenz-Hyperchaotic system. Different quality metrics are implemented to evaluate the quality of the encryption algorithm [16].

A novel method of encryption of the speech signal is discussed using multiple chaotic maps and cryptographic protocols. In the scrambling process, the input signal is divided into four segments using a cubic map. To secure all the parameters of chaotic maps, the blowfish algorithm is used with the private key. Hashing algorithm for authentication of shared data and the blowfish key of the system is implemented between sender and receiver’s ends. The message digest is used for secure communication providing authentication and verification of the parameters of chaotic maps. Several statistical tests are carried out to prove the method’s efficiency [17]. A new multiuser speech encryption method has been done using a chaos-based cryptosystem. The Chua chaotic systems are implemented to the transmitters and receiver to produce the chaotic encryption and decryption keys. The chaotic matrix operation for randomization and XOR operation are combined to encrypt the speech signal. The security analysis shows the sensitivity to the secret keys, large keyspace to resist the brute-force attack. The lifetime of the battery of the transmitter has been increased by the strong diffusion and confusion mechanisms [18]. A new scheme of audio encryption has been studied with a substitution-permutation algorithm using DNA encoding. The key generation of uses a key chaining mode, that produces a new key block for every plain block using the chaotic logistic map. Several security attacks are performed to evaluate the system. The chosen ciphertext, the chosen plaintext attacks, and a cycle attack are successfully demonstrated [19].

4 Henon–Tent pseudo random number generation algorithm

The new Henon–Tent pseudo random number generation algorithm is given below.

  1. Step 1.

    Read the audio signal and save it to audiodata

  2. Step 2.

    Compute the size of the audio signal and save it to an variable s

  3. Step 3.

    If s%2 == 0 then

  4. Step 4.

    Read the parameters a = 1.3, b = 0.3, X_new = 0.1, Y_new = 0.3 and Initialize two lists X_list = [], Y_list = []

  5. Step 5.

    Loop i = 0 to s/2

  6. Step 6.

    X_new,Y_new = HenonMap (X_new,Y_new)

  7. Step 7.

    X_list.append (X_new) and Y_list.append(Y_new)

  8. Step 8.

    End Loop

  9. Step 9.

    Save list C_list = X_list+Y_list

  10. Step 10.

    Compute C_list = C_list \(\times\) \(10^{5}\) and convert into to integer sequence and Save C_list to HenonSeq.

  11. Step 12.

    Else

  12. Step 13.

    Go to step 1

  13. Step 14.

    End If

  14. Step

    15. Read parameters  r = 1.0,rmax = 2.0, rstep = 0.001, x = 0.4,k, Initialize  float array TentSeq and integer array nTentSeq

  15. Step 17.

    While k<s

  16. Step 18.

    Loop

  17. Step 19.

    x = TentMap(x,r)

  18. Step 20.

    If r\(<=\)rmax then

  19. Step 21.

    r = r+rstep

  20. Step 22.

    Else

  21. Step 23.

    r = 1.0

  22. Step 24.

    TentSeq[k] = x

  23. Step 25.

    End If

  24. Step 26.

    End Loop

  25. Step 27.

    TentSeq = TentSeq \(\times\) max(audiodata)

  26. Step 28.

    Convert TentSeq to integer sequence nTentSeq

  27. Step 29.

    Compute MixSeq = HenonSeq XOR nTentSeq

  28. Step 30.

    Stop.

The block diagram of this process is also given in Fig. 1.

Fig. 1
figure 1

Block diagram of pseudo random number generation using Henon–Tent map

5 Audio encryption and decryption algorithm

The encryption Algorithm has the following steps.

  1. Step 1.

    Read the original audio file and save it to audiodata

  2. Step 2.

    Read the pseudo random number sequence MixSeq

  3. Step 3.

    Compute audiodata XOR MixSeq and Save it to CipherAudio

  4. Step 4.

    Write CipherAudio to EncAudio.wav

The decryption Algorithm has the following steps.

  1. Step 1.

    Read the EncAudio.wav file and save it to EncryptedAudiodata

  2. Step 2.

    Read the pseudo random number sequence MixSeq

  3. Step 3.

    Compute EncryptedAudiodata XOR MixSeq and Save it to DecipherAudioPassword

  4. Step 4.

    Write DecipherAudio to original audio file

Fig. 2
figure 2

Block diagram of encryption and decryption process

In the Fig. 2, the block diagram of encryption process and decryption process is given, where pseudo random numbers generated from mixed chaotic maps as secret key is used.

6 Simulation result and security analysis

Simulation is done in the software SageMath 8.0 and Matlab R2016b with 1.70 Ghz Intel processor having 4Gb RAM. The content of the 16-bit uncompressed audio files (.wav) are given in Table 1 [20, 21].

Table 1 Content of standard audio files

The recorded audio password_1.wav contains “154bca401”. The security analysis of these audio password files are given in the next section.

6.1 Key space analysis

The keyspace is defined by initial values from Equations and represents all the possibilities. The keyspace is obtained from the equation (1) and (2) the number of changing variables is four. So, according to the IEEE floating-point standard, the precision of 64 bits double variables is about \(10^{-15}\). In our proposed algorithm, we have four double variables as \(\mu\),\(x_{i}\),\(x_{n}\),\(y_{n}\) and so the final keyspace is about \(10^{60} \approx 2^{249.14461}\). This large keyspace represents our encryption method is secure against all types of brute force attacks.

6.2 Spectrogram analysis

A spectrogram is a visual representation of an audio file frequency spectrum varying with time and is used to analyze audio signals [14, 15]. If the spectrogram of the encrypted audio is uniform, the audio signals are successfully encrypted [6]. In our proposed method, the encrypted audio file has a uniform spectrogram. This means the original audio signals are successfully encrypted. The result is shown in Fig. 3.

Fig. 3
figure 3

Spectrogram of original and encrypted audio files

6.3 Histogram analysis

The histogram analysis is used to compute the distribution of values and to measure the quality of encrypted speech signals [15, 16]. It is preferable to have an encrypted speech file consists of equally probable sample values to resist against statistical attacks It has been obtained from Fig. 4, that the histogram of the encrypted speech file is almost uniform, so our algorithm is secure against different statistical attacks.

Fig. 4
figure 4

Histogram of original and encrypted audio files

6.4 Correlation

The correlation coefficient between two audio files represents the dependency between their sample values. If the values are in between \(|0.3-0|\), it is considered as a weak correlation. Lower value of correlation represents good encryption method with desirable resistance properties [6].

$$\begin{aligned} \rho (A,B)=\frac{cov(A,B)}{\sigma _{A}\sigma _{B}} \end{aligned}$$
(3)

where cov(AB), \(\sigma _{A}\) and \(\sigma _{B}\) are the covariance and standard deviation between two audio files A and B respectively.

Table 2 Correlation analysis of standard audio files

From the Table 2, It is found that the correlation coefficient values are close to zero or negative. So, there is no dependence between the original and encrypted files and our encryption scheme is a good encryption scheme with the desirable resistant property.

6.5 Signal to noise ratio

Signal to noise ratio (SNR) determines the quality of the signals. It is also used to validate the encryption algorithm’s performance [16]. The algorithm is more powerful if it has a more negative value of SNR [14]. For this test, we need both plain and encrypted audio files to calculate the SNR as follows: where \(x_{i}\) and \(y_{i}\) are corresponding sample values from audio files, and n is the number of samples.

$$\begin{aligned} SNR =10\log _{10} \frac{\sum\limits_{i=1}^{n}\times {x_{i}}^{2}}{\sum\limits_{i=1}^{n}[x_{i}-y_{i}]}(dB) \end{aligned}$$
(4)
Table 3 SNR of standard audio files

From the Table 3, it is clear that our proposed method gives negative values of SNR, so our method is very powerful encryption method.

6.6 Information entropy

The information entropy analysis finds the degree of uncertainty. The higher entropy value is desired to prevent statistical attacks [6, 13]. The equation of entropy is used to calculate the entropy value of the encrypted audio files [22], where peakval is the maximum value of the audio data and \(p_{i}\) is the probability of the occurrence of value i.

$$\begin{aligned} entropy=-\sum _{i=1}^{peakval}(p(i)\log _{2}(p(i))) \end{aligned}$$
(5)

From the Table 4, it is clear that the encrypted file has more entropy value, so the file is protected from any statistical attack.

Table 4 entropy of standard audio files

6.7 Peak Signal to Noise Ratio

Peak Signal to Noise Ratio (PSNR) is used to calculate the power of clean signals concerning the power of noise. The decreased values of PSNR are desired indicating the high level of noise in the encrypted audio files to resist any attacks [14]. PSNR is calculated as follows:

$$\begin{aligned} PSNR=10\log _{10}(\frac{peakval^{2}}{MSE}), MSE=\frac{1}{n}\sum _{i=1}^{n}(a[i]]-b[i]])^{2} \end{aligned}$$
(6)

Where peakval is the maximum possible value of audio stream and MSE is the mean square error between the plain and encrypted file and a and b represent the plain and encrypted audio file. From Table 5, it can be concluded that our encryption provides a lower psnr value for encrypted audio files, so a high level of noise is present. The algorithm is strong enough to resist any attacks.

Table 5 PSNR of standard audio files

6.8 Number of sample change rate

Number of sample change rate is used for the robustness of encryption algorithms. The test is done to compare sample values of the original and encrypted audio files in percents and the ideal value is \(100\%\). In Eq. (7), N is the total number of samples, \(x_{i}\) and \(y_{i}\) are the corresponding sample values of the plain and encrypted files. The value of \(D _{i}\) is 1 when \(x _{i}\) \(\ne\) \(y _{i}\) and 0 otherwise. From Table 6, we can say that the results demonstrate NSCR values are close to the ideal values. So, the proposed method has a high-security level.

$$\begin{aligned} NSCR=\frac{\sum\limits _{i=1}^{N} D_{i}}{N}\times 100 \% \end{aligned}$$
(7)
Table 6 NSCR of standard audio files

6.9 Speed and performance analysis

In Table 7, the encryption time in seconds of different audio files is depicted. The comparative study of the encryption time of other audio files with ours is given in Table 8. From Table 9, the comparative analysis proves our audio encryption method performs better than others concerning different security parameters.

Table 7 Encryption time of standard audio
Table 8 Encryption time of other audio files with ours
Table 9 Security analysis of proposed audio encryption algorithm with others

7 Conclusion and future scope

In this paper, a novel audio encryption and decryption method has been discussed using the chaotic pseudo-random number as a secret key. From the experimental result, it can be concluded that the method is robust against all types of security attacks. In the future, the audio password can be used for login credentials in the e-learning process to provide information security. The different file formats like .mp3 for an audio password may be taken for further experiments. Elliptic curve cryptography may be implemented to exchange the secret key. The concept of the session of the audio password may be incorporated.