Securing Digital Audio Files Using Rotation and XOR Operations

Gaffar, Abdul

doi:10.1007/978-981-99-2229-1_38

Abdul Gaffar ORCID: orcid.org/0000-0002-3388-9067⁸

Part of the book series: Algorithms for Intelligent Systems ((AIS))

Included in the following conference series:

International Conference on Cryptology & Network Security with Machine Learning

179 Accesses

Abstract

In this paper, we propose a WORD-oriented technique for encrypting (and decrypting) digital audio files based on rotation and XOR operations. The key concepts of the designed encryption algorithm are the RX (Rotation-XOR) operations, i.e., the plain audio samples are first left-rotated by the sum-of-digits of the previous audio samples, and then XOR-ed with the previous audio samples. The designed encryption algorithm encodes a digital audio file into a random (noise) audio file, from the human visual as well as the statistical points of view. Several encryption and decryption evaluation metrics, such as oscillogram, number of sample change rates, etc., are applied on the digital audio files of varying sizes in order to empirically assess the performance and efficiency of the proposed technique. The results of these metrics validate the robustness of the designed technique.

Access provided by Autonomous University of Puebla. Download conference paper PDF

A technique for securing digital audio files based on rotation and XOR operations

Article 31 October 2023

Audio encryption based on the cosine number transform

Article 01 July 2015

Multilayer symmetric and asymmetric technique for audiovisual cryptography

Article 16 September 2023

Keywords

1 Introduction

Every day millions (perhaps billions) of messages in the form of texts, audio, images, and videos, are communicated on the Internet, which is an open (unsecure) network. So, there must be robust technique(s) in order to communicate secretly. In the context of secure communication, encryption is the best choice, which encodes a secret message into a form which is unrecognizable, except by the intended one. Broadly, there are two types of encryption schemes: symmetric-key encryption and asymmetric-key encryption. The symmetric-key encryption, also known as (a.k.a.) private-key encryption, uses the same secret key for encoding and decoding a message. The foremost application of the private-key encryption is to provide confidentiality. On the other hand, asymmetric-key encryption, a.k.a. public-key encryption, uses different keys for encoding and decoding a message. In particular, public key is used for encoding, while private (secret) key is used for decoding a message. The foremost applications of the public-key encryption are authentication and non-repudiation, besides confidentiality.

Since the symmetric-key encryption methods are much faster and more efficient, for attaining confidentiality, as compared to the asymmetric-key encryption methods, therefore, we adopt the symmetric-key encryption method in the proposed technique. Note that the Rotation-XOR (RX) operations, used in the proposed technique, are the primitive operations, which are efficiently and directly supported by most of the computer processors. These operations aid in the possible improvement of speed of the designed technique.

The rest of the paper has been put in the following order: Sect. 2 provides related works; Sect. 3 gives preliminaries; Sect. 4 describes the encryption and decryption algorithms of the proposed technique; Sect. 5 describes the implementation and experimental results; Sect. 6 discusses security analyses of the proposed technique; Sect. 7 gives comparison of the proposed technique with the recent state-of-the-art techniques; and Sect. 8 concludes the paper, followed by the references.

2 Related Works

Abouelkheir and El-Sherbiny [1] in 2022 proposed a technique for the security of digital audio files based on a modified RSA (Rivest, Shamir, and Adleman) algorithm. The authors modified the RSA algorithm via using dynamic keys—for enhancing security of the proposed technique, and five numbers (two primes and three random numbers)—for enhancing speed of the proposed technique. Several metrics have been utilized in order to validate the aims of the designed scheme. Although the scheme performs well in terms of encryption, but in terms of decryption, it is not a good scheme. It performs lossy decryption, i.e., the decrypted audio files are not exactly identical to the original audio files.

Shah et al. [2] in 2021 proposed a technique for the secure communication of digital audio files based on the finite fields. The authors generated a sequence of pseudo-random numbers via an elliptic curve, which is used to scramble the samples of the plain audio files. Further, the scrambled audio samples are substituted via the newly constructed S-boxes, to ensure the confusion–diffusion properties [3] required for a secure encryption algorithm. Faragallah and El-Sayed [4] in 2021 proposed an encryption scheme for securing the audio files based on the XOR (eXclusive OR) operation and the Hartley Transform (HT). First of all, a plain audio file is reshaped into a two-dimensional (2D) data block, and then it is XOR-ed with a grayscale image (treated as a secret key). The obtained XOR-ed blocks are then transposed via a chaotic map, followed by an optical encryption using HT. Naskar et al. [5] in 2021 suggested an encryption scheme for audio files based on the distinct key blocks together with the Piece-Wise Linear Chaotic Map (PWLCM) and the Elementary Cellular Automata (ECA). The scheme encrypts a plain audio file in three stages: cyclic shift, substitution, and scrambling. The cyclic shift is used for reducing the correlation between the samples of each audio block. The shifted audio data blocks are substituted (modified) via PWLCM, and finally, modified blocks are scrambled via ECA for better diffusion.

Abdelfatah [6] in 2020 proposed an algorithm for securing audio files in three phases utilizing three secret keys. The first phase is the self-adaptive scrambling of the plain audio files via the first secret key. The second phase is the dynamic DNA (deoxyribonucleic acid) encoding of the scrambled audio data via the second secret key. The last phase is the cipher feedback mode via the third secret key, which aids in achieving better confusion and diffusion properties.

3 Preliminaries

3.1 Digital Audio

A digital audio, say, P is a l-by-c matrix, consisting of elements called samples, where l and c denote the number of samples and the number of channels in P, respectively. If $c = 1$, then P is said to be a single (or mono) channel audio file, and if $c = 2$, then P is said to be a dual (or sterio) channel audio file. Note that the samples in P are the floating-point values, i.e., real values. Figure 1 shows the oscillogram (a graph between amplitude and time) of the audio file “handel.wav”, which is of size 73113 $\times $ 1, i.e., a single-channel audio file containing 73113 samples. For other details of the audio file “handel.wav”, namely, sample rate (in Hz—Hertz), duration (in sec—seconds), bits per sample, bit rate (in kbps—1000 bits per second), and size (in KB—1024 Bytes), see Table 1.

3.2 Rotation operation

By rotation operation, we mean “circular shift” or “bit-wise” rotation. It is of two types:

1.
Left rotation. It is denoted by “$\lll $”. By $x \lll y$, it is meant that x is left rotated by y bits. For example, if $x =$ 0001 0111 and $y = 1$, then $x \lll y$ gives 0010 1110. Figure 2a demonstrates the concept, wherein MSB is the Most Significant Bit and LSB is the Least Significant Bit.
2.
Right rotation. It is denoted by “$\ggg $”. By $x \ggg y$, it is meant that x is right rotated by y bits. For example, if $x =$ 0001 0111 and $y = 1$, then $x \ggg y$ gives 1000 1011. Figure 2b demonstrates the concept.

3.3 XOR Operation

It is one of the simplest operations in a computer’s processor. It is a bit-wise operation that takes two strings of bits of equal length and performs the XOR (denoted by $\oplus $) operation as: if two bits are same, the result is 0; and if not same, the result is 1. It’s actually addition modulo 2.

For example, if $a =$ 1010 1011 and $b =$ 0101 1100, then $a \oplus b =$ 1111 0111.

4 Description of the Proposed Encryption and Decryption Algorithms

4.1 Preprocessing on the Audio File

Input. An audio file P of size $l \times 1$.

1.
Convert the audio samples of P from the floating point values (real values) to binary (matrix) via single-precision floating point (32-bit).^{Footnote 1}
2.
Convert the binary (matrix) to non-negative integers (bytes) array, i.e., P is of size $1 \times l$. Note that, here samples of P are in bytes (0–$2^8 - 1$).
3.
Now, if l is a multiple of 4, then no padding is required, else pad ($4 - r$) elements “post” with zeros to P, where r is a remainder on dividing l by 4.
4.
Convert the bytes of P into WORDS, where WORD is a collection of 4 bytes, and rename the audio file P as $P_w$.

Output. The audio file $P_w$ of size $1 \times m$, where m denotes the number of WORDS in $P_w$.

4.2 Reverse Preprocessing on the Audio File

Input. The audio file $P_w$ of size $1 \times m$, where m being the number of WORDS in $P_w$.

1.
Convert the WORDS of the audio file $P_w$ into bytes (0–2$^8 -$ 1), and now, the size of $P_w$ is $1 \times 4m$. Rename $P_w$ as P.
2.
Remove “last” zero (padded) bytes, if any, from P, and let the size of P becomes $1 \times l$ bytes.
3.
Convert the bytes (non-negative integers—0–$2^8 - 1$) into a binary (matrix).
4.
Convert the binary (matrix) into the floating-point values via the single-precision floating point (32-bit).
5.
Take the transpose of P so that the size of P becomes $l \times 1$.

Output. The audio file P of size $l \times 1$.

4.3 Preprocessing on Secret Key

Input. Secret key $K =$ {$k_1$, $k_2$, $k_3$, $k_4$, $k_5$, $k_6$, $k_7$, $k_8$, $k_9$, $k_{10}$, $k_{11}$, $k_{12}$, $k_{13}$, $k_{14}$, $k_{15}$, $k_{16}$, $k_{17}$, $k_{18}$, $k_{19}$, $k_{20}$, $k_{21}$, $k_{22}$, $k_{23}$, $k_{24}$, $k_{25}$, $k_{26}$, $k_{27}$, $k_{28}$, $k_{29}$, $k_{30}$, $k_{31}$, $k_{32}$} of 32 bytes.

1.
Split the secret key K into two equal parts, say, $K_1$ and $K_2$ as $K_1 =${$k_1$, $k_2$, $k_3$, $k_4$, $k_5$, $k_6$, $k_7$, $k_8$, $k_9$, $k_{10}$, $k_{11}$, $k_{12}$, $k_{13}$, $k_{14}$, $k_{15}$, $k_{16}$} and $K_2 =${$k_{17}$, $k_{18}$, $k_{19}$, $k_{20}$, $k_{21}$, $k_{22}$, $k_{23}$, $k_{24}$, $k_{25}$, $k_{26}$, $k_{27}$, $k_{28}$, $k_{29}$, $k_{30}$, $k_{31}$, $k_{32}$}.
2.
Convert the key-bytes of $K_1$ and $K_2$ into WORDS as $K_{1w} =${$q_{1w}$, $q_{2w}$, $q_{3w}$, $q_{4w}$}, and $K_{2w} =${$r_{1w}$, $r_{2w}$, $r_{3w}$, $r_{4w}$}, where $q_{1w} = k_1k_2k_3k_4$, $q_{2w} = k_5k_6k_7k_8$, $q_{3w} = k_9k_{10}k_{11}k_{12}$, and $q_{4w} = \{k_{13}k_{14}k_{15}k_{16}$}; $r_{1w} = k_{17}k_{18}k_{19}k_{20}$, $r_{2w} = k_{21}k_{22}k_{23}k_{24}$, $r_{3w} = k_{25}k_{26}k_{27}k_{28}$, and $r_{4w} = \{k_{29}k_{30}k_{31}k_{32}$}.
3.
Expansion of $K_{1w}$.
- $\blacktriangleright $ Expand $K_{1w}$ to the size m as:
  1. (a)
    For $i =$ 1, 2, 3, 4; $T_1[i] = K_{1w}[i]$, i.e., $T_1[1] = q_{1w}$, $T_1[2] = q_{2w}$, $T_1[3] = q_{3w}$, and $T_1[4] = q_{4w}$.
  2. (b)
    Calculate $T_1[5]$ as
    $$\begin{aligned} T_1[5] = \text {mod}(\lceil mean(T_1[i])\rceil \text {,} \ 2^{32})\text {,} \qquad i = \text {1, 2, 3, 4.} \end{aligned}$$
    where “mean” denotes the average function, $\lceil \cdot \rceil $ denotes the ceiling function, and “mod” denotes the modulus function.
  3. (c)
    Calculate $T_1[i]$, for $i =$ 6, 7, ..., m, as
    $$\begin{aligned} T_1[i] = \text {mod}(T_1[i - 1] + T_1[i - 2]\text {,}\ 2^{32})\text {,} \qquad i = \text {6, 7,} \dots \text {,}\ m{.} \end{aligned}$$
4.
Expansion of $K_{2w}$.
- $\blacktriangleright $ Expand $K_{2w}$ to the size m as:
  1. (a)
    For $i =$ 1, 2, 3, 4; $T_2[i] = K_{2w}[i]$, i.e., $T_2[1] = r_{1w}$, $T_2[2] = r_{2w}$, $T_2[3] = r_{3w}$, and $T_2[4] = r_{4w}$.
  2. (b)
    Calculate $T_2[5]$ as
    $$\begin{aligned} T_2[5] = \text {mod}(\lceil mean(T_2[i])\rceil \text {,} \ 2^{32})\text {,} \qquad i = \text {1, 2, 3, 4.} \end{aligned}$$
    where symbols have their usual meanings.
  3. (c)
    Calculate $T_2[i]$, for $i =$ 6, 7, ..., m, as
    $$\begin{aligned} T_2[i] = \text {mod}(T_2[i - 1] + T_2[i - 2]\text {,}\ 2^{32})\text {,} \qquad i = \text {6, 7,} \dots \text {,}\ m{.} \end{aligned}$$
5.
Generation of a third key.
- $\blacktriangleright $ Generate a third key $K_{3w}$ from $K_{1w}$ and $K_{2w}$ as:
  $$\begin{aligned} K_{3w} = \text {mod}(K_{1w} \cdot K_{2w}\text {,}\ 2^{32}) \end{aligned}$$
  where “$\cdot $” denotes component-wise multiplication.

Output. The expanded keys $T_1$ and $T_2$ of size m, and the generated key $K_{3w}$ of size 4.

4.4 Encryption Algorithm

Input. An audio file P of size $l \times 1$ and the secret key K of 32-byte.

1.
Apply preprocessing on the audio file P (see Sect. 4.1), and let the obtained file be $P_w$ of size $1 \times m$.
2.
Apply preprocessing on the secret key K (see Sect. 4.3) to obtain the expanded keys $T_1$ & $T_2$ of size m, and the generated key $K_{3w}$ of size 4 (in WORDS).
3.
Initial round substitution. XOR $P_w$ with $T_1$, i.e.,
$$\begin{aligned} B[i] = P_w[i] \oplus T_1[i], \qquad i =1, 2, \ldots , m. \end{aligned}$$
4.
First round substitution.
1. (a)
  Let $B =${$b_1$, $b_2$, $\dots $, $b_m$}, then do the following:
  $$\begin{aligned} \begin{array}{ll} \qquad \text {for} \ i = \text {1 to}\ m \\ \qquad \quad b_{i - 1} = c_{i - 1} \\ \qquad \quad c_i = [b_i \lll \sigma (b_{i - 1})] \oplus b_{i - 1}\\ \qquad \text {end for} \end{array} \end{aligned}$$
  where $b_0 = c_0 = b_m$; “$\lll $” denotes the left rotation operator; and “$\sigma $” in $\sigma (b_{i - 1})$ denotes sum-of-digits function, and $\sigma (b_{i - 1})$ denotes sum-of-digits of $b_{i - 1}$. For instance, if $ b_{i - 1} = 123$, then $\sigma (123) = 1 + 2 + 3 = 6$.
2. (b)
  Let $C =${$c_1$, $c_2$, $\dots $, $c_m$}, then do the following:
  $$\begin{aligned} \begin{array}{l} C[i] = C[i] \oplus K_{3w}[i] \qquad i =\text {1, 2, 3, and} \\ C[m] = C[m] \oplus K_{3w}[4]. \end{array} \end{aligned}$$
5.
Second round substitution.
1. (a)
  Do the following:
  $$\begin{aligned} \begin{array}{l} \qquad \text {for} \ j = \text {1 to}\ m \\ \qquad \quad c_{j - 1} = d_{j - 1} \\ \qquad \quad d_j = [c_j \lll \sigma (c_{j - 1})] \oplus c_{j - 1}\\ \qquad \text {end for} \end{array} \end{aligned}$$
  where $d_0 = c_m$ and the rest symbols have their usual meanings.
2. (b)
  Let $D =${$d_1$, $d_2$, $\dots $, $d_m$}, then do the following:
  $$\begin{aligned} E[j] = D[j] \oplus T_2[j]\text {,} \qquad j = \text {1, 2,} \dots \text {,}\ m. \end{aligned}$$
6.
Apply the reverse preprocessing on the audio file E of size $1 \times m$ (see Sect. 4.2), and let the obtained audio file be F of size $l \times 1$.

Output. The encrypted audio file F of size $l \times 1$.

4.5 Decryption Algorithm

Input. The encrypted audio file F of size $l \times 1$ and the secret key K (32-byte).

1.
Apply the preprocessing on the audio file F (see Sect. 4.1) to obtain an audio file E of size $1 \times m$, m being number of WORDS in E.
2.
Second round substitution.
1. (a)
  XOR the audio file E with $T_2$, i.e.,:
  $$\begin{aligned} D[j] = E[j] \oplus T_2[j]\text {,} \qquad j =\text {1, 2,} \dots \text {,}\ m. \end{aligned}$$
2. (b)
  Let $D =${$d_1$, $d_2$, ..., $d_m$}, then do the following:
  $$\begin{aligned} \begin{array}{ll} \qquad \text {for} \ j = m \ \text {to 1} \\ \qquad \quad c_j = [d_j \oplus d_{j - 1}] \ggg \sigma (d_{i - 1}) \\ \qquad \text {end for} \end{array} \end{aligned}$$
  where “$j = m$ to 1” means $j = m\text {,} \ m - 1$, ..., 2, 1; $d_0 = d_m$; and “$\ggg $” denotes the right rotation.
3.
First round substitution.
1. (a)
  Let $C =${$c_1$, $c_2$, ..., $c_m$}, then do the following:
  $$\begin{aligned} \begin{array}{ll} C[i] = C[i] \oplus K_{3w}[i]\text {,} \qquad i =\text {1, 2, 3, and} \\ C[m] = C[m] \oplus K_{3w}[4]. \end{array} \end{aligned}$$
2. (b)
  Do the following:
  $$\begin{aligned} \begin{array}{ll} \qquad \text {for} \ i = m \ \text {to 1} \\ \qquad \quad b_i = [c_i \oplus c_{i - 1}] \ggg \sigma (c_{i - 1}) \\ \qquad \text {end for} \end{array} \end{aligned}$$
  where $c_0 = C_m$ and the rest symbols have their usual meanings.
4.
Initial round substitution. Let $B =${$b_1$, $b_2$, ..., $b_m$}, then do the following:
$$\begin{aligned} P_w[i] = B[i] \oplus T_1[i]\text {,} \qquad i ={1, 2, \ldots ,} \ m. \end{aligned}$$
5.
Apply the reverse preprocessing on the audio file $P_w$ (see Sect. 4.2) of size $1 \times m$, to obtain the audio file P of size $l \times 1$.

Output. The decrypted (original) audio file P of size $l \times 1$.

5 Implementation and Experimental Results

The proposed technique is implemented on MATLAB (R2021a) software under the Windows 10 operating system. To evaluate the performance (encryption and decryption qualities) of the proposed technique, two test audio files of different sample lengths are taken from the MATLAB IPT (Image Processing Toolbox).^{Footnote 2} The details of these audio files are provided in Table 1. Also, the oscillograms of the original, encrypted, and the decrypted audio files are shown in Fig. 3.

Table 1 Description of the test audio files

Full size table

6 Security Analyses

6.1 Key Space Analysis

The space of all potential combinations of a key constitutes a key space of any encryption/decryption algorithm. Keyspace should be very large so that attacks, such as the brute-force [9], known/chosen plaintext [10], etc., could become unsuccessful. Our proposed technique is based on a secret key of 32 bytes (256 bits), which produces a key space of $2^{256}$, and as of today, it is believed to be unbreakable.

6.2 Encryption Evaluation Metrics

Since any single metric cannot evaluate any encryption algorithm (or any encrypted audio file) fully, so we employ two important metrics, namely, the oscillogram and the number of sample change rates.

6.2.1 Oscillogram Analysis

The oscillogram is a 2D graph of an audio file (or signal) between the amplitude and the time, generated by the oscilloscope, a.k.a. oscillograph [11, Chap. 5]. It represents the change in amplitude of an audio file over the time. X-axis represents the time in seconds, while Y-axis represents the amplitude in volts. The oscillograms of the original, encrypted, and the decrypted audio files are shown in Fig. 3.

From Fig. 3, we observe that the oscillograms of the encrypted audio files are uniform, unlike those of the corresponding original audio files. Also, the oscillograms of the decrypted audio files are identical to those of the corresponding original files. Thus, our proposed technique performs a robust encryption. Also, since the audio files are successfully decrypted without any data loss, the designed technique performs lossless decryption.

6.2.2 Number of Sample Change Rate (NSCR) Test

The NSCR [13] is used to test the resistance of the differential attack [14], or judging the Shannon’s diffusion property [3]. The NSCR scores between the encrypted audio files $E_1$ and $E_2$ can be calculated via Eq. 1:

$$\begin{aligned} {} NSCR = \sum _{s=1}^{l}\dfrac{\beta (s\text {,}\ 1)}{l} \times 100\% \ \text {,} \end{aligned}$$

(1)

where $\beta (s\text {,}\ 1)$ is given by Eq. 2:

$$\begin{aligned} {} \beta (s\text {,}\ 1) = {\left\{ \begin{array}{ll} 0 \text {,} &{} \text {if} \ E_1(s\text {,}\ 1) = E_2(s\text {,}\ 1)\\ 1 \text {,} &{} \text {if} \ E_1(s\text {,}\ 1) \ne E_2(s\text {,}\ 1) \end{array}\right. } \end{aligned}$$

(2)

where $E_1(s\text {,}\ 1)$ and $E_2(s\text {,}\ 1)$ are the samples of the encrypted audio files prior to and after alteration of only one sample of the original audio file.

We have calculated the NSCR scores by changing only one sample of the test audio files at different positions (from beginning—(1, 1)th sample as well as from the last—(l, 1)th sample), l being the total number of samples in an audio file. The obtained NSCR scores are shown in Table 2. Note that, if the calculated/reported NSCR score is greater than the theoretical NSCR value, which is 99.5527 at 0.01 significance level and 99.5693% at 0.05 level [13], then the NSCR test is passed. The proposed technique passes the NSCR test for all the audio files, and thus, ensures the property of diffusion, and also, outperforms the methods listed in Table 2, which are vulnerable to the differential attack.

Table 2 NSCR scores of the encrypted images

Full size table

6.3 Decryption Evaluation Metric

To evaluate the decryption algorithm, i.e., the decrypted audio files, we use an important metric: the mean square error.

6.3.1 Mean Square Error (MSE) Analysis

The MSE [15] is used to judge the decryption quality of any decrypted audio file. The MSE value can be any non-negative integer. Lower the MSE, better is the decryption quality, in particular, value 0 denotes the perfect decryption, i.e., the original and the decrypted audio files are exactly identical—lossless decryption. The MSE can be calculated via Eq. 3:

$$\begin{aligned} MSE = \sum _{j = 1}^{l} \dfrac{(P_j - D_j)}{l} \ \text {,} \end{aligned}$$

(3)

where $P_j$ and $D_j$ denote the jth samples of the original and the decrypted audio files, respectively, while the other symbols have their usual meanings.

The values of the MSE between the original and the decrypted audio files are provided in Table 3. From the table, we observe that the MSE values are 0 (zero), endorsing that the decrypted audio files are perfectly identical to the original audio files.

Table 3 The MSE values between the decrypted and the original audio files

Full size table

6.4 Key Sensitivity Analysis

This test is utilized to judge the confusion property [3] of any encryption/decryption algorithm. According to Shannon [3], a secure cryptographic algorithm must have the confusion property to thwart statistical attacks. It is the property of confusion that hides the relationship between the encrypted data and the secret key. The key sensitivity test is utilized to judge this confusion property. The sensitivity of the secret key is assessed in two aspects:

1.
Encryption. It is used to measure the dissimilarity between the two encrypted audio files $E_1$ and $E_2$ w.r.t. the same plain audio file P using two different encryption keys $\lambda _1$ and $\lambda _2$, where $\lambda _1$ and $\lambda _2$ are obtained from the original secret key K by altering merely the LSB corresponding to the last and the first bytes of K, respectively.
2.
Decryption. It is used to measure the dissimilarity between the two decrypted audio files $D_1$ and $D_2$ w.r.t. the same encrypted audio file E, encrypted via secret key K, using the decryption keys $\lambda _1$ and $\lambda _2$, respectively. Note that both the encryption/decryption keys $\lambda _1$ and $\lambda _2$ differ from each other as well as from the secret key K merely by 1-bit.

The results of the key sensitivity analysis w.r.t. the encryption (enc—in short) and decryption (dec—in short) aspects are shown in Figs. 4 and 5, respectively, whence we infer that the proposed technique has a very high bit-level sensitivity, and thus, ensures the property of confusion.

7 Comparison with the Existing Techniques

The proposed technique is compared with the recent state-of-the-art techniques based on the commonly available metrics, namely, the NSCR and the MSE. The comparisons of the proposed approach with the recent approaches based on the NSCR and the MSE metrics are provided in Tables 2 and 3, respectively. From these tables, we infer that our proposed technique performs well in terms of the respective compared metrics.

8 Conclusion

In this paper, we proposed a technique for securing digital audio files based on the WORD-oriented RX operations. Several performance evaluation metrics, i.e., encryption and decryption evaluation metrics, have been used on the audio files of varying sizes from the standard database, in order to empirically assess the efficiency and robustness of the designed approach. The results of these performance evaluation metrics validate the goals of the proposed approach. Moreover, a thorough comparison with the recent state-of-the-art techniques, based on several metrics, have also been made.

Notes

1.
See [7, 8].
2.
Available in, C:$\backslash $Program Files$\backslash $Polyspace$\backslash $R2021a$\backslash $toolbox$\backslash $images$\backslash $imdata.

References

Abouelkheir E, Sherbiny SE (2022) Enhancement of speech encryption/decryption process using RSA algorithm variants. Hum-Centric Comput Inf Sci 12(6). https://doi.org/10.22967/HCIS.2022.12.006
Shah D, Shah T, Hazzazi MM, Haider MI, Aljaedia, Hussain I (2021) An efficient audio encryption scheme based on finite fields. IEEE Access 9:144385–144394. https://doi.org/10.1109/ACCESS.2021.3119515
Shannon CE (1949) Communication theory of secrecy systems. Bell Syst Tech J 28(4):656–715. https://doi.org/10.1002/j.1538-7305.1949.tb00928.x
Faragallah OS, El-Sayed HS (2021) Secure opto-audio cryptosystem using XOR-ing mask and Hartley transform. IEEE Access 9:25437–25449. https://doi.org/10.1109/ACCESS.2021.3055738
Naskar PK, Bhattacharyya S, Chaudhuri A (2021) An audio encryption based on distinct key blocks along with PWLCM and ECA. Nonlinear Dyn 103:2019–2042. https://doi.org/10.1007/s11071-020-06164-7
Abdelfatah RI (2020) Audio encryption scheme using self-adaptive bit scrambling and two multi chaotic-based dynamic DNA computations. IEEE Access 8:69894–69907. https://doi.org/10.1109/ACCESS.2020.2987197
Available at https://in.mathworks.com/help/matlab/matlab_prog/floating-point-numbers.html. Accessed 05 Nov 2022
Available at https://en.wikipedia.org/wiki/Single-precision_floating-point_format. Accessed 05 Nov 2022
ECRYPT II yearly report on algorithms, keysizes NS (eds) (BRIS) 2011–2012. https://www.ecrypt.eu.org/ecrypt2/documents/D.SPA.20.pdf. Accessed 05 Nov 2022
Stinson DR (2006) Cryptography: theory and practice. Chapman and Hall CRC, UK
Google Scholar
Kularatna N (2002) Digital and analogue instrumentation: testing and measurement. IET, UK. https://doi.org/10.1049/PBEL011E
Belmeguenai A, Ahmida Z, Ouchtati S, and Dejmii R (2017) A novel approach based on stream cipher for selective speech encryption. Int J Speech Technol 20:685–698. https://doi.org/10.1007/s10772-017-9439-8
Wu Y, Noonan JP, Agaian S (2011) NPCR and UACI randomness tests for image encryption. J Sel Areas Telecommun 31–38
Google Scholar
Biham E, Shamir A (1993) differential cryptanalysis of the data encryption standard (DES). Springer, US
Google Scholar
Hossein PN (2014) Introduction to probability, statistics, and random processes. Kappa Research LLC, USA
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics and Statistics, Integral University, Lucknow, 226 026, UP, India
Abdul Gaffar

Authors

Abdul Gaffar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Abdul Gaffar .

Editor information

Editors and Affiliations

Applied Statistics Unit, Indian Statistical Institute, Kolkata, West Bengal, India
Bimal Kumar Roy
Department of Mathematics, Pranveer Singh Institute of Technology, Kanpur, India
Atul Chaturvedi
Department of Mathematics, Bar-Ilan University, Ramat Gan, Israel
Boaz Tsaban
Department of Mathematics, Indian Institute of Technology Jammu, Jammu, Jammu and Kashmir, India
Sartaj Ul Hasan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gaffar, A. (2024). Securing Digital Audio Files Using Rotation and XOR Operations. In: Roy, B.K., Chaturvedi, A., Tsaban, B., Hasan, S.U. (eds) Cryptology and Network Security with Machine Learning. ICCNSML 2022. Algorithms for Intelligent Systems. Springer, Singapore. https://doi.org/10.1007/978-981-99-2229-1_38

Download citation

DOI: https://doi.org/10.1007/978-981-99-2229-1_38
Published: 18 October 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-2228-4
Online ISBN: 978-981-99-2229-1
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Securing Digital Audio Files Using Rotation and XOR Operations

Abstract

Similar content being viewed by others

A technique for securing digital audio files based on rotation and XOR operations

Audio encryption based on the cosine number transform

Multilayer symmetric and asymmetric technique for audiovisual cryptography

Keywords

1 Introduction

2 Related Works

3 Preliminaries

3.1 Digital Audio

3.2 Rotation operation

3.3 XOR Operation

4 Description of the Proposed Encryption and Decryption Algorithms

4.1 Preprocessing on the Audio File

4.2 Reverse Preprocessing on the Audio File

4.3 Preprocessing on Secret Key

4.4 Encryption Algorithm

4.5 Decryption Algorithm

5 Implementation and Experimental Results

6 Security Analyses

6.1 Key Space Analysis

6.2 Encryption Evaluation Metrics

6.2.1 Oscillogram Analysis

6.2.2 Number of Sample Change Rate (NSCR) Test

6.3 Decryption Evaluation Metric

6.3.1 Mean Square Error (MSE) Analysis

6.4 Key Sensitivity Analysis

7 Comparison with the Existing Techniques

8 Conclusion

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation