Securing encrypted image information in audio data

Shamsi, Zeba; Laiphrakpam, Dolendro Singh

doi:10.1007/s11042-023-14735-0

Securing encrypted image information in audio data

Published: 06 March 2023

Volume 82, pages 33063–33085, (2023)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Multimedia Tools and Applications Aims and scope Submit manuscript

Securing encrypted image information in audio data

Download PDF

352 Accesses
2 Citations
Explore all metrics

Abstract

Advances in communication technologies have fueled growth in digital data transfer. Images are one of the most often conveyed types of digital information. Crytographic techniques help create cipher but also lure attackers as it indicates secret communication. To overcome, a method for encrypting a secret image and hiding it in audio data is proposed. The Ikeda map is used to create the encrypted image, which is then hidden in the audio’s lifting wavelet transform. Various statistical experiments show that the proposed approach conceals the encrypted image while causing minimal changes to the audio. The proposed algorithm shows robustness towards noise addition or random audio crop attack by retrieving a visually perceivable image after the attack. The suggested approach outperforms the current algorithms in terms of imperceptibility and embedding capacity.

Secure Audio Cryptosystem Using Hashed Image LSB watermarking and Encryption

Article 25 October 2017

Secret Audio Messages Hiding in Images

Reversible Data Hiding for Encrypted Audios by High Order Smoothness

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

As communication technology improves, data transmission through the internet has a manifold. Cryptographic encryption techniques help in providing confidentiality by generating unintelligible data. However, this unintelligible data indicates some secret communication that lures attackers to perform various attacks to decipher the secret communication. Concealing cryptographic encryption data in some cover data helps reduce attacks. Concealing through embedding on the cover data can be carried out using spatial or frequency domain [13]. The challenges in embedding the information on cover data include high imperceptibility, embedding capacity and robustness. Increasing any of the mentioned three qualities reduces the other two qualities. The three qualities are interrelated [7]. Embedding on the spatial domain has better embedding capacity than the frequency domain, but frequency domain embedding has got better concerning imperceptibility. Various authors have developed various methods to conceal multimedia data like text, audio, and image in some other multimedia data to conceal confidential data. Some of the text hiding methods are given in [6, 10, 15, 18, 19, 30]. Karakus et al. [15] proposed a method to conceal the doctor’s comments in a medical image using Genetic Algorithm-Optimum Pixel Similarity. Karakus et al. method is compared with the classical similarity-based method and found to have more data hiding capacity. Ditta et al. [6] proposed a method to conceal personal text data using Arabic text as a carrier. The method hides the secret information using Unicode characters like Zero Width Joiner and Zero Width character. In order to achieve high security, the secret information is encrypted using bit inversion before concealing in the cover medium. Younus et al. [30] proposed a method to secure secret text by hiding it in a cover image. The secret text is compression using the Huffman technique and encrypted using the Vigenere cipher. Random blocks are selected from the cover image using the Knight tour algorithm (KTA) and the compressed, encrypted data are embedded using Exploiting Modification Direction (EMD) technique. Ren et al. [19] proposed a method to hide secret messages using Advanced Audio Coding (ACC) audio as a carrier. The method is designed to reduce the distortion in the carrier audio by combining Huffman codeword histogram, modifying quantized modify discrete cosine transformation and perceptual masking using the psychoacoustic model. Qi et al. [18] inspired by natural data hiding technique, proposed a method to securely transmit confidential data by creating a synthetic haze image that resembles some natural weather condition. The private data is embedded using the HILL cost function. Some demerits are possibilities of false during parameter estimation and binary embedding only. Rahman [10] proposed a method for securely sending confidential information for a nuclear reactor by hiding the information in the middle frequency of a DCT transformed cover image. The private data is converted into a binary sequence and every two bits are hidden in LSB1 and LSB2 of the middle frequency DCT transformed cover image. Since the direct secret data are embedded, the possibility is that an attacker recovers the confidential data if he or she knows that communication is taking place using this method. Basu et al. [3] proposed a method to hide the secret image into another cover image. A differential evolution optimization algorithm is deployed to hide the secret data constructively.

As an image is one of the most communicated data, various authors have developed methods for concealing the secret image using various multimedia covers such as images, audio or videos. Some image hiding methods are given in [1, 8, 9, 11, 17, 22, 23, 25, 28, 31, 32]. Thanki et al. [23] proposed a method to hide a secret image using another cover image. The cover image’s Ridgelet coefficient and wavelet coefficient are obtained by applying Finite Ridgelet Transform and single-level DWD, respectively. The secret image is scrambled using Arnold’s transform and embedded in the LL subband of the cover image. The drawback of the method is that Arnold’s transformation takes a lot of execution time. Valander et al. [25] proposed a method to hide a secret image using a cover image using Integer Wavelet Transform (IWT). The secret image is encrypted using a modified Logistic chaotic map to increase the security of the secret image. Banik et al. [11] proposed a robust binary image hiding technique using audio as the carrier. The image undergoes Arnold’s scrambling operation and is embedded using the cocktail party effect at the LH subband of the Discrete Wavelet Transform of the audio carrier. El-Latif et al. [1] proposed a method to hide an image half the size of the cover image using an S-box created from quantum walks. The S-box determines the position of embedding the secret image data to the cover image. Mukherjee et al. [17] proposed a multi-bit embedding method that can embed 2-6 bits with an image as a carrier or 6-13 bits with audio as a carrier. Embedding is done using pixel value differencing (PVD) on the spatial domain. Zhang [32] proposed a data hiding technique using an image as a cover. The cover image undergoes Haar IWT to avoid truncation error. The data are hidden in the edges of the LL sub-band based on multidirectional line encoding. Wahab et al. [26] proposed a method for hiding secret data on a cover image. The secret data is first encrypted using RSA (Rivest Shamir Adleman) encryption scheme and compressed using Huffman code. The encoded data is hidden using the LSB embedding technique on the LH, HL, and HH sub-band for DFT decomposed cover image. Abdulhammed et al. proposed a novel method to hide secret data into a cover image. The secret image data is stored in the edges of a cover image identified using a strong edges detection algorithm (SEDA). The position for embedding is computed using a random sequence generated using the Chirikov map. Yu et al. [31] proposed a reversible data hiding technique using audio as a carrier. The personal data is converted into novenary digits, embedded into the audio (single channel) carrier using a magic matrix, and converted into dual-channel stego-audio. The proposed method has got better signal-to-noise-ratio (SNR), and objective difference grade (ODG) than that of the method given by Xiang et al. [28]. Shafi et al. [22] proposed a method to hide data using audio as cover based on amplitude differencing. The method uses two audio covers of similar size to embed the data. The amplitude difference from the two cover audio with multiple indexes ranging from 0 to 255 (each at an interval of 16) is used to generate the stego-audio, where 4 bits of the secret data are embedded sequentially. The proposed method has good imperceptibility, but the maximum embedding ratio is just 12.5% of the cover audio and low robustness. El-Khamy et al. [8] proposed a method to hide image data in the audio signal. The image is encrypted using a random sequence generated from the Logistic map with XOR operation to increase security. The audio signal undergoes a two-level Cohen-Daubechies-Feauveau Integer to Integer lifting wavelet transform. The binary bits 0 or 1 of the encrypted image are hidden using a threshold technique in the second detail sub-band coefficients. El-Khamy et al. [9] proposed a method to hide image data into the audio signal using sample comparison with Discrete Wavelet Transform (DWT). The original image is first encrypted using the RSA encryption technique and binary data is generated from the cipher image. The position of embedding is based on a pseudo-random number generated. The cover audio is decomposed using DWD. The cipher bits are embedded by comparing the sample of DWD with a threshold value and making necessary changes to the coefficient of DWD if required based on the threshold. The method is robust to noise attacks, but it has a low embedding capacity.

Cipher images lure attackers to perform cryptanalysis as it indicates secret communication. Concealing the cipher image into a cover image while maintaining high imperceptibility, robustness and high embedding capacity are of research importance. Motivated by the various data hiding works, a method is proposed to conceal secret image information in audio data. The proposed method aims to improve the embedding capacity while maintaining high imperceptibility and robustness. The contribution of the paper includes:

1.
The secret image is converted to non-intelligible data by scrambling and encrypting using Ikeda chaotic map. The initial conditions for the Ikeda map are secretly shared using Elliptic Curve Cryptography (ECC) cryptosystem.
2.
To avoid a known-plaintext attack, the initial conditions for the Ikeda map depend on the Secure Hash Algorithm (SHA3-384) bits applied to the secret image.
3.
To avoid suspicious communication taking place to an attacker, the secret image’s intelligible data is concealed in the Lifting Wavelet Transform (LWT) of the carrier audio, maintaining a good embedding capacity with significantly less compromise on the carrier audio quality.
4.
To decrease the computation involved in random number generation using the Ikeda map, a base conversion operation is applied so that each loop generates 12 random values.

The following is the organization of this paper. In Section 3, the proposed scheme is described, which includes key exchange and keys generation for the Ikeda map, generating the scrambled-encrypted image and concealing the encrypted image into audio data along with reversing the process to get back the secret image. Section 4 shows the simulation results and the analyses of the proposed method. The conclusion is given in Section 5.

2 Ikeda map

Kensuke Ikeda pioneered the development of the Ikeda map [14]. The Ikeda map is a discrete time dynamic system given by:

$$ \begin{array}{@{}rcl@{}} x_{k+1}&=&\alpha+\beta(x_{k}Cos[t]+y_{k}Sin[t]); \\ y_{k+1}&=&\beta(x_{k}Sin[t]-y_{k}Cos[t]); \end{array} $$

(1)

where $t =c-\frac {d}{1+{x_{k}^{2}}+{y_{k}^{2}}}$, α ∈ (0,8), β ∈ (0.6,1), c ∈ (2,5), d ∈ (35,50)

Figure 1 shows the attractor, bifurcation diagram and Lyapunov characteristic exponents of the Ikeda map.

Equation 1 is iterated and plotted to depict the attractor as shown in Fig. 1a. The bifurcation plot with α ∈ (0,8) with a step size of 0.005 and β = 0.6 is shown in Fig. 1b. A bifurcation diagram helps in visually examining the chaotic behavior. From Fig. 1b, it is seen that the chaotic behavior is more ergodicity for α > 2.3. Lyapunov exponent measures the exponential separation for two minuscule close orbits with respect to variation in the control parameter. A positive Lyapunov exponent indicates that the system is chaotic. The Lyapunov exponent plot given in Fig. 1c denotes that the Ikeda system is chaotic. In order to generate a chaotic sequence for image scrambling and cipher image generation, the Ikeda map is used in the proposed scheme as it poses the desired chaotic behavior.

3 Proposed scheme

3.1 Key exchange and keys generation for Ikeda map

1
Get the 384 bits hash value (h) by applying (SHA3-384) on the input image.
2
Apply elliptic curve point multiplication operation between the hash value h and the generator G of a finite field elliptic curve given by ε_ρ : y² ≡ x³ + ax + b mod ρ to get an elliptic coordinate hG(hG_x,hG_y). hG(hG_x,hG_y) is secretly shared using ECC to the other communicating party [21].
3
hG_x and hG_y are converted into binary bits each of 384 bits.
4
Two sets of initial parameters for the Ikeda map are computed from the values of the bits from Step [3] as follows. For image scrambling, the binary bits from hG_x are used and for image encryption, the binary bits from hG_y are used. Where, x₀ = [b₁...b₆₄]₂/2⁶⁴; y₀ = [b₆₅...b₁₂₈]₂/2⁶⁴; α = 5 + [b₁₂₉...b₁₉₂]₂/2⁶⁴; β = e + [b₁₉₃...b₂₅₆]₂/(2⁶⁴ × 10); $t =c-\frac {d}{1+{x_{k}^{2}}+{y_{k}^{2}}}$; c = 3 + [b₂₅₇...b₃₂₀]₂/2⁶⁴; d = 43 + [b₃₂₁...b₃₈₄]₂/2⁶⁴; [b_i...b_i+ 63]₂ is the equivalent binary to integer conversion using 2 as base for each 64 binary bits, e = 0.7 and e = 0.9 for image scrambling and image encryption respectively.

3.2 Image scrambling

To generate a scrambled image from an colour input plain image the following steps are performed:

1
Input the plain RGB colour image and determine the image dimension (M × N).
2
The Ikeda map is run for a loop of (M × N × 3) to generate a list of X and Y sequence using the initial parameters derived from hG_x.
3
The X and Y sequence are partitioned into three parts (X_Red,X_Green,X_Blue) and (Y_Red,Y_Green,Y_Blue) for each colour channel.
4
Each list is sorted and stored as (X_RSort, X_GSort, X_BSort) and (Y_RSort, Y_GSort, Y_BSort).
5
A permutation table (P_XRed,P_XGreen,P_XBlue,P_{Y Red},P_{Y Green},P_{Y Blue}) is generated by determining the position of each values of X_Red in X_RSort, X_Green in X_GSort, X_Blue in X_BSort, Y_Red in Y_RSort, Y_Green in Y_GSort and Y_Blue in Y_BSort respectively.
6
The input image is separated into the corresponding colour channels (I_R,I_G,I_B).
7
Each I_R,I_G and I_B are partitioned vertically into M parts and vertically scrambled using permutation table (P_XRed,P_XGreen and P_XBlue respectively to generate RGB vertically scrambled images.
8
The RGB vertically scrambled images are partitioned horizontally into N parts and horizontally scrambled using permutation table (P_{Y Red}, P_{Y Green} and P_{Y Blue} respectively to generate the horizontally scrambled images.
9
The output RGB images of Step [9] are combined to form the required scrambled image (S_img).

3.3 Image encryption

To generate the cipher image from the scrambled image (S_img) the following steps are performed:

1
The Ikeda map is run for a loop of (M × N × 3)/12 to generate a list of X and Y sequence using the initial parameters derived from hG_y.
2
Each values in X and Y are converted into 6 integers as follows:
$$ \begin{array}{@{}rcl@{}} \begin{array}{l} S_{\chi}=[X_{i} \times 10^{16}]_{256}[[2...7]]\\ S_{\Psi}=[Y_{i} \times 10^{16}]_{256}[[2...7]] \end{array} \end{array} $$
(2)
where, [X_i × 10¹⁶]₂₅₆[[2..7]] and [Y_i × 10¹⁶]₂₅₆[[2..7]] are the integer to base 256 conversion generating a list of values ∈ (0 − 255) and the values from position 2 to 7 are taken.
3
The values in S_X = (sχ₁,sχ₂,...,sχ_n) and S_Y = (sΨ₁,sΨ₂,...,sΨ_n) are riffle as S = (sχ₁,sΨ₁,sχ₂,sΨ₂,...,sχ_n,sΨ_n) and used to generate a chaotic image C_img.
4
The chaotic image C_img is XORed with the scrambled image S_img to generate the encrypted image E_img.

3.4 Hiding encrypted image in audio data

To hide the encrypted image in an audio data the following steps are performed:

1
Import the audio file (.wav).
2
Extract the number of audio channel (ac), the sample rate (as) and the audio data (value ranges from -1 to 1) from the imported audio.
3
Combine all the channel (if more than one channel) and store as a single list. Apply Lifting Wavelet Transform (LWT) using Cohen-Daubechies-Feaveau (CDF) wavelet for 2 level (0,1,00,01) of refinement.
4
Generate three random integers (r₁,r₂,r₃) between 1 and the length of each list in the LWT 1, 00 and 01 refinement levels and share (r₁,r₂,r₃) secretly to the receiver.
5
The pixel values in encrypted image E_img is divided into 4 parts (P₁, P₂, P₃, P₄), where P_i are represented as triplet digits with necessary 0 padding at the left (for instance, if pixel value is 20, triplet digits= 020). P₁ and P₂ are concatenated to form P₁₂.
6
The values in P₁₂, P₃ and P₄ replaces the fifth to seventh fractional part of the real digits starting at position (r₁,r₂,r₃) of the 1, 00 and 01 Level of LWT respectively with wrapping around if needed.
7
The Lifting Wavelet Data at level 0 along with the cipher data embedded DWD in Step [6] are combined together and Inverse Wavelet Transform is applied.
8
The data in Step [7] is partitioned based upon the number of audio channel (ac) and represented as audio (.wav) using the same sample rate (as).

3.5 Extracting encrypted image information

The following operations are performed to extract the encrypted image information from the stego-audio.

1
Import the stego-audio file.
2
Extract the audio data.
3
Apply Lifting Wavelet Transform (LWT) using Cohen-Daubechies-Feaveau wavelet for 2 level (0,1,00,01) of refinement.
4
Obtain the values of (r₁,r₂,r₃).
5
Extract the values of P₁₂, P₃ and P₄ from 1, 00, 01 level of LWT (step 2) from positions (r₁,r₂,r₃) respectively with wrapping around if necessary.
6
The encrypted image E_img information is obtained by combining P₁₂, P₃ and P₄.

Block diagram for enciphering and concealing in the proposed method is shown in Fig. 2a.

3.6 Image decryption

To generate the deciphered image from the encrypted image (E_img) the following steps are performed:

1
The Ikeda map is run for a loop of (M × N × 3)/12 to generate a list of X and Y sequence using the initial parameters derived from hG_y.
2
Each values in X and Y are used to generate S_X and S_Y as given in Step 2 of Section 3.3.
3
The chaotic image C_img is generated using the same process as given in Step 3 of Section 3.3.
4
The chaotic image C_img is XORed with the encrypted image E_img to generate the deciphered scrambled image S_img.

3.7 Image descrambling

To generate a descrambled image from the scrambled image S_img the following steps are performed:

1
Import the scrambled image S_img.
2
Generate X and Y sequence using initial parameters hG_x derived from the shared elliptic coordinate hG_x.
3
The X and Y sequence are partitioned into three parts (X_Red, X_Green, X_Blue) and (Y_Red, Y_Green, Y_Blue) for each colour channel.
4
Each list is sorted and stored as (X_RSort, X_GSort, X_BSort) and (Y_RSort, Y_GSort, Y_BSort).
5
Using the same process given in Step 5 of Image scrambling, a permutation table (P_XRed,P_XGreen,P_XBlue,P_{Y Red},P_{Y Green},P_{Y Blue}) is generated.
6
An inverse permutation table (IP_XRed,IP_XGreen,IP_XBlue is generated by determining the position of each values of 1 to M in (P_XRed, P_XGreen, P_XBlue. Similarly, an inverse permutation table (IP_{Y Red}, IP_{Y Green}, IP_{Y Blue} is generated by determining the position of each values of 1 to N in (P_{Y Red}, P_{Y Green}, P_{Y Blue}).
7
The scrambled image S_img is separated into the corresponding colour channels (SI_R,SI_G,SI_B).
8
Each SI_R,SI_G and SI_B are partitioned vertically into M parts and vertically descrambled using permutation table (IP_XRed,IP_XGreen and IP_XBlue respectively to generate RGB vertically descrambled images.
9
The RGB vertically descrambled images are partitioned horizontally into N parts and horizontally descrambled using permutation table (IP_{Y Red}, IP_{Y Green} and IP_{Y Blue} respectively to generate the horizontally descrambled images.
10
The output RGB images of Step [9] are combined to form the required descrambled image (I_img).

Block diagram for deciphering and revealing the secret image in the proposed method is shown in Fig. 2b.

4 Simulation and analysis of the proposed scheme

The proposed algorithm is simulated using Wolfram Mathematica 12.3 on Fujitsu Celsius workstation with configuration Intel(R) Xeon(R) W-2133 CPU @ 3.60 GHz 32 GB RAM. The sample images used are taken from the USC-SIPI Image Database [24]. The audios are taken from BBC Sound Effects [4]. The ECC technique uses the Brainpool [16] elliptic curve parameters for key exchange. Figure (3a) shows the plain image used as input. The input image is vertically and horizontally scrambled using the image scrambling technique given in Subsection 3.2 and shown in Fig. (3b) and (c) respectively. Figure (3d) shows the cipher image generated using the proposed method. The cover audio and the stego-audio is shown as an audio plot in Fig. (3e) and (f) respectively and the corresponding spectrogram is shown in Fig. (3g) and (h). Figure (3i-l) show the recovered cipher image, deciphered descrambled image, horizontally descrambled and vertically descrambled image, respectively. The absolute difference is calculated and depicted as an image to check the difference between the input image and deciphered descrambled image, as shown in Fig. (3m). Blacker the image, the lesser the difference. Figure (3n) depicts the absolute difference in audio magnitude between the cover audio and steg-audio. Lesser the difference, the magnitude of the amplitude tends to zero. The difference between the cover audio and stego-audio is minimal and lies between + 0.0001 to − 0.0001. The encrypted image data is concealed and distributed evenly across the audio data. The PSNR and SSIM value for the cover audio and stego-audio is tabulated in Table 1.

Table 1 PSNR and SSIM of the stego-audio

Full size table

The PSNR and SSIM values of the stego-audio show that the stego-audio is very close to the cover audio, indicating high imperceptibility. The PSNR and SSIM value for the stego-audio, cipher image and the decrypted image is tabulated in Table 1. The PSNR and SSIM values of the decrypted image indicate that the original input image and the deciphered image are the same.

4.1 Embedding capacity

Embedding capacity is computed as:

$$ \begin{array}{@{}rcl@{}} \text{Embedding capacity}=\frac{\text{Size of secret data}}{\text{Size of cover data}} \times 100 \% \end{array} $$

(3)

The proposed method uses a 2200KB audio data to hide a colour image of size 768KB. The embedding capacity is 34.9%.

4.2 Noise attack

In a noise attack, random noise is induced in the stego-audio. The stego-audio is induced with noise whose magnitude ranges from (− 1,1) for a certain percentage of the stego-audio duration to determine the robustness of the proposed algorithm against noise attack. Table 2 shows the PSNR, SSIM and BER values for the decrypted images under random noise attack (Fig. 4).

Table 2 PSNR, SSIM and BER under noise attack and random cropping attack

Full size table

4.3 Random cropping attack

In a random cropping attack, certain audio signal parts are replaced by zeros. The stego-audio is applied with cropping attacks for specific durations to determine the robustness of the proposed algorithm against random cropping attacks. The random audio crop attack for (12.5%,25%,50%) on the stego audio are shown in Fig. (5a - c). The respective deciphered plain images are shown in Fig. 5d-f.

The PSNR, SSIM and BER values for the deciphered images under random cropping attack are tabulated in Table 2. The values show that the proposed method is robust against random cropping attacks and the generated deciphered images are visually perceivable.

4.4 Correlation analysis

The correlation coefficient shows how strongly two variables are related. Images are made up of pixel values arranged as a 2D matrix. The correlation coefficient analysis in cipher image helps assess the encryption algorithm’s statistical characteristic. The correlation coefficient is computed as:

$$ \begin{array}{@{}rcl@{}} \text{Correlation} = \frac{{\sum}_{i=1}^{n}(\chi_{i}-\overline{\chi})(\psi_{i}-\overline{\psi})} {\sqrt{{\sum}_{i=1}^{n}(\chi_{i}-\overline{\chi})^{2}} \sqrt{{\sum}_{i=1}^{n}(\psi_{i}-\overline{\psi})^{2}}} \end{array} $$

(4)

where n is the number of pixels under consideration. Ten thousand points are randomly selected to calculate the correlation coefficient and horizontal, vertical and diagonal (HVD) directions. The correlation coefficient ranges from -1 to 1, where values toward -1 indicate anti-correlation, values towards 0 show no correlation and values toward 1 indicate high correlation. Standard images are usually highly correlated, indicated by values tending towards 1 and cipher images are lowlily correlated, indicated by values tending towards 0, as shown in Table 3.

Table 3 Correlation coefficients for plain and cipher images

Full size table

Graphical plots of the correlation along HVD directions for plain and cipher images are shown in Fig. 6. The correlation graph is concentrated for plain images, indicating a higher correlation, while the correlation graph is dispersed for cipher images, indicating a lower correlation.

4.5 Attack analysis

The cipher-text-only attack is an attack where the adversary has access to only the cipher-text information. For the proposed model, if the adversary has access to the cipher-text, the adversary cannot decipher the plain-text as he/she does not have the correct key. Determining the key would require solving ECDLP, a challenging problem, and trying a brute force attack is impractical. A known-plaintext attack is a type of attack where the adversary has access to cipher-text and the corresponding plain-text and tries to decipher a new cipher-text. A hash value is used for the proposed method to generate the secret keys using (SHA3-384). Every different image will have a different set of secret keys. So, the proposed method will be safe from cipher-text-only and known-plaintext attacks.

4.6 Histogram analysis

A histogram plot of an image depicts the frequency of each pixel value. A standard image usually has specific intensities conglomerated to provide meaningful information. So in plotting the histogram of a plain image, uneven distribution of pixels frequency is seen as shown in Fig. 7a. A good cipher image conceals any meaningful information with a uniform distribution of pixels frequency. So in plotting the histogram of a cipher image, uniform distribution of pixels frequency is seen as shown in Fig. 7b.

4.6.1 Variance

A variance of the histogram provides the mathematical confirmation for the histogram distribution. Lower the variance, the more uniform the histogram. Higher the variance, the more uneven the histogram. Variance is given by:

$$ \begin{array}{@{}rcl@{}} Variance=\frac{1}{N^{2}} \sum\limits_{i=0}^{N-1} \sum\limits_{j=0}^{N-1} \frac{1}{2} (h_{i}-h_{j})^{2} \end{array} $$

(5)

where, N = 256, h_i and h_j are number of grey pixels. Variance for the plain and cipher image are tabulated in Table 4.

4.6.2 Maximum deviation

The maximum deviation (MD) [5] is computed as:

$$ \begin{array}{@{}rcl@{}} MD=\frac{h_{0}+h_{255}}{2}+\sum\limits_{i=1}^{255}h_{i} \end{array} $$

(6)

where, h_i is the absolute difference of the i th pixel count between the cipher and the plain image. MD measures the deviation of the cipher image from the plain image. Higher the difference better the encryption algorithm. The maximum deviation of the proposed method is tabulated in Table 4.

Table 4 Variance, maximum deviation and irregular deviation for plain and cipher images

Full size table

4.6.3 Irregular deviation

The irregular deviation (ID) [5] is computed as:

$$ \begin{array}{@{}rcl@{}} ID=\sum\limits_{i=0}^{255}(\lvert H_{i}-A_{h} \rvert) \end{array} $$

(7)

where, H_i is the absolute difference of the i th pixel count between the cipher and the plain image; A_h is the mean of the histogram values. ID measures the closeness of statistical distribution between histogram deviation and the uniform distribution. A smaller ID indicated a better encryption algorithm. Irregular deviation for the proposed method is tabulated in Table 4.

4.7 Key space and key sensitivity analysis

The proposed algorithm uses 384 bits Brainpool parameters. The sender and receiver use a 384 bits private key to share the elliptic curve coordinate hG. Solving the private key from a public key in ECC requires solving the elliptic curve discrete logarithmic problem (ECDLP). Using Brute force requires computing CO computation, where CO is the cyclic order of the given 384 bits Brainpool parameter for a given Generator G. The best-known techniques, such Baby-step-Giant step method or the Pollard’s rho method, requires $\sqrt {CO}$ steps to solve the ECDLP. $\sqrt {CO}=4.65395 \times 10^{57}$ steps are large enough to baffle the attacker. The proposed algorithm is susceptible to keys. A single bit change drastically changes the data of the output. Figure (8a-b) shows the images of the deciphered image using the correct key and another utilizing a key that is just a bit different from the original key.

4.8 Differential attack

Cryptanalysis using differential attack tries to find non-random behavior in the cipher data generated from two minimally different plain inputs. Two input images are considered that differ just by a bit to test the robustness of the proposed encryption scheme against differential attacks. The cipher images’ corresponding Number of Changing Pixel Rate (NPCR) and Unified Averaged Changed Intensity (UACI) values are analyzed. Wu et al. [27] proposed that a cipher image NPCR should be greater than the threshold criteria $N_{\alpha }^{*}$, where α is the significance level. For UACI, the value should be in the interval of $U_{\alpha }^{*-}$ and $U_{\alpha }^{*+}$. An image with dimension 512 × 512, should have $N_{\alpha }^{*}$ greater than 99.5717%, 99.5810%, 99.5893% and the interval of $U_{\alpha }^{*-}$ and $U_{\alpha }^{*+}$ of 33.3115% − 33.6156%, 33.3445% − 33.5826% and 33.3730% − 33.5541% for significance levels 0.001, 0.01 and 0.05 respectively. Figure (9a) and (b) shows the two different images that just differ by a bit and the corresponding cipher image are given in Fig. 9c and d. Figure (9e) shows the image generating through absolute pixel difference between the two cipher images given in Fig. 9c and d.

The NPCR and UACI values are computed and tabulated in Table 5. The proposed encryption scheme passes the NPCR and UACI threshold criteria $N_{\alpha }^{*}$, $U_{\alpha }^{*-}$ and $U_{\alpha }^{*+}$.

Table 5 Entropy, NPCR, UACI and avalanche effect

Full size table

4.9 Avalanche effect

The Avalanche effect is the desired property of a cryptographic algorithm where a slight change in the key or the plain image should drastically change the cipher image. The computed avalanche value for a single bit flipped in the input image and a single bit flipped in the input key is tabulated in Table 5. The avalanche values indicate that for a single bit flip in the input image or the key 50% of the cipher data got changed.

4.10 Entropy

In an image, entropy provides the randomness information on the distribution of pixels. Entropy is given by:

$$ \begin{array}{@{}rcl@{}} \text{Entropy} = -\sum\limits_{i=1}^{2^{n}-1}P(p_{i})\log_{2}\frac{1}{ P(p_{i})} \end{array} $$

(8)

where, n is the number of pixels and P(p_i) denotes the probability of pixel p_i. In an image with pixels represented using 8 bits, the theoretical maximum entropy value is $log_{2}{2}^{8}=8$. The higher the entropy values, the more uniformly the pixels are distributed. So, a cipher image tends to have entropy close to 8 compared to plain images. The Entropy value for the plain and cipher images is tabulated in Table 5.

4.11 Randomness test

Rukhin et al. [20] develop a standard statistical test suite (NIST test suite 800-22 revision 1a) that can be used to test the randomness of the cipher data. In each test a p-value is computed. A p-value greater than 0.01 suggests that the cipher data is random with the confidence of 99%. The fifteen tests are applied to each cipher image of Peppers, Lena, Splash and Baboon, and the p-values are tabulated in Table 6. The p-values indicates the encryption algorithm passes the randomness test with confidence of 99%.

Table 6 Randomness test

Full size table

4.12 Complexity of the proposed method

The proposed method consists of:

1.
Computing the hash value (SHA3-384) of the input image: O(n), where n is the number of bits in the input image.
2.
Computation of elliptic curve point multiplication: O(log₂h), where h is the hash value of the input image.
3.
Scrambling of the input image: O(N²), where N is the dimension of the image.
4.
Chaotic sequence generation: O(p), where p is the number of the pixels in the input image.
5.
Computing LWT: O(NlogN), where N is the audio data.
6.
Concealing operation: O(p), where p is the number of the pixels in the input image.

Overall, the complexity of the proposed method is O(n) + O(log₂h) + O(N²) + 2O(p) + O(NlogN).

4.13 Comparison and discussion

Performance comparison of the proposed method with other techniques of data hiding is tabulated in Table 7. The proposed method had a good PSNR value for the embedded data, which is better than the existing techniques. The SSIM value of the embedded data for the proposed method is close to 1, where SSIM 1 indicates exact similarity with the original data. The embedding capacity of the proposed method is better than other existing methods. Discussion of the compared data hiding techniques are as follows:

1.
In Ref. [30], the plain text is secure using the Vigenere cipher, and the payload is increased using the Huffman lossless compression technique. However, the Huffman dictionary table must be shared between the communicating parties for each secret message.
2.
In Ref. [10], the confidential information of the nuclear reactor is hidden in middle band DCT coefficients by replacing the LSB. The technique possesses imperceptibility but has got low embedding capacity.
3.
In Ref. [3], the confidential data is hidden in the insignificant perceptual region in the cover image determined by deploying the Differential Evolution optimization algorithm. The technique possesses imperceptibility and robustness but lacks embedding capacity.
4.
In Ref. [23], the scrambled secret image is hidden in the DWT coefficient. The technique has a high embedding capacity; however, the scrambled secret image is obtained through Arnold’s transformation, a permutation-only technique with finite cyclic order.
5.
In Ref. [25], the secret image is enciphered using a modified Logistic map and embedded in the cover image using IWT. The modified Logistic map has got better key-space, thereby increasing the security. The method possess possesses imperceptibility and high embedding capacity.
6.
In Ref. [11], the technique is blind steganography that uses both DCT and DWT to hide the secret image. Arnold’s transformation is used to scramble the pixels to increase security. The technique possesses a strong imperceptibility and robustness but has a moderate embedding capacity. The security can be improved because Arnold’s transformation is a permutation-only technique with finite cyclic order.
7.
In Ref. [1], a substitution box (S-Box) is developed using quantum walks. The secret data is expanded to the size of the cover image. The secret data is hidden in the cover image based on the S-box entries. The technique possesses a strong imperceptibility and embedding capacity but lacks robustness.
8.
In Ref. [32], the secret image is embedded in the edge obtained from 3 × 3 non-overlapping blocks on the cover image. To avoid truncation errors, IWT is deployed. The problem of overflow or underflow in embedding for pixels with values close to 0 or 255 is also handled. The technique possesses a strong imperceptibility and shows robustness to some common attacks but has low embedding capacity.
9.
In Ref. [26], the secret data is ciphered using RSA and compressed using Huffman coding, followed by embedding in the DWD LH, HL, and HH sub-bands. The method has got a moderate imperceptibility and embedding capacity. Choosing the primes of the RSA is a concern in this method. The smaller primes will generate a smaller value for the cipher, which will help better payload but is vulnerable to integer factorization attacks. Bigger primes will be secure, but the payload will decrease.
10.
In Ref. [2], the plain text is secure through the elliptic curve encryption technique. However, the chosen elliptic parameter has a small cyclic order, and the technique uses a static table for each character.
11.
In Ref. [29], the secret image is stored in the strong edges detected using SEDA, which helps in maintaining the robustness and imperceptibility, but it has low embedding capacity.

Table 7 Comparison with existing methods of data hiding

Full size table

5 Conclusion

Images are one of the most transmitted digital data where specific images are confidential. Various image encryption methods are proposed by different authors that convert the plain image to an unintelligible image that looks noisy to maintain confidentiality. From an attacker’s perceptive, these unintelligible noisy images indicate that something important is transmitted. Many such encryption schemes are cryptanalysed. Concealing the noise like encrypted data helps to baffle from cryptanalysis attack from the attacker as no clear indication is shown about the transmission of critical data. A method is proposed to safeguard the confidential image transmission by first converting the plain image into a cipher image based on scrambling and encryption operation using the Ikeda map. The cipher image is concealed in LWT audio data. Various statistical analyses are carried out to show that the technique blends the cipher image in the audio data without affecting the quality of the audio with high embedding capacity. The encryption algorithm passes the statistical and security analyses. The proposed technique shows robustness to noise attacks and random cropping attacks. Amongst the compared techniques given in Table 7, the proposed method has got the highest embedding capacity and better imperceptibility with PSNR and SSIM values of 86.33 and 0.99, respectively. The code of the proposed method can be obtained from https://github.com/Dolendro/Securing-encrypted-image-information-in-audio-dataon demand. As a future work, concealing secret data in video using natural noise [12] for key generation can be researched.

Data Availability

Data sharing not applicable to this article as no datasets were generated or analysed during the current study.

References

Abd EL-Latif A A, Abd-El-Atty B, Venegas-Andraca S E (2019) A novel image steganography technique based on quantum substitution boxes. Opt Laser Technol 116:92–102
Article MATH Google Scholar
Bansal R, Badal N (2022) A novel approach for dual layer security of message using steganography and cryptography. Multimed Tools Appl 81:20669–20684
Basu S, Debnath A, Basu A, et al. (2022) An image data hiding technique using differential evolution. Multimed Tools Appl 1–18
Bbc sound effects. http://bbcsfx.acropolis.org.uk. Accessed 29 June 2020
Chattopadhyay C, Sarkar B, Mukherjee D (2015) Encoding by dna relations and randomization through chaotic sequences for image encryption. arXiv preprint arXiv:1505.01795
Ditta A, Azeem M, Naseem S et al (2022) A secure and size efficient algorithm to enhance data hiding capacity and security of cover text by using unicode. Journal of King Saud University - Computer and Information Sciences 34 (5):2180–2191
Article Google Scholar
Djebbar F, Ayad B, Meraim K A, et al. (2012) Comparative study of digital audio steganography techniques. EURASIP J Audio Speech Music Process 2012(1):1–16
Article Google Scholar
El-Khamy S E, Korany N, El-sherif MH (2017a) Robust image hiding in audio based on integer wavelet transform and chaotic maps hopping. In: 2017 34Th national radio science conference (NRSC), IEEE, pp 205–212
El-Khamy SE, Korany NO, El-Sherif MH (2017b) A security enhanced robust audio steganography algorithm for image hiding using sample comparison in discrete wavelet transform domain and rsa encryption. Multimedia Tools Appl 76 (22):24091–24106
Article Google Scholar
El-Rahman SA (2018) A comparative analysis of image steganography based on dct algorithm and steganography tool to hide nuclear reactors confidential information. Comput Electr Eng 70:380–399
Article MathSciNet Google Scholar
Gupta Banik B, Bandyopadhyay S K (2018) Blind key based attack resistant audio steganography using cocktail party effect. Secur Commun Netw. 2018:1781384:1–1781384:21
Hongjun Liu A K, Xu C (2020) Color image encryption with cipher feedback and coupling chaotic map. Int J Bifurcation Chaos 30 (12):2050173(1)–2050173(14)
MathSciNet MATH Google Scholar
Hussain M, Wahab A W A, Idris Y I B, et al. (2018) Image steganography in spatial domain: a survey. Signal Process Image Commun 65:46–66
Article Google Scholar
Ikeda K, Daido H, Akimoto O (1980) Optical turbulence: Chaotic behavior of transmitted light from a ring cavity. Phys Rev Lett 45:709–712. https://doi.org/10.1103/PhysRevLett.45.709
Article Google Scholar
Karakus S, Avci E (2020) A new image steganography method with optimum pixel similarity for data hiding in medical images. Med Hypotheses 139:109691
Article Google Scholar
Lochter M, Merkle J (2010) Elliptic curve cryptography (ecc) brainpool standard curves and curve generation. https://doi.org/10.17487/RFC5639. https://www.rfc-editor.org/info/rfc5639
Paul G, Saha S K, et al. (2018) An efficient multi-bit steganography algorithm in spatial domain with two-layer security. Multimed Tools Appl 77(14):18,451–18,481
Article Google Scholar
Qi B, Yang C, Tan L, et al. (2020) A novel haze image steganography method via cover-source switching. J Vis Commun Image Represent 70:102814
Article Google Scholar
Ren Y, Cai S, Wang L (2021) Secure aac steganography scheme based on multi-view statistical distortion (sofmvd). J Inf Secur Appl 59:102863
Google Scholar
Rukhin A, Soto J, Nechvatal J, et al. (2001) A statistical test suite for random and pseudorandom number generators for cryptographic applications. Booz-allen and hamilton inc mclean va, Tech. rep.
Google Scholar
Sahasrabuddhe A, Laiphrakpam D S (2021) Multiple images encryption based on 3d scrambling and hyper-chaotic system. Inf Sci 550:252–267
Article MathSciNet MATH Google Scholar
Shafi K, Sankaranarayanan A, Prashanth G et al (2010) A novel audio steganography scheme using amplitude differencing. In: Trendz in Information Sciences & Computing (TISC2010), IEEE, pp 163–167
Thanki R, Borra S (2018) A color image steganography in hybrid FRT–DWT domain. J Inf Secur Appl 40:92–102. https://doi.org/10.1016/j.jisa.2018.03.004
Article Google Scholar
The usc-sipi image database. http://sipi.usc.edu/database. Accessed 01 July 2020
Valandar M Y, Ayubi P, Barani M J (2017) A new transform domain steganography based on modified logistic chaotic map for color images. J Inf Secur Appl 34:142–151
Google Scholar
Wahab O F A, Khalaf A A, Hussein A I, et al. (2021) Hiding data using efficient combination of rsa cryptography, and compression steganography techniques. IEEE Access 9:31805–31815
Article Google Scholar
Wu Y, Noonan J P, Agaian S, et al. (2011) Npcr and uaci randomness tests for image encryption. Cyber journals: multidisciplinary journals in science and technology. J Sel Areas Telecommun (JSAT) 1(2):31–38
Google Scholar
Xiang S, Li Z (2017) Reversible audio data hiding algorithm using noncausal prediction of alterable orders. EURASIP J Audio Speech Music Process 2017(1):1–16
Article Google Scholar
Younis A O (2022) A novel approach of steganography by using strong edge detection and chaos theory. Multimed Tools Appl 81(13):17875–17888
Article Google Scholar
Younus ZS, Hussain MK (2019) Image steganography using exploiting modification direction for compressed encrypted data
Yu H, Wang R, Dong L, et al. (2020) A high-capacity reversible data hiding scheme using dual-channel audio. IEEE Access 8:162271–162278
Article Google Scholar
Zhang H, Hu L (2019) A data hiding scheme based on multidirectional line encoding and integer wavelet transform. Signal Process Image Commun 78:331–344
Article Google Scholar

Download references

Funding

We confirm that there are no funding Sources.

Author information

Authors and Affiliations

Department of Computer Science and Engineering, National Institute of Technology Silchar, Silchar, 788010, Assam, India
Zeba Shamsi & Dolendro Singh Laiphrakpam

Authors

Zeba Shamsi
View author publications
You can also search for this author in PubMed Google Scholar
Dolendro Singh Laiphrakpam
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dolendro Singh Laiphrakpam.

Ethics declarations

Competing interests

We wish to confirm that there are no known conflicts of interest associated with this publication. We confirm that the manuscript has been read and approved by all named authors and that there are no other persons who satisfied the criteria for authorship but are not listed. We further confirm that the order of authors listed in the manuscript has been approved by all of us.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Dolendro Singh Laiphrakpam contributed equally to this work.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Shamsi, Z., Laiphrakpam, D.S. Securing encrypted image information in audio data. Multimed Tools Appl 82, 33063–33085 (2023). https://doi.org/10.1007/s11042-023-14735-0

Download citation

Received: 18 March 2022
Revised: 04 July 2022
Accepted: 04 February 2023
Published: 06 March 2023
Issue Date: September 2023
DOI: https://doi.org/10.1007/s11042-023-14735-0

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Securing encrypted image information in audio data

Abstract

Similar content being viewed by others

Secure Audio Cryptosystem Using Hashed Image LSB watermarking and Encryption

Secret Audio Messages Hiding in Images

Reversible Data Hiding for Encrypted Audios by High Order Smoothness

1 Introduction

2 Ikeda map

3 Proposed scheme

3.1 Key exchange and keys generation for Ikeda map

3.2 Image scrambling

3.3 Image encryption

3.4 Hiding encrypted image in audio data

3.5 Extracting encrypted image information

3.6 Image decryption

3.7 Image descrambling

4 Simulation and analysis of the proposed scheme

4.1 Embedding capacity

4.2 Noise attack

4.3 Random cropping attack

4.4 Correlation analysis

4.5 Attack analysis

4.6 Histogram analysis

4.6.1 Variance

4.6.2 Maximum deviation

4.6.3 Irregular deviation

4.7 Key space and key sensitivity analysis

4.8 Differential attack

4.9 Avalanche effect

4.10 Entropy

4.11 Randomness test

4.12 Complexity of the proposed method

4.13 Comparison and discussion

5 Conclusion

Data Availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation