1 Introduction

When an enterprise is started, more investments are made to safely maintain its secret data, which range from buying and maintaining a server to paying salary to network administrator [1]. Nowadays, instead of maintaining an independent server to store an organization’s confidential data, server space in the public cloud can be hired and used [2]. Cloud computing, an emerging technology in recent times, represents a host of advanced technologies, services, marketing techniques and exciting business opportunities. Cloud computing services can be broadly viewed as three different categories: Infrastructure as a Service(IaaS), Software as a Service(SaaS) and Platform as a Service(PaaS) [3]. Incoherence with another emerging technology called the Internet of Things (IoT), Cloud Computing offers seamless and consistent connection, intelligent algorithms, applications and social networks to a wide range of devices and people [4]. The marriage of the Internet of Things (IoT) with multiple technologies creates a wide range of exciting applications. One such core application, Software Defined Networking(SDN), is a solution for offering centralized control architecture in a manner that is both flexible and powerful [5]. In the midst of this dynamic progress, security and privacy concerns pose integral challenges. As the world expands to consume multimedia through IoT services, providing a basic form of enterprise security and authentication is mandatory. Though a considerable amount of flaws exist with present-day web services, simple and straightforward measures go a long way in safeguarding the service [6]. To improve multimedia security issues with the transmission of multimedia, countermeasures are taken to respond to attacks and specify the ways to recover from attacks [7].

Information security is a broad branch of study, where methods to keep user multimedia contents safe in personal computers, in transit across a network, and in storage in cloud space are discussed. It is broadly classified into two types: encryption and information hiding. Encryption satisfies the confidentiality property of cryptography. Through encryption, the produced cipher is of meaningless or unintelligible form, which may invite suspicion [8]. But if multimedia content is subjected to information hiding using some cover media, mere knowledge of the existence of multimedia itself is suppressed which may evade suspicion. Information hiding provides confidentiality and at the same time suppress suspicion. So, information hiding is preferred over encryption. Information hiding is of two types: steganography and watermarking. Steganography is the art of sending a secret to a destination without other knowledge. Image, audio or video can be used as a cover for transmitting secret data in steganography. Based on the quantum of secret data to hide in the cover medium and based on the operational simplicity, the cover medium is chosen. If an image is used as a cover for transmitting multimedia contents, then the process is called image steganography [9].

Image steganography is done in both spatial and frequency domains. The Least Significant Bit (LSB) based method in spatial domain steganography embeds the payload of any size into a cover medium [10]. LSB embedding may introduce some distortion in the resultant image. To avoid introducing errors, embedding is done in the frequency domain which provides better imperceptibility and robustness [11]. Once the image is transformed, the image is divided into four sub-bands: LL, HL, LH and HH. LL is low-frequency sub-bands and HH, LH, and HL are high-frequency sub-bands. Embedding can be done in any of the sub-bands, but if embedding is done in HH sub-band, the output Stego image is not robust enough against attacks, so embedding in HH sub-band is avoided [12].

1.1 SVM based steganography

Recently efforts are made to exploit advantages of machine learning techniques like SVM for embedding and extraction of secret images. Support vector machine is utilized for the extraction of secret images from the tampered stego images. Probability of extraction of the original secret image from attacked stego image is more while SVM is utilized in the extraction process [13]. SVM can also be used in the wavelet domain. A relation is established between the statistical characteristics of the wavelet coefficient for successive data using SVM and further used in the extraction of the original secret from the tampered stego image [14].

1.2 Salient features of our approach

The proposed steganographic scheme is an enhanced variant of the existing approach by following aspects:

  1. 1.

    The secret image is transformed using a Discrete Rajan Transform (DRT) before embedding. DRT is used to increase the security of the secret image. DRT performs many rounds of shifting and does some mathematical operations on the secret image. So steganalysis cannot be done based on pattern matching or using any other image processing techniques.

  2. 2.

    DRT transformed images are subjected to Base b (b = 2 k 2 + 2 k + 1, where k ≥ 1 is embedding parameter) conversion. It increases the embedding rate to some extent.

  3. 3.

    In DWT, integer pixel values of the secret image are converted into double values while transforming. While the inverse of DWT is applied, this double value may not be mapped into an exact integer value. Because of wrong mapping, some amount of round of error is introduced in the stego image. To overcome the roundoff error, Integer Wavelet Transform is used for transformation.

  4. 4.

    SVM classifier is introduced in the extraction phase which increases the probability of recovering the original secret image even after some image processing attacks are done on the stego images.

As all the above four techniques are combined, the resultant approach provides better security, enhanced embedding rate, less roundoff error and robust to any kind of attacks.

We propose an adaptive steganographic scheme for providing multimedia security in the cloud. Multimedia refers to audio, image, video or text. In our proposed work, image is considered as a multimedia content throughout the article, but the proposed method can easily be used with other types of multimedia. Prior to sending a secret image into cloud space, it is first subjected to DRT transformation and embedded using Diamond Encoding method into an IWT transformed cover image. The resulting stego image is stored in the cloud and whenever the multimedia content is required, the stego image is downloaded from the cloud and the multimedia content is extracted using a similar process. Usage of a Support Vector Machine increases the robustness of extraction of stego images which makes the extraction of the original secret image from tampered stego image possible.

Remainder of the article is organized as follows: Section 2 summaries similar contributions, Section 3 describes the basic concepts used in our proposed work, Section 4 explains proposed methods, Section 5 gives the detailed experimental results and analysis. Section 6 provides a conclusion and future works.

2 Literature survey

In earlier stages of spatial steganography, LSB of cover image pixels is replaced in such a way that multimedia content can be embedded into it to produce stego images. The stego images thus produced are of low distortion and high quality [15]. All image pixels may not be used for embedding a multimedia content; the cover image may be divided into blocks and LSB embedding can be done in some of the pixels within a block [16]. Even if a multimedia content is of very large size which can not be embedded into a single cover image, multimedia content can be divided into many segments and are LSB embedded separately into a many cover images [17]. LSB replacements are not changing the odd values of intensities of the image, but change only even values of an image. In LSB replacement, statistical changes happen which may pave the way for steganalysis [18]. To overcome this problem, LSB matching is used. In this approach, before embedding a secret bit in a pixel, matching is done and accordingly either one is added or subtracted from the pixel value. Because of this modification, statistical changes are minimized [19]. But, both LSB replacement and LSB matching methods may modify 50 % of pixel intensity values. LSB matching the revised approach, which is based on a block of pixels, is proposed to mitigate this problem. Using some specific functions and tables, embedding is done so that it utilizes almost all possible modifications of pixels optimally to preserve the quality of stego images [20].

Steganography is done in the transform domain as well. A given cover image is transformed from its time-domain into its equivalent frequency domain using a mathematical function. At the earlier stage of usage of transform domain, Fourier Transform (FT) and Discrete Cosine Transform (DCT) are used to transform a multimedia content [21]. But nowadays, Discrete Wavelet Transform (DWT) is used in transformation because of its high quality and flexibility. But wavelet transform converts the pixel intensity values of integer values of time domain into floating-point value coefficients of the frequency domain. In the extraction stage, the reverse of conversion is done from floating values of coefficients into integer values of pixels. Some amount of error is introduced because of this conversion [22]. To overcome this limitation, Integer Discrete Wavelet Transform (IDWT) is proposed [23]. Usually, the cover image alone is transformed and multimedia content is embedded into cover image without doing any transform. To increase the security of multimedia in the embedding algorithm, the multimedia content is also transformed and then embedded into the transformed coefficients of the cover image [24].

Enterprise Security in hardware remains a major challenge for the successful implementation of IoT in the industry. Solutions have been explored by researchers and organizations to prevent unauthenticated attacks on a device commonly termed as outside device attack [25]. Successful solutions are a critical requirement for the deployment of the fog computing environment and the Internet of Things (IoT) in the industry of Information Technology. Data science, an interdisciplinary track, allows various experts to work in cohesion and obtain meaningful information through investigation of data [26].

The multimedia content is compressed using Huffman encoding. Generated bitstreams are divided into three bits each using 3-bit block preparation and are converted into decimal notation. A cover image is chosen, which is transformed using Haar-Wavelet. Embedding of the multimedia content is done in the high-frequency components of the transformed image as it may not reveal the existence of the secret image within it [27]. Cover images can be represented in YCbCr color space. Both the Cr component and the multimedia content are transformed using DWT. Coefficients of the multimedia content are embedded into coefficients of the Cr component of the cover image by dividing both multimedia content and cover image into pixel pairs of equal size and comparing both before embedding a bit in it [28]. The cover image can be transformed using DWT twice and embedding is done in subband LL2. 5 MSB bits of each pixel in the multimedia content is embedded into 5 LSB of each coefficient of LL2 [29]. The cover image can be transformed using DWT. The multimedia content is subjected to Base b (b = 2 k2 + 2 k + 1, where k is embedding parameter) conversion. Then each Base b digits are embedded into coefficients of subbands using Diamond Encoding Scheme which treats a pair of pixels as embedding units [30].

Discrete Rajan Transform does coding morphism. In DRT, the number sequence of the length of the power of two is mapped into a highly related same length sequence. It is most useful in data compression and pattern recognition [31]. If a stego image is attacked by image processing attack, such as the addition of noise, clipping or blurring of a portion of an image, extraction may not be accurate. But due to the usage of support vector machine, even if a stego image is subjected to some image processing attacks, extraction is done accurately without any trace of attacks [32].

3 Technical background

3.1 Integer wavelet transform(IWT)

In Integer Wavelet Transform (IWT), a cover image undergoes frequency domain transform to obtain detailed coefficients and that coefficients are used for classification based on most significant bits (MSBs). As IWT prevents modifications in MSBs, it helps to extract the secret image precisely without any errors from the stego image. IWT maps integer value of each pixel into integer coefficients in the forward transformations. It reduces the error occurs in mapping integer coefficients into its corresponding integer pixel values during inverse transformations. The forward and inverse transformations of IWT is given in the following equations

$$ {\displaystyle \begin{array}{l}{d}_{1,n}={s}_{0,2n+1}-{s}_{0,2n}\\ {}{s}_{1,n}={s}_{0,2n}+\left\lfloor \raisebox{1ex}{${d}_{1,n}$}\!\left/ \!\raisebox{-1ex}{$2$}\right.\right\rfloor \end{array}} $$
(1)
$$ {\displaystyle \begin{array}{l}{s}_{0,2n+1}={d}_{1,n}+{s}_{0,2n}\\ {}{s}_{0,2n}={s}_{1,n}-\left\lfloor \raisebox{1ex}{${d}_{1,n}$}\!\left/ \!\raisebox{-1ex}{$2$}\right.\right\rfloor \end{array}} $$
(2)

where si,n and di,n are the nth low and high-frequency coefficients of the ith level wavelet, respectively. After a successful transformation, one approximation coefficient matrix and three detailed coefficient matrixes were obtained. Finally, the stego image is generated by concealing bits of the confidential image within the frequency coefficients of cover media. By reversing the embedding algorithm, original confidential image bits are recovered from the stego image by employing inverse IWT in the extraction phase. HH matrix is first selected for embedding to provide reduced distortion, and then other detailed matrices are used for embedding subsequently if required.

3.2 Discrete Rajan transform (DRT)

Discrete Rajan Transform is an efficient and fast algorithm derived from Decimation-In-Frequency (DIF) algorithm and Fast Fourier Transform (FFT). Let us consider that x(n) be a number with length N and also a power of 2, then x(n) is decomposed into two equal parts as follows

$$ g(j)=x(i)+x\left(i+\left(\frac{N}{2}\right)\right),0\le j\le \frac{N}{2},0\le i\le \frac{N}{2} $$
(3)
$$ h(j)=x(i)-x\left(i-\left(\frac{N}{2}\right)\right),0\le j\le \frac{N}{2},\frac{N}{2}\le i\le N $$
(4)

The division process continues and terminates when there is no option for further division. Thus the level of division is log2N. The operators + and –denotes the function addition and subtraction respectively. For a sequence of number with length 8, the one-dimensional signal flow of DRT is illustrated in Fig. 1.

Fig. 1
figure 1

One dimensional signal flow of DRT of length 8

Suppose that the length of the sequence of a number is 2 k, where k is greater than 0, then DRT is denoted by X(k). DRT supports isomorphism which means it maps a domain set containing dyadic and cyclic permutations onto the range set which is in the form of X(k)E(r), where E(r) indicates encryption code and x(k) indicates permutation invariant. DRT is treated as transform as it is one to one and onto correspondence and also inverse facilitates inverse DRT. The forward transform is performed by recursively dividing the input sequence and partitioning the matrix operator R matrix as follows

$$ {A}_{N\times 1}={R}_N\times {X}_{N\times 1} $$
(5)
$$ {R}_N=\left[\begin{array}{cc}{I}_{N/2}& {I}_{N/2}\\ {}-{e}_k{I}_{N/2}& {e}_k{I}_{N/2}\end{array}\right] $$
(6)

where XN × 1 denotes column matrix of an input whose sequence length is N, RN is the N × N sized R matrix, IN/2 is the identity matrix and ek represents encryption function. The value of k is determined as follows

$$ {e}_k={\left(-1\right)}^k\;\mathrm{such}\ \mathrm{that}\ \mathrm{k}=\Big\{{\displaystyle \begin{array}{cc}1& \mathrm{for}\ x\left(i+N/2\right)<x(i);0\le i\le N/2\\ {}0& \mathrm{Otherwise}\end{array}} $$
(7)

The signal x(n) can be recovered by the inverse DRT with the help of DRT coefficients and its corresponding encryption values used in the forward DRT [31].

3.3 Support vector machine (SVM)

A binary classification algorithm, SVM is utilized for classification and error detection. Based on the statistical learning hypothesis, SVM is trained to detect the errors by applying a hypothetical upper limit for error generalization. Theoretically, error predictions are determined for unknown or new instances by the binary classification. Figure 2 depicts the classes and hyperplane of the SVM classifier. Let us assume that there are m data points represented by Xi, where i = 1, 2, 3 … m and Yi = ±1 be the two labelled class and decision function f(x) is given as follows

$$ f(s)=\mathit{\operatorname{sign}}\left(w\times {x}_i+b\right) $$
(8)
Fig. 2
figure 2

SVM classifier with hyperplane and classes

Various classification algorithms are available for datasets with linear classes. But for non-linear class datasets, SVM is best employed with kernel substitutions. Image processing techniques provide better results when combined with SVM classifiers [32].

3.4 Diamond encoding

The Diamond Encoding (DE) is a spatial domain-based transform used in the proposed hybrid transform-based image steganography to conceal secret bits in the cover media. The embedding units of DE are neighbourhood pairs (p, q) of cover pixels. An image which is to be protected is first transformed into digits of base b, where b = 2 k2 + 2 k + 1 and k is embedding parameter(or embedding rate), which is always greater than or equal to 1. The lower limit of k is estimated as follows

$$ \left\lfloor \frac{M\times N}{2}{\log}_2\left(2{k}^2+2k+1\right)\right\rfloor \ge \mid S\mid $$
(9)

where M × N indicates cover media size and |S| denotes the size of the medical image. After estimating the embedding parameter, for every pixel pair (a, b) defined in neighbourhood set φ(p, q), DCV (Diamond Characteristic Value) is computed by the following equation

$$ \varphi \left(p,q\right)=\left\{\left(a,b\right)\Big\Vert a-p|+|b-q|\le k\right\} $$
(10)
$$ DCV\left(a,b\right)=\left(\left(2k+1\right)a+b\right)\operatorname{mod}b $$
(11)

The important features of φ(p, q) are every DCV should be unique and a member of φ(p, q). The embedding is performed by changing the difference values of pixel pairs. As each pixel pair of stego image holds log2b bits, the maximum payload embedding capacity of the DE is (1/2)log2b bits per pixel. To conceal the digit Sb of a secret message within the pixel pair (p, q), DCVs are examined to discover the coordinates (p′, q′), where DCV(p′, q′) = Sb. Then pixel pair (p′, q′) is replaced in place of (p, q). This type of substitution causes underflow or overflow problem and to overcome such problem, the values of p′ and q′ are modified into p″ and q″ as follows

$$ {p}^{{\prime\prime} }=\Big\{{\displaystyle \begin{array}{ll}{p}^{\prime }-b,& \mathrm{if}\ {p}^{\prime }>255;\\ {}{p}^{\prime }+b,& \mathrm{if}\ {p}^{\prime }<0;\end{array}} $$
(12)
$$ {q}^{{\prime\prime} }=\Big\{{\displaystyle \begin{array}{ll}{q}^{\prime }-b,& \mathrm{if}\ {q}^{\prime }>255;\\ {}{q}^{\prime }+b,& \mathrm{if}\ {q}^{\prime }<0;\end{array}} $$
(13)

In the extraction phase, both DCV(p′, q′) and Sb from DCV(p′, q′) is computed. In DE method, the distortion occurred in the cover image during the embedding process does not exceed embedding parameter k. If the embedding parameter increases while embedding large payloads, then the distortion occurs is also large which can be detected easily by steganalysis. So the embedding parameter should be as small as possible.

3.5 Cloud computing

Flexibility and cost-effectiveness are provided through cloud computing architecture for many services on the Internet. Cloud computing has gained momentum because of the following factors: elimination of planning overhead for the novice user by providing facilities which are available on-demand, lower initial investment cost by using the infrastructure of the cloud and pay as they use, etc. The cloud gives a major boost to Information Technology as it facilitates a new model of competing over computing resources i.e., bandwidth, processing power, storage capacity and so on. Prior to submitting content into the cloud, content is encrypted using an encryption key and then uploaded into the cloud. All cloud users can access the encrypted data so long as they possess the valid key to decrypt; at the same time, unauthorized users cannot access the information, as shown in Fig. 3. According to the National Institute of Standards and Technology (NIST), there are three cloud models. Through Software as a Service (SaaS), users can make use of some software or application deployed in the cloud. The Platform as a Service (PaaS) provides computing environment for the user to deploy their application or programs. Infrastructure as a Service (IaaS) provides infrastructure to run different operating systems or sophisticated applications.

Fig. 3
figure 3

Cloud architecture for secure image sharing

Even though the cloud provides many benefits, some security issues are also present. Once data is stored in the cloud, Cloud Service Provider (CSP) has all the control over the data. Either intentionally or accidentally, data may be deleted or shared by any unauthorized user. Sometime the CSP himself may steal confidential data like credit card numbers, aadhaar numbers, and mobile numbers, etc. So, the CSP cannot always be trusted.

The user of the cloud can take some measures to protect the confidentiality of essential information. Before uploading data to a cloud server, encryption can be done over the data. But if the CSP wants to take secret information from the user data and if the user data are encrypted, the CSP may become suspicious due to the encryption. To avoid this issue, image steganography can be employed where multimedia contents are embedded into cover images to produce stego images. These stego images can be uploaded to the cloud server. The cloud user can download these stego images back from the server and then do an extraction operation from the stego image and produce the original multimedia contents.

4 Proposed method

In this section, a novel steganographic scheme based on diamond encoding(DE) scheme and Discrete Rajan Transform (DRT) is presented in Fig. 4. The proposed scheme embeds the coefficient of the secret image which is transformed using DRT into coefficients of the cover image which is transformed using Integer Wavelet Transform. Then inversion of the Integer wavelet transform is done. Stego images thus produced are stored in the cloud. When the secret image is needed, the stego image is downloaded from the cloud. It is first subjected to Integer wavelet transform. The transformed image goes through DE extraction. The image thus produced is subjected to the inverse of DRT which reveals the secret image. The proposed scheme reduces the distortion created in the stego image by exploiting DE.

Fig. 4
figure 4

Architecture for embedding and extraction operations of our proposed steganographic approach

4.1 Embedding secret image

Let C be a cover image of size M × N and S be the secret image which needs to be embedded of size m × n. Following are steps for the embedding phase:

Pseudo code for embedding secret image in the cover image

Input:M × N sized Cover image C, m × n sized secret image S, and r1, r2, and r3 are random seeds

Output: Stego-image C

Step 1:

Secret image S is transformed using Discrete Rajan Transform. The transformed image is subjected to base b conversion based on the embedding parameter k.

Step 2:

At first, cover image C undergoes integer wavelet transform to divided it into three detailed sub-bands LH, HL, HH and one approximation sub-band LL. Then LL, LH, and HL sub-bands are preferred for embedding because embedding on these sub-bands produce less distortion.

Step 3:

C is split into nonoverlapping blocks of two pixels each (Block size is 2 × 1 pixel).

Step 4:

Diamond Encoding is utilized to conceal medical image pixels within the pixels of cover image based on DCV value of neighbourhood sets. Location of coefficients in LL, LH, and HL are determined by random seeds r1, r2 and r3 respectively.

Step 5:

Finally, the inverse Integer Wavelet Transform is executed to generate stego image C′.

The problem of extraction of the secret image from the stego image is examined as a type of binary classification. Embedded secret image components are associated with some of the particular patterns of singular values. There is a possibility of a non-linear functional relationship occurring between two singular values of the cover image and secret image bits Wi € {0,1}. As SVM efficiently applies non-linear manipulation relationships, it is a good choice to understand the relationship. Moreover, many kinds of image processing attacks upon stego images are possible. So, the design of extraction of the secret image must include a solution to address the problem of extraction of corrupted stego images. As the usage of SVM has a good ability to generalize, our scheme uses SVM for extracting secret images from stego images.

4.2 Retrieving secret image

Once the stego-image C′ and the embedding parameters k, and the random seeds r1, r2, and r3 are obtained, the secret image can be readily extracted from C′. The extraction steps are listed as follows:

Pseudo code for reconstructing confidential image from stego image

Input: A stego-image C′, random seeds r1, r2, and r3 and embedding parameters k.

Output: Secret image S.

Step 1:

Stego-image C′ is initially subjected to integer wavelet transform.

Step 2:

The transformed image is partitioned into four sub-bands, namely, LL, LH, HL, and HH sub-band. The sub-bands LH, HL and LL are used for the extraction process.

Step 3:

Pixel values are extracted from subbands LL, LH, and HL according to random seeds r1, r2, and r3 respectively from the image generated in step 2. The extracted pixel values are subjected to reverse base b conversion.

Step 4:

SVM training pattern Ψ is created by manipulating feature sets and singular values where embedding is performed previously.

ψ = {(xi, yi) ∈ RN × R| i = 1, 2, ⋯, Lr}

where xi indicates feature set and yi indicates singular values and R characterizes the preferred output. An optimal solution can be modelled as follows:

\( \max \sum \limits_{i=1}^{L_r}{\alpha}_i-\frac{1}{2}\sum \limits_{i=1}^{L_r}\sum \limits_{j=1}^{L_r}{\alpha}_i{\alpha}_j{y}_i{y}_jK\left({x}_i{x}_j\right) \)

exposed to \( \sum \limits_{i=1}^{L_r}{\alpha}_i{\alpha}_j0,\kern0.5em 0\le {\alpha}_i\le C,\kern0.5em i=1,2,\cdots, {L}_r \)

Here, αi(i = 1,2,…,Lr) denotes the training parameter, C represents the control parameter, and y indicates decision function

\( y=\sum \limits_{i=1}^{L_r}{\alpha}_i{y}_iK\left({x}_i,x\right)+b \) where b ϵ R is the bias.

Step 5:

Required factors are calculated for each block

\( {x}_u^{\prime }=\left({x}_i^{\prime },{y}_i^{\prime}\right)\in {\psi}^{\prime },\mathrm{where}\ u=1,2,\cdots, {L}_r\;\mathrm{and}\;{y}_u^{\prime }=f\left({x}_u^{\prime}\right) \)

And stego image is computed as

\( {W}_u^{\prime }=\Big\{{\displaystyle \begin{array}{ll}1,& if\;{y}_u^{\prime }=1\\ {}0,& f\;{y}_u^{\prime }=-1\end{array}} \)

Step 6:

The generated image is subjected to inverse DRT to produce secret image S.

5 Experimental results and analysis

In this section, the performance of our proposed method is compared with the similar schemes like Nag et al. [27], Hemalatha et al. [28], Verma et al. [29] and Atawneh et al. [30]. To assess the performance of our scheme with other schemes, sixteen cover images of size 512 × 512, four from benchmark images from the UCID dataset and twelve non-benchmark images are considered as shown in Fig. 5 and two secret images of size 128 × 128 and 256 × 256 as shown in Fig. 6. Figure 7 shows the produced stego images, Fig. 8 shows the extracted secret images. The proposed scheme is applied to practical images like military and bio-medical images as well. As the research community is accustomed to utilizing images provided in benchmark databases, Effectiveness of our approach is tested with standard images than practical images. The effectiveness of the steganographic system is determined by the imperceptibility, structural similarity, strength and robustness of the stego images produced by our proposed scheme. These four characteristics are discussed in section 5.1 through 5.4.

Fig. 5
figure 5

Cover images of size 512 × 512

Fig. 6
figure 6

Original Secret images

Fig. 7
figure 7

Stego images

Fig. 8
figure 8

Extracted Secret images

5.1 Imperceptibility

Imperceptibility is the measure of the quality of produced stego images. The amount of distortion occurring in stego images is measured using PSNR. It is computed using the following formula:

$$ \mathrm{PSNR}=10\ {\log}_{10}\left(\frac{\genfrac{}{}{0pt}{}{255^2}{\ }}{MSE}\right) $$
(14)

Where MSE denotes the mean squared error between cover and stego images determined by the following formula:

$$ \mathrm{MSE}=\frac{\sum_{\mathrm{p}=0}^{\mathrm{m}-1}\ {\sum}_{\mathrm{q}=0}^{\mathrm{n}-1}\Big(\mathrm{x}\left(\mathrm{p},\mathrm{q}\right)-{\left({\mathrm{x}}^{\prime}\left(\mathrm{p},\mathrm{q}\right)\right)}^2}{\mathrm{m}\mathrm{Xn}} $$
(15)

Table 1 lists PSNR values produced by our scheme for stego images of 128 × 128 and 256 × 256. Average PSNR values for the image of size 128 × 128 and 256 × 256 are 53.52 and 50.43 respectively. Figure 9 shows the PSNR values chart produced by our scheme for sixteen images. From the figure, it is shown that our scheme produces stego images of good quality, higher than the acceptable PSNR value of similar schemes.

Table 1 PSNR of our scheme for sixteen test images
Fig. 9
figure 9

PSNR values of sixteen test images

Table 2 shows PSNR of the stego images for sixteen cover images produced by ours as well as by the other schemes with secret images of size 128 × 128. Figure 10 shows the comparison of PSNR for secret image1. Average PSNR produced by Nag et al. is 48.31, Hemalatha et al. are 46.27 and Verma et al. are 38.24. These three schemes produce images with less quality, but the average PSNR for Atawneh et al. scheme is 52.37 which is better than the previous three. However, ours is 53.52, which is better than the other four schemes. Table 3 shows PSNR of the stego images for sixteen cover images produced by ours as well as by the other schemes with secret images of size 256 × 256. Figure 11 shows the comparison of PSNR for secret image2. Average PSNR produced by Nag et al. is 44.36, Hemalatha et al. are 42.35 and Verma et al. are 34.80. These three schemes produce images with less quality, but Average PSNR for Atawneh et al. scheme is 49.23 which is better than the previous three. However, ours is 50.43, which is better than the other four schemes. For both secret images of size 128 × 128 and 256 × 256, PSNR values for our scheme is greater than the other schemes. Thus, it proves that our scheme produces stego images of high quality.

Table 2 Comparison of PSNR of Secret image1 of our scheme with other schemes
Fig. 10
figure 10

Comparison of PSNR values for secret image1

Table 3 Comparison of PSNR of Secret image2 of our scheme with other scheme
Fig. 11
figure 11

Comparison of PSNR values for secret image2

5.2 Structural similarity

Structural similarity index measure (SSIM) is a human visual system-based quality measure to assess image distortion in structural information. This is measured by the following formula:

$$ \mathrm{SSIM}\ \left(\ \mathrm{X},\mathrm{Y}\right)=\frac{\left(2{\upmu}_{\mathrm{X}}{\upmu}_{\mathrm{Y}}+{\mathrm{C}}_1\right)\left(2{\upsigma}_{\mathrm{X}\mathrm{Y}}+{\mathrm{C}}_2\right)}{\left({\upmu}_{\mathrm{X}}^2+{\upmu}_{\mathrm{Y}}^2+{\mathrm{C}}_1\right)\left(\ {\upsigma}_{\mathrm{X}}^2+{\upsigma}_{\mathrm{Y}}^2+{\mathrm{C}}_2\right)} $$
(16)

A hundred pairs of input cover images and output stego images are considered to test similarity for our scheme as well as the other four schemes. If two images are structurally equal, SSIM value is equal to 1, otherwise, it may vary between 0 and 1. Table 4 depicts the SSIM value produced by our scheme for sixteen images of size 128 × 128 and 256 × 256. SSIM for all sixteen stego images for both secret images is shown diagrammatically in Fig. 12. Stego images produced for the secret image of size 128 × 128 are much closer to 1 than those of the secret image of size 256 × 256. Table 5 shows the average SSIM produced by our scheme as well as by the other four schemes. From a human visual system point of view, our scheme produces similarity of around 98.89% for the image of size 128 × 128 and 98.78% for the image of size 256 × 256, which is slightly higher than the other schemes: for Nag et al. is 97.26% and 96.46%, for Hemalath et al. is 97.89% and 97.33%, for Verma et al. is 96.44% and 96.94% and for Atawneh et al. is 98.62% and 98.48%. Thus, it proves that our scheme is producing visually similar stego images.

Table 4 SSIM of our scheme for sixteen test images
Fig. 12
figure 12

SSIM values for all sixteen stego images for both secret images

Table 5 Comparison of SSIM of our scheme with other four schemes

5.3 Resistance strength to steganalysis

This section proves the strength of produced stego images. The classifier is used to assess our scheme. For classifier, thousand arbitrary images of size 512 × 512 were used as cover images. All the cover images are subjected to embedding using our scheme and by the other four schemes, producing 5 sets of stego images. Stego images from each set are steganalysed separately. The output of this test is depicted in Table 6. The failure rate of the set of stego images produced by our scheme is 42.18% for 128 × 128 and 30.42% for 256 × 256 sized images. This is superior over the other four schemes. Usage of Discrete Rajan Transform increases the strength of stego images. This is shown diagrammatically in Fig. 13. Thus, it proves that our scheme is strong enough to produce stego images with more strength which can withstand steganalytic attacks.

Table 6 Comparison of Steganalysis result (Test error) of ours and other schemes
Fig. 13
figure 13

Comparison of Failure rate of ours with other scheme

Steganlyser may also compare the histogram of the stego image with its original cover images(in the case cover image is taken from the benchmark data set). If the difference in the histogram is witnessed, then steganalyser would learn that some kind of operation is performed on the secret image which may induce them to steganalyse. Fig. 14 gives a comparison of the histogram. (a) and (b) are the actual cover image and its histogram respectively, (c) and (d) is the original secret image and its histogram respectively, (e) is the histogram of stego image and (f) is the histogram of the extracted secret image. From the Fig. 14b and e, it is evident that histogram of stego image and its corresponding cover image are almost identical which makes Histogram attack impractical.

Fig. 14
figure 14

Comparison of Histogram of cover images and its stego image

5.4 Robustness

This section proves the robustness of our proposed scheme against all possible attacks. These attacks are salt and pepper noise, Gaussian noise, cropping and compression. Each secret image is first embedded into all other cover images, and the resultant image is subjected to noise and other image processing attacks. For salt and pepper noise, noise density of 0.05 is added to the stego image. For Gaussian noise, noise with a mean of 0 and variance of 0.000001 is added to the stego images. In a cropping attack, 30 × 30-pixel values are cropped from the stego image. In compression, the stego image is compressed in a lossless manner. Then in all the above cases, original images are recovered almost exactly with a meager trace of the presence of noise or effects of image processing attacks. This is evident from the stego images which were subjected to different image processing attacks as shown in Fig. 15 and from MSSIM, NCC and BER values between the extracted secret image and the original secret images as shown in Table 7. MSSIM and NCC values have to close to be 1 to judge that no error is introduced in the extracted secret images, Its average values are 0.7118 and 09568 respectively in our experimental result. BER value have to be close to being 0 to judge the amount of error is introduced in the extracted secret images, the average value is 0.0141 in our experimental result. Usage of a Support Vector Machine increases the robustness of the extraction of stego images. Thus, this proves that our scheme is robust enough to withstand different kinds of attacks on the stego images.

Fig. 15
figure 15

Stego images under different image processing attacks

Table 7 MSSIM, NCC and BER values between the extracted secret image and the original secret images

6 Conclusion and future works

A new scheme for providing multimedia security in cloud space is presented in this work. In the proposed scheme, the multimedia content was first subjected to DRT transformation which was done to increase the strength of our proposed scheme. Then it was converted to Base B, which was used to increase the embedding rate. The cover image was then subjected to Integer Wavelet Transform. Base B converted coefficients of the transformed multimedia content were embedded into the coefficients of the transformed cover image using diamond embedding method. The produced image was subjected to the inverse of IWT to produce the stego image which was later stored in cloud space. When in-demand, these stego images were downloaded from the cloud, and the original multimedia content was extracted from them, using a similar procedure. The experimental results reveal that the produced stego image was structurally similar to the cover image. It also shows that our scheme is robust against many attacks. The experimental results prove the superior performance of our proposed approach over the existing methods in the literature. In the future, we plan to enhance our work with a color image steganographic technique for cloud storage.