1 Introduction

With the rapid development of the Internet and information processing technology, more and more video and image information is transmitted and stored over the network [57]. However, the Internet is an open and shared platform. The transmission and storage of multimedia information through the Internet may have many security risks, especially in military, medical, and commercial fields. Encrypting images is an effective way to protect them [53].

In early times, data encryption was mainly proposed for texts. Such as Anand et al. [1] proposed a symmetric key cryptography encryption algorithm for data security and Jha et al. [26] presented a data security encryption algorithm based on matrix property. When encrypting images, most scholars also use these text encryption techniques to encrypt images. Patel et al. [34] designed a multilevel data encryption algorithm for image and textual information based on AES and RAS, and Arab et al. [3] suggested an image encryption scheme based on chaotic system and AES algorithm. However, many encryption algorithms for texts are not suitable for images, because there are also many differences between images and text. First, the amount of information contained in images is very large. The second is that there is a strong correlation between adjacent pixels of the image. Third, the cipher image allows a certain amount of distortion during decryption, while the text encryption technology does not consider the issue of distortion.

Therefore, more and more scholars put forward new schemes to achieve image encryption. Hua et al. [25] put forward a new color image encryption scheme based on the new two-dimensional logistic tent chaotic system, and Mou et al. [44] proposed an image encryption scheme based on the fractional order hyper-chaotic complex system and Galois field (GF). Based on the good pseudo-randomness and ergodicity of the proposed chaotic system, Zhou et al. [47] suggested an image encryption scheme based on phase truncated short-time fractional Fourier transform and hyper-chaotic system, Ghazvini et al. [19] designed an image encryption scheme based on chaotic mapping and genetic algorithm, and Anwar et al. [2] makes use of the chaos theory to propose an image encryption technique that is based on pixel permutation. The above algorithm effectively protects the security of the image. However, with the increasing size of the image transmitted on the Internet, it is essential to compress the image to improve the efficiency of image transmission and storage while protecting the image security. Therefore, image compression and encryption schemes have attracted more and more attention.

In 2006, D. L. Donoho [17] and E. Candès [5] formally proposed the theory of compressed sensing, and then CS was quickly used in the field of image encryption, which may compress and encrypt the images simultaneously. For example, Chen et al. [11] proposed an ICES based on Kronecker compressed sensing (KCS) and elementary cellular automata (ECA), ECA was used to scramble the sparse image, and KCS was utilized to encrypt and compress the scrambled image. Yao et al. [45] presented a CS-based color image encryption scheme, the plain image was split into R, G, and B three components, and then the obtained components are measured by Gaussian random matrices. Next, the resulting measurement value matrices are scrambled by the multi-image cross pixel scrambling. Chai et al. [7] combined CS and least significant bit (LSB) embedding to propose an effective visual meaningful ICES. After the plain image was compressed and shuffled by CS and zigzag path, the obtained cipher image was embedded into an independent carrier image by a dynamic LSB embedding technology, and finally a visual meaning cipher image was generated, which had the same size as the original image. Luo et al. [31] proposed an image encryption scheme based on Chua’s circuit and CS.

After performing CS on the plain image, the obtained cipher image has less dimension than the corresponding original image. When it is transmitted and stored through Internet, transmission bandwidth and storage space may be effectively reduced. However, when recovering the original image from the compressed cipher images on the decryption side, convex optimization computation process of CS reconstruction may require a lot of time, which seriously affects the application of the cryptosystem in real-time fields. In order to solve this problem, some scholars apply the cloud platform with high computing power to manipulate the CS reconstruction [23, 49, 51, 54]. However, since the cloud is semi-trusted, the data information stored in the cloud may be stolen by some criminals, and the cloud sometimes deceives users maliciously. Therefore, some necessary measures should be taken to ensure that our privacy is not leaked and tampered with [28, 30]. For instance, Zhang et al. designed a large image data security transmission and data sharing scheme based on hybrid cloud [49], and then they proposed a CS reconstruction outsourcing scheme based on cloud platform [51]. Besides, Xiao et al. [23] proposed a CS-based image outsourcing reconstruction and identity authentication service in a cloud environment. These schemes ensure the efficiency and secrecy of image data transmission.

In a word, the current CS-based ICES can provide a good compression performance for images, but there also exist some problems. For example, the key used by the algorithms [11, 20, 36, 40, 58] has nothing to do with the plain image, which makes the algorithm vulnerable to chosen-plaintext attacks. Additionally, the schemes [14, 35, 42, 45] have no diffusion operation in the encryption process, resulting in low information entropy of the cipher images. Since CS is a lossy encryption process, the methods [12, 16, 31, 33, 48, 59] lose more data in the compression stage, resulting in poor reconstruction quality. At the same time, the algorithm [51] needs two different samplings to ensure the security of cloud reconstruction, and the methods [4, 23, 49] cannot prevent the malicious deception of the cloud.

To solve the above problems, combining CS with DRPE, we design a secure and efficient image compression-encryption system with cloud-assisted reconstruction. In our scheme, the SHA-256 function of the plain image is utilized to disturb the generation of chaotic key sequences, ensuring that the algorithm and plain image have high correlation. Our main contributions are as follows:

  1. 1)

    An ICES based on CS and DRPE is presented. Firstly, the plain image is decomposed by DWT to obtain the approximate component and the detail components. Secondly, the approximate component is scrambled, and the detail components is compressed by CS. Finally, DRPE, scrambling and diffusion are performed on the scrambled approximation component and the compressed measurement values to obtain the cipher image. The combination of CS, DRPE, confusion and diffusion may improve the security level of the proposed ICES while effectively compressing the image. Besides, through scrambling the approximate component containing important information and compressing the detail components containing unimportant information, the volume of the cipher image is less than that of the original image, and the image reconstruction quality is guaranteed.

  2. 2)

    A random pixel scrambling method is proposed to shuffle the phase and amplitude matrices of the plain image. The summation of the two matrices is used to decide which elements of these two matrices need to be exchanged, and the index sequence generated by sorting the chaotic sequence is utilized to give the new position for scrambling. The adaptive scrambling process is controlled by the plaintext image and chaotic sequence, which makes the proposed algorithm resistant to statistical attacks and plaintext attacks.

  3. 3)

    A joint decryption scheme of cloud platform and user side is designed to shorten the decryption time. In the decryption process, the cipher image is firstly partially decrypted on the cloud. Secondly, the measurement matrix after scrambling is also transmitted to the cloud by the user, and the scrambled sparse coefficient matrix is reconstructed on the cloud. Then it is transmitted to the user and further decrypted to recover the plain image. At the same time, by comparing the contour similarity between the approximate component and the detail component returned by the cloud to the client, we can judge whether the cloud has cheated us. The proposed decryption method may not only enhance decryption efficiency by assisting powerful cloud platform in CS reconstruction, but also prevent the malicious deception of the cloud and protect the security of the image data.

The rest of the paper is organized as follows. The compressed sensing and chaotic system are described in Section 2. Section 3 presents the proposed image encryption system and decryption system. In Section 4, we present the simulation results. The performance and security analysis of the proposed image encryption algorithm is demonstrated in Section 5. Finally, Section 6 concludes the paper.

2 Preliminaries

This section mainly introduces the basic knowledge of compressed sensing, and the used Logistic-tent chaotic system and 2D Logistic-adjusted-Sine map.

2.1 Compressed sensing

The CS architecture includes the sampling process in the encryption side and reconstruction process in the decryption side. In CS theory, the low-dimensional signal obtained by compression contains enough information of the original signal, and thus the signal can be accurately reconstructed by solving the convex optimization problem [6]. Meanwhile, the number of samples required for signal reconstruction is less than the Nyquist sampling theorem, so that CS has high efficiency.

Assuming that the signal xRn × 1 is K-sparse, it is transformed under a standard orthogonal basis Ψ and expressed as

$$ x=\Psi s $$
(1)

where the vector s has K (K < <n) non-zero coefficients. The signal x is sampled and compressed by a measurement matrix Φ∈Rm × n independent of Ψ and represented as

$$ y=\Phi x=\Phi \Psi s=\Theta s $$
(2)

where yRm × 1 (m < <n) is a measurement value vector, and Θ∈Rm × n is a sensing matrix.

The signal x may be reconstructed from y by solving an l0 optimization problem, illustrated as,

$$ \left\{\begin{array}{l}\underset{s}{\min }{\left\Vert s\right\Vert}_0\kern0.5em \\ {}\ s.t.\kern0.5em \Theta s=y\end{array}\right. $$
(3)

Solving Eq. (3) is a NP-hard problem, and it may be converted to solve the following l1 norm problem,

$$ \left\{\begin{array}{l}\underset{s}{\min }{\left\Vert s\right\Vert}_1\kern0.5em \\ {}\ s.t.\kern0.5em \Theta s=y\end{array}\right. $$
(4)

The basic model of CS theory mainly includes three major aspects: the spare representation of original signals, compression measurement and signal reconstruction [55]. The natural image is not sparse in time domain, but it is sparse after it is transformed by DWT or discrete cosine transform (DCT). The measurement matrix needs to meet the restricted isometry property (RIP), which is not related to sparse basis, and Gaussian random matrix and cyclic matrix are mostly utilized [6]. As for reconstruction methods, there are two categories. The first one is greedy iterative algorithms, such as orthogonal matching pursuit (OMP) algorithm, block orthogonal matching pursuit method (BOMP). The second one is convex optimization algorithms, such as base pursuit (BP) algorithm, gradient projection algorithm (GP) and others.

In this paper, the DWT is utilized to sparsity the original image, the random sequence generated by the chaotic system is applied for constructing the measurement matrix, and OMP is used to reconstruct the plain image.

2.2 Logistic-tent system

Based on Logistic map and tent map, Logistic-tent system (LTS) [37] is designed and derived by,

$$ {z}_{n+1}=\left\{\begin{array}{l}\left[\mu {z}_n\left(1-{z}_n\right)+\frac{\left(4-\mu \right){z}_n}{2}\right]\operatorname{mod}1,\kern0.5em {z}_n<0.5\\ {}\left[\mu {z}_n\left(1-{z}_n\right)+\frac{\left(4-\mu \right)\left(1-{z}_n\right)}{2}\right]\operatorname{mod}1,\kern0.5em {z}_n\ge 0.5\end{array}\right. $$
(5)

where control parameter μ∈(0,4] and state values zn∈[0, 1]. Performance analyses verify that the LTS has better chaotic performance than the Logistic and Tent maps [37]. In the proposed method, the chaotic sequences generated by LTS are applied for constructing the measurement matrix for CS.

2.3 2D logistic-adjusted-sine map

The 2D Logistic-adjusted-Sine map (2D-LASM) has extremely complex dynamic behavior and it is defined as [40],

$$ \left\{\begin{array}{l}{x}_{i+1}=\sin \left(\pi \mu \left({y}_i+3\right){x}_i\left(1-{x}_i\right)\right)\\ {}{y}_{i+1}=\sin \left(\pi \mu \left({x}_{i+1}+3\right){y}_i\left(1-{y}_i\right)\right)\end{array}\right. $$
(6)

When the system parameter μ∈[0.37, 0.38]∪[0.4, 0.42]∪[0.44, 0.93]∪{1}, 2D-LASM maps to a chaotic state, and the resulting sequences {(xn, yn), n = 0, 1, 2, 3...} are aperiodic, non-convergent and sensitive to initial values. In this article, the chaotic sequences generated by 2D-LASM are utilized to scramble and diffuse the image, and produce the mask matrix used in the DRPE.

3 The proposed image encryption and decryption scheme

The proposed image cryptosystem includes the encryption process and decryption process, as illustrated in Fig. 1. In the encryption stage, the plain image is firstly transmitted to the client side, and then compressed and encrypted to the cipher image. Next, the client side sends the cipher image and some keys to the cloud for storage, and also sends other keys to the user for decryption.

Fig. 1
figure 1

The framework of the proposed image cryptosystem

In the decryption stage, the user side transmits the measurement matrix to the cloud, and then the cipher image is partially decrypted and reconstructed in the cloud. Subsequently, the resulting image is sent to the user and decrypted by use of the keys to obtain the final plain image. The detailed encryption and decryption steps are as follows.

3.1 The proposed ICES

In this section, the proposed ICES will be described in detail, and it is composed of five stages. As shown in Fig. 2, the first stage is to generate the chaotic sequences, some sequences are used for constructing measurement matrices, and others are utilized for DRPE, confusion and diffusion. The second stage is to confuse the approximate component of plain image and to compress detail components by CS. Next, in the third stage, after recombining scrambled approximate component and compressed detail components to a complex matrix, DRPE is performed on the obtained matrix to find the phase and amplitude matrices. Subsequently, the phase and amplitude matrices are quantified and scrambled in the fourth stage, and diffused in the fifth stage to obtain the final cipher image.

Fig. 2
figure 2

The flow chart of the proposed ICES

Before encryption, the plain image P with size M × M is decomposed into an approximate component LL and three detail components HL, LH, HH by using DWT, the size of each component is m × m, and m = M/2. The detailed encryption steps are given as follows.

3.1.1 The generation of the chaotic sequences

In this paper, the SHA-256 function is used to generate a 256-bit hash key for the plain image P (M × M), which is combined in groups of 8 bits to obtain ki, where i = 1, 2, …, 32. Then, ki and the external keys t1, t2, t3, t4 are utilized to generate the initial values ​​x0, y0, z0, w0, u0, k0 of the used chaotic system by the following Eq. (7),

$$ \left\{\begin{array}{l}{x}_0=\left(\left({t}_1\times {t}_2\right)/\left(256\times {t}_3\times {t}_4\right)\right)\left(\left({k}_1\left|{k}_5\right.\right)\bigoplus \left({k}_9\left|{k}_{13}\right.\right)\bigoplus \left({k}_{17}\&{k}_{21}\right)\bigoplus \left({k}_{25}\&{k}_{29}\right)\right)\\ {}{y}_0=\left(\left({t}_1\times {t}_3\right)/\left(256\times {t}_2\times {t}_4\right)\right)\left(\left({k}_2\left|{k}_6\right.\bigoplus {k}_{10}\&{k}_{14}\right)\bigoplus \left({k}_{18}\left|{k}_{22}\right.\bigoplus {k}_{26}\&{k}_{30}\right)\right)\\ {}{z}_0=\left(\left({t}_1\times {t}_2\right)/256\left({t}_3\times {t}_4\right)\right)\left(\left({k}_3\bigoplus {k}_7\bigoplus {k}_{11}\bigoplus {k}_{15}\right)\left|\left({\mathrm{k}}_{19}\bigoplus {k}_{23}\bigoplus {k}_{27}\bigoplus {k}_{31}\right)\right.\right)\\ {}{w}_0=\left(\left({t}_1\times {t}_3\right)/256\left({t}_2\times {t}_4\right)\right)\left({k}_4\bigoplus {k}_8\bigoplus {k}_{12}\bigoplus {k}_{16}\bigoplus {k}_{20}\bigoplus {k}_{24}\bigoplus {k}_{28}\bigoplus {k}_{32}\right)\\ {}{u}_0=\left({x}_0+{y}_0+{z}_0+{w}_0\right)/4\\ {}{k}_0=\left({x}_0+{y}_0+{z}_0+{w}_0+{u}_0\right)/5\end{array}\right. $$
(7)

where x ⊕ y is the XOR operation of x and y, x | y is the OR operation of x and y, x & y is the AND operation of x and y.

Next, the initial values x0, y0 and parameter μ1 are brought into the 2D-LASM chaotic system to iterate m2 + n0 times, wherein, n0 ≥ 500 and m = M/2. In order to eliminate transient effect, the first n0 values ​​are discarded to obtain two chaotic sequence X1, Y1. In the same way, bringing the initial values ​​z0, w0 and the parameter μ2 into the 2D-LASM system to get the chaotic sequences Z1, W1.

Subsequently, iterate the LTS chaotic system for m2 + n0 times with the initial values u0 and parameter μ3, then obtain chaos sequence U1 after removing the previous n0 values. Similarly, the chaotic sequence K1 is gotten by bringing the initial value k0 and parameter μ4 into the LTS system.

3.1.2 Scrambling of approximate component and compression of detail components

The approximate component LL is shuffled by the chaotic sequence X1. In detail, LL is transformed into 1D sequence l1 sized of 1 × m2, the chaotic sequence X1 with the size of 1 × m2 is selected and sorted in ascending order according to Eq. (8), and the index sequence s1 is obtained. Subsequently, as shown in Eq. (9), the sequence l2 is obtained by scrambling l1 with s1.

$$ \left[{X}_1^{\prime },{s}_1\right]= sort\left({X}_1\right) $$
(8)
$$ {l}_2\left({s}_1(t)\right)={l}_1(t) $$
(9)

where 1 ≤ t ≤ m2. Finally, l2 is converted to a matrix LL1 with the size of m × m.

Next, the measurement matrices are produced by chaotic sequences, and then the detail components of the plain image are compressed and encrypted by CS. The specific steps are as below.

  1. Step 1:

    The discrete wavelet transform sparse basis ψ is used to sparsity the three detail components HL, LH and HH, and three sparse coefficient matrices HL1, LH1 and HH1 is obtained by,

$$ \left\{\begin{array}{l}{HL}_1=\psi HL{\psi}^{\prime}\\ {}{LH}_1=\psi LH{\psi}^{\prime}\\ {}{HH}_1=\psi HH{\psi}^{\prime}\end{array}\right. $$
(10)

where ψ´ denotes the transposition of matrix ψ.

  1. Step 2:

    Use two chaotic sequences U1 and K1 to construct two random matrices Φ01 and Φ01 of size m × m, and then recombine them into a non-overlapping random matrix Φ0 sized of 2 m × m. At the same time, pick up the sequence Um with a length of 1 × 2 m from the sequence U1, and then sort Um in ascending order to obtain the index sequence s2. Next, scramble Φ0 by row according to s2, and select the first m rows of the obtained shuffled matrix to produce the measurement matrix Φ sized of m × m.

  2. Step 3:

    Divide the measurement matrix Φ into three measurement matrices Φ1, Φ2, Φ3, their size is ​​m1 × m, m2 × m, m3 × m, where m1 = m/2, m2 = m/4, m3 = m/4. Then, measure the three sparse coefficient matrices HL1, LH1 and HH1 by use of these three measurement matrices to get three measured value matrices HL2, LH2, HH2, respectively. The specific process is as follows:

$$ \left\{\begin{array}{l}{HL}_2={\Phi}_1{HL}_1\\ {}{LH}_2={\Phi}_2{LH}_1\\ {}{HH}_2={\Phi}_3{HH}_1\end{array}\right. $$
(11)
  1. Step 4:

    The matrices HL2, LH2 and HH2 are recombined into a non-overlapping matrix H1 through the following Eq. (12).

$$ {H}_1=\left[\begin{array}{l}{HL}_2\\ {}{LH}_2\\ {}{HH}_2\end{array}\right] $$
(12)

3.1.3 Double random phase encoding

DRPE has been widely utilized in image encryption for its parallel processing and easy configuration [59]. In this subsection, DRPE is applied for encrypting the approximate component and three detail components of the plain image.

  1. Step 1:

    The matrix LL1 and H1 are merged into a complex matrix LLH1 with the size of m × m, where LLH1 = LL1 + j H1 and j is an imaginary unit.

  2. Step 2:

    Chaotic sequences X1, Y1, Z1, W1 are utilized to generate two 1 × m2 sequences XY0 and ZW0, where XY0 = (X1 + Y1) / 2, ZW0 = (Z1 + W1) / 2.

  3. Step 3:

    The sequences XY0 and ZW0 are converted to two random matrices q1 and q2 with the size of m × m, respectively. Then, two random phase mask matrices M1, M2 are gotten by

$$ {M}_i={e}^{j2\pi {q}_i\left(x,y\right)} $$
(13)

where i = 1, 2, and the size of matrix Mi is m × m.

  1. Step 4:

    Manipulate the DRPE transform on the matrix LLH1 to obtain the matrix LLH2 via,

$$ {LLH}_2= IFFT\left( FFT\left({LLH}_1\cdot {M}_1\right)\cdot {M}_2\right) $$
(14)

where FFT() and IFFT() indicate the Fourier transform and inverse Fourier transform, respectively.

  1. Step 5:

    Extract the phase and amplitude from the matrix LLH2 to obtain matrices LL2 and H2 sized of m × m, respectively. The specific operations are as follows.

$$ \left\{\begin{array}{l}{LL}_2= real\left({ LL H}_2\right)\\ {}{H}_2= imag\left({ LL H}_2\right)\end{array}\right. $$
(15)

3.1.4 Random pixel scrambling

Before scrambling, the obtained matrices LL2 and H2 are quantified to get the matrices LL3 and H3, respectively. Taking LL2 as an example, its quantization process is as follows:

$$ {LL}_3= round\left(\frac{255\times \left({LL}_2-\min \right)}{\max -\min}\right) $$
(16)

where min is the minimum value of matrix LL2, max is the maximum value of matrix LL2, and round (x) is the integer closest to x.

Subsequently, the random pixel scrambling is manipulated on the matrices LL3 and H3, and the detailed steps are as below.

  1. Step 1:

    Pixel exchanging. Calculate the summation of the two matrices LL3 and H3, judge the parity of the obtained matrix to find a new matrix, when the element is even, it is marked with 0, otherwise it is marked with 1. Next, the resulting matrix is utilized to determine whether the elements at the same position of the LL3 and H3 need to be exchanged.

  2. Step 2:

    Pixel scrambling. Firstly, select the sequence Z1 generated by the chaotic system, rearrange the elements of Z1 in ascending order to obtain a 1 × m2 index sequence s3, and then transform the obtained sequence into an m × m index matrix S(i, j). Secondly, the element value of the index matrix is ​​used to provide the new position of the pixels to be shuffled.

Through the above operations, the scrambled matrix LL4 and H4 are obtained. The specific confusion process may be represented as follows:

$$ \mathrm{When}\ \left({LL}_3\ \left(i,j\right)+{H}_3\ \left(i,j\right)\right) \operatorname {mod}\ 2=1,\mathrm{then}\ {H}_4\ \left(s,t\right)={LL}_3\ \left(i,j\right),{LL}_4\ \left(s,t\right)={H}_3\ \left(i,j\right); $$
$$ \mathrm{When}\ \left({LL}_3\ \left(i,j\right)+{H}_3\ \left(i,j\right)\right) \operatorname {mod}\ 2=0,\mathrm{then}\ {H}_4\ \left(s,t\right)={H}_3\ \left(i,j\right),{LL}_4\ \left(s,t\right)={LL}_3\ \left(i,j\right). $$

where

$$ \left\{\begin{array}{l}s=S\left(i,j\right)\operatorname{mod}m+1\\ {}t= ceil\left(S\left(i,j\right)/n\right)\end{array}\right. $$
(17)

ceil (x) means to return the smallest integer greater than or equal to x, and 1 ≤ i, j, s, t ≤ m. Figure 3 shows an example of the random pixel scrambling.

Fig. 3
figure 3

Illustration of the random pixel scrambling

3.1.5 Diffusion

This subsection is provided to modify the pixel values of the scrambled matrices LL4 and H4 to improve the encryption effect.

  1. Step 1:

    Recombine the scrambled matrices LL4 and H4 into a non-overlapping matrix BLH of size m × 2 m, where BLH = [LL4, H4], and then convert the BLH to a 1 × 2 m2 sequence B.

  2. Step 2:

    The chaotic sequences X1, Y1, Z1, W1 are utilized to generate two random sequences XY and ZW with size of 1 × 2 m2, where XY = [X1, Y1], ZW = [Z1, W1].

  3. Step 3:

    According to the following Eq. (18), the sequence B is diffused using the sequences XY and ZW to obtain an encrypted sequence C, and then the final cipher image is obtained by transforming C to the matrix.

The first pixel C1 of the sequence C is generated by using the first element of the random sequence XY, ZW and B. The diffusion process is as follows:

$$ \left\{\begin{array}{l}{C}_1={B}_1\bigoplus {XY}_1\bigoplus {ZW}_1\\ {}{C}_i={B}_i\bigoplus {C}_{i-1}\bigoplus {XY}_i\bigoplus {ZW}_i\end{array}\right. $$
(18)

where XY1, ZW1, B1 and C1 represent the first element of random sequences XY, ZW, B and C, respectively. XYi, ZWi, Bi, and Ci represent the i-th element of random sequences XY, ZW, B and C, respectively. Ci-1 represents the (i-1)-th element of the sequence C, and i = 2, 3, ..., 2m2.

3.2 The image decryption process

The image decryption is the inverse process of image encryption, and the flow chart is shown in Fig. 4. Before decryption, the client needs to transmit the SHA-256 hash value of the plain image and the external keys t1, t2, t3, t4, μ1, μ2 to the cloud, and sends the hash value and external keys t1, t2, t3, t4, μ3, μ4 to the user. As described in Section 3.1.1, the chaotic sequences X1, Y1, Z1, W1, U1 and K1 are firstly generated by the secret keys. The specific decryption process consists of the following ten steps.

  1. Step 1:

    Inverse diffusion. As described in Section 3.1.5, the sequences XY, ZW are produced by use of the sequences X1, Y1, Z1, W1. Then, the sequence C transformed from the cipher image sized of m × 2 m is diffused by the sequences XY and ZW to obtain the sequence B. The inverse diffusion process is as follows:

Fig. 4
figure 4

The flow chart of the image decryption algorithm

$$ \left\{\begin{array}{l}{B}_1={C}_1\oplus {XY}_1\oplus {ZW}_1\\ {}{B}_i={C}_i\oplus {B}_{i-1}\oplus {XY}_i\oplus {ZW}_i\end{array}\right. $$
(19)

where i = 2, 3, ..., 2m2. Subsequently, the sequence B is converted into a matrix and divided into two non-overlapping matrices LL4 and H4 with the size of m × m.

  1. Step 2:

    Inverse random pixel scrambling. According to the following rules, the matrices LL4 and H4 are inversely scrambled by the chaotic sequence Z1 to obtain the matrices LL3 and H3.

When (LL4 (i, j) + H4 (i, j)) mod 2 = 1, then H3 (s, t) = LL4 (i, j), LL3 (s, t) = H4 (i, j);

When (LL4 (i, j) + H4 (i, j)) mod 2 = 0, then H3 (s, t) = H4 (i, j), LL3 (s, t) = LL4 (i, j).where

$$ \left\{\begin{array}{l}s=S\left(i,j\right)\operatorname{mod}m+1\\ {}t= ceil\left(S\left(i,j\right)/n\right)\end{array}\right. $$
(20)

where the ceil (x) means to return the smallest integer greater than or equal to x, and 1 ≤ i, j, s, t ≤ m.

  1. Step 3:

    Inverse quantization. The matrices LL3 and H3 are inversely quantized to obtain LL2 and H2. Taking LL3 as an example, its inverse quantization process is as follows:

$$ {LL}_2=\frac{LL_3\times \left(\max -\min \right)}{255}+\min $$
(21)

where min, max are the minimum and maximum values of the matrix LL2, respectively.

  1. Step 4:

    Inverse double random phase encoding (IDRPE).

    1. (1)

      The matrices LL2 and H2 are combined into a complex matrix LLH2 with the size of m × m, where LLH2 = LL2 + j × H2 and j is an imaginary unit.

    2. (2)

      As described in Section 3.1.3, two random sequences XY0 and ZW0 are generated by the chaotic sequences X1, Y1, Z1, W1 and then converted to two random matrices q1 and q2, respectively. Then, two random phase mask matrices\( {M}_1^{\prime } \),\( {M}_2^{\prime } \)are gotten by

$$ {M}_i^{\prime }={e}^{-j2\pi {q}_i\left(x,y\right)} $$
(22)

where i = 1, 2.

  1. (3)

    Perform the IDRPE transform on the matrix LLH2 to obtain the matrix LLH1 via Eq. (23),

$$ {LLH}_1= FFT\left(\left( IFFT\left({LLH}_2\right)\right)\cdot {M}_2^{\prime}\right)\cdot {M}_1^{\prime } $$
(23)

After the above operation, the phase and amplitude are picked up from the matrix LLH1 to find the matrices LL1 and H1 with the size of m × m via Eq. (15), respectively.

  1. Step 5:

    Generation and scrambling of measurement matrix. As described in Section 3.1.2, three measurement matrices Φ1, Φ2, Φ3 are constructed by the chaotic sequences U1 and K1. To protect the security of data on the cloud, the user selects three sequences US1, US2 and US3 sized of 1 × m from the chaotic sequence U1, then sorts them in ascending order to obtain index sequences s4, s5 and s6, respectively. Subsequently, the matrices Ma1, Ma2, Ma3 are obtained by shuffling the measurement matrices Φ1, Φ2, Φ3 via,

$$ \left\{\begin{array}{l}{M}_{a1}\left(:,{s}_4(t)\right)={\Phi}_1\left(:,t\right)\\ {}{M}_{a2}\left(:,{s}_5(t)\right)={\Phi}_2\left(:,t\right)\\ {}{M}_{a3}\left(:,{s}_6(t)\right)={\Phi}_3\left(:,t\right)\end{array}\right. $$
(24)

where 1 ≤ t ≤ m. Afterwards, the three scrambled measurement matrices are transmitted to the cloud for CS reconstruction.

  1. Step 6:

    CS reconstruction on cloud. The matrix H1 is divided into three non-overlapping matrices y1, y2, y3, where the size of y1 is m/2 × m, the size of y2 and y3 are both m/4 × m. When the cloud receives the measurement matrices from the user, it uses its strong computing power to get the sparse coefficient matrices HL11, LH11, HH11 sized of m × m, and then transmits them to the user.

  2. Step 7:

    Inverse scrambling of sparse coefficient matrices. After the user receives the sparse coefficient matrices HL11, LH11, HH11 from the cloud, it uses the index sequence s4, s5, s6 to scramble them to obtain three sparse coefficient matrices HL1, LH1, HH1. The detailed inverse scrambling operation is as,

$$ \left\{\begin{array}{l}{\mathrm{HL}}_1\left(t,:\right)={\mathrm{HL}}_{11}\left({s}_4(t),:\right)\\ {}{\mathrm{LH}}_1\left(t,:\right)={\mathrm{LH}}_{11}\left({s}_5(t),:\right)\\ {}{\mathrm{HH}}_1\left(t,:\right)={\mathrm{HH}}_{11}\left({s}_6(t),:\right)\end{array}\right. $$
(25)

where 1 ≤ t ≤ m.

  1. Step 8:

    Inverse sparsity. According to the following Eq. (26), three detail components HL, LH and HH with the size of m × m are obtained by inversely processed the sparse coefficient matrices HL1, LH1, HH1.

$$ \left\{\begin{array}{l} HL={\psi}^{\prime }{HL}_1\psi \\ {} LH={\psi}^{\prime }{LH}_1\psi \\ {} HH={\psi}^{\prime }{HH}_1\psi \end{array}\right. $$
(26)

where ψ´ is the transposition of matrix ψ.

  1. Step 9:

    Inverse scrambling of approximate component. The index sequence s1 is obtained by sorting the chaotic sequence X1 in ascending order, the matrix LL1 is transformed to one-dimensional sequence l2, and then l1 is gotten by inversely shuffling l2 by use of s1, and converted to the approximate component LL with the size of m × m.

$$ {l}_2(t)={l}_1\left({s}_1(t)\right) $$
(27)
  1. Step 100:

    Inverse discrete wavelet transform (IDWT). By performing IDWT on the approximate component LL and detail components HL, LH, HH, the plain image P is obtained. The decryption process is finished.

Among the above decryption process, step 1, step 2, step 3, step 4 and step 6 are operated on the cloud platform, and the remaining steps are completed on the user side. In general, a cloud is curious about the computation content and attempts to get some sensitive information from it, even if it may faithfully perform its computation duties. Except for the cloud’s curiosity, malicious behaviors should be considered. To tackle this type of semi-trusted cloud, input/output privacy designs are essential. In the proposed method, the measurement matrices are shuffled by the user and sent to the cloud for CS construction in step 5, and the gotten sparse coefficient matrices are the output information. When they are gotten by the hackers, the correct image may not be obtained without the index sequences s4, s5 and s6, indicating the input/output information is secure. Besides, before decrypting the plain image, we can compare the contour information of the approximate component and detail components to determine whether the user is cheated by the cloud.

4 Simulation results

In this section, MATLAB R2016a is employed to verify the encryption and decryption effects of the proposed algorithm in a personal computer with an Intel(R) Core(TM) i7–6700 CPU 3.40 GHz and 8 GB memory, and the operating system is Microsoft Windows 10. The parameters used in the algorithm are shown in Table 1. t1, t2, t3 and t4 are the parameters of the external key, and μ1, μ2, μ3, μ4 are the control parameters of the chaotic system. The four different 512 × 512 images “Lena”, “Baboon”, “Cameraman”, “Peppers” are all used as the plain images. Moreover, in order to fully illustrate the effectiveness of the algorithm, the related test images are collected from USC-SIPI image database [39].

Table 1 Experimental parameters

4.1 Encryption and decryption results

The encryption and decryption images of “Lena”, “Baboon”, “Cameraman” and “Peppers” with the size of 512 × 512 are shown in Fig. 5. The simulation results display that when the compression ratio (CR) is 0.5, the cipher images shown in the second column cannot be recognized, thus protecting the information of plaintext images, and their volumes have been compressed half. In addition, the reconstructed images shown in the third column have good visual effect and are just like their corresponding plain images.

Fig. 5
figure 5

Plain images, cipher images and decrypted images. (a), (b), (c) represent the original, cipher and decrypted images of “Lena”, respectively. (d), (e), (f) represent the original, cipher and decrypted images of “Baboon”, respectively. (g), (h), (i) represent the original, cipher and decrypted images of “Cameraman”, respectively. (j), (k), (l) represent the original, cipher and decrypted images of “Peppers”, respectively

4.2 Compression performance

This section mainly shows the compression performance of the algorithm from peak signal-to-noise ratio (PSNR) and structural similarity index measurement (SSIM), and compares with the existing algorithms to show that the proposed algorithm has good reconstruction quality.

4.2.1 Peak signal-to-noise ratio (PSNR)

The PSNR is usually employed to estimate the quality of the decrypted image, and it can be calculated by [46].

$$ PSNR=10\log \frac{255^2}{\left(1/{N}^2\right){\sum}_{i=1}^N{\sum}_{j=1}^N{\left[D\left(i,j\right)-I\Big(i,j\Big)\right]}^2} $$
(28)

where D(i, j) and I(i, j) represent the reconstructed image and the plain image, respectively. And the PSNR test results of different images are shown in Tables 2 and 3. In Table 2, the plain images are all 512 × 512, the PSNR values between the reconstructed image and plain image are more than 30 dB, and the maximum value is more than 36 dB for “Cameraman”, when the compression ratio is 0.5. For 256 × 256 images in Table 3, the PSNR values are about 30 dB. These results both demonstrate that the decrypted images gotten by our algorithm have good visual quality and can be well recognized. Additionally, it may be found from Table 4 that our algorithm has higher PSNR values than those in [12, 31, 59], which means that the proposed ICES has better reconstruction quality.

Table 2 PSNR values of different plain images sized of 512 × 512.
Table 3 PSNR values of different images sized of 256 × 256.
Table 4 Comparison of different algorithms for PSNR values

4.2.2 Structural similarity index measurement (SSIM)

Besides, SSIM is employed to measure the similarity of the decrypted image and plain image. Its value range is [0, 1], and the larger the value is, the higher the similarity between these two images is. SSIM is defined by [43].

$$ SSIM=\frac{\left(2{\mu}_X{\mu}_Y+{C}_1\right)\left(2{\sigma}_{XY}+{C}_2\right)}{\left({\mu}_X^2+{\mu}_Y^2+{C}_1\right)\left({\sigma}_X^2+{\sigma}_Y^2+{C}_2\right)} $$
(29)

where μX and μY respectively represent the mean value of the plain image X and decrypted image Y, σX and σY respectively denote the variance of X and Y, σXY is the covariance of X and Y, C1 = (k1 × L)2, C2 = (k2 × L)2, k1 = 0.01, k2 = 0.03, and L = 255.

The SSIM values of different images are listed in Table 5. When the compression ratio of the original images is 0.5, the SSIM values of the decrypted images and the original images are all above 0.95 for 256 × 256 images; in the meantime, the SSIM values of the decrypted images and original images are all more than 0.97 for 512 × 512 images, indicating that the original images and the reconstructed images are very similar, and the proposed method has good compression and reconstruction effect.

Table 5 SSIM values ​​of different images

5 Performance analyses

This section assesses the performance of the proposed cipher from the key space, key sensitivity, information entropy, histogram, adjacent pixel correlation, cropping attack and noise attack.

5.1 Key space analysis

The secret key of the proposed algorithm is as follows: ① the external keys t1, t2, t3, t4; ② the control parameters μ1, μ2, μ3, μ4 of chaotic systems; ③ the 256-bit hash value from the SHA-256 hash function of the original image. If the computing accuracy of the computer is 10−14, the key space is about (1014)4 × (1014)4 = 10112 > 2372. If the 256-bit hash value is considered, the overall key space is much larger than 2100, which can resist all kinds of brute-force attacks [8]. In addition, as shown in Table 6, the proposed scheme has the largest key space than encryption schemes in [12, 21, 31, 59], so it has higher security.

Table 6 Comparison of key space of the proposed method with other methods

5.2 Key sensitivity analysis

A good image encryption algorithm should have a high sensitivity to secret keys [15]. It means that a tiny change in the keys would cause a great distortion [24]. In this subsection, the Lena image shown in Fig. 5(a) is utilized as test image, the correct cipher image and decrypted image are displayed in Fig. 5(b) and (c).

The number of pixel change rate (NPCR) is an important criterion to measure the pixel consistency between two images. The NPCR is computed by [22].

$$ NPCR=\frac{\sum \limits_{i,j}D\left(i,j\right)}{M\times N}\times 100\% $$
(30)
$$ D\left(i,j\right)=\left\{\begin{array}{l}1,{C}_1\left(i,j\right)\ne {C}_2\left(i,j\right)\\ {}0, otherwise\end{array}\right. $$
(31)

where C1 and C2 represent two images, respectively, and M and N are the sizes of the image.

In the simulation, the key sensitivity analyses may be performed in encryption process and decryption process. Firstly, the key parameters t1, t2, t3, and t4 are chosen and a change of 10−14 is added on them to obtain new parameters t1 + 10−14, t2 + 10−14, t3 + 10−14, and t4 + 10−14. Next, the changed parameters are utilized to encrypt the plain image Fig. 5(a), and the obtained cipher images are shown in Fig. 6. The subtraction images and NPCR values between them and Fig. 5(b) are given in Fig. 7 and Table 7, respectively. Obviously, when the key changes slightly, the pixel change is about 99.60%, which means that our algorithm has high key sensitivity in the encryption stage.

Fig. 6
figure 6

Encrypted “Lena” using incorrect keys (a) t1 + 10−14, (b) t2 + 10−14, (c) t3 + 10−14, (d) t4 + 10−14

Fig. 7
figure 7

The subtraction images between Fig. 5(b) and (a) Fig. 6(a), (b) Fig. 6(b), (c) Fig. 6(c), (d) Fig. 6(d)

Table 7 Sensitivity analysis of encryption keys

Besides, the trivially changed parameters t1 + 10−14, t2 + 10−14, t3 + 10−14, and t4 + 10−14 are utilized to decrypt the Lena cipher image shown in Fig. 5(b), and each time only one parameter is modified. The results are illustrated in Fig. 8(a)-(d). As shown from Fig. 8, the correct plain image may not be obtained even if the parameters change slightly. In addition, the NPCR between the correctly decrypted image and the decrypted image with changed key is tested and listed in Table 8. As can be observed from Table 8, the values of NPCR are about 99.5%, which means that the proposed scheme has high key sensitivity to make brute-force attacks invalid.

Fig. 8
figure 8

Decrypted “Lena” using incorrect keys (a) t1 + 10−14, (b) t2 + 10−14, (c) t3 + 10−14, (d) t4 + 10−14

Table 8 Sensitivity analysis of decryption keys

5.3 Histogram analysis

In order to prevent attackers from statistically analyzing the gray value distribution to crack the image, it is required that the histogram of the cipher image is smooth and uniform [10, 50]. The histograms of plain images “Lena”, “Baboon”, “Cameraman”, “Peppers” and corresponding cipher images are plotted respectively, as shown in Fig. 9. It can be seen from Fig. 9 that the histogram of the plain image is steep and fluctuating, histogram of the cipher image is very uniform and significantly different from that of the plain image, so it can effectively stand statistical analysis attacks.

Fig. 9
figure 9

Histogram of different plain images (512 × 512) and cipher images (256 × 512): (a) “Lena”, (b)encrypted “Lena”, (c)"Baboon”, (d)encrypted “Baboon”, (e)"Cameraman”, (f)encrypted “Cameraman”, (h)"Peppers”, (i)encrypted “Peppers”

In addition, the chi-square test (χ2) and histogram variance are used to quantitatively evaluate the uniformity of the cipher image. The chi-square test is calculated by [41].

$$ {\chi}_{test}^2=\sum \limits_{i=0}^{255}\frac{\left({o}_i-o\right)}{o} $$
(32)
$$ o=\frac{M\times N}{256} $$
(33)

where oi is the number of times of the pixel value i in the image sized of M × N. Under a significance level of 0.05, the chi-square test is shown in Table 9. Table 10 gives the variance of histogram of plain images and cipher images. The results indicate that all cipher images have past the chi-square test, and the histogram variance of cipher image is much smaller than that of plain image, which indicates that the cipher image has a random distribution and consequently does not provide any clue for statistical analysis attacks.

Table 9 Chi-square test of different images
Table 10 Variance of histogram of different images

5.4 Correlation coefficient analysis

Correlation coefficient analyses of plain images and cipher images are utilized to qualitatively evaluate the encryption scheme. The correlation coefficient may be defined as [59],

$$ CC=\frac{Cov\left({v}_{ij},{v}_{xy}\right)}{\sqrt{D\left({v}_{ij}\right)\cdot D\left({v}_{xy}\right)}} $$
(34)
$$ Cov\left({v}_{ij},{v}_{xy}\right)=\frac{1}{MN}\sum \limits_{i,x=1}^M\sum \limits_{j,y=1}^N\left(\left[{v}_{ij}-E(v)\right]\cdot \left[{v}_{xy}-E(v)\right]\right) $$
(35)

where vij and vxy are the values of two adjacent pixels located at (i, j) and (x, y) in the image, respectively, E (·) represents the mean value of image pixels, and D (·) expresses the variance of the values, M and N are the sizes of the image.

In this section, we randomly choose 5000 pairs of adjacent pixels from the plain image and the cipher image of “Lena”, and analyze the correlations from horizontal, vertical and diagonal directions. Figure 10 plots the correlation distribution among adjacent pixels of “Lena” before and after encryption. It can be observed from Fig. 10 that the adjacent pixels in the horizontal, vertical and diagonal directions of the plain image are linearly distributed, while the pixels of the cipher image are uniformly distributed, indicating that the correlation between adjacent pixels of the plaintext image can be effectively eliminated.

Fig. 10
figure 10

Correlation of two adjacent pixels in the plain and cipher images. (a) and (b) are the horizontal correlation of the plain image “Lena” and its cipher image, respectively; (c) and (d) are the vertical correlation of the plain image “Lena” and its cipher image, respectively; (e) and (f) are the diagonal correlation of the plain image “Lena” and its cipher image, respectively

In addition, the correlation coefficients in the horizontal, vertical, and diagonal directions of test images are calculated and shown in Table 11. From Table 11, one may watch that the correlation coefficients of the plain images are close to 1, while the correlation coefficients of the corresponding cipher images tend to 0, which means that the strong correlation of the plain image has been effectively removed. Table 12 shows the comparison results of correlation coefficients. By comparing the data in Table 12, the correlation coefficients obtained by the proposed algorithm are similar to [12, 13, 29, 59], demonstrating that our method has good confusion effect and it may resist statistical analysis attacks.

Table 11 Correlation coefficient between the plain image and cipher image
Table 12 Comparison with other algorithms for Lena image

5.5 Information entropy analysis

To a certain extent, the information entropy of the image can represent the degree of disturbance of the image [32]. Moreover, the image is divided into non overlapping blocks of the same size. The average value of the information entropy of each block is the local entropy of the image. The local entropy can better represent the randomness of the pixel values in the image, and its ideal value is 8. At the same time, for 8-bit grayscale image, the image entropy divided by 8 is called relative entropy, which can be used to indicate whether the image can be compressed and its ideal value is 1. The difference between 1 and relative entropy is called redundancy [56]. Redundancy indicates whether the image has information redundancy, and its ideal value is 0. The information entropy, local entropy, relative entropy and redundancy are calculated by [18],

$$ \left\{\begin{array}{l}H(m)={\sum}_{i=0}^{255}p\left({m}_i\right){\log}_2\frac{1}{p\left({m}_i\right)}\\ {}{\overline{H}}_{k,{T}_B}(m)=\sum \limits_{i=1}^k\frac{H\left({M}_i\right)}{k}\\ {}{H}_r=\frac{H}{8}\\ {}{R}_{ed}=1-{H}_r\end{array}\right. $$
(36)

where H(m) is information entropy, p(mi) denotes the occurrence probability of symbol mi. \( {\overline{H}}_{k,{T}_B}(m) \) is the local entropy, the image is divided into k sub-blocks, and the information entropy of each sub-block Mi is H(Mi), and pixel number of every sub-block is TB. Hr represents the correlation entropy, and Red represents the redundancy.

In the simulation, the number of sub-blocks is k = 256. For the plain image, the pixel number of every sub-block is TB = 1024; For the cipher image, the pixel number of every sub-block is TB = 512, and the information entropy, local entropy, relative entropy and redundancy of different plain images are tested, as shown in Table 13. The corresponding results of encrypted images are given in Table 14.

Table 13 Information entropy, local entropy, relative entropy and redundancy of plain images
Table 14 Information entropy, local entropy, relative entropy and redundancy of cipher images

Comparing Tables 13 and 14, one can conclude that: (1) the information entropy of the plain image is significantly less than the ideal value 8, while the information entropy of the cipher image is very close to 8; (2) the local entropy of every plain image is less than its cipher image, and the cipher image is more random; (3) the relative entropy of each plain image is less than 1, indicating that the plain image can be compressed, and the relative entropy of the cipher image is very close to 1, demonstrating that the cipher image is difficult to compress; (4) the redundancy of each plain image is greater than 5%, and that of each cipher image is less than 0.02%, which means that the cipher image has little information redundancy.

Furthermore, the information entropy values of different cipher images gotten by different cryptosystems are shown in Table 15. It may be observed from Table 15 that the information entropy of our algorithm is slightly higher than that of [29, 59], and trivially lower than that of [12, 13], which shows that the proposed scheme is robust against entropy attack and has higher security level.

Table 15 Comparison of information entropy for different images

5.6 Noise attack analysis

In the simulation, the “Lena” image (shown in Fig. 5(a)) is used as test image, Salt & Pepper noise (SPN), Speckle noise (SN), and Gaussian noise (GN) with different intensities are added to the “Lena” cipher image (shown in Fig. 5(b)), and then the obtained noisy images are decrypted and shown in Fig. 11. The PSNR values between the decrypted images and the plain image are calculated and displayed in Fig. 12.

Fig. 11
figure 11

The reconstructed images under different noise intensities

Fig. 12
figure 12

PSNR values under different noise intensities

From Figs. 11 and 12, it can be seen that when the noise intensity increases from 0.000001 to 0.000007, the proposed encryption algorithm has strong resistance to SPN, and the PSNR values keep to 35.3844 dB; besides, our method has general resistance to SN and GN, under the same noise level, the PSNR values vary from 35.3844 dB to 26.0047 dB for SN, and those modify from 26.8283 dB to 25.8966 dB for GN. In summary, the proposed scheme is robust to noise attacks.

5.7 Cropping attack analysis

During the transmission of cipher images, some information may be lost. Therefore, assessing the ability to resist cropping attack is necessary for a cryptosystem. Here, the “Lena” cipher image (shown in Fig. 5(b)) is selected for testing, the data of 16 × 16, 32 × 32, 64 × 64, 128 × 128 are cropped to obtain images shown in Fig. 13(a)-(d), and the corresponding decrypted images are illustrated in Fig. 13(e)-(h). The PSNR values of the decrypted images and plain images are computed and listed in Fig. 14.

Fig. 13
figure 13

Cropping attack test images. (a), (b), (c), (d) are the cipher images with the cropping size of 16 × 16, 32 × 32, 64 × 64, 128 × 128 data, respectively; (e), (f), (g), (h) are the corresponding decrypted images of (a), (b), (c), (d), respectively

Fig. 14
figure 14

PSNR values under different cropping sizes

It can be observed from Figs. 13 and 14 that when the lost data increases, the decrypted image becomes blurred, but the outline of the plain image can still be recognized. In addition, the PSNR value is another indicator to evaluate the performance of cropping attacks. When 1/64 of the encrypted image is lost, the PSNR value between the decrypted image and plain image drops to 30.9005 dB. When the encrypted image loses 1/8 of data, the PSNR is 26.8660 dB, which shows that the proposed algorithm can resist certain cropping attacks.

5.8 Running time analysis

The encryption process mainly includes scrambling of approximate component and compression of detail components, double random phase encoding, random pixel scrambling and diffusion. The process of decryption on cloud mainly includes the generation and scrambling of measurement matrix, inverse scrambling, inverse sparsity, inverse scrambling of approximate component and inverse discrete wavelet transform (IDWT). It can be seen from Table 16 that for images with the size of 512 × 512, the average encryption time of the proposed algorithm is 0.582940 s, the average decryption time is 8.623362 s without cloud, the average decryption time is 0.157799 s using cloud, and the decryption time with cloud is 8.465562 s less than that without cloud.

Table 16 Encryption and decryption times of different images (Unit: s)

At the same time, the specific process of encryption and decryption of Lena image is analyzed. As shown in Fig. 15, in the encryption process, scrambling and diffusion represent the scrambling of approximate components and compression of detail components. The others mainly include the generation of key and chaotic sequences, and they take a long time, accounting for about 85% of the total encryption time. In the decryption process on the cloud, the most time-consuming convex optimization process is carried out on the cloud, so the decryption time on the cloud is very short, in which the inverse discrete wavelet transformation and inverse sparsity account for 90% of the total decryption time.

Fig. 15
figure 15

The specific running time of encryption and decryption processes for Lena image

In addition, without using cloud-assisted decryption, the encryption and decryption times of our proposed algorithm and other methods are shown in Table 17. It can be seen from this table that even if the cloud platform is not used to assist decryption, for 256 × 256 images, the encryption time of our algorithm is shorter than that of the algorithms [6, 9, 31, 38], and the decryption time is lower than that of the algorithm [6]; For 512 × 512 images, the encryption time of our algorithm is shorter than that of the algorithms [6, 9, 31, 38, 43], and the decryption time is lower than that of the algorithm [6]. To sum up, the encryption time and decryption time of the proposed algorithm are short, which shows that the algorithm is very effective and can be applied to real-time secure image communication.

Table 17 Comparison results with other algorithms in running times for Lena images (Unit: s)

5.9 Security analysis of cloud-assisted decryption strategy

In the proposed image encryption, the client directly transmits the cipher image to the cloud after encryption. In the decryption process, the cipher image is decrypted directly on the cloud, and then the partially decrypted cipher image is transmitted to the authorized users for the second decryption to get the original image. The cipher images transmitted to the cloud platform are noise-like images, and the output partially decrypted images on the cloud are also noisy images. The original image information cannot be obtained from these images from the naked eyes. Besides, before the user transmits the measurement matrix to the cloud platform for CS reconstruction, it is scrambled by use of the index sequences generated by the chaotic sequences. Thus, without the correct index sequences, the correct detail components of the plain image may not be recovered by manipulating the inverse sparse operation on the sparse coefficient matrices reconstructed by the cloud platform. Therefore, the cloud cannot obtain the original image, ensuring the privacy and security of the image.

Moreover, decryption on the cloud also needs to consider whether the cloud deceives us. Therefore, in order to prevent cloud deception, after the users get three detail components and the approximate component using inverse scrambling, the user firstly constructs a zero matrix of the same size as the detail component. Then the reconstructed three detail components and the zero matrix are performed the IDWT to obtain a new image. Because the authorized user has the approximate component of the decrypted image, it can compare the new image with the approximate component. If the contour of the two images is similar, it shows that the detail components gotten from the cloud are correct, and the plain image can be obtained by using the IDWT of the detail components and the approximate component. If the contour is completely different, it means that the cloud deceives the user, and the user can reject the reconstruction results from the cloud.

In short, the proposed cloud-assisted decryption method can not only ensure the authenticity and security of the decrypted image, but also prevent the user from being deceived. Even if other illegal users obtain the reconstructed results from the cloud, they cannot obtain the original image without the index sequences stored in the user.

5.10 Discussion

This part shows the advantages of our algorithm comparison with the existing algorithms from two aspects: qualitative analysis and quantitative analysis. The qualitative analysis mainly shows the advantages of the proposed algorithm from two aspects of the security and efficiency of image data transmission, while the quantitative analysis shows the comparison results between our algorithm and the state-of-the-art methods from several indicators of adjacent pixel correlation, information entropy, key space, PSNR and NPCR.

5.10.1 Qualitative analysis with other algorithms

The proposed image security transmission algorithm has some advantages.

First of all, some existing image data security algorithms [7, 11, 12, 20, 31, 35, 42, 45, 48, 58, 59] only involve the encryption and decryption of the client and the user, wherein the image is encrypted on the client side, and the decryption is performed on the user side. Although they protect the security of image data, convex optimization computation process of CS reconstruction may require a lot of time, which seriously affects the application of the cryptosystem in real-time fields. In addition, some image security transmission algorithms [51, 52, 54, 57] based on cloud do not consider the work of CS reconstruction on the cloud. In order to solve this problem, in this paper, we designed a secure and efficient cloud-decryption-assisted image compression-encryption based on CS, our scheme not only protects the security of image data, but also effectively relieves the burden of data storage and calculation, and is suitable for the transmission and storage of large amounts of images on the Internet.

Secondly, there are some shortcomings in some existing image security transmission algorithms based on the cloud. For example, although the algorithms [4, 23, 49] effectively encrypts the image, it lacks the authenticity authentication of the image, which makes it impossible to prevent the malicious tampering of the cloud. That’s because when data is stored on the cloud, the cloud may distort the stored image data or return calculation results at will because of laziness or curiosity. In this paper, we judge the authenticity of the image data by the contour information of the detail components and the approximate component obtained by the decryption scheme, which can effectively prevent the tampering of image data stored on the cloud.

Finally, algorithms [51, 52, 54, 57] require two different sampling and reconstruction to ensure the security of CS reconstruction. In this paper, we only use one sampling and reconstruction to achieve the security authentication of image data stored on the cloud, and a lot of time-consuming CS reconstruction process is completed on the cloud, which effectively improves the transmission efficiency of image data. To sum up, compared with other existing algorithms, the proposed algorithms have some predominance.

5.10.2 Quantitative analysis with other algorithms

In order to compare with other algorithms conveniently, the image Lena (512 × 512) is chosen as the test image, the information entropy, the correlation of adjacent pixels, PSNR, key space and NPCR are all computed and listed in Table 18.

Table 18 Comparison results with other studies for Lena image

As depicted from Table 18 that the information entropy gotten by our algorithm is close to the theoretical value of 8, and it is more than the information entropy of algorithms [12, 27, 29, 59], which illustrates the effectiveness of our algorithm. Meanwhile, in addition to the correlation coefficient of adjacent pixels in the vertical direction, those in the other two directions of cipher image are better than those in algorithms [12, 27, 29, 59], which shows that our method can effectively reduce the correlation coefficient between adjacent pixels. The PSNR of our algorithm is better than algorithms [12, 29, 59], which shows that our method has good reconstruction quality. In addition, the key space of the proposed method is better than most algorithms [12, 27, 59], and the NPCR value is higher than algorithm [12], close to algorithms [27, 29, 59]. It shows that the proposed algorithm has a good ability to resist differential attacks. Conclusively, the comparative analyses between our method and other algorithms verify the effectiveness and advantage of our algorithm and it can be applied to the field of image security.

6 Conclusion

In this paper, we designed a secure and efficient cloud-decryption-assisted image compression-encryption based on CS and DRPE. Specifically, the original image is firstly transformed into an approximate component and three detail components by DWT, and then the important approximate component is shuffled, and the unimportant detail components are compressed by CS. Subsequently, perform DRPE, random pixel scrambling and diffusion to obtain the final cipher image. In addition, the SHA-256 function of plain image is utilized to generate the initial values of the chaotic systems, the obtained chaotic sequences are used in the encryption stages, ensuring the correlation between the algorithm and plain image. During the decryption process, the partial decryption is firstly manipulated on the cloud to obtain the intermediate cipher image, and then the final decryption is performed on the user to obtain the plain image, which not only saves a lot of time, but also protects user privacy and prevents cloud deception.

Simulation results demonstrate that the proposed algorithm not only reduces the amount of image data transmitted and stored, but also has good security performance, and at the same time can achieve fast decryption, which can be applied in medical, commercial and other fields that require real-time processing of big data images. However, the algorithm designed in this paper also has shortcomings. Among them, it is artificial to judge whether the image stored on the cloud has been tampered by comparing the contour information of the detail component and the approximate component. In the future, it will be more effective to design a contour comparison algorithm to realize this process.