1 Introduction

Owing to the high speed of 2D processing, parallelism, and high degree of freedom across multiple parameters, optical information processing techniques have been widely employed in image encryption. Since double-random phase encoding (DRPE) in the Fourier domain using two random-phase masks (RPMs) was first proposed by Refregier and Javidi [1], it has been extended to the fractional Fourier domain [2], Fresnel domain [3], multiple-parameter fractional Fourier (MPFrF) domain [4], gyrator domain [5] and others [6,7,8]. However, these DRPE-based cryptosystems are linear and symmetric, resulting in a low endurance to some attacks. Phase-truncated Fourier transforms (PTFTs) [9] were suggested to remove the linearity and symmetry of DRPE. However, the initial PTFTs have no resistance to a specific attack. Various attack-free PTFTs [9,10,11,12,13] were proposed to enhance the security, but these resulted in complexity in the cryptosystems. A simple asymmetric cryptosystem that utilizes the interference to encrypt a plain image into two phase-only masks (POMs) was first suggested by Zhang et al. [14], but this approach suffered from the silhouette problem. Various improved schemes have been suggested to eliminate the silhouette problem. For example, Zhong et al. [15] suggested encryption that utilizes three POMs in the MPFrF domain, and Lin et al. [16] suggested generating private keys using conditional decomposition. However, these schemes could only encrypt one image. To improve the efficiency and capacity, interference has also been studied with various schemes to simultaneously encrypt multiple images. Niu et al. [17] introduced wavelength multiplexing, but to encrypt only two images. Chen et al. [18] employed multiplane phase retrieval using iteration, and Qin et al. [19] applied position multiplexing and utilized POM multiplexing [20]. However, these schemes suffered from either heavy computational complexity or crosstalk. Recently, Zhang et al. [21] adopted a vector stochastic decomposition algorithm based on a cascaded interference structure. This scheme eliminated time-consuming iteration, but required a complex structure with multiple cascaded POMs, resulting in transmission and storage burdens.

To realize simple multiple-image encryption free from the silhouette problem, we propose an asymmetric cryptosystem based on optical interference that incorporates the discrete cosine transform (DCT) and the conditional decomposition. During the encryption process, the DCT is employed to multiplex multiple original images into one synthetized signal, and subsequently, interference combined with the conditional decomposition is used to encode the synthetized signal into three POMs. In contrast to Refs. [18,19,20,21], our approach can be directly employed in image encryption without crosstalk between the decrypted images and without requiring iteratively generated or cascaded POMs. As a result, our approach eliminates the silhouette problem without increasing the computational complexity, time-consumption, or storage and transmission burdens. Numerical simulation results demonstrate the efficiency and capacity of our approach.

2 Theoretical analysis of the encryption algorithm

As is well-known, the upper-left corner of the DCT spectral plane contains most of the information of the general images. Therefore, by retaining just the upper-left part, the original images can be compressed without reducing the visibility, to a certain extent [22, 23]. In addition, the retained spectral parts can be shifted in space and multiplexed into a new synthetized spectrum, which can be employed to realize multiple-image encryption.

The scheme of our encryption process is illustrated in Fig. 1. Suppose that Oi(x, y) (i = 1, 2,…, m) denotes the ith original image with N × N pixels, where m is the total number of original images. During encryption, the DCT is applied to Oi, and the upper-left part of each image with N/c × N/c pixels after cropping its spectrum with a retaining coefficient c, is retained. The result is

$$ {\text{CF}}_{i} (u,v) = {\text{SC}}_{c} \left\{ {{\text{DCT}} \left[ {O_{i} \left( {x,y} \right)} \right]} \right\} $$
(1)

where (x, y) denotes a 2D-matrix in the spatial domain and (u, v) denotes a 2D matrix in the spectral domain; SCc [·] denotes the cropping process through low-pass (LP) filter, and m = c2.

Fig. 1
figure 1

Scheme of the encryption process

Each retained spectrum is then shifted in space and multiplexed into one synthetized signal [22] with N × N pixels, which can be written as

$$ {\text{SF}}(u,v) = \sum\limits_{i = 1}^{m} {\text{SM}} \left[ {{\text{CF}}_{i} (u,v)} \right] $$
(2)

where SM [·] denotes the process of shifting and multiplexing.

The synthetized signal is transformed by the inverse discrete cosine transform (IDCT) to the image domain and then it undergoes pixel-scrambling (PS) by PS [·]. The result is

$$ {\text{OM}}\;(x,\;y) = \sqrt {{\text{PS}} \left\{ {{\text{IDCT}} \left[ {{\text{SF}}\;(u,\;v)} \right]} \right\}} $$
(3)

It can be deduced that the multiplexed signal OM (x, y) is real-valued; however, it contains most of the information of the original images.

Optical interference and conditional decomposition are then employed to encode OM (x, y). During this process, OM (x, y) is bonded with an RPM of exp [ip1 (x, y)] and regarded as the object function

$$ I_{1} \;(x,\;y) = {\text{OM}}\;(x,\;y)\exp \left[ {{\text{ip}}_{1} \;(x,\;y)} \right] $$
(4)

where p1 (x, y) is uniformly distributed in [0, 2π].

For the conditional decomposition, another RPM of exp [ip2 (x, y)] is directly served as the cyphertext C (u, v), and another new object function can be expressed as

$$ {\text{FD}}\;(u,\;v) = F_{{(M_{L} ,M_{R} )}}^{{( - \alpha_{L} , - \alpha_{R} )}} \;({\mathbf{n}}_{L} ^{\prime},{\mathbf{n}}_{R} ^{\prime})\;\left[ {I_{1} \;(x,\;y)} \right] - C\;(u,\;v) $$
(5)

where \(F_{{(M_{L} ,M_{R} )}}^{{(\alpha_{L} ,\alpha_{R} )}} \left[ {} \right]\) represents the operation of discrete multiple-parameter fractional Fourier transform (DMPFrFT) [8, 15] with parameters of (ML, MR; αL, αR; mL, nL;mR, nR), while (αL, αR) is the fractional order for any value not equal to 0 or ± 2; (ML, MR) is the periodicity; (nL, nR) is the vector parameter; and n′ is defined as

$$ {\mathbf{n}}^{\prime} = \left( {km_{k} + Mm_{k} n_{k} + n_{k} } \right)\quad k = 0,1,2, \ldots ,\left( {M - 1} \right) $$
(6)

where m = (m0, m1,…, m(M-1)) ∊ ℤM; n = (n0, n1,…, n(M-1)) ∊ ℤM; and M is an arbitrary integer of > 2.

Following the principal of interference, two plaintext-dependent private keys can be obtained as

$$ M_{1} \;(u,\;v) = \arg \left[ {{\text{FD}}\;(u,\;v)} \right] - \arccos \left\{ {{{{\text{abs}} \left[ {{\text{FD}}\;(u,\;v)} \right]} \mathord{\left/ {\vphantom {{{\text{abs}} \left[ {{\text{FD}}\;(u,\;v)} \right]} 2}} \right. \kern-\nulldelimiterspace} 2}} \right\} $$
(7)
$$ M_{2} \;(u,\;v) = \arg \left\{ {{\text{FD}}\;(u,\;v) - \exp \left[ {iM_{1} \;(u,\;v)} \right]} \right\} $$
(8)

where M1 and M2 are POMs generated analytically in [0, 2π]; arg[·] and abs[·] return the phase angle and modulus of the complex signal, respectively.

Our approach yields three POMs consisting of one plaintext-independent cyphertext and two plaintext-dependent private keys. No information from the original images is encoded into the cyphertext. Because the existing approaches for extracting keys or plaintext through cyphertext look for the mathematical relationship between the cyphertext and the plaintext or keys [16, 24], we believe that our approach can resist the current chosen-cyphertext, known-plaintext, and cyphertext-only attacks.

The scheme of our decryption process is illustrated in Fig. 2, which contains the steps for decrypting and demultiplexing. As shown in Fig. 2a, the decrypting step is straightforward and can be carried out through the superposition of diffraction fields from the three POMs. In this step, the first spatial light modulator (SLM1) is used to produce the summation of the three POMs. A second spatial light modulator (SLM2) and a lens are used to perform the DMPFrFT, in which the optical system is a typical fractional Fourier transformer (FRFT) of the order of 4/M [8, 15]. A parallel laser beam is modulated by SLM1 and then transformed by SLM2 and a lens. After being acquired by a charge-coupled device (CCD) camera, the result can be expressed as

$$ \left| {{\text{OM}}\;(x,\;y)} \right|^{2} = \left| \begin{gathered} F_{{(M_{L} ,M_{R} )}}^{{(\alpha_{L} ,\alpha_{R} )}} ({\mathbf{n}}_{L} ^{\prime},\;{\mathbf{n}}_{R} ^{\prime})\left[ {C\;(u,\;v)} \right]\;{ + }\;F_{{(M_{L} ,M_{R} )}}^{{(\alpha_{L} ,\alpha_{R} )}} ({\mathbf{n}}_{L} ^{\prime},\;{\mathbf{n}}_{R} ^{\prime})\left\{ {\exp \left[ {iM_{1} \;\left( {u,\;v} \right)} \right]} \right\} \hfill \\ { + }\;F_{{(M_{L} ,M_{R} )}}^{{(\alpha_{L} ,\alpha_{R} )}} ({\mathbf{n}}_{L} ^{\prime},\;{\mathbf{n}}_{R} ^{\prime})\left\{ {\exp \left[ {iM_{2} \left( {u,\;v} \right)} \right]} \right\} \hfill \\ \end{gathered} \right|^{2} $$
(9)
Fig. 2
figure 2

Scheme of the decryption process. a Decrypting, and b demultiplexing

As shown in Fig. 2b, the demultiplexing step can be digitally executed on a computer. During this step, the synthetized spectrum SF (u, v) can be achieved in the spectral domain by applying the inverse pixel scrambling (IPS) and the DCT in sequence to OM (x, y). Each reduced image can be retrieved by taking the IDCT after correctly splitting and choosing its corresponding spectrum, and then enlarging it to yield the final decrypted image.

To quantify the performance of our approach, as many others did [16, 18-20, 25], the correlation coefficient (CC) is employed to evaluate the similarity between the original image and the decrypted image, which is defined as

$$ CC\; = \;\frac{{\sum {\sum {\left[ {O - E\left( O \right)} \right]\left[ {O^{\prime} - E\left( {O^{\prime}} \right)} \right]} } }}{{\sqrt {\left\{ {\sum {\sum {\left[ {O - E\left( O \right)} \right]^{2} } } } \right\}\left\{ {\sum {\sum {\left[ {O^{\prime} - E\left( {O^{\prime}} \right)} \right]^{2} } } } \right\}} }}, $$
(10)

where E is used to obtain the mean value of the input.

3 Results and analysis

To demonstrate the validity of the proposed asymmetric cryptosystem, various numerical experiments were conducted. First, we chose four original images with 256 × 256 pixels as shown in Fig. 3a–d. Their DCT spectra, as illustrated in Fig. 3e–h, were cropped but with the upper-left part retained as depicted by the white squares. The retained spectra were then multiplexed into a synthetized spectrum with the same size as the original image, as shown in Fig. 3i. The synthetized spectrum was transformed by the IDCT back to the spatial domain to yield a synthetized image as shown in Fig. 3j. The PS operation was then applied to the synthetized image, breaking it up into 65,536 subsections of 2 × 2 pixels, in which the gray value of the pixel of point (x, y) was interchanged with that of point (x′, y′) [26]. The PS application is shown in Fig. 3k. The synthetized image was then bonded with an RPM using Eq. 4.

Fig. 3
figure 3

ad Four original images; eh DCT spectra corresponding to (ad); i synthetized spectrum; j synthetized image by IDCT on (i); and k PS-synthetized image

For the encryption process, the parameters of the DMPFrFT were set as (αL, αR; ML, MR) = (0.34, 0.73; 15, 20). The vector parameters (mL, nL) and (mR, nR) were 1 × 15 and 1 × 20 random vectors, respectively, that contained independent integer values. Figure 4a shows the RPM chosen to act as the cyphertext C (u, v), and Fig. 4b, c shows the corresponding generated private keys M1 and M2, respectively. During the showing process, the operation of angle was applied to each POM. Clearly, no information from the original images could be identified.

Fig. 4
figure 4

a Cyphertext; private keys of bM1 and cM2

After completing the decryption process with the correct keys, the images could be reproduced as shown in Fig. 5a–d and can be recognized easily. However, owing to the cropping operation on the DCT spectrum, lossy-compression was produced on the four decrypted images. The corresponding CC values were calculated as 0.9826, 0.9847, 0.9920 and 0.9717, respectively.

Fig. 5
figure 5

ad Decrypted images using the correct keys

We further illustrate the importance of the encryption keys in our proposed method. For the sake of the brevity, we show only the first decrypted image. Figure 6 shows the influence of the deviation in the fractional order in the DMPFrFT on the decrypted image, and Figs. 7 and 8 show the decrypted image extracted using the incorrect periodicity of (ML, MR) and vector parameters (mL, nL; mR, nR). These results indicate that any deviation in the DMPFrFT parameters can result in poor image identification.

Fig. 6
figure 6

Influence of the deviation in the fractional order in the DMPFrFT on the decrypted image

Fig. 7
figure 7

Decrypted image with aML = 14 and bMR = 21

Fig. 8
figure 8

Decrypted image with amL + 1, bnL + 1, cmR + 1 and dnR + 1

For interference-based cryptosystem, the silhouette problem is a key issue. As other researchers have stated [15, 25,26,27,28], some silhouette information can be recognized using only one POM owing to the equipollence of the three POMs. However, by benefitting from the conditional decomposition algorithm, this drawback is easily overcome. We evaluate this by using only one or two of the three POMs in Eq. 9 to reconstruct the images, and the results are shown in Fig. 9. Clearly, none of the visible information associated with the original images could be seen in any of the decrypted images. This is because the cyphertext was generated randomly by the computer while the other two POMs of M1 and M2 were obtained in an analytical way [15]; however, the relation between the original images and M1 (and/or M2) was disturbed by the conditional decomposition.

Fig. 9
figure 9

Decrypted images using a only cyphertext; b cyphertext together with M1; c cyphertext together with M2; d only M1; e only M2; and fM1 and M2

Our approach can be utilized to encode greater number of images by cropping smaller parts of the DCT spectrum. Figure 10 shows the decrypted images with overall numbers of 4, 9, and 16. As the number of original images increases, the quality of the decrypted images decreases, but the images can still be visually recognized.

Fig. 10
figure 10

Decrypted images with encoded numbers of a 4, b 9 and c 16

Finally, to further verify the effectiveness of our proposed method, two different binary images and two random patterns were also taken as the original images. As shown in Fig. 11, the decrypted images were identical to the corresponding original images without any noises or distortions.

Fig. 11
figure 11

Original a, b binary images and c, d random patterns; eh decrypted images corresponding to (ad)

4 Conclusion

In summary, we presented an asymmetric cryptosystem based on optical interference using the DCT and conditional decomposition. In our approach, one plaintext-independent cyphertext is generated through conditional decomposition, and two plaintext-dependent POMs are yielded by interference to act as private keys. Therefore, the security strength is improved owing to the inherent non-linearity and asymmetry. Our numerical simulations demonstrate the validity and feasibility of our approach.