Keywords

1 Introduction

Digital steganography is a kind of covert communication technique that embeds secret messages in multimedia objects such as images, audio, video, text and so on [1]. It can realize the transmission of secret messages through the innocuous-looking stego objects. Due to the wide application of JPEG images on the Internet, JPEG image steganography has received extensive attentions. In recent years, many JPEG steganography algorithms have been proposed, such as Uniform embedding Distortion (UED) [2] steganography, JPEG UNIversal WAvelet Relative Distortion (J-UNIWARD) steganography [3], and so on. They usually constrain the embedding changes to the complex texture regions difficult to model and then achieve the good anti-detection ability. In other words, the primary goal of these content-adaptive steganography algorithm is the strong resistance ability to steganalysis technique. However, they are not robust to the lossy image processing such as image compression, Image resizing, etc. For example, in recent years, the instant communication tools, social platforms, and multimedia sharing websites are becoming more and more popular. These tools, platforms, and websites transmit and exchange massive amounts of digital images every day. With these massive images as cover, more secure covert communication can be achieved. However, in order to save transmission and storage costs, the images are often compressed during the process of transmission and sharing, which makes the extraction error rate of secret messages is very high when the message embedding is performed by non-robust steganography algorithms such as J-UNIWARD. Therefore, robust JPEG steganography against lossy image processing is becoming a research hotspot in field of information hiding.

For the design of robust JPEG steganographic schemes, Zhang et al. [4] constructed the robust embedding domain based on the relative relationship of inter-block DCT coefficients, and proposed a robust and adaptive JPEG steganography algorithm against JPEG compression; Qian et al. [5] proposed a robust steganography algorithm using texture synthesis, however, the extraction error rate of secret messages is relatively high; Zhang et al. [6] also proposed a JPEG compression and detection resistant steganography algorithm based on dither modulation when the quantization table of JPEG compression is assumed to be known; Zhao et al. [7] proposed a robust and adaptive JPEG image steganography algorithm based on transmission channel matching, however, the behavior of repeatedly uploading images for re-compression is very suspicious; Tao et al. [8] proposed a robust JPEG steganography by generating the “intermediate image” that is just the stego image after JPEG compression with special quality factor, however, the quality factor of JPEG compression must to be known previously; Yu et al. [9] proposed a robust image steganography algorithm based on generalized dither modulation and embedding domain expansion, which can achieve better robustness and anti-detection ability, however, the quantization table of JPEG compression is assumed to be known. Recently, Zhang et al. [10] proposed a robust steganography algorithm with multiple robustness enhancements, however, the anti-detection ability is quite weak.

In this paper, a robust JPEG steganography algorithm is proposed based on inter-block singular value correlation in DCT domain. As we know, the inter-block DCT coefficients of JPEG image have strong correlation. Moreover, the correlations are relatively stable even though some lossy processing are performed for the image. Furthermore, considering the stability of singular values [11], we construct the robust embedding domain by exploiting this correlation of the maximum singular values of the two matrixes generated using the DCT coefficients in middle frequency bands of two adjacent 8 × 8 DCT blocks. The binary cover elements can be got using the correlation of the maximum singular values of the generated matrixes. In addition, based on the embedding distortion function of J-UNIWARD, the embedding distortion of the proposed steganography algorithm is defined according to the embedding changes of DCT coefficients caused by modifying the maximum singular values of the generated matrixes. To reduce message extraction errors, the secret messages are encoded by Reed-Solomon (RS) error correcting code, and the encoded messages are embedded using STCs [12] which is widely used for minimal distortion steganography. Finally, the stego elements are embedded by modifying the maximum singular values of the corresponding matrixes and the stego image is generated using the modified DCT coefficients.

The rest of the paper is organized as follows: Sect. 2 introduces the construction of robust embedding domain; Section 3 proposes a robust JPEG steganography algorithm and the implementation details are described in details; Section 4 verifies the effectiveness of the proposed steganography algorithm by comparing it with the state-of-the-art robust image steganography algorithms; Section 5 is the conclusion.

2 Robust Embedding Domain Construction Using SVD

2.1 Singular Value Decomposition

SVD is a kind of orthogonal transforms used for matrix diagonalization. Let \(\mathbf{A}\in {\mathbf{R}}^{m\times n}\) be a \(m\times n\) matrix. Then, the matrix \(\mathbf{A}\) can be represented by its SVD in the following form,

(1)

where \(\mathbf{U}\) and \(\mathbf{V}\) are orthogonal \(M\times N\) and \(N\times M\) matrices, respectively, and \(\mathbf{S}\) is a diagonal matrix with nonnegative elements. Diagonal terms \({\lambda }_{1},{\lambda }_{2},\cdots ,{\lambda }_{r}\) of matrix \(\mathbf{S}\) are singular values of matrix \(\mathbf{A}\) in a descending order and \(r\) is the rank of matrix \(\mathbf{A}\).

There are many attractive mathematical properties of SVD, such as the singular values \({\lambda }_{1},{\lambda }_{2},\cdots ,{\lambda }_{r}\) are unique and have good stability. In other words, when a small perturbation is added to a matrix, in contrast to the changes of matrix element values, the changes of singular values are very small.

2.2 DCT Coefficient Properties of JPEG Image

JPEG is one of the most popular image formats on the Internet because it can achieve good tradeoff between storage size and image quality. The basis for JPEG is the DCT (Discrete Cosine Transform) which is a lossy image compression technique. For JPEG compression, the image is performed two-dimensional (2D) DCT on 8 × 8 blocks, and then the DCT coefficients are quantized according to the quality factor or JPEG quantization table, finally, the DCT blocks are encoded using Huffman encoding.

Fig. 1.
figure 1

Zig-zag ordering and frequency bands for DCT coefficients.

As we know, a DCT block of JPEG image can be separated into low, middle, and high frequency bands as shown in Fig. 1. The DCT coefficients in low frequency band is most important for image quality and the small changes to the DCT coefficients will change the image quality significantly. On the other hand, although the changes of DCT coefficients in the high frequency band will no significantly degrade image quality, the DCT coefficients or their statistics are not robust for lossy image processing. Therefore, to get a high quality and robust stego image, middle frequency band is suitable for message embedding.

2.3 Robust Embedding Domain Construction

For JPEG image, it is generally known that the inter-block DCT coefficients have strong correlation. That is to say, the DCT coefficients at the same positions in two adjacent DCT blocks have close values. Moreover, the correlations between them are relatively stable against lossy image processing. Unlike [4], we do not directly use the coefficient correlation to extract the cover elements. Considering the stability of singular values, we extract the cover elements by exploiting this correlation of the maximum singular value of the matrixes generated using the DCT coefficients in middle frequency bands of two adjacent 8 × 8 DCT blocks. Therefore, A binary cover element can be extracted based on two adjacent DCT blocks.

In Fig. 2, the adjacent relations of DCT blocks and the construction of matrix for SVD are both shown. To get the adjacent relations, all the DCT blocks are scanned in a snake-like order, and then each DCT block has an adjacent DCT block according to the scan order. To extract a binary cover element, two matrixes should be respectively constructed using the DCT coefficients in middle frequency band of two adjacent DCT blocks such as \({B}_{(i,j)}\) and \({B}_{(i,j+1)}\). Specially, to ensure the reference DCT coefficients are not altered during embedding process, the matrixes of two adjacent DCT blocks are constructed using the DCT coefficients at different positions. As shown in Fig. 2, if the traversing order of \({B}_{(i,j)}\) is an odd number, the matrixes of two adjacent blocks for SVD is \({\mathbf{M}}_{i,j}^{odd}\) and \({\mathbf{M}}_{i,j+1}^{odd}\). On the contrary; if the traversing order of \({B}_{(i,j)}\) is an even number, the matrixes of two adjacent blocks for SVD is \({\mathbf{M}}_{i,j}^{even}\) and \({\mathbf{M}}_{i,j+1}^{even}\).

Fig. 2.
figure 2

DCT blocks scan order and matrix construction for SVD.

After all the matrixes for SVD are constructed, SVD is performed for each matrix and the maximum singular value is got. Supposed the matrixes of two adjacent DCT blocks \({B}_{(i,j)}\) and \({B}_{(i,j+1)}\) are \({\mathbf{M}}_{i,j}^{odd}\) and \({\mathbf{M}}_{i,j+1}^{odd}\) respectively, the corresponding maximum singular values are \({\lambda }_{i,j}\) and \({\lambda }_{i,j+1}\). Then, a binary cover element can be extracted according to Eq. (2),

$${c}_{i,j}=\left\{\begin{array}{c}1,\,if {\lambda }_{i,j}\ge {\lambda }_{i,j+1}\\ 0,\, if {\lambda }_{i,j}< {\lambda }_{i,j+1}\end{array}\right..$$
(2)

Therefore, for each DCT block, we can extract a binary cover element and all the cover elements are used as the cover object. For example, suppose the size of cover image is 512 × 512, and then the number of 8 × 8 DCT blocks and the cover elements are both 4096. Finally, the extracted cover object is used for robust message embedding.

3 Proposed Robust JPEG Steganography Algorithm

3.1 Framework of Proposed JPEG Steganography Algorithm

In Fig. 3, the whole framework of the proposed robust JPEG steganography algorithm is shown, which includes the message embedding procedure and extracting procedure.

Fig. 3.
figure 3

Framework of proposed robust and adaptive JPEG steganography algorithm.

As shown in Fig. 3, the embedding procedure gives the execution steps for message embedding. The inverse quantization operation is performed by multiplying the DCT coefficients by the corresponding quantization step, then all the 8 × 8 DCT blocks are scanned in a snake-like order shown in Fig. 2 and the adjacent relations of DCT blocks are determined according to the scan order. For each DCT block, a DCT coefficient matrix is constructed in middle frequency band and SVD is performed for the constructed DCT coefficient matrix, and then the cover elements are extracted according to the inter-block maximum singular value correlation. For embedding cost of each cover element, the distortion of ±1 modification at each position in 8 × 8 DCT block is firstly computed, and then the embedding cost is measured according to the changes caused by the modification of maximum singular value. In addition, to improve the robustness, the secret messages are encoded by RS codes which can correct some error bits. Then, based on the cover elements and the corresponding embedding costs, the encoded secret messages are embedded using STCs and the corresponding stego elements are generated. Finally, the maximum singular values of the constructed DCT coefficient matrixes are modified according to the stego elements, and then all the 8 × 8 DCT blocks are reconstructed and the stego image is generated.

For message extraction, as shown in Fig. 3, the stego image is performed inverse quantization operation and then the corresponding 8×8 DCT blocks are got. For each DCT block, the DCT coefficient matrix is constructed in middle frequency band and SVD is performed for the constructed DCT coefficient matrix. Then, the stego elements can be extracted according to the inter-block maximum singular value correlation. Finally, the encoded messages are extracted by STCs and then the secret messages are obtained using RS decoder.

3.2 Extract Cover Elements

Let \(\mathbf{X}\) denotes the cover JPEG image of size \(M\times N\) and \({B}_{(i,j)}\) denotes the (i, j)-th 8 × 8 DCT blocks. According to the steganography framework shown in Fig. 3, the DCT blocks are scanned in snake-like order and a binary cover element can be extracted using two adjacent DCT blocks such as \({B}_{(i,j)}\) and \({B}_{(i,j+1)}\).

For ease of description, according to the block scanning order, we denote the \({B}_{k}\) as the k-th DCT block and its adjacent DCT block is \({B}_{k+1}\). The number of DCT blocks is \(\left\lfloor {M/8} \right\rfloor \times \left\lfloor {N/8} \right\rfloor \) where \(\left\lfloor \cdot \right\rfloor \) denotes the floor function. Then, matrix \({\mathbf{M}}_{k}\) and \({\mathbf{M}}_{k+1}\) are respectively constructed using the DCT coefficients in middle frequency band of DCT blocks \({B}_{k}\) and \({B}_{k+1}\). Next, SVD is performed for \({\mathbf{M}}_{k}\) and \({\mathbf{M}}_{k+1}\), the corresponding maximum singular values are \({\lambda }_{k}\) and \({\lambda }_{k+1}\). Finally, the cover element \({c}_{k}\) can be extracted according to Eq. (3),

$${c}_{k}=\left\{\begin{array}{c}1,\,if\, {\lambda }_{k}\ge {\lambda }_{k+1}\\ 0,\, if\, {\lambda }_{k}< {\lambda }_{k+1}\end{array}\right..$$
(3)

According to Eq. (3), we know that the cover element \({c}_{k}\) equals 0 or 1. For each block can generate a cover element, \(L\) cover elements \({c}_{1},{c}_{2},\cdots ,{c}_{L}\) are generated in total and \(L\) equals \(\left\lfloor {M/8} \right\rfloor \times \left\lfloor {N/8} \right\rfloor\). These cover elements construct the cover object which is a binary sequence.

3.3 Define Embedding Costs of Cover Elements

As shown in Fig. 3, the cover object is constructed based on the maximum singular value correlation in DCT domain and then the secret messages are embedded by STCs. Therefore, an embedding cost need to be defined for each cover element. We know that J-UNIWARD is one of the most powerful JPEG steganography algorithms and it has good anti-detection ability against the state-of-the-art steganalysis techniques. Therefore, we use the embedding distortion function of J-UNIWARD to measure the cost of ±1 modification for DCT coefficient.

The embedding distortion function of J-UNIWARD is defined in Eq. (4),

$$D(X,Y)={\sum }_{k=1}^{3}{\sum }_{i,j}\frac{\left|{W}_{ij}^{(k)}(X)-{W}_{ij}^{(k)}(Y)\right|}{\varepsilon +\left|{W}_{ij}^{(k)}(X)\right|},$$
(4)

where \({\text{X}}\text{,}{\text{Y}}\) denote the cover image and stego image respectively, \({W}_{ij}^{(k)}(X)\) and \({W}_{ij}^{(k)}(Y)\) denote the (i, j)-th wavelet coefficient got using k-th wavelet filter, \(\varepsilon\) is a stabilizing constant to avoid dividing by zero.

Furthermore, \(D(X,Y)\) can be denoted as additive form as follows,

$$D(X,Y)={\sum }_{i,j}{\rho }_{ij}(X,{Y}_{ij}),$$
(5)

where \({\rho }_{ij}(X,{Y}_{ij})\) denotes the embedding distortion with only (i,j)-th element changed. In other words, \({\rho }_{ij}(X,{Y}_{ij})\) is the cost of ±1 modification for (i,j)-th DCT coefficient.

For the proposed robust and adaptive JPEG steganography algorithm, the cover element is extracted by modifying the maximum singular values of the matrix \({\mathbf{M}}_{k}\) and \({\mathbf{M}}_{k+1}\) constructed using the DCT coefficients in middle frequency bands of two adjacent DCT blocks. Therefore, the modification of a cover element will cause the changes of multiple elements. Then, the embedding cost of the cover element is measured by summing all the embedding costs of the corresponding DCT coefficients. The specific form is shown in Eq. (6),

$$\rho \left({c}_{k}\right)={\sum }_{{Y}_{i,j\in {\mathbf{M}}_{{\varvec{k}}}}}{{d}_{ij}\left(X,{Y}_{ij}\right)\times \rho }_{ij}\left(X,{Y}_{ij}\right)+{\sum }_{{Y}_{i,j\in {\mathbf{M}}_{{\varvec{k}}+1}}}{d}_{ij}\left(X,{Y}_{ij}\right)\times {\rho }_{ij}\left(X,{Y}_{ij}\right),$$
(6)

where \({d}_{ij}\left(X,{Y}_{ij}\right)\) denotes the modifications of (i,j)-th DCT coefficients that are caused by modification to the cover element \({c}_{k}\).

3.4 Embed Stego Elements by Modifying Maximum Singular Values

Based on the cover object and the corresponding embedding cost, the encoded secret messages can be embedded by STCs. Then, the stego object is generated and it includes \(L\) binary stego elements. According to the steganography framework shown in Fig. 3, the stego elements need to be embedded by modifying the maximum singular value correlation.

Suppose the maximum singular values of matrixes \({\mathbf{M}}_{k}\) and \({\mathbf{M}}_{k+1}\) are \({\lambda }_{k}\) and \({\lambda }_{k+1}\) respectively, and \({s}_{k}\) denote the k-th stego element. Then, the stego element \({s}_{k}\) can be embedded by modifying \({\lambda }_{k}\) and \({\lambda }_{k+1}\) according to Eq. (7) and Eq. (8),

$${\lambda{^{\prime}}}_{k}=\left\{\begin{array}{c}E+\alpha \\ E-\alpha \end{array}\right.,\,{if\, s}_{k}=1$$
(7)
$${\lambda {^{\prime}}}_{k+1}=\left\{\begin{array}{c}E-\alpha \\ E+\alpha \end{array}\right.,\, {if\, s}_{k}=0$$
(8)

where \({\lambda {^{\prime}}}_{k}\) and \({\lambda {^{\prime}}}_{k+1}\) denotes the modified maximum singular values, \(E=({\lambda }_{k}+{\lambda }_{k+1})/2\) and \(\alpha \) is the embedding strength factor. After the modification for \({\lambda }_{k}\) and \({\lambda }_{k+1}\), the new matrixes \({\mathbf{M}{^{\prime}}}_{k}\) and \({\mathbf{M}{^{\prime}}}_{k+1}\) are reconstructed using \({\lambda {^{\prime}}}_{k}\) and \({\lambda {^{\prime}}}_{k+1}\), and the corresponding DCT coefficients are also modified.

In Eq. (7) and Eq. (8), it can be seen that the parameter \(\alpha \) is the key factor for the robustness of stego elements against lossy image processing. The large \(\alpha \) means strong robustness whereas the corresponding modifications are large which led to the weak anti-detection ability. Therefore, a suitable parameter \(\alpha \) should be selected which can achieve good tradeoff between steganography robustness and anti-detection ability.

3.5 Extract Stego Elements

As shown in Fig. 3, for message extraction, the stego object \(S=({s}_{1},{s}_{2},\cdots ,{s}_{L})\) should be firstly extracted and then the embedded messages are extracted by STCs and RS decoder.

According to the embedding rule of stego elements in Eq. (7) and (8), the stego elements can be extracted using Eq. (9),

$${s}_{k}=\left\{\begin{array}{c}1,if {\lambda \mathrm{^{\prime}}}_{k}\ge {\lambda {^{\prime}}}_{k+1}\\ 0, if {\lambda \mathrm{^{\prime}}}_{k}< {\lambda {^{\prime}}}_{k+1}\end{array}\right..$$
(9)

where \({\lambda {^{\prime}}}_{k}\) and \({\lambda {^{\prime}}}_{k+1}\) respectively denote the maximum singular values of two matrixes constructed using two adjacent DCT blocks of stego image.

To summarize, the embedding procedure and extraction procedure of the proposed robust and adaptive JPEG steganography can be respectively described in Algorithm 1 and 2.

figure a
figure b

4 Experimental Results and Analyses

In the experiments, the robustness against JPEG compression attack and the anti-detection ability of the proposed steganography algorithm are compared with the other robust JPEG steganography algorithms. Moreover, the proposed steganography algorithm is used in WeChat platform and the corresponding experiment results are shown. For the robustness and anti-detection experiments, the 10000 grayscale images from BOSSbase1.01 [13] are used as sample images. The size of sample images is 512×512 and all the sample images with PGM format are converted to JPEG image with quality factor (QF) 85. The parameter of RS code is (31, 19).

4.1 Robustness Against JPEG Compression Attack

As we know, the complex image has strong resistance to the detection. Therefore, we should select some images with complex texture for messages embedding. Here, one-level wavelet transform is performed for the image and the energy of wavelet coefficients of the three high-pass subbands is used to measure the complexity of the image.

To evaluate the robustness against JPEG compression attack of the proposed steganography algorithm, the most complex 2000 images from BOSSbase1.01 are used to generate the stego images. The robust steganography algorithms used for comparison are MREAS-PS and MREAS-PJ [10]. For the proposed steganography algorithm, the number of cover elements is 4096 because the image size is 512 × 512 and DCT block size is 8 × 8. Therefore, the length of the embedded message bits cannot exceed 4096. The payload is set to 0.001, 0.002, 0.003, 0.004, 0.005 bpnzAC (bit per non-zero AC DCT coefficient) respectively. Then, for each robust steganography algorithm, we have one group of cover images and five groups of stego images.

Table 1. Average extraction error rates of three robust steganography algorithms for the 2000 complex images in BOSSbase1.01. (×10−3)

According to the average extraction error rates shown in Table 1, the proposed robust and adaptive JPEG steganography algorithm has achieved the competitive robustness.

As shown in Table 1, for MREAS-PJ, the average extraction error rates are low when the QFs of JPEG compression attack are 85 and 95. However, the extraction error rates become very high when the QF of JPEG compression attack is 65 which means strong attack.

4.2 Detection Resistance for Typical Steganalysis Features

The anti-detection ability is important for robust image steganography. Here, the proposed steganography algorithm is compared with MREAS-Ps and MREAS-PJ using CC-PEV [14] and DCTR [15] which are the typical steganalysis features.

Same with Sect. 4.1, the most complex 2000 images from BOSSbase1.01 are used to generate stego images. The payloads are from 0.001 to 0.050 bpnzAC. The ensemble classifier [16] is trained by the steganalysis feature and used as the final detector. The ratio of training and test images is 0.5:0.5. The detection accuracy is quantified using the minimal total error probability under equal priors \({P}_{E}={\mathit{min}}_{{P}_{\text{FA}}}({P}_{\text{FA}}+{P}_{\text{MD}})/2\), where \({P}_{\text{FA}}\) denotes the false-alarm probabilities and \({P}_{\text{MD}}\) denotes the missed-detection probabilities. The value of \({\bar P_E}\) is averaged over ten random image database splits.

According to the detection performances, we find that the detection resistance of the proposed steganography algorithm is relatively weak in contrast to MREAS-PS and MREAS-PJ. It is possible that the embedding changes of the proposed algorithm is larger.

4.3 Application in WeChat Platform

WeChat is the most popular chat app in China and it provide a good public channel for covert communication by image steganography. However, the compression algorithm of WeChat is unknown. Therefore, UEDR-P cannot realize robust steganography by WeChat channel. Therefore, only the robustness of MREAS-PJ and the proposed steganography is evaluated for WeChat channel.

Fig. 4.
figure 4

Cover image for covert communication by WeChat. (a) ‘8.jpg’ in BOSSbase1.01 and (a) ‘4226.jpg’ in BOSSbase1.01.

The two cover images are shown in Fig.4 and the image sizes are both 512 × 512. The left cover image is ‘8.jpg’ in BOSSbase1.01 with 157 KB and the right cover image is ‘4226.jpg’ with 258 KB. First, the stego images are generated by MREAS-PJ and the proposed steganography algorithm with (31,19) RS codes and the payload is 0.01 bpnzAC. Next, the stego images are posted on the moment of WeChat. Then, the stego images are downloaded from the moment of WeChat and the file size of the downloaded WeChat images are 54 KB and 105 KB respectively. Finally, the embedded messages are extracted from the downloaded WeChat image and the extraction error rates are calculated. The experimental results are shown in Table 2.

Table 2. Comparisons of robustness of MREAS-PJ and the proposed steganography algorithm against WeChat compression.

Table 2 shows that MREAS-PJ and the proposed algorithm both can realize the correct message transmission under lossy WeChat channel.

5 Conclusions

Robust image steganography techniques are important for covert communication under lossy transmission channels. Based on the singular value decomposition in DCT domain, a robust JPEG steganography algorithm is proposed. Considering the inter-block DCT coefficient correlation and the stability of singular value, the robust embedding domain is constructed using the correlation of the maximum singular values generated from two adjacent DCT blocks. Then, the whole frame of the proposed steganography algorithm is given and the procedure of message embedding and extraction are described in details. The experimental results show the proposed JPEG steganography is effective against JPEG compression and WeChat channel compression.

Furthermore, we also notice that the anti-detection ability of the current robust image steganography techniques is all weak when the detection is performed by the classifier trained using the cover images and the corresponding stego images. This is because that the embedding changes of the robust JPEG steganography scheme are much larger than the non-robust JPEG steganography scheme such as J-UNWARD. In other words, to achieve the robustness, the embedding changes will be large. In the future, the robust embedding domain which can led to the stronger robustness and anti-detection ability should be studied.