Keywords

1 Introduction

Data hiding is a technique for hiding secret message into digital cover, which developed for the last decades. Data hiding is widely used in military, commercial, medical, financial, etc. People transmit or share multimedia data more and more widely with the rapid development of computer technology. To transmit or share multimedia data quickly in the net, most multimedia data are stored in compressed formats. JPEG is the most popular file format in relation to digital images, which compresses image data effectively. Therefore, the research of data hiding for JPEG images has practical significance.

Traditional data hiding techniques usually change the original data irretrievably, such as modifying the DC components [1], modifying the AC coefficients by histogram shifting [2], etc. However, the distortion due to embedding cannot be accepted in some application areas, like medical, military and law. Hence, the lossless data hiding techniques are required. As a branch of data hiding techniques, lossless data hiding techniques can embed additional data into cover signal and leave no signal distortion.

Recently some lossless data hiding methods for JPEG images have been proposed. For instance, Fridrich et al. [3] proposed three lossless data embedding methods for embedding data in the quantized DCT coefficients reversibly. Hence, it is possible to reconstruct from the marked JPEG file. Xuan et al. [4] proposed a lossless data hiding method by histogram shifting. The quantized DCT coefficient histogram is shifted to embed high capacity secret data. However, the filesize will increase after data embedding by using the lossless methods introduced above. Then, some lossless and filesize preservation methods for JPEG images appear. A novel lossless method by code mapping is proposed by Mobasseri et al. [5]. The method performs in JPEG bitstream and preserves the image quality and filesize. To improve the embedding capacity, an improved method was proposed by Qian and Zhang [6]. The method improved the code mapping relationships and achieved higher embedding capacity. Recently, Hu et al. [7] improved the mapping relationships further but not optimally.

Similar to the previous methods, we only replace the unused-VLCs for hiding additional data in this paper. Then both the image visual quality and the JPEG filesize are preserved. But the embedding capacity can be improved because of the proposed optimal VLC mapping rules. The rest of this paper is organized as follows. Section 2 briefly introduces the structure of JPEG images and Hu et al.’s method [7]. Section 3 presents the proposed scheme in detail. The experimental results with analysis and comparisons are given in Sect. 4. Section 5 concludes the paper.

2 Related Works

Since the proposed method works on JPEG bitstream, the knowledge of JPEG image structure is required. Furthermore, inspired by Liu et al. method, we proposed a lossless and reversible data hiding method. Thus, we introduce the structure of JPEG image at first and Liu et al. method. Since we aim to propose the optimal VLC mapping of JPEG bitstream based on Hu et al.’s method [7] in this paper, we introduce the structure of JPEG bitstream at first in Subsect. 2.1. Because Hu et al. [7] improved the payload based on Qian and Zhang’s method [6] Subsect. 2.2 will include the method proposed by Qian and Zhang [6]. Finally, we will introduce Hu et al.’s method in Subsect. 2.3.

2.1 The Structure of JPEG Bitstream

In general, JPEG bitstream consists of a sequence of segments, each beginning with a marker. Each marker begins with a 0xFF byte followed by a byte indicating what kind of marker it is. For the decoding phase, there are two key segments, the Define Huffman Table (DHT) segment and the Start of Scan (SOS) segment. Figure 1 shows the detail structure of the two segments.

Fig. 1.
figure 1

The structure of the JPEG bitstream.

According to the JPEG guideline, after some preprocessing works, the pixel values of each \(8\times 8\) block in the image are transformed into AC/DC coefficients by Discrete Cosine Transformation (DCT). For each block’s upper left corner first coefficients, define it as DC coefficients. For the remaining 63 coefficients of each block, define them as AC coefficients. To compress the image filesize, the AC coefficients are encoded in Run-Length encoding format as intermediate symbols, (Run, Length) and (Amplitude). Further, to improve the compress rate, (Run, Length) is encoded in the format of Variable Length Code (VLC), and (Amplitude) is encoded in Variable Length Integer (VLI) format, which calls it as appended bits.

The DHT segment contains the Canonical Huffman table information, which is used for obtaining the VLCs. Figure 1 shows the structure of DHT segment, in which \(L_i\) equals the number of the same length VLCs and \(RLV_{i,j}\) represents the run/length value (RLV) corresponding to the first j VLC of length i. Each VLC is corresponded to a specific RLV that represents an AC coefficient in entropy-coded data. The VLCs are encoded by Canonical Huffman code, for instance, if the run/length value is ‘0/4’, the corresponding VLC is ‘1011’. For AC coefficients of luminance component, 162 different VLCs are corresponded to the run/length value from ‘0/1’ to ‘F/A’ accompanied with ‘0/0’ (End of Block) and ‘F/0’ (Zero Run Length). The length of VLC is between 2 to 16 bits. The statistical results indicate that not all of the VLCs appear in the entropy-coded data. As discussed in Sect. 1, many researchers make use of this condition to embed data [6, 7].

2.2 Qian and Zhang’s Method

Qian and Zhang proposed a lossless data hiding method [6] by Huffman code mapping, which can preserve the modified image with no distortion and provide more embedding capacity than [5]. In addition, their method can preserve the filesize with not changed.

Fig. 2.
figure 2

An embedding instance by VLC mapping.

As mentioned in Sect. 1 and Subsect. 2.1 that not each kind of VLC appears in the entropy-coded data. Qian and Zhang used this condition to establish mapping relationships between used-VLCs and unused-VLCs. Then replace the used-VLC by the any VLC in the mapping sets to embed data. Figure 2 shows an instance about embedding data by VLC mapping. In Fig. 2, the mapping set includes four VLCs, used-VLC, \(unused-VLC_1\), \(unused-VLC_2\) and \(unused-VLC_3\). Therefore, the four VLCs can stands for all the \(2^2\) situations of two binary bits. Then find the used-VLC in the entropy-coded data and replace it by any of the mapping set. For instance, if we replace the used-VLC by the \(unsed-VLC_3\), the data ‘11’ will be embedded.

Their embedding phase can be described as following steps.

Step 1: Parse the JPEG bitstream and extract all the VLCs in the entropy-coded data.

Step 2: Establish the mapping relationships according to the statistical results of the used-VLCs and unused-VLCs.

Step 3: Modify the corresponding run/size value in the DHT segment according to the mapping relationships.

Step 4: Replace the VLCs in the entropy-coded data with the corresponding unused-VLCs to embed data.

To preserve the filesize with not changed, Qian and Zhang only replace the used-VLC by the unused-VLC with same length. Therefore, all the VLCs are classified into 16 categories:

$$\begin{aligned} \{ C_1,C_2,\ldots ,C_{16}\} \end{aligned}$$

by their code length and category \(C_i\) has \(L_i\) VLCs of length \(i(i=1,2,\ldots ,16)\), which can be represented in the following equation:

$$\begin{aligned} C_i = \{ VLC^{u}_{i,1},\ldots ,VLC^{u}_{i,p_i};VLC^n_{i,1},\ldots ,VLC^n_{i,q_i}\} \end{aligned}$$
(1)

where \(VLC^{u}_{i,1},\ldots ,VLC^{u}_{i,p_i}\) are used-VLCs, \(VLC^n_{i,1},\ldots ,VLC^n_{i,q_i}\) are unused-VLCs, and \(p_i + q_i = L_i\).

To obtain more payload, they have proposed two mapping manners:

  1. (1)

    For the case \(p \ge q \) and \(q > 0\), VLCs in each category are mapped by one-to-one manner.

    $$\begin{aligned} C_i = \{ \{VLC^{u}_{i,1}\leftrightarrow VLC^n_{i,1}\},\ldots ,\{VLC^{u}_{i,q_i}\leftrightarrow VLC^n_{i,q_i}\}\} \end{aligned}$$
    (2)

    where “\(\leftrightarrow \)” represents the mapping relationship.

  2. (2)

    For the cases \(q \ge p \) and \(p > 0\), VLCs in each category are mapped in the manner of one-to-\(k\). This One-to-k manner can stand for \(log_2 (k+1)\) binary bits data, where \(log_2 (k+1)\) is a positive integer.

Qian and Zhang’s method is mapping the same number of the unused-VLCs to each used-VLC in each category. The mapping manner could be described as follows:

$$\begin{aligned} M_i =\left\{ \begin{aligned}&\{VLC^{u}_{i,1}\leftrightarrow \{VLC^n_{i,1},\ldots ,VLC^n_{i,k_i}\}\},\ldots ,\\&\{VLC^{u}_{i,p_i}\leftrightarrow \{VLC^n_{i,(p_i - 1)\times k_i - 1},\ldots ,VLC^n_{i,p_i\times k_i}\}\} \end{aligned} \right\} \end{aligned}$$
(3)

where \(k_i=2^{\lfloor log_2 {(q_i/p_i+1)} \rfloor -1}\) and “\(\lfloor x \rfloor \)” stands for the floor function.

After establishing the mapping relationships, the additional data can be successfully embedded into JPEG bitstream. However, there are still free space to obtain more payload.

2.3 Hu et al. Method

Hu et al. have discovered the potential free space to improve the payload based on Qian and Zhang’s method [6]. They have considered the statistical results of used-VLCs and unused-VLCs and explored to increase the payload. The key contribution of Hu et al.’s method is improving the one-to-k mapping manner by considering the frequency of each used-VLC in the entropy-coded data.

For the VLCs, the largest category is \(C_16\) which has 125 VLCs, and the total number of VLCs is 162. Hence use one-to-k manner can represent 6 (\( \lfloor log_2 (125+1) \rfloor \)) data in \(C_16\). In each category \(C_i\), Hu et al.’s method assumed the number of one-to-(\(2j-1\)) manner mapping sets is \(m_{i,j}\) \((1\le j\le 6)\). The selection of \(m_{i,j}\) should satisfy the following condition:

$$\begin{aligned} \begin{aligned} \mathbf{max}\quad&\mathbf{Z} = \; \sum ^6_{j=1} j\cdot m_{i,j}\\ \mathbf{s.t.}\quad&\left\{ \begin{aligned}&\sum ^6_{j=1} m_{i,j}\le p_i\\&\sum ^6_{j=1} (2^j-1)\cdot m_{i,j}\le q_i \\&m_{i,j}\ge 0,j=1,2,\ldots ,6\\&m_{i,j}\in N^*,j=1,2,\ldots ,6 \end{aligned}\right. \end{aligned} \end{aligned}$$
(4)

After 4 is solved, all code mapping relationships are established which could be described as:

$$\begin{aligned} M_i=\left\{ \begin{aligned}&\{VLC^{u}_{i,1}\leftrightarrow \{VLC^n_{i,1},\ldots ,VLC^n_{i,63}\}\},\ldots ,\\&\{VLC^{u}_{i,m_{i,6}}\leftrightarrow \{VLC^n_{i,63\cdot (m_{i,6}-1)+1},\ldots ,VLC^n_{i,63\cdot m_{i,6}}\}\},\\&\{VLC^{(u}_{i,m_{i,6}+1}\leftrightarrow \{VLC^n_{i,63\cdot m_{i,6}+1},\ldots ,VLC^n_{i,63\cdot m_{i,6}+31}\}\},\ldots ,\\&\{VLC^{u}_{i,\sum ^6_{j=1} m_{i,j}}\leftrightarrow VLC^n_{i,\sum ^6_{j=1} [(2^j-1)\cdot m_{i,j}]}\} \end{aligned} \right\} \end{aligned}$$
(5)

where \(VLC^{u}_{i,j}\) are the used-VLCs after sorted in descend order and \(VLC^n_{i,1},\ldots ,VLC^(n)_{i,j}\) are unused-VLCs.

This means that the first \(m_{i,6}\) sorted used VLCs follow the one-to-63 manner, the next \(m_{i,5}\) sorted used VLCs follow the one-to-31 manner, ... , and the last \(m_{i,1}\) sorted used VLCs follow the one-to-one manner.

Since each used-VLC each time occurs in the entropy-coded segment can be embedded \(log_2 (k+1)\) bits data, Hu et al.’s method maps more unused-VLCs to the used-VLC which occurs more in the entropy-coded data. By considering each used-VLC’s frequency (occurrences number) in the mapping relationships, Hu et al.’s method gained more payload than Qian and Zhang’s method. Whereas only consider the first six high frequency VLCs may ignore the potential free space. The statistical results indicate that the frequency of each used-VLC in the same category also vary distinctly. Hence, the payload can be further increased by considering the frequency of each used-VLC. That is the key point of the proposed method that will be described in detail in Sect. 3.

3 Proposed Method

The proposed method can further increase the payload based on Hu et al.’s method [7] by improving the VLCs’ mapping relationships. The improved algorithm, the data embedding and data extraction procedures are introduced in the following sections. Figure 3 illustrates the framework of our proposed method. First, parse the JPEG bitstream and extract all the VLCs. Then establish the VLC mapping relationships by our proposed mapping rules. According to the optimal mapping relationships, we can embed more additional data.

Fig. 3.
figure 3

The framework of proposed method.

3.1 Algorithm of Optimized VLC Mapping

According to the JPEG guideline, there are 162 kinds of VLCs to represent AC coefficients and 12 kinds of VLCs to represent DC coefficients. The statistical results show that the VLCs for DC coefficients are all used in the entropy-coded data, but for AC coefficients, not all of the VLCs are used. Hence, the algorithm is suitable for AC coefficients. The unused-VLCs’ corresponding RLVs can be modified to the same value as the used-VLCs’ corresponding RLVs, thus the unused-VLCs can be mapped to the used-VLCs. In order to preserve the image filesize, the code length of the used-VLCs and the mapped unused-VLCs must be consistent.

The steps of the VLCs’ classifying, occurrences recording and VLCs sorting are the same as Hu et al.’s method. After the above steps completed, each category can be represented in the following form:

$$\begin{aligned} C_i = \{ VLC^{(u)'}_{i,1},\ldots ,VLC^{(u)'}_{i,p_i};VLC^n_{i,1},\ldots ,VLC^n_{i,q_i}\} \end{aligned}$$
(6)

where \(VLC^{(u)'}_{i,1},\ldots ,VLC^{(u)'}_{i,p_i}\) are the used-VLCs after sorted in descend order and \(VLC^n_{i,1},\ldots ,VLC^(n)_{i,q_i}\) are unused-VLCs.

After the above steps are completed, the mapping relationships will be established. Each category’s mapping rule depends on \(p_i\) and \(q_i\). For the case \(p \ge q\) and \(q>0\), \(C_i\) employs the one-to-one mapping rule. For the case \(q \ge p\) and \(p>0\), \(C_i\) employs the one-to-\(k\) mapping rule. If both the two cases are not satisfied, \(C_i\) does not employ any mapping rule. A detailed description of the different mapping rules is given in the following subsection.

One-to-One Mapping Rule. For the case \(p \ge q\) and \(q>0\), VLCs in each category are mapped by one-to-one rule which the same as Hu et al.’s method:

$$\begin{aligned} C_i = \{ \{VLC^{(u)'}_{i,1}\leftrightarrow VLC^n_{i,1}\},\ldots ,\{VLC^{(u)'}_{i,q_i}\leftrightarrow VLC^n_{i,q_i}\}\} \end{aligned}$$
(7)

where “\(\leftrightarrow \)” represents the mapping relationship. Figure 4 illustrates the one-to-one mapping rule. As shown in Fig. 4, each unused-VLC is mapped to a different used-VLC.

Fig. 4.
figure 4

The diagram of the one-to-one mapping rule.

One-to-\({{\varvec{k}}}\) Mapping Rule. For the case \(q \ge p\) and \(p>0\), the mapping rule will change to one-to-k manner. Hu et al.’s method assign the corresponding unused-VLCs to used-VLCs according to their sorted order. Both of Qian and Zhang’ method [6] and Hu et al.’s method [7] are particular cases. Different from Qian and Zhang’s method and Hu et al.’s method, the proposed method takes each used-VLC’s frequency into consideration. However, how many unused-VLCs to be mapped to each used-VLC is a pure integer nonlinear programming (\(PINLP\)) problem. Assuming the number of unused-VLCs which mapped to each used-VLC in the category \(C_i\) is \(x_{i,j}\) and the frequency of each used-VLC is \(f_{i,j}\), the selection of \(x_{i,j}\) should satisfy the following condition:

$$\begin{aligned} \begin{aligned} \mathbf{max}\quad&Z_i = \; \sum ^{p_i}_{j=1} f_{i,j}\cdot log_2 (x_{i,j}+1)\\ \mathbf{s.t.}\quad&\left\{ \begin{aligned}&\sum ^{p_i}_{j=1} x_{i,j}\le q_i\\&x_{i,j}\ge 0 \\&x_{i,j}\le q_i \\&x_{i,j}\in N^*\\&log_2 (x_{i,j}+1)\in N^* \end{aligned}\right. \end{aligned} \end{aligned}$$
(8)

where \(Z_i\) represents the embedding capacity of the category \(C_i\) and \(log_2 (x_{i,j}+1)\) means \(log_2 (x_{i,j}+1)\) additional data can be represented by each VLC belonging to the corresponding mapping set. For instance, if there are 63 unused-VLCs mapped to a used-VLC, 6 additional data can be represented by each of the same used-VLCs. According to Eq. (8), there is a positive correlation between \(x_{i,j}\) and \(f_{i,j}\). After Eq. (8) is solved, all code mapping relationships are established which can be described as:

$$\begin{aligned} M_i=\left\{ \begin{aligned}&\{VLC^{u}_{i,1}\leftrightarrow \{VLC^n_{i,1},\ldots ,VLC^n_{i,x_{i,1}}\}\},\\&\{VLC^{(u}_{i,2}\leftrightarrow \{VLC^n_{i,x_{i,1}+1},\ldots ,VLC^n_{i,x_{i,1}+x_{i,2}}\}\},\ldots ,\\&\{VLC^{(u}_{i,p_i}\leftrightarrow \{VLC^n_{i,q_i-x_{i,p_i}},\ldots ,VLC^n_{i,q_i}\}\},\\ \end{aligned} \right\} \end{aligned}$$
(9)

Figure 5 illustrates the one-to-k mapping rule. In Fig. 5, \(VLC^{u}_{i,1}\) is mapped to \(k_1\) different unused-VLCs. Thus the mapping rule for is one-to-\(k_1\). In the same way, the mapping rule for \(VLCVLC^{u}_{i,k}\) is one-to-\(k_2\).

Fig. 5.
figure 5

The diagram of the one-to-k mapping rule.

Fig. 6.
figure 6

The diagram of embedding additional data by replacing VLCs.

3.2 Data Embedding and Extraction

The embedding procedure can be summarized as follows:

Input: An original JPEG bitstream and a secret bitstream.

Output: A stego JPEG bitstream.

Step 1: Parse the entropy-coded data, extract all of the used-VLCs and unused-VLCs.

Step 2: Establish the mapping relationships based on the mapping method mentioned in Sect. 3.1.

Step 3: Modify the corresponding RLVs in DHT segment, embed these by replacing the VLCs. Figure 6 shows the process of embedding additional data by replacing VLCs. In Fig. 6, \(VLC^u_{i,m}\) is mapped to three different unused-VLCs \(VLC^n_{i,k},VLC^n_{i,l}\), and \(VLC^n_{i,n}\). After replacing the three unused-VLCs’ corresponding RLVs by \(VLC^u_{i,m}\)’s corresponding RLV \(RLV^u_{i,m}\), these four VLCs can represent the same RLV. We can then embed additional data by replacing the original VLC by the four VLCs. Figure 7 shows one of the groups of additional data. If we use the group as described in Fig. 7, when we replace the original VLC with \(VLC^n_{i,k}\), additional data ‘01’ will be embedded in the bitstream.

Fig. 7.
figure 7

One of the group of additional data.

Fig. 8.
figure 8

The ten test grey-scale images

On the receiver side, the data extraction procedure can be summarized as follows:

Input: A stego JPEG bitstream.

Output: An original JPEG bitstream and a secret bitstream.

Step 1: Read the DHT segment and find the RLVs with same values.

Step 2: Classify the same RLVs into a set and record the RLVs, the corresponding VLCs and each corresponding additional data.

Step 3: Parse the JPEG bitstream and extract the embedded data according to the records from the entropy-coded data.

4 Experimental Results and Analysis

4.1 Experimental Results

To test the performance of our proposed method, firstly we used the images from the USC-SIPI database as mentioned in [8]. Figure 8 shows the ten test grey-scale images. First, we convert these images to grey-scale images and then compress these to generate the JPEG images with different quality factors.

Since we only replace the used-VLCs by the same length unused-VLCs, the filesize of the modified JPEG bitstream is not changed. Moreover, because the corresponding RSV of the replaced VLC is the same as the original VLC, the decoded results are not changed. Thus, the decoded image has no distortion left. For the reason that our proposed method is lossless with filesize not changed, we only compare the embedding capacity with the other earlier method.

The embedding capacity results of both our proposed method and Hu et al.’s method are shown in Table 1, which the JPEG quality factors are from 10 to 90. From Table 1, we can see the improvement of capacity by our proposed method.

Table 1. Embedding capacity comparison with JPEG quality factors from 10 to 90 (bits)

To compare the embedding capacity of these methods further, we has also tested the entire 1338 images from UCID image database [9]. Figure 9 shows the average payload of these 1338 JPEG images with different quality factors by these methods. Table 2 lists the average payload of the three methods. As shown in Fig. 9, the higher the quality factor, the more obvious the improvement of our proposed method. Table 3 shows the specific improvements of our proposed method compared to the other methods. Therefore, to indicate the improvement clearly, we has generated three scatter plots Figs. 10, 11 and 12, which the quality factors are 70, 80, and 90 respectively. From the three scatter plots, we can see that the improvement is steady and clear.

Fig. 9.
figure 9

The comparison of average payload of the JPEG images from UCID.

Table 2. The average payload of the three methods.

4.2 Analysis

From Table 3 and Fig. 9, we can see the improvement of our proposed method is much obvious for the high-QF JPEG images. However, from Table 2, the average payload of the high-QF JPEG images is less than the low-QF JPEG images, regardless of obtained by using any of the three methods. The reason can be concluded as the following two points.

First point: The shorter the length of VLC, the higher the frequency (occurrence number) in the entropy-coded data generally. This is because the VLC is encoded by the format of Canonical Huffman Code.

Second point: The lower the QF, the higher possibility of unused-VLCs with shorter length exists. The QF stands for the compressed degree of a JPEG image. The high-QF means the compressed degree is low. The lower the QF, the more zero-AC coefficients, which means the less kinds of used-VLCs in the entropy-coded data. Correspondingly the possibility of unused-VLCs with shorter length exists is more.

Table 3. The improvements of proposed method compared to the other methods.
Fig. 10.
figure 10

The payload of each JPEG image with QF = 70 from UCID.

Fig. 11.
figure 11

The payload of each JPEG image with QF = 80 from UCID.

Fig. 12.
figure 12

The payload of each JPEG image with QF = 90 from UCID.

Base the two points we can embed more data into low-QF JPEG images than high-QF JPEG images. Because the payload is determined by both the frequency of used-VLC and the amount of information, which each used-VLC can stands for. However, the frequency of used-VLC affects more. Because the possibility of the unused-VLC with shorter length exists in low-QF JPEG images is higher, the used-VLC with shorter length is more possible to establish mapping relationship. In addition, the frequency of the used-VLC with shorter length is much high, so we can embed more data than high-QF JPEG images.

However, the high-QF JPEG images are used more widely than low-QF JPEG images, which is because the distortion is less than low-QF JPEG images. Therefore, for most JPEG images in daily life, our method can obtain obvious improvement than the other methods.

5 Conclusions

In this paper, a lossless data hiding scheme for JPEG images is proposed. After introducing the established VLC mapping algorithm for JPEG bitstream, we develop the optimal VLC mapping according to the statistical results of VLCs. Thus, the proposed method obtains more free space to embed additional data.