1 Introduction

Data hiding (DH) is a technique of embedding secret data in an innocuous cover media such as image, audio, or text, and meanwhile may maintain the quality of cover media. For DH schemes, the cover media should not be improperly degraded, and the embedded data should not be recognized by the human visual system. That is, even if a cover image includes a lot of message, its existence should not be detected by the observer [1, 8, 16]. But in reality, most stego images may be detected by statistical analysis [12, 14]. To avoid the detection of steganalysis, we should reduce the number of modified pixels for embedding a message, such that these modifications will not seriously change the original features of the cover image.

The method of least significant (LSB) substitution is one of DH schemes. The technique of LSB replacement was widely researched in DH fields as well as achieving high embedding capacity is a merit of it [11, 20]. The LSB approach is easy and simple, but it has a critical issue that there is a statistical difference between modified and unmodified pixels in stego image. It is possible for MCDH to scatter secret message embedded in stego image to address this difference.

Unlike the naive LSB approache, the optimal pixels adjustment process (OPAP) [3] is widely used for achieving imperceptibility, because OPAP significantly improves peak signal to noise ratio (PSNR) of simple LSB. Recently, Yang et al. [15] also adopted LSB and OPAP to improve the visual quality of a cover image which was generated by neighbor mean interpolation.

For MCDH adopting covering function COV (r, n, k), it had been possible to improve embedding efficiency by decreasing the number of changes [4]. F5 algorithm [13] is a kind of MCDH using HC(n, nk) Hamming code with minimum Hamming distance dmin = 3 with COV (1, n = 2k − 1, k) [2]. Zhang and Wang [19] introduced exploit modification direction (EMD), which is ternary HC for fully exploiting LSBs. Fridrich et al. [6] also proposed a simple one-dimensional ternary code suitable for embedding large payloads. Previously MCDHs are based on the cover coding like Hamming code, Golay code, BCH and code [17, 18]. For instance, the Hamming+ 1 DH (H1DH) [18] used OPAP to improve stego image quality. The high capacity of information hiding (HCIH) [21] was introduced for high ER using H1DH and LSB replacement.

Kim and Yang [10] introduced a scheme to improve embedding capacity by overlapping two consecutive blocks based on HC(7, 4) DH (HDH). That is, it is possible to embed 6 bits in 11 pixels (note: 7 + 7 − 3 = 11). But, for exploitation of the overlapped pixels, they only simply used second LSB. Recently, Kim et al. [9] proposed a Hamming+ k data hiding (Hk DH) scheme to hide 2k bits in a block with overlapped LSB and 2LSB using matrix encoding by adding or subtracting by two. These coarse approaches will lead to reducing somewhat the quality of cover image.

To tackle this problem, we extend this pixel overlapping approach on Hk DH with m overlapped pixels (referred to as Hk_mDH) in this paper. We use the syndrome function of HC and OPAP simultaneously of overlapped pixels. The proposed Hk_mDH may conceal 2k secret bits to (2k+ 1 − 1 − m) pixels, by changing two LSBs in a block with large probability, and at most no more than 3 modifications.

The rest of this paper is organized as follows. Section 2 briefly reviews Hamming code, HDH, and Kim and Yang’s DH, and Hk DH. In Section 3, we present Hk_mDH. In addition, some theoretical analyses are given. In Section 4, the proposed Hk_mDH is tested using some images. A comparison with the HDH, H1DH, HCIH, Kim and Yang’s DH, and Hk DH is also provided. Finally, this paper concludes in Section 5.

2 Related works

2.1 Hamming code

We first introduce a few basic concepts of Hamming code theory for cover coding. Hamming code HC(n, nk) of dmin = 3 is a single error correction linear block code with, where k is the number of parity bits and (nk) is the number of information bits. Let \(\mathbb {F}_{2}^{n}\) denotes the space of all n-bit column vectors y = (y1,…, yn)T. Let \(y \in \mathbb {F}_{2}^{n} (n = 2^{k}-1)\) be a codeword obtained from an information word \(x \in \mathbb {F}_{2}^{(n-k)}\) via a (nk) × n generator matrix G, where y = xG. Suppose that H is a k × n parity matrix with GHT = [0](nkk.

For any \(y \in \mathbb {F}_{2}^{n}\), the vector \(S = Hy \in \mathbb {F}_{2}^{k}\) is called the syndrome of y. For each syndrome \(S \in \mathbb {F}_{2}^{n}\), the set \(C(S) = \{y \in \mathbb {F}_{2}^{n} | Hy = S\}\) is called a coset. We can figure out a one-bit error pattern to correctly decode the codeword with one-bit error. Suppose that the received codeword is \(\hat {y}\) with an error pattern \(e=(y \oplus \hat {y})\).

The position of e can be obtained from (1).

$$ \left\{\begin{array}{ll} \hat{y} \cdot H^{T} = (e \oplus y) \cdot H^{T} = e \cdot H^{T} + y \cdot H^{T}\\ (cf., y = (x \cdot G) \cdot H^{T} = 0)\\ = e \cdot H^{T} + 0 = e \cdot H^{T} \end{array}\right. $$
(1)

For example, the parity check matrix H of HC(7, 4) code is shown in (2).

$$ H=\left[ \begin{array}{ccccccc} 0 & 0 & 0 & 1 & 1 & 1 & 1 \\ 0 & 1 & 1 & 0 & 0 & 1 & 1 \\ 1 & 0 & 1 & 0 & 1 & 0 & 1 \end{array} \right] $$
(2)

Suppose that there is one error bit occurred in \(\hat {y}\) (say the 6-th bit from left), i.e., e = (e1, e2,…, e7) = (0000010). The syndrome S = (110) is extracted. It means that the error position is the 6-th from left. In order to embed message bits in each subset by making at most one embedding change, we divide the cover image into N/n subsets, each consisting of n pixels, where n is the length of the code. Given that a codeword with code length n, an error can occur in any of the n positions or no error has occurred. Thus, we have 2kn + 1 for (n, k) code. If this bound is achieved with equality, n = 2k − 1, then the code is a perfect code. Every binary HC is a perfect code.

The decoding procedure for HC is demonstrated as follows.

Step 1::

Compute the syndrome \(S(\hat {y})=b(\hat {y})\cdot H^{T}\), where b(i) = i mod 2 and \(\hat {y}\) is received codeword.

Step 2::

Find the error pattern e from the syndrome.

Step 3::

Modify the cover object so that \(y=(\hat {y} \oplus e)\), and then decode x = yG− 1.

2.2 HDH scheme

For simplicity, we adopt HC (7, 4) to describe HDH. This HDH may hide 3 secret bits δ for in 7 cover bits by using the covering function COV (1,7,3). For pixel-domain DH, these 7 cover bits are selected from 7 LSBs of 7 pixels. For embedding procedure, we have to know the error e in between δ and the syndrome, i.e., S = (Sδ). The number of cosets on HC (7,4) is eight and it is possible to embed δ through the changing one bit in all codewords in other cosets. For COV (1, n, k), the codewords have the length n = 2k − 1, change density is d = 1/2k, ER is φ = k/n = k/(2k − 1), and embedding efficiency is EE = φ/d = (2k/(2k − 1)) ⋅ k. That is, the embedding efficiency is defined as the expected number of bits embedded per one embedding change.

2.3 Kim and Yang’s DH scheme

Kim and Yang’s DH is a scheme based on HC(7,4) with 3 overlapped pixels in 11 consecutive pixels. In this case, it is possible to embed 6 secret bits into 11 pixels (i.e., 7 + 7 − 3 = 11) (x1,…, x11), as shown in Fig. 1. That is, one 11-pixel block is regarded as two HDHs using HC(7, 4) with 3 overlapped pixels. The first 3 secret bits (δ1, δ2, δ3) are embedded into the LSBs (c1,⋯ , c7) of the first 7 pixels (x1,⋯ , x7) in the block by COV (1,7,3) (see (3)), where b(xi) = ci for 1 ≤ i ≤ 7 and the b(⋅) is a function to extract the 1st LSB of a pixel. The other three secret bits (δ4, δ5, δ6) are embedded into 7 bits (\(c_{1}^{\prime },\dots ,c_{7}^{\prime } \) ) by COV (1,7,3) (see (4)). These 7 bits (\(c_{1}^{\prime },\cdots ,c_{7}^{\prime }\)) will be constructed from \(c_{i}^{\prime }=b(\lfloor x_{i + 4}/2\rfloor )\), 1 ≤ i ≤ 3 and \(c_{i}^{\prime }=b(x_{i + 4})\), 4 ≤ i ≤ 7.

$$ (\delta_{1}, \delta_{2}, \delta_{3})^{T} = H \cdot (c_{1}, \dots, c_{7})^{T} = H \cdot (b(x_{1}),\dots, b(x_{7}))^{T} $$
(3)
$$ \left\{\begin{array}{llllll} (\delta_{4}, \delta_{5}, \delta_{6})^{T} = H \cdot (c_{1}^{\prime},\cdots,c_{7}^{\prime})^{T} \\ = H \cdot (b(\lfloor x_{5}/2 \rfloor), \cdots, b(\lfloor x_{7}/2 \rfloor), b(x_{8}), \cdots, b(x_{11}))^{T} \end{array}\right. $$
(4)
Fig. 1
figure 1

A block with three overlapped pixels for Kim and Yang’s DH

2.4 Hk DH scheme

In Hk DH, the number of overlapping pixels was set to 2k − 1 and it was possible to embed 2k bits (δ1,⋯ , δ2k) in a block composed of 2k − 1 pixels by exploiting 2LSB and LSB. For k = 3, Hk DH is based on HC (7,4), which is the same as Kim and Yang’s DH but uses seven overlapping pixels. Moreover, the quality of a stego image was maintained by using manipulation skill of OPAP and HC. Hk DH is briefly described as follows. The first k secret bits are embedded into the XOR-ed result of the 1st LSB and 2nd LSB of all the 2k − 1 pixel values using the syndrome of HC, and the other k bits are embedded with the 2nd LSB using HC. When compared with HDH, H1DH, and HCIH, the capacity of Hk DH is improved, and meanwhile the PSNR is not seriously degraded.

Embedding algorithm of Hk DH

Step 1::

Read 1-block and generate a codeword, \(C_{1} \in \mathbb {F}_{2}^{2^{k}-1}\), by performing an XOR operation between the 1st LSB and 2nd LSB with a sequence of pixels \((x_{1},\dots ,x_{2^{k}-1})\) using (5),

$$ C_{1} = \sum\limits_{i = 1}^{2^{k}-1} \left[ b\left( \left\lfloor \frac{x_{i}}{2} \right\rfloor \right) \oplus b(x_{i}) \right] $$
(5)
Step 2::

Compute the syndrome S1 (see (6) with the C1 using parity check matrix H. Compute a new syndrome as \(S_{1}^{\prime } = S_{1} \oplus {\delta _{1}^{k}}\), in which δ is secret bits. If \(S_{1}^{\prime } \neq 0\), {if (LSB(\(x_{S_{1}^{\prime }}\))=([00] or [10])), \(x_{S_{1}^{\prime }}+ 1\), else \(x_{S_{1}^{\prime }}-1.\}\)

$$ S_{1} = ([H\cdot {C_{1}^{T}}] \text{ mod } 2)^{T} $$
(6)
Step 3::

Compute a syndrome S2 with only the second LSB, C2 (see (7), of the x using (8). If (\(S_{2}^{\prime } \neq 0\)), {if (LSB(\(x_{S_{2}^{\prime }}\))=([00] or [10])), \(x_{S_{2}^{\prime }}-1\), else \(x_{S_{2}^{\prime }}+ 1.\}\)

$$ C_{2} = \sum\limits_{i = 1}^{2^{k}-1} \left[ b\left( \left\lfloor \frac{x_{1}}{2} \right\rfloor \right), \dots, b\left( \left\lfloor \frac{x_{2^{k}-1}}{2} \right\rfloor \right) \right] $$
(7)
$$ S_{2} = ([H \cdot {C_{2}^{T}}] \text{ mod } 2)^{T} $$
(8)

3 The proposed Hamming+ k scheme with m overlapped pixels

In Kim and Yang’s DH using HC(7, 4), one modified bit of \((c_{1}^{\prime },c_{2}^{\prime },c_{3}^{\prime })\) causes a mean square error 22 = 4 because these three bits \((c_{1}^{\prime },c_{2}^{\prime },c_{3}^{\prime })\) are the 2nd LSB of the pixels (x5, x6, x7). This degrades the stego image quality seriously. To solve the problem, we extend this pixel overlapping approach and adopt OPAP and LSB simultaneously to reduce the average mean square error. In fact, the proposed Hk_mDH is an extension of Hk DH, which use m overlapped pixels, where 0 ≤ mk. In fact, our scheme is the HDH for m = 0, and the Hk DH for m = k.

3.1 Design concept

We introduce design concept about the proposed Hk_mDH using COV (1,2k − 1, k). The Hk_mDH embeds k secret bits (\({\delta _{1}^{k}}\)) into the first (2k − 1) pixels using COV (1,2k − 1, k) and then embeds the other k secret bits (\(\delta _{k + 1}^{2k}\)) into the last (2k − 1) pixels. The size of one block is 2k+ 1m − 2 and the block is composed of pixels (\(x_{1}, \dots , x_{2^{k + 1}-m-2}\)) with m overlapped pixels in our DH. With the pixel overlapping approach, one stego block is divided into two 7-pixel subblocks. The first (2k − 1) pixels are composed of (\(x_{1},\dots ,x_{2^{k}-1}\)), and the last 7 pixels are (\(x_{2^{k} - m}, \dots , x_{2^{k + 1}-m-2}\)). As shown in (9) and (10), the secret bits (\({\delta _{1}^{k}}\)) and (\(\delta _{k + 1}^{2k}\)) are, respectively, embedded in the block.

$$ \left\{\begin{array}{llllll} (\delta_{1},\dots,\delta_{k})^{T} = H \cdot (b(x_{1}),\dots,b(x_{2^{k}-m-1}),\\ b(x_{2^{k}-m}) \oplus b\left( \left\lfloor \frac{x_{2^{k}-m}}{2} \right\rfloor \right),\dots, b(x_{2^{k}-1}) \oplus b \left( \left\lfloor \frac{x_{2^{k}-1}}{2} \right\rfloor \right) ) \end{array}\right. $$
(9)
$$ \left\{\begin{array}{llllll} (\delta_{k + 1},\dots,\delta_{2k})^{T} = H \cdot (b \left( \left\lfloor \frac{x_{2^{k}-m}}{2}\right\rfloor \right), \dots, b\left( \left\lfloor \frac{x_{2^{k}-1}}{2}\right\rfloor \right), \\ b(x_{2^{k}}), \dots, b(x_{2^{k}-m-2})) \end{array}\right. $$
(10)

Here, the design concept (Fig. 2) of Hk_mDH are briefly described using (9) and (10). For one cover block, the first k secret bits are embedded into the 7-bits acquired from the right hand side (RHS), (\({c_{1}^{7}}\)), of (9) and then the other k secret bits are embedded the other 7-bits obtained from RHS, (\({c_{1}^{7}}{'}\)), of (10). The overlapped pixels in (9) and (10) are to be XOR-ed of 1st LSB and 2nd LSB, respectively.

Fig. 2
figure 2

A conceptual diagram of the proposed Hk_mDH, m= 3

The proposed method applies the different procedure to each of the following three cases:

Case 1

When S1 and S2 ((9) and (10)) are different values, the two formula apply to only 1st LSBs of the blocks. In this case, OPAP is not necessary (see Fig. 3a).

Fig. 3
figure 3

Sketchy diagram for the proposed concept (assuming m = 3)

Case 2

When (S1 and S2) ∈{5,6,7}, i.e., S1S2, the optimization method by OPAP and LSB is required (see Fig. 3b).

Case 3

When S1 = S2, perform xi ± 1 where i(= 8…11) is randomly selected in Fig. 3c. After then, S2 may point to non-overlapped pixels. Afterward, apply HC to the pixel pointed by S2.

The proposed method leads to a mean square error (i.e., 22 = 4) if the collision of two syndromes happens (e.g., when S1 = S2). In this case, the method of (Case 3) may remove the cause of the error.

For example, given a pixel x with grayscale value 244 (11110100), the XOR-ed value of 1st LSB and 2nd LSB is 0 ⊕ 0 = 0. If x = x − 1, then x = 243 (11110011) and the 2nd LSB is changed from “0” to “1”, and but the XOR-ed value of 1st LSB and 2nd LSB is unchanged, because of 0 ⊕ 0 = 1 ⊕ 1. Thus, the changed value is only 1. When the changed pixels in (9) and (10) are within the same overlapped pixel, we should change the modified position to other two positions in (10) so that the mean square error will be reduced.

Example 1

Consider an example that we embed the first k secret bits by (9), and modify the LSB in one overlapped pixel from x = 10 = (00001010)2 to x = 11 = (00001011)2. In this case, if the modification position in (10) is the same, x = 12 = (00001100)2 by OPAP. Thus, the difference value, |xx| is “2” and mean square is 22 = 4. But, if the modified position moves to other two position, the mean square error is reduced from 22 = 4 to (12 + 12) = 2. Through the procedure of this optimization process, it will be able to reduce a lot of errors that may happen when we hide messages by using Hk_mDH.

In the final analysis, it is certain that modifying two 1st LSBs may obtain good PSNR than modifying one 2nd LSB theoretically. Assume that the k-tuple secret δ is embedded by δ = yHT, where y is changing one bit (or no change) in the (2k − 1)-tuple y, where S = yHT. Suppose that S = (S1S2), and then we have S = (S1S2) = (y1HTy2HT), where y1 and y2 are two (2k − 1)-tuples via changing one bit in y, respectively. Finally, we may embed k bits by changing two positions in one block.

3.2 Embedding procedure

For brief explanation, we here assume that k = 3 (i.e., by COV (1,7,3) covering function) and m = 3 for the proposed Hk_mDH, which can embed 6 secret bits into 11 cover pixels. Moreover, the extraction procedure is just a reverse procedure of the embedding procedure, so we skip the extraction procedure. The detailed procedure of the embedding is as follow.

Step 1::

Read the next one block (2k+ 1m − 2) bits x from the cover object and generate a codeword, \(p \in \mathbb {F}_{2}^{2^{k}-1}\), by (11).

$$ p = \left( b(x_{1}),\dots,b(x_{4}),\left( b(x_{5}) \oplus b\left( \left\lfloor \frac{x_{5}}{2}\right\rfloor \right) \right),\dots, \left( b(x_{7}) \oplus b\left( \left\lfloor \frac{x_{7}}{2}\right\rfloor \right) \right) \right) $$
(11)
Step 2::

Calculate the syndrome S1 = HpT and \(S_{1}^{\prime }=S_{1} \oplus ({\delta _{1}^{3}})\).

Step 3::

If (\(S_{1}^{\prime } \le (7-m + 1)-1), (x_{S_{1}^{\prime }})\pm \), else if (\(S_{1}^{\prime } \in \{7-m + 1,\dots ,7\)}), when (\(LSB(x_{S_{1}^{\prime }}) = ([00] \text {or} [10])): x_{S_{1}^{\prime }}+ 1\) or when (LSB\((x_{S_{1}^{\prime }}) = ([01] \text {or} [11]): x_{S_{1}^{\prime }}-1\).

Step 4::

Generate another codeword, \(q \in \mathbb {F}_{2}^{2^{k}-1}\), by (12).

$$ q = \left( b\left( \left\lfloor\frac{x_{5}}{2} \right\rfloor \right),\dots, b\left( \left\lfloor \frac{x_{7}}{2}\right\rfloor \right), b(x_{8}),\dots,b(x_{11}) \right) $$
(12)
Step 5::

Calculate the syndrome S2 = HqT and \(S_{2}^{\prime } = S_{2} \oplus ({\delta _{4}^{6}})\).

Step 6::

If \((S_{1}^{\prime }=S_{2}^{\prime }), \{ (x_{i})\pm \), where i = 8…11 and choose one from i}, else if (\(S_{1}^{\prime } \neq S_{2}^{\prime }), \{\)If \((S_{2}^{\prime } \ge 8), (x_{S_{2}^{\prime }})\pm \), else if (\(S_{2}^{\prime } \in \{7-m + 1,\dots ,7\})\), when (LSB(\(x_{S_{2}^{\prime }}) = ([00 \text {or} [10])): x_{S_{2}^{\prime }}-1\) and when (LSB\((x_{S_{2}^{\prime }}) = ([01] \text {or} [11])): x_{S_{2}^{\prime }}+ 1\).}

Step 7::

Go to Step 1 until not end of block.

3.3 Theoretical estimation of average mean square error

The HDH has i, 0 or 1, where the LSBs need to be modified with probability \(\binom {2^{k}-1}{i}/2^{k}\) over every 2k − 1 pixels. Obviously, the ER of HDH is φHDH = k/(2k − 1), and the average mean square error of HDH AMSEHDH is derived in (13). For k = 3, we have φHDH = 3/7 = 0.428 and AMSEHDH = 1/8 = 0.125.

$$ AMSE_{H} = \left( 0^{2} \times \binom{2^{k}-1}{0}/2^{k} + 1^{2} \times \binom{2^{k}-1}{1}/2^{k} \right) / (2^{k}-1) = \frac{1}{2^{k}} $$
(13)

The H1DH [18] extended from HDH may embed (k + 1) secret bits into 2k pixels. Thus, the ER is φH1DH = (k + 1)/2k. The AMSEH1DH is given as follows [18]. For k = 3, we have φH1DH = 4/8 = 0.5 and AMSEH1DH 1/8 − 1/128 = 0.117.

$$ AMSE_{H1}=\left\{\begin{array}{llllll} (0^{2} \times \binom{2^{k}-1}{0}/2^{k} + 1^{2} \times \left( \binom{2^{k}-1}{0}/2^{k} \times 1/2 \right) \\ + 1^{2} \times \binom{2^{k}-1}{1}/2^{k})/2^{k} = \frac{1}{2^{k}} - \frac{1}{2^{2k + 1}} \end{array}\right. $$
(14)

From the encoding algorithm of Hk DH, it is observed that \(S_{1}^{\prime } \neq S_{2}^{\prime }\) has three cases: (6) and (8) hold with probability 1/22k, one of these equations holds and the other does not with the probability (2k − 1)/22k, and neither equation holds but the positions of the changed pixels are different with probability (2k − 1) ⋅ (2k − 2)/22k. The case where neither equation holds but the positions of the changed pixels are the same (i.e., \(S_{1}^{\prime }=S_{2}^{\prime }\)) has probability (2k − 1)/22k. Therefore, AMSEHk is directly derived from (15).

$$ AMSE_{Hk}=\left\{\begin{array}{llllll} (0^{2} \times \frac{1}{2^{2k}} + 1^{2} \times \frac{(2^{k}-1)}{2^{2k}} + 1^{2} \times \frac{(2^{k}-1)}{2^{2k}} + (1^{2} + 1^{2}) \times \\ (((2^{k} - 1) \times (2^{k} - 2)/2^{2k})) \\ \frac{(S_{1}^{\prime} \neq S_{2}^{\prime}) + (1^{2}+ 1^{2}+ 1^{2}) \times \left( \frac{(2^{k}-1)}{2^{2k}}\right) (S_{1}^{\prime} = S_{2}^{\prime}) )}{2^{k}} -1 \\ = \frac{(2^{k + 1}+ 1)}{2^{2k}}= \frac{1}{2^{k-1}} + \frac{1}{2^{2k}} \end{array}\right. $$
(15)

For the proposed Hk_mDH, by using LSB and OPAP, we may embed these 2k secret bits into (2k+ 1 − 2 − m) pixels with only two modifications (grayscale value differs with only one like changing two LSBs) with large probability, and at most no more than 3 modifications.

The values of \(\varphi _{Hk\_mDH}\) can be easily from (16). About the AMSE\(_{Hk\_mDH}\), if both equations (9) and (10) are satisafied, then the MSE is (02 × 1/22k), while the MSE will be (12 × (2k − 1)/22k + 12 × (2k − 1)/22k) for any one equation is satisfied (9) or (10). Consider the cases one change in (9). The MSE is (12 + 12) × (2k − 1)/2k × (((2k − 1 − m)/(2k − 1)) × (2k − 1)/2k) for the case that the change is in the first (2k − 1 − m) bits, and the MSE is (12 + 12) × (2k − 1)/2k × ((m/(2k − 1)) × (2k − 2)/2k) for the change is in the last m bits. On the other hand, for the case two changes in the last m bits for (9), the MSE is (12 + 12 + 12) × (2k − 1)/2k((m/2(2k − 1)) × 1/2k). Finally, the average AMSE\(_{Hk\_mDH}\) is derived in (17). For k = 3 and m = 3, we have \(\varphi _{Hk\_mDH} = 6/(16-2-3) = 0.545\) and AMSE\(_{Hk\_mDH} = 0.256\).

$$ \left\{\begin{array}{llllll} \varphi_{Hk\_mDH} = 2k/((2^{k}-1)+(2^{k}-1)-m) \\ = 2k/(2^{k + 1}-2-m) \end{array}\right. $$
(16)
$$ \left\{\begin{array}{llll} AMSE_{Hk\_m} = \{(0^{2} \times 1/2^{2k} + 1^{2} \times (2^{k} - 1) / 2^{2k} + 1^{2} \times (2^{k} - 1)/2^{2k})\\ \text{(note: both equations are satisfied, or any one equation satisfied)}\\ +(1^{2} + 1^{2}) \times (2^{k} - 1)/2^{k} \times\\ \left( \overbrace{\left( ((2^{k} - 1-m )/(2^{k} - 1)) \times (2^{k} - 1)/2^{k} \right)}^{\text{one change in the first \((2^{k} - 1-m)\) bits for (9)}}+ \overbrace{\left( (m/(2^{k} - 1)) \times (2^{k} - 2)/2^{k} \right)}^{\text{one change in the last \textit{m} bits for (9)}} \right)\\ + (1^{2} + 1^{2} + 1^{2}) \times (2^{k} - 1)/2^{k} \times \overbrace{\left( (m/(2^{k} - 1)) \times 1/2^{k} \right)}^{\text{two changes in the last \textit{m} bits for (9)}} \} / (2^{k} - 1)\\ = \{ (2 \times (2^{k} - 1)/2^{2k}) + (2 \times ((2^{k} - 1)^{2} - m)/2^{2k}) + (3 \times m / 2^{2k}) \} /(2^{k} - 1)\\ = ({2^{k + 1} + m/ (2^{k} - 1)} ) / ({2^{2k}} ) = \frac{1}{2^{k-1}} + \frac{m/(2^{k} - 1)}{2^{2k}} \end{array}\right. $$
(17)

4 Experiment and comparison

4.1 Experimental results

We have proved by using the formula that Hk_mDH improves the performance of PSNR by reducing AMSE meanwhile, retain the embedding payload. To verify the proposed theory, we compare Hk_mDH with the previous scheme and prove the superiority of Hk_mDH as a result of the comparison. To evaluate performance, we experiment on the stego image quality for HDH (k = 3), H1DH (k = 3), HCIH (n = 7), Kim and Yang’s DH (k = 3 with 3 overlapped pixels), Hk DH, and the proposed Hk_mDH (k = 3, m = 3) using MATLAB with 512 × 512 original grayscale images [7] with cover images (see Fig. 4).

Fig. 4
figure 4

Original images used in experiment.

Generally, MSE (see (18)) and PSNR (see (19)) are widely used to measure the quality of an image. The value of MSE is used to measure average the squared intensity differences between a distorted image and reference image. If an image has a small MSE value, it has a relatively good visual quality. PSNR describes the ratio of the maximum possible power of a signal to the power of corrupting noise and is a clearly good instrument for evaluation about the quality of an image as objectively. MSE measures the energy of the signal using the L2 - norm, which is energy preserving. The MSE between two image x and y is

$$ MSE(x,y) = \frac{1}{N} \sum\limits_{i = 1}^{N} (x_{i} - y_{i})^{2} $$
(18)

The error signals, ei = xiyi, denotes the difference between the original and distorted signal. The MSE is usually expressed as the PSNR measure

$$ PSNR = 10 log_{10} \frac{L^{2}}{MSE}, $$
(19)

where L is the dynamic range of allowable pixel intensities. For example, for an 8-bit per pixel image, L = 28 − 1 = 255. The definition of capacity means how much information can be hidden in the cover image. That is to say, it refers to the amount of information. This is used for important criteria about DH. For this reason, we tried to find algorithms to increase their capacity without degrading stego image. (20) means the ratio of the number of message bits (||d||), to the total number of pixels.

$$ \varphi = \frac{||d||}{N \times N} $$
(20)

Table 1 demonstrates the comparison among HDH, H1DH, HCIH, Kim and Yang’s DH, Hk DH, and Hk_mDH, which are DH methods based on HC. It appears that each ERs of them are 0.43, 0.5, 0.75, 0.54, 0.86, and 0.54, respectively. The ER of HCIH is higher than HDH, H1DH, and Kim and Yang’s DH. The maximum ER of both Hk DH and the proposed Hk_mDH has the same. The Hk_mDH has good PSNR although the ER of Hk_mDH is higher ER than that of HDH, H1DH, HCIH, and Kim and Yang’s DH.

Table 1 Performance comparison of all Hamming-like DHs

H1DH is the best in the aspect of quality, and Hk DH and Hk_mDH could have the highest ER with maintaining good PSNR. Especially, Hk_mDH can trade PSNR for ER using variable m. The two elements, PSNR and ERs, are the relation in inverse proportion to each other. In addition, in the proposed Hk_mDH, we can trade AMSEHk_mDH for φHk_mDH by selecting m. For example, we may have φHk_mDH = 0.666 and AMSEHk_mDH = 0.261 for k = 3 and m = 5.

Figure 5 represents the comparison among original and stego images such as (a) original (b) HDH, (c) H1DH, (d), HCIH, (e) Kim and Yang’s DH, (f) Hk DH, and (g) Hk_mDH, respectively. The PSNR of H1DH is the highest in this experiment and the difference of PSNR between H1DH and Hk_mDH is about 1.43 PSNR. That is to say, it has only slight errors.

Fig. 5
figure 5

Comparison of Lena images derived from various schemes: a original b HDH (57.16dB), c H1DH (57.44 dB), d HCIH (53.43 dB) e Kim and Yang’s DH (53.95 dB), f Hk DH (53.63 dB), and g Hk_mDH (m = 3; 56.01 dB)

Table 2 shows performance comparisons according to control variable m (when m = 1, 3, 5, and 7). At this time, the proposed Hk_mDH has ERs 0.46, 0.54, 0.67, and 0.86, respectively. We may adjust ER and PSNR by increasing/decreasing m. In the Hk_mDH, the value of m may be decreased or increased according to user’s need. Users may embed a large amount of messages by increasing the value of m.

Table 2 Performance comparisons for various m variations

Figure 6 demonstrates the comparison among stego images derived from Hk_mDH when the variable m = 1…7. Since the stego images are originated by the optimized DH, it is not easy to distinguish the original through the human visual system.

Fig. 6
figure 6

PSNRs of Lena images when m= 1,3,5,7; am= 1 (56.75 dB), bm= 3 (55.87 dB), cm= 5 (54.86 dB), dm= 7 (53.64 dB)

In Table 3, we measured PSNRs under the same condition (i.e., embedding capacity: 90000 random bits) using about Hamming-like schemes such as HDH, H1DH, HCIH, Kim and Yang’s DH, Hk DH, and Hk_mDH. Although the amount of embedding capacity is the same for the same cover images, the PSNRs of stego images are different because the embedding approaches (or algorithms) are different. In this experiment, Hk_mDH (when COV (15,4)) had higher PSNRs than HDH, HCIH, Kim and Yang’s DH, and Hk DH and the PSNR of H1DH is slightly higher rather than Hk_mDH.

Table 3 Performance comparison of proposed scheme and previous schemes (using 90000 embedding bits)

4.2 Steganlysis

Fridrich et al. [5] proposed a steganalysis based on statistical analysis with the object of the detection of stego image. This method define two mappings: F1 for 0 ⇔ 1,2 ⇔ 3,…,254 ⇔ 255 and F− 1 for − 1 ⇔ 0,1 ⇔ 2,…,255 ⇔ 256. When the LSB of cover image is different from the hidden bit, F1 is used. The method to measure the smoothness of a group (x1, x2,…, xn) is as follows:

$$ f(x_{1},x_{2},\dots,x_{n}) = \sum\limits_{i = 1}^{n-1} |x_{i + 1} -x_{i}| $$
(21)

Rm is referred to the ratio of blocks that f increases when F1 is applied to a part of each block and Sm is referred to the ratio of blocks when f is decreased. Rm and Sm are referred to ratio of blocks when F− 1 is applied to a part of each block. If cover image does not embed secret data, F1 and F− 1 may equally increase the f value of blocks, i.e., RmRm and SmSm. When secret bits are embedded, the difference between Rm and Sm decreases whereas the difference between Rm and Sm increase, i.e., RmSm > RmSm. Figure 7 shows the RS analysis results for a stego image Lena and the x-axis denotes the ER and the y-axis the rate of regular and singular pixel groups with m = [0110] and − m = [0 -1 -1 0]. From Fig. 7a, when LSB replacement is used simply, the larger the payload, the greater the difference between (RmSm) and (RmSm). But, Fig. 7b shows that RmRm and SmSm hold when the payload is increased. Therefore, it is apparent that our proposed scheme is secure against the RS detection attack.

Fig. 7
figure 7

Comparison of RS-diagrams between cover images produced by simple LSB-embedding DH and Hk_mDH

4.3 Performance evaluation

In the proposed Hk_mDH, m denotes the number of overlapped pixels. We can obtain the higher ER by increasing the value of m. For example, our Hk_mDH by COV (1,7,3) covering function has φHk_mDH = 6/14 and AMSEHk_mDH = 0.252 for m = 1.

On the other hand, we could obtain φHk_mDH=6/7 and AMSEHk_mDH = 0.265 for m = 6. Therefore, our Hk_mDH is possible to control stego image quality via the value of m. In another application, if we want a high ER we may use Hk_mDH using COV (1,7,3) with m = 7, which has a very high capacity of 6/7. Even for this case, the AMSE = 0.265 has a PSNR of approximately 53.8dB.

In addition, we may control AMSE by choosing the value of k, on which the large k decrease AMSE. Table 4 illustrates the ERs φHk_mDH and average mean square errors AMSEHk_mDH of the proposed Hk_mDH for various values of m.

Table 4 Comparison of φ and AMSE for various values of m for the proposed Hk_mDH

Figure 8 represented the histograms of the original Lena image and the stego image of the Hk_mDH. The bar chart and line plot are histograms of the original image and its modified image using Hk_mDH (when m = 3 and COV (1,15,4),andφ = 0.3), respectively. This histogram only showed the frequency of the pixel values between 50 and 100 only to highlight the two differences values clearly. Figure 8 shows that it is not easy to find clues about the hidden secret message can be found with only the stego image. However, a blind test may hardly find such differences.

Fig. 8
figure 8

Histogram of original Lena image (bars) and image modified using Hk_mDH (plots)

5 Conclusion

In this paper, we introduced Matrix coding based data hiding (MCDH) methods. Generally, the methods based on MCDH are superior in the aspect of efficiency. However, they are not higher ER than that of simple LSB method. To solve these problems, we proposed an optimization way to ER and image quality by OPAP and LSB. The proposed Hk_mDH was designed to trade ER off against the stego image quality by using the value of m. This property enabled that Hk_mDH can be used intended application for a specific purpose. For example, it may be secure communication. That is, it is possible to increase m when the size of the concealed data is large and to decrease m for the high PSNR. For example, if the value of m is reduced to a minimum, it can be used for secure communication applications. In the future, we will improve more performance both embedding capacity and PSNR for a cover image by upgrading Hk_mDH.