1 Introduction

Nowadays, digital documents often are transmitted on the internet network, but this is an unsafe medium, so these documents are easily tampered with by a hacker. So, protecting the integrity and the copyright of the data is an urgent problem in the field of information security. Two main approaches in this field are encryption and data hiding. Encryption approaches generate ciphertexts consisting of a series of meaningless symbols that are curious to the hackers. The data hiding methods overcome this drawback because the image containing data and the original image are difficult to distinguish. But when images containing secret data are sent on the internet, hackers can attack and modify the content of images, and the receivers can extract false information. So protecting the integrity of the image becomes very important. Image authentication is a technique to protect the integrity of images from being illegally modified. The existing image authentication methods can be divided into two categories, that are digital signature-based methods [1, 2] and fragile watermarking-based methods [3, 4]. For the digital signature methods, a signature of an image is obtained using a hash function, and the hashed result is then encrypted and stored in a trusted third party. To authenticate an image, the signature is obtained from the image and compared with the signature stored in the third party. In fragile watermarking methods, the authentication information used as watermarks embedded into the host image to obtain the watermarking image. In general, the authentication information is the feature generated from the host image or a random bit-stream obtained by a pseudo-random number generator. Because the embedded data is fragile if the watermarking image has attacked then the embedded watermarks are likely different from the extracted ones and thus the tampered regions can be defined.

The image authentication methods can be divided into two classes: irreversible methods [5,6,7,8,9,10] and reversible methods[11,12,13,14,15]. In the irreversible image authentication methods, the lossy data hiding techniques [16,17,18,19] often are used for embedding watermarks. Because lossy data hiding techniques often provide a large embedding capacity, the irreversible image authentication methods have a good detection ability while maintaining high watermarking image quality. However, in the lossy data hiding techniques, the original image cannot be restored exactly from a watermarking image. Therefore, the irreversible image authentication methods are not suitable for applications where the original image is demanded such as medical and military images, or for the image carrying secret data mentioned above.

In these cases, we can use the reversible image authentication (RIA) methods, they not only detect the tampered regions but also can restore the original image from the watermarking image. Reversible data hiding (RDH) techniques have been extensively developed in the past decades. The difference expansion (DE) [20] and histogram shifting (HS) [21] are the two most important techniques used in RDH techniques. DE method is the first remarkable RDH method, in which the difference of each pixel pair is expanded to the left enough for embedding one bit on the right. The method has been improved and developed by many researchers to enlarge its embedding capacity such as the methods based on pixel blocks [22,23,24], the methods for reducing the location map [25,26,27,28], and especially the prediction error expansion (PEE) methods [29,30,31,32]. In general, DE-based RDH methods provide a high embedding capacity, but stego image quality is still low.

Ni et al. [21] proposed another important RDH method called histogram shifting (HS). First, the histogram of pixels is generated by using a statistical method. Then the bins at which the histogram reaches the largest values are selected to embed the data. In the HS method, each pixel must be modified at most one unit, so its stego image quality is very high, however, the embedding capacity is not very large. To overcome this drawback, prediction error histogram shifting (PEHS) is proposed in [33,34,35,36,37], in which prediction errors of pixels are calculated, and then a histogram of errors is shifted. Recently, Li et al. [38] proposed a new RDH method based on pixel value ordering (PVO), in which firstly the host image is divided into non-overlapped blocks, and the pixels of each block are sorted in ascending order. Then the largest pixel is predicted by the second-largest pixel and the smallest pixel is predicted by the second-smallest pixel. Next, one bit is embedded in the largest pixel (or the smallest pixel) if the prediction error is 1 (or -1), respectively. The advantage of this method is that it has very high stego image quality, but its embedding capacity is still low. The PVO method has attracted the attention of many researchers and recently there have been many improvements to enlarge the embedding capacity as in [39,40,41,42,43,44,45].

Since the RIA methods often detect the tampered regions of an image in a block-wise manner, the block-based RDH techniques are therefore suitable to be used in the RIA methods. Lo and Hu [12] proposed an RIA method in which a sequence of authentication codes (AC) is generated by a random seed. Each \(4\times 4\) pixel block is used to embed an AC bit using a PEHS technique. Later, Nguyen et al. [13] proposed an RIA method based on adaptive prediction-error expansion (PEE).

Yin et al. [15] also proposed an RIA method in which the pixels of an original image are visited by the Hilbert curve and divided into non-overlapped blocks of size \(4\times 4\). Since four consecutive pixels in the Hilbert curve are always within a \(2\times 2\) sub-block, the Yin method essentially embeds the authentication bits on the \(2\times 2\) blocks using the IPVO technique. This method used two thresholds to classify blocks into three types: flat, normal, and rough. At most eight bits and four bits are embedded in a flat and a normal block, respectively. Whereas rough blocks are skipped, and no bit is embedded in them. Nguyen and Vo [14] used an idea similar to in [13] to create an RIA method in which an original image is divided into non-overlapped blocks with the size of \(3\times 3\). The pixel in the centre of the block is selected as a prediction value of pixels on the left and right sides. This method has the same stego image quality as in [13], but an embedding capacity smaller.

All four above methods used a pseudo-random bit sequence as authentication information. So the validation content will remain the same even if the pixel values have been changed. This is one reason for the inability to detect attacked blocks. Moreover, the authentication bits only are embedded in embeddable blocks. That means non-embeddable blocks will not be protected. Hong et al. [11] proposed an RIA method in which, the above two disadvantages have been overcome. This method used the same embedding algorithm as in [15], namely an original image is divided into non-overlapped blocks with the size \(4\times 4\), then at most eight bits are embedded in four \(2\times 2\) sub-blocks by the IPVO technique. Authentication code (AC) bits of a block are generated by using the hash function MD5 of the block’s pixel values and location information. The non-embeddable blocks are not ignored. In each such block one AC bit is embedded by the LSB replacement technique. Due to these improvements, the method of Hong et al. has a higher ability to detect and locate tampered blocks than the aforementioned methods.

In addition to the RDH methods, dual-image RDH techniques have also been proposed, that generate two stego images after embedding data bits into a single host image. Because of using two stego images, the dual-image RDH methods have a larger embedding capacity and higher security than the RDH methods. Without two stego images being simultaneously obtained, hackers can’t extract the complete secret data. A dual-image RDH scheme was first proposed by Chang et al. [46], in which a \(256\times 256\) modulus function matrix is used. Qin et al. [47] used Exploiting Modification Direction (EMD) to construct a dual-image RDH method in which four bits are embedded in a pixel pair. Lu et al. [48] applied the LBS matching [17] to enhance stego image quality in which four bits are embedded in a pixel pair (xy). Lee and Huang [49] proposed a dual image RDH method using 25 orientation combinations to embed more than 4 bits in a pair of two pixels. Lu et al. [50] proposed a dual image RDH method by using the center folding strategy (CFS) to reduce the distortion of the image. Yao et al. [51] proposed an improvement of Lu et al.’s method in which an extra bit is embedded when a bit sequence embedded in a pixel consists of all bits equal to 1. Recently, Chen et al. [52] proposed an efficient dual-image reversible data hiding method by using EMD.

To correctly authenticate a pixel block, it is important to embed multiple authentication bits on the block. If the number of embedded bits is n, the conclusion that the block is attacked has an average accuracy rate of \(1-(\frac{1}{2}) ^{n}\). Since dual-image RDH methods have a very large embedding capacity, applying them in RIA schemes will achieve high detection accuracy. Peng et al. [53] proposed a new reversible image authentication scheme based on a dual image RDH method developed from Yin et al.’s irreversible hiding technique [54] in which four bits are embedded in a pixel pair. In this scheme, AC bits are made from pseudo-random numbers and the original image is divided into non-overlapped pixel blocks with the size of \(4\times 4\). Two stego images are used to embed data bits independently. the 16 AC bits and the 16 data bits used to restore the original image are embedded alternately in each \(4\times 4\) block of two stego images. Since only half of the pixel pairs of each watermarking image are used to embed the AC bits, if pixel pairs that do not contain the AC bits are hacked, then this attack cannot be detected. This is why Peng et al.’s scheme is not able to detect tampered blocks with high accuracy.

This paper has the following contributions:

+:

Based on Lu et al.’s paper [50], proposed a new dual-image reversible data-hiding algorithm that can embed about \(m \times n \times log_{2}5 = m \times n \times 2.32\) AC bits in an \(m \times n\) sized block. This embedding capacity is higher than related methods.

+:

Using the above hiding algorithm to construct the RIA method in which an original image is divided into \(m \times n\) non-overlapping blocks. The AC bits are generated from the MD5 hash with input data including pixel values and the position of each block. Both watermarking images are used to embed the AC bits.

+:

The proposed method can detect forged blocks with 100% accuracy with \(3 \times 3\) and \(4 \times 4\) blocks in all attack types. Existing methods have only an attack detection ability for \(4 \times 4\) block sizes with about 0% to 85.91 % accuracy depending on every case.

The paper consists of an Introduction and the following sections. Section 2 introduces related works and gives some remarks on their disadvantages and advantages. The proposed method is presented in Section 3. The experimental results comparing the embedding capacity, watermarking image quality, and ability to detect tampered blocks between methods are shown in Section 4. Finally, Section 5 provides some conclusions.

2 Related works

2.1 Lu et al.’s method

Lu et al. [50] used the central folding strategy to create a dual image RDH method in which embedding s bits \(b_{1}, . . , b_{s}\) in a pixel x of an original image I to obtain the two watermarking pixels \(x'\) and \(x''\). In the case of \(s = 2\), the embedding algorithm is performed as follows. First, convert \(b_{1}, b_{2}\) into a decimal value d with \(0 \le d \le 3\). Fold d in the centre by the operation \(d-2\), then embed \(d-2\) in the pixel x to get \(x'\) and \(x{''}\) as follows

$$\begin{aligned} x' = x + \lfloor \frac{d-2}{2} \rfloor , \quad x{''} = x - \lceil \frac{d-2}{2} \rceil . \end{aligned}$$
(1)

When two stego pixels \(x'\) and \(x{''}\) are known, the original pixel x is restored by the formula:

$$\begin{aligned} x = \lceil \frac{x'+x{''}}{2} \rceil , \end{aligned}$$
(2)

and two embedded bits are extracted as follows:

$$\begin{aligned} d = (x{'} - x{''} + 2) \rightarrow b_{1}b_{2}. \end{aligned}$$
(3)

That means converting the decimal number d into the two binary bits.

2.2 Yao et al.’s method

Yao et al.’s method [51] is an improvement to Lu et al.’s algorithm for embedding an extra bit when both first bits are equal to one. Specifically, when \(d\le 2\) Yao et al. embed two bits by formula (1). But when \(d=3\), i.e. \(b_{1}b_{2}=11\), Yao et al. have embedded an extra as follows:

$$\begin{aligned} x' = x, x'' = x-1, \, \text {if} \, b_{3}=0, \end{aligned}$$

and

$$\begin{aligned} x' = x+1, x{''} = x-1, \, \text {if} \, b_{3}=1. \end{aligned}$$

When knowing \(x'\) and \(x{''}\), the original pixel x is restored by the formula (2), and embedded bits are extracted as follows. Compute \(d = x{'}- x{''}+2\). If \(d \le 2\), two bit \(b_{1}b_{2}\) are extracted by formula (3). Otherwise, the three bits are extracted according to formulas:

$$\begin{aligned} b_{1}b_{2}b_{3} = {\left\{ \begin{array}{ll} 110, \, \text {if} \, d=3,\\ 111, \, \text {if} \, d=4. \end{array}\right. } \end{aligned}$$

To avoid underflow/overflow, only pixels with values between 1 and 254 are used for embedding. Pixels with a value of 0 or 255 are ignored. The embedding ratio of Lu et al.’s algorithm is 2 bits per pixel (2 bpp), while the embedding ratio of Yao et al.’s algorithm is 2.25 bpp. The maximal magnitude of the difference between each watermarking pixel and the corresponding original pixel of both methods is equal to one. That proves that both methods have high image quality.

2.3 Image authentication method of Yin et al.

Yin et al. [15] divided the original image I into \(4\times 4\) non-overlapped pixel blocks \(\left\{ B_k \right\} _{k=1}^{N}\) . Then embed AC bits in four consecutive pixels of the Hilbert Curve. As these four pixels are in a sub-block with the size of \(2\times 2\), so the embedding technique used in Yin et al.’s method is IPVO in sub-blocks of \(2\times 2\) pixels. Yin et al. used two thresholds \(T_{1}\) and \(T_{2}\) to classify \(4\times 4\) blocks into three categories: flat, normal and rough. The flat and normal blocks can embed at most 8 AC bits and 2 AC bits, respectively. While the rough blocks will be omitted in the embedding procedure. When both thresholds \(T_{1}\) and \(T_{2}\) are equal to 255, all \(4\times 4\) blocks can embed at most 8 bits, then the method has the highest embedding capacity. Consider a \(4\times 4\) pixel block \(B_k\) that consists of four sub-blocks \(B_k^t\) with \(t=1,...,4\). Suppose that \(B_k^t\) has four pixels \(B_{k,i}^t , i=1,...,4\). The embedding algorithm IPVO in sub-blocks \(B_k^t\) is carried out as follows. First sort the sequence of pixels \(B_{k,i}^t\) in ascending order to get:

$$\begin{aligned} B_{k,\sigma _1}^t \le B_{k,\sigma _2}^t \le B_{k,\sigma _3}^t \le B_{k,\sigma _4.}^t \end{aligned}$$

Then compute \(d_{min}\) and \(d_{max}\) by formulas:

$$\begin{aligned} d_{k,min}^t= & {} {\left\{ \begin{array}{ll} B_{k,\sigma _1}^t - B_{k,\sigma _2}^t, \, \text {if} \, \sigma _1< \sigma _2,\\ B_{k,\sigma _2}^t - B_{k,\sigma _1}^t, \, \text {if} \, \sigma _2< \sigma _1, \end{array}\right. }\\ d_{k,max}^t= & {} {\left\{ \begin{array}{ll} B_{k,\sigma _3}^t - B_{k,\sigma _4}^t, \, \text {if} \, \sigma _3< \sigma _4,\\ B_{k,\sigma _4}^t - B_{k,\sigma _3}^t, \, \text {if} \, \sigma _4 < \sigma _3. \end{array}\right. } \end{aligned}$$

Next a bit \(w_1\) is embedded in \(B_{k,\sigma _1}^t\) and a bit \(w_2\) is embedded in \(B_{k,\sigma _4}^t\) (if possible) to obtain \(C_{k, \sigma _1}^t\) and \(C_{k, \sigma _4}^t\) as follows:

$$\begin{aligned} C_{k, \sigma _1}^t= & {} {\left\{ \begin{array}{ll} B_{k, \sigma _1}^t - w_1, \, \text {if} \, d_{k,min}^t = 0 \, \text {or} \, d_{k,min}^t = 1, \\ B_{k, \sigma _1}^t - 1, \, \text {and no bit is embedded, otherwise,} \end{array}\right. }\\ C_{k, \sigma _4}^t= & {} {\left\{ \begin{array}{ll} B_{k, \sigma _4}^t + w_2, \, \text {if} \, d_{k,max}^t = 0 \, \text {or} \, d_{k,max}^t = 1, \\ B_{k, \sigma _4}^t + 1, \, \text {and no bit is embedded, otherwise}. \end{array}\right. } \end{aligned}$$

In Yin et al.’s method, authentication code bits (AC bits) are a sequence of N pseudo random bits generated by a secret key, where N is the number of \(4\times 4\) blocks obtained from the original image I. Non-embeddable blocks are ignored in the embedding procedure, and as such, they are not protected. An AC bit is embedded repeatedly in each embeddable block by mentioned above formulas. After embedding AC bits in the image I is completed, the watermarking image \(I{'}\) consisting of \(4\times 4\) blocks \(C_{k}, k = 1, ... ,N\) is obtained. During authentication, if all bits extracted from an embeddable block \(C_{k}\) are equal to the embedded AC bit, the k-th block is considered unchanged. On the contrary, it is considered an attacked block.

Remark of Yin et al.’s method

It is easily noted that if a \(4\times 4\) block \(C_{k}\) of the watermarking image \(I{'}\) is modified in the following ways, then Yin et al.’s method cannot detect.

  1. 1.

    All pixels of a sub-block \(C_k^t\) with \(t\in \{1, ... ,4\}\) are increased or decreased by the same value. Because then the result of extracting information according to IPVO does not change.

  2. 2.

    The last sub-blocks of a \(4\times 4\) block \(C_{k}\) are modified such that these sub-blocks become non-embeddable. Because then \(C_{k}\) becomes non-embeddable or all bits extracted are equal to the embedded AC bit.

  3. 3.

    A non-embeddable \(4\times 4\) block \(C_{k}\) is modified so that it remains non-embeddable. Moreover, because the number of embedded AC bits is not large, the accuracy of the detection is not high.

2.4 Image authentication method of Hong et al.

Hong et al. [11] used the same block division and embedding technique as in [15]. So symbols of [11] are used here. The original image I is partitioned into non-overlapped \(4\times 4\) pixel blocks \(\left\{ B_k \right\} _{k=1}^{N}\). Each \(4\times 4\) block \(B_k\) is divided into 4 sub-blocks with the size of \(2\times 2\) \(B_k^t\) with \(t=1,...,4\). Embedding data bits are implemented in \(2\times 2\) sub-blocks by using the technique IPVO as in Section 2.3. Thus each \(4\times 4\) block can embed at most 8 bits. A block with the number of embedded bits greater than zero is called embeddable, otherwise is called non-embeddable.

In this method, AC bits are generated for each \(4\times 4\) block. Specifically, eight values \(B_{k,\sigma _2}^t \; \text {and}\; B_{k,\sigma _3}^t \; \text {with}\; t=1,..., 4\) and the location position of the block \(B_k\) are used as the input data of the hash function MD5 to generate 128 AC bits for block \(B_k\). To enlarge detection capacity, an AC bit is embedded into the LSB of the first pixel of a key-selected sub-block of each \(4\times 4\) non-embeddable block by using the LSB replacement technique. The LSB of pixel used in the LSB replacement technique and AC bits will be embedded in embeddable \(4\times 4\) blocks by formulas described in Section 2.3. After embedding AC bits in \(4\times 4\) blocks \(B_k\) of the original image I, the watermarking image \(I'\) consists of \(4\times 4\) blocks \(C_k\) is obtained.

Compared with Yin et al.’s method, Hong et al. introduced two critical improvements: AC bits are generated from the features of each block and an AC bit is embedded in a non-embeddable block.

Remark of Hong et al.’s method Although the attack detection ability of Hong et al.’s method is significantly improved compared to Yin et al.’s method, it still cannot detect the change of blocks in the following cases:

  1. 1.

    For a non-embeddable \(4\times 4\) block \(C_k\) consisting of four sub-block \(C_k^t\), each \(C_k^t\) has four pixels \(C_{k,\sigma _1}^t \le C_{k,\sigma _2}^t \le C_{k,\sigma _3}^t \le C_{k,\sigma _4}^t\) if decreasing \(C_{k,\sigma _1}^t\) or/and increasing \(C_{k,\sigma _4}^t\) by an even value for at least \(t \in \{1,2,3,4\}\), then \(C_k\) is still non-embeddable and the bit extracted from \(C_k\) is still equal to the AC embedded bit. Therefore, in this case, the modification of \( C_k \) cannot be detected.

  2. 2.

    For an embeddable \(4\times 4\) block \(C_k\), if Pixels \(C_{k, \sigma _1}^t\) and \(C_{k, \sigma _4}^t\) with \(t=s,...,4\) (where s is a number between 2 and 4) are modified so that sub-blocks \(C_k^t , t=s,..,4\) become non-embeddable, but \(C_k\) is still embeddable. Then the bits extracted from \(C_k\) are still equal to the AC bits. Therefore in this case the change of \(C_k\) cannot be detected.

In this method, the number of embedded AC bits is not large, so the accuracy of the detection is not high. In addition, since an embeddable block contains the LSB value of a pixel in a non-embeddable block, a change in the latter block can lead to false conclusions about the change in the former block. For example, suppose \(C_s\) is embeddable that can embed two bits. In the embedding process, an AC bit and an LSB from a non-embeddable \(C_r\) are embedded in \(C_s\). Now if \(C_r\) is attacked so that it becomes embeddable, then two bits embedded in \(C_s\) are considered as two AC bits. Then the bits extracted from \(C_s\) can be different from the AC bits. This leads to the false conclusion that \(C_s\) was attacked.

2.5 Image authentication method of Peng et al.

In Peng et al.’s scheme [53], authentication information is the pseudo random bits which are generated by keys. The original image I with pixels \(x_i\) is copied into two images denoted as \(I_1\) with pixels \(p_i\) and \(I_2\) with pixels \(q_i\), that means \(x_i=p_i=q_i\) with \(i=1,..., H \times W\), where \(H \times W\) is the size of the image. Peng et al. improved the irreversible data hiding algorithm SOS [54] of Yin et al. to get a reversible data hiding method in two images and then use this method to create a reversible image authentication scheme.

Fig. 1
figure 1

Matrix MB of size \(4\times 4\)

First, four AC bits \(b_1,b_2,b_3,b_4\) are embedded in the pair of two pixels \(p_1,p_2\) of \(I_1\) based on matrix MB (Fig. 1) as follows. Convert \(b_1,b_2\) and \(b_3,b_4\) into two decimal numbers \(d_1\) and \(d_2\) with values between 0 and 3. Define two stego pixels \(p'_1,p'_2\) from \(p_1,p_2\) and \(d_1,d_2\) by formulas

$$\begin{aligned}&MB(p'_1 \% 4, p'_2 \% 4) = d_1, \\&MB(p'_1 \% 4, (p'_2 + 1) \% 4) = d_2, \\&p_1-1 \le p'_1 \le p_1 +2, \\&p_2-1 \le p'_2 \le p_2 +2. \\ \end{aligned}$$

To be able to recover \(p_1,p_2\) from \(p'_1 , p'_2\) can calculate values:

$$\begin{aligned} e_1=2+(p_1-p'_1), \, e_2=2+(p_2-p'_2), \end{aligned}$$

where \( 0 \le e_1, e_2 \le 3\). Embed \(e_1, e_2\) in the pair of two pixels \(q_1, q_2\) of the image \(I_2\) according to the same formulas above. That means

$$\begin{aligned}&MB(q'_1 \% 4, q'_2 \% 4) = e_1, \\&MB(q'_1 \% 4, (q'_2 + 1) \% 4) = e_2, \\&q_1-1 \le q'_1 \le q_1 +2, \\&q_2-1 \le q'_2 \le q_2 +2. \end{aligned}$$

To ensure that both images \(I_1^{'}\) and \(I_2^{'}\) contain the same number of authentication bits, the embedding of the authentication bits and the recovery information are performed alternately on the \(I_1\) and \(I_2\) images. That is, use the pair of pixels \(q_3, q_4 \) of \(I_2\) to embed the 4 AC bits and the pair of pixels \(p_3, p_4\) of \(I_1\) to embed recovery information, and so on. It is easy to see that only the pairs \((p_i, p_{i+1})\) as well as \((q_i,q_{i+1})\) satisfy the condition

$$\begin{aligned} 1 \le p_i, p_{i+1}, q_i, q_{i+1} \le 253, \end{aligned}$$

can be used to embed AC bits without causing underflow/overflow. In Peng et al.’s scheme, the original is divided into non-overlapped pixel blocks with the size of \(4\times 4\), so each block of \(I{'}_1\) and \(I{'}_2\) contains at most 16 AC bits. For \(i = 1, 5, 9,...,\), extracting the AC bits and restoring the original pixels \((x_i, x_{i+1})\) from \((p'_i, p'_{i+1})\) and \((q'_i, q'_{i+1})\) are carried out as follows:

$$\begin{aligned}&e_i=MB(q'_i \% 4,q'_{i+1}) \% 4, \\&e_{i+1}=MB(q'_i) \% 4,q'_{i+1}+1) \% 4,\\&x_i= p'_i+ e_i-2,\\&x_{i+1}= p'_{i+1}+ e_{i+1}-2, \end{aligned}$$

and

$$\begin{aligned} \begin{aligned}&d_i=MB(p'_i \% 4,p'_i \% 4), \\&d_{i+1}=MB(p'_i \% 4,(p'_{i+1}+1) \% 4). \end{aligned} \end{aligned}$$
(4)

Convert \(d_i\) and \(d_{i+1}\) into 4 AC bits:

$$\begin{aligned} d_i = (b_1,b_2 )_2 ,\, d_{i+1} = (b_3,b_4 )_2 \end{aligned}$$
(5)

For \(i = 3, 7, 11,...,\) extracting the AC bits and restoring the original pixels is done similarly but with a change of roles between \(p'_i\) and \(q'_i\).

Remark of Peng et al.’s method Although the number of embedded AC bits is large, this method has some limitations in its ability to detect attacks as follows.

  1. 1.

    From (4) and (5) , it follows that if \(p'_i\) or/and \(p'_{i+1}\) (with i=1,5,9,...) are incremented or decremented by a positive integer that is a multiple of four, then the same AC bits are extracted, so the change of \(I'_1\) is undetectable.

  2. 2.

    Since with \(i = 3, 7, 11, .., p'_i \)\(p'_{i+1}\)do not contain AC bits, so if they are modified in any way, this change will not be detected.

  3. 3.

    From (a) and (b), deduce that if each pixel of \(I_1^{'}\) is increased or decreased by a positive integer equal to a multiple of 4, such modification is also undetected.

The limitations in attack detection ability for the watermarking image \(I_2^{'}\) are similar to the image \(I_1^{'}\).

3 Proposed reversible image authentication scheme

3.1 Improvements of Lu et al.’s and Yao et al.’s algorithms

The algorithm used in the proposed scheme is an improvement of Lu et al.’s [50] and Yao et al.’s [51] algorithms by embedding a 5-nary digit d in a pixel x according to the following formula:

$$\begin{aligned} x'= x + \left\lfloor \frac{(d-2)}{2} \right\rfloor ,\, x''=x- \left\lceil \frac{(d-2)}{2} \right\rceil . \end{aligned}$$
(6)

Consider a block X of size \(m \times n\) consisting of \(k = m \times n\) pixels with a value between 1 and 254. Then can embed k 5-nary digits in the block X. Now we will evaluate the embedding capacity of this technique. Set:

$$\begin{aligned}&k5Max = 5^k - 1,\\&bk5Max: \text { binary representation of \,} k5Max, \\&l: \text {length of\, } bk5Max, \end{aligned}$$

where k5Max is the maximum 5-nary value of k digits and is the maximum value that can be embedded in the block X. Denote

$$\begin{aligned} l2Min = 2^{l-1} , \, l2Max = 2^l - 1, \end{aligned}$$

are the minimum \(l-bit\) binary value and the maximum \(l-bit\) binary value, respectively. Let emValue be a embedded binary number of l bits, then it can be divided into the following cases:

  1. 1.

    emValue is between l2Min and k5Max, emValue can be embedded in block X. In this case, l bits are embedded.

  2. 2.

    emValue is greater than k5Max, then can not embed emValue in X. In this case, can only embed the first \((l-1)\) bits of emValue.

  3. 3.

    emValue is smaller than l2Min, then the first bit of emValue is equal to zero. Although we can embed all l bits of emValue, but to avoid ambiguity with case 2, only the first \((l-1)\) bits of emValue are embedded.

Since can consider that values of emValue are uniformly distributed over the domain from zero to l2max, so it is deduced that the average number of embedded bits in X is:

$$\begin{aligned} Cap(X)= & {} [(k5Max - l2Min + 1)\times l + (l2Max + l2Min - k5Max)\\{} & {} \times (l - 1)]/(l2Max+1). \end{aligned}$$

For block X with sizes \(2\times 2\), \(3\times 3\) and \(4\times 4\) corresponding to 4, 9 and 16 pixels, the value of Cap(X) is equal to 9.1104, 20.4313 and 37.0551, respectively. Finally, the embedding ratio of the proposed algorithm is 2.2776, 2.2701 and 2.3159 corresponds to block sizes \(2\times 2\), \(3\times 3\) and \(4\times 4\). All these embedding ratios are higher than Lu et al.’s embedding ratio of 2 bpp and Yao et al.’s embedding ratio of 2.25 bpp.

Our other improvement is using the pixels with the value of 0 or 255 to embed, while these pixels are ignored in the methods of Lu et al. and Yao et al. Namely, a bit b is embedded in the pixel x with the value of 255 as follows:

$$\begin{aligned} \begin{aligned} x{'} = 255, x{''} = 255 \; \text {if} \; b=0,\\ \text {\,} x{'} = 254, x{''} = 255 \; \text {if} \; b=1. \end{aligned} \end{aligned}$$
(7)

For a block X consisting of all pixels with the value of 0 (\(x(i)=0,\) \(i=1,...,k) \), one-bit b is embedded in the first pixel of the block by formulas:

$$\begin{aligned} \begin{aligned} x{'}(1) = 4, x{''}( 1) = 0,\; \text {if} \; b=0,\\ \text {\,} x{'}(1) = 3, x{''}(1) = 0, \; \text {if} \; b=1,\\ x{'}(i) = x{''}( i) = 0, i=2,...,k. \end{aligned} \end{aligned}$$
(8)

From formulas (6)-(8) it is easy to see that the embedding algorithms in cases: \(0<x<255,x=255\), and \(x=0\) will not give ambiguous results.

3.2 Embedding data bits in a pixel block

Let X be a block of size \(m\times n\) consisting of k (with \(k=m\times n\)) pixels \(x_i=1,...,k, \text {and}\; B={b_1,b_2,...}\) be a given bit sequence. For embedding B in X, consider three cases:

The first case: All pixels x(i) of X with a value of 0. In this case, one bit of B is embedded in x(1) by formula (8) to get \(x{'}(i)\) and \(x{''}( i)\), \(i=1,...,k\).

The second case: There is at least a pixel of X with a value of 255, the remaining pixels with a value of 0. In this case, a bit of B is embedded in each pixel of X with a value of 255 by formula (7). For \(x(i)=0\), no bit is embedded, and \(x{'}(i)=x{''}( i)=0\).

The third case: X has at least a pixel with a value between 1 and 254. Suppose there are s such pixels.

First, scan by rows pixels of X. Each pixel with a value of 255 (if any) is used to embed one bit according to formula (7). For a pixel \(x(i)=0\), no bit is embedded, and \(x{'}(i)=x{''}( i)=0\). Then scan again X to find s pixels with a value between 1 and 254. The subsequent bits will be embedded in these s pixels according to Section 3.1. Namely, calculate the \(s5Max= 5^s-1\), the length l of the binary representation of s5Max. Then compute \(l2Min = 2^{l-1}\). Select l bits from B to generate a binary number called emValue. Compare emValue with s5Max and l2Min. If emValue is in the domain between l2Min and s5Max then emValue is preserved, otherwise, emValue is adjusted by obtaining only the first \((l - 1)\) bits. Convert emValue to a 5-nary number consisting of s digits: \(D=(d1, d2,..,ds)\) with \(0 \le dj \le 4\). Embed each digit of D in a pixel of X with a value from 1 to 254 according to the formula (6).

The illustrating examples:

  1. 1.

    Consider the block:

    $$\begin{aligned} X = \begin{bmatrix} 255 &{} 255 \\ 255 &{} 0 \end{bmatrix}, \end{aligned}$$

    and : B = (1,0,1). Scan pixels by rows, as far as the second case goes, there are:

    $$\begin{aligned} X^{'} = \begin{bmatrix} 254 &{} 255 \\ 254 &{} 0 \end{bmatrix} ,\, X^{''} = \begin{bmatrix} 255 &{} 255 \\ 255 &{} 0 \end{bmatrix}. \end{aligned}$$
  2. 2.

    Consider the block:

    $$\begin{aligned} X = \begin{bmatrix} 0 &{} 0 \\ 0 &{} 0 \end{bmatrix}, \end{aligned}$$

    and : B = (0). By the formulas in the first case we get:

    $$\begin{aligned} X^{'} = \begin{bmatrix} 4 &{} 0 \\ 0 &{} 0 \end{bmatrix} ,\, X^{''} = \begin{bmatrix} 0 &{} 0 \\ 0 &{} 0. \end{bmatrix}. \end{aligned}$$
  3. 3.

    Consider the block:

    $$\begin{aligned} X = \begin{bmatrix} 0 &{} 6 \\ 255 &{} 4 \end{bmatrix} ,\, \text {and} : B = (1,1,0,1,1, 1). \end{aligned}$$

    First, embed one bit in a pixel \(x_{3}\) with a value of 255. Since \(b_{1} = 1\), so \(x^{'}_{3} = 254\), and \(x^{''}_{3} = 255\). Two pixels \(x_{2} = 6\) and \(x_{4} = 4\) will be used to embed next bits. In this case, \(s=2\), \(s5Max= 5^{2}-1 = 24\), \(bs5Max = 11000\), \(l = 5\), and \(l2Min = 10000\). Select the next 5 bits: 10111, it is between l2Min and s5Max, so five of these bits can be embedded in pixels \(x_{2}\) and \(x_{4}\). Convert this binary value to a 5-nary number to get \(emValue = (43)_{5}\). Embed numbers 4 and 3 in the pixels \(x_{2}\) and \(x_{4}\) according to formula (6), we obtain:

    $$\begin{aligned} \begin{aligned} x'_{2} = 7, \, x{''}_{2} = 5 \; \text {} \end{aligned} \\ \begin{aligned} x_{4}' = 4, \, x{''}_{4} = 3 \; \text {} \end{aligned} \end{aligned}$$

    Thus, the obtained watermarking blocks are:

    $$\begin{aligned} X^{'} = \begin{bmatrix} 0 &{} 7 \\ 254 &{} 4 \end{bmatrix} ,\, X^{''} = \begin{bmatrix} 0 &{} 5 \\ 255 &{} 3 \end{bmatrix}. \end{aligned}$$
  4. 4.

    Consider the block:

    $$\begin{aligned} X = \begin{bmatrix} 3 &{} 5 \\ 6 &{} 5 \end{bmatrix} ,\, \text {and} \, B = (1,0,0,0,1,0,0,0,0,1). \end{aligned}$$

    This block belongs to the third case with \(s= 4\). In this case, no pixel has a value of 255, so all bits will be embedded in pixels with values 1 to 254. We have \(s5Max= 5^{4}-1 = 624,\, bs5Max = 1001110000, \;\ l = 10\), and \(l2Min = 2^{l-1} = 2^9\). Select 10 bits of \(B: emValue = 1000100001\). It is clear that emValue is smaller than s5Max and greater than l2Min, so convert this binary value to a 5-nary number to get \((4140)_5\). Embed the 5-nary digits 4, 1, 4 and 0 in pixels with the values 3, 5, 6 and 5 of X according to formula (6), respectively, we get:

    $$\begin{aligned} X^{'} = \begin{bmatrix} 4 &{} 4 \\ 7 &{} 4 \end{bmatrix} ,\, X^{''} = \begin{bmatrix} 2 &{} 5 \\ 5 &{} 6 \end{bmatrix}. \end{aligned}$$

3.3 Extracting and restoring from a pair of two watermarking blocks

Suppose \(X^{'}\) and \(X^{''}\) are a pair of known watermarking blocks of size \(m \times n\). The process of extracting embedded bits and restoring the original block is performed by the following steps: First restoring the original block \(X=(x_1,...,x_k)\), with \(k=m\times n\) by formulas:

$$\begin{aligned} x_i = {\left\{ \begin{array}{ll} 0, \, \text {if} \,i=1 \, \text {,} \,x_1^{'} \in \{3,4\}\, \text {and} \,x_1^{''}=0,\\ \bigl \lceil \frac{x'_i + x_i{''}}{2} \bigr \rceil , \, \text {otherwise.} \end{array}\right. } \end{aligned}$$
(9)

where \(i=1,...,k\). Then we divide X into three cases as in Section 3.2. In the first case, one bit is extracted by the formula:

$$\begin{aligned} b = {\left\{ \begin{array}{ll} 0, \, \text {if} \, x^{'}_1 = 4, \\ 1, \, \text {if} \, x^{'}_1 = 3. \\ \end{array}\right. } \end{aligned}$$
(10)

In the second case, one bit is extracted from each \(x'\in \{254,255\}\) by the formula:

$$\begin{aligned} b = {\left\{ \begin{array}{ll} 0, \, \text {if} \, x{'} = 255, \\ 1, \, \text {if} \, x{'} = 254. \\ \end{array}\right. } \end{aligned}$$
(11)

In the third case, first scan \(X'\) and \(X''\) by rows to get pairs of \(x^{'}\) and \(x^{''}\). If \(x^{'}\) and \(x^{''}\) correspond to a pixel x with a value of 255, extract a bit by formula (11). Then scan again \(X^{'}\) and \(X^{''}\), for each pair of pixels \(x'\) and \(x''\) corresponds with an original pixel x with a value between 1 and 254, extract a 5-nary digit by the formula:

$$\begin{aligned} d = x^{'} - x^{''} + 2. \end{aligned}$$
(12)

Suppose have s such original pixels. Compute values s5Maxl,  and l2Min as in Section 3.2. Convert s of obtained 5-nary digits into a decimal number denoted emValue. If emValue is between l2Min and s5Max, convert emValue in to l binary bits. Otherwise, convert emValues into \(l - 1\) binary bits. The sequence of embedded bits is obtained by collecting bits extracted. The illustrating example:

  1. 1.

    Consider a pair of two watermarking blocks (example 4 of 3.2)

    $$\begin{aligned} X^{'} = \begin{bmatrix} 4 &{} 4 \\ 7 &{} 4 \end{bmatrix} ,\, X^{''} = \begin{bmatrix} 2 &{} 5 \\ 5 &{} 6 \end{bmatrix}. \end{aligned}$$

    First, restore the original block X by (9), obtain:

    $$\begin{aligned} X = \begin{bmatrix} 3 &{} 5 \\ 6 &{} 5 \end{bmatrix} \end{aligned}$$

    We see that the original pixel block belongs to case 3. In this case, have \(s=4\), \(s5Max = 624\), \(bs5Max = 1001110000\), \(l = 10\), \(l2Mins = 512\). Extract four 5-nary digits by formula (12), we obtain a 5-nary number: \((4, 1, 4, 0)_5\). Convert it into a decimal number denoted emValue, we have \(emValue = 545\). Since emValue is between l2Min and s5Max, so convert emValue to a sequence of l bits.We obtain 1000100001, that are the bits to be extracted.

  2. 2.

    Consider two watermarking blocks (example 2 of 3.2)

    $$\begin{aligned} X^{'} = \begin{bmatrix} 4 &{} 0 \\ 0 &{} 0 \end{bmatrix} ,\, X^{''} = \begin{bmatrix} 0 &{} 0 \\ 0 &{} 0 \end{bmatrix}. \end{aligned}$$

    From formulas (9) and (10) it follows

    $$\begin{aligned} X = \begin{bmatrix} 0 &{} 0 \\ 0 &{} 0 \end{bmatrix}, \end{aligned}$$

    and an embedded bit is 0.

3.4 Embedding AC bits in an original image

Let I be an original image of size \(H \times W\). Divide I into non-overlapped blocks \(I_{r,c}\) with the size of \(m \times n\) where \(r=1,...,\lfloor H/m \rfloor , c=1,...,\lfloor W/n \rfloor \). The process of embedding authentication information on image I is carried out according to the following steps:

Step 1: For each block \(I_{r,c}\) with pixels \(I_{r,c}(i), i=1,...,k \, (\text {with} \, k=m \times n)\), first generate 128 AC bits by applying hash function MD5 as follows

$$\begin{aligned} {\begin{matrix} I_{r,c} AC &{}= hashMD5(r,c,I_{r,c} (1), ... ,I_{r,c} (k)) \\ &{}=(ac(1),...,ac(128)). \end{matrix}} \end{aligned}$$
(13)

Step 2: Choose AC bits defined in the previous step as the embedded bits:

$$\begin{aligned} {\begin{matrix} (b(1), ..., b(128)) =(ac(1),...,ac(128)). \end{matrix}} \end{aligned}$$
(14)

Embed bits \(b(1), b(2), ..., b(\alpha )\) in the block \(I_{r,c}\) by formulas in Section 3.2 to get two watermarking blocks \(I_{r,c}^{'}\) and \(I_{r,c}^{''}\), where \(\alpha \) is a maximum number of bits that can be embedded.

Step 3: Repeat steps 1 and 2 for all blocks \(I_{r,c}\) with \(r=1,...,\lfloor H/m \rfloor \) and \(c = 1, ... ,\lfloor W/n \rfloor \) to get all watermarking blocks \(I_{r,c}^{'}\) and \(I_{r,c}^{''}\).

Step 4: Combine blocks \(I_{r,c}^{'}\) to get the watermarking image \(I^{'}\) and blocks \(I_{r,c}^{''}\) to get the watermarking image \(I^{''}\). The watermarking images \(I^{'}\) and \(I^{''}\) are sent to a certain recipient. This person needs to check if each block of the watermarking images is hacked or not. If a pair of two watermarking blocks is not attacked, then from those two blocks we can restore the corresponding original block correctly. Therefore, if both watermarking images are not attacked or the number of tampered block pairs is small, the recovered image is still available.

3.5 Authenticating the watermarking images

After embedding AC bits in the original image I of size \(H\times W\) as in Section 3.4 we get two watermarking images \(I^{'}\) and \(I^{''}\) of the same size as I. These images are sent, but the recipient receives images \(\widetilde{I^{'}}\) and \( \widetilde{I^{''}}\) that can be equal to or different from \(I^{'}\) and \(I^{''}\). If images \(I^{'}\) and \(I^{''}\) are attacked, then \(\widetilde{I^{'}}\) and \(\widetilde{I^{''}}\) are different from \(I^{'}\) and \(I^{''}\), otherwise, they are equal. To detect and locate attacked regions, first divide images \(\widetilde{I^{'}}\) and \(\widetilde{I^{''}}\) into non-overlapped blocks \(\widetilde{I_{r,c}^{'}}\) and \(\widetilde{I_{r,c}^{''}}\) with the size of \(m \times n\) as in Section 3.4. Then the process of authenticating each pair of blocks \(\widetilde{I_{r,c}^{'}}\) and \(\widetilde{I_{r,c}^{''}}\) is carried out according to the following steps:

Step 1: Preliminary Check: First from Section 3.4, it is easy to see that the pair of two blocks \(I_{r,c}^{'}\) and \(I_{r,c}^{''}\) has the following two properties:

  1. 1.

    All pixels of \(I_{r,c}^{'}\) and \(I_{r,c}^{''}\) are between 0 and 255.

  2. 2.

    Except for the case \(I_{r,c}^{'} (1) \in \{3,4\}, I_{r,c}^{''} (1) = 0,\) and \(I_{r,c}^{'} (i) = I_{r,c}^{''} (i) = 0, i=2,...,k. \) In other cases, the values of pixels \(I_{r,c}^{'} (i)\) and \(I_{r,c}^{''} (i) \) differ by no more than 2, that is \(abs(I_{r,c}^{'} (i) - I_{r,c}^{''} (i)) \le 2\), with \(i=1,...,k\).

Therefore, we check if \(\widetilde{I_{r,c}^{'}}\) and \(\widetilde{I_{r,c}^{''}}\) satisfy these two properties. If the condition is not satisfied, then it can be confirmed that \(I_{r,c}^{'}\) and \(I_{r,c}^{''}\) have been attacked, and skipped steps 2-5. Otherwise, go to step 2.

Step 2: From the pair of two blocks \(\widetilde{I_{r,c}^{'}}\) and \(\widetilde{I_{r,c}^{''}}\), restore the block \(\widetilde{I_{r,c}}\) by formula (9). Note that if blocks \(I_{r,c}^{'} \) and \(I_{r,c}^{''}\) are not attacked, then \(\widetilde{I_{r,c}}\) is the original block, that is: \(\widetilde{I_{r,c}} = I_{r,c}\).

Step 3: Generate AC bits for \(\widetilde{I_{r,c}}\) by formula (13):

$$\begin{aligned} \widetilde{I_{r,c}}AC&= hashMD5(r,c,\widetilde{I_{r,c}}(1), ... , \widetilde{I_{r,c}}(k))\\&= (\widetilde{ac}(1), ... ,\widetilde{ac}(128)) \end{aligned}$$

Step 4: From blocks \(\widetilde{I_{r,c}^{'}}\) and \(\widetilde{I_{r,c}^{''}}\) extract bits by formulas in Section 3.3. Denote extracted bits as \(\tilde{B} = (\tilde{b}(1), ... ,\tilde{b}(\alpha ))\).

Step 5: Compare \(\tilde{b}(i)\) with \(\widetilde{ac}(i))\), \(i=1,...,\alpha \). If all are equal, we conclude that \( I_{r,c}^{'}\) and \(I_{r,c}^{''}\) are not attacked, that is \( \widetilde{I_{r,c}^{'}} = I_{r,c}^{'}\) and \( \widetilde{I_{r,c}^{''}} = I_{r,c}^{''}\). Otherwise consider as \(I_{r,c}^{'}\) and \(I_{r,c}^{''}\) are attacked.

Step 6: Repeat steps 1-5 to authenticate and locate tampered pairs of two watermarking blocks. In the case \(I_{r,c}^{'}\) and \(I_{r,c}^{''}\) are not attacked then block \(\widetilde{I_{r,c}}\) defined in step 2 is the original block \(I_{r,c}\). Thus, we can locate the attacked blocks and recover the original blocks correctly in case of no attack.

3.6 Accuracy of the authentication algorithm

First, it is noted that almost original pixel blocks \(I_{r,c}\) satisfy the condition:

$$\begin{aligned} I_{r,c}(i) \in \{1,...,254\} \end{aligned}$$
(15)

with \(i \in \{1,...,k\}\). For such pixel blocks, restoring original pixels and extracting embedded bits are performed according to formulas (9) and (12), respectively.

For simplicity, in this section only blocks satisfying condition (15) are considered. The authentication accuracy of the proposed method in the below three attack types of a block will be discussed. .

The first type of attack: Together increase or decrease \(I_{r,c}^{'}(i)\) and \(I_{r,c}^{''}(i)\) by the same value, that is,

$$\begin{aligned} \widetilde{I_{r,c}^{'}}(i) = I_{r,c}^{'}(i) + \Delta (i), \, \widetilde{I_{r,c}^{''}}(i) = I_{r,c}^{''}(i) + \Delta (i), \, i=1,...,k \end{aligned}$$

with at least one \(\Delta (i) \ne 0\). Then by formula (9)\(_{b}\), pixels \(\widetilde{I_{r,c}}(i)\) restored by step 2 of Section 3.5 differ from the original pixels. That means \((\widetilde{I_{r,c}}(1), ... , \widetilde{I_{r,c}}(k)) \ne (I_{r,c}(1), ... , I_{r,c}(k))\) . So, AC bits obtained in step 3 of the authentication procedure (Section 3.5) are not equal to AC bits created in the process of embedding AC bits (Section 3.4), which means

$$\begin{aligned} (\widetilde{ac}(1), ... , \widetilde{ac}(128)) \ne (ac(1), ... , ac(128)). \end{aligned}$$
(16)

While from (12), the bits extracted in step 4 of Section 3.5 are the same as the embedded bits, which means

$$\begin{aligned} \widetilde{b}(i) = b(i), \, i = 1,...,\alpha . \end{aligned}$$

From this and (14) it follows that

$$\begin{aligned} \widetilde{b}(i) = ac(i), \, i = 1,...,\alpha . \end{aligned}$$
(17)

By formulas (16)-(17), deduce that

$$\begin{aligned} (\widetilde{b}(1), ... , \widetilde{b}(\alpha )) \ne (\widetilde{ac}(1), ... , \widetilde{ac}(\alpha )) \end{aligned}$$

So, by step 5 of Section 3.5, we conclude that blocks \(I_{r,c}^{'}\) and \(I_{r,c}^{''}\) are attacked.

However, when the value of \(\alpha \) is small, from (16) it can still deduce that

$$\begin{aligned} (\widetilde{ac}(1), ... , \widetilde{ac}(\alpha )) = (ac(1),...,ac(\alpha )) \end{aligned}$$
(18)

From this and (17) we have \((\widetilde{b}(1), ... , \widetilde{b}(\alpha )) = (\widetilde{ac}(1), ... , \widetilde{ac}(\alpha )). \) So, by step 5 of Section 3.5, we conclude that blocks \(I_{r,c}^{'}\) and \(I_{r,c}^{''}\) are not attacked. Thus, we get the wrong conclusion. However, from (16) it follows that the probability of occurrence of the condition (18) is \(\frac{1}{2^{\alpha }}\). When \(\alpha \) is greater than 10, this probability is very low.

The second type of attack: Increase \(I_{r,c}^{'}(i)\) by one value and decrease \(I_{r,c}^{''}(i)\) by the same value or vice versa, that is

$$\begin{aligned} \widetilde{I_{r,c}^{'}}(i) = {I_{r,c}^{'}}(i) + \Delta (i), \, \widetilde{I_{r,c}^{''}}(i) = {I_{r,c}^{''}}(i) - \Delta (i), \, i=1,..,k, \end{aligned}$$

with at least one \(\Delta (i) \ne 0\). Similar to the first case, by formula (9), have

$$\begin{aligned} \widetilde{I_{r,c}}(i) = I_{r,c}(i), \, i=1,..,k. \end{aligned}$$

It follows that

$$\begin{aligned} \widetilde{ac}(i) = ac(i), \, i=1,..,128. \end{aligned}$$

On the other hand, from (12), deduce that

$$\begin{aligned} \widetilde{b}(i) \ne b(i), \end{aligned}$$

with at least one i. From this and (14), deduce \((\widetilde{b}(1),...,\widetilde{b}(\alpha )) \ne (\widetilde{ac}(1), ... , \widetilde{ac}(\alpha ))\). So, by step 5 of Section 3.5, we conclude that blocks \(I_{r,c}^{'}\) and \(I_{r,c}^{''}\) are attacked. Similar to the first case, the probability of a false conclusion in this case is \(\frac{1}{2^{\alpha }}\).

The third type of attack: This attack type differs from both the first type and the second type. Then we have

$$\begin{aligned}&(\widetilde{ac}(1), ... , \widetilde{ac}(128)) \ne (ac(1), ... , ac(128), \\&(\tilde{b}(1), ... ,\tilde{b}(\alpha )) \ne (b(1), ... ,b(\alpha )). \end{aligned}$$

If

$$\begin{aligned} (\widetilde{ac}(1), ... , \widetilde{ac}(\alpha )) = (\tilde{b}(1), ... , \tilde{b}(\alpha )) \end{aligned}$$
(19)

According to step 5 of Section 3.5, we conclude that two blocks \(I_{r,c}^{'}\) and \(I_{r,c}^{''}\) are not attacked. In this case, bits \(\widetilde{ac}(i)\) and \(\tilde{b}(i)\) do not have any relationship, so the probability of occurrence of the condition (19) is equal to is \(\frac{1}{2^{\alpha }}\). In summary, for the proposed method, the true attack detection rate is \((1-\frac{1}{2^{\alpha }})\), with \(\alpha \) is the number of AC bits embedded in a pixel block. This rate depends on the size of the block \(I_{r,c}\). For the block sizes of \(4\times 4\), \(3\times 3\) and \(2\times 2\), the embedding capacities \(\alpha \) are 37, 20 and 9, respectively. In all of these cases, the correct attack detection rate is close to 100%.

Fig. 2
figure 2

Experimental images

4 Experimental results

This section presents experiment results to compare the detection effectiveness of the proposed method with the methods of Yin et al. [15], Hong et al. [11], and Peng et al. [53]. The computations of methods have been performed using MATLAB running on Lenovo computer IdeaPad Slim 5 15 ITL 05 with Intel Core i5, 8-GB Ram, 512-GB SSD. The experiments are performed on the 12 standard grayscale images of size \(512\times 512\) as shown in Fig. 2.

4.1 Embedding capacity and image quality comparisons between methods

The embedding capacity of each method is measured by the average number of AC bits embedded in 12 standard grayscale images. Similarly, stego-image quality is calculated by the average PSNR value obtained in these 12 test images. The comparison results are presented in Table 1. The information in the second column is the number of non-overlapping blocks partitioned from each \(512\times 512\) test image. For methods using \(4\times 4\) blocks, such as Yin et al., Hong et al., Peng et al., and the proposed method with \(4\times 4\) blocks, the number of blocks in each test image is 16384. With the proposed method using \(3\times 3\) and \(2\times 2\) blocks, the numbers of blocks are 28900 and 65536, respectively. The third column presents the average number of embeddable blocks in 12 test images. This number for methods of Yin et al. and Hong et al. is 13418 over 16384 blocks. For the remaining methods, all blocks are embeddable.

Table 1 Embedding capacity and image quality comparisons

The fourth column introduces the embedding capacity of methods. In Sections 2.3 and 2.4, it is noted that the methods of Yin et al. and Hong et al. used the IPVO technique in \(2\times 2\) sub-blocks. So, in each block of \(4\times 4\) pixels at most 8 AC bits are embedded and at least no pixel is embedded, the average number is 4 bits. The average number of embedded bits in each \(4\times 4\) block of these two methods is 3,3060 as shown in Table 1 which is in agreement with the theoretical analysis. Section 2.5 showed that in each \(4\times 4\) block of 16 pixels with values from 1 to 253, the method of Peng et al. can embed 32 bits, but only half of the bits are AC bits, the remaining bits are used to restore the original image. The mean number of embedded AC bits of Peng et al’s method in a block is 15.9851. It is shown in Section 3.1 that for a pixel block X of sizes \(2\times 2\), \(3\times 3\), and \(4\times 4\), the proposed method can embed a number of 9.1104, 20.4313 and 37.0551 bits, respectively. Thus, the embedding capacity of the proposed method presented in Table 1 is consistent with the theoretical analysis.

The 5th column presents the average PSNR values of the methods. In Peng et al.’s method and the proposed method, PSNR is the mean of PSNR1 and PSNR2. For Yin et al.’s method, only \(50\%\) number of pixels in an image is increased or decreased by one, so its average PSNR is the largest. Compared with Yin et al.’s method, Hong et al.’s method must perform embedding in non-embeddable blocks. Therefore, the PSNR of Hong et al.’s method is smaller than that of Yin et al.’s method. In the proposed method, each pixel is also increased or decreased by one, but the number of modified pixels is larger than that of Yin et al.’s method. Therefore, the PSNR of the proposed method is slightly smaller than the PSNR of Yin et al.’s method. The pixels in Peng et al.’s method are modified at most two units, so the PSRN coefficient of this method is the smallest.

4.2 Detection capacity comparison between methods

4.2.1 Small and uniform distribution attack

In this attack manner, first, a noise matrix of size \(512\times 512\), denoted by NM, is created. This matrix consists of non-overlapped \(4\times 4\) blocks, denoted by \(NM_{r,c}\) with \(1 \le r,c \le 128\). All blocks are equal to zero except blocks \(NM_{r,c}\) with rc satisfy the condition:

$$\begin{aligned} (r-1)\,mod\,4 = 0 \, \text {and} \, (c-1)\,mod\,2 = 0 \end{aligned}$$
(20)

Each of these blocks has two elements with values belonging to the set \(\{-2, -1, 1, 2\}\), other elements are equal to zero. The positions and values of non-zero elements are determined randomly. For example, a sub-matrix consisting of the first 20 rows and 20 columns of NM is shown in Fig. 3. The matrix NM used for attacking a watermarking image \(I^{'}\) of size \(512\times 512\) to get an attacked image \(\widetilde{I}^{'}\) as follows:

$$\begin{aligned} \widetilde{I}^{'}(i,j) = {\left\{ \begin{array}{ll} I^{'}(i,j) + NM(i,j),\; \text {if}\; 0 \le I^{'}(i,j) + NM(i,j) \le 255 \\ I^{'}(i,j) - NM(i,j), \; \text {otherwise.} \end{array}\right. } \end{aligned}$$
(21)

For Peng et al.’s method and the proposed method, the second watermarking image \(I^{''}\) will be modified according to formula (21) to get an attacked image \(\widetilde{I}^{''}\) but with a noise matrix NM2 that only differs NM in positions of non-zero elements in each block \(NM_{r,c}\). These attacks make a small modification to the watermarking images. Even this change is difficult to perceive with the naked eye. Figure 4 below illustrates a watermarking image \(I^{'}\) and an attacked image \(\widetilde{I}^{'}\) obtained by attacking it in the above manner.

Fig. 3
figure 3

The first part of noise matrix NM

Fig. 4
figure 4

Watermarking image and attacked image by the small and uniform distribution attack

Table 2 Attack detection ability comparison between methods

The average detection ability of attacked blocks on 12 test images of the methods is presented in Table 2.

We now explain some notations in Table 2. The TP in column 2 is the exact number of blocks detected as fake. In other words, TP is the number of blocks that are confirmed to have been hacked and that has in fact been modified. FN is the number of blocks that are confirmed to be unchanged but in fact, they have been tampered with. TN is the exact number of blocks detected as not being hacked. FP is the number of blocks detected as hacked but in reality, they have not changed at all. TPR is the ratio of the exact number of blocks detected as hacked to the total number of hacked blocks. This coefficient evaluates the ability to detect hacked blocks. If TPR = 1, it means that \(100\%\) of hacked blocks are detected. The FPR is the ratio of the number of blocks that are falsely detected as hacked to the total number of blocks that are not attacked. If FPR = 0, it means that \(100\%\) of not-hacked blocks are correctly detected. According to Table 2, Yin et al.’s method detects 42.30% and Hong et al.’s method detects 64.59% of the number of tampered blocks. Peng et al.’s method can detect 85.91 % of tampered blocks. The attack detection ratio of the proposed method is 99.91% if \(2\times 2\) block sizes are used and is always 100% if \(3\times 3\) and \(4\times 4\) block sizes are used. It should also be mentioned that for all methods except Hong’s method, every block that is not attacked is detected as unmodified. Particularly for Hong’s method, \(FPR = 0.05 \%\). This means there are some blocks that do not change, but the method of Hong et al. concluded that they have been attacked. This is explained in Section 2.4.

4.2.2 A special attack on the watermarking images of Sailboat image

This section presents a special attack that is performed in each \(4\times 4\) block \(bI^{'}\) of the watermarking image \(I^{'}\) independently. Denote \(mi^{'}\) (\(ma^{'}\)) as the smallest (largest) value of pixels in the \(bI^{'}\). Now we want to define a quantity, denoted \(deV^{'}\), that is a multiple of 4 with as large a value as possible, such that after decreasing all pixels of \(bI^{'}\) by \(deV^{'}\), the decreased pixels are retained in segment [0, 255]. Obviously, \(deV^{'}\) is determined by the formula:

\(deV^{'} = 4 \times \lfloor (mi^{'}/4) \rfloor .\)

Similarly, if define

\(inV^{'} = 4 \times \lfloor (255 -ma^{'})/4) \rfloor ,\)

then \(inV^{'}\) is the largest possible multiple of 4 such that after increasing all the pixels of \(bI^{'}\) by \(inV^{'}\), the increased pixels will be retained in the [0,255] segment.

To make \(bI^{'}\) as variable as possible, \(deV^{'}\) is chosen If \(deV^{'} \ge inV^{'}\), and \(inV^{'}\) is chosen, if otherwise. The following are some illustrative examples. Consider a block:

$$\begin{aligned} bI^{'} = \begin{bmatrix} 103 &{} 150 &{} 152 &{} 154\\ 208 &{} 156 &{} 160 &{} 134\\ 200 &{} 150 &{} 170 &{} 144\\ 228 &{} 158 &{} 180 &{} 124\\ \end{bmatrix}, \end{aligned}$$

In this case, \(mi^{'}=103\), \(ma^{'}=228\), \(deV^{'}= 100\), \(inV^{'}= 24\). Choose \(deV^{'}= 100\) to attack \(bI^{'}\). In other words, all pixels of \(bI^{'}\) are decreased by \(deV^{'}\). The attacked block is equal to:

$$\begin{aligned} \widetilde{bI}^{'} = \begin{bmatrix} 3 &{} 50 &{} 52 &{} 54\\ 108 &{} 56 &{} 60 &{} 34\\ 100 &{} 50 &{} 70 &{} 44\\ 128 &{} 58 &{} 80 &{} 24\\ \end{bmatrix}, \end{aligned}$$

Consider a block:

$$\begin{aligned} bI^{'} = \begin{bmatrix} 23 &{} 150 &{} 152 &{} 150\\ 108 &{} 153 &{} 140 &{} 134\\ 100 &{} 150 &{} 130 &{} 144\\ 128 &{} 148 &{} 120 &{} 124\\ \end{bmatrix}, \end{aligned}$$

In this case, \(mi^{'}=23\), \(ma^{'}=153\), \(deV^{'}= 20\), \(inV^{'}= 100\). Choose \(inV^{'}= 100\) to attack \(bI^{'}\). In other words, all pixels of \(bI^{'}\) are increased by \(inV^{'}\). The attacked block is equal to:

$$\begin{aligned} \widetilde{bI}^{'} = \begin{bmatrix} 123 &{} 250 &{} 252 &{} 250\\ 208 &{} 253 &{} 240 &{} 234\\ 200 &{} 250 &{} 230 &{} 244\\ 228 &{} 248 &{} 220 &{} 224\\ \end{bmatrix}, \end{aligned}$$

In Peng et al.’s method and the proposed method, there are two watermarking images \(I^{'}\) and \(I^{''}\). Therefore, for a pair of two blocks \(bI^{'}\) and \(bI^{''}\). First, need to compute \(deV^{'}\), \(inV^{'}\) of \(bI^{'}\), and \(deV^{''}\), \(inV^{''}\) of \(bI^{''}\). Then define \(deV = min(deV^{'}, deV^{''})\), and \(inV = min(inV^{'}, inV^{''})\). Finally, if \(deV \ge inV\), using deV to attack both blocks \(bI^{'}\) and \(bI^{''}\) simultaneously. Otherwise, inV is used. Figure 5 illustrates a watermarking image of the Sailboat image and an image obtained after attacking this watermarking image by modifying 7680 blocks of size \(4\times 4\) located in the first 240 rows of the image.

This attack makes significant changes to the watermarking images that we can easily perceive with the naked eye. However, according to Table 3, both Yin et al.’s and Peng et al.’s methods failed to detect any block that is attacked. This has been explained in Sections 2.3 and 2.5. The method of Hong et al. detected 55.29% of \(4\times 4\) blocks being hacked.

Fig. 5
figure 5

Watermarking image and attacked image by the special attack

Table 3 Detection ability comparison for a special attack on the Sailboat image

While the proposed method with block sizes \(4\times 4\), \(3\times 3\) and \(2\times 2\) have a detection ratio of 100%, 100% and 99.80% respectively.

In summary, the experimental results confirmed that the proposed method has superior attack detection ability compared to other methods, while always maintaining high watermarking image quality (PSNR more than 50).

4.3 Computational complexity comparison

The proposed method and the method of Hong et al. used the characteristics of each block as input data to the MD5 hash function to generate AC bits for each block. Both of these methods achieve higher attack detection capabilities than the other two methods. Therefore their computational complexity is also higher. The running time of the proposed method and the method of Hong et al. are similar. This conclusion holds for the methods of Peng et al. and Yin et al. Specifically, the running time in the AC bit embedding phase is 1.0552, 0.7716, 4.9216 and 4.8619 (seconds) for the methods of Peng et al., Yin et al., Hong et al., and the proposed method. The running time in the hacked block detection phase is 0.8741, 1.0636, 4.9498 and 5.1961 for the above methods. Thus, the proposed method has a significantly higher ability to detect attack blocks than existing methods, but its computational complexity is only approximately that of Hong et al.’s method.

5 Conclusion

In this paper, we have improved the reversible data hiding methods on two images of Lu et al. and Yao et al. by embedding a 5-nary number of k digits on k pixels with values from 1 to 254. The average embedding ratios of our improvement are 2.3159, 2.2701 and 2.2776 bpp for block sizes of \(4 \times 4\), \(3 \times 3\) and \(2 \times 2\) respectively, which is larger than the embedding ratios 2 and 2.25 bpp of Lu et al.’s and Yao et al.’s methods. Due to the large embedding capacity and the AC bits generated for each block individually by using the MD5 hash function on the information about the value and location of the block, the proposed method can detect any type of attack with high accuracy. For the attack type of modifying slightly 2048 blocks that are distributed uniformly out of a total of 16384 blocks of size \(4 \times 4\), the correct detection ratios of Yin et al.’s, Hong et al.’s and Peng et al.’s methods are 42.30%, 64.59%, and 85.91%, respectively, while the correct detection ratios of the proposed method with block sizes of \(4 \times 4\), \(3 \times 3\) and \(2 \times 2\) are 100%, 100%, and 99.91%. In the second attack, all 7680 blocks of size \(4 \times 4\) in the first 240 rows of a watermarking image are transformed by increasing or decreasing all pixels of each block \(4 \times 4\) by the same value which is the largest possible multiple of 4. For this attack, the methods of Yin et al. and Peng et al. can not detect any hacked block, Hong et al.’s method has a correct detection rate of \(55.29\%\), while the proposed method with block sizes of \(4 \times 4\), \(3 \times 3\) and \(2 \times 2\) has a correct detection rate of 100%, 100% and 99.80%, respectively. In summary, the proposed reversible image authentication method has a superior attack detection ability compared to existing methods, while maintaining high watermarking image quality (values of PSNR greater than 50 dB).