1 Introduction

With the rapid development of the Internet and multimedia applications, a great amount of digital data, i.e., images, videos, audio, and text, is transmitted through the Internet each second. Such transmitted data can be easily infringed by illegally copying, editing, or tampering with the data using image processing software. Therefore, protecting the integrity and the privacy of digital images has become an increasingly important issue in both academia and industry [8, 10,11,12, 21, 22]. Digital watermarking is a popular technique that is used to protect the integrity and the privacy of digital images [27], and it can be classified into three categories, i.e., robust, semi-fragile, and fragile watermarking. Robust watermarking [1, 9, 19] is used to resist intentional and accidental malicious attacks, while semi-fragile watermarking [2, 15, 18, 24, 26] is used to counter to some content-preserved manipulations, i.e., Gaussian noise and JPEG compression. Fragile watermarking [3,4,5,6,7, 13, 14, 16, 17, 20, 25] is very sensitive to any modifications. Even performing a minor operation on the image will destroy the embedded watermark. Two of many important applications of robust watermarking are to protect copyrights and digital forensics. In contrast, the main use of the other two techniques is to protect the integrity of digital images. The major purpose of this paper is to concentrate on fragile watermarking for image authentication.

In the last few decades, several fragile algorithms for image authentication have been proposed in both the spatial and transform domains to detect tampered regions in watermarked images [3,4,5,6,7, 13, 14, 16, 17, 20, 25]. The schemes based on the spatial domain embed the watermark into pixels of the host images by directly modifying a group of pixels with the goal of preserving the high image quality of the watermarked images. However, schemes based on the transform domain, i.e., vector quantization (VQ), block truncation coding (BTC), discrete cosines transform (DCT), and discrete wavelets transform (DWT), transform the host images into coefficients and use them to carry the watermark. Many of the earlier studies were performed in the spatial domain [4, 16]. In 2011, Chan [4] utilized hamming code for image authentication. He rearranged the bits of pixels to construct parity check bits that can be used to reconstruct the value of the pixel exactly when a pixel was modified by an attacker. To further improve the quality of watermark images, a new fragile watermarking scheme was proposed by Qin et al. that used both image hashing and the folding operation [16]. The low-frequency component of the non-subsampled contourlet transform coefficients is used to encode the restoration bits in their scheme, and they obtained good quality reconstructed images.

In addition to schemes based on the spatial domain, some recent fragile watermarking studies also were implemented in the transform domain [3, 5,6,7, 13, 14, 17, 20, 25]. In 2011, Chuang and Hu [5] used vector quantization as an image authentication scheme to verify tampered regions in the watermarked images. In this scheme, two sets of authentication codes are used to verify the modifications and achieve accurate tamper detection. Qin et al. [17] combined traditional VQ and inpainting techniques to restore watermarked VQ-compressed images after tamper detection. The complexity of each block is calculated to determine whether the VQ technique or the inpainting technique was used to generate the restoration bits. Hu et al. [6, 7] proposed two new schemes for block truncation coding-compressed (BTC-compressed) images instead of using VQ-compressed images for fragile watermarking and tamper detection. However, the image quality of these two schemes was unsatisfactory; the PSNR values for various images were always less than 41 dB. To increase the quality of the watermarked BTC-compressed images in [6, 7], Nguyen et al. [13] used a reference table to embed the authentication code into BTC-compressed images. This reference table also is required in the watermark extraction phase. The quality of the watermarked images was increased significantly. Nguyen et al.’s scheme maintained PSNR values greater than 42 dB for various test images. Later on, Tiwari et al. [20] proposed new image authentication and tamper detection algorithm by using VQ mechanism. In Tiwari et al.’s scheme, to obtain the high localization accuracy, the robust zero level watermark using properties of indices of vector quantized image is embedded into the cover image. Then, the watermark is embedded by using modified index key based algorithm. However, this scheme suffered the image quality limitation, approximately 42 dB. To improve the image quality of watermarked image, Azeroual and Afdel [3] proposed novel tampered detection scheme based on fragile watermarking and Faber-Schauder wavelet. Their scheme obtained the average PSNR greater than 51 dB. To improve the accuracy of tamper detection and the quality of watermarking image, Peng et al. [14], applied two identical host images, where one image contains the secret data while the other image is used to embed the distortion information. In Peng et al.’s scheme, the SOS algorithm [23] is used construct the reference matrix for embedding watermark. However, this scheme offered low image quality, when the average PSNR is smaller than 50 dB.

In this paper, a fragile watermarking scheme for image authentication is proposed to improve the image quality provided by previous schemes and to achieve highly accurate tamper detection. Three algorithms, i.e., the DWT algorithm, the SVD algorithm, and the DCT algorithm, were considered carefully to extract the feature coefficients for carrying the authentication code. The authentication code is generated by a pseudo-random generator with a secret key K. Then, quantization index modulation (QIM) is used to embed the authentication code into the feature coefficients. To guarantee that the same authentication code is extracted in the watermark extraction, the Gram-Schmidt process is used to adjust these coefficients. The experimental results indicated that the proposed scheme achieved high visual quality of watermarked images and provided highly accurate tamper detection under different attacks, i.e., direct cropping and object insertion attacks.

The rest of this paper is organized as follows. Section 2 provides the details of the proposed fragile watermarking scheme. Our experimental results and the performance of the proposed scheme are presented in Section 3. In Section 4, our conclusions are presented to summarize the main contributions of the proposed scheme.

2 The proposed scheme

The proposed scheme consists of two main phases, i.e., the watermark embedding phase and the watermark extraction phase. These two phases are discussed in detail in Subsections 2.1 and 2.2, respectively.

2.1 Watermark embedding phase

In this subsection, the host image is partitioned into non-overlapping 8 × 8 blocks. Then, we use 1-level DWT [25] to decompose each of the image blocks into four different coefficient sub-bands, i.e., LL, HL, LH, and HH, as shown in Fig. 1. In principle, most of energy of a grayscale image is concentrated on the low-frequency sub-band, thus the authentication code is embedded into this sub-band to protect it against to accidental modifications. However, altering the LL sub-band for embedding authentication degraded the host image significantly. To avoid this shortcoming, we use both SVD and DCT to extract suitable features from the LL sub-band for the embedding authentication code.

Fig. 1
figure 1

Discrete wavelet transform algorithm

The LL sub-band is divided into non-overlapping blocks with the size of 4 × 4. Then, the SVD algorithm is used to decompose each block further by using Eq. (1).

$$ \mathrm{A}={\mathrm{USV}}^{\mathrm{T}}=\left[\begin{array}{l}\mid \\ {}{u}_1\\ {}\mid \end{array}\kern0.5em \begin{array}{l}\mid \\ {}{u}_2\\ {}\mid \end{array}\kern0.5em \begin{array}{l}\mid \\ {}{\mathrm{u}}_3\\ {}\mid \end{array}\kern0.5em \begin{array}{l}\mid \\ {}{u}_4\\ {}\mid \end{array}\right]\left[\begin{array}{cccc}{\lambda}_1& 0& 0& 0\\ {}0& {\lambda}_2& 0& 0\\ {}0& 0& {\lambda}_3& 0\\ {}0& 0& 0& {\lambda}_4\end{array}\right]{\left[\begin{array}{l}\mid \\ {}{v}_1\\ {}\mid \end{array}\kern0.5em \begin{array}{l}\mid \\ {}{v}_2\\ {}\mid \end{array}\kern0.5em \begin{array}{l}\mid \\ {}{v}_3\\ {}\mid \end{array}\kern0.5em \begin{array}{l}\mid \\ {}{v}_4\\ {}\mid \end{array}\right]}^T, $$
(1)

where S is a diagonal matrix that contains all non-negative, real, singular values λk ′ s, and U and V are the left and right unitary matrices, respectively, that consist of the singular vectors uk and vk. Since the image layer with the larger singular value, i.e., \( {\lambda}_1{u}_1{v}_1^T \), is less affected by digital signal processing (DSP) attacks, it is used to embed the authentication bits. Then, we use a one dimensional (1-D) DCT on both matrices, i.e., U and V, by using Eq. (2):

$$ {\displaystyle \begin{array}{c}\mathrm{DCT}\left(\mathrm{i},\mathrm{j}\right)=\mathrm{C}\left(\mathrm{i}\right)\times \mathrm{C}\left(\mathrm{j}\right)\times \sum \limits_{x=1}^N\sum \limits_{y=1}^N pixel\left(x,y\right)\times \cos \left[\frac{\left(2x+1\right) i\pi}{2N}\right]\cos \left[\frac{\left(2y+1\right) j\pi}{2N}\right];\\ {}N=8; pixel\left(x,y\right)\ \mathrm{are}\ \mathrm{values}\ \mathrm{of}\ \mathrm{matrix}\ U\ or\ V\\ {}\mathrm{C}\left(\mathrm{i}\right),\mathrm{C}\left(\mathrm{j}\right)=\left\{\begin{array}{c}\sqrt{\frac{1}{N}}\kern1em for\ i,j=1\kern3.5em \\ {}\sqrt{\frac{2}{N}}\kern0.75em for\ i,j=2,3,\dots, N.\end{array}\right.\end{array}} $$
(2)

The authentication code is embedded by modifying either the DCT coefficients of \( {DCT}_{u_1}(k)\ or\ {DCT}_{v_1}(k) \) in the middle index range, e.g., k = 2 or 3, by using the QIM technique. It is noted that, in the proposed scheme, thanks to the value of k a embedding set E is generated by four elements, i.e.,\( {DCT}_{u_1}(2),{DCT}_{v_1}(2),{DCT}_{u_1}(3)\ \mathrm{and}\ {DCT}_{v_1}(3) \), defined as E = (E0, E1, E2, E3). Therefore, to improve the security of the proposed scheme, one more secret key SK is needed for generating a pseudo random number generator R. Then, one element Epos is selected from the embedding set E for containing the authentication code, where pos = Ri mod 4. By doing so, the security of the proposed scheme is guaranteed. Figure 2 shows the main processes of the proposed watermark embedding phase.

Fig. 2
figure 2

Main processes of the watermark embedding phase

The input variables are a grayscale host image H and a secret key K. The output is a watermarked image. The watermark embedding phase consists of seven steps as followings:

  1. Step 1:

    Partition the image H into 64 × 64 non-overlapping blocks Bm, n(i, j) for i = 1, 2, ⋯, 8 and j = 1, 2,⋯, 8 , and each block consists of 8 × 8 pixels. The authentication code W in binary form is generated randomly by a pseudo-random number generator with the secret key K.

  2. Step 2:

    Apply the 1-level 2-D DWT transform to generate four sub-bands, i.e., LL, LH, HL, and HH. Then, Eq. (1) is used to decompose each 4 × 4 block of low-frequency sub-band LL further by SVD algorithm, thereby obtaining two matrices, i.e., U and V.

  3. Step 3:

    Apply 1-D DCT of u1and v1 to get two DCT sequences, i.e., \( {DCT}_{u_1}(k)\ and\ {DCT}_{v_1}(k) \).

  4. Step 4:

    Use the QIM technique to embed the authentication code bit wb = {0, 1} into the coefficient \( {DCT}_{u_1}(k)\ or\ {DCT}_{v_1}(k) \), where k = 2 or 3:

$$ {\displaystyle \begin{array}{c}\hat{DCT_{u_1}(k)}=2\left\lfloor \frac{DCT_{u_1}(k)}{2}\right\rfloor +{w}_b,\\ {}\hat{DCT_{v_1}(k)}=2\left\lfloor \frac{DCT_{v_1}(k)}{2}\right\rfloor +{w}_b.\end{array}} $$
(3)
  1. Step 5:

    Adjust the singular vectors as follows. After embedding, the inverse DCT is used to obtain two new singular vectors, \( \hat{u_1}\ and\ \hat{v_1} \). Since coefficients u1 and v1 are modified to \( \hat{u_1}\ \mathrm{and}\kern0.5em \hat{v_1} \), leading the orthogonal relationship in two matrices U and V is also altered. To ensure that \( \hat{u_1}\ and\ \hat{v_1} \) are extracted exactly in order to determine the corrected embedded authentication code, the Gram-Schmidt algorithm is used according to the value of {\( \hat{u_1},{u}_2,{u}_3,{u}_4 \)} and it is defined in Eq. (4).

(4)

Because the orthonormal property had to be held for matrix V, Eq. (4) is applied to {\( \hat{v_1},{v}_2,{v}_3,{v}_4 \)} as well. Before decomposing by SVD, let A be sub-band LL,

Along with the modification of matrices U and V, the singular values in matrix S must be modified as follows:

(5)
(6)

where ||.|| is the Frobenius norm. After adjustment with the values of

, the reconstructed image block is computed by Eq. (7):

(7)
  1. Step 6:

    Replace each block in LL by the value of A′.

  2. Step 7:

    Obtain the watermarked image by using the 1-level inverse DWT with the new LL sub-band, and the three original sub-bands, i.e., LH, HL, and HH.

2.2 Watermark extraction phase

When an image, which is suspected to have been tampered from the watermarked image, is published on the Internet, the watermark extraction algorithm is used to detect whether or not any regions in this image have been tampered. The watermark extraction algorithm is designed to be as simple as possible and to represent a replica of the watermark embedding algorithm. Figure 3 shows the flowchart of the proposed watermark extraction phase.

Fig. 3
figure 3

Flowchart of the watermark extraction phase

In the watermark extraction algorithm, the three first steps are done in the same manner that was used in the watermark embedding algorithm. Then, two DCT coefficients, i.e., \( \hat{DCT_{u_1}(k)}\ \mathrm{and}\ \hat{DCT_{v_1}(k)} \), are determined, and the authentication code bits wb’s are extracted by Eq. (8):

$$ {w}_b=\hat{DCT_i}\ \mathit{\operatorname{mod}}\ 2, $$
(8)

where \( \hat{DCT_i} \) denotes the DCT coefficient, that is determined based on the secret key SK as was done in the watermark embedding phase.

After obtaining the extracted authentication code bit wb, according to the secret key K, we reconstruct the original authentication code W to detect whether or not the watermarked image block has been tampered. Then, the corresponding authentication code bit w of W is read and compared with the extracted authentication code bit wb. If wb= w, the current block is marked as a legal block, meaning that no modifications have been performed in the current block; otherwise, it is marked as an illegal block.

The mentioned processes discussed above are implemented repeatedly until the entire image is detected completely. The detected image is obtained by gathering all the legal and illegal blocks.

The detected image should be processed further by the refinement method to enhance its accuracy. Specifically, the refinement method is performed on the detected image several times. Each time, the refinement method divides the detected image into non-overlapping 3 × 3 blocks. Then, each white block is tested to see whether or not its color has been changed to black color. Fig. 4 shows four conditions, i.e., Con1, Con2, Con3, and Con4, that may occur in this refinement method, with B being the current white test block. Taking Con1 in Fig. 4a as an example, if the top-left and bottom-right adjacent blocks of B are black, the color of B will be modified to black. The refined detected image cannot be obtained until all of the white blocks in the detected image have been tested.

Fig. 4
figure 4

Four conditions of the refinement method. a Con1, b Con2, c Con3, d Con4

3 Experimental results

In this section, five 512 × 512 images, i.e. Lena, Peppers, Boat, Barbara, and Airplane serve as the host images to illustrate the performance of the proposed scheme.

The peak signal-to-noise ratio (PSNR) was used to measure the image quality of watermarked images and PSNR is calculated as

$$ PSNR=10{\log}_{10}\left(\frac{255^2}{\left(1/M\right)\sum \limits_{i=1}^M{\left({I}_i-{I}_i^{\prime}\right)}^2}\right), $$
(9)

where M is the size of the watermarked image, and Ii and I′i are the pixel values of the original image and the watermarked image, respectively.

Figure 5 shows the five test images before and after embedding the authentication code. It is obvious that the watermarked images of the proposed scheme have great quality, the values of PSNR always were greater than 84 dB for the images.

Fig. 5
figure 5

Illustrations of images before and after the authentication code was embedded

To evaluate the proposed scheme’s accuracy of tamper detection, the watermarked images were subjected to two different attacks, i.e., a cropping attack and an object insertion attack. The results are shown in the following two subsections.

3.1 Results of direct cropping

To simulate this attack, a certain region was deleted from the watermarked image. Figure 6 shows examples of the direct cropping attack on three watermarked images, i.e. Lena, Airplane, and Peppers. Figures 6a, c, and e depict the enlarged part of three watermarked images with cropped blocks with the sizes of 8 × 8, 16 × 16, and 32 × 32, respectively. Figure 6 shows that the proposed scheme succeeded in detecting this type of attack, and the detected images are shown clearly in Figs. 6b, d, and f, since a normalized correlation coefficient (NCC) always was greater than 0.99. Even when the cropped block is very small in size (Fig. 6a), the proposed scheme presented highly precise localization capacity, as shown in Fig. 6b.

Fig. 6
figure 6

Illustration of direct cropping attack and tamper detection results

3.2 Results of object insertion

A common attack is the cut-and-paste attack, which also is referred as the object insertion attack. In this attack, the attacker intentionally can cut one region from another image and paste it somewhere in the wall of watermarked images. Figure 7 shows the results of object insertion attacks with various sizes of tamper objects. Figures 7a and b illustrate the case in which relatively small-sized objects (a pepper or a flower) were inserted into the watermarked images, Peppers and Lena. These insertions, resulted in a normalized correlation coefficient (NCC) larger than 0.85. NCC is used to measure the similarity between the refined detected image and the tampered image, and it is defined by Eq. (10):

$$ \mathrm{NCC}=\frac{\sum_{i=1}^H{\sum}_{j=1}^W\left[{I}_{i,j}-{I}_{mean}\right]\left[{D}_{i,j}-{D}_{mean}\right]}{\sqrt{\left({\sum}_{i=1}^H{\sum}_{j=1}^W{\left[{I}_{i,j}-{I}_{mean}\right]}^2\right)\left({\sum}_{i=1}^H{\sum}_{j=1}^W{\left[{D}_{i,j}-{D}_{mean}\right]}^2\right)}}, $$
(10)

where I denotes the tampered object in binary form, Imean is the mean value of all of the pixels in image I. D is the refined detected image and Dmean is the mean value of all of pixels in image D.

Fig. 7
figure 7

Illustration of object insertion attack and tamper detection results

Figure 7c shows the case in which a medium-sized object was added, and Fig. 7d presents the case in which a large-sized object was added. Definitely, the proposed scheme always provided the highest value of NCC. It was evident that the proposed scheme has the ability to detect all inserted objects with a high accuracy of tamper detection, as is clearly demonstrated in these figures.

To demonstrate the superiority of the proposed scheme, we compared the proposed scheme to recent image authentication schemes [14, 18,19,20]. In Table 1, in addition to five above mentioned images, five more general grayscale test images, i.e., Baboon, GoldHill, House, Sailboat, Elaine, are tested to further evaluate the effectiveness of the proposed scheme. It can be seen in Table 1 that first column shows different techniques are applied for either fragile or semi fragile watermarking algorithm. Subsequently performance evaluation matrices such as average PSNR of watermarked images, similarity between the refined detected image and the tampered image and localization capacity are compared. The proposed scheme obtained average PSNR = 84 dB, NCC = 0.893, very high efficiency of tamper detection /localization. The schemes proposed by other authors achieved good PSNR, but the average PSNR obtained by the proposed scheme is dramatically higher than those of other four schemes. All algorithms are able to locate tamper but Tiwari et al.’s scheme [20] obtained the highest average NCC among five schemes. However, the average NCC of the proposed scheme is still better than other three schemes [14, 18, 19]. Moreover, average execution time is measured while all computing was performed on a computer with Intel i3 processor @ 1.7 GHz, 4GB DDR RAM and Windows 10 OS. In the Table 1, the average execution times required by our scheme and the four existing schemes [14, 18,19,20] are presented. Obviously, Tiwari et al.’s scheme [20] requires the highest computation time, followed by the proposed scheme, Shojanazeri et al.’s scheme [18], Singh et al.’s scheme [19], and then Peng et al.’s scheme [14]. Tiwari et al.’s scheme is the worst one. This is because Tiwari et al.’s scheme takes much more time than others for two stage VQ coding. However, overall proposed algorithm is having superiority in terms of imperceptibility, exact extraction of embedded watermark, high accuracy in tamper detection.

Table 1 Performance comparison of the proposed scheme with four recent image authentication schemes [14, 18,19,20]

4 Conclusions

A novel fragile watermarking scheme for image authentication was proposed in this paper. The advantages of using the DWT, SVD, and DCT algorithms were explored to select the features coefficients. Then, the authentication code was generated and embedded in these features using the QIM technique. The Gram-Schmidt process was used to adjust these coefficients to ensure that the embedded authentication code was extracted exactly. The experimental results indicated that the proposed scheme provided highly accurate tamper detection under different attacks, i.e., direct cropping and object insertion attacks. In addition, the proposed scheme achieved watermarked images with dramatically high visual quality while maintaining the high accuracy of tamper detection.