1 Introduction

Today, owing to powerful computers, advanced photo-editing software packages and high resolution capturing devices, digital images can be easily manipulated and edited [1, 15]. An image maybe changed inadvertently or intentionally when it spread through the Internet. Digital image forensics aims to verify their authenticity. Image authentication solution is classified into two types: active methods and passive methods [1]. In contrast with the active methods, the passive methods needn’t additional information and used more widely.

There are many ways to falsify an image. Copy-move forgery is a common and no perceptible method of image tampering. As shown in Fig. 1, it means that parts of an image are copied and pasted another part of the same image [5]. As a result, some information of the image will be appended or hidden. Copy-move forgery detection is one of the passive methods.

Fig. 1
figure 1

An example of copy-move forgery from CASIA database

Various methods have been proposed to detect copy-move forgery. Most of them follow a common pipeline [4], as shown in Fig. 2. Copy-move forgery detection methods are categorized either keypoint-based methods or block-based methods [4], which have their respective pros and cons. For feature extraction, block-based methods used different feature vectors to represent the sub-blocks, like images’ brightness [14], Polar Sine Transform (PST) [9], Zernike moments [11], Discrete Cosine Transform (DCT) [2] and so on. However, for matching, the block-based matching algorithm often fixed (e.g., [2, 9,10,11, 14]). The step of algorithm as follows:

Fig. 2
figure 2

General pipeline of copy-move detection methods [4]

Step1. Lexicographically sort the matrix that is formed by feature vectors.

Step2. Calculate the Euclidean distance between adjacent feature vectors in the matrix. If the results are less than a preset threshold value V e u r , two sub-blocks represented by the adjacent feature vectors will classify the matching sub-blocks.

We call above-mentioned algorithm the classical block-based matching algorithm (CBMA). Figure 3 is an example of CBMA. S is the matrix that is formed by feature vectors. After lexicographically sorting matrix S, we can get S . For S1 (the first row of matrix S), we respectively compute the Euclidean distance between S1 and S2,S1 and S3, S1 and S4. The Euclidean distance between S1 and S2 is the smallest. But S1 and S2 isn’t adjoining in the matrix S . As a result, CBMA will most probably miss many matching sub-blocks.

Fig. 3
figure 3

An example of classical block-based matching algorithm

In order to solve this problem, we devise an improved block-based matching algorithm (IBMA). IBMA can pick out all matched sub-blocks. Either CBMA or IBMA is one step of copy-move forgery detection methods. Compared with CBMA, IBMA is able to detect more matched sub-blocks. Hence, IBMA let copy-move forgery detection methods more robust. Besides, the threshold V e u r is very important, because it determines whether two sub-blocks are matched. We put forward a scheme to get the threshold V e u r .

2 Methods

Flow diagrams of the proposed algorithm can be seen from Fig. 4. First, if the tested image is a RGB color image, we can turn it to grayscale. Second, the input image is divided into sub-blocks. Third, each sub-block will become a feature vector. The means of feature extraction are brightness [14], PST [9], Zernike moments [11], DCT [7] and Hu moments [6]. Fourth, we respectively apply CBMA and IBMA to matching. Finally, the filtering method proposed by [8] is used to remove wrong matched sub-blocks. The whole algorithm aims to compare the different results between CBMA and IBMA.

Fig. 4
figure 4

The algorithm framework of our method

2.1 Feature extraction

If the input image is a RGB color image, we can turn it to grayscale by (1).

$$ I=0.228R+0.587G+0.114B $$
(1)

We assume the size of tested image is M × N pixels. Then we divide it into overlapped sub-blocks of b × b pixels. The number of sub-blocks is N a l l .

$$ {{N}_{all}}=(M-b+1)\times (N-b+1) $$
(2)

To better compare CBMA and IBMA, we take five feature vectors: brightness [14], PST [9], Zernike moments [11], DCT [7] and Hu moments [6]. We extract each feature separately rather than concatenate five different existing features. For every feature, we use CBMA and IBMA to make experiments. It must be added that the quantization table of [7] become improper because the size of sub-blocks is changed. We use the quantization table of [5], and the quantization factor is 80.

2.2 Matching

2.2.1 Classical block-based matching algorithm

The purpose of matching algorithm is to select matched sub-blocks. If a pair of sub-blocks is matching, perhaps they are the blocks in either copy region or move region. After extracting feature vectors of dimension n, we can obtain the matrix M of \({{N}_{all}}\times \left (n+2 \right )\). For the matrix M, the n-dimensional column vector at the head is the feature vectors and the 2-dimensional column vector at the end is the sub-blocks’ coordinate of top left corner. In the chapter Sections 2.2.1 and 2.2.2 of this article, we define the top right corner of matrix notation represents the column of the matrix and the lower right corner represents the row of the matrix. For example, M12 means the first row and the second column of matrix M.

The steps of CBMA are shown as follows. First, lexicographically sort the matrix M that is formed by feature vectors and position coordinates of sub-blocks. Then we can obtain the matrix Q. By this way, similar feature vectors will be adjacent as far as possible. Second, we calculate the Euclidean distance between adjoining two feature vectors in the matrix Q like (3). If the results are less than the threshold V e u r , we calculate the distance of the corresponding sub-blocks’ position coordinates like (4). If the results are more than the threshold V d i s , the position coordinates of sub-blocks will be persisted to matrix T. The threshold V e u r ensures that the two sub-blocks are resembled. On account of the copy region and move region have a certain distance, we set the threshold V d i s .

We assume \(Q1=\left [ {{I}^{1}},{{I}^{2}},{\ldots } ,{{I}^{n}} \right ]\) and \(Q2=\left [ {{L}^{1}},{{L}^{2}},{\ldots } ,{{L}^{n}} \right ]\). The Euclidean distance between Q1 and Q2 can be calculated as (3).

$$ SIM(Q1,Q2)=\sqrt{\sum\limits_{r=1}^{n}{{{({{I}^{r}}-{{L}^{r}})}^{2}}}} $$
(3)

We assume \(Q3=\left [ x,y \right ]\) and \(Q4=\left [ x^{\prime },y^{\prime } \right ]\). The distance between Q3 and Q4 can be calculated as (4).

$$ DIS(Q3,Q4)=\sqrt{{{(x-x^{\prime})}^{2}}+{{(y-y^{\prime})}^{2}}} $$
(4)
figure e

2.2.2 Improved block-based matching algorithm

The signs of many feature vectors (e.g., DCT [7] and Hu moments [6]) are positive and negative. We deem that only the signs of the feature vectors are the same can the sub-blocks probably become matched sub-blocks. In other words, if not all values of the feature vectors is either more than zero or less than zero, we firstly classify the matrix M based on the signs of the feature vectors. Then we continue the follow steps of IBMA. Furthermore, for each sub-block, if the number of matching is more than M a x , we abandon them. We set M a x to 10. Namely, if a sub-block matches lots of other sub-blocks, the sub-block is unqualified.

Now we provide the approach of IBMA. Firstly, we can count the sum of the feature vectors in the matrix M and put them in the column vector D. Then we can have a new matrix P by combining M with D. Secondly, we gain Q after sorting the matrix P based on the first column. Thirdly, every row of the matrix Q will search the following rows until the difference in the first column is larger than a threshold value \(\sqrt {n}\times {{V}_{eur}}\). If the Euclidean distance (calculate by (3)) of the two rows is less than V e u r and the distance (calculate by (4)) is more than V d i s , the position coordinates of sub-blocks will keep in matrix T.

The key of IBMA is the stopping condition of searching. Without the stopping condition, every row of the matrix Q will search the following rows until the end. Moreover, the threshold \(\sqrt {n}\times {{V}_{eur}}\) is based on the threshold V e u r .

We suppose \(Q1=\left [ {{I}^{1}},{{I}^{2}},{\ldots } ,{{I}^{n}} \right ]\) and \(Q2=\left [ {{L}^{1}},{{L}^{2}},\ldots ,{{L}^{n}} \right ]\). The threshold V e u r is a constant and more than zero.

Condition \(\mathcal {A}\,:\,\left | \left ({{I}^{1}}+{{I}^{2}}+{\ldots } +{{I}^{n}} \right )-\left ({{L}^{1}}+{{L}^{2}}+{\ldots } +{{L}^{n}} \right ) \right |>\sqrt {n}\times {{V}_{eur}}\).

Condition \(\mathcal {B}\,:\,SIM\left (Q1,Q2 \right )>{{V}_{eur}}\).

We have proved that condition \(\mathcal {A}\) can deduce condition \(\mathcal {B}\). The details of proof procedure are in Appendix. The proof mainly used average inequality. For two feature vectors, Q1 and Q2, if the difference value between the sum of Q1 and the sum of Q2 is more than \(\sqrt {n}\times {{V}_{eur}}\), we can deduce that the Euclidean distance between Q1 and Q2 is more than V e u r . So the condition \(\mathcal {A}\) can turn into stopping condition of searching.

figure f

2.2.3 The calculation method of the threshold V e u r

The threshold V e u r is important because it determines whether two sub-blocks are matched or not. The value of the threshold V e u r must be suitable. Most of copy-move forgery detection methods of block-based obtain the threshold V e u r through experience. In view of this problem, we put forward a scheme to compute the threshold V e u r . After a tampering image was distorted by Gaussian noise, salt-pepper noise, or JPEG compression, the corresponding sub-blocks in copy region and move region may be changed. Our method gains the threshold V e u r by calculating the change.

If we should detect an image database, the details of our scheme are as follows. First, 10 images are randomly picked out from the database. For each image, we copy a region (size is 70 × 70) of the image and paste the region to another part of the same image. At the same time, we record the locations of the two regions. Second, we respectively add Gaussian noise (variance is 0.001), salt-pepper noise (parameter is 0.003), and JPEG compression (quality factor is 80) to the 10 images. Then we can get 30 images. Third, we take 10 images of JPEG compression as an example. For every JPEG tampered image, we randomly select 100 sub-blocks (size is b × b) from the middle part of copy region. The size of the middle part is 60 × 60. After that, we calculate the Euclidean distance between the corresponding sub-blocks of copy region and move region. And e i (i = 1,2,...,100) represent the results of Euclidean distance. Then we sort e i (i = 1,2,...,100) in ascending order. The front 50 e i (i = 1,2,...,50) is remained to compute their average value. And E j (j = 1,2,...,10) represent the results of average values from 10 JPEG images. We can sort E j (j = 1,2,...,10) in ascending order. The front 5 E j (j = 1,2,...,5) is remained to get their average value. We can receive the final result V e u r1 from the 10 JPEG tampered images. By the same way, we can obtain the result V e u r2 of 10 tampered images with Gaussian noise and the result V e u r3 of 10 tampered images with salt-pepper noise. Finally, we can get the threshold V e u r through reckoning the average value of V e u r1, V e u r2 and V e u r3.

2.3 Filtering

The aim of filtering algorithm is to reduce the probability of false matches. Our method of filtering is based on the filtering algorithm of [8].

As shown in Fig. 5, for every matched sub-blocks (e.g., a and a ), we can achieve the number of adjacent matched sub-blocks (e.g., b and b , c and c ). Take a pair of matched sub-blocks, a and a , as an example. If a sub-block is located in the circular area whose center is a and the matched sub-block is situated in the circular area that the center is a , we define the pair of matched sub-blocks is an adjacent matched sub-block of a and a . We set the radius to R r e . The threshold of the number of adjacent matched sub-blocks is N r e . For a pair of matched sub-blocks (e.g., a and a ), when the number of adjacent matched sub-blocks is bigger than N r e , we keep the pair of matched sub-blocks (a and a ).

Fig. 5
figure 5

Adjacent matched sub-blocks

As shown in Fig. 6, for a and a , we define angle \({{\beta }_{k}}\left (k=1,{\ldots } ,{{N}_{re}} \right )\) as the included angle between a and the adjacent sub-block of a. The value of β k is range from − π to π. And we define \({{\beta }_{k}}^{\prime }\left (k=1,{\ldots } ,{{N}_{re}} \right )\) as the included angle between a and the adjacent sub-block of a . The value of β k is range from − π to π. Then we define:\({{\theta }_{k}}=\left ({{\beta }_{k}}-{{\beta }_{k}}^{\prime } \right )\bmod \left (2\pi \right )\left (k=1,{\ldots } ,{{N}_{re}} \right )\). Ideally, 𝜃 k equal to the rotating angle α of move region. We calculate the variance of the assemblage \(\left [ {{\theta }_{1}},{{\theta }_{2}},{\ldots } ,{{\theta }_{{{N}_{re}}}} \right ]\). If the variance is less than φ r e , we remain the pair of matched sub-blocks (a and a ).

Fig. 6
figure 6

The angle of adjacent matched sub-blocks

In summary, for a pair of matched sub-blocks, if the number of adjacent matched sub-blocks is bigger than N r e and the variance of the assemblage \(\left [ {{\theta }_{1}},{{\theta }_{2}},{\ldots } ,{{\theta }_{{{N}_{re}}}} \right ]\) is less than φ r e , we hold it.

3 Experimental results and analysis

The experiments were performed on a computer with Intel core i5 CPU (3.20 GHz) and 4 GB RAM. The software environment of tests is Matlab R2012a. The images that we used in the experiments were formed by two parts. The one was 100 images from UCID [13]. The size of the images are 512 × 384. In real life, most people tamper with the images deliberately by Photoshop. In order to simulate the situation, we make use of Photoshop CS3 to tamper images. The other was 100 images of CASIA V2.0 [3]. The size of the images range from 384 × 256 to 528 × 318. All the images of CASIA V2.0 are formed by authentic images and tampered images. The tampered ways of CASIA V2.0 are splicing and copy-move. We randomly chose 100 images that the tampered method was copy-move.

3.1 Thresholds setting

The thresholds of our experiments were shown as Table 1. The threshold V e u r of UCID and CASIA V2.0 were obtained by the method of chapter Section 2.2.3.

Table 1 Thresholds setting

3.2 UCID data

For the 100 images of UCID, we randomly copied an area which size is 70 × 70 from an image and pasted it to another part of the same image. Moreover, the tampered images were added with Gaussian noise, salt-pepper noise, and JPEG compression. We detected these images by CBMA and IBMA to compare the results.

In order to display the results objectively, we adopted the True Positive Rate (TPR) and False Positive Rate (FPR) of [12].

$$ TPR=\frac{\left| TP \right|}{\left| {{R}_{clone}} \right|} $$
(5)
$$ FPR=\frac{\left| FP \right|}{\left| {{R}_{nomal}} \right|} $$
(6)

In (5), \(\left | TP \right |\) represents the number of pixels correctly classified as tampered in the copy region and the move region. And \(\left | {{R}_{clone}} \right |\) represents the number of real tampered pixels in the copy regions and the move regions. In (6), \(\left | FP \right |\) represents the number of pixels wrongly classified as tampered in the normal region that isn’t tampered. And \(\left | {{R}_{normal}} \right |\) represents the number of real normal pixels in the region that do not belong to the copy regions or move regions. In general, the higher are the TPR and the lower are the FPR, the better are the copy-move detection methods.

Table 2 shows the tampered images’ TPR and FPR of CBMA and IBMA. The tampered images without post-processing are from UCID. From Table 2, we can see that the TPR of CBMA is slightly higher than IBMA and the FPR of CBMA is slightly lower than IBMA. In short, CBMA performs slightly better than IBMA. IBMA can detect all matched sub-blocks that contain false matches and right matches. Because more false matches are detected, a part of right matches are deleted through filtering algorithm in chapter Section 2.3.

Table 2 The TPR and FPR of UCID images without post-processing

Table 3 is the tampered images’ TPR of CBMA and IBMA. The tampered images had been handled by JPEG compression. The JPEG quality factors (QF) are 60, 70, 80 and 90. When the QF s are less than 90, CBMA almost can’t detect the tampered regions. But the change of QF has influenced IBMA not too much. Table 4 shows the JPEG images’ FPR of CBMA and IBMA. On the whole, the FPR of CBMA is slightly lower than IBMA. If we consider the TPR, it’s acceptable.

Table 3 The TPR of UCID images with JPEG compression
Table 4 The FPR of UCID images with JPEG compression

Table 5 is the tampered images’ TPR of CBMA and IBMA. The tampered images were added with Gaussian noise. The variances (V a r ) of Gaussian noise are 0.0001, 0.0003, 0.0005 and 0.0007. With different variances, the most TPR of CBMA close to zero. However, the most TPR of IBMA are higher than 0.8. Table 6 shows the FPR of CBMA and IBMA. In general, the FPR of CBMA is slightly lower than IBMA. From the TPR and FPR, we can see that IBMA is more robust than CBMA when the tampered images are distorted by Gaussian noise.

Table 5 The TPR of UCID images with Gaussian noise
Table 6 The FPR of UCID images with Gaussian noise

After the tampered images were added with salt-pepper noise, we calculated their TPR of CBMA and IBMA, as shown in Table 7. The noise densities (d) of salt-pepper noise are 0.001, 0.002, 0.003 and 0.004. The noise densities (d) mean that the percentage of affected pixels. When the feature vectors are brightness [14], the TPR of either CBMA or IBMA are very high. For the other feature vectors, the most TPR of CBMA are close to zero with different noise densities. But the most TPR of IBMA are higher than 0.7. Table 8 shows the FPR of CBMA and IBMA. In a word, the FPR of CBMA is slightly lower than IBMA. When the tampered images are distorted by salt-pepper noise, we can see that IBMA is more robust than CBMA except for the feature vectors are brightness [14].

Table 7 The TPR of UCID images with salt-pepper noise
Table 8 The FPR of UCID images with salt-pepper noise

For an image, the running time of copy-move forgery detection methods are different when the matching algorithm is either CBMA or IBMA. As shown in Table 9, the running time of copy-move forgery detection methods using CBMA is less than using IBMA. There are two reasons to consider. First, is that the time complexity of IBMA is higher than CBMA. Second, when the matching algorithm is IBMA, the running time of filtering algorithm is more than CBMA because IBMA will detect more matched sub-blocks.

Table 9 The running time of using CBMA and IBMA

In all, although the running time of copy-move forgery detection methods using IBMA is more than using CBMA, IBMA is more robust than CBMA when the tampered images are distorted by JPEG compression Gaussian noise or salt-pepper noise.

3.3 CASIA data

We randomly got 100 images that the tampered way was copy-move. The sizes of the tampered regions are different and irregular. Some of the tampered images had multiple tampering regions. From CASIA V2.0, we can only extract the information of move regions and can’t obtain the copy regions. What’s more, the tampered images were added with Gaussian noise, salt-pepper noise, and JPEG compression. We detected these images by CBMA and IBMA to contrast their results.

Due to only have the information of move regions, we define Referenced True Positive Rate (RTPR).

$$ RTPR=\frac{\left| TP^{\prime} \right|}{\left| R_{clone}^{\prime} \right|} $$
(7)

In (7), \(\left | TP^{\prime } \right |\) represents the number of pixels correctly classified as tampered in the move regions. And \(\left | R_{clone}^{\prime } \right |\) represents the number of real tampered pixels in the move regions. Thinking from a certain degree, the higher are the RTPR, the better are the copy-move detection methods.

Table 10 shows the tampered images’ RTPR of CBMA and IBMA. The tampered images without post-processing are from CASIA V2.0. From Table 10, we can see that the performances of CBMA and IBMA are almost equal. To compare with CBMA, IBMA has some advantage when the tampered regions are irregular.

Table 10 The RTPR of CASIA images without post-processing

Table 11 is the tampered images’ RTPR of CBMA and IBMA. The tampered images had been processed by JPEG compression. The JPEG quality factors (QF) are 60, 70, 80 and 90. CBMA almost can’t detect the tampered regions when the QF s are less than 90. However, the change of QF has influenced IBMA not too much.

Table 11 The RTPR of CASIA images with JPEG compression

Table 12 is the tampered images’ RTPR of CBMA and IBMA. The tampered images from CASIA V2.0 were added with Gaussian noise. The variances (V a r ) of Gaussian noise are 0.0001, 0.0003, 0.0005 and 0.0007. With different variances, the most RTPR of CBMA nearly equal zero. But the most TPR of IBMA are higher than 0.7. From the RTPR, we can think that IBMA is more robust than CBMA when the tampered images are distorted by Gaussian noise.

Table 12 The RTPR of CASIA images with Gaussian noise

After the tampered images from CASIA V2.0 were added with salt-pepper noise, we calculated their RTPR of CBMA and IBMA, as shown in Table 13. The noise densities (d) of salt-pepper noise are 0.001, 0.002, 0.003 and 0.004. When the feature vectors are brightness [14], the TPR of either CBMA or IBMA are very high. For the other feature vectors, CBMA almost can’t detect the tampered regions when the noise densities are more than 0.001. And the change of noise densities has influenced IBMA not too much. When the tampered images are distorted by salt-pepper noise, we can prefer that IBMA is more robust than CBMA except for the feature vectors are brightness [14].

Table 13 The RTPR of CASIA images with salt-pepper noise

In brief, although tampered regions are different and irregular, IBMA is more robust than CBMA when the tampered images are distorted by JPEG compression, Gaussian noise or salt-pepper noise.

From all experiments of UCID and CASIA V2.0, we can get the conclusion that IBMA is more robust than CBMA. For every sub-block, we can search their all matched sub-blocks by using IBMA. CBMA only search the sub-block represented by adjacent feature vector after lexicographically sorting the feature vectors’ matrix. When an image was distorted by Gaussian noise, salt-pepper noise, or JPEG compression, the matched sub-blocks probably are not adjacent in the feature vectors’ matrix. This leads to CBMA become invalid.

4 Conclusions

We have two contributions in copy-move forgery detection. Firstly, IBMA can detect all matched sub-blocks. The reason is that IBMA is based on strict mathematical proofs. In the step of matching, most existing copy-move forgery detection methods can’t detect all matched sub-blocks, such as [9, 11, 14] and [2]. Secondly, the method we propose to calculate the threshold V e u r is reasonable. For different image databases and feature vectors, we can obtain relatively high TPR or RTPR.