Keywords

1 Introduction

Due to the existence of highly sophisticated software for editing the digital images, it became easily modify images without leaving any subtle traces. Copy-move forgery technique is the most commonly used technique where, a part of the image is copied and then pasting it into another part in the same image. Therefore, Copy-move forgery detection (CMFD) algorithms aims at detecting the same or similar regions in the forged images. Figure 1 shows an example of Copy-move forgery, where the pocket of the child’s shirt is copied from his left hand side and then pasted into the other side of the shirt. Some post-processing operations can be performed on the forged images after Copy-move operation, which makes the task of forgery detection more harder. Typically, post-processing operations are applied to cover up the forgery such as geometric transformation (e.g. scaling).

Several researchers have introduced algorithms for detecting image copy-move forgery which can be found in these surveys [1, 6]. Generally, these methods can be classified into two main categories: block-based methods [14] and keypoint-based methods [2]. Due to the limitations of block-based methods especially in the robustness against scaling manipulations and time complexity, keypoint-based methods attract many researcher’s attention. Keypoint-based methods detect keypoints and then use the local features to identify duplicated regions instead of using overlapping blocks [15]. Typically, SIFT [13] and SURF [3] are used as a keypoint detectors, and their corresponding descriptors are used to find matches between these keypoints. To estimate the geometric transformations applied to the forged regions, Random sample consensus (RANSAC) algorithm [9] can be used.

Fig. 1.
figure 1

Copy-move forgery example

Huang et al. [12] introduced a method to detect copy-move forgery based on local statistical features, known as scale invariant feature transform (SIFT). But in that method there is no estimation for the geometric transformation parameters rather than the weak performances. Another method has been proposed in [15], but that method can’t manage affine transformation. Bo et al. [5] presented another CMFD method based on Speeded Up Robust Feature (SURF) descriptor to overcome geometric operations such as scaling, their experimental results were introduced visually. The algorithm appears to be promising, but it still need more improvements to automatically localize the duplicated regions. The existing CMFD methods mentioned in [1, 6] can achieve an acceptable performance, but still there exist some challenges especially in the affine transformation manipulations (e.g. scaling).

In this paper, Scale Invariant Feature Operator (SFOP) detector [11] is used as a local feature detector to extract the keypoints from the forged image. After detecting the keypoints from the image, these keypoints are described using MROGH descriptor and then matched. The matched points are then clustered according to the distance between them. Then, RANSAC algorithm can be used to estimate the affine transformation parameters and remove the false matches. The rest of this paper proceeds like this: in Sect. 2, each step of the proposed method is explained; in Sect. 3, experimental results are presented and discussed; finally, Sect. 4 summarizes the paper and next research target.

2 Proposed Method

2.1 Local Features Detector and Descriptor

In our proposed method, SFOP detector is used for detecting keypoints from the forged image. The SFOP, which is a scale-space extension of the detector proposed by Förstner [10], is a local feature detector proposed by Förstner et al. in [11]. SFOP uses the general spiral feature model of [4] to unify different types of features within the same framework, and it achieves a better coverage under various geometric transformations than the other local feature detectors [7].

Given a forged image \(I_{forged}\), Firstly we applied SFOP to detect the keypoints \( P = {p_{1},p_{2},.....,p_{n}} \). Secondly, we used Multi-support Region Order-based Gradient Histogram (MROGH) [8] to generate a descriptive vectors \( F_{i}, i=1,2,....,n \) for each keypoint \( p_{i}\in P, i=1, 2, ...., n \). A two-dimensional MROGH histogram with length \( \lambda _{1} \) \(\times \) \( \lambda _{2} \) \(\times \) \( \lambda \) can be obtained where; \( \lambda \) is the number of support regions, \( \lambda _{1} \) is the number of quantifiable levels, and \( \lambda _{2} \) is the number of order segments. We empirically choose \( \lambda _{1} = 8\), and \( \lambda _{2} =6 \) because they can achieve a good performance as presented by experiments in [8].

2.2 Feature Matching

For each feature \( f_{j}\in F ; j=1,2,...,48 \), we used \( kd-tree \) to obtain the k nearest neighbors \( N_{l}, l=1,2,...,k \) with corresponding distances denoted as \( d_{\textit{z}}, \textit{z}=1,2,...,k \) that represents the sorted Euclidean distance. The keypoints are then matched if the ratio between \( D_{1} \) and \( D_{2} \) is less than a threshold (\( D_{1}/D_{2} < thr\)). But, this matching strategy can’t deal with multiple keypoint matching. So, we used another matching procedure g2NN as presented in [2]. This method iterates the nearest neighbors test between \( D_{r} , D_{r+1} \) while:

$$\begin{aligned} D_{r}/D_{r+1} < g2NN_{thr} \end{aligned}$$
(1)

Now, we obtain the set of all matched points. These matched points are then kept for further post-processing and the other mismatched keypoints are then removed.

2.3 Post-processing

In this step, the matched keypoints are clustered according to the distance between them based on a threshold \( D_{thr} \). After that, all the clusters with members less than a minimum member number \( \zeta \) in each cluster are discarded, for the others, we used RANSAC algorithm [9] to estimate the affine transformation parameters and remove the false alarms. For each estimated homography matrix, we find all inliers D less than \( \alpha \) that fit with this transformation according to:

$$\begin{aligned} D = \begin{Vmatrix} H \begin{pmatrix} x\\ y\\ 1 \end{pmatrix} - \begin{pmatrix} x^{'}\\ y^{'}\\ 1 \end{pmatrix}\end{Vmatrix}_{2} \end{aligned}$$

where \( (x,y,1)^{T}\), \( (x^{'},y^{'},1)^{T} \) are the homogeneous coordinates of a pair of matched points and H is the estimated affine homography matrix that can be defined as follows:

$$\begin{aligned} H = \begin{bmatrix} a_{11}&a_{12}&t_{x}\\ a_{21}&a_{22}&t_{y}\\ 0&0&1 \end{bmatrix} \end{aligned}$$

Hence, we can get some false alarms. To remove it, we again used distance-based clustering for each homography whose corresponding inlier pairs are less than \( \gamma \). Then, all the clusters with members less than \( \zeta \) in each cluster are removed.

Finally, we applied some morphological operations to get the final detected duplicated regions.

3 Experimental Results and Discussion

3.1 Dataset

We evaluate the performance of the proposed method by conducting a series of experiments. In the following experiments, we have used benchmark evaluation database appeared in [6]. The dataset contains 48 high resolution color images of different sizes, varying from \(533\times 800\) (giraffe image) to \(3900\times 2613\) (sailing image). The tampered images have been generated by cutting and pasting image region(s). The image region(s) selected for duplication can be geometrically transformed before being pasted. The duplicated region can vary in size (e.g., small, medium, or large). A sample of original images, forged images and its ground truth indicating the forged regions are shown in Fig. 2.

To evaluate the performance of the proposed method, precision-recall (PR) curves [16] and \( F_{1} \) score are employed. Equations 2, 3, and 4 show how the precision, recall, and \( F_{1} \) rates are calculated;

$$\begin{aligned} Precision=\frac{T_{P}}{T_{P}+F_{P}} \end{aligned}$$
(2)
$$\begin{aligned} Recall=\frac{T_{P}}{T_{P}+F_{N}} \end{aligned}$$
(3)
$$\begin{aligned} F_{1}= 2 . \frac{Precision . Recall}{Precision + Recall} \end{aligned}$$
(4)

where;

  • \( T_{P} \) (True Positive) represents the number of tampered pixels, which are classified as tampered.

  • \( F_{P} \) (False Positive) represents the number of authentic pixels, which are classified as tampered.

  • \( F_{N} \) (False Negative) represents the number of tampered pixels, which are classified as authentic.

Fig. 2.
figure 2

Examples for copy-move forgery: the up row is the original images, middle row is the corresponding forged images, and bottom row is the ground truth map

3.2 Parameters Setup

We set up the parameters of the proposed method as in Table 1.

Table 1. Parameters setup for the proposed method

3.3 Detection Performance of the Proposed Method

Plain Copy-move Forgery. We evaluate how the proposed method can perform under plain Copy-move forgery without any post-processing operations. Figure 3 shows some visual detected examples, in which the forged regions are correctly localized by the proposed method. All images in the dataset are detected and the values of precision, recall, and \( F_{1} \) score are computed.

The proposed method has a very high precision rate (exactly, \( 95.66\,\% \)), an acceptable recall rate value (exactly, \( 49.12\,\% \)), and the comprehensive assessment \( F_{1} \) score equal to \( 64.91\,\% \). Therefore, the proposed method is very accurate in finding the correct matches (very slight false positives). But, it is not able to cover all the areas of the Copy-move region (more false negatives exist). Furthermore, in our experiment we found that we can observe the forgery in the images and can be easily identifiable, even when the forged regions are not detected correctly as shown in Fig. 4.(d).

Fig. 3.
figure 3

Some visual results: (a)the tampered images, (b)the corresponding binary mask, (c)the correctly detected matches, (d)the detection map produced by our proposed method

Fig. 4.
figure 4

An example of Copy-move forgery detection (proposed method)

Robustness against Scaling Manipulations. In the benchmark dataset, the cloned regions are scaled before being pasted by different scaling factors s. To evaluate the robustness of our proposed method against scaling manipulations, We test the performance of our method under 10 different scaling factors of the original size of the forged region (\( s = 91\,\%, 93\,\%, 95\,\%, 97\,\%, 99\,\%, 101\,\%, 103\,\%, 105\,\%, 107\,\%,\ and \ 109\,\%\)). The detection results of some forged images with different scaling factors s are shown in Fig. 5.

Table 2 shows the performance evaluation results of the proposed method against scaling manipulations for all Copy-Scale-move forgery images in the benchmark dataset.

We noticed that the proposed method achieves a good performance against scaling manipulations. We also noticed that, when scaling factor increase, the matched points decrease due to the impact of scaling, but there are still enough matched points to be detected (see for example Fig. 5(d) and Table 2 when s is more than \( 101\,\% \)).

Fig. 5.
figure 5

The detection results with different scaling factors s

Hence, we can conclude that our method is of robustness to scaling manipulations, since the scale invariant feature operator (SFOP) is able to detect sufficient keypoints from the forged image with a different scaling factors. But, it still needs some improvements in the flat regions.

Table 2. Detection Performance of the proposed method against scaling manipulations

4 Conclusion

In this paper, a forensic method has been proposed to detect and localize copy-move regions under scaling manipulations. We used scale invariant feature operator (SFOP) as a feature points detector due to its scaling invariance. The proposed method is able to find the correct matches but, it is not able to cover all the regions of the Copy-move region due to the existence of more false negatives. Furthermore, our method can observe the forgery in the images and can be easily identifiable. The experimental results show the robustness of our method against scaling manipulations, especially in the non-flat regions. But, it still needs some improvements in the flat regions. In the future, we will try to solve this problem by using a dense interest point to find more matched keypoints. Also, we will try to use some other post-processing techniques, to recover some missing matches and hence increase the recall rate of the method.