1 Introduction

With the rapid development of Internet technology and multimedia technology, there are many researches have been done on multimedia image and video processing [21,22,23, 33, 44]. Digital image is gradually applied to our real life and social network. Faced with massive digital images, how to quickly and accurately search for the desired image becomes a problem. Many research schemes about image search have been proposed in [28, 38, 39]. However, digital image forensics becomes a more serious problem. With the rapid development of image editing software and processing technology, it has become easy to tamper digital images without leaving any visual trace. If these forged images are manipulated with malicious purpose, it can cause harmful impacts on people’s life, society and even the country. So, digital image forensics has gained extensive attentions in recent years.

According to the differences of feature extraction methods, the approaches for digital image authentication can be divided into two main categories: active detection methods [29, 30, 37] and passive detection methods [1, 8, 25, 26]. Active detection methods can authenticate the authenticity and originality of digital images through the previous embedded information [37] and these methods are widely used to protect the integrity of images. Passive detection methods can gather evidence of tampering from the images directly without any extra information [8]. In real life, because active detection methods are limited by the extra information, passive detection methods are more suitable for practical application on image forensics field, and the research on passive detection methods has become a more challenging and meaningful research focus.

There are two common problems in image tampering: copy-move tampering and image splicing tampering. The primary mission of copy-move detection is to detect if there exists two or more similar regions in a single image, and to locate them if there is any. Recently, some novel methods about copy-move detection were proposed [3, 4, 20, 43]. The primary mission of image splicing detection is to detect whether a given image is a composite one which is generated by cutting and joining two or more photographs. Many splicing detection schemes based on Markov features in transform domain have been proposed, they will be introduced in detail in the next section. From these schemes, we can observe that the traditional DWT based schemes perform not better than the DCT based schemes and the features extracted in DWT domain were worked as a supplementary for image splicing detection. Based on these observations and the researches on the DWT, a block DWT based scheme is proposed to improve the detection performance of the DWT based scheme. And for the phenomenon about the traditional DWT based schemes perform not better than the DCT based schemes, we propose it may due to that the used datasets were generated with the process of DCT(JPEG compression). In order to verify our hypothesis, new datasets are generated without any process of DCT and a set of experiments are constructed on it. In summary, the main objects of this paper list in following:

  • A novel block DWT based scheme is proposed to improve the detection performance of image splicing detection.

  • A detail comparison between the DWT based scheme and the DCT based scheme is proposed to prove that the DWT based scheme is more applicable and powerful and the DCT based scheme is more suitable for handling these datasets which generated with the process of JPEG compression.

The rest of this paper is organized as follows. Section 2 reviews related work in detail. Section 3 describes the proposed algorithm framework based on block DWT strategy and Markov model in detail. Then Section 4 shows the experiment results of the proposed scheme and the comparison between two kinds of schemes. Eventually, the concluding remarks and future works are given in Section 5.

2 Related work

In recent years, many effective methods have been proposed to improve the image splicing detection performance. In these different methods, methods based on Markov features extracted in several transform domains usually achieve promising detection performance [15, 31, 46]. The correlation between image pixels can be modeled by the spatial Markov property, and the transition probability matrices can be used to characterize Markov process, so the Markov features are effective statistical features which can reflect the correlation between image pixels or coefficients. In [31], Shi et al. proposed a natural image model to classify spliced images and authentic images. The model consists of statistical features extracted from the test image as well as 2-D arrays produced by applying multi-size block DCT to the test images. The statistical features include moments of characteristic functions and Markov transition probabilities. In these features, the detection rate of 98-D Markov features is 88.31% while the detection rate of 168-D moment features is 86.82% on the Columbia Image Splicing Detection Evaluation Dataset (DVMM dataset) [27], it can be observed that the accuracy of 98-D Markov features is better than 168-D moment features, and 98-D Markov features contribute most to the effectiveness and efficiency of the whole approach. The whole method can achieve a detection rate of 91.87% on the DVMM dataset.

Inspired by the method based on natural image model, an expanded Markov based scheme was proposed by He et al. [15]. The Markov features were extracted in DCT domain and DWT domain. The transition probability matrices of intra-block and inter-block between block DCT coefficients were extracted in DCT domain, and more features were constructed in DWT domain to characterize three kinds of dependency among wavelet coefficients across positions, scales and orientations of the source image. Then, in order to fulfill the task of feature reduction and make the computational cost more manageable, the feature selection method of SVM-RFE [13] and Support Vector Machine (SVM) [16] were utilized to deal with the large number of obtained features, and a detection rate of 93.55% on the DVMM dataset was reported in [15].

After that, Zhang et al. [46] proposed a Markov model based on block DCT domain and Contourlet transform domain for image splicing detection, improved Markov features were extracted in DCT domain, and the feature domains of splicing detection were extended to the Contourlet transform domain, additional features which characterize the dependency of positions among Contourlet subband coefficients were extracted in Contourlet transform domain. Ensemble classifier [7, 17] and SVM were exploited to handle the final features. The method can achieve a promising detection rate of 94.10% on the DVMM dataset.

A scheme based on Markov features in Quaternion discrete cosine transform (QDCT) domain for image splicing detection was proposed by Li et al. [18]. QDCT was introduced into image splicing detection algorithm to make use of the whole color information. Expanded Markov features were obtained from intra-block and inter-block between block QDCT coefficients matrices. SVM was utilized to deal with the large number of obtained features, and a detection rate of 92.38% on the CASIA TIDE V2.0 dataset.

From these splicing detection schemes based on Markov features in transform domain, it can be observed that although detection approaches mentioned above can achieve ideal detection rate, they focused on the improvement of the DCT domain features, and the DWT domain feature is just a supplemental feature that has not been fully utilized. Our block DWT scheme can improve detection performance of the DWT based scheme without increasing the feature dimensionality. The strategy of block DWT is proposed because of three reasons:

  1. 1.

    Effects of the inconformity were introduced by different regions: Different regions have different content information and texture information. We propose to apply non-overlapping block DWT on the source image, and it can avoid effects of the inconformity introduced by different regions of natural image during the process of DWT.

  2. 2.

    Effects of the correlation between image pixels: There are smoothness, consistency, continuity, regularity and periodicity between the natural image pixels [31], and this correlation will be changed gradually when continue for about 15 to 20 pixels, so the strategy of block DWT can decrease the influence introduced by far pixels on the proposed statistics features.

  3. 3.

    Computational complexity: The process DWT on the whole image will lead to high computational complexity and the strategy of block DWT can reduce the computational complexity effectively. This strategy of block DWT was also utilized in the encoding process of JPEG2000 [5, 45].

3 The proposed approach

In this section, we first present the algorithm framework of the proposed approach for image splicing detection, followed by detailed description of the Markov features extracted in block DWT domain and the feature classification method exploited in this paper.

3.1 Algorithm framework

Among all digital image tampering operations, image splicing is a fundamental and popular image tampering method. Image splicing can make a composite and realistic-looking forged image by well-designed cutting and pasting operations. However, splicing operation may cause disturbances in the smoothness, consistency, continuity, regularity and periodicity of the original image, then the correlations between image pixels and the underlying characteristic of the whole image will be changed. Similar correlations are also existed in these transform domains. These correlations can be modeled by the spatial Markov property [19, 32, 40]. Therefore, the splicing operation can be detected by the Markov features extracted in several different transform domains. Benefits from these schemes [15, 31, 46], we propose a block DWT based scheme. The algorithm framework of the proposed scheme is shown in Fig. 1.

Fig. 1
figure 1

The proposed algorithm framework

As shown in Fig. 1, given a digital image, we extract Markov features in block DWT domain. Firstly, the block DWT is applied on the source image, resulting in a set of 2-D arrays. Then, the Markov transition probability matrices (TMP) generated from horizontal difference matrix and vertical difference matrix are extracted in these 2-D arrays to characterize the dependency among wavelet coefficients across positions. Finally, feature selection method SVM-RFE is used to reduce the dimensionality of features and SVM is exploited to classify the authentic and spliced images.

3.2 Feature extraction

DWT is a high performance image signal analysis tool, especially for detecting the sudden change signal and non-stationary signal, it has excellent adaptive time-frequency localization analysis function and multi-resolution analysis ability, so DWT is also suitable for image signal analysis field [34]. An example of DWT is shown in Fig. 2. A is a spliced image and the seagull is a spliced part in it. B is two-level DWT subbands of A. It can be observed that the edge information which includes much image splicing information is reserved in these DWT subbands. Based on such reasons, the features extracted in wavelet subbands by considering the dependency among wavelet coefficients across positions, scales and orientations were used for image splicing detection in [9, 15, 24]. In their experiment results, it can be noticed that the features extracted in DCT domain can achieve a detection rate of 90.07% while the features extracted in DWT domain can achieve a detection rate of 86.50% on the DVMM dataset, the detection rate and efficiency of the features extracted in DCT domain outperforms the features of DWT domain. However, considering that wavelet analysis is good at catching the relationship between the spatial domain and the frequency domain, and it can make up for the lack of location information in the analysis process of DCT [10], there are reasons to believe that more effective features can be extracted in DWT domain for image splicing detection. The features which are extracted in DWT domain for the dependency among wavelet coefficients across positions are effective. Novel Markov features which are extracted to characterize the dependency among wavelet coefficients across positions are proposed in this paper. These Markov features in DWT domain can be calculated as follows.

Fig. 2
figure 2

A spliced image and corresponding two-level DWT decomposition result.(The fig A is a spliced image and the seagull is a spliced part in it. The fig B is corresponding two-level DWT decomposition result of A and (a)\(\sim \)(d) are approximation subbands, horizontal detail subband, vertical detail subband, and diagonal detail subband of two-level DWT decomposition; (e)\(\sim \)(g) are horizontal detail subband, vertical detail subband, and diagonal detail subband of one-level DWT decomposition.)

Firstly, apply 3-level N × N block DWT on the source image pixel array, and the corresponding DWT subbands are obtained. Different wavelets perform differently, so we choose discrete Meyer wavelet for image splicing detection, because it is symmetrical and has characteristic of compact support in the frequency domain so that more effective features can be extracted [15]. After the process of block DWT, approximation subband, horizontal detail subband, vertical detail subband and diagonal detail subband can be obtained on each level of transform, denoted as Ai, Hi, V i, Di(i = 1,2,3), respectively. We round all the coefficients of the 12 obtained subbands to the nearest integer and take absolute value. And the source image pixel array is viewed as the 0-th level approximation subband and denoted as A0. Therefore, all 13 arrays can be obtained and we denote the obtained arrays as Wk, and k ∈{1,2,…,12,13}, Sku and Skv denote the dimensions of Wk.

Secondly, calculate the horizontal and vertical difference arrays using (1) and (2) respectively, two obtained difference arrays denoted by Dkh and Dkv respectively. In order to reduce the feature dimensionality, horizontal and vertical difference array are only used here.

$$ D_{kh}(u,v) = W_{k}(u,v) - W_{k}(u + 1,v) $$
(1)
$$ D_{kv}(u,v) = W_{k}(u,v) - W_{k}(u,v + 1) $$
(2)

where u and v denote coordinates in corresponding matrix. The obtained difference arrays on DWT domain reflect the change information of coefficients in Wk, the statistical features around zero can reflect the regions whose pixels changed smoothly, and it can capture the disturbances on the correlation between image pixels caused by image splicing, image edges and image textures. The statistical features on different frequency level can be obtained with the aid of DWT. Then a truncation value TN+ is introduced to reduce the computational complexity and to limit the risk of over-fitting. If an element of Dkh or Dkv is either larger than T or smaller than − T, it will be represented by T or − T correspondingly. The truncation value T can limit the numbers of states needed by the statistics.

Thirdly, calculate the horizontal and vertical transition probability matrices of Dkh and Dkv using (3) to (6), the obtained transition probability matrices are denoted by Mkhh, Mkhv, Mkvh and Mkvv correspondingly.

$$ M_{khh}(i,j) = \frac{\sum\limits_{u = 1}^{S_{ku}-2}\sum\limits_{v = 1}^{S_{kv}}\delta(D_{kh}(u,v)=i,D_{kh}(u + 1,v)=j)} {\sum\limits_{u = 1}^{S_{ku}-2}\sum\limits_{v = 1}^{S_{kv}}\delta(D_{kh}(u,v)=i)} $$
(3)
$$ M_{khv}(i,j) = \frac{\sum\limits_{u = 1}^{S_{ku}-1}\sum\limits_{v = 1}^{S_{kv}-1}\delta(D_{kh}(u,v)=i,D_{kh}(u,v + 1)=j)} {\sum\limits_{u = 1}^{S_{ku}-1}\sum\limits_{v = 1}^{S_{kv}-1}\delta(D_{kh}(u,v)=i)} $$
(4)
$$ M_{kvh}(i,j) = \frac{\sum\limits_{u = 1}^{S_{ku}-1}\sum\limits_{v = 1}^{S_{kv}-1}\delta(D_{kv}(u,v)=i,D_{kv}(u + 1,v)=j)} {\sum\limits_{u = 1}^{S_{ku}-1}\sum\limits_{v = 1}^{S_{kv}-1}\delta(D_{kv}(u,v)=i)} $$
(5)
$$ M_{kvv}(i,j) = \frac{\sum\limits_{u = 1}^{S_{ku}}\sum\limits_{v = 1}^{S_{kv}-2}\delta(D_{kv}(u,v)=i,D_{kv}(u,v + 1)=j)} {\sum\limits_{u = 1}^{S_{ku}}\sum\limits_{v = 1}^{S_{kv}-2}\delta(D_{kv}(u,v)=i)} $$
(6)

where i, j ∈{−T,−T + 1,⋯ ,− 1,0,1,⋯ , T − 1, T}, δ(⋅) = 1 only when its arguments are satisfied, otherwise δ(⋅) = 0, and Mk denotes the transition probability matrix corresponding to the k subband.

As a result, (2T + 1) × (2T + 1) × 4 × 13 elements of all of these four transition probability matrices are obtained. All of these elements are used as the Markov features of block DWT coefficients for image splicing detection.

3.3 Feature classification

The gray channel is selected for gray image to extract features. And the main object of our proposed scheme is to increase the detection performance of the features extracted in DWT domain, so only R channel of image is selected for color image to extract features for a better comparison with the features extracted in other domains (extra experiment result on Y component, G component and B component are similar to the R channel). Then, (2T + 1) × (2T + 1) × 4 × 13 Markov features of block DWT domain can be obtained, that is 4212 if the threshold T is set to 4. The dimensionality of the features used to training will influence the computational complexity and the detection performance of the final obtained classifier. At times like this, it is high computing complexity for SVM to deal with these features directly because the dimensionality of the final features is too large. To handle such situation, we will use feature reduction methods to reduce feature dimensionality. There are many feature reduction methods can be used to reduce feature dimensionality [11, 12, 14, 15], in order to compare different schemes under fair conditions, we use a similar method as He et al. [15]. The feature selection method of SVM-RFE is an application of RFE using weight magnitude as ranking criterion and can be utilized to reduce the obtained features to a n-D feature vector and to improve the detection performance. Finally, SVM is applied to the selected features for image splicing detection. The overall algorithm based on block DWT for image splicing detection is summarized in Algorithm 1.

figure d

4 Experiments and discussions

In this section, we first introduce the classical and frequently used datasets for evaluating image splicing detection methods, and in order to form a fair situation for the comparison between the DWT features and the DCT features, two new datasets denoted NIKON and GrayNIKON are generated by us. Then we present some details about experiment parameters. Next a set of experiments are presented to demonstrate the excellent performance and effectiveness of the proposed scheme and to clarify the phenomenon about the traditional DWT based scheme performs not better than the DCT based scheme. Finally the relevant discussions and comparison between the two kinds of schemes based on the experiment results are presented.

4.1 Experiment conditions

The DVMM dataset [27] and the IFS-TC dataset [6, 36] are classical and frequently used dataset. To evaluate the effectiveness of the proposed strategy of block DWT, some experiments are conducted over the DVMM dataset and the IFS-TC dataset.

The DVMM dataset is provided by the DVMM Laboratory of Columbia University to evaluate image splicing detection methods. It consists of 933 authentic and 912 spliced gray images with a fixed size of 128 × 128 pixels. Some example images of the dataset are shown in Fig. 3. Where the authentic images are given in the first row and the spliced images are given in the second row. It can be observed that the processes of splicing in spliced images in DVMM dataset are simple, and these spliced images are generated without post processing, so the classifiers obtained from DVMM dataset are more effective for the direct splicing operations. More details about the DVMM dataset may be found in [27].

Fig. 3
figure 3

Some samples of the DVMM dataset. Row one to row two show authentic images and spliced images

The IFS-TC dataset was used in the First Image Forensics Challenge which is an international competition organized by the IEEE Information Forensics and Security Technical Committee (IFS-TC). It consists of 1050 pristine and 1150 forged color images. The forged images comprise a set of different manipulation techniques such as copy/pasting and splicing with different degrees of photorealism. Some example images of the dataset are shown in Fig. 4. where the authentic images are given in the first row and the spliced images are given in the second row. It can be easily observed that those splicing images are more realistic and indistinguishable, so the classifiers obtained in IFS-TC dataset are more suitable for real life. More details about the IFS-TC dataset may be found in [6, 36].

Fig. 4
figure 4

Some samples of the IFS-TC dataset. Row one to row two show authentic images and spliced images

In our experiments, we compared the proposed features extracted in DWT domain [15] and the classical features in DCT domain [15, 18, 46]. If the images in dataset contain the process of DCT, the features extracted in DCT domain have nature competencies in the comparison with the features extracted in DWT domain. In order to better demonstrate the detection performance of the proposed features extracted in DWT domain under a fair situation, two new datasets denoted as NIKON and GrayNikon are generated without any process of DCT. 300 original format images randomly selected from the image dataset [35] generated by Nikon_D70 camera without any process of DCT are used to build the NIKON dataset and GrayNIOKN dataset, these images with a fixed size of 3008 × 2000 or 2000 × 3008 pixels are PNG format. 2400 image blocks with a fixed size of 1024 × 1024 pixels are generated from these 300 images by Matlab directly and are used to build the authentic images of NIKON dataset. At the same time, 2400 spliced image blocks are generated from the authentic images of NIKON dataset through the splicing processes which are similar to the splicing processes used in the DVMM dataset, so the spliced images also consists of 2400 images with a fixed size of 1024 × 1024 pixels. In order to remove the influence of color and further validate our assumptions, we build a grayscale image dataset denoted GrayNIKON in the same way as NIKON dataset. These processes can lead to a fair situation for the comparison of different schemes and can enrich the datasets used for image splicing detection. And the classifiers obtained in NIKON and GrayNIKON datasets are also more effective for the direct splicing operations. Some example images of the datasets are shown in Figs. 5 and 6. Where the authentic images are given in the first row and the spliced images are given in the second row.

Fig. 5
figure 5

Some samples of the NIKON dataset. Row one to row two show authentic images and spliced images

Fig. 6
figure 6

Some samples of the GrayNIKON dataset. Row one to row two show authentic images and spliced images

Image splicing detection is considered as a binary decision problem of pattern recognition in our experiments. We label all the authentic images as + 1 while all the spliced images are labeled as − 1. In order to make the comparison results more intuitively, the features are extracted from the gray channel for gray image while from R color channel for color image (For color image, extra experiment results on Y component,G component and B component are similar to the R channel). SVM is utilized to train classifiers in our experiments, and we choose LIBSVM [2] with a RBF kernel. Before training classifier, in order to handle high dimensional features, the feature selection method of SVM-RFE is adopted to reduce the dimensionality of the feature vector to n-D (n ∈{50,100,150,200}), and the related experiment results show that feature selection can improve the detection performance. After feature selection, the grid-search method is applied to find the optimal values for the penalization parameter c and the kernel parameter g for SVM.

In the classification phase of each experiment, 50 random training and testing are performed independently to reduce the variations caused by different selections of the training samples and the testing samples. In each of the 50 runs over the used dataset, 5/6 authentic images and 5/6 spliced images in the dataset are randomly selected to train the corresponding classifier, then the remaining 1/6 of the authentic and spliced images are used to test the trained classifier. We measure the detection performance by the following criterions: TP (true positive) rate, TN (true negative) rate, accuracy and the AUC. TP rate is the ratio of correct classification of authentic images while TN rate is the ratio of correct classification of spliced images. Accuracy can be obtained by averaging the weighted value of TP rate and TN rate. And the AUC is the value of the area under ROC curve. The software platform is Matlab R2010b, and the hardware platform is a PC with a 4 G duo core processor.

4.2 Results of different block size

In our experiments, due to the small and fixed size of the image of DVMM dataset, to evaluate the effect of different block size on block DWT domain and to obtain a straightforward insight about the detection performance of the proposed scheme based block DWT, the experiments with the features of block DWT domain under different block size are implemented on the DVMM dataset. The images of DVMM dataset have a fixed size of 128 × 128 pixels, so we set N ∈{8,16,32,64}. The dimensionality n of the reduced feature vector is set to different values, to evaluate its effect on the detection performance of the trained classifier. The detailed results are given in Table 1 and the corresponding ROC curves are shown in Fig. 7 (All features are reduced to 200-D feature vectors).

Table 1 Experiment results of the proposed features with different block size N (T = 4)
Fig. 7
figure 7

The ROC curves of the proposed features with different block sizes

As shown in Table 1 and Fig. 7, the proposed feature with N = 8 can achieve a detection rate of 89.88%. And it can be observed that the detection performance of the proposed features is increasing gradually with the decrease of the value of the block size N. More effective features can be obtained when reducing the block size, because that the features obtained will focus on the correlation between local and adjacent pixels more intensively when decreases the block size N gradually.

4.3 Detection performance on DVMM dataset

To demonstrate the excellent performance and effectiveness of the proposed features, in our experiments over the DVMM dataset, we make a comparison between the proposed features extracted in block DWT domain and the classical features in DCT domain [15, 46] and the original features which consider the dependency across positions in DWT domain [15]. We denote these features as BDWT, HeDCT, ZhangDCT and HeDWT correspondingly. To ensure the fairness and validity of the reported results of different detection features, all the comparison experiments are conducted in the same experiment setup described in Section 4.1. The detailed results of comparisons and the corresponding ROC curves are given in Table 2 and Fig. 8.

Table 2 Detection results on the comparison between the proposed features and the other features on the DVMM dataset (T = 4)
Fig. 8
figure 8

The ROC curves of different feature vectors on the DVMM dataset

As shown in Table 2 and Fig. 8, the proposed features can achieve a detection rate of 89.88% and outperform other features introduced in our experiments on the DVMM dataset except ZhangDCT feature. Because block DWT can avoid effects of the inconformity introduced by different regions of natural image during the process of DWT and decrease the influence introduced by far pixels on the proposed statistics features, Markov features extracted from block DWT domain are more conducive to splicing detection. So the detection rate of the features extracted in DWT domain for gray image splicing detection can be improved by the strategy of block DWT effectively.

It can be also noticed that the traditional DWT based scheme performs not better than the DCT based scheme, we propose it may due to that the DVMM dataset generated with the process of DCT, the process of DCT can lead to the DCT features are more applicable to handle this dataset. However, the official description of the DVMM dataset does not contain the statement of the process of DCT. In order to clarify the phenomenon and to better demonstrate the potential of the proposed scheme, we construct the following experiments on other datasets.

4.4 Detection performance on IFS-TC dataset

Color images are widely used in our real world, so splicing detection for color image is more realistically significant. A set of experiments are implemented over the IFS-TC dataset in our experiments. In addition to the several features introduced above, we also added the QDCT feature [18] denoted LiQDCT in our experiments. In the proposed scheme, the Markov features are extracted from the R components only and 4212 Markov features are obtained when T = 4. To ensure the fairness and validity of the reported results of different detection features, all the comparison experiments are conducted in the same experiment setup described in Section 4.1. The detailed results of comparisons and the corresponding ROC curves are given in Table 3 and Fig. 9.

Table 3 Detection results on the comparison between the proposed features and the other features on the IFS-TC dataset (T = 4)
Fig. 9
figure 9

The ROC curves of different feature vectors on the IFS-TC dataset

As shown in Table 3 and Fig. 9, the proposed BDWT feature performs better than other features, and the proposed BDWT features can achieve a detection rate as high as 92.10%. We notice that the LiQDCT feature also achieved good detection results compared with HeDCT feature and ZhangDCT feature, this may be because LiQDCT feature makes full use of the inherent connection between the three color components of the color image. And ZhangDCT feature performs better than HeDCT feature may be because ZhangDCT feature improves the inter-block characteristics of the DCT domain.

And it can be noticed that the traditional DCT features perform not better than the traditional DWT features on the IFS-TC dataset. And the official description of the IFS-TC dataset also does not contain the statement of the process of DCT. However, the detection performance of DWT based schemes and DCT based schemes is opposite to that of the two kinds of schemes on the DVMM datasets. To better explain the opposite phenomenon of the two kinds of schemes on DVMM dataset and IFS-TC dataset, new dataset definitely generated without any process of DCT is needed.

4.5 Detection performance on NIKON dataset

In order to ensure a fair experimental environment, the NIKON dataset is generated by us without any process of DCT. The NIKON dataset is provided for color image splicing detection. We make a comparison among the proposed BDWT feature and HeDCT feature, HeDWT feature, ZhangDCT feature and LiQDCT feature over the NIKON dataset. More experiment details are described in Section 4.1. The detailed results of comparisons and the corresponding ROC curves on NIKON datasets are given in Table 4 and Fig. 10.

Table 4 Detection results on the comparison between the proposed features and the other features on the NIKON dataset (T = 4)
Fig. 10
figure 10

The ROC curves of different feature vectors on the NIKON dataset

It can be noticed from Table 4 and Fig. 10 that the proposed BDWT feature can outperform the other features on the NIKON dataset. The proposed BDWT feature can achieve a detection rate of 85.80%. There is also an interesting phenomenon that the detection performance of LiQDCT feature is similar to that of HeDWT feature and even better than it. We guess this may be that the LiQDCT feature utilizes the quaternion discrete cosine transform(QDCT) to treat the three color components of the color image as a whole, making full use of the inherent connection between the color channels, not because the DCT based scheme performs better than the DWT based scheme.

And it can be observed that the HeDCT feature on NIKON dataset only achieves a detection rate of 77.84% and ZhangDCT feature only achieves a detection rate of 80.81%. They do not perform as well as the HeDWT feature and the proposed BDWT feature. Based on previous experiments, we can obtain that the detection performance of DWT scheme on color images is better than that of DCT scheme. However, there is still a question that the reason why the HeDCT feature and ZhangDCT feature get a better performance on DVMM dataset than IFS-TC dataset and NIKON dataset may be that the DCT based scheme is good at processing gray images, not that the datasets may be generated with some processes of DCT. In order to verify this problem, we need to build a grayscale image datasets without DCT compression.

4.6 Detection performance on GrayNIKON dataset

In order to verify the reason why the HeDCT feature and ZhangDCT feature get a better performance in DVMM dataset than IFS-TC dataset and NIKON dataset is that the DCT based scheme may be good at processing the datasets generated with some processes of DCT, not that gray images, we build a grayscale image datasets without DCT compression denoted GrayNIKON. Because LiQDCT can only process color images, We only make a comparison among the proposed BDWT feature and HeDCT feature, HeDWT feature and ZhangDCT feature over the GrayNIKON dataset. More experiment details are described in Section 4.1. The detailed results of comparisons and the corresponding ROC curves on GrayNIKON datasets are given in Table 5 and Fig. 11.

Table 5 Detection results on the comparison between the proposed features and the other features on the GrayNIKON dataset (T = 4)
Fig. 11
figure 11

The ROC curves of different feature vectors on the GrayNIKON dataset

It can be noticed from Table 5 and Fig. 11 that the proposed BDWT feature can outperform the other features on the GrayNIKON dataset. The proposed BDWT feature can achieve a detection rate of 85.80% and the detection performance of DCT based scheme is still not better than DWT based scheme. Based on the above experimental studies, we have reason to believe that the DWT based scheme is more applicable and powerful than the DCT based scheme and the DCT based scheme is more suitable for handling these datasets which were generated with the process of JPEG compression.

4.7 The discussions on experiment results

Firstly, the experiment results on four datasets illustrate that the proposed block DWT features can get a higher detection rate on the image splicing detection dataset and the detection rate of DWT features can be promoted by the strategy of block DWT without increasing feature dimensionality. Through the block DWT, the features which reflect the correlation among image pixels better can be obtained, so the proposed block DWT features can get a promising detection performance on four datasets.

Then, for the phenomenon about the traditional DWT based scheme performs not better than the DCT based scheme, it can be observed that the DCT features on NIKON and GrayNIKON dataset do not perform as well as the DCT features on DVMM dataset and IFS-TC dataset, while the performance difference of DWT features is not significant on four datasets. NIKON dataset and GrayNIKON dataset are generated without any the process of DCT, so there is reason to believe that the DVMM dataset may be generated with some processes of DCT, and the DCT features are adept at handling such dataset which generated with the process of DCT. It leads to that the traditional DCT features can achieve a promising detection rate on the DVMM dataset but an unsatisfactory detection rate on the NIKON dataset and GrayNIKON dataset. And the traditional DCT based scheme has nature competencies in the comparison with the DWT based scheme on traditional and frequently used datasets which contain the process of JPEG compression. As for the IFS-TC dataset, according to the favorable detection rate of traditional DCT features, there is reason to doubt that the IFS-TC dataset also contains the process of JPEG compression.

Finally, according to the results on the four datasets and relevant experiment analysis, it can be easily observed that the DWT based scheme may be more applicable and powerful than the DCT based scheme for image splicing detection, especially for the dataset generated without any process of JPEG compression, and the DCT based scheme is more suitable for handling these datasets which generated with the process of JPEG compression.

4.8 The comparison between two schemes

It can be observed that effective features for image splicing detection can be obtained on the DCT domain and the DWT domain. Based on the experiment results in this paper and previous studies [15, 18, 46], some characteristics of the DWT based scheme and the DCT based scheme can be summarized as follows.

The DCT based scheme for image splicing detection has the following characteristics:

  • The DCT domain is the most frequently used transform domain for image splicing detection, image can be analyzed in frequency domain through DCT.

  • The dimensionality of features extracted in DCT domain is lower and the computational complexity is also small, but the detection rate is relatively high on traditional datasets such as the DVMM dataset and the IFS-TC dataset.

  • The features extracted in DCT domain consist of the features of intra-block DCT coefficients and the features of inter-block DCT coefficients, these features consider the correlation among image pixels and image blocks respectively.

  • The detection rate of single-layer coefficients will increase gradually with the increase of the number of decomposition layers.

  • The DCT based scheme is more suitable for handling these datasets which were generated with the process of JPEG compression.

The DWT based scheme for image splicing detection has the following characteristics:

  • The features extracted in DWT domain were often used as the supplement of features extracted in DCT domain. The frequency information and corresponding pixel position information can be analyzed through DWT.

  • The dimensionality of features extracted in DWT domain is higher, more features can be constructed in DWT domain to characterize the three kinds of dependency among wavelet coefficients across positions, scales and orientations. The computational complexity is higher.

  • More effective features for image splicing detection can be obtained in DWT domain and the detection performance can be improved by the proposed strategy of block DWT.

  • The detection rate of single-layer subband will decrease gradually with the increase of the number of decomposition layer.

  • Well constructed DWT based scheme can achieve a better detection performance than the DCT based scheme.

  • The DWT based scheme is more applicable and powerful than the DCT based scheme on uncompressed images.

5 Conclusion and future work

In this paper, for the phenomenon about the traditional DWT based schemes perform not better than the DCT based schemes, a block DWT based scheme is proposed to improve the detection performance of the DWT based scheme firstly. The correlation between local and adjacent pixels can be captured by the features based on the strategy of block DWT, so more effective features can be obtained by the proposed scheme. Experiment results show that the detection performance of the features extracted in DWT domain can be improved without increasing dimensionality of feature. And then, in order to further clarify the phenomenon about the traditional DWT based schemes perform not better than the DCT based schemes, a detail comparison between the two kinds of schemes is proposed according to the analysis of experiment results, and the results show that it may due to that the frequently used datasets contain the process of JPEG compression, the results also show that the DWT based scheme is more applicable and powerful than the DCT based scheme and the DCT based scheme is adept at handling these datasets which generated with the process of JPEG compression. The multi-core CPU / many-core GPU techniques and advanced optimization methods are becoming more and more prominent in computer science [41, 42, 47, 48]. So we will utilize them to enhance our algorithm for processing of large image data in future work.