1 Introduction

Digital images have become an essential portion of our day-to-day life since they provide prosperous information. Due to a large number of photo-editing software such as Adobe Photoshop, GNU Image Manipulation Program (GIMP), etc., digital images can be easily manipulated for the user’s interest [16]. For example, in the medical field, physicians make a diagnosis based on images. Since medical images deal with a large amount of money, these images get manipulated for claiming medical insurance [43,44,45,46]. So, it creates a necessity for advanced methods to determine the legitimacy and truthfulness of digital images used in law, military, science, medical, journalism, and other images of extreme importance. Intrusive (active) and Non-intrusive (passive) techniques are used to authenticate the images. In intrusive methods, the information is inserted into the image, for instance, digital watermark and signature. Various watermark techniques [1,2,3,4, 49] have been proposed, which are used for verifying the authenticity of the images and detection of forgery. The major shortcoming of these techniques is that it involves special software or hardware either to extract authentication information from the images or to insert authentication information into the images. Although in the past, researchers preferred digital watermarking and digital signature algorithms, however, in recent times, non-intrusive techniques have become more popular since it does not insert any secondary data into the image. Moreover, these techniques authenticate the images by examining the intrinsic properties of the images. Two common types of non-intrusive procedures are Copy-Move Forgery (CMF) as well as splicing forgery. In CMF, one portion of the image is copied, and inserted in the analogous image to obscure some significant data. In splicing forgery, one portion is removed and inserted in a different image to generate a new image [16, 29, 50]. An illustration of splicing forgery is given in Fig. 1. In splicing, to generate a forged image, regions are generally compressed, resampled, and blurred. As a result, spliced images are being used for malicious purposes, since image splicing can be performed with ease and it is difficult to detect forged images by human eyes. Thus, emerging reliable splicing detection techniques to determine the genuineness of images has become a significant issue. This motivates the researchers to introduce various procedures to detect the splicing forgery. The major idea of different image splicing detection approaches is to detect the region of irregularities with features of the image [18, 35].

Fig. 1
figure 1

An illustration of splicing forgery

Recently, several effective approaches have been introduced to upgrade the performance of splicing forgery detection. He et al. [18] fused the Markov features in a Discrete Wavelet Transform (DWT) as well as the Discrete Cosine Transform (DCT) domain. Even though the paper demonstrated the legitimacy of the Markov feature, but still needs improvement in the accuracy rate. Zhang et al. [47] introduced the forgery detection approach based on Local Binary Patterns (LBP) by applying multi-size block DCT (MBDCT) coefficients. In this, DCT and LBP are combined, and improved accuracy was achieved. To meet the day to day forgery challenges, there is further demand for a more accurate method. The proposed scheme aims at improving the detection accuracy rate. Zhang et al. [48] detected splicing forgery by extracting the Markov features in the Contourlet transform and DCT domain. The Contourlet transform features illustrate the dependence of positions between the Contourlet sub-band coefficients. Suthiwan et al. [41] used Markov in the Multi-Block DCT (MBDCT) domain to detect the splicing forgery and artifacts are created due to post-processing in the dataset. El-Alfy et al. [15] proposed image splicing forgery detection procedure by extracting Markov features in spatial as well as the DCT domain. Sheng et al. [36] and Zhang et al. [51] extracted Markov features in Discrete Octonion Cosine Transform (DOCT) domain and block DWT domain, respectively using an SVM classifier. Prakash et al. [34] used BDCT and the enhanced threshold for the extraction of features. These techniques cause difficulty in finding the correlation among the pixels. So, there is a need to improve correlation among the pixels to avoid the problem of degradation of image quality.

Zhao et al. [52] introduced an approach to model an image as a 2D non-casual signal. This model is applied to BDCT and discrete Meyer wavelet transform (DMWT) domain, and combined extracted features are used for classification. It is observed that this approach has a better detection rate but at the cost of the high dimensions of features. Shi et al. [37] treated the neighboring differences of BDCT coefficients of an image as a 1-D signal. The dependencies between neighboring nodes along a certain direction (horizontal or vertical) were modeled as a causal Markov model and the TPM, which was considered as a discriminative feature vector for SVM classification. Kirchner et al. [25] detected median filtering of JPEG compressed images using SPAM features and results demonstrate that it can be treated as a detector for image forgery detection. Agarwal et al. [5] extracted internal statistical properties using rotation invariant co-occurrence among LBP operator. Muhammad et al. [30] proposed an image forgery detection technique by using Steerable Pyramid Transform (SPT) along with LBP. SPT produces multi-oriented sub-bands and LBP histograms were evaluated from each sub-band, which were further concatenated to generate feature vector. Then, SVM has been used for classification. Dong et al. [13] detected splicing by extracting statistical features from image run-length and image edge statistics. Li et al. [28] introduced a technique based on Markov in quaternion DCT transform to detect the image splicing. Both intra-block and inter-block correlation have been extracted in the QDCT domain, and finally, SVM has been used to classify the authentic and spliced images. Alahmadi et al. [8] proposed a technique based on LBP and DCT to detect the forgeries present in the image. The technique converts LBP code blocks into the DCT domain, and finally, standard deviation has been evaluated and fed to the SVM classifier. Agarwal et al. [6] used the Undecimated Wavelet Transform (UWT) to highlight the details of the image and then applied the Markov process to extract the features. Hussain et al. [20] examined the effect of two texture descriptors, multi-scale LBP, and multi-scale WLD for the robust detection of splicing forgery. The experimental results show that multi-scale WLD performed superior to multi-scale LBP. Alahmadi et al. [7] applied 2D-DCT in the LBP domain to extract discriminative localized features. Kumar et al. [27] applied Markov in BDCT and DMWT domain, and enhanced threshold method. The flaw of this method is the loss of details of feature vectors because of the thresholding process. Moreover, it was shown in [27] that the detection performance was quite low, i.e. the attained highest detection accuracy was 88.43% for the DVMM dataset. Jalab et al. [21] detected splicing forgery by proposing a texture descriptor based on approximated Machado fractional entropy (AMFE). Though it attains better results, but it is not robust to post-processing operations like JPEG compression. Kanwal et al. [23] introduced an overlapping block-based approach for the detection of image splicing forgery. This approach extracted features using Ostu based enhanced local ternary pattern (OELTP) and energy was used to reduce the dimensionality of the features. Nevertheless, this approach was not computationally effective because of the overlapping block-based approach. Several approaches for image splicing forgery detection have been proposed in the literature but still, there is a need for improvement in detection accuracy rate. The proposed approach deals with the detection of image splicing forgery with an improved accuracy rate as well as reduced running time. Moreover, it is robust against post-processing operation i.e. JPEG compression.

As discussed earlier, a lot of procedures had been proposed in the previous years by the researchers for detecting the image splicing forgery and this progression seems to be never-ending. So, it is difficult to discuss all the procedures in the paper. Thus, the trends of literature analysis in the field of image splicing forgery detection has been demonstrated in Fig. 2. Since the search strategy is a significant point of the survey process, therefore, the semantic scholar has been considered to gather the appropriate literature. It has been observed that this field is still in its progressive stage and has become a topic of interest for many researchers.

Fig. 2
figure 2

Trends of literature analysis of image splicing forgery detection in the past decade

The previous approaches [22, 31, 38] has used the combination of LBP and DWT in applications like image retrieval, face recognition, object recognition. Also, in the prior work, the image splicing forgery is detected either by the fusion of LBP and DWT [24, 53] or combination of Markov and DWT [39, 51], but, according to the best knowledge of the authors, the combination of Markov, LBP, and DWT has not been yet used in any application. Thus, in the proposed method, Markov features in both DWT and LBP domains are extracted and combined to detect the image splicing forgery efficiently. Since image splicing produces sharp edges in a forged image, therefore capturing the forgery introduced artifacts is the key for image splicing detection. So, the edges introduced by forgery are different from their neighbors, the relationships among the spliced part and normal part can be used to expose image forgery. The proposed method uses Markov TPM to describe these relationships. Furthermore, DWT is used because the wavelet analysis is good in capturing the localized changes in the images that are created by splicing operation. Also, DWT has better spatial and frequency resolution than other transforms like DCT and DFT. In contrast, LBP is used as it is an effective texture operator that captures the local deviations in the texture of forged images because the original texture of the image gets distorted when manipulation is performed. Consequently, the proposed technique based on Markov is effective in image splicing forgery detection.

The technical contributions of the proposed work are:

  • Inspired by the strong capability of Markov TPM in characterizing pixel correlation, Markov features from both LBP and DWT domains are extracted and combined, for the first time, to the best knowledge of the authors.

  • During the creation of image splicing forgery, there are abrupt changes that are highlighted by using a standard deviation filter in the proposed approach.

  • The experimental results performed on six datasets indicate that the proposed approach offers better results than the existing techniques in terms of accuracy, TNR, TPR, and informedness as presented in Table 5.

  • The comparison between existing state-of-the-art techniques and the proposed technique is given in Table 6.

  • Also, the run time analysis is evaluated to authenticate the effectiveness of the proposed work as shown in Table 7.

  • To validate the robustness of the proposed method, JPEG compression is applied and superior results are attained in comparison to the existing techniques.

  • Furthermore, a statistical analysis test using ANOVA is performed to confirm the efficacy of the proposed scheme.

The remaining paper is structured as follows, the proposed technique is described in detail in Section 2. Section 3 illustrates the experimental results. Finally, the efficacy of the proposed technique is concluded in Section 4.

2 Proposed methodology

In this paper, the Markov process is applied after the LBP and DWT domains. Then, features from both domains are combined and normalized. The features that are being extracted rely on the perception that falsification alters the association pattern among the pixels. Consequently, features are extracted from the DWT domain and fused with the LBP domain’s features. In both domains, statistical fluctuations are demonstrated through the Markov procedure. The layout of the proposed algorithm is described in Fig. 3.

Fig. 3
figure 3

The framework of the proposed scheme for image splicing detection

2.1 Pre-processing

Pre-processing operations are executed on images before going to the next step. In this step, the RGB image Z of size W × V is changed to the grayscale image as given below [16]:

$$ Z=0.299R+0.587G+0.114B $$
(1)

where, R, G, and B are red, green, and blue components of the image Z, correspondingly.

2.2 STD filter

After the pre-processing, the standard deviation (STD) filter is being used to highlight the inconsistencies in the forged images. The STD filter is chosen because of its ability to measure the inconsistencies in the spliced images since intensity level changes at the edge of a spliced image by a huge value. Moreover, the edges of the spliced part are different from the other part of the image, so, the relation between the spliced part and the normal part can be used to reveal image splicing forgery. Thus, the STD filter is used as it is capable of recognizing those relations in the images since it is the best measure of variation.

This filter changes the value of each pixel in the image with that of the standard deviation of its neighbors, along with itself. In spliced images, the portion which is cut and pasted in the image to generate forgery is being highlighted using the STD filter [5, 26, 40]. Since the STD filter is used to eradicate the isolated noise points in the image, consequently, the details of edges in the spliced image are preserved using STD filtering. In contrast, other filters are more sensitive to noise, in case of edge detection. For example, in Fig. 4 the bird is removed from an image and inserted into another image to generate a forgery. It can be observed from Fig. 4 that the standard deviation filter can detect the edges of the spliced part more smoothly as compared to other filters like variance filter, skewness filter, and kurtosis filter. Thus, the proposed technique uses the STD filter to highlight the abrupt changes occurring in the spliced images.

Fig. 4
figure 4

Example of the Standard deviation filter

2.3 Discrete wavelet transform (DWT)

The study of wavelet works well at grasping the short-term or localized change in signals. Some variety of wavelet families are Haar, Daubechies, Coiflet, Symlets, and Meyer. In this paper, DWT is applied by using a discrete Haar wavelet since it is a fast, memory-efficient and conceptually simple type of wavelet. According to the observations, the information concerning the edges which contains much of the image information is kept in these DWT sub-bands. Initially, discussing first level DWT, it divides an input image into four sub-band images. Each sub-band consists of high-frequency bands as well as low-frequency bands: HL, LH, LL, and HH. Here, LL denotes low-frequency sub-band (approximation coefficient) and the other three (LH, HL, and HH) are high-frequency sub-bands (detailed coefficients) in different directions i.e. vertical, horizontal and diagonal, congruently. In third level decomposition, low-frequency sub-band is further decomposed twice to diminish the image dimensions and to extract features. Moreover, the decomposition of low-frequency sub-band leads to more information as well as less noise. The DWT conducts decomposition as well as the reconstruction of signals using scaling function γ(t) and basic wavelet function ξ(t). The original signals are approximated using scaling function and detailed variations are detected by basic function, which is given as follows [11, 39].

$$ \gamma (t)=\sum \limits_na(n)\sqrt{2}\xi \left(2t-n\right) $$
(2)
$$ \xi (t)=\sum \limits_nb(n)\sqrt{2}\xi \left(2t-n\right) $$
(3)

Also, the scaling function is used to evaluate the basic wavelet function. a(n) and b(n) are coefficients of the filter, their relation is indicated in eq. (4). In this transform, b(n) and a(n) are almost equivalent to low pass as well as high pass filter.

$$ a(n)={\left(-1\right)}^nb\left(m-n-1\right) $$
(4)

Here, m denotes the length of the filter and n is number of levels. The decomposition of the wavelet transform is depicted in eq. (5) and (6).

$$ eA1(l)=\sum \limits_nb\left(n-2l\right)R(n) $$
(5)
$$ eD1(l)=\sum \limits_na\left(n-2l\right)R(n) $$
(6)

R(n) is an original signal and eA1(l) is an approximation coefficient that preserves the low-frequency data of R(n). The detailed coefficient which preserves the high-frequency data of R(n) is denoted as eD1(l). DWT has been used to detect the splicing forgery because it has the proficiency to scale the image in various resolution at various positions. Other transforms like DCT and DFT are full-frame transforms. Henceforth, the entire image is affected by any change in the coefficients of both transforms. But, DWT has spatial frequency locality, which means if the signal is embedded it will affect the image locally. Hence a wavelet transform provides both spatial and frequency descriptions for an image.

2.4 Local binary pattern (LBP)

LBP is an effective texture operator that captures the local deviations in the smoothness of altered images. In this technique, every pixel is labeled by the neighboring pixel’s relative gray levels. The value of the pixel is assigned as one if the neighboring pixel’s gray level is greater or equivalent to the middle pixel, otherwise, the value of the pixel is assigned as zero. Finally, the binary pattern is obtained for each center pixel. The weighted sum of pattern bits is called the LBP code. The LBP operator is calculated using eq. (7) as shown beneath [42, 47]:

$$ L\left(x,y\right)=\sum \limits_{q=0}^{q-1}Z\left({h}_q-h\left(x,y\right)\right){2}^q $$
(7)

where 'q' is total pixels in the spherical region of the radius R, h(x, y) is the amount of center pixel at (x, y), hq is qth pixel in the region and Z(hq − h(x, y)) is threshold function.

$$ Z\left({h}_q-h\left(x,y\right)\right)=\left\{\begin{array}{c}1\\ {}0\end{array}\begin{array}{c}\\ {}\end{array}\begin{array}{c}\left({h}_q-h\left(x,y\right)\right)\ge 0\\ {}\left({h}_q-h\left(x,y\right)\right)<0\end{array}\right\} $$
(8)

The original smoothness of the image is falsified in the cases when the image is counterfeit. As the LBP has proficiency in capturing the texture differences, it is used in the proposed approach to discover the forged as well as authentic images.

2.5 2-D difference arrays

The features that differentiate falsification are determined by the artifacts produced at the image’s edges through the tampering procedure. Because of this purpose, the connection among neighboring pixels is captured by evaluating the differences in minor diagonal (M), vertical (V), horizontal (H) and main diagonal (D) directions for LBP as well as DWT coefficients. The difference arrays Lz(x, y), z ∈ {V, H, D, M}for LBP is calculated by [9]:

$$ {L}_H\left(x,y\right)=L\left(x,y\right)-L\left(x+1,y\right) $$
(9)
$$ {L}_V\left(x,y\right)=L\left(x,y\right)-L\left(x,y+1\right) $$
(10)
$$ {L}_D\left(x,y\right)=L\left(x,y\right)-L\left(x+1,y+1\right) $$
(11)
$$ {L}_M\left(x,y\right)=L\left(x+1,y\right)-L\left(x,y+1\right) $$
(12)

where, \( {\displaystyle \begin{array}{l}L\left(x,y\right)\\ {}\end{array}} \) is calculated LBP code, 1 ≤ x ≤ Rx, 1 ≤ y ≤ Ry, Rx × Ryis the size of the image. For DWT based Markov features, the differences are evaluated in all four directions in an analogous way as that of LBP. Here \( {\displaystyle \begin{array}{l}L\left(x,y\right)\\ {}\end{array}} \)is replaced by \( {\displaystyle \begin{array}{l}W\left(x,y\right)\\ {}\end{array}} \)in above-given equations to get Wz(x, y), z ∈ {V, H, D, M}for DWT.

2.6 Markov transition probability matrix (TPM)

The process of Markov is a proficient tool for feature extraction and it works proficiently by identifying the relationship of the features. As stated in the theory of random process, the Markov TPM is a tool that is used to describe the relationships between the spliced portion and the normal portion of the forged image. Therefore, Markov based feature is a type of measure which can reveal the statistical changes produced by splicing. Since image splicing produces sharp edges in a forged image, therefore capturing the forgery introduced artifacts is the key for image splicing detection. So, the edges introduced by forgery are different from their “neighbors”, the relationships among the spliced part and normal part can be used to expose image forgery. Consequently, techniques based on Markov are effective in image splicing problems. Initially, the STD filter is applied to the input image, which partially highlights the inconsistencies of tampering artifacts. Therefore, Markov TPM is used to discover the forged regions in images by examining the inconsistencies of tampering artifacts, completely. Since applying the Markov process to difference array leads to dimensionality reduction of Markov TPM, so the Markov process is applied to the difference arrays instead of directly applying to the image or coefficients 2-D array. The TPMs which obtained from the difference arrays of both LBP and DWT domain, capture pixels or coefficients correlations to detect spliced artifacts. The general block diagram of Markov feature extraction is given in Fig. 5 and the difference arrays for minor diagonal (M), vertical (V), horizontal (H) and main diagonal (D) are shown in Fig. 6. The equations of difference arrays for all the directions have been given in section 2.5 [15, 37].

Fig. 5
figure 5

Block diagram of Markov feature extraction procedure

Fig. 6
figure 6

Difference 2-D array: (i) Horizontal difference 2-D array (H), (ii) Vertical difference 2-D array (V), (iii) Main diagonal difference 2-D array (D), and (iv) Minor diagonal difference 2-D array (M)

The difference arrays of both LBP and DWT domains are limited to [−T, +T]. If \( {\displaystyle \begin{array}{l}L\left(x,y\right)\\ {}\end{array}} \)or W(x, y) is lesser than −T, or larger than T, it is indicated by −T or T, congruently as given in the following eq. [9].

$$ {Z}_z\left(x,y\right)=\left\{\begin{array}{c}T\\ {}-T\\ {}{F}_z\left(x,y\right)\end{array}\right.{\displaystyle \begin{array}{c}\\ {}\\ {}\end{array}}{\displaystyle \begin{array}{c}{F}_z\left(x,y\right)>+T\\ {}{F}_z\left(x,y\right)<-T\\ {} otherwise\end{array}} $$
(13)

where, Fz(x, y) is either Lz(x, y) or Wz(x, y), for z ∈ {V, H, D, M}. The T constraints the number of states required by the data. In the proposed scheme, the value T is set to 4 to maintain the balance between computational efficacy as well as classifier performance. The process of Markov is categorized by TPM. A total number of elements in each direction for one-step TPM is (2T + 1) × (2T + 1). The achieved TPM for H, V, D and M directions are given in the following eqs. [24]:

$$ P\left[{Z}_h\left(x+1,y\right)=q|{Z}_h\Big(x,y=p\Big)\right]=\frac{\sum \limits_{x=1}^{R_x-1}\sum \limits_{y=1}^{R_y}\delta \left({Z}_h\left(x,y\right)=p,{Z}_h\Big(x+1,y\Big)=q\right)}{\sum \limits_{x=1}^{R_x-1}\sum \limits_{y=1}^{R_y}\delta \left({Z}_h\left(x,y\right)=p\right)} $$
(14)
$$ P\left[{Z}_v\left(x,y+1\right)=q|{Z}_v\Big(x,y=p\Big)\right]=\frac{\sum \limits_{x=1}^{R_x}\sum \limits_{y=1}^{R_y-1}\delta \left({Z}_v\left(x,y\right)=p,{Z}_v\Big(x,y+1\Big)=q\right)}{\sum \limits_{x=1}^{R_x}\sum \limits_{y=1}^{R_y-1}\delta \left({Z}_v\left(x,y\right)=p\right)} $$
(15)
$$ P\left[{Z}_d\left(x+1,y+1\right)=q|{Z}_d\Big(x,y=p\Big)\right]=\frac{\sum \limits_{x=1}^{R_x-1}\sum \limits_{y=1}^{R_y-1}\delta \left({Z}_d\left(x,y\right)=p,{Z}_d\Big(x+1,y+1\Big)=q\right)}{\sum \limits_{x=1}^{R_x-1}\sum \limits_{y=1}^{R_y-1}\delta \left({Z}_d\left(x,y\right)=p\right)} $$
(16)
$$ P\left[{Z}_m\left(x,y+1\right)=q|{Z}_m\Big(x+1,y=p\Big)\right]=\frac{\sum \limits_{x=1}^{R_x-1}\sum \limits_{y=1}^{R_y-1}\delta \left({Z}_m\left(x+1,y\right)=p,{Z}_m\Big(x,y+1\Big)=q\right)}{\sum \limits_{x=1}^{R_x-1}\sum \limits_{y=1}^{R_y-1}\delta \left({Z}_m\left(x+1,y\right)=p\right)} $$
(17)

where, p, q ∈ {−T, −T + 1, .. …, 0, .. …, T − 1, T}, Rx × Ry is the dimensionality of the image. If the arguments are satisfied δ(⋅) = 1, else δ(⋅) = 0 as shown in the following equation:

$$ \delta \left(A=p,B=q\right)=\left\{\begin{array}{cc}1& A=p,B=q\\ {}0& otherwise\end{array}\right. $$
(18)

After evaluating the Markov from both the domains i.e. LBP and DWT at T = 4, the feature vector is generated. Then, this feature vector is applied to SVM classifier for the classification. Moreover, the computational complexity is reduced and detection performance is improved.

2.7 SVM classification

The support vector machine (SVM) is a standard classifier depending on the knowledge of the hyperplane. The optimal separation hyperplane which differentiates the positive pattern from the negative pattern is found by using Lagrangian multipliers. Also, it can grasp feature vectors spaces both linearly as well as nonlinearly separable. The authentic images are marked as +1 and forged images as −1, which is two-way classification and can be resolved by SVM.

figure a

3 Experimental results and discussion

The proposed scheme is implemented using MATLAB R2017b (9.9.0.713579). The experimentation is carried out using Processor i.e. Intel(R) Core (TM) i5-4210U CPU @ 2.4 GHz with the memory of 4.00 GB on Microsoft Window 8.1.

3.1 Description of datasets

In this segment, all the experimentation is accomplished to estimate the efficacy of the proposed algorithm. The benchmark datasets are used in the experiment analysis which is discussed below. Figure 7 illustrates an example of images from each dataset and Table 1 illustrates certain characteristics of these datasets.

  • CASIA v1.0: The CASIA image tampering detection evaluation dataset (CITDE) offers a more puzzling as well as faithful image for the detection of tampering. The dataset comprises 800 authentic and 921 forged images [14].

  • CASIA v2.0: This dataset comprises of 7491 authentic as well as 5123 forged images. It involves 9 categories, classified as animal, scene, architecture, plant, nature, indoor, character, article, and texture [14].

  • Columbia Uncompressed Image Splicing Detection Evaluation Dataset: It encloses 363 total images, out of which 183 are authentic images, else is forged. The images are in uncompressed formats i.e. BMP and TIFF. The images involve indoor sights, for instance, bookshelf, computer, or desks [19].

  • DVMM: It is contributed by Columbia University to appraise the detection approaches. It has 933 authentic and 912 forged images. The tampering operation in this dataset has been created by cutting and pasting procedure across boundaries of the object or perpendicular/parallel strips, from a similar image or a dissimilar image [32].

  • IFS-TC: This dataset was initially used in international competition planned by IFS-TC. It encompasses 1150 forged and 1050 authentic images [10].

  • DSO-1: It comprises 100 authentic and 100 forged images. The dataset comprises both interiors as well as outside images. The fake images are generated by implanting one or more than one person in the original image. Various procedures such as alteration in color and illumination are executed with the motive of making realistic forged images [12].

Fig. 7
figure 7

Example of authentic and forged images from various datasets (i) CASIA v1.0 (ii) CASIA v2.0 (iii) Columbia (iv) DVMM (v) IFS-TC (vi) DSO-1

Table 1 Characteristics of evaluated datasets

3.2 Performance parameters

The efficacy of the procedure is evaluated using several numbers of performance parameters like Detection Accuracy (Accuracy), recall (R), F2 score (F2), True Positive Rate (TPR), F1 score (F1), precision (P), True Negative Rate (TNR), Informedness (Inf), Markedness (Mkd), and Mathews correlation coefficient (MCC). F1 score is a parameter which merges both recall and precision in a single value. F2 score is an average of recall as well as precision. TPR, also called sensitivity, is the possibility of identifying a forged image as forged. TNR, also called specificity, which is the possibility of identifying an authentic image as authentic. On the other hand, Accuracy is the proportion of summation of true positives and true negatives to the overall images used in the experiment. MCC is the correlation coefficient among predicted and actual classes for the classifier. Informedness (Inf) states the probability that the classifier is informed about the condition and Markedness (Mkd) specifies the probability that condition is marked by the classifier. These terms are described in equations underneath [15, 17, 33].

$$ P=\frac{T_P}{T_P+{F}_P} $$
(19)
$$ TPR=R= Sensitivity=\frac{T_P}{T_P+{F}_N} $$
(20)
$$ TNR= Specificity=\frac{T_N}{T_N+{F}_P} $$
(21)
$$ {F}_1=2\frac{P.R}{P+R} $$
(22)
$$ {F}_2=5\frac{P\cdot R}{4\cdot P+R} $$
(23)
$$ Accuracy=\frac{T_p+{T}_N}{T_p+{T}_N+{F}_p+{F}_N} $$
(24)
$$ MCC=\frac{T_P\times {T}_N-{F}_P\times {F}_N}{\sqrt{\left(\left({T}_P+{F}_P\right)\left({T}_P+{F}_N\right)\left({T}_N+{F}_P\right)\left({T}_N+{F}_N\right)\right)}} $$
(25)
$$ Informedness= TPR+ TNR-1 $$
(26)
$$ Markedness=\frac{T_P}{T_P+{F}_P}+\frac{T_N}{T_N+{F}_N}-1 $$
(27)

where, TP is total images which are perfectly identified as forged, FP is total images incorrectly identified as forged, FN is the number of missed forged images, TN is total authentic images perfectly identified as authentic. Furthermore, a brief exposition of how parameters are chosen in the experiments of the proposed scheme is also given in this paper. The proposed scheme comprises different parameters such as LBP parameters, and a threshold value of Markov i.e. T. The extensive experiments have been performed on CASIA v1.0 with different LBP parameters (q, R) to discover the set that results in the best performance; here, q is total pixels in the spherical region of the radius R [7, 8]. From the experimentations, it has been observed from Fig. 8, that LBP parameters q = 8, R = 1 give the best performance with a high accuracy rate. Therefore, the next experiments are executed using these optimal values of LBP parameters.

Fig. 8
figure 8

Effect of LBP parameters (a) q (b) R on the performance

Moreover, to select suitable values for T, few factors should be taken into account. If the value of the threshold is taken too small, it is hard to capture the spliced artifacts. On the other hand, if the value of the threshold is too large, then the dimensionality of feature vectors will be very large, as a result, the computational cost might be uncontrollable. Thus, the choice of T becomes a trade-off between detection accuracy and computational cost of the algorithm. In most of the papers [18, 28, 36, 48] using the Markov feature, the threshold value is set to 4, so, empirically, we choose T = 4 in our simulation.

3.3 Experimental results

As discussed earlier, six datasets are used to evaluate the proposed scheme. Meanwhile, in almost all datasets, the number of forged images are more than the authentic images, and vice-versa, so a balance is maintained between the authentic and forged images. Thus, images are randomly chosen such that an equal number of both the images are selected for the detection. The images are specified with the equivalent label, which is used for training the classifier. On the other side, the images used for the testing purpose are specified with no label and it is used to authenticate the algorithm’s efficacy. For appraising the performance of the proposed scheme, 80% of images are taken for training and 20% images are taken for testing. A confusion matrix outlines the classifier’s performance with respect to testing data. For instance, there are 800 genuine and 921 forged images in the CASIA v1.0 data set. To maintain balance 800 authentic and 800 forged images are taken for experimentation, so total there are 1600 images for CASIA v1.0. According to 80:20 proportion for training and testing, 1280 images are used in training the classifier and 320 images are used for testing purposes. Therefore, the confusion matrix is created based on 320 images to visualize the accuracy of the classifier by comparing actual and predicted classes. The confusion matrix for testing images of all the datasets i.e. CASIA v1.0, CASIA v2.0, Columbia, DVMM, IFS-TC, and DSO-1 is given in Table 2 from left to right, respectively.

Table 2 Confusion matrices for the respective datasets

The value of different performance metrics on all the datasets are specified in Table 3. The graphical representation for the performance parameters on all the datasets i.e. CASIA v1.0, CASIA v2.0, Columbia, DVMM, IFS-TC, and DSO-1 has been depicted in Fig. 9.

Table 3 Performance parameters of the proposed method on various datasets
Fig. 9
figure 9

Graphical representation of performance parameters on datasets

Several experiments have been carried out on six mentioned datasets. Moreover, the results of the combined Markov features of both LBP and DWT have been compared with LBP and DWT Markov features individually as shown in Table 4 on the respective datasets.

Table 4 Results for datasets with various features

Table 4 reveals that significant results are attained when Markov features are extracted and combined from both the domains i.e. LBP and DWT. The graphical representation of the results obtained on all the datasets for various features such as LBP, DWT, and the combination of LBP and DWT is shown in Fig. 10.

Fig. 10
figure 10

Graphical representation of results evaluated for various features on (a) CASIA v1.0 (b) CASIA v2.0 (c) Columbia (d) DVMM (e) IFS-TC (f) DSO-1 datasets

From the observations, Markov features in the LBP domain perform better than Markov features in the DWT domain for the DVMM dataset. Furthermore, the combination of both domains attains improved results in comparison to individual domains. Also, CASIA v1.0, IFS-TC, CASIA v2.0, DSO-1, and Columbia datasets have been used. For Columbia and CASIA v1.0 dataset, the LBP based Markov features attain better results in comparison to DWT based Markov features in terms of accuracy and specificity. When merging DWT and LBP, there is an improvement in detection performance. Even though DVMM, CASIA v1.0 as well as Columbia dataset, is extensively used, but the size of these datasets is not large. So, to authenticate the performance of the proposed approach on a larger dataset, the same technique is utilized for CASIA v2.0. In this, Markov features in DWT perform better in comparison to Markov features in LBP.

Nevertheless, the finest performance is observed in combining features from both the domains. Also, two more datasets have been used to carry out the experimentation i.e. IFS-TC and DSO-1 datasets. The manipulation of these datasets is done by cutting and pasting various degrees of photorealism. To check the results on a very small dataset, DSO-1 is used which comprises 100 authentic and 100 forged images. The LBP based features perform superior to DWT features with higher accuracy for this dataset. Moreover, the combination of both features exceeds the detection accuracy rate in comparison to individual features. The Receiver Operating Characteristic (ROC) curve is plotted to visualize the performance of the classifier. Moreover, it is used to describe the progressions of TPR as well as False Positive Rate (FPR). TPR represents the number of forged images that were perfectly identified as forged. Similarly, FPR represents the number of forged images that were wrongly classified as authentic. The ROC curves are for LBP, DWT, and combined Markov features for all the datasets are shown in Fig. 11. The ROC curves for CASIA v1.0 and CASIA v2.0 dataset are zoomed for better visualization.

Fig. 11
figure 11

ROC curves for various features evaluated on (a) CASIA v1.0 (b) CASIA v2.0 (c) Columbia (d) DVMM (e) IFS-TC (f) DSO-1 datasets

From Fig. 11, it is observed that the ROC curves for combined features are closer to the upper left corner which depicts the highest accuracy. The performance of the classifier is calculated by using LBP as well as DWT based Markov features separately as well as in the combination of both. From Table 4 as well as ROC curves for respective datasets, it is clear that a combination of Markov features from both LBP and DWT domains provides better results in comparison to their individual performances. The accuracy achieved by fusing both LBP as well as DWT domains for datasets i.e. CASIA v1.0, CASIA v2.0, Columbia, DVMM, IFS-TC, and DSO-1 is 99.69%, 99.76%, 98.61%, 97.80%, 96.90%, and 92.50%, respectively.

3.4 Comparative analysis of the proposed scheme with existing schemes

To exhibit the efficacy of the proposed scheme, the comparison of performance parameters of the proposed scheme is carried out with some existing image splicing detection techniques as shown in Table 5. The ROC curve is plotted to visualize the performance of the classifier. The ROC curve that is close to the upper left corner indicates the highest performance of the proposed scheme. The comparison of ROC curves for all the datasets has been given in Fig. 12. The ROC curves have been zoomed for better visualization for CASIA v1.0 and CASIA v2.0 datasets. Moreover, the difference between the state-of-the-art techniques and the proposed technique has been given in Table 6.

Table 5 Comparison of performance parameters of the proposed scheme with existing schemes
Fig. 12
figure 12

Comparative analysis of ROC curves for the proposed scheme evaluated on (a) CASIA v1.0 (b) CASIA v2.0 (c) DVMM (d) Columbia (e) IFS-TC (f) DSO-1 datasets

Table 6 Comparison of the proposed technique with other state-of-art techniques

From Table 5, it is revealed the proposed scheme outperforms the existing techniques in terms of performance metrics like accuracy, sensitivity, specificity, and informedness. As shown in Fig. 12, the ROC curve of the proposed scheme for all the datasets is closer to the upper left corner which depicts that it attains a better accuracy rate in comparison to other existing procedures. It is observed from the experimental results, that fusing Markov TPM features in LBP as well as DWT domains outperforms with regard to sensitivity, specificity, and accuracy. Moreover, the proposed scheme is compared with other techniques and attains excellent detection performance on CASIA v1.0, DVMM, CASIA v2.0, IFS-TC, Columbia, and DSO-1 Datasets.

It has been observed from Table 6 that the effectiveness of the proposed technique is validated on six datasets which are larger in number as compared to other state-of-art techniques. As a result, the proposed method attains better results with several performance parameters. Furthermore, most of the existing techniques do not perform run time analysis, statistical analysis, as well as they, are not robust against post-processing operations. On the other hand, the proposed approach overcomes these drawbacks by validating the performance with run time analysis, statistical analysis, and post-processing operation.

3.5 Run time analysis of the proposed approach

In this section, the run time analysis of the proposed approach for the detection of splicing forgery is evaluated on all the six mentioned datasets. Table 7 demonstrates the average running time of the proposed approach. The execution time is different for each dataset since it depends on the different sizes of images as well as the total number of images present in the dataset. The average run time of the proposed approach attained for CASIA v1.0, CASIA v2.0, Columbia, DVMM, IFS-TC, and DSO-1 is 0.372, 0.508, 2.478, 0.110, 4.482, and 2.748 secs per image, respectively. The average run time of the DVMM dataset is lowest than the other five datasets as each image in this dataset is of small size i.e. 128×128. On the other hand, the IFS-TC dataset has large-sized images, thus it takes maximum processing time to execute the proposed algorithm. For small dataset i.e. DSO-1, the proposed approach performs higher than the other four datasets. Meanwhile, CASIA v1.0 and CASIA v2.0 datasets take slightly less time to execute as compared to Columbia, IFS-TC, and DSO-1 datasets. It is observed from Table 7 that the average run time of the proposed approach increases significantly either with the increase in the total number of images or the size of images.

Table 7 Run time analysis of the proposed approach on different datasets

3.6 Robustness test

In this section, post-processing operation i.e. JPEG compression is applied on the DVMM dataset to validate the robustness of the proposed technique. The JPEG compression level can be measured by the quality factor. The higher quality factors mean high quality (i.e., less compression) and vice versa. The proposed method has extracted Markov features from DWT and LBP domain. He et al. [18] fused the Markov features in a Discrete Wavelet Transform (DWT) as well as the Discrete Cosine Transform (DCT) domain. Zhao et al. [52] extracted features from BDCT and discrete Meyer wavelet transform (DMWT) domain. Shi et al. [37] treated the neighboring differences of BDCT coefficients of an image as a 1-D signal and Kirchner et al. [25] have used SPAM features. After performing the experiments, the detection accuracies of the existing detection techniques [18, 25, 37, 52] and the proposed method are evaluated as shown in Fig. 13. It has been observed from Fig. 13 that with the decrease in the quality factor, the detection accuracies of all the techniques decrease. The proposed method with different JPEG quality factors outperforms the other existing methods, which shows that the proposed approach is robust against post-processing operation i.e. JPEG compression.

Fig. 13
figure 13

Detection results for JPEG compression

The value of detection accuracy for different quality factors for various techniques has been compared by Analysis of Variance (ANOVA). The ANOVA is used to figure out whether there is any statistical significant difference among the means of two or more independent techniques. The ANOVA test is performed on the achieved results of the proposed technique and existing techniques such as He [18], Zhao [52], Shi [37], and Kirchner [25]. Figure 14 shows the statistical analysis of detection accuracy for different techniques when tested against JPEG compression. From Fig. 14, it has been analyzed that the whiskers (which indicate the maximum and minimum values) of the proposed technique reaches nearly to 100, which is higher than the other existing techniques, and also, the values of the median are better than the existing techniques at 95% confidence level. It has been observed from the representation of the data in Fig. 14 that the existing detection techniques are much weaker than the proposed technique.

Fig. 14
figure 14

Statistical analysis of detection accuracy for different techniques, when tested against JPEG compression

4 Conclusion

A passive forgery detection methodology is proposed to validate the detection of splicing forgery. At the outset, the STD filter is used to highlight the irregularities in the forged images. Further, Markov features are extracted from LBP and DWT domains separately and combined, to detect the image splicing forgery. Then, the SVM classifier is used to evaluate the effectiveness of the algorithm. Accuracy is calculated on six different datasets i.e. CASIA v1.0, DVMM, CASIA v2.0, IFS-TC, Columbia, and DSO-1. The proposed technique attains 99.69% and 99.76% accuracy on CASIA v1.0 and CASIA v2.0, 97.80%, and 98.61% accuracy on DVMM and Columbia datasets, and 96.90% and 92.50% accuracy on IFS-TC, and DSO-1, respectively. The experimental results show that fusing Markov features from LBP and DWT domains leads to improvement in terms of detection accuracy, sensitivity, specificity, and informedness in comparison to other existing techniques. Moreover, the robustness of the proposed method is validated for JPEG compression, and efficacy is confirmed by performing the statistical analysis test using ANOVA. In future work, it has been planned to localize the tampering regions in the spliced images. Moreover, the authentication of medical images has acquired less attention in the research community. Since the medical images are misrepresented by some people to claim medical loans, the concerned patient may face social embarrassment or disappointed, while other people may achieve an illegal benefit. So, this field needs more attention to attain the trust of patients as well as to avoid their embarrassment. Thus, the proposed scheme can be extended and applied to medical images in the future work so that it will be beneficial for the society as well as the research community.