Detecting median filtering via two-dimensional AR models of multiple filtered residuals

Yang, Jianquan; Ren, Honglei; Zhu, Guopu; Huang, Jiwu; Shi, Yun-Qing

doi:10.1007/s11042-017-4691-0

Detecting median filtering via two-dimensional AR models of multiple filtered residuals

Published: 18 April 2017

Volume 77, pages 7931–7953, (2018)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Multimedia Tools and Applications Aims and scope Submit manuscript

Detecting median filtering via two-dimensional AR models of multiple filtered residuals

Download PDF

Jianquan Yang^1,2,
Honglei Ren¹,
Guopu Zhu ORCID: orcid.org/0000-0001-7956-5343^1,3,
Jiwu Huang⁴ &
…
Yun-Qing Shi⁵

739 Accesses
39 Citations
Explore all metrics

Abstract

Median filtering, being an order statistic filtering, has been widely used in image denoising and recently also in image anti-forensics and anti-steganalysis. In the past few years, several methods have been developed for median filtering detection. However, it is still a challenging task to detect median filtering in JPEG compressed images. In this paper, we propose a novel method to solve this challenging task. We first generate median filtered residual (MFR), average filtered residual (AFR) and Gaussian filtered residual (GFR) by calculating the differences between an original image and its filtered images. Then, we propose to use two-dimensional autoregressive (2D-AR) model to characterize MFR, AFR and GFR separately, and further combine the 2D-AR coefficients of these three residuals into a set of features. Finally, the extracted feature set is fed into a support vector machine classifier for training and detection. Extensive experiments have demonstrated that compared with existing methods, the proposed one can achieve a considerable improvement in detecting median filtering in heavily compressed images.

Robust median filtering detection based on the difference of frequency residuals

Article 13 November 2018

A forensic algorithm against median filtering based on coefficients of image blocks in frequency domain

Article 29 January 2018

Blind Forensics of Median Filtering Based on Markov Statistics in Median-Filtered Residual Domain

Discover the latest articles, news and stories from top researchers in related subjects.

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Nowadays, digital images are fairly prevalent on the Internet. However, among these images, there are often forged ones, which might result in harming individual and public interests, such as, injuring someone’s reputation, and misleading public opinion. Therefore, as an effective technique to unveil the forgeries in digital images, image forensics has drawn increasing attention over the past more than ten years [11, 30, 33]. So far, extensive studies have been conducted in the field of image forensics, including the detection of median filtering [4, 23, 37], the estimation of JPEG compression history [9, 14, 38], the identification of source camera [18, 26], the localization of tampered regions in forgery images [7, 25, 31], and so on.

This paper focuses on the study of median filtering detection. Median filtering is an order statistic operation. Because of its capability of removing image noise meanwhile preserving image detail [17], median filtering has been widely studied and applied in image denoising. In recent years, much attention has also been paid to the detection of median filtering. This is mainly due to the following reasons. First, one goal of image forensics is to detect the use of image editing and retouching in digital photos [21, 33]. Since median filtering is one of the basic tools for image editing and retouching, there is a need to detect median filtering applied to images. Second, as a nonlinear image operation, median filtering can effectively conceal the artifacts caused by some other operations, but without much further deterioration of image quality. Therefore, median filtering has been employed as a tool to attack image forensics, such as to hide the trace of image resampling [22] or image compression [35]. In addition, it could be used to degrade the performance of steganalysis methods [19, 23]. Median filtering detection is a simple but effective way to counter these anti-forensics and anti-steganalysis. Third, in the scenario of image tampering, it may happen that a portion of an image is pasted on another image to create a tampered image. If the two original images are filtered in different ways, i.e., one is smoothed with median filter, while the other is not, then the tampered region can be localized by a block-by-block detection of median filtering in the tampered image, as done in [4, 19, 37]. That is to say, median filtering detection is also useful to expose image tampering.

Several studies have been conducted on the detection of median filtering. In [23], Kirchner and Fridrich proposed a feature to detect median filtering by measuring streaking artifacts^{Footnote 1} in digital images. They found that the number of pixels whose first-order differences are zero, denoted as h ₀, is likely to increase relative to that of pixels whose first-order differences are one, denoted as h ₁. Hence, they introduced the ratio h ₀/h ₁ as a detection feature. Experimental results have shown that the streaking-based feature is effective for uncompressed images, but not robust to JPEG compression. In order to reliably detect median filtering in JPEG compressed images, the authors of [23] also proposed to apply SPAM, a set of steganalysis features proposed in [29], for median filtering detection. It is shown that the application of SPAM can achieve better performance in detecting median filtering in JPEG compressed images. Almost simultaneous to the work reported in [23], Cao et al. [5] independently proposed a detection feature similar to the above streaking-based feature. A year later, Yuan [37] made several interesting observations on median filtered images. These are, firstly, the pixels in a small block tend to have the same value as the median of the block; secondly, the gray value of the center pixel of a block occurs more frequently in the block than before filtering; and thirdly, median filtering makes the quantity of gray values in a small block decrease. Based on these observations, Yuan proposed a set of features, named as median filtering forensics features (MFF), including the distribution of the block median, the occurrence of the block-center gray value, the quantity of gray levels in a block, and so on. Experiments have shown that MFF can improve the performance of detecting median filtering in low-quality images. Recently, Chen et al. [4] proposed two sets of new features extracted from the difference domain of images. The first set of features, namely global probability feature set, was derived from the cumulative histograms of the first- and second-order difference images; and the second set of features, namely local correlation feature set, was computed based on the correlation between adjacent pixels in the difference images. They combined these two sets of features as the final global and local feature set (GLF). According to the experiments reported in [4], GLF can achieve good performance in detecting median filtering in low resolution and highly compressed images. Later, Kang et al. [19] found that median-filtered residuals (MFR) have a higher capability of suppressing both image textures and block artifacts than difference images. Then, they made a contribution to median filtering detection by extracting features from MFR. This is quite different from the earlier works, which usually extract features from the difference domain of images. Another contribution of [19] is to apply one-dimensional autoregressive (1D-AR) model to describe the difference of MFRs between median-filtered and non-filtered images. More recently, Zhang et al. [39] proposed to use high-order local ternary pattern (LTP), which is an improved local binary pattern, for median filtering detection. Their results have shown that LTP is more robust to JPEG compression than SPAM and MFF. Lai et al. [24] proposed to detect the presence of median filtering based on the local binary patterns in both the spatial and DCT domains. Qiu et al. [32] reported that spatial rich model (SRM) [13], originally proposed for universal steganalysis, can effectively detect median filtering when the test images have a large size (in their tests, the size of the images varies from 480 × 640 to 4288 × 4752) and are not compressed. However, the dimension of SRM features is too high. Different from the above-mentioned methods that use handcrafted features for detection, Chen et al. [3] are the first to introduce convolutional neural network (CNN) to the field of median filtering detection. Their proposed CNN-based method can learn discriminative features automatically from image samples, and outperforms the MFF [37], GLF [4], and 1D-AR [19] methods in detecting median filtering in 32 × 32 and 64 × 64 images. From an interesting point of view, Pasquini et al. [28] performed a theoretical analysis on the deterministic property of median filter, and then proposed a novel approach to detect the presence of median filtering in 1D signal, which can guarantee 0% false negative in theory. It is also worth to note that a few studies [6, 10, 12, 36, 40] have been done on anti-forensics and countering anti-forensics of median filtering.

In this paper, inspired by the work of Kang et al. [19], we propose a novel method for median filtering detection. We regard not only the median filtered residual (MFR) but also the average filtered residual (AFR) and Gaussian filtered residual (GFR) as forensic fingerprints. Considering that each of the filtered residuals should be a two-dimensional stationary random process, we propose to use 2D-AR model instead of 1D-AR model to characterize the three kinds of the filtered residuals separately. The AR coefficients of the three filtered residuals of an image are then used together as features to distinguish between median filtered and non-median filtered images. Our experimental results indicate that the proposed method outperforms existing methods in detecting median filtering in JPEG compressed images.

The rest of the paper is organized as follows. In Section 2, we present the proposed detection method in detail. Database and experimental methodology are described in Section 3. Then, in Section 4, experimental results are reported and discussed. Finally, conclusions are summarized in Section 5.

2 Proposed detection method

The proposed detection method is a learning-based method. It can be divided into two parts: the feature extraction, and the classifier training and testing. The second part is very similar to that applied in previous works. Therefore, in this section we focus on the description of the feature extraction. Before describing it, we first briefly introduce the median filtering applied to images.

2.1 Median filtering

Median filtering is an order statistic operation, which is implemented by replacing the gray value of each pixel in an image with the median of its neighboring pixels. Given an image I to be filtered, the process of median filtering can be formulated as

$$\begin{array}{@{}rcl@{}} \hat{\mathbf{I}}(i,j) &=& \text{median }\{\mathbf{I}(i+r,j+s), \; (r,s)\in W \},\\ && \text{for}\quad (i,j)\in(1,2,\ldots,M)\times(1,2,\ldots,N), \end{array} $$

(1)

where I(i,j) denotes the pixel value of the image I at coordinate (i,j), W denotes the set of coordinates within a square window centered at (0,0), and M and N denote the image height and width, respectively. Since 3 × 3 and 5 × 5 windows are the most widely used for median filtering, this paper concentrates on the detection of 3 × 3 and 5 × 5 median filterings, as done in previous works [4, 19, 23, 37].

2.2 Feature extraction

In [19], Kang et al. proposed to extract features only from the median filtered residual (MFR). More generally, the proposed method extracts features from multiple filtered residuals, which are defined as

$$ \begin{array}{rcl} \mathbf{R}_{t}(i,j) = \mathbf{I}(i,j)-\hat{\mathbf{I}}_{t}(i,j), \end{array} $$

(2)

where I and $\hat {\mathbf {I}}_{t}$ denote the original and filtered images, respectively, and the subscript t denotes the index of the filtered image. As a specific example, in this paper we let t ∈ {“m”, “a”, “g”}, which represent median, average and Gaussian filterings, respectively. The window size of 3 × 3 can achieve a low computational complexity, and was adopted in [19]. For simplicity and a fair comparison with [19], in this paper the window size of the three kinds of filterings is also set as 3 × 3, and the mean and variance of the Gaussian filtering are set as 0 and 0.5, respectively. Therefore, as described above, not only the MFR used in [19], but also the average filtered residual (AFR) and the Gaussian filtered residual (GFR) are applied as the forensic fingerprints of median filtering. Note that compared with the first- and second-order differences of images, the filtered residuals have better capability of removing the interference caused by image edges and textures. Besides, the three kinds of filtered residuals are different from each other, thus it is possible to extract different distinguishing features from different kinds of filtered residuals.

In [19], 1D-AR model was utilized to fit the statistical properties of MFR in the horizontal and vertical directions. However, a digital image is a 2D signal, and median filtering is usually performed by sliding a square filtering window over the entire image. This results in that each pixel in a filtered image has strong correlations with its neighboring pixels in all directions. Therefore, 1D-AR model is not good enough to fit the statistical properties of MFR. In this paper, we consider each of the filtered residuals as a 2D stationary random process. Based on this, we propose to apply 2D-AR model to characterize the filtered residuals, and to extract the coefficients of the 2D-AR model as our features.

The 2D-AR model A R(p,q) of each filtered residual R _t (t ∈ {“m”, “a”, “g”}) can be given by

$$ \begin{array}{ll} \mathbf{R}_{t}(i,j) = & \displaystyle \underset{k+l\neq0}{\sum\limits_{k=0}^{p}\sum\limits_{l=0}^{q}} \mathbf{a}_{t}(k,l)\mathbf{R}_{t}(i-k,j-l)+\boldsymbol{\varepsilon}_{t}(i,j), \\ & \text{for}\quad (i,j)\in(1,2,\ldots,M)\times(1,2,\ldots,N), \end{array} $$

(3)

where a _t and ε _t denote the AR coefficient and prediction error matrices of R _t, respectively, p and q ($p, q\in \mathbb {N}^{+}$) denote the orders of the 2D-AR model in the row and column directions, respectively, (k,l) indexes the elements of the matrix a _t of size (p + 1) × (q + 1), and a _t(0,0) is specified as 1. Note that owing to the row and column symmetry of 3 × 3 and 5 × 5 median filterings, in this paper we use the AR model A R(p,q) with p = q to characterize the filtered residuals.

It is difficult to estimate the AR coefficient matrix a _t directly from (3). But fortunately, a _t can be obtained via the correlation method introduced by Kashyap [20]. According to the correlation method, the estimation of a _t is reduced to solving the following linear equation sets

$$ \begin{array}{ll} \boldsymbol{\rho}_{t}(m,n)= & \displaystyle \underset{k+l\neq0}{\sum\limits_{k=0}^{p}\sum\limits_{l=0}^{q}} \mathbf{a}_{t}(k,l)\boldsymbol{\rho}_{t}(m-k,n-l), \\ & \text{for}\quad (m,n)\in(0,1,\ldots,p)\times(0,1,\ldots,q),\; m+n\neq0, \end{array} $$

(4)

where ρ _t denotes the autocorrelation matrix of each filtered residual, which is calculated by

$$ \begin{array}{rcl} \boldsymbol{\rho}_{t}(m,n) = \left\{ \begin{array}{ll} \frac{\displaystyle\sum\limits_{i=1}^{M-m}\sum\limits_{j=1}^{N-n}\mathbf{R}_{t}(i,j)\mathbf{R}_{t}(i+m,j+n)} {\displaystyle \left( M-m \right)\times \left( N-n\right)} & \text{if} \quad m \ge 0,n\ge 0,\\ &\\ \frac{\displaystyle\sum\limits_{i=1}^{M-m}\sum\limits_{j=1-n}^{N}\mathbf{R}_{t}(i,j)\mathbf{R}_{t}(i+m,j+n)} {\displaystyle \left( M-m \right)\times \left( N-n\right)} & \text{if} \quad m > 0,n< 0,\\ \end{array} \right. \end{array} $$

(5)

and ρ _t(−m,−n) = ρ _t(m,n).

Next, we investigate the statistical difference of the AR coefficients a _t(k,l) among MFR, AFR and GFR. Figure 1 shows the element values (except for $\bar {\mathbf {a}}_{t}(0,0)$) of the average matrices $\bar {\mathbf {a}}_{t}$ of MFR, AFR and GFR. Each $\bar {\mathbf {a}}_{t}$ was averaged over the matrices a _t of 6000 images, and in Fig. 1 was rearranged into a vector by scanning it in zigzag order. Note that considering the trade-off between feature dimension and discriminability, this paper uses the A R(p,q) model with p = q = 6 to calculate the AR coefficents a _t(k,l). It is shown in Fig. 1 that the average AR coefficients $\bar {\mathbf {a}}_{t}(k,l)$ among MFR, AFR and GFR are obviously different, whether for the original non-filtered images or for the median filtered images. Moreover, by comparing the plots in Fig. 1a with those in Fig. 1b and c, we can see that not only for MFR but also for AFR and GFR, the average AR coefficients of the non-filtered images take quite different values from those of the median filtered images. The above findings indicate that the AR coefficients have different statistical characteristics among the three different kinds of filtered residuals, and the AR coefficients of any kind of the filtered residuals have the ability to distinguish the median filtered from the non-filtered images. Hence, it is reasonable to combine the three sets of the AR coefficients extracted from MFR, AFR and GFR as distinguishing features. Figure 1 also shows that the AR coefficient values of both the median filtered and non-filtered images almost vanish when their indices are larger than about 30. This means the AR coefficients a _t(k,l) have almost no distinguishability when k + l > 6. Therefore, only the AR coefficients a _t(k,l) with k + l ≤ 6 and k + l ≠ 0 (i.e., except for a _t(0,0)) are chosen as features.

The 2D-AR coefficients of the MFR a _m(k,l), for k + l ≤ 6 and k + l ≠ 0, can be divided into three parts: the horizontal coefficients, the vertical coefficients and the rest coefficients (the fifteen coefficients labeled in blue-filled circles as shown in Fig. 2). The horizontal and vertical 2D-AR coefficients of the MFR are closely correlated to the 1D-AR coefficients of the MFR [22], whereas the rest 2D-AR coefficients of the MFR are the additional coefficients that 1D-AR cannot model. In order to validate the effectiveness of these additional 2D-AR coefficients, we visualize them by projecting them onto a 2D plane with linear discriminant analysis (LDA), as shown in Fig. 3. It is seen that the additional 2D-AR coefficients of the original non-filtered images, the 3 × 3 median filtered images, and the 5 × 5 median filtered images are clearly distinguishable from each other, indicating that the 2D-AR model is more suitable to describe the statistical properties of the filtered residuals than the 1D-AR model of [19].

2.3 Summary of the proposed method

Based on the above description, we summarize the proposed method as follows.

1.
Generate the MFR, AFR and GFR for each input image according to (2).
2.
Calculate the three AR coefficient matrixes a _t (t ∈ {“m”, “a”, “g”}) for the MFR, AFR and GFR, respectively, according to (4) and (5).
3.
For each of the three AR coefficient matrices a _t (t ∈ {“m”, “a”, “g”}), choose the elements a _t(k,l) with k + l ≤ 6 and k + l ≠ 0 as a set of features. Then the three sets of features are combined into the proposed feature set.
4.
Train SVM classifier with the proposed feature sets extracted from median filtered and non-median filtered images. Finally, the trained classifier is used for median filtering detection.

Note that, as mentioned above, the order of the A R(p,q) model is set as (p,q) = (6,6) in this paper, thus there are 27 features extracted from each a _t, and totally 81 features in the proposed feature set.

The time complexity of the proposed method can be considered from two aspects, namely the feature extraction and the classifier training. The feature extraction of the proposed method consists of two parts: the generation of the filtered residuals and the computation of 2D-AR coefficients. The time complexities of the two parts are dominated by image filtering and autocorrelation computation, respectively. Both image filtering and autocorrelation computation can be implemented to run efficiently. Therefore, the feature extraction of the proposed method is not very time-consuming. It is known that the time complexity of SVM classifier training is related to the dimension of features. The dimension of the proposed features is not large compared with existing features. Therefore, the classifier training of the proposed method is also not time-consuming.

3 Database and experimental methodology

In order to evaluate the performance of the proposed method, we build a database of 6000 images. These images are taken from three different image databases, namely NRCS [27], BOWS2 [15] and BOSS [1], each of which provides 2000 images. Each image in our database is centrally cropped to a size of 512 × 512 pixels, and then converted to an 8-bit grayscale image. 3 × 3 and 5 × 5 median filterings are performed on each of the cropped grayscale images to generate 2 × 6000 median-filtered images. For testing the robustness of the proposed method, the filtered and non-filtered images are further central-cropped into sizes of 256 × 256, 128 × 128, and 64 × 64, and all of the cropped grayscale images are then compressed with quality factors (QFs) 90, 70, 50, and 30, respectively.

Five existing methods, including SPAM [23], MFF [37], GLF [4], 1D-AR [19], and LTP [39], are implemented for comparison. All these compared methods applied the SVM classifier for training and detection. In order to fairly compare our proposed feature set with the above five ones, the SVM classifier is also chosen in this paper, as done in [4, 19, 23, 37, 39]. 40% of the images are used for training, and the remaining 60% are used for testing. A five-fold cross validation is performed in the training to search the two hyper-parameters of the SVM classifier over the following grid:

$$ \begin{array}{rcl} (\textit{C},\gamma)\in\left\{(2^{i},2^{j})|i\in\{-5,-4,\cdots,4\},j\in\{-13,-12,\cdots,13\}\right\}. \end{array} $$

(6)

Note that the feature dimensions of SPAM, MFF, GLF, 1D-AR, LTP, and the proposed method are 686, 44, 56, 10, 2048 and 81, respectively. The time consumption of SVM classifier training is related to feature dimension. 1D-AR has only 10 features, which makes the training of its SVM classifier very fast. MFF, GLF, and the proposed method have less than 100 features, and the trainings of their SVM classifiers can be performed in an acceptable time. Compared to the other four methods, SPAM and LTP have a large number of features, so their classifier trainings are quite time-consuming.

The performance of each classifier is mainly evaluated by the minimum average decision error under equal probability of the positive and negative samples (i.e., the median-filtered and non-median filtered images), which is expressed by

$$ P_{e}=\min\left( \frac{P_{fp}+1-P_{tp}}{2}\right), $$

(7)

where P _{f
p} and P _{t
p} denote the false positive and true positive rates, respectively. Besides, in some cases we also plot receiver operating characteristic (ROC) curves to illustrate the tradeoff between the false positive and true positive rates.

4 Experimental results

Extensive experiments have been carried out to investigate the detection performance of the proposed method on the above-described database.

4.1 Performance evaluation of different sets of the proposed features

As described in Section II, the proposed 2D-AR features consist of three sets of features extracted from MFR, AFR and GFR, respectively. In this subsection, we present experiments to evaluate the performance of each of the three sets of features, as well as the superiority of the combination of the three sets over either set alone. The experiments are conducted on the images with various image sizes and different QFs of JPEG compression. Table 1 lists the detection results obtained by using the three sets of features from MFR, AFR, and GFR (denoted as 2D-AR-MFR, 2D-AR-AFR, and 2D-AR-GFR, respectively) and the combination of the three feature sets (denoted as 2D-AR-Comb). The detection results of the 1D-AR features [19] are also reported for comparison. In Table 1 and the following tables, “MF3 (or MF5) vs ORI” denotes that the positive and negative samples used for the tests are the 3 × 3 (or 5 × 5) median-filtered images and the original non-filtered images, respectively, and “UnC” denotes that the test images are not compressed. It can be seen from Table 1 that all of the three sets of the proposed features, i.e., 2D-AR-MFR, 2D-AR-AFR, and 2D-AR-GFR, have the capability to detect median filtering, especially when the test images have a large size and are not (or slightly) compressed. Moreover, 2D-AR-Comb is superior to any of the three sets of the proposed features in all the tests. This indicates that not only the features extracted from MFR, but also those from AFR and GFR contribute to the superior performance of the proposed feature set. It is also shown in Table 1 that 2D-AR-MFR outperforms 1D-AR in all the tests, which validates that 2D-AR model is more suitable than 1D-AR model to fit the statistical property of the filtered residuals.

Table 1 Detection results of the three sets and the whole of the proposed features

Full size table

4.2 Performance comparison with existing methods

A set of experiments have been performed to compare the proposed method with existing methods, including SPAM [23], MFF [37], GLF [4], 1D-AR [19], and LTP [39]. The comparative results are given in Table 2. We can see that the six compared methods generally perform better in “MF5 vs ORI” than in “MF3 vs ORI”. This is not surprising since 5 × 5 median filtering leaves more significant filtering artifacts and thus is more easily detected than 3 × 3 median filtering. We can also see that, in general, the performance of these detection methods becomes worse as the image size decreases. The same happens as the QF of the JPEG compression goes down. These are consistent with the following two facts: first, the smaller the image size is, the less the information can be used for feature extraction; and second, the compression can disturb the artifacts left by the median filtering, and the more severe the compression is, the larger the disturbance becomes. It is also shown in Table 2 that for the uncompressed images, all of the six compared methods perform quite well in the median filtering detections. LTP [39] and SPAM [23] are clearly superior to the other four compared methods, but have much larger dimensions of features, especially the dimension of the LTP features is more than 2000. The proposed 2D-AR-Comb method has a detection error of less than 1.00% when the size of the test images is larger than 64 × 64. Whereas, for the compressed images, most of the compared methods no longer perform very well, particularly when the image size is small and “MF3 vs ORI” is tested. In contrast to the other compared methods, the proposed method is relatively robust to JPEG compression. 2D-AR-Comb achieves the best performance among the six methods in all cases that the QF of the compression is not larger than 70, and also in some cases when the QF takes 90. The superiority of 2D-AR-Comb over the other five methods is evident when the test images are heavily compressed. For example, when the image size and the QF are 512 × 512 and 50, respectively, and “MF3 vs ORI” is tested, 2D-AR-Comb has a detection error of 3.86%, which is 37.5% lower than that of 1D-AR (i.e., the best result of the other five methods). The ROC curves of some representative tests are also provided in Fig. 4. It can be seen that these results represented by the ROC curves are consistent with those reported in Table 2.

Table 2 Comparison between the proposed and five existing methods in distinguishing the median filtered images from the original non-filtered images

Full size table

4.3 Discrimination between median filtering and other manipulations

For median filtering detection, a good classifier should have the ability to distinguish the median filtered images not only from original images but also from those processed by some other image manipulations, such as, Gaussian filtering, average filtering, and image resizing. In order to assess this ability of the proposed method, a set of experiments have also been conducted, including “MF vs AVE”, “MF vs GAU”, “MF vs D-SCA” and “MF vs U-SCA”. Note that “MF vs AVE (or GAU, D-SCA, and U-SCA)” denotes that the positive and negative samples are the median filtered images and those processed by an average filtering with a window size of 3 × 3 (or a Gaussian filtering with a standard deviation of 0.5, a downscaling with a factor of 0.9, and an upscaling with a factor of 1.1). In practice, we may not know the kind of manipulation to be distinguished from median filtering, as discussed in [19]. To simulate a more practical scenario, we created a set of negative samples, denoted as “ALL”, which consists of equal amounts of the original images, the Gaussian filtered images, the average filtered images, the downscaled images, and the upscaled images. Then the experiments of “MF vs ALL” were also conducted. For brevity, we only report the results for the case that the test images are compressed with a QF value of 70. It is interesting to note from Table 3 that GLF [4] performs quite well (it obtains the second best performance in general) in 3 × 3 median filtering detection, but has no superiority over most of the compared methods in 5 × 5 median filtering detection. Note that here 3 × 3 (or 5 × 5) median filtering detection refers to the classification between 3 × 3 (or 5 × 5) median filtering and other manipulations including Gaussian filtering, average filtering and image resizing. In contrast to GLF, the proposed 2D-AR-Comb method always achieves the best results among the six compared methods in all the tests. Moreover, the improvement of the proposed method is often significant when compared with SPAM [23], MFF [37], 1D-AR [19], and LTP [39] in detecting 3 × 3 median filtering. The good performance of the proposed method can also be seen from Fig. 5, where the ROC curves of some representative tests are reported.

Table 3 Comparison between the proposed and five existing methods in distinguishing between median filtering and other manipulations

Full size table

4.4 Discrimination between 3 × 3 and 5 × 5 Median filterings

To identify the parameter of an image manipulation is another important issue for image forensics. Hence, we have conducted a set of experiments to examine the ability of the proposed method in identifying the window size of median filtering (i.e., in distinguishing between 3 × 3 and 5 × 5 median filterings). In this set of experiments, the positive and negative samples are the 3 × 3 and 5 × 5 median filtered images, respectively, and all of the filtered images are compressed with QF = 70,50. The detection results are presented in Table 4 and Fig. 6. It is shown that the proposed method performs very well in these tests, and still achieves the lowest detection error among the six compared methods.

Table 4 Comparison between the proposed and five existing methods in distinguishing between 3 × 3 and 5 × 5 median filterings

Full size table

4.5 Performance comparison on more image databases

We have also conducted a set of experiments to compare the proposed method with the existing methods on three more image databases, namely, UCID [34], RAISE2k [8], and DRESDEN [16]. The UCID database has 1338 images with size of 384 × 512; the RAISE2k database has 2000 images with size of 4928 × 3264 or 4288 × 2848; and the DRESDEN database has 1388 images with size of 3872 × 2592 or 3008 × 2000. We centrally cropped each images of these databases into the size of 256 × 256. Then, the cropped images were used as the test images. The detection error rates of the six compared methods are listed in Table 5. It is shown that the compared methods perform better on the UCID database than on the RAISE2k and the DRESDEN databases. This may be explained as follows. The images from UCID database have a much smaller size than those from the RAISE2k and the DRESDEN databases, which results in that the 3 × 3 (or 5 × 5) local neighbourhoods of the UCID images usually appear less smooth than those of the RAISE2k and the DRESDEN images. Therefore, after median filtering, the filtering artifacts left on UCID images are usually more obvious than those on the RAISE2k and the DRESDEN images, hence the median filtering is more easily detected in the UCID images than in the RAISE2k and the DRESDEN images. In addition, Table 5 also shows that the proposed method still outperforms the other compared methods in most of the test cases.

Table 5 Performance comparison between the proposed and five existing methods on UCID, RAISE2k, and DRESDEN databases

Full size table

4.6 Image forgery detection via median filtering detection

Finally, we present an experiment to illustrate the application of median filtering detection in image forgery detection. One scenario of image forgery can be described as follows: Given two images, one of which is a median filtered image and the other is a non-filtered image, as shown in Fig. 7a and b, respectively. A forged image is generated by pasting a region cut from the median filtered image into the non-filtered image and then compressing it with QF=70, as shown in Fig. 7c. To identify the cut-and-paste region in Fig. 7c, we first divide the forged image in Fig. 7c into blocks of size 64 × 64. Then, the SVM classifiers that have been trained in Subsection IV-B are used to make binary decision on whether each of the blocks is a median filtered block or not. According to the outputs of the classifiers for all of the blocks, we generate six binary maps for the six compared methods, respectively, as shown in Fig. 8. For reference, the ground truth of the forged region is already given in Fig. 7d. The miss detection rates (i.e., the percentage of wrongly classifying median filtered blocks as non-median filtered ones) of SPAM, MFF, GLF, 1D-AR, LTP, and the proposed method are 26.92%,14.93%,31.42%,24.53%,25.25%, and 22.81%, respectively. The false alarm rates (i.e., the percentage of wrongly classifying non-median filtered blocks as median filtered ones) of the six methods are 14.07%,17.29%,8.12%,8.72%,9.92%, and 6.95%, respectively. On the whole, the error rates of the six methods are 20.49%,16.11%,19.77%,16.63%,17.58%, and 14.88%, respectively. It is shown that among the six compared methods, the proposed method has the second lowest miss detection rate and the lowest false alarm rate, which leads to the lowest error rate in detecting the image forgery. Note that the detection errors mainly occur at the top left corner of the forged image for all the six compared methods. The reason might be that the top left corner of the forged image is a low-contrast region so that the statistical traces left by median filtering are too weak to be detected.

5 Conclusion

Inspired by the work of Kang et al. [19], in this paper we have proposed a novel method for median filtering detection. The main contributions of this paper are as follows:

First, considering that the filtered residuals are two-dimensional stationary random processes, we have proposed to apply 2D autoregressive model instead of 1D-AR model to characterize the filtered residuals. According to our experiments, 2D-AR model is more effective than 1D-AR model to fit the filtered residuals.

Second, we found that, in addition to median filtered residual (MFR), average filtered residual (AFR) and Gaussian filtered residual (GFR) also exhibit different characteristics between median filtered images and non-median filtered images. Hence, we use not only MFR but also AFR and GFR as fingerprints for median filtering detection. Our experiments have validated that the use of more kinds of filtered residuals can evidently improve the detection performance.

Extensive experiments have been conducted to compare the proposed method with five existing methods. The results show that the proposed method performs quite well in all the tests. It is often superior to the existing methods in detecting median filtering in heavily compressed images. Although only grayscale natural images have been tested in out experiments, other types of images, such as medical images, should be modified in a similar way after median filtering, and hence leave similar statistical traces. With appropriate modification, the proposed method should be applicable to other types of images. Our future work includes investigating the discriminability of other residuals to detect median filtering, and countering the anti-forensics of median filtering detection. Moreover, it is feasible to combine the proposed feature set with other existing feature sets, such as the 1D-AR feature set [19], to achieve improved performance, which could also be an interesting future work.

Notes

¹ Streaking artifacts, first analyzed by Bovik [2], refer to the phenomenon that the probability of two adjacent pixels being the same increases greatly after median filtering, which can thus be evaluated by using first-order difference images.

References

Bas P, Filler T, Pevný T (2011) Break our steganographic system: the ins and outs of organizing BOSS Information hiding. Springer, Berlin, pp 59–70
Chapter Google Scholar
Bovik AC (1987) Streaking in median filtered images. IEEE Trans Acoust Speech Signal Process 35(4):493–503
Article MATH Google Scholar
Chen J, Kang X, Liu Y, Wang ZJ (2015) Median filtering forensics based on convolutional neural networks. IEEE Signal Process Lett 22(11):1849–1853
Article Google Scholar
Chen C, Ni J, Huang J (2013) Blind detection of median filtering in digital images: a difference domain based approach. IEEE Trans Image Process 22 (12):4699–4710
Article MathSciNet MATH Google Scholar
Cao G, Zhao Y, Ni R, Yu L, Tian H (2010) Forensic detection of median filtering in digital images Proceedings IEEE Int. Conf. Multimedia and Expo, pp 89–94
Google Scholar
Dang-Nguyen D-T, Gebru ID, Conotter V, Boato G, De Natale FGB (2013) Counter-forensics of median filtering Proceedings IEEE Int. Workshop on Multimedia Signal Processing, pp 260–265
Google Scholar
Dirik AE, Memon N (2009) Image tamper detection based on demosaicing artifacts Proceedings IEEE Int. Conf. Image Processing, pp 1497–1500
Google Scholar
Dang-Nguyen D-T, Pasquini C, Conotter V, Boato G (2015) RAISE: a raw images dataset for digital image forensics Proceedings 6th ACM Multimedia Systems Conference, pp 219–224
Google Scholar
Fan Z, de Queiroz RL (2003) Identification of bitmap compression history: JPEG detection and quantizer estimation. IEEE Trans Image Process 12(2):230–235
Fan W, Wang K, Cayre F, Xiong Z (2015) Median filtered image quality enhancement and anti-forensics via variational deconvolution. IEEE Trans Inf Forensics Secur 10(5):1076–1091
Article Google Scholar
Farid H (2009) Image forgery detection. IEEE Signal Proc Mag 26:16–25
Article Google Scholar
Fontani M, Barni M (2012) Image tamper detection based on demosaicing artifacts Proceedings 20th European Signal Processing Conference, pp 1239–1243
Google Scholar
Fridrich J, Kodovsky J (2012) Rich models for steganalysis of digital images. IEEE Trans Inf Forensics Secur 7(3):868–882
Article Google Scholar
Fu D, Shi YQ, Su W (2007) A generalized Benford’s law for JPEG coefficients and its applications in image forensics Proceedings SPIE, Electronic Imaging, Security and Watermarking of Multimedia Contents IX, vol 6505, pp 1L1–1L11
Furon T, Bas P (2008) Broken arrows. EURASIP J Inf Secur 2008, p 3
Gloe T, Böhme R The Dresden Image Database for benchmarking digital image forensics Proceedings 25th Symposium On Applied Computing (ACM SAC 2010), vol 2, pp 1585–1591
Gonzalez RC, Woods RE (2002) Digital image processing. Upper Saddle River NJ: Prentice-hall
Kang X, Li Y, Qu Z, Huang J (2012) Enhancing source camera identification performance with a camera reference phase sensor pattern noise. IEEE Trans Inf Forensics Secur 7(2):393–402
Article Google Scholar
Kang X, Stamm MC, Peng A, Liu KJR (2013) Robust median filtering forensics using an autoregressive model. IEEE Trans Inf Forensics Secur 8(9):1456–1468
Article Google Scholar
Kashyap RL (1984) Characterization and estimation of two-dimensional arma models. IEEE Trans Inf Theory 30(25):736–745
Article MathSciNet MATH Google Scholar
Kee E, Farid H (2011) A perceptual metric for photo retouching. Proc Natl Acad Sci 108(50):19907–19912
Article Google Scholar
Kirchner M, Böhme R (2008) Hiding traces of resampling in digital images. IEEE Trans Inf Forensics Secur 3(4):582–592
Article Google Scholar
Kirchner M, Fridrich J (2010) On detection of median filtering in digital images Proceedings SPIE, Electronic Imaging, Media Forensics and Security II, vol 7541, pp 1–12
Lai Y-N, Gao T-G, Li J-X, Sheng G-R (2015) Forensic detection of median filtering in digital images using the coefficient-pair histogram of DCT value and LBP pattern Proceedings Int Conf Intelligent, pp 421–432
Google Scholar
Lin Z, He J, Tang X, Tang C-K (2009) Fast, automatic and fine-grained tampered JPEG image detection via DCT coefficient analysis. Pattern Recogn 42 (11):2492–2501
Article MATH Google Scholar
Lukas J, Fridrich J, Goljan M (2006) Digital camera identification from sensor pattern noise. IEEE Trans Inf Forensics Secur 1(2):205–214
Article Google Scholar
(2002). Natural resources conservation service photo gallery, http://photogallery.nrcs.usda.gov/, United States Department of Agriculture
Pasquini C, Boato G, Alajlan N, De Natale FGB (2016) A deterministic approach to detect median filtering in 1D data. IEEE Trans Inf Forensics Secur 11 (7):1425–1437
Article Google Scholar
Pevný T, Bas P, Fridrich J (2010) Steganalysis by subtractive pixel adjacency matrix. IEEE Trans Inf Forensics Secur 5(2):215–224
Article Google Scholar
Piva A (2013) An overview on image forensics, ISRN Signal Processing, vol. 2013
Popescu AC, Farid H (2004) Exposing digital forgeries by detecting duplicated image regions, Dept Comput Sci, Dartmouth College, Tech Rep TR2004-515
Qiu X, Li H, Luo W, Huang J (2014) A universal image forensic strategy based on steganalytic model Proceedings 2nd ACM Workshop on Information hiding and multimedia security, pp 165–170
Google Scholar
Rocha A, Scheirer W, Boult T, Goldenstein S (2011) Vision of the unseen: Current trends and challenges in digital image and video forensics. ACM Comput Surv 43(4):Article 26
Article Google Scholar
Schaefer G, Stich M (2004) UCID: An Uncompressed color image database Proceedings SPIE 5307, Storage and Retrieval Methods and Applications for Multimedia, vol 5307, pp 472–480
Stamm MC, Liu KJR (2011) Anti-forensics of digital image compression. IEEE Trans Inf Forensics Secur 6(3):1050–1065
Article Google Scholar
Wu Z-H, Stamm MC, Liu KJR (2013) Anti-forensics of median filtering Proceedings IEEE Int. Conf. acoustics, Speech, Signal Process, pp 3043–3047
Google Scholar
Yuan H-D (2011) Blind forensics of median filtering in digital images. IEEE Trans Inf Forensics Secur 6(4):1335–1345
Article Google Scholar
Yang J, Xie J, Zhu G, Kwong S, Shi YQ (2014) An effective method for detecting double JPEG compression with the same quantization matrix. IEEE Trans Inf Forensics Secur 9(11):1933–1942
Article Google Scholar
Zhang Y, Li S, Wang S, Shi YQ (2014) Revealing the traces of median filtering using high-order local ternary patterns. IEEE Signal Process Lett 21(3):275–279
Article Google Scholar
Zeng H, Qin T, Kang X, Liu L (2014) Countering anti-forensics of median filtering Proceedings IEEE Int Conf acoustics, Speech, Signal Process, pp 2704–2708
Google Scholar

Download references

Acknowledgments

The authors appreciate the anonymous reviewers for their constructive comments. This work was supported in part by the National Natural Science Foundation of China under Grant 61572489 and Grant 61672554, and in part by the Youth Innovation Promotion Association of CAS under Grant 2015299.

Author information

Authors and Affiliations

Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, GD, 518055, China
Jianquan Yang, Honglei Ren & Guopu Zhu
Shenzhen College of Advanced Technology, University of Chinese Academy of Sciences, Shenzhen, GD, 518055, China
Jianquan Yang
State Key Laboratory of Information Security, Institute of Information Engineering, Chinese Academy of Sciences, Beijing, 100093, China
Guopu Zhu
College of Information Engineering, Shenzhen University, Shenzhen, GD, 518055, China
Jiwu Huang
Department of Electrical and Computer Engineering, New Jersey Institute of Technology, Newark, NJ, 07102, USA
Yun-Qing Shi

Authors

Jianquan Yang
View author publications
You can also search for this author in PubMed Google Scholar
Honglei Ren
View author publications
You can also search for this author in PubMed Google Scholar
Guopu Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Jiwu Huang
View author publications
You can also search for this author in PubMed Google Scholar
Yun-Qing Shi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Guopu Zhu.

Additional information

Jianquan Yang and Honglei Ren contributed equally to this work

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yang, J., Ren, H., Zhu, G. et al. Detecting median filtering via two-dimensional AR models of multiple filtered residuals. Multimed Tools Appl 77, 7931–7953 (2018). https://doi.org/10.1007/s11042-017-4691-0

Download citation

Received: 05 September 2016
Revised: 01 April 2017
Accepted: 04 April 2017
Published: 18 April 2017
Issue Date: April 2018
DOI: https://doi.org/10.1007/s11042-017-4691-0

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Detecting median filtering via two-dimensional AR models of multiple filtered residuals

Abstract

Similar content being viewed by others

Robust median filtering detection based on the difference of frequency residuals

A forensic algorithm against median filtering based on coefficients of image blocks in frequency domain

Blind Forensics of Median Filtering Based on Markov Statistics in Median-Filtered Residual Domain

1 Introduction