1 Introduction

With the rapid development of computer graphics (CG), some rendering softwares arise, for example 3D Max, Photoshop and Maya. Based on them, many highly photographic (PG) images are generated, which have involved in the applications of politics, economy and daily life. Unfortunately, some CG images are used as forgery in academic research, journalism, and some other fields. Now, CG technology is getting stronger and CG image is becoming photographic. Thus, how to recognize these CG images from the photographic images is getting critical because they can hardly be identified by human eyes from PG images. Fortunately, image forensics is a possible technology to solve this problem [4, 9, 11, 20].

Nowadays, many researchers focus on forensics for identifying PG and CG [3, 5, 6, 10, 1315]. Most of existing schemes can be classified into two categories, one of which is the transform-domain statistics scheme [3, 5, 6, 13, 15], and the other is the spatial-domain statistics scheme [10, 14]. In this paper, the former is focused on.

In [1], the statistical model built on discrete wavelet transform (DWT) are proposed to obtain the relationship between the subband coefficients and color channels in 1999. However, the first DWT statistical model designed for forensics is proposed in 2003. In this paper [6], Farid et al. presented a statistical model to capture the intrinsic feature, which is composed of the original and prediction statistic for steganalysis. Nextly, in 2005, the improved scheme [13] is presented. These two schemes obtain the first four-order statistics as the feature, including the mean, variance, skewness and kurtosis, which are computed from the first three-level wavelet coefficients and the first two-level prediction error. In 2007, Chen et al. [3] proposed a DWT- and DFT-based forensics scheme. In this scheme, the histograms of the original DWT coefficients and the predicted DWT values are transformed by discrete fourier transform (DFT) respectively, from which the feature is extracted. In 2011, Özparlak et al. [15] proposed to use the ridgelet and contourlet wavelet statistical model to obtain the regularity of the image on the basis of the Farid’s schemes. The scheme boosts the accuracy of forensics by using the sequential floating forward selection (SFFS) and the support vector machine (SVM). Different contourlet wavelet model is proposed by Wang et al. [5], which uses different prediction model to predict the coefficients. In this scheme, HSV channels are used, instead of RGB channels used in Farid’s and Özparlak’s schemes. However, due to the drawbacks of DWT and CWT (to be discussed in detail in Sections 2 and 3), the extracted feature has some impacts on the forensics performance.

In this paper, we use the first four statistics extracted from QWT domain as feature for CG and PG classification. At present, QWT has already been successfully applied to image processing, for example texture classification [8, 19], image fusion [16], image coding [18]. In [19], the authors proposed the use of QWT for texture classification. Experimental results show that the combined features of the QWT magnitude and the phase are superior to those of the standard wavelet transform. In [8], a new multi-level Copula model is constructed based on the magnitude-phase dependance of QWT, which could obtain higher performance for texture classification than DWT Copula. Pang et al. [16] used QWT to solve the blur problem of multifocus color images. This scheme performs better than the existing fusion methods, including DWT-based schemes. In [18], QWT is applied to image coding, especially phase impact. With equal bit rates, this scheme can capture better visual quality than DWT. Summing up, QWT brings some better performances than DWT does.

The purpose of this paper is to help you understand the wavelet-based feature for forensics and what’s more, achieve the best recognition accuracy. The main contributions of this paper are as follows.

  1. 1)

    QWT as a new member of wavelet family is applied to forensics;

  2. 2)

    The use of DWT and CWT is also investigated, and compared with QWT.

The rest of this paper is arranged as follows. Some related works are introduced in Sections 2 and 3. In Section 4, the quaternion and QWT are introduced, and the forensics scheme based on QWT is proposed. In Section 5, some comparative experimental results are given to show the proposed QWT-based scheme’s performances and the priority compared with the DWT-based scheme and CWT-based scheme. Finally, the conclusions are drawn and some future work is given in Section 6.

2 Farid’s scheme

In [13], Farid proposed the linear predictor scheme to predict a new wavelet coefficient from the other wavelet coefficients in the same scale subband, the high-scale subband and the same scale subbands of the other two color channels. Then, the prediction error between the original wavelet coefficient and the predicted coefficient is generated and the first four order statistics of the prediction errors are used as features. Take R channel as an example, the predictor is expressed as follows.

$$\begin{array}{@{}rcl@{}} |{D_{i}^{R}}(x,y)|&=&w_{1}|{D_{i}^{R}}(x-1,y)|+w_{2}|{D_{i}^{R}}(x+1,y)|\\ &&+w_{3}|{D_{i}^{R}}(x,y-1)|+w_{4}|{D_{i}^{R}}(x,y+1)|\\ &&+w_{5}|D_{i+1}^{R}(\lceil\frac{x}{2}\rceil,\lceil\frac{y}{2}\rceil)|+w_{6}|{H_{i}^{R}}(x,y)|\\ &&+w_{7}|{V_{i}^{R}}(x,y)|+w_{8}|{D_{i}^{G}}(x,y)|\\ &&+w_{9}|{D_{i}^{B}}(x,y)| \end{array} $$
$$\begin{array}{@{}rcl@{}} |{V_{i}^{R}}(x,y)|&=&w_{1}|{V_{i}^{R}}(x-1,y)|+w_{2}|{V_{i}^{R}}(x+1,y)|\\ &&+w_{3}|{V_{i}^{R}}(x,y-1)|+w_{4}|{V_{i}^{R}}(x,y+1)|\\ &&+w_{5}|V_{i+1}^{R}(\lceil\frac{x}{2}\rceil,\lceil\frac{y}{2}\rceil)|+w_{6}|{D_{i}^{R}}(x,y)|\\ &&+w_{7}|D_{i+1}^{R}(\lceil\frac{x}{2}\rceil,\lceil\frac{y}{2}\rceil)|+w_{8}|{D_{i}^{G}}(x,y)|\\ &&+w_{9}|{V_{i}^{B}}(x,y)| \end{array} $$
$$\begin{array}{@{}rcl@{}} |{H_{i}^{R}}(x,y)|&=&w_{1}|{H_{i}^{R}}(x-1,y)|+w_{2}|{H_{i}^{R}}(x+1,y)|\\ &&+w_{3}|{H_{i}^{R}}(x,y-1)|+w_{4}|{H_{i}^{R}}(x,y+1)|\\ &&+w_{5}|H_{i+1}^{R}(\lceil\frac{x}{2}\rceil,\lceil\frac{y}{2}\rceil)|+w_{6}|{D_{i}^{R}}(x,y)|\\ &&+w_{7}|D_{i+1}^{R}(\lceil\frac{x}{2}\rceil,\lceil\frac{y}{2}\rceil)|+w_{8}|{H_{i}^{G}}(x,y)|\\ &&+w_{9}|{H_{i}^{B}}(x,y)| \end{array} $$
(1)

Here, R, G, B denote the three color channels R, G and B, respectively, and D, V, H denote the diagonal, vertical and horizontal subbands in DWT, respectively. Additionally, w k , k = 1, 2,⋯ ,9 is the scalar weight, the subscript i is the scale number, (x, y) represent the spatial coordinate in a subband, |x| denotes the absolute value of x, and ⌈x⌉ denotes the smallest integer no less than x.

The framework of the Farid’s scheme is shown in Fig. 1.

Fig. 1
figure 1

Framework of Farid’s scheme

However, the DWT feature for forensics suffers from the following drawbacks [12, 17] of DWT:

  1. 1)

    Oscillations. The oscillating wavelet filter is used to decompose the image that results in complicated representation of a simple structure, including several coefficients in one neighborhood.

  2. 2)

    Shift-variance. A small shift of the image causes substantial change in the energy of wavelet transform, which increases the difficulty to extract valid feature from wavelet coefficients.

  3. 3)

    Lack of directionality. The conventional wavelet transform only provides local information of three directions, 0, 45, and 90, as shown in Fig. 2 on an image decomposed by DWT. Owning to this, encoding local information for recognition cannot be satisfied.

Fig. 2
figure 2

DWT decomposition to lena

3 Özparlak’s scheme

In [15], the author learned from the drawbacks of Farid’s scheme and proposed a CWT-based statistical model to capture some features of images from the prediction error. A linear predictor in R channel is written as following.

$$\begin{array}{@{}rcl@{}} |C_{j}^{R,i}(x,y)|&=&w_{1}|C_{j}^{R,i}(x-4,y)|+w_{2}|C_{j}^{R,i}(x-3,y)|\\ &&+w_{3}|C_{j}^{R,i}(x-2,y)|+w_{4}|C_{j}^{R,i}(x-1,y)|\\ &&+w_{5}|C_{j}^{R,i}(x+1,y)|+w_{6}|C_{j}^{R,i}(x+2,y)|\\ &&+w_{7}|C_{j}^{R,i}(x+3,y)|+w_{8}|C_{j}^{R,i}(x+4,y)|\\ &&+w_{9}|C_{j}^{R,i}(x,y-2)|+w_{10}|C_{j}^{R,i}(x,y-1)|\\ &&+w_{11}|C_{j}^{R,i}(x,y+1)|+w_{12}|C_{j}^{R,i}(x,y+2)|\\ &&+w_{13}|C_{j}^{G,i}(x,y)|+w_{14}|C_{j}^{B,i}(x,y)|\\ &&+w_{15}|C_{j}^{R,i+1}(\lceil\frac{x}{2}\rceil,\lceil\frac{y}{2}\rceil)| \end{array} $$
(2)

where \(C_{j}^{R,i}(x,y)\) denotes the CWT coefficient at position (x, y) in the jth subband of the ith scale.

Figure 3 shows an image decomposed by CWT.

Fig. 3
figure 3

CWT decomposition to lena

Compared with DWT, CWT overcomes some of its shortcomings and is more efficient in directionality and anisotropy, which catches the smooth contours in different directions of an image. However, it still has two drawbacks:

  1. 1)

    Discontinuity of local phase. The number of directional subbands is 2N, shown in Fig. 3, and thus it cannot capture smooth local phase information for texture.

  2. 2)

    Directionality redundancy. In order to fit many directions, CWT has some redundancy in algorithmic aspect.

4 The proposed QWT-based scheme

In this section, the proposed QWT forensics scheme will be introduced in detail. As shown in Fig. 4, the magnitude part of the proposed scheme is same as the one of Farid’s scheme, while the phase part of the proposed scheme can only use the one of Farid’s scheme for reference due to the fact that no previous scheme has used QWT for forensics.

Fig. 4
figure 4

Framework of the proposed scheme

4.1 The quaternion

The quaternion invented by Hamilton in 1843 is one of the most simple hypercomplex, composed of four parts, one real part and three imaginary both orthogonal units, i.e., i, j, k. It is expressed as follows.

$$q = a+bi+cj+dk, \{a,b,c,d\in\mathbf{R}\}$$

where i, j, k are constrained by

$$i^{2} = j^{2} = k^{2} = -1$$
$$ij = -ji = k, jk = -kj = i, ki = -ik = j$$

It is observed that the quaternion does not meet the commutative law of multiplication.

The polar representation of the quaternion is formed by the magnitude and three angles which are called phase, i.e., 𝜃, υ, φ.

$$q = |q|e^{i\theta}e^{j\upsilon}e^{k\varphi}$$

where \(|q| = \sqrt {a^{2}+b^{2}+c^{2}+d^{2}}\), \(\theta \times \upsilon \times \varphi \in [-\pi ,\pi ] \times [-\frac {\pi }{2}, \frac {\pi }{2}]\times [-\frac {\pi }{4},\frac {\pi }{4}]\).

4.2 The QWT

QWT is different from DWT and CWT, and it provides a local magnitude-phase analysis for images. Its coherent representation of local structural information in images introduces some redundancy in the local phase rather than in directionality.

Bülow proposed that a quaternionic analytic signal can be defined by both partial Hilbert transform (HT),i.e., H 0 and H π/2, and total HT, H T = H 0 H π/2.

$$f_{A}(x,y) = f(x,y)+iH_{0}f(x,y)+jH_{\pi/2}f(x,y)+kH_{T}f(x,y)$$

The mother wavelet Ψ is a quaternionic 2D analytic filter, which is separable, i.e., Ψ = ψ(x)ψ(y). QWT implements 2D analytic wavelet transform on the basis of the generalization with separable scaling function Φ and mother wavelets ΨD, ΨV, ΨH. Therefore, QWT, the analytic 2D wavelets can be extended by separable product as follows [2, 18].

$$\begin{array}{@{}rcl@{}} {\Psi}^{D}&=&\psi_{h}(x)\psi_{h}(y)+i\psi_{g}(x)\psi_{h}(y)+j\psi_{h}(x)\psi_{g}(y)\\ &&+k\psi_{g}(x)\psi_{g}(y)\\ {\Psi}^{V}&=&\phi_{h}(x)\psi_{h}(y)+i\phi_{g}(x)\psi_{h}(y)+j\phi_{h}(x)\psi_{g}(y)\\ &&+k\phi_{g}(x)\psi_{g}(y)\\ {\Psi}^{H}&=&\psi_{h}(x)\phi_{h}(y)+i\psi_{g}(x)\phi_{h}(y)+j\psi_{h}(x)\phi_{g}(y)\\ &&+k\psi_{g}(x)\phi_{g}(y)\\ {\Phi}&=&\phi_{h}(x)\phi_{h}(y)+i\phi_{g}(x)\phi_{h}(y)+j\phi_{h}(x)\phi_{g}(y)\\ &&+k\phi_{g}(x)\phi_{g}(y) \end{array} $$
(3)

Here, due to the fact that 2D H T is equivalent to 1D H T along x axis and/or y axis, 1D Hilbert pair of wavelets and scaling functions are considered:

$$\psi_{h},\psi_{g}=H\psi_{h}$$
$$\phi_{h},\phi_{g}=H\phi_{h}$$

In QWT, the magnitude |q| represents the signal strength at any space position in each frequency subband, similar to that of DWT. The phases (𝜃, υ, φ) depict the structure of the signal. The first two QWT phases 𝜃, υ indicate the spatial shift information of image features in the axis x/y coordinate system, while the third phase φ indicates edge orientation mixtures and textual information.

An image decomposed by QWT is shown in Figs. 5 and 6.

Fig. 5
figure 5

QWT magnitude decomposition to lena

Fig. 6
figure 6

QWT Phase Decomposition to Lena. In a, the first row is horizontal direction, the second row is vertical one and the third row is diagonal one. In a and b, the first column is 𝜃, the second column is υ and the third column is φ

4.3 QWT analysis for forensics

Compare with DWT and CWT, QWT includes not only the magnitude which encodes the frequency information but also three phases which indicate richer edge and texture information. Those three phases contain extra information that is not included in high frequency subbands LH, HL and HH of DWT and CWT. The key difference is from the expression meaningfulness of the high frequency subbands in DWT and CWT which indicate the edge and noise change along the specific directions between the low frequency and original image. However, the low frequency part of three phases shows the edges and noise of the whole image with the general directions, and the high frequency part reflects the change of the edges and textures [12].

As shown in Figs. 5 and 6, QWT magnitude and phases carry complementary information that could be used for texture analysis. At present, the third phase φ of QWT is widely applied to texture segmentation and classification due to its ability of discriminating different textures and their superpositions [18, 19].

To our best knowledge, QWT has not yet been applied to forensics. Based on the above analysis, QWT may provide more information than DWT and CWT for forensics, and is expected to achieve better forensics performance. The experimental results will be given which prove its validity in Section 5.

4.4 Feature dimension

In Farid’s scheme, the first four order statistics, i.e., mean, variance, skewness and kurtosis, are used to represent the difference between the computer-generated image and the photographic image. DWT of scale 3 is used to decompose two classes of image. Twelve features, named the original features, are extracted from DWT coefficients in three subbands of every scale, LH, HL and HH. Thus, the total dimension of original feature vector is 12 × 3 × 3 = 108 for 3 color channels. The prediction error is generated by the predictor introduced in Section 2 on the DWT coefficients in two same directional subbands of two scales. The same four statistics can be obtained from the prediction error, named prediction features. Thus, the dimension of prediction feature vector is totally 12 × 3 × 3 = 108 for 3 color channels. In sum, the dimension of both feature vectors is 216.

Özparlak’s scheme is the same as Farid’s scheme except that CWT is used to replace DWT. The images are split by CWT in scale 3 and direction 28. The dimension of original feature vector is 4 × 28 × 3 = 336 for 3 color channels, and the one of prediction feature vector is totally 4 × 28 × 3 = 336 for 3 color channels. Therefore, the dimension of both feature vectors is 672.

In Farid’s scheme, DWT coefficients less than 1 are erased. Similarly, in QWT, coefficients less than 1 are deleted. What’s more, the phase corresponding to small coefficients are wiped off because of two reasons. One is that there exists strong dependence between the magnitude and three phases of QWT. The other is that those phases of small coefficients are little meaningful and unstable. In our proposed scheme, QWT in scale 3 is used to decompose the images. Therefore, the dimension of whole feature vector is 576, including magnitude features 4 × 3 × 3 × 4 = 144 for 4 statistics, 3 subbands, 3 scales and 4 complex planes and phase features 144 × 3 = 432 for 3 phases.

It is noted that the phase features are obtained by the product of the phase coefficients and the corresponding normalized magnitude coefficients.

5 Experiments and discussions

In experiments, 7500 CG images and 7500 PG images with high visual quality are used to evaluate the performance of the proposed feature vector (http://www.creative-3d.netandhttp://www.3dlinks.com) [7, 10], as shown in Fig. 7. In order to enrich the image database, the image content includes all kinds of the indoor-outdoor scenes, for example, plants, animals, characters, sculptures and architectures. The CG images are generated by over 50 softwares, including 3D Max, Auto CAD and Maya, etc.

Fig. 7
figure 7

The examples from the CG/PG Database

In order to guarantee fair comparison, RGB color model is selected in three schemes. The LibSVM (http://www.csie.ntu.edu.tw/cjlin/libsvm/) is used to classify CG and PG images. The kernel function is polynomial with the following parameters: degree 2, gamma 1, coef0 1 and cost 1. The proposed QWT feature vector is split in two parts, e.g., magnitude features and phase features. And, the two parts are tested respectively to compare with that of DWT and CWT. Finally, the whole QWT feature vector is evaluated and three ROC curves are depicted.

All the experimental results are obtained by 5-fold cross validation. In order to obtain convincing experimental results and validly evaluate the performance of the proposed scheme, all the images with various sizes in PG and CG database are used to calculate the statistical features for DWT-, CWT- and QWT-based schemes. The width of the images in PG database is from 128 to 2704 and the height from 85 to 2259. The width of the images in CG database is from 130 to 2400 and the height from 140 to 3200. This means that the range of the image size changes widely and therefore it makes our research confirm to reality.

5.1 Tests on the proposed feature

In this section, three kinds of feature are tested, including the magnitude and phase features, the original and prediction features, and the four statistical features.

For the first kind of feature, its performance is tested by considering of sub-features independently, including the magnitude feature, three phase features and the combination features of the magnitude and three phases. As shown in Table 1, the accuracy of the single feature is less than that of the combination of all the features. However, it is noted that the accuracy of some combination feature is less than that of the single feature, for example, the accuracy of φ feature is higher than that of 𝜃 + φ features in PG database. It has also been discovered in Table 1 that more features do not mean more accuracy, for example, the performance of 𝜃 + υ + φ features is inferior to that of υ + φ and 𝜃 + φ features in CG database. Besides, the accuracy of the phase feature excels that of the magnitude feature, which proves that the phase feature provides better performance of classification. And, the combination of the magnitude feature and the phase feature is superior to the magnitude feature, which proves the complementary effect of the magnitude and phase features.

Table 1 Tests on the magnitude features and phase features

For the second kind of feature, the original features and the prediction features are tested and compared, with the results shown in Table 2. As is found that the prediction feature has better balance between PG and CG, although its accuracy is less than that of the original feature.

Table 2 Tests on original features (OF) and prediction features (PF)

For the third kind of feature, four statistical features, e.g., mean, variance, skewness and kurtosis, are tested, as shown in Table 3. It is easily understood that the mean feature is the worst because it reflects the average of total image, which results in the weak recognition capability. Similar to Table 1, the accuracy of combination features is less than that of single feature, for example, the performance of mean+skewness is inferior to that of kurtosis. It is noted that variance+skewness+kurtosis feature can obtain higher accuracy than any of the other combination except for the whole feature set, which shows these three sub-features can provide the most complementary information. This test proves skewness+kurtosis the best combination feature.

Table 3 Tests on mean, variance, skewness and kurtosis

5.2 Feature comparison

In this section, three schemes based on wavelet transform are compared in order to select the best feature over three schemes. That is, the advantages and disadvantages of these features are analyzed to achieve the best recognition accuracy.

Seen from the results in Table 4, it is observed that the DWT feature’s dimension and forensics accuracy of Farid’s scheme are lowest, which shows that Farid’s scheme has the weakest classification capability. Compared with Farid’s scheme, Özparlak’s scheme improves the forensic accuracy by increasing the feature dimension from 216 to 672. For the proposed scheme, the performance of the QWT feature excels the other two schemes, and achieves the highest forensics accuracy with feature dimension lower than Özparlak’s scheme.

Table 4 Comparison of the proposed scheme with Farid’s scheme and Özparlak’s scheme

The receiving operation characteristics (ROC) curves are depicted for three wavelet-based schemes, as shown in Fig. 8. It is obviously found that the QWT feature owns the best performance over all three schemes.

Fig. 8
figure 8

ROC Curve of QWT, CWT and DWT

It is also proved from Table 4 and Fig. 8 that QWT can be helpful for boosting forensics performance. The reason is that QWT takes the advantage of the rich phase information to represent the change details of the image.

6 Conclusions and future work

In this paper, QWT is applied to obtaining more feature information to achieve the improvement on forensics scheme’s identification performance. Firstly, the investigation on Farid’s scheme and Özparlak’s Scheme is made. Then, QWT is introduced and analyzed. And, the proposed scheme is constructed by generating the magnitude feature and the phase feature. In experiments, various QWT features are analyzed and tested, including the magnitude and phase features, original and prediction features, and four statistical features, and compared with the features extracted in DWT and CWT domains. The theoretical analysis and experimental results proves the proposed scheme’s (based on QWT) superiority over the other schemes (based on DWT and CWT). It may be the first time to introduce QWT to image forensics. The improvements are apparent, but the advantage of QWT is not totally explored. For example, the current transform treats three color channels separately but not together. In future, a quaternion color wavelet transform for forensics will be studied.