Keywords

1 Introduction

Digital images are powerful sensory perceptions considering the natural tendency to believe what we see. Image sharing statistics in the social media shows that Snapchat users share 8796 images per second, whereas Whatsapp users share 8102 images per second and Facebook users share 4501 images every second [26]. This reveals how deeply digital images have ingrained in our daily lives. People are conditioned to believe in images even though an image can be altered effortlessly. Different types of image manipulations include copy-move forgery, image filtering, and image splicing. In copy-move forgery, an image region is copied and pasted onto the same image to enhance the impact of an image. For example, consider an image where a few image regions containing vehicles are copied and pasted on to the same image for depicting heavy traffic. Image filtering is often carried out to enhance the image quality, but sometimes it may alter the original meaning of the image. For example, the color of a vehicle getting altered in a crime scene photograph. In image splicing, an image region from another image is copied and pasted to another image. For example, an image region containing a vehicle being added onto another picture. Examples of image forgery are shown in Fig. 1. Redi et al. have given a detailed discussion regarding the impact and importance of image forgery detection [29].

Fig. 1.
figure 1

(a) An example of copy-move forgery. (b) The mask showing copy moved regions. Both the image and the mask are taken from CoMoFoD dataset [39]. (c) An example of image splicing. (d) and (e) Source image and Destination image for creating the spiced image in (c). Images shown in (c) and (e) are taken from CASIA V2.0 dataset [14].

Just like any act of crime, image alterations also leave some pieces of evidence. These are the traces left by the image processing operations carried out during alteration. Even saving a JPEG image again in JPEG format after altering the image contents can introduce JPEG double compression artifacts [37]. Forensic analysis of digital images involves the detection and analysis of the evidence left behind during a forgery. Various pieces of evidence considered include similarities in pixel-wise regularities (for detecting copy-paste forgery) and interpolation pattern inconsistencies (for detecting image splicing). Image splicing introduces several pieces of evidence including inconsistencies in the underlying camera noise pattern and inconsistencies in scene illuminant color.

Several surveys have been published in the broad area of digital image forensics [3, 17, 28, 33]. In this paper, we survey the existing image splicing detection techniques that exploit inconsistencies in illuminant color. This work elaborates the different illuminant representation models and clarifies how illuminant color is computed in each model. We hope that the discussion on the background would help readers gain a better understanding of the underlying process of illuminant color estimation in each technique.

1.1 Motivation

Recently, researchers have started analyzing illumination in a scene during image capture as evidence to expose an image forgery. Compared to the independent research advancing in the directions of illuminant color estimation, and image forgery detection separately, the illuminant color estimation based image forgery detection is in the evolving stage. So far, research bottlenecks and opportunities in illuminant color estimation based forgery detection have been disclosed sparingly in research theses and specific papers only. This motivated us to carry out a survey of existing work based on illuminant color inconsistency and to attempt giving hints to researchers regarding possible future research directions.

1.2 Contribution

The main contributions of this survey are

  • A discussion on illuminant color estimation approaches for a better understanding of illuminant color inconsistency based techniques.

  • A survey of existing illuminant color inconsistency based image forgery detection techniques.

  • A detailed listing of future research directions including the need for specific datasets and the possibility of applying new technologies.

The rest of the paper is organized as follows. Section 2 introduces the basic concept of scene illumination. Section 3 explains different illumination estimation approaches employed in existing illuminant color inconsistency based forgery detection techniques. Section 4 examines various illuminant color based forgery detection techniques. Finally, Sect. 5 elaborates future research directions followed by a conclusion in Sect. 6.

2 Scene Illumination

Scene illumination represents the illumination prevailed in the scene at the time of image capture. The scene illumination in an outdoor scene will be uniform whereas the scene illumination in an indoor scene will be non-uniform as indoors are often lit up by a mix of multiple light sources. The scene illumination influences the color or pixel value recorded by the camera sensor. Hence, in an image, the perceived color is not the actual color of the object, instead, a combination of the object color and the color of scene illumination. The color of scene illumination is termed as the illuminant color.

Humans have the ability to see objects in their actual color irrespective of illuminant color. This capability of the human visual system is known as color constancy. Incorporating color constancy for computer vision applications is an active research area. For object recognition purposes, the illuminant color is estimated and later removed to get the actual color of the object. When an image is altered by copy-pasting a region from another image, there will be a mismatch in the illumination of the copy-pasted region with the rest of the image. Therefore, the inconsistency in illuminant color across an image can be considered as a clue for detecting an image forgery. If an illuminant color estimated from a suspect region is different from the illuminant color estimated from the rest of the image, possibly the suspect region could have been copy-pasted from another image captured with a different scene illumination. Illuminant color is usually estimated by following any of the illuminant color estimation approaches discussed in Sect. 3.

3 Different Approaches for Estimating Scene Illuminant Color

Several illuminant color estimation techniques are available. For further reading, please refer to Gijsenij et al.’s survey [21] where authors have surveyed and evaluated various color constancy techniques. In our survey, the discussion is restricted to statistics based and physics based approaches as these two approaches are the common illuminant color estimation approaches employed in current image forgery detection techniques.

3.1 Statistics-Based Approach

In statistics-based approach, the techniques rely upon the color distribution present in the image and are influenced by the number of colors present in the image. For example, the traditional Gray-world [5] assumes that the average color in an image is gray. Hence, any deviation from this gray is contributed by the illuminant color. Another generalized model is Generalized Gray-Edge (GGE) assumption proposed by Van De Weijer et al. [40]. The GGE assumes that the average color of edges in an image is gray. In this model, the illuminant color is computed by taking the integral of derivatives of pixels in an image.

3.2 Physics-Based Approach

Physics-based techniques are based on the understanding of physical properties of light reflection and hence perform well even if the number of colors in an image are few [18]. A popular Physics-based illuminant color estimation approach is based on the Dichromatic Reflection Model (DRM) [34]. According to DRM, homogeneous objects (objects with a uniform surface) show only the interface reflection and inhomogeneous objects show both the interface and the body reflection [38].

In DRM, the light reflected from an inhomogeneous dielectric surface is considered as a combination of specular reflectance (specular highlights, for e.g., the bright cheek region in a facial image) and diffuse/body reflectance (light reflected from the surface albedo/matte). Specular reflectance is also known as interface reflectance since it is the part of the light immediately reflected from the surface causing specular highlights. The estimation of illuminant color in DRM is explained here.

Tan et al. have defined chromaticity (normalized RGB) as the ratio of an RGB component to the sum of R, G, and B components [38]. When pixels from a uniformly colored object are plotted in a chromaticity - intensity space, the interface (specular) pixels will appear as a varying cluster, whereas the body (diffuse) pixels appear as a straight vertical line showing that the diffuse pixels are independent of the image intensity, as shown in Fig. 2 [38].

The varying cluster of specular pixels can be clearly understood in the Inverse Intensity-Chromaticity (IIC) space where the x-axis represents Inverse Intensity, defined as,

$$\begin{aligned} \text {Inverse Intensity} = 1/\sum I_{i}(x) \end{aligned}$$
(1)

and the y-axis represents chromaticity. When pixels from a uniformly colored surface are projected onto the IIC space, we get the illuminant chromaticity as illustrated in Fig. 3 [38].

Fig. 2.
figure 2

(a) A green colored synthetic object. (b) The projection of specular and diffuse pixels (green plane) in the chromaticity-intensity space (right) [38] (Reprinted from Tan, R.T., Nishino, K., Ikeuchi, K.: Color constancy through inverse-intensity chromaticity space. Journal of Optical Society of America A 21(3), 321–34 (2004)).

Fig. 3.
figure 3

(a) Inverse Intensity-Chromaticity space showing specular and diffuse pixel clusters. (b) The specular cluster extension pointing towards the illumination-chromaticity (green plane) in the y-axis [38] (Reprinted from Tan, R.T., Nishino, K., Ikeuchi, K.: Color constancy through inverse-intensity chromaticity space. Journal of Optical Society of America A 21(3), 321–34 (2004)).

To compute the illuminant chromaticity, the IIC space is transformed into Hough space. In the Hough space, the x-axis represents the illuminant chromaticity and y-axis represents the image chromaticity as shown in Fig. 4(a). The illuminant chromaticity with the maximum number of line intersections is taken as the illuminant chromaticity, as illustrated in Fig. 4(b). To get the illuminant color in RGB color space, the illuminant chromaticity computation is carried out in R, G, B channels separately.

Ideally, in an authentic image, the illuminant chromaticity will be consistent throughout the image pixels. But, during image splicing, where an image region is copy-pasted from another image with a different illuminant chromaticity, the illuminant distribution across the spliced image will become inconsistent.

Fig. 4.
figure 4

(a) Hough space. (b) The intersection of lines in Hough space [38] (Reprinted from Tan, R.T., Nishino, K., Ikeuchi, K.: Color constancy through inverse-intensity chromaticity space. Journal of Optical Society of America A 21(3), 321–34 (2004)).

4 Illuminant Color Inconsistency Based Image Forgery Detection Techniques

Recently, researchers have started analyzing illuminant color inconsistency for image forgery detection. Illuminant color based forensics is challenging because (i) most of the existing illuminant color estimation methods assume single illumination whereas the real world scenes can be multi-illuminated [4], (ii) illuminant color estimation is an ill-posed problem, and (iii) illuminant color comparisons can only be carried out on similar materials since the material properties affect the surface reflection.

In general, the illuminant color comparison is restricted to similar material surfaces. In this survey, illuminant color inconsistency based techniques are grouped into two, since research is happening in parallel in two directions based on objects analyzed. In the first direction, the forgery detection is carried out by analyzing the illuminant color from similar objects [6, 16, 20, 43]. In the second direction, techniques concentrate on detecting forgery by analyzing skin regions [13, 19, 31, 41]. This classification has significance in the digital forensics domain since a lot of image forgery cases are being reported where human skin regions are copy-pasted.

4.1 Forgery Detection Techniques Analyzing Object Regions

Gholap and Bora devised an illuminant color estimation based forgery detection based on the dichromatic reflection model [20]. All the R, G, B values of pixels in the specular highlight regions are arranged as a matrix and Principal Component Analysis (PCA) is carried out by Singular Value Decomposition (SVD). The eigenvectors for the two significant eigenvalues from the two principal components constitute the dichromatic plane. The dichromatic planes are projected as lines in normalized r-g chromaticity space and the point of the intersection of the colors indicates illuminant color. In an authentic image, the different dichromatic lines estimated from different objects intersect at the same point.

But, if the image is forged by copy-pasting different regions from different images, then the dichromatic lines obtained may not intersect at the same point indicating forgery. This method assumes that there is only a single light source in the image.

Cao et al. developed an image splicing detection technique based on the differences in the local color statistics, and the difference in illuminant color between the suspect region and the background region [6]. The method is based on the fact that local color statistics will be consistent in natural authentic images. In this method, the image is segmented into a foreground region and a background region, and color histograms are extracted from the background region and the foreground object regions separately. The distance between the foreground and background histogram is computed using histogram distance measures such as Chi-square distance and Kullback-Leibler (K-L) distance, constituting the first set of features. The second set of features is obtained from illuminant color inconsistency between the foreground and background regions.

To obtain the difference in illuminant color, a new method of illuminant color estimation is proposed. Here, the illuminant color is estimated as the average color of near-white pixels. Near-white pixels are pixels that exhibit near-zero color difference and near-white luminance. The average color differences in these near-white pixels between the foreground and background regions are computed in both U and V planes in YUV color space. Finally, the features are fed to an SVM classifier with RBF kernel. Test dataset contained 180 realistic, 360 unrealistic forged images and 540 real images. Different color spaces such as RGB, YUV, HSV, XYZ, La*b*, and L\(\alpha \beta \) are used. Experiments are conducted with different color spaces and different histogram distance measures. The K-L distance measure in the HSV color model gave optimum results. The results show that this illuminant color-based method obtained an Average Precision of 56% at a low False Positive (FP) rate of 0.2%.

Wu and Fang proposed another illuminant color inconsistency based image splicing detection technique assuming single illumination [43]. Here, the image is divided into overlapping blocks. The method makes use of three illuminant color estimating techniques such as the Grey-Shadow, the first-order Grey-Edge and the second order Grey-Edge algorithm, represented by Generalized Grey Edge framework proposed by Van De Weijer et al. [40]. The most suitable algorithm for each block is adaptively selected using a maximum likelihood classifier proposed by Gijsenij et al. [22] based on block properties such as color distribution and color edges. The illuminant color for each block is estimated using the selected algorithm. The estimated illuminant color is compared with reference blocks. The comparison is carried out by computing the angular error between the illuminant color in each block and the reference block. If this angular error is greater than a threshold, the corresponding block is considered as spliced.

The image database by Ciurea and Funt is used for classifier training and for determining the block size [10]. A block size of 30\(\,\times \,\)30 is selected, since increasing the block size further reduced localization accuracy. The adaptive illuminant algorithm selection achieves an accuracy of 94% when tested on the database. The angular error threshold is determined using 100 spliced images taken from the CASIA image tampering database [14]. A detection accuracy of 75% is obtained when the angular error threshold is set to 7.

Fan et al. overcame the need for manual selection of reference objects in their work [16]. Here, the image is divided into vertical and horizontal bands. Illuminant colors are estimated using five algorithms such as Grey-World, Max-RGB, Shades of Grey, first-order Grey-Edge and Second order Grey-Edge. Thus, five illuminant color values are obtained for each band. For each illuminant estimation algorithm, two reference illuminant colors are obtained by taking the median of vertical and horizontal bands separately. If the distance between the illuminant color of a band and the reference illuminant is greater than a preset value, that band is considered as spliced. For each illuminant color estimating algorithm, the intersection of bands marked as spliced is represented as a detection map. Finally, the spliced region is detected by the intersection of all bands previously considered as spliced.

Experiments are conducted on two sets of images taken from CASIA V2.0 [14]. The first dataset contains images without reference objects. Based on the image contents, this dataset is divided into four categories such as ‘People’, ‘Animals’, ‘Plants’, and ‘Objects’. The second data set contains reference objects, where manual marking is required to identify three objects including one object belonging to the spliced region in the image. The proposed method obtained highest True Positive Rate (TPR) of 90.00% in the Plants group and the lowest False Positive Rate (FPR) of 20.00% in the Objects group and highest Accuracy (ACC) of 76.62% in the Objects group. Overall TPR, FPR and ACC are 50.75%, 26.34% and 67.59%. The performance of the method in the second dataset with the reference object is compared with two other illumination based splicing detection methods such as Method based on Dichromatic Line (MDL) [24] and Method based on Illumination Map (MIM) [31]. Experimental results show that though MIM gave the highest TPR of 60.00%, the proposed method gave lowest FPR of 19.58% and the highest ACC of 76.69%. When the computation time is compared, Fan et al.’s method required 713.66 s whereas MDL and MIM required 296.66 s and 640.14 s respectively.

Table 1. Summary of forgery detection techniques analyzing object regions.
Table 2. Comparison of performance forgery detection techniques analyzing object regions.

Illuminant inconsistency based techniques address two kinds of issues, such as forgery detection and forgery localization. In forgery detection, the technique classifies an image as either a forged or an authentic image whereas in forgery localization, the region of forgery is identified. Another point considered while comparing the techniques are the underlying assumption regarding illumination. Certain methods discussed are based on the uniform single illuminant source assumption, whereas some other methods are based on the multi-illuminant assumption. Among the four techniques described above, the technique proposed by Gholap and Bora [20] classifies an image as spliced or authentic, whereas techniques by Wu and Fang [43], Cao et al. [6] and Fan et al. [16] perform forgery localization. For the techniques proposed by Cao et al. [6] and Wu and Fang [43], a reference region is to be specified to detect the inconsistency in illuminant color, whereas techniques proposed by Gholap and Bora [20], and Fan et al. [16] do not require any reference region.

Table 1 summarizes the methods discussed in Sect. 4.1. A comparison of the performance of the methods proposed by Cao et al. [6], Wu and Fang [43], and Fan et al. [16] is given in Table 2. All the techniques for which the experiments are conducted on a dataset are included in Table 2.

4.2 Forgery Detection Techniques Analyzing Facial Skin Regions

While altering an image, a human facial region may be copied from an image taken in a different lighting environment. This introduces a discrepancy at the copy-pasted facial region compared to the authentic facial regions. Among the various forgery detection techniques that considered facial skin regions [8, 13, 19, 25, 31, 41, 42], the techniques that considered the illuminant color inconsistency are discussed here.

Fig. 5.
figure 5

(a) Forged image with the third person copy-pasted. (b) Illuminant map. (c) Distance map. Illuminant map and distance map clearly shows the third face as an inconsistent region [31] (Reprinted from Riess, C., Angelopoulou, E.: Scene illumination as an indicator of image manipulation. In: Information Hiding. vol. 6387, pp. 66–80 (2010) with permission from Springer).

Riess and Angelopoulou proposed a method based on the dichromatic reflection model that identifies the variation in illuminant color using two maps generated from the image - an illuminant map and a distance map [31]. Here, the image is segmented into sub-regions based on color similarity. Each sub-region is again partitioned into small patches and an illuminant color is estimated from each small patch. Illuminant color is computed using the Inverse Intensity Chromaticity (IIC) space described in Sect. 3.2. In the IIC space, the diffuse pixels in the small sub-region will form a horizontal line, whereas the bright specular pixels point toward the illuminant color in the Chromaticity axis. Pixel groups that satisfy the two constraints in the IIC space, such as a constraint on the shape of pixel distribution, and another constraint on the slope of pixel distribution, are only considered. Illuminants are estimated from these small pixel patches, and finally, an illuminant is selected through majority voting. The illuminant color thus obtained is used to generate an illuminant map where each sub-region is colored with selected illuminant color. The distance map is generated by representing the deviation of illuminant color computed from specially selected sub-regions to the rest of the sub-regions.

Both the illuminant map and the distance map show the inconsistency in illuminant color in the altered region. An example is shown in Fig. 5. A manual examination of the illuminant map and distance map reveals the copy-pasted image region. The advantage of this method is that it calculates the illuminant color at a local region and hence works well on real-world multi-illuminant images.

Carvalho et al. have proposed a machine learning based method [13] that automates the previous image splicing detection method proposed by Riess and Angelopoulou [31]. This method analyzes facial skin pixels for detecting an image forgery. The method consists of five stages.

In the first stage, two variants of illuminant maps are generated after partitioning the image into pixels of similar color. One variant, the IIC based illuminant map is generated as proposed by Riess and Angelopoulou [31]. The second variant is the statistics-based Generalized Gray World (GGW) illuminant map. For generating GGW illuminant map, the illuminant color for each small pixel group is estimated using the method proposed by Van De Weijer et al. [40]. In the second stage, the facial regions from the illuminant maps are extracted by the user specifying a bounding box around the face. In the third stage, a feature set consisting of texture and edge descriptors are extracted. Edge features are generated using a new edge-based Histogram of Gradient (HOG) descriptor based on the HOG-descriptor [12]. For texture features, Statistical Analysis of Structural Information (SASI) descriptor is used [7]. Both edge and texture features are extracted from the IIC and GGW illuminant maps. The method identifies an image as tampered if any of the face pairs in the image is inconsistently illuminated. Thus, in the fourth stage, all face pairs are considered and the features from a face pair are concatenated. In the final fifth stage, an image is categorized as tampered if any of the two faces are identified as inconsistent. The SASI-Gray-World, SASI-IIC, HOGedge-IIC and HOGedge-GGW features are fed to a Support Vector Machine (SVM) classifier independently. Then, the SVM meta-fusion combines the output of all the independent classifiers as a combined feature set. This new feature set is fed to another SVM classifier to categorize the image as tampered or original.

In this work, Carvalho et al. introduced two datasets DSO-I and DSI-1. DSO-I contains 200 images (100 original and 100 spliced) with a resolution of 2,048\(\,\times \,\)1536 pixels. The DSI-1 dataset consists of 25 authentic and 25 tampered images downloaded from the internet. When the meta-fusion SVM classifier is tested on DSO-I dataset, it obtained an overall Area Under the Curve (AUC) of 86.3% whereas a manual evaluation of the same dataset achieved only 38.3% on tampered images. The DSI-1 dataset is used for a cross-database experiment on the classifier trained with DSO-I and obtained an AUC of 82.6% indicating generalization to images from other sources as well.

Francis et al. devised illumination based forgery detection from human skin highlight pixels [19]. The proposed method works as follows. The input image is segmented into facial regions of different persons present in the image. For each person, pixels in nose tip are selected (can be done manually or automatically using any face detection technique). Principal Component Analysis (PCA) is performed on the sorted pixels starting from the darkest pixel for estimating the body reflection vector. The PCA is performed on the sorted pixels starting from the brightest pixel to obtain specular reflection vector. The direction of specular reflection vector is mapped onto the RGB chromaticity space. This direction gives the estimate of the illuminant color. In the normalized RG space, the chromaticity coordinates of illuminant colors obtained for different persons are plotted. The Euclidean distance between points is calculated, and if the distance measure is greater than a threshold then it indicates forgery. This method requires frontal facial regions for estimating the illuminant color from nose tip highlights.

Carvalho et al. extended their previous work [13], by considering more color spaces in addition to YCbCr, and by using a more powerful classifier fusion and selection method [8, 9]. In this method, both GGW and IIC based illuminant maps are generated from the color segmented input image. The facial regions are represented in four different color spaces such as HSV, Lab, YCbCr, and RGB since different features are highlighted in different color spaces. Various visual properties of the image such as texture, shape, and color are extracted and represented as image descriptors. A combined image descriptor representing the illuminant map, color space, and visual properties are computed. A feature vector is then obtained for a pair of faces by concatenating the image descriptor for a pair of faces. An optimum combination of these feature vectors is then selected and classified through a classifier and fusion technique. A classification rate of 94% with a reduction of 72% of error from the previous method [13] is achieved.

In addition to forgery detection, forgery localization is also performed by computing the probability of a face to be spliced using an SVM classifier after an image is classified as spliced. Forgery localization is based on the finding that the difference in illuminant maps (GGW and IIC) for a spliced facial region is higher than that for an original face. For forgery localization, an SVM classifier with various color descriptors such as color correlograms [23], Border/ Interior pixel Classification (BIC) [35], color coherence vectors [27] and local color histograms [36] are used, and obtained a detection accuracy of 76%, 85%, 83% and 69% respectively.

Vidyadharan and Thampi proposed another forgery localization technique [41] using illuminant maps introduced by Riess and Angelopoulou [31]. Both Generalized Gray-World (GGW) and Inverse Intensity Chromaticity (IIC) based illuminant maps are used. In this technique, the facial regions are extracted manually from illuminant maps. The extracted facial regions in RGB space are converted to gray scale. All the facial regions are arranged as an M\(\,\times \,\)N matrix where each row represents a facial region as an N dimensional vector. N is the total number of pixels and M is the number of faces in the image. PCA is carried out in this M\(\,\times \,\)N matrix by decomposing the matrix using Singular Value Decomposition (SVD). The matrices representing facial regions when projected on to the principal component space shows the illumination variance between faces. Facial regions showing similar illumination properties will be grouped together in the principal component axes and the face with dissimilar illumination properties will be projected as an outlier. When the proposed method is evaluated in images containing three or more faces from DSO-I dataset, the detection accuracy obtained on GGW and IIC illuminant maps are 64% and 62% respectively. For images with three or more faces in the DSI-1 dataset, a detection accuracy of 42% is obtained for both IIC and GGW maps.

Mazumdar and Bora devised an illumination-signature capable of detecting forged images [25]. The illumination-signature, the Dichromatic Plane Histogram (DPH) is based on DRM. From each face, a DPH is generated using 2-D Hough Transform. DPHs obtained from two faces are compared using correlation measure. If the correlation value is higher than a pre-specified threshold, the illumination is considered as consistent between the faces and hence the image is identified as an authentic image. On the other hand, if the correlation value is below the threshold, the illumination is considered inconsistent and hence the image is identified as forged. The method is evaluated on a combination of subset of images from DSO-I and DSI-1 datasets, and a new dataset created by the authors - Face Splicing Detection (FSD) dataset. The proposed method obtained an AUC of 91.2% when tested on the combined dataset containing 55 spliced and 55 authentic images. Further experiments conducted with images compressed in JPEG quality factors, 90, 80, and 70 gave AUC values of 90.8%, 90.6% and 89.6% respectively. This shows that the proposed method is robust to JPEG compression.

Table 3. Summary of forgery detection techniques analyzing facial skin regions.
Table 4. Comparison of forgery detection techniques analyzing facial skin regions.

The illuminant color inconsistency based forgery detection methods that considered facial regions are studied based on the task - Forgery Detection/Localization, the approach followed - machine learning/non-machine learning, the assumption regarding the illuminant - single/multi-illuminant. A summary of the techniques discussed in Sect. 4.2 is given in Table 3. A performance comparison of the methods discussed in Sect. 4.2 is given in Table 4. Only techniques with experimental results available on a dataset are included in Table 4.

5 Future Research Directions

Illuminant inconsistency based forgery detection is an evolving research area considering the lack of multi-illuminant estimation techniques and proper dataset. Here, we discuss the future research opportunities exposed by other researchers along with the directions revealed during our literature survey.

The effect of the camera’s inbuilt color constancy algorithm. Riess has observed that it is not possible to tell how the illuminant color is being affected by the camera’s own color constancy methods [30]. Riess clearly mentioned the need for a proper dataset that includes images captured using different camera models with a color chart present in each image to carry out illuminant color analysis. According to Riess, this kind of dataset could help in exploring interpolation patterns or camera response function for forensic analysis of digital images.

The effect of JPEG compression, noise, and blur. Another research direction is the study of the effect of compression schemes such as JPEG in the illuminant color based techniques, as mentioned in the work of Carvalho et al. [13]. Current state-of-art methods work well on uncompressed data compared to JPEG compressed images. Hence, how the JPEG compression process and how the presence of compression artifacts affect the illuminant maps need to be explored. Similarly, the effects of noise, camera out-of-focus, and blur also require further exploration.

The effect of known illuminant-color-variation and skin-tone variation. Carvalho et al. in the recent work [8] noticed that, in the future, three kinds of experiments can be considered. First, images captured at known lighting can be used and the proposed detector can be tested with image pairs that vary in illumination by a known amount. Secondly, experiments that analyze the distribution of illumination in different people in an image are to be carried out. Finally, the influence of different skin tones can be studied.

The effect of Fresnel rectification of skin pixels. The illuminant color estimation using the Inverse-Intensity Chromaticity space [38], assumes that the color of specular pixels is the color of the illuminant. This assumption is known as Neutral Interface Reflection (NIR). However, the geometry of the scene and the refractive index of the surface affect the specular pixels. This wavelength dependent refractive index, the color of the object, and the geometry are captured as a function of wavelength known as Fresnel term in the reflectance model proposed by Cook and Torrance [11]. The Fresnel effect is neglected in the NIR based model. Eibenberger and Angelopoulou found out that the Fresnel effect, when ignored introduces an error in specular based illuminant color estimation methods [15]. Eibenberger and Angelopoulou showed that rectification for this illuminant color shift in human skin pixels can improve the illuminant color estimation by 30%. In the future, illumination estimation from human skin regions should consider this correction as well.

Application of recent illuminant color estimation methods. Although, Riess and Angelopoulou’s pioneering work on illumination representation [31] and subsequent works by Carvalho et al. [8, 13] take care of multi-illumination, researchers can as well consider recent multi-illuminant estimation methods such as the method proposed by Beigpour et al. [1]. Similarly, the method of adaptive color constancy from skin pixels proposed by Bianco and Schettini [2] can be considered for skin-pixel based forgery detection [13]. Also, illuminant estimation from multiple dissimilar surface materials can also be attempted, as in the recent work that overtook the need for similar surface materials for detecting the direction of the light source [32].

Application of deep neural networks. The machine learning based illumination inconsistency detection techniques proposed by Carvalho et al. rely upon texture, edge and color features extracted from images [8, 13]. Recently, the feature extraction based computer vision applications are addressed by deep learning techniques. Deep learning techniques can be explored to tackle image forgery detection. Currently, lack of large datasets setback research in this direction.

6 Conclusion

Inconsistency in illumination can be considered as a potential clue while authenticating digital images during a digital crime investigation. This survey gives an overview of illuminant color inconsistency based image forgery detection mechanisms devised recently. The underlying illumination models are explained to help researchers understand the techniques clearly. Illuminant color inconsistency based forgery detection schemes are grouped into two categories based on the type of image regions considered. Since the images with human skin regions are important in many forensic investigations, techniques that deal with human facial regions are categorized separately. In a nutshell, researchers need to consider the creation of new dataset along with ground truth illuminant color information for each image, explore new research directions that take care of the effect of JPEG compression, noise, blur, and Fresnel effect rectification for skin pixels, and try to incorporate new multi-illuminant estimation techniques.