Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Recently it is observed that personal authentication has emerged as a primary requirement of our society. Biometrics based characteristics are used to realize it either using physiological (such as face [18, 19], fingerprint [7, 25], iris [5, 16, 20], palmprint [21], ear [23, 24], knuckleprint [4, 17, 22] etc.) or behavioural such as gait, speech etc. characteristics, which are assumed to be unique for each individual. However, each trait has its own challenges and trait specific issues hence none of the biometric trait can be considered as the best one. The human eye is a well-protected internal organ and has rich texture information that can be used in identification. It has a very rich layered pigmentation structure composed of various colors and patterns. These patterns include ridges, furrows, spots, curves, etc.

The estimation of amount and location of iris occlusion is a very critical step. Hence it has to be carried out before recognition. In [10], parabola fitting has been used to detect upper and lower eyelids from original iris image. Eyelashes are detected using gray-level co-occurence matrix (GLCM) technique in [3]. GLCM is computed for windows of fixed size and fuzzy K-means algorithm is used to cluster them into skin, eyelash, sclera, pupil and iris using the computed GLCM values. In [13], the eyelids have been detected by line fitting on edge map of raw image within the inner and outer boundaries of iris.

In [8], iris recognition is done by creating a binary representation and has extracted a real-valued representation similar to [27]. Several filters have been proposed to extract useful features from the unwrapped iris image. In [15], discrete cosine transform(DCT) has been applied on overlapping rectangular patches rotated at \(45^\circ \) degrees from circumferential direction. Difference of DCT coefficients of adjacent patches is binarised and used to create the template. In [12], Dyadic wavelet transformation of a group constituting \(1-D\) row signals obtained from unwrapped iris image has been used to create a template. Ordinal measures proposed in [26] use relative intensities of image regions to generate a binary template. In [13], the quantised phase data from 1D Log-Gabor filters has been used to generate iris templates. Disadvantage of the Gabor-filter based approaches is that extensive parameter optimisation is required to get accurate recognition. The approaches which use real valued features use distance measures like Euclidean, Cosine, etc. In [2] four different features are extracted using Gabor filtering, histogram of phase coefficients, Daubechies wavelet and DCT to train a neural network. Statistical methods like PCA [9], and ICA [11], have been used as supplements to wavelets. Band-limited phase-only correlation(BLPOC) based on \(2D-\)DFTs of two images has been used for matching in [14].

In this paper, initially occlusion mask is determined for each iris. Robust features using relative Gaussian filtering responses of neighbor regions are computed and matched using hamming distance. The paper is organized as follows: Sect. 2 describes the proposed algorithm. Section 3 presents the experimental results of the proposed system and last Section concludes the work.

2 Proposed Approach

The iris segmentation is done using the technique proposed in [6]. Iris region is normalized to a fixed size to deal with iris dilations. One of the major hurdles in iris recognition is occlusion (hiding of iris) due to eyelids, eyelashes, specular reflection and shadows. Occlusion hides the useful iris texture and introduces irrelevant parts like eyelids and eyelashes which are not an iris part.

2.1 Occlusion Detection

Occlusion is detected from the normalized image, instead of original iris image. It is done in three steps: eyelid, eyelash and specular reflection detection.

[A] Eyelid Detection-The major portion of the occluded iris area is constituted by lower and upper eyelids. Instead of traditional parabola/ellipse fitting, region-growing approach [1] has been used to determine the eyelids. For normalized image of size \(r \times c\), two seed point \((r,\frac{c}{4})\) and \((r,\frac{3c}{4})\) are chosen to perform region-growing to detect lower and upper eyelid respectively. These points are chosen because after normalization upper and lower eyelids mostly at \(\left( \frac{\pi }{2}\right) ^{c}\) and \(\left( \frac{3\pi }{2}\right) ^{c}\) angles w.r.t. horizontal axis. Region-growing begins with these seeds using a low threshold and expands the regions till they encounter similar region. This gives the expected lower and upper eyelid regions. Region-growing overcomes the problem of shape irregularity of eyelids and gives the exact area which is occluded by eyelids. It fails when eyelid boundary does not have good contrast because of which it grows outside the eyelid region. Therefore region-growing is repeated with a lower threshold so that it does not grow outside the eyelid regions. If region grows beyond a limit, it indicates that there is no eyelid. Finally, a binary mask is generated in which every eyelid pixel is set to 1, as shown in Fig. 1(b).

[B] Eyelash Detection - Eyelashes are of two types: separable and multiple. Separable eyelashes are like thin threads whereas multiple eyelashes constitute a shadow like region. Eyelashes have lower intensity as compared to iris texture; but absolute threshold that can separate them with the rest of iris cannot be determined properly, because it is very sensitive to illumination. Eyelashes have high contrast with their surrounding pixels but having low intensity. As a result, standard deviation in a small region around separable eyelashes will be high. The normalized image is convoluted with a \(3 \times 3\) standard deviation filter. High filter response is used to localize he separable eyelashes. Multiple eyelashes do not have high value of filter response, but they have lower intensity value. Hence to give some weight to low intensity pixel value, filter response for each pixel is normalized with respect to the maximum response. A combined feature CF(ij) for every pixel P(ij) is computed as :

$$\begin{aligned} CF(i,j) = 0.5\times SD(i,j) + 0.5\times (1-N(i,j)) \end{aligned}$$
(1)

where SD is the normalized standard deviation filter response, N represents normalized intensity values (\(0--1\)) of pixel (ij) in normalized iris. This feature boosts up the gap between eyelash and non-eyelash part. The histogram \(CF_H\) of CF is computed, that has two distinct clusters - one corresponding to low CF values belonging to iris pixels while other with high CF values belonging to eyelash pixels. To identify these two clusters Otsu thresholding is used that considers all possible pairs of clusters and chooses the clustering that minimizes the intra-cluster variance. It thus separates the eyelash portion from the iris portion as shown in Fig. 1(c).

[C] Specular Reflection DetectionSpecular reflection can be detected by using simple thresholding method. Those pixels which exceed a threshold value of 200 (very bright) are declared as specular reflections. Absolute threshold can be used because specular reflections are very bright. A binary mask is generated in which all specular reflections pixels are set to 1 as shown in Fig. 1(d). The final occlusion mask is generated by addition (logical \(OR-ing\)) of the binary masks of eyelid, eyelash and specular reflection as shown in Fig. 1(e).

Fig. 1.
figure 1

Determination of overall occlusion mask

2.2 Enhancement

The normalized iris is enhanced to highlight its rich texture features. Non-uniform illumination is an important artifact introduced due to varying illumination condition. It adds different intensities to various iris regions and introduces noise. An example of such image is shown in Fig. 2(a). The mean intensity value of each \(8\times 8\) size block is used as an estimate of background illumination (e.g. Figure 2(b)) which is subtracted from the original image to obtain uniformly illuminated image (e.g. Figure 2(c)). The Contrast-Limited Adaptive Histogram Equalization (CLAHE) [28] is applied over uniformly illuminated image to obtain enhanced image (e.g. Figure 2(d)). The entire process of enhancement is shown in Fig. 2.

Fig. 2.
figure 2

Overall Enhancement Process

2.3 Feature Extraction Approach

Raw iris image is very sensitive to translation, rotation, blurring, noises, occlusions and non-uniform illuminations. This sensitivity increases intra-class differences and may also mitigate inter-class differences that may lead to incorrect identification. Hence, a template is generated and used which represents the iris image in a robust, compact and unique manner.

In the proposed feature extraction approach, relational measures (RM) are calculated for various regions of enhanced image for iris encoding. The average intensities of an image region is compared with its four neighboring equi-spaced regions (as shown in Fig. 3(a)) at fixed distance and there sign is encoded into a bit. Hence for all selected regions, four bits corresponding to sign of such comparisons with four neighboring regions are obtained. Vertically and horizontally overlapping regions are chosen from the normalized image. The four bits per region are obtained and are concatenated to create a 2-D binary template.

Feature Extraction using Relational Measures ( RM ) Relational measures are features based on relational operators like \(>\),\(<\),\(=\). These ordinal relationships are more robust than the “absolute difference”. Single order relation (i.e greater than or less than) is encoded in a bit. The central region of size \(b\times b\) is chosen and its four neighboring regions of size \(b\times b\) are selected (as shown in Fig. 3(b)) at a particular distance d, where d is large as compared to b. The value of d is kept larger than b because distant regions provide comparison between uncorrelated regions which is more robust as compared to that with closer regions. A symmetric 2D Gaussian filter centrally clipped to size \(b\times b\) is used for convolution with each of these five regions, as shown in Fig. 3(a). The symmetric 2D Gaussian filter is a bell-shaped probability function PDF defined as:

$$\begin{aligned} G(\mathbf {\mu }, \sigma ) = \frac{1}{2\pi \sigma ^2}e^{-\frac{(\mathbf {X}-\mathbf {\mu })(\mathbf {X}-\mathbf {\mu })^{T}}{2\sigma ^2}} \end{aligned}$$
(2)

where \(\mathbf {\mu }\) is the spatial location of the peak (mean), \(\sigma \) is the standard deviation of the Gaussian and \(\mathbf {X}\) is the spatial location. Assume that x-coordinate increases from top to bottom of image whereas y-coordinate increases from left to right. If the mid point of central region is (0, 0), then the peaks of the four identical Gaussian filters used for its neighbors are at \((-d, 0)\), (0, d), (d, 0) and \((0,-d)\), as shown in Fig. 3(a).

Fig. 3.
figure 3

Region Filtering

The response of the central region is compared with those of its neighbors. If its response is greater than that of a neighbor, the information is encoded as 1 else 0. Thus, four bit code for this central region is obtained. Vertically and horizontally overlapping rectangular patches over the entire image are chosen as candidates for the central region, as shown in Fig. 3(b). These bits are concatenated according to the spatial location of their corresponding central regions in the image to generate a 2-D binary template (feature vector).

The occlusion mask is also generated for every iris image. A second level mask based on the feature vector calculation is generated as follows: if the central block has more than 80 % occluded pixels then the four bit code for that block is masked by values [1, 1, 1, 1] in the second level mask. Otherwise, it is left unmasked or [0,0,0,0] is put in the mask. The second level mask is essential because the feature vector is not pixel-based but block-based. Hence, the corresponding masking should also be brought down to block level. The algorithm to compute the feature vector and second level mask is given in Algorithm 1.

A \(b\times b\) Gaussian filter GC is obtained by clipping the generalized Gaussian \(G(\mathbf {0},\sigma )\) centrally around the zero mean. Feature template RMT and block-level mask BM are initialized to zero values. Central block of size \(b\times b\) is chosen from the normalized image NI and is convoluted with GC to obtain scalar response RC. This convolution is also applied to neighboring regions to obtain responses \(RT,\ RR,\ RB,\ RL\) corresponding to top, right, bottom and left directions respectively. RC is compared to each of \(RT,\ RR,\ RB,\ RL\) and based on the sign of the comparison a ‘0|1’ value is saved. Similarly, if the occluded bits in the central block exceeds a threshold, then [1, 1, 1, 1] is placed in BM. This process is repeated for all overlapping central blocks chosen according to parameters \(d_1, d_2\) and shown in Fig. 3(b). The feature and mask bits obtained are concatenated according to spatial position to generate RMT and BM.

figure a

2.4 Matching

Templates and second level masks of all the images in database are created in feature extraction stage. Matching between two iris considers their respective templates and corresponding masks and calculates their dissimilarity score. To calculate dissimilarity score between the binary templates, the hamming distance metric is used. Hamming distance between two 2D binary templates \(t_1\) and \(t_2\) of same size \(M\times N\) is defined as in Eq. (3).

$$\begin{aligned} HD_{(t_1,t_2)} =\frac{ \Sigma _{i=1}^{M}\Sigma _{j=1}^{N} (t_{1_{(i,j)}} \oplus t_{2_{(i,j)}})}{M\times N} \end{aligned}$$
(3)

where \(\oplus \) stands for bitwise exclusive-OR and \(t_{1_{(i,j)}}\) is the \((i,j)^{th}\) bit value of \(t_1\). Thus, HD is zero iff all the bits from both templates are of same value. Hence, HD is low for genuine matchings and high for imposter matchings. The pixels occluded are not considered while matching. To perform matching only in valid bits of the template, the second level occlusion masks are used. Equation (3) is be modified using occlusion masks \(m_1\) and \(m_2\) for the respective templates as:

$$\begin{aligned} HD_{(t_1,t_2)}=\frac{ \Sigma _{i=1}^{M}\Sigma _{j=1}^{N} [ t_{1_{(i,j)}}\oplus t_{2_{(i,j)}} ]\ \vert \ [m_{1_{(i,j)}}+m_{2_{(i,j)}}]}{M\times N-\Sigma _{i=1}^{M}\Sigma _{j=1}^{N} [m_{1_{(i,j)}} +m_{2_{(i,j)}}]} \end{aligned}$$
(4)

where operators \(\oplus \), \(\vert \), and \(+\) stand for binary XOR, NAND and OR operations. \(a \oplus b=1\) if a and b are not same, else 0.

To perform recognition for a particular iris, template (and second level mask) of probe image P of same iris is computed. Its HD with all templates (and corresponding masks) stored in the database is calculated. If the minimum HD is obtained with the template (and second level mask) of image belonging to the same iris, it is considered as a hit otherwise a miss. Recognition accuracy is defined as percentage of hits obtained among all probe images used.

Rotational Invariance Rotation of the eye in Cartesian coordinate-space corresponds to horizontal translation in the normalized image. To account for head tilting while acquiring image, matching needs to be applied multiple times. Hence, while matching a template A with another template B, template B is circularly shifted in horizontal direction, to get the minimum hamming distance. This minimum distance is taken as final dissimilarity score. While rotating the gallery template, it is essential to rotate the corresponding mask too. This rotational matching is demonstrated in Fig. 4. Accounting for rotation in this manner makes the matching.

Fig. 4.
figure 4

Calculation of minimum HD

3 Experimental Results

Performance evaluation of the proposed iris recognition system on challenging and voluminous databases is essential to assess its applicability. The iris database is partitioned into gallery set and probe set and all probe images are matched with the gallery images.

3.1 Databases

Two publicly available CASIA-4.0 Interval, CASIA-4.0 Lamp databases and the self-collected IITK database have been used for performance analysis. The CASIA-4.0 Interval database consists of 2, 639 images, each of size \(320 \times 280\), from both eyes of 249 people. Images from this database have clear iris texture taken by NIR camera. Images are taken in two sessions separated by a month. The CASIA-4.0 Lamp database consists of 16, 212 images using both eyes of 411 subjects, with 20 images per eye. All images in this database are taken in single session. Each image is of size \(640 \times 480\) pixels. This database has variable illumination introduced by switching of external lamp. The IITK database contains 20, 420 images, each of size \(640 \times 480\) using both eyes of 1021 subjects. This database has been collected using a circular LED based NIR camera. Images are acquired in two sessions. In each session, 5 images per eye per subject have been captured.

3.2 Recognition Results

The database is divided into probe set P and gallery set G. Each image from P is matched with all images of G. Those matchings where the image from G and P belong to same iris are called genuine matchings GM while those belonging to different irides are called impostor matchings IM. Those genuine and impostor matchings having combined mask of more than \(85\,\%\) of the image size are discarded as they are highly occluded. This is done because when occlusion is very high, less evidence is available for the matching score to be authentic.

For CASIA-4.0 Interval first session images are taken as gallery set and second session images are taken as probe set. Those eyes which have less than 4 images are not used for identification because not enough gallery images are available for them. Thus, there are 1047 gallery and 1509 probe images. For CASIA-4.0 Lamp first ten images are taken as gallery while rest ten are considered as probe images. Thus there are 7830 images in gallery and probe set. For IITK database first session images are taken as gallery set and second session images are used as probe. Thus, there are 10, 210 images in both gallery and probe sets.

The performance of the proposed system on these databases is summarized in Table 1. The ROC graph for CASIA-4.0 Interval, CASIA-4.0 Lamp and IITK databases are shown in Figs. 5a, b, c respectively. The EER on Lamp database is higher because there is severe occlusion present in its images as compared to other databases. The best EER obtained is on IITK database due to better image acquisition conditions.

Table 1. Recognition Performance on Various Databases

Since the iris template extraction in the proposed approach is based on comparison between regions and not on absolute measurements, it remains fairly stable with variable illuminations and small amounts of noises. Also, template extraction is efficient since only a single filter has to be applied to multiple regions using only basic convolution, which can be optimized by pre-calculation of the filter. Since binary values are stored instead of real values, feature vectors are compact. Flexibility can be provided by varying the distance parameters and the scale of Gaussian filter to achieve optimal performance.

However, the proposed approach cannot handle too much translation or torsional eye rotation because it does not track the features in two irides. It assumes one-to-one correspondence between image regions and is flexible to some extent. Also in case of failed segmentation, it cannot perform recognition because of this non-correspondence.

Fig. 5.
figure 5

ROC Curves for various Database

3.3 Comparison with Previous Approach

Comparison with Gabor filtering has been performed on the basis of feature extraction capability. Image preprocessing involving segmentation, normalization, occlusion masking and enhancement has been kept common for the sake of comparison. All possible cross-session matchings have been performed on CASIA-4.0 Interval, Lamp and IITK databases using the Gabor filtering approach. The performance metrics obtained for this experimentation are summarized in Table 2. It is evident from results on CASIA-4.0 Interval database that proposed approach gives slightly better results as compared to Gabor filtering, but results on Lamp and IITK database shows a significant amount of improvement in the result. It should be noted that the database size is small for CASIA-4.0 Interval (2,639 images) as compared to large size of Lamp (16,212 images) and IITK (20,420 images). Results on larger and more challenging databases usually are more reliable. Hence, it can be concluded that the proposed approach is at par and even better than the Gabor filtering approach.

Table 2. Comparative Results on Various Databases

3.4 Effect of Enhancement

The effect of enhancement approach has also been studied. Results of the proposed approach and the gabor filtering approach are obtained both with and without enhancement on CASIA-4.0 Interval database, as shown in Table 3. It can be inferred that enhancement produces significant discrimination among irises leading to a better EER. Hence, enhancement has proved to be a key factor in improving the recognition performance of the proposed recognition approach.

Table 3. Recognition Performance on CASIA-4.0 Interval database

4 Conclusion

In this paper, a robust and efficient iris recognition approach based on Relational Measures (RM) has been proposed. Experimental results on large databases of CASIA-4.0 Interval, CASIA-4.0 Lamp and IITK have proved that the efficiency and reliability of the proposed approach is comparable to state-of-the-art recognition approaches. Novel occlusion detection approaches of region-growing for eyelid, combined feature for eyelash improve the quality of mask generated. The proposed recognition approach has obtained a CRR of 99.07 % on CASIA-4.0 Interval, 98.7 % on CASIA-4.0 Lamp and 98.66 % on IITK database respectively. The CRR becomes 100 % when top 10 matches are considered for identification instead of top 1. It has also achieved an EER of 1.82 % on CASIA-4.0 Interval, 4.2 % on CASIA-4.0 Lamp and 2.12 % on IITK database. Since this is a new algorithm there exist scope of result improvement by adding more constrains and fine parametric tuning for result optimization.