Keywords

1 Introduction

The abundant research in human authentication features used was extracted from the face. In recent years, texture feature extraction [10] from the iris image has drawn attention as a means of the soft biometric attribute in identifying the gender of a person. The major advantage of using soft biometrics is that it helps in the faster retrieval of identities when aggregated with corresponding biometric data. Iris information had effectively applied in diverse areas as airport check-in or refugee control [1] and can be used in cross-spectral matching scenarios [5] while comparing RGB images and NRI images. By improving the recognition attributes and accuracy provides additional semantic information about an unfamiliar area that fills the gap between machine and human descriptions about entities [1].

Iris texture feature extraction is well protected as it is an internal organ of the eye and externally visible from a distance, unique and has a highly complex pattern. The pattern is stable over the lifetime except for pigmentation. Images of the iris are taken in visible and near-infrared light. The outside layer, which includes the sclera and cornea, is fibrous and protective; the middle layer, which includes the choroid, ciliary body, and iris, is vascular; and the innermost layer, which includes the retina, is nerve or sensory [11].

The major challenges in extracting iris information are the distance between camera and eyes, occlusion by the eyelid, eyelashes, eye rotation, and the light effect in acquiring the image. The camera placed at a distance will capture inconsistent iris size. Occlusion by eyelids and eyelashes may result in inappropriate and/or insufficient features. The variation in light will cause pupil dilation, which affects the segmentation method. Eye rotation or tilting head adds variations in the segmentation process because of intra-class variations.

The aim of this paper is to experiment the gender prediction dependencies like whole eye image or normalized iris image, the split dataset as between training and testing data, feature extraction methods traditional machine learning models or neural network models, small dataset or augmented dataset. Rest of the paper discusses about the general gender prediction steps, related work in gender prediction using iris images, discussion of the results, and conclusion of the work.

2 General Steps

As iris recognition is safe, authentic, stable, it is regarded as accurate soft biometrics, and the same steps are adopted for predicting gender. Researchers might experiment with freely available database resources as listed in Table 1 to extract information specific to humans. The major and common steps involved in these areas are listed in Fig. 1. The first important step in iris recognition is the iris localization or segmenting the iris portion from the eye image. The major challenges to be addressed in localization are occlusion by eyelashes, eyelid, tilted head while capturing and illumination effect. Once the iris region is localized, it needs to be normalized to reduce or suppress the unwanted or noise information, also called enrollment. The iris information is in a circular, polar coordinate system until this phase. Daugman’s rubber sheet model [6] converts iris information from a polar coordinate system to a Cartesian coordinate system, i.e., unwrapping. After unwrapping, feature extraction algorithms like LBP, BSIF, LPQ, Gabor filter, CNN are applied to extract features used for classification based on the type of application.

Table 1 Iris dataset
Fig. 1
A flowchart depicts 7 steps involved in predicting gender using iris data. On the right are the corresponding images associated with each step.

Steps involved in gender prediction using iris data

Images in the visible spectrum (380–750 nm) or the near-infrared band (700–900 nm) are collected by the sensors. The visible spectrum images can be saved as either a color or an intensity image; however, the NIR images are always saved as an intensity image. Literature study shows that higher accuracy is obtained for the experiments done on a person-disjoint dataset for testing and training model for NIR images than visible light images because visible light sensors are more prone to noise.

3 Related Work

Thomas et al. [18] published the first paper on gender prediction from geometric and texture features of iris images. The researchers combined the CASIA Dataset, UPOL Dataset, and UBIRIS Dataset (a total of 57,137 images) with equal distribution of all genders, generated a feature vector by applying 1D Gabor filters to the normalized iris image using Daugman’s rubber sheet method, used information gain for feature selection, and later applied C4.5 decision tree algorithm for classification. Initially, the authors have used SVM and neural networks for classification. However, they could not get better results than the decision tree techniques. The authors achieved 75% accuracy and enhanced it to 80% by collecting bagging and random subspaces with a decision tree. Here, the authors have considered only the left iris for the experimentation.

Lagree and Bowyer [9] carried the gender prediction based on the SVM classifier training. The classification is based on the features generated by applying simple texture feature extraction methods like spot detector, line detector, laws texture features on normalized iris image of size 40 × 240 and eliminated the occlusions like the eyelid, eyelash, etc. The accuracy achieved by the authors using twofold, fivefold, and tenfold cross-validation with the Weka SMO SVM classifier was about 62%. The authors claim that their accuracy is less than Thomas et al. because of smaller size of the dataset. The researchers have used the same dataset for predicting both gender and ethnicity.

Tapia et al. [16] claimed accuracy of 91.33% in gender prediction using SVM classifier for uniform LBP and conventional LBP for subject-disjoint dataset for training and testing and also used tenfold validations. The Gabor filters were applied to the normalized image and then transformed into binary iris code with four levels, which was considered as more stable iris information for predicting gender.

Tapia et al. [17] clarified that authors had used 1500 images from unique subjects in [16] with incorrect labels. They were able to achieve 91% accuracy and were able to get this due to overlapping training and testing datasets. In [17], the disjoint train-test sets were created concerning the subject and used mutual information measures (mRMR, CMIM, weighted mRMR, and weighted CMIM) for feature selection tested for statistical significance of gender information distribution across the different bands of the iris using ANOVA test. In this current work, the three datasets used are: the UND Dataset, ND-Gender-From-Iris (NDGFI) Dataset, and a subject-disjoint validation set (UND V). The authors observed that CMIM gives better accuracy than mRMR and obtained 89% of prediction accuracy by fusing the best features from left and right iris code.

Tapia and Aravena [14] proposed a modified Lenet-5 CNN model for achieving a better gender prediction rate. The modified network consists of four convolution layers and one fully connected layer with a minimum number of neurons. A minimum number of neurons are considered to reduce the risk of over-fitting and solve the two-class gender prediction problem. The authors adopted data augmentation to increase the dataset size from 1500 to 9500 images for each eye. The authors conclude that the fusion of CNN for the right and left eye gives better prediction than the single eye, separately.

Tapia and Perez [14] used 2D quadrature quaternionic filter for classification and replaced the 1D log-Gabor filter with 2D Gabor filters. The 2D Gabor filters encoded with the normalized image phase information consist of 4 bits per pixel. The authors conducted five experiments. At first, using all the features from the normalized image for classification and other experiments are built over this model. The second experiment used transfer learning with a VGG19 model for extracting features. The next experiment applied a genetic algorithm for selected blocks of normalized images and used raw pixel values, principal component analysis (PCA), and local binary patterns (LBP) as features. The fourth experiment was conducted using different variants of mutual information for feature extraction and used SVM and ten ensemble classifiers for classification. In the last experiment, gender classification was done using the encoding images with quaternioc code (QC) with 3 and 4 bits per pixel and observed that 4 bits per pixel show better results than 3 bits per pixel. The authors achieved maximum accuracy of 95.45% for gender prediction.

Tapia and Arellano [15] proposed modified binary statistical image features (mBSIF) for gender prediction. The experiments were carried out with different filter sizes ranging from 5 × 5 to 13 × 13 and number of bits from 5 to 12 and observed that 11 × 11 shows better prediction accuracy for MBSIF histogram with 94.66% for the left eye and 92% for right eye with 10 bits per pixel.

Bobeldyk and Ross [1] made an attempt to find the extended ocular region, the iris-excluded ocular region, the iris-only region, and the normalized iris-only region was used to determine the gender prediction accuracy. The authors used BSIF code for feature extraction and applied SVM classifier for the classification of males and females. They made the geometric adjustment so that the iris was at the center of the image and tessellated it into blocks. Then, the histogram of BSIF is evaluated for each region. The histograms are normalized before concatenating them into a feature vector. The resulting feature vector is used for classification. The authors also observed the prediction accuracy by varying window sizes for BSIF and obtained an accuracy of 85.7%. For the research, BioCOP2009 Dataset was used.

Bobeldyk and Ross [2, 3] expanded their earlier work [1] by considering local binary pattern (LBP) features along with BSIF features and were able to achieve maximum accuracy of 87.9%. The author also observed the impact of a number of bits in BSIF code with respect to the computational time and memory. And the impact of race on gender prediction also tested the results with the cross dataset. They used three different datasets (BioCOP2009 Dataset, Cosmic contact Dataset, and GIF Dataset) for their research.

Bobeldyk and Ross [4] have investigated the impact of resolution on gender prediction without reconstructing the low-resolution image to a high-resolution image. Used BioCOP2009 Dataset and Cosmic contact Dataset for their research. In this work, researchers used BSIF code with SVM classifier and CNN-based classifier and observed 72.1% and 77.1% accuracy for the 30-pixel image, respectively. Authors have used small networks with fewer neurons for CNN as the input image’s size is small and needs smaller training samples. Also, they carried out experiments on gender prediction accuracy by varying the window size from 340 × 400 to 2 × 3 and concluded that 5 × 6 ocular images contain gender information with reduced complexity.

Singh et al. [12] utilized a variation of an auto-encoder in which the attribute class label has been included in conjunction with the reconstruction layer. They used NIR ocular pictures that had scaled down to 48 × 64 pixels. The GFI and ND-Iris-0405 Datasets were used for their method. The authors applied RDF and NNet classifiers and achieved an accuracy of 83.17%. They claim that the deep class encoder only takes a quarter of the overall training time, and their results outperform the outcomes of Tapia et al. [17].

Sreya and Jones [13] used the IITD Dataset for investigation and ANN for iris recognition. The authors explained the steps involved in recognition in detail. The experiments were carried out on cropped NIR images to locate the pupil region. The authors conclude that the prediction accuracy depends on processing.

Kuehlkamp and Bowyer [7] investigated the impact of mascara on iris gender prediction. They got a 60% gender prediction accuracy using only the occlusion mask from each image and 66% accuracy when LBP was used in conjunction with an MLP network. Also, they were able to attain up to 80% accuracy using the complete ‘eye’ image using CNNs and MLP’s. The authors used the GFI Dataset and classified it as Males, Females With Cosmetics (FWC) and Females No Cosmetics (FNC).

4 Experiments and Results

In this work, the experiments are conducted by adopting different approaches to know the suitable criteria for the prediction. We have used two publicly available datasets: IITD Dataset [8] with image size of 320 × 240 and SDUMLA-HMT Dataset with 768 × 576. Both the datasets have female eye image count less than that of male eye image count, so the eye images are augmented to generate 11,512 male eye images and 11,906 female eye images that meet experimentation purpose.

Initial experiments were conducted using traditional machine learning classification methods based on normalized iris texture features as shown in Fig. 1, as cited in literature study. We have used local binary pattern (LBP), Gabor filter-based feature extraction methods for getting the texture features from the normalized iris image and used SVM and random forest for classification. The experiments are carried out using the IITD Dataset [8] and SDUMLA-HMT Dataset, the results are given as in Table 2, and SVM for Gabor features shows enhanced results.

Table 2 Experimentation results

Next experiment was done using dense neural network for classification with 20% dropout and convolution neural network for feature extraction from whole eye image and normalized iris images. Deep neural network gives an accuracy of 73.96% and 90.97% for SDUMLA-HMT Dataset when trained using whole eye image and normalized iris image, respectively, and an accuracy of 98.92% for normalized IITD Dataset.

Another experiment is conducted by varying the split ratio of training and testing data. The split is done as 60:40, 80:20, and 90:10 for training and testing and observed that the results show better results for the support vector machine (SVM) for smaller dataset and deep neural network shows better accuracy for larger dataset independent of the split ratio of training and testing data, as shown in Table 2.

5 Conclusion

The experiments are carried out to study feature extraction and classification methods’ appropriate for gender prediction. The results in Table 2 show that SVM shows better outcome for smaller dataset, independent of the feature extraction method. The gender prediction accuracy increases when normalized iris images are used as input for feature extraction methods, and Gabor filter-based feature extraction shows better gender prediction accuracy. The neural network model was trained for gender prediction using whole eye images and normalized iris images; the gender prediction accuracy is high for normalized input with greater dataset size. Observations are made that SDUMLA-HMT images contain full eye image including eyelids, noisy images, and pupil is not the center of the images which makes iris localization and normalization more challenging. So it is observed that IITD Dataset shows good accuracy as compared with the SDUMLA-HMT Dataset as images are focused on region of interest with minimum noise. Further, the same setup can predict other soft biometric predictions which are like age and ethnicity.