Keywords

1 Introduction

In machine learning, feature vector extraction involves simplifying the amount of resources required to accurately describe a large set of data. A proper feature vector should be highly informative, invariant to a given set of transformations (such as rotation, scale etc.). In pattern recognition the feature extraction step is considered to be the most important step for achieving a robust and efficient system. The purpose of this framework is to guide the development of a compact, relevant and consistent set of features for the classification task [1] by choosing well-designed ordinal measures for image representation.

The measurements used in science can be classified into four types of scales: nominal, ordinal, interval and ratio. Nominal scales are used for labeling variables, without any quantitative value (ex: gender (male, female), hair color (dark, brown, blonde and grey)). Ordinal measurements describe order, but not relative size or degree of difference between the items measured. Interval scales are numeric scales in which we know not only the order, but also the exact differences between the values. The classic example of an interval scale is celsius temperature because the difference between each value is the same. Ratio scales provides the exact value between units, and they also have an absolute zero. Good examples of ratio variables include height and weight (Table 1).

Biological and psychological measurement usually operates on ordinal scales. Computer vision researchers prefer interval or ratio measures for object description and pattern recognition. As the lowest level of measurement, nominal scale is too weak for classification. But the power of ordinal measures for feature representation has been largely underestimated [2, 3].

Ordinal features come from a straightforward concept that we often use: one could easily rank or order the heights or weights of two persons, but it is hard to answer their precise differences. For computer vision, the absolute intensity information associated with an image can vary because it changes under various illumination settings. However, ordinal relationships among neighborhood pixels or regions present some stability with such changes and reflect the intrinsic natures of the object [4]. Ordinal features are efficient to compute and encode an ordinal relationship between two concepts. Less than and greater than are meaningful terms with ordinal variables. Figure 1 gives an example in which the average intensities between regions A and B are compared to give the ordinal code of 1 or 0.

Fig. 1
figure 1

Ordinal measure of relationship between two regions. Region A is brighter than B, A \(<\) B; region A is darker than B, A \(>\) B [5]

Table 1 Primary scales of measurement

Object recognition is a fundamental task that humans perform many times each day without noticeable effort. Furthermore, one can identify an object despite different changes that affect its appearance, including illumination, viewing direction, occlusion etc.

Inspired by the biological plausibility of ordinal measures, we propose to use them for object recognition. Ordinal data would use non-parametric statistics, including: median and mode rank order correlation to measure the strength of the associations between two variables, non-parametric analysis of variance etc.

The rest of this chapter is organized as follows. Section 2 summarizes different ordinal measures based methods proposed in literature for pattern recognition, particularly with applications in biometrics. In Sect. 3, a general algorithm for image analysis and coding, based on ordinal measures is presented. Section 4 discusses how to explore effective ordinal features for iris recognition. We conclude the chapter in Sect. 5.

2 Ordinal Measures, a Key Issue in Pattern Recognition

The advantages of ordinal measures for image representation have already been verified by some researchers. Sinha [6] was probably the first to introduce ordinal measures to computer-based vision systems. Based on the fact that several ordinal measures on facial images, such as eye-forehead and mouth-cheek, are invariant to individuals and imaging conditions, Sinha developed a ratio template for face detection [3]. He proposed a representation that is a collection of several pair-wise ordinal contrast relationships across facial regions. For instance, the average brightness of the left eye is always less than that of the forehead, regardless of the lighting conditions. The relative magnitudes of the two brightness values may change, but the sign of the inequality does not (Fig. 2).

In other words, the ordinal relationship between the average brightness of the \(<\)left-eye, forehead\(>\) pair is invariant under lighting changes. Starting from the idea that the human visual system is far better at making relative brightness judgments than absolute ones, he suggested the structure presented in Fig. 3, for detecting faces under significant illumination variations [6].

Fig. 2
figure 2

Pair-wise ordinal relationships invariant under lighting changes [6]

Fig. 3
figure 3

Invariant ordinal structure of the image brightness on a human face under widely varying illumination conditions [6]

Assuming that the ordinal relationship between neighboring image regions is stable and robust, several researchers proposed different multi-lobe differential filters (MLDFs) for ordinal image analysis and coding [3, 7]. MLDF can encode ordinal measures of multiple image regions with a flexible parameter configuration. Some of the variable parameters that can be used are presented in Fig. 4: the number of positive and negative lobes, orientation, scale and location of each lobe, inter-lobe distance, etc.

Fig. 4
figure 4

Some typical multi-lobe differential filters [3]

Sun et al. proposes proposed in [3] multi-lobe differential filters for ordinal iris feature extraction. An MLDF operator is passed across the normalized iris image and each comparison is further encoded as one bit: 1 or 0 according to the sign of the filtering result. For classification, the Hamming distance was employed. For example, the ordinal measures for a group of two lobes, may denote point, line, edge, corner, ridge, slope, etc., as shown in Fig. 5.

Fig. 5
figure 5

Ordinal measures and their visual meanings. a Point. b Line. c Edge. d Corner. e Ridge. f Slope [3]

An effective scheme for matching noisy iris images under visible lighting is described in [8]. For feature representation and matching, multiple cues, including ordinal measures (Fig. 6), color histogram, text on representation, and semantic information are employed [8].

Fig. 6
figure 6

ac Ordinal filters; de iris sub-regions; fk ordinal code [8]

In [9] Tan introduced OM for iris, face and palmprint representation by using a Multi-lobe Ordinal Filter (MLOF) with different parameters, such as distance, orientation, scale and location. Each biometric region is binary encoded according on the sign of the filtering results (Fig. 7). Their experiments have demonstrated that the method achieves significantly higher accuracy than the state-of-the-art systems with lower computational cost.

Fig. 7
figure 7

Feature extraction with MLOF a Input images; b ordinal code [9]

Orthogonal Line Ordinal Features are proposed in [4] for palm print representation. Gabor Ordinal Measures were also used for hand vein recognition [10] and for face recognition [5, 11, 12]. Local Ordinal Contrast was employed for lip-based speaker authentication in [13]. A robust ear recognition system is proposed in [14] using gradient ordinal relationship pattern. Based on the previous studies it can be stated that the ordinal measurements represent a feasible solution for the personal identification requirements. The existing architectures also provide directions for developing new and improved algorithms for image analysis and coding based on ordinal representations.

Object classification is a natural task for human visual system: we can classify a novel object, without effort, based on its appearance. It is therefore natural to study the biological mechanisms used for object classification and to propose similar approaches for computer vision systems.

The biological plausibility of visual ordinal measures has been verified by several neuroscience researchers. DeAngelis et al. [15] found that many striate cortical neurons’ visual responses saturate rapidly regarding the magnitude of contrast as the input, which tells us the determining factor of visual cognition is not the absolute value of contrast, but the contrast polarity. Rullen et al. [16] suggested that temporal order coding might form a rank-based image representation in the visual cortex [3]. Inspired by the human visual system, Ullman et al. proposes in [17] a part-based method for pattern recognition. On their approach objects within a class are represented in terms of image fragments. The classification is based on a direct grey-level comparison between stored fragments and the input image. The method measures the qualitative shape similarity (using the ordinal ranking of the pixels in the regions) and the orientation difference (using gradient amplitude and direction) [17].

Based on ordinal pattern analysis, a mutual information technique is proposed in [18] to describe correlations of electromyogram signals during hand open/hand close states.

Several arguments in favor of ordinal measures applications are further presented:

  • features based on high-level measurements are useful for image reconstruction but unnecessary for object recognition;

  • high-level measures involve complex and time-consuming computations;

  • ordinal measures are simple to implement and compact in feature template [3].

3 A New Algorithm for Image Analysis and Coding, Based on Ordinal Measures

The aim of this chapter is to describe a novel and robust image feature vector extraction method based on ordinal measures. Standing from the idea that a ‘machine learning experimenter’ needs to address three questions: (i) what to measure, (ii) how to measure it, and (iii) how to interpret it, the proposed algorithm follows the next sequence of processing steps (Fig. 8).

Fig. 8
figure 8

Image feature vector extraction based on isolines

The input image is viewed as a matrix of pixels, where pixels are interpreted as heights with respect to the \(x-y\) plane as shown in Fig. 9.

Fig. 9
figure 9

Input image, viewed as a matrix

Fig. 10
figure 10

Surfaces delimited by isolines

Firstly, the isolines of the matrix are computed where an isoline (also called a contour line) is a curve in the xy plane along which the function f(xy) has a constant value, Cj. These curves are defined by the following equation:

$$\begin{aligned} \text {f(x, y) = Cj, j = 1, 2, . . . ,N} \end{aligned}$$
(1)

where N is the number of detected contours. A local maxima or minima is surrounded by several contour lines as presented in Fig. 10. The number of the contour lines could be chosen automatically based on the minimum and maximum values of the matrix or a prefixed number of levels could be selected by the user. In our example, for the input matrix, four surfaces are delimited:

  • S2 and S4 correspond to local maxima;

  • S1 and S3 correspond to local minima (Fig. 10).

Three descriptors were associated to each surface:

  • L (Level) which describes the depth of the surface by counting the number of internal isolines;

  • A (Area) characterizes the area of the surface;

  • C (Contour) defines the constant value related to external contour line of the surface.

The proposed new coding procedure supposes that pairs of surfaces are further compared and a symbol is provided (there are 27 different symbols: 3 \(\times \) 3 \(\times \) 3), indicating the differences between the individual L, A and C traits of the two regions involved in the ordinal comparison (\(>,\,<\) or \(=\)).

Each ordinal relation by itself is not reliable enough and is too coarse to provide a good discriminant function to distinguish between different classes therefore it is necessary to consider many of the relations together to achieve satisfactory performances.

The order of surfaces to be compared may be performed in different ways. The following two procedures are proposed. The first one assumes center of gravity determinations for each surface and establishing the order according to their distance from the origin of the coordinate system. A second method supposes that the delimited surfaces are compared two by two. In this way the computational cost is higher but a rotation invariant feature vector is obtained. In many practical computer vision applications, rotation invariance is a crucial issue.

A histogram of the 27 possible combinations is further produced, forming a feature vector with fixed dimension 1 \(\times \) 27, which carries information about symbols frequency. Because of their fixed length, the resulted feature vectors are ideal form of inputs for many classifiers (Fig. 11).

Fig. 11
figure 11

Symbols assignation

A detailed description of the entire process is presented in Tables 2 and 3. By scaling the input matrix (from 40 \(\times \) 40 pixels to 10 \(\times \) 10) (Fig. 12) we get the contour plot given by Fig. 13.

The histogram of the obtained symbols is the same for both matrices (original and scaled). This remark indicates that the proposed coding procedure is robust to scale changes.

In a period shorter than 20 years, wavelets have imposed themselves as a fruitful tool for both signal and image processing. The theory behind wavelets has been developed independently by mathematicians, scientists and engineers. A privileged area of applications where wavelet methods have been found to be relevant is pattern recognition. Ordinal measurements could be combined with DWT2 (Bi-dimensional Discrete Wavelet Transform). DWT2 leads, after one level decomposition, into four components: the approximation coefficients (cA), and the details coefficients in the three orientations, respectively the horizontal, the vertical, and the diagonal (cH, cV, cD) [19], as shown in Fig. 14.

Figure 15 presents the surfaces obtained according to the previously presented method. The feature vector derived from approximation coefficients is the same with the one obtained from the input image, and together with corresponding vectors resulted from detail coefficients, ensures a highly informative final feature vector.

Table 2 Corresponding surfaces and descriptors for the input matrix
Table 3 Symbols assignation for the input matrix

The presented method generates simple data structures which are both compact and of known size so that, limited memory resources in embodiments such as Smart Cards can be employed efficiently. This aspect has important benefits for data storage and transfer operations.

The advantages offered by the proposed method recommend it for several areas of application including biomedical image analysis, biometry (iris recognition), etc.

For one-dimensional signals a similar procedure TESPAR DZ (Time Encoded Signal Processing and Recognition) was proposed by King [20]. The TESPAR DZ method is based on an approximation model employing the zeros theory, such that the signal is divided in periods between successive zero crossing of the waveform. Duration (D), Shape (S) and Amplitude (A) are used as descriptors for each epoch. Then, pairs of epochs are compared and a symbol is provided [21] (Fig. 16).

Fig. 12
figure 12

Scaled version of the input matrix

Fig. 13
figure 13

Surfaces delimited by isoline

Fig. 14
figure 14

Resulted matrix after DWT2 decomposition of input image a cA; b cH; c cV; d cD

Fig. 15
figure 15

Surfaces delimited by isolines for wavelet coefficients a cA; b cH; c cV; d cD

4 Ordinal Representation for Iris Recognition

4.1 Biometrics

Nowadays, computers and Internet are used in every major function of our society, so security is increasingly important. For many years, passwords, PINs or identity cards have been used for person’s identification. The advantage is that these modalities do not change over time and they are not affected by the environment, but they cannot avoid problems such as being forgotten or guessed by others. Biometrics provides a promising solution for reliable personal identification [22]. The availability of faster computers and improved sensing technology coupled with significant advances in pattern recognition afford researchers to develop robust biometric systems. The goal of pattern recognition is to classify the objects in different categories or classes. In biometry the objects could be images (iris, palm vein) or signal waveforms (voice, signature, etc.).

Biometry has changed and will continue to change the way many activities are carried out by each of us. Biometric applications concern a variety of areas: civil and criminal identification, surveillance and screening, health care, eCommerce, eGovernment, physical and logical access. Biometric systems are used in airports, financial service institutions, banking ATMs, houses, etc. Many computers, laptops and smart phones have incorporated webcams, microphones and even fingerprint scanners, offering to the users the possibility to embrace biometric authentication based on fingerprint, iris, face or voice. Although several biometric modalities (i.e. fingerprint, voice and face) have already been used in large-scale deployments, there are many other attractive and “new” modalities in various stages of development and assessment (hand vein, iris, DNA etc.). However, no biometric modality is impeccable and each one has advantages and disadvantages for a given use case.

Table 4 compares several biometric technologies with each other against seven criteria.

  • Universality describes how commonly a biometric is found in each individual.

  • Uniqueness is how well the biometric separates one individual from another.

  • Permanence measures how well a biometric resists aging.

  • Collectability explains how easy it is to acquire a biometric for measurement.

  • Performance indicates the accuracy, speed, and robustness of the system capturing the biometric.

  • Acceptability indicates the degree of approval of a technology by the public in everyday life.

  • Circumvention is how easy it is to fool the authentication system [23].

Table 4 Comparison of various biometric technologies [24]

It is known that, from all biometric techniques, iris based biometric systems are the most promising for high security environments. Although diverse iris recognition methods have been proposed, the fundamentals of this biometrics have not a unified answer.

Though the theory behind iris recognition was studied as early as the 19th century, most research has been done in the last few decades. At the moment, two prototypes of iris recognition systems had been developed, by Daugman [25] and Wildes et al. [26]. Promising results were obtained by Boles and Boashash [27] using wavelet transform, and by Sanchez-Reillo and Sanchez-Avila in [28], where Gabor filters are employed.

Iris is a colored pigmented tissue (usually blue, brown or green), found outside the pupil to regulate it from incoming rays of light. It has an extraordinary structure and provides many interlacing minute characteristics such as freckles, coronas, stripes, furrows, crypts and so on.

The human iris has several benefits when compared with other biometrics methods: is very stable over a long period of time. In addition, the inherent isolation from the external environment and the impossibility of surgically modifying it without high risk of damaging the user’s health provides strong immunity to forgery [29]. Some important factors that may affect the performances of iris recognition system are [2]:

  1. (a)

    Registration: Although the rotation difference between two images can be solved by a brute force registration process, the large computational cost makes it not preferred in real time application.

  2. (b)

    Normalization is expected to provide scale and position invariance of input images. Complex normalization methods were developed by researchers, but they proved to be computationally intense and for real time applications a very fast digital image processing hardware is required.

  3. (c)

    Contrast variation

  4. (d)

    Noise may disturb the precision of traditional algorithms. But the richness of inter-region sharp intensity differences provides a good source of ordinal measures for iris coding. Qualitative relationships across distinct iris regions can be insensitive to the contrast variations.

A proper iris recognition algorithm should be tolerant to the drawbacks mentioned above and should also encode efficiently image properties [2].

Iris scan technology has been traditionally used for surveillance and security purpose. In present iris based systems are used in airports for passenger authentication process, in financial service institutions, for ATM access usage etc.

The proposed system intends to argue that ordinal image representation provides a better trade-off for biometric recognition between accuracy, robustness and efficiency.

4.2 The Proposed Iris Based Biometric System

The processing flow used to implement the iris recognition system is presented in Fig. 17.

Fig. 16
figure 16

TESPAR DZ coding procedure

Fig. 17
figure 17

Biometric system block diagram

Fig. 18
figure 18

Iris, segmented image (ROI selection)

The qualities of the image acquisition, segmentation, normalization, feature extraction and the used classifier define the performance of the system.

Database: For experiments a public database was used. It contains 3 \(\times \) 128 iris images (i.e. 3 \(\times \) 64 left and 3 \(\times \) 64 right). The features of the images are: 24 bit-RGB, 576 \(\times \) 768 pixels, file format: PNG [30].

Iris Segmentation and Normalization: Segmentation requires a proper detection of the inner and outer boundaries of the iris texture. The specular reflections inside the pupil area are contained in the images from the database.

At the beginning, a Canny edge detection is employed in order to generate an edge map. Then, the circular Hough transform establishes the radius and the centre coordinates of the iris region within the whole edge image which was previously obtained. Further, the pupil’s radius and centre coordinates are searched within a crop of the image that contains only the region of interest (ROI) [31].

The detected iris region Fig. 18 is unwrapped by remapping each point to a pair of polar coordinates (\(r,\,\theta \)) using the Cartesian to polar reference transform suggested by Daugman [32]. Hence, we obtain a rectangular representation for the iris after a normalization process, Fig. 19.

Fig. 19
figure 19

Unwrapped iris (ROI transformation into polar coordinates)

Feature Extraction: The feature vectors are extracted from the polar representation of the iris. Firstly, the bidirectional discrete wavelet transform (DWT-2D) was applied, by using different mother functions. Researchers are faced with an ever increasing variety of wavelets to choose from and the choice of the best wavelet is application-dependent. We selected several well-known wavelet functions such as Daubechies of order 1 and 3 (Db1, Db3), Battle–Lemarie of order 1 and 2 (Lem1, Lem2), Biorthogonal of order 1.3 (Bior1.3), Reverse Biorthogonal of order 2.4 (Rbio2.4) and Coiflet of order 1 (Coif1). Comparative studies seem to be very useful for the selection of a particular wavelet function. After one level decomposition, there are four components: approximations (cA), and details in the three orientations—horizontal, vertical and diagonal (cH, cV respectively cD).

After several experiments we conclude that for approximations 13 contour levels should be firstly computed and after that 6 of them to be selected (the first and the last three ones) (Figs. 20 and 21). For details 3 contour levels are considered Fig. 22. As presented in the previous chapter, three descriptors are associated to each surface: L (Level), A (Area) and C (Contour value).

Further, pairs of surfaces are compared and a symbol indicates the differences between the individual traits of the two surfaces being compared. The symbol stream is then condensed into fixed-size feature vectors by simply counting how many times each symbol occur. Individual vectors resulted from approximations respectively details are than fused into a final feature vector. The feature vector length has 4 \(\times \) 27 coefficients for all irises. Thus the image is transformed into an encoded stream of discrete numerical symbols. This compact iris code greatly facilitates the matching process.

Fig. 20
figure 20

Surfaces delimited by isolines for unwrapped iris, approximation coefficients (full size image)

Fig. 21
figure 21

Surfaces delimited by isolines for unwrapped iris, approximation coefficients (zoomed area)

Fig. 22
figure 22

Surfaces delimited by isolines for unwrapped iris, diagonal detail coefficients (zoomed area)

Classification: To perform training and classification tasks for identification experiments, WEKA toolkit was used. WEKA is a data mining workbench that allows comparison between many different machine learning algorithms. The first step was to represent our learning problem using an .arff file, where each instance is represented as a feature vector. The header of this file identifies the types of the features and the classes being predicted. WEKA summarizes the classification results by generating a confusion matrix [33]. In the present study, different classifiers were used, based on the following algorithms: Bayes Net (BN), Naive Bayes (NB), Radial Basis Function Neural Network (RBF), k-Nearest Neighbor (1 \(-\) NN for k \(=\) 1 and 3 \(-\) NN for k \(=\) 3) and Support Vector Machine (SVM).

Fig. 23
figure 23

The Performances of different classifiers in term of accuracy rates

A Bayes Classifier is a simple probabilistic classifier based on applying Bayes’ theorem with strong (naive) independence assumptions. In simple terms, a naive Bayes classifier assumes that the presence (or absence) of a particular feature of a class is unrelated to the presence (or absence) of any other feature [34].

In the RBF Network, class implements a normalized Gaussian radial basis function network. It uses the k-means clustering algorithm to provide the basis functions.

The Nearest Neighbor Classifier (NNC) uses normalized Euclidean distance to find the training instance closest to the given test instance, and predicts the same class as this training instance. If multiple instances have the same (smallest) distance to the test instance, the first one found is used.

Support Vector Machines are based on the statistical learning theory of structural risk management. They are built by mapping the training patterns into a higher dimensional feature space where the points can be separated by using a hyper plane [35]. WLSVM software toolbox was employed for this classifier (Weka LibSVM—Integrating LibSVM into Weka) and may be seen as a form of implemented LibSVM working in Weka [36]. The main advantage is that LibSVM works considerably faster than WEKA. There are four kernels available for the SVM classifier: linear, polynomial, radial basis function and sigmoid. Optimal values for the SVM kernel’s parameters will found by performing a grid search on the training data.

Experiments: Identification experiments were carried out, using two iris images/person for training (left and right eye) and four iris images/person for testing (two for the left and two for the right eye).

The SVM classification performances were tested for all available kernels. A proper choice of parameters is crucial for SVM to achieve good recognition rates. In the experiments the simple grid-search approach was used: parameters were varied with a fixed step-size through a wide range of values and the performance of every combination was measured. A 10-fold cross validation technique was also employed. The training data was randomly split into 10 sets, 9 of which were used in training and the 10th for validations. Then iteratively another nine were picked and so forth.

The best classification rates issued by identification experiments are presented in the Fig. 23. Comparative studies seem to be very useful for the selection of a particular classifier. The best results were obtained by using Db1 wavelet function, which is actually the Haar wavelet, together with the polynomial kernel (C \(=\) 100 and d \(=\) 7).

5 Conclusions

This chapter handles concepts of pattern recognition and their applications to solving real life problems such as biometric identification. Biometrics offers the promise of much stronger identity verification, and identity management is becoming ever more important to economic and social life. The need for enhanced security persists more than ever in a more electronically dependent and interconnected world. The traditional authentication methods are neither secure enough nor convenient for many automatic identification systems.

Firstly, a new idea for image feature representation by using ordinal measures was presented. The proposed algorithm is invariant to rotation and scale. Moreover, it provides fixed size descriptors regardless of the image dimension thus being ideal for many classifiers. Besides fixed dimension, the small number of coefficients used to store and characterize an image should be mentioned. The ordinal measures ensure the image representation to be robust to different intra-class variations (such as illumination), signal noises, misalignment or nonlinear deformations. The new method is considered to be suitable for portable applications due to its computationally low costs.

The proposed algorithm was further integrated to implement a biometric system, based on iris. The performances of several classification algorithms were investigated. The Support Vector Machines classifier was tested for linear, polynomial, radial basis function and sigmoid kernels, using 10-fold cross-validation on the training set. The obtained best parameters were then used for testing. The ability of several wavelets was also investigated, results showing that Haar functions seem to be the best suited in this case.