1 Introduction

Security of connection and identity management is one the challenges in networking [11]. Biometrics provide a way to obtain a “key”, which cannot be lost or forgotten. Currently iris recognition is the most precise biometric modality among user-friendly and non-intrusive methods. Moreover, a well-grounded approach to get a quality digital key from iris is proposed [7]. Hence the development of even more precise iris identification methods is actual.

Iris identification involves selection of an annular region of eye image enclosed between the pupil and the sclera and comparison of two such regions for a pair of images. It is generally accepted to transform the iris annular region from the original image to a rectangle of specified size (normalization). Of all possible ways of normalization, the so-called “rubber-sheet” model proposed in [3] is the most popular and, apparently, close to optimal. Here we will be within this framework.

As a rule, the texture of the iris is uniform throughout its area, but it has a large number of small details. For this reason, the most successful matching methods are based on a comparison of the local texture characteristics, which are calculated by spatially and spectrally local transformations, such as Gabor or LoG wavelets. This approach was proposed already in the earliest papers [4, 23]. Many other types of features and methods of their matching, developed since then, are also essentially local, for example the key points of the iris [25], ordinal statistics [20], correlators [16]. The use of global descriptors such as PCA, LDA or ICA [5, 22], histograms [10], Fourier transforms [15], and a number of others [1, 2] did not justify itself. All these methods show the recognition error of \(EER \geqslant 1\%\), which is significantly worse than that of local methods, or their results are obtained on small and specific image bases.

The use of local characteristics for matching requires the alignment of the matched objects, preferably as accurate as possible. The inaccuracy in determining the iris region by the border detection algorithms is a big problem, since it generates nonlinear distortions.

Fig. 1
figure 1

Distortions of normalized image

Figure 1 schematically shows the results of the normalization of the initial image (a) with correctly calculated boundaries (b), with a relative error of \(10\%\) of the radius of the pupil (c) and a relative error of \(10\%\) in determining the radius of the iris (d). It is easy to see that the deformation has a regular character. However, an accurate (analytical) description is rather difficult, since it requires the account of at least four parameters, which can be determined inaccurately (two coordinates of the eye center and two radii of circles approximating the inner and outer boundaries of the iris). For more accurate iris models, the number of parameters increases, for example, for a model of two nonconcentric circles, there are six parameters. This problem is poorly conditioned and its solution is practically inapplicable due to instability. Therefore, heuristic ways of matching are investigated mainly.

The main heuristic is the understanding that these distortions change the local texture weakly, so the similarity of local areas is preserved. Considering this, templates should be matched through a set of their local areas, some area from the first template should find its vis-a-vis in the second template with some offset, and the offsets for different pairs of areas should not be necessarily equal. In this formulation, the main issue is what kind of offsets this can be and how they are related to each other for different areas. In [12] the case of unbound offsets is considered, which are then combined using the hidden Markov model. In [25] the calculation of nonlinear deformations of the iris is carried out by tracking the corresponding points using a special kind of correlator. In [13] global (to compensate for the angle of view) and local (to compensate for the inaccuracy in determining the pupil and iris) corrections are proposed. In [17, 19, 21] neighboring areas are aligned by elastic graphs. These methods are quite computationally complex. In this paper we propose a simpler approach based on the calculation of the optimal path.

The following section briefly describes the procedure for converting an eye image to a template. Then the application of optimal path to template matching is presented. The last section describes the statement of the computational experiment, its results and conclusions.

2 Making template out of image

Consider a test database containing a set of eye images. Images are labelled with the persons’ unique identifiers in order to verify the correctness of identification. There is more than one person in the database and more than one image per person. The basic processing scheme is depicted in Fig. 2. Here blocks represent the “state of the world” and are labelled with Latin chars as: source image (a), segmentation (b), normalized image and mask (c), extracted features forming the template (d), cross-matching decisions (e), and matching statistics and aggregated error values (f). Arrows represent the processes, which transform a state to the next one.

Fig. 2
figure 2

The processing scheme

Source image I(xy) is segmented by method [6], which outputs the segmentation results as pupil contour, iris contour and occlusion mask. The pupil contour is the circle given by its center and radius \((x_P, y_P, r_P)\), which is the best approximation of pupil-iris boundary. The iris contour is the circle \((x_I, y_I, r_I)\), approximating iris-sclera boundary. The occlusion mask M(xy) is an image of same size as source with zero pixels in place where iris is covered by eyelids, eyelashes, flashes etc.

Then the iris normalisation is performed. It is a mapping of a ring, enclosed between iris and pupil circles to a rectangular region. The normalized image coordinate system is rectilinear \(O\phi \rho\), where horizontal axis \(O\phi\) corresponds to angle measured along the pupil and iris circles in source image, and vertical axis \(O\rho\) corresponds to radial shift from pupil circle to iris circle. Both source image and mask are subjected to the transformation, which yields their normalized versions \(I(\phi , \rho )\) and \(M(\phi , \rho )\). Figure 2c depicts a sample of normalized image obtained from image in Fig. 2(a) and occlusion mask thereof. There are several possible models of this transformation, here the so called “rubber-sheet model” [3] is used. The origin (xy) for the point of normalised image \((\phi , \rho )\) is expressed as:

$$\begin{aligned} \begin{aligned} x(\phi , \rho ) =&\;(1-\rho ) x_1(\phi ) + \rho x_2(\phi )\;, \\ x_1(\phi ) =&\; x_P + r_P \cos (\phi ), \\ x_2(\phi ) =&\;x_I + r_I \cos (\phi ), \\ \end{aligned} \end{aligned}$$
(1)

Coordinate y is computed accordingly. Dimensions of normalized image are set in ranges: \(\rho \in [0;1]\), \(\phi \in [0;2\pi )\). Brightness of the normalized image is obtained with the bilinear interpolation:

$$\begin{aligned} \begin{aligned}&N(\phi , \rho ) \\&\quad =\left( 1- \left\{ x \right\} \right) \left( 1- \left\{ y \right\} \right) I\left( \lfloor x \rfloor , \lfloor y \rfloor \right) \\&\qquad +\left\{ x \right\} \left( 1- \left\{ y \right\} \right) I\left( \lfloor x \rfloor +1, \lfloor y \rfloor \right) \\&\qquad +\left( 1- \left\{ x \right\} \right) \left\{ y \right\} I\left( \lfloor x \rfloor , \lfloor y \rfloor +1 \right) \\&\qquad +\left\{ x \right\} \left\{ y \right\} I\left( \lfloor x \rfloor +1, \lfloor y \rfloor +1 \right) , \end{aligned} \end{aligned}$$
(2)

where \(\lfloor a \rfloor\) and \(\left\{ a \right\}\) are integer and fractional parts of a respectively.

Iris features \(V(\phi ,\rho )\) are calculated as convolution of normalized image (2) with Gabor filter:

$$\begin{aligned} \begin{aligned} V(\phi , \rho )&= N(\phi , \rho ) * g_{\sigma \lambda }(\phi , \rho ) \;, \\ g_{\sigma \lambda }(\phi )&= \exp \left( -\frac{\phi ^2}{2 \sigma ^2} \right) \exp \left( -i \frac{\phi }{\lambda }\right) , \end{aligned} \end{aligned}$$
(3)

where \(\sigma\) and \(\lambda\) determine the spread of the wavelet in spatial domain and the wavelength of modulation. It should be noted that 1D Gabor (3) wavelet is used, where dimension spans along the angular coordinate \(\phi\). To form 2D functions one might multiply spatial representations of the filters by delta function \(\delta (r)\). Finally, the features used for matching are obtained as binarization of real and imaginary parts of array \(V(\phi ,\rho )\):

$$\begin{aligned} \begin{aligned}&T_{Re}(\phi ,\rho ) = \left\{ \begin{array}{ll} 1,&{}\quad \mathfrak {R}\left( V(\phi ,\rho )\right)> 0,\\ 0,&{}\quad \mathrm{otherwise}, \end{array} \right. \\&T_{Im}(\phi ,\rho ) = \left\{ \begin{array}{ll} 1,&{}\quad \mathfrak {I}\left( V(\phi ,\rho ) \right) > 0,\\ 0,&{}\quad \mathrm{otherwise}, \end{array} \right. \end{aligned} \end{aligned}$$
(4)

Two components of (4) are joined together to form a template. In this work binary templates are used in experiments. But without loss of generality one can speak about any system of local features, which are calculated in a regular mesh. So, each eye image I(xy) is converted to a template \(T(\phi ,\rho )\) and accompanying mask \(M(\phi ,\rho )\).

3 Template matching

Any two binary templates can be matched with normalized Hamming distance:

$$\begin{aligned} d_0(T_1,T_2) = \frac{1}{\left| \varOmega \right| } \sum \limits _{(\phi ,\rho ) \in \varOmega } T_1(\phi ,\rho ) \oplus T_2(\phi ,\rho ), \end{aligned}$$
(5)

where \(\varOmega = M_1 \cap M_2\) is the intersection of non-occluded areas of two matching templates. In fact, more complex distance function should be used, which counts on possible uncertainty of iris angle due to image rotation. The rotation of source eye image turns to cyclic shift along \(\phi\) coordinate in normalized image. One of the templates (together with mask) is rotated and matched, minimum distance is found:

$$\begin{aligned} \begin{aligned} d(T_1, T_2)&= \min \limits _{\psi \in [-S;S]} d(T_1, T_2, \psi ),\\ d(T_1, T_2, \psi )&= \frac{1}{\varOmega (\psi )} \sum \limits _{(\phi ,\rho ) \in \varOmega } T_1(\phi +\psi ,\rho ) \oplus T_2(\phi ,\rho )\\ \varOmega (\psi )&= M_1(\phi +\psi ) \cap M_2(\phi ), \end{aligned} \end{aligned}$$
(6)

where \(\psi\) is a possible rotation limited by maximum allowed rotation angle S of tested iris. The distance is normalized to the range [0; 1].

Let’s now split the template \(T_1\) into N equally sized, non-intersecting, fully covering segments \(T_1^{(n)}\), \(n \in [1;N]\), which are located along angular axis as shown in Fig. 3. Each such segment may be displaced by some angle \(\psi _n \in [-S;S]\) and matched against corresponding part of \(T_2\) template according to (5). Partial distances obtained here can be organized as the matrix \(D = \left\{ d_{\psi ,n}\right\}\) with size \((2S+1)\times N\). Note that the computational complexity of obtaining this matrix does not exceed that of determining the distance (6). In these terms, the distance (6) is obtained as the minimum sum over the rows of the matrix D:

$$\begin{aligned} d(T_1, T_2) = \min \limits _{\psi } \sum \limits _n d_{\psi ,n}. \end{aligned}$$
(7)
Fig. 3
figure 3

Matching segments with angular shift

That is, the angular displacements of all segments are the same, which can be called as a model of an non-deformed “rigid body”. On the other hand, angular displacements can be made independent, minimizing each partial distance separately:

$$\begin{aligned} d(T_1, T_2) = \sum \limits _n \min \limits _{\psi } d_{\psi ,n}. \end{aligned}$$
(8)

This corresponds to a model of a body, perfectly elastic with respect to rotations. A computational experiment with independent template displacements was carried out as described below. The model (8) gives a lesser identification error than (7) in conditions of inaccuracy of border detection when number of segments is small: \(2 \leqslant N \leqslant 6\). As the number of segments increases, the error increases due to locating false matches for small segments. The question arises whether it is possible to improve the alignment, if we introduce some restrictions on the mutual motion of segments. So, the nature of template should be something between completely restricted “rigid” body (7) and completely unconstrained (by angle) model (8). In this situation, in [12] it is proposed to make a relationship between displacements by introducing a hidden Markov chain. In this paper, we propose to use the optimal path method.

The smoothness of the normalization transformation means that the values \(\psi _n\) for neighboring indices are close. Values for distant indexes can vary greatly, but because of the cyclicity of the angle conversion, the values of \(\psi _1\) and \(\psi _N\) should also be close. Thus, we get the task of selecting a sequence of elements of matrix \(\left\{ d_{\psi ,n}\right\}\), with the following requirements: one and only one element in each column; row of selected elements changes no more than by small value between adjacent columns; sum of the selected elements is minimized. It is possible to present this problem as the definition of the minimum cost of a cyclic path in a matrix:

$$\begin{aligned} \begin{aligned} d(T_1,T_2) = \min \limits _{\left( \psi _1,\dots ,\psi _N\right) } \sum \limits _{n=1}^{N} \left[ d_{\psi ,n} + C(\psi _n, \psi _{n+1}) \right] ,\\ C(\psi _n, \psi _{n+1}) = {\left\{ \begin{array}{ll} 0,&{}\quad \mathrm{if}\quad \Vert \psi _n - \psi _{n+1} \Vert \leqslant 1,\quad n \ne N,\\ 0,&{}\quad \mathrm{if}\quad \Vert \psi _1 - \psi _N \Vert \leqslant 1,\quad n = N,\\ \infty ,&{}\quad \mathrm{otherwise}. \end{array}\right. } \end{aligned} \end{aligned}$$
(9)

The problem (9) is an optimization of path on a grid, which can be solved by various methods [24]. Figure 4 represents typical view of matrices D for three cases and the optimal path found. The size of the matrices is \(N=15\), \(S=12\). The darker is the cell the smaller partial distance is. The optimal path is outlined with white circles. First matrix (a) is the case of matching two irises of one person (i.e. “genuine” match), when all border parameters are detected correctly (or, maybe, have very similar errors). In this case no rotation is necessary, optimal path is a straight line and “rigid body” model would suffice as well. Matrix (b) is the case of genuine matcher, but normalization is distorted by border detection error. Here one can see that minima in columns appear in some regular order, which forms a dark “valley”, and the optimal path is a curved path through it. Matrix (c) is a case of impostor match. Would it be correct border detection or not, matrix of impostor match will have chaotic location of minima in its columns, and the optimal path will be forced through many large values of matrix, yielding high total cost. The solution of (9) produces the distance and the angles \(\left( \psi _1,\dots ,\psi _N\right)\), to which the template segments are offset to find their correspondents.

Fig. 4
figure 4

Matrices D for undistorted, distorted genuine and impostor matches

In all the calculations of this section an arbitrary parameter N is used, the number of segments to which the template is divided. The selection of this parameter was carried out experimentally, by calculating the classification error obtained by using the distance with the given parameter N.

4 Experiment setup and results

By adding a threshold \(\varTheta \in (0;1)\) to the distance \(d(T_1,T_2)\) the classifier is obtained:

$$\begin{aligned} Class = {\left\{ \begin{array}{ll} genuine,&{}\quad \mathrm{if} \quad d(T_1, T_2)\leqslant \varTheta , \\ impostor,&{}\quad \mathrm{if}\quad d(T_1, T_2)>\varTheta . \end{array}\right. } \end{aligned}$$
(10)

Since the persons’ unique identifiers are known for test database, the decision of classifier can be matched against the ground truth, and the quality of classifier can be evaluated from the number of wrong classification events. The equal error rate is defined by a trade-off between false match and false non-match errors, which is governed by a classification threshold \(\varTheta\):

$$\begin{aligned} EER \equiv \frac{fn(\varTheta )}{fn(\varTheta )+tp(\varTheta )} = \frac{fp(\varTheta )}{fp(\varTheta )+tn(\varTheta )}, \end{aligned}$$
(11)

where \(fn(\varTheta )\), \(fp(\varTheta )\) are numbers of false non-match and false match events (also referred to as false negative and false positive), and \(tn(\varTheta )\), \(tp(\varTheta )\) are true non-match and true match numbers respectively. As long as EER depends on number of template segments, one can think of it as a function EER(N) and search for the optimum:

$$\begin{aligned} EER^{*} = \min \limits _{N} EER(N),\;\; N^{*} = \arg \min \limits _{N} EER(N). \end{aligned}$$
(12)

Two publicly available iris image databases were used: ICE2005 subset of ND-Iris-0405 [18] and CASIA4-Lamp [9]. All images in the datasets are 480 rows and 640 columns. ICE2005 subset contains 2593 images for 132 persons. Majority of the subjects are Caucasian. All images were acquired with LG 2200 iris biometrics system. The subset of all left irises including 1527 images of 119 persons was used for experiments. Number of images per iris is very uneven in this database and ranges from single (33 persons have only one image) to 31. Number of genuine matches is 15, 357, number of impostor matches is 1, 149, 744. CASIA4-Lamp contains images for over 800 irises, each iris is represented by 20 or rarely a couple less images. Totally, the DB contains 16, 312 images All irises are Asian type. Images were collected in near-infrared illumination using IKEMB-100 camera produced by OKI-IrisPass (http:// www.oki.com). This produces approximately \(800*20*20/2=160\) k genuine matches and \(16{,}312^2/2-160{,}000 \approx 13\) M impostor matches.

Fig. 5
figure 5

Graphs of EER(N) for various databases and matchers

Figure 5 presents graphs of dependency of EER on the number of segments N, \(N \in [1;30]\) for two involved databases. Graphs, which are entitled “NoLink”, are obtained for model of unrelated segments and distance (8), graphs “OptPath” are derived for optimal path model with distance (9). Note that the initial points of these graphs \(N = 1\) coincide and correspond to the “rigid body” model (7).

Based on the graphs, the following conclusions can be drawn. The model of a “rigid body” almost always loses to models with division into segments. Models with unrelated segments for a small number of segments are better than models of the optimal path, but with an increase in the number of segments, they quickly saturate and further their error increases. Models of the optimal path are saturated more slowly, but they achieve substantially better results. Comparing databases (ICE2005 against CASIA), one can see that at \(N=1\) ICE2005 yields bigger EER value, but with growing N it allows better classification for both matchers. The reason is ICE2005 has bigger share of images with low quality, which produce imprecise border detection and template matching error with “rigid body” method. On the other hand, ICE2005 has more diversified iris types and in general less occlusions; both of this factors allow achieving better precision if border detection error influence is compensated.

The Table 1 gives the EER (11) in percent for proposed approach and some state of the art methods for two databases involved.

Table 1 EER for various methods

“Rigid body”, which is a straightforward implementation of Daugman’s scheme [4] has poor performance and loses to any of state of the art approaches. Splitting into segments and unconstrained matching (8) enhances the situation substantially, but still is inferior to other methods. The optimal path model covers the gap, and the proposed method demonstrates same recognition quality as known solutions according to EER. At that optimal path method is quite simple both algorithmically and computationally.

5 Conclusions

The influence of splitting the iris into the segments upon classification error was studied. The models of “rigid body”, unconstrained segment offset and alignment by optimal path were investigated. Numerical experiments have shown that this algorithm can significantly improve the accuracy of recognition and achieve the performance of complex state of the art approaches. The highest accuracy is achieved with five segments for unconstrained offset model and 16 for optimal path model. The running time of the algorithm with \(N=16\) is 150 microseconds per one comparison with Intel Core i7-3770 CPU and allows for multiple acceleration when using multiprocessor systems.