Feature Component-Based Extreme Learning Machines for Finger Vein Recognition

Xie, Shan Juan; Yoon, Sook; Yang, Jucheng; Lu, Yu; Park, Dong Sun; Zhou, Bin

doi:10.1007/s12559-014-9254-3

Feature Component-Based Extreme Learning Machines for Finger Vein Recognition

Published: 15 March 2014

Volume 6, pages 446–461, (2014)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Cognitive Computation Aims and scope Submit manuscript

Feature Component-Based Extreme Learning Machines for Finger Vein Recognition

Download PDF

Shan Juan Xie^1,2,
Sook Yoon³,
Jucheng Yang⁴,
Yu Lu²,
Dong Sun Park^2,5 &
…
Bin Zhou¹

594 Accesses
34 Citations
1 Altmetric
Explore all metrics

Abstract

This paper proposes an efficient finger vein recognition system, in which a variant of the original ensemble extreme learning machine (ELM) called the feature component-based ELMs (FC-ELMs) designed to utilize the characteristics of the features, is introduced to improve the recognition accuracy and stability and to substantially reduce the number of hidden nodes. For feature extraction, an explicit guided filter is proposed to extract the eight block-based directional features from the high-quality finger vein contours obtained from noisy, non-uniform, low-contrast finger vein images without introducing any segmentation process. An FC-ELMs consist of eight single ELMs, each trained with a block feature with a pre-defined direction to enhance the robustness against variation of the finger vein images, and an output layer to combine the outputs of the eight ELMs. For the structured training of the vein patterns, the FC-ELMs are designed to first train small differences between patterns with the same angle and then to aggregate the differences at the output layer. Each ELM can easily learn lower-complexity patterns with a smaller network and the matching accuracy can also be improved, due to the less complex boundaries required for each ELM. We also designed the ensemble FC-ELMs to provide the matching system with stability. For the dataset considered, the experimental results show that the proposed system is able to generate clearer vein contours and has good matching performance with an accuracy of 99.53 % and speed of 0.87 ms per image.

Finger-Vein Recognition Based on an Enhanced HMAX Model

Efficient hand vein recognition using local keypoint descriptors and directional gradients

Article 28 February 2022

A sequential combination of convolution neural network and machine learning for finger vein recognition system

Article 14 August 2024

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Recently, a newly emerging biometrics technology based on human finger veins has attracted more and more attention, since the finger veins are located internally within the living body, thus providing a recognition system with high accuracy and immunity to forgery and the interference from the outer skin (e.g., from skin disease, humidity, dirtiness, etc.).

Finger vein systems have advantages of low cost, easy collection with contactless operation, and small devices. Finger vein patterns are viewable with reflected light due to the peak absorption of infrared illumination by oxygenated and de-oxygenated hemoglobin in the blood relative to the surrounding flesh at specific frequencies [1]. In practice, however, finger vein images suffer from a specific selectivity in imaging modes, or changes in the physical conditions and blood flow, which make them become unstable and have low contrast, or cause the veins to have various apparent thicknesses and brightness, as shown in Fig. 1. This makes it difficult to achieve reliable and accurate finger vein recognition and causes high generalization requirements for feature extraction and matching algorithms.

Vein extraction has been widely researched, usually based on the intensity characteristics in the cross-sectional profiles, since a vein pattern point is darker than its surroundings. Miura et al. proposed repeated line tracking [1] and the maximum curvature detection method [2, 3] based on the cross-sectional profiles. Hoover et al. [4, 5] proposed an approximated Gaussian-shaped model to simulate the profile curve. Although these methods can extract the veins from a low-contrast image, they are sharply affected by the temporal change of the widths of the veins. Intensity thresholding-based methods [6] are easily affected by the image brightness due to the threshold tuning problem.

In the matching phase, the similarities between the registered image and testing image are calculated based on the certain distance [7], chi-square distance [8], or machine learning methods. Over the past few decades, lots of cognitive-inspired computation works are contributed for the image processing and pattern recognition [9–12], including neural networks [13–15], genetic algorithms, support vector machine [16], and a new type of feed-forward classifier, extreme learning machines (ELM) [17–20]. The ELM has recently attracted more and more attention as an emergent technology that overcomes some of the challenges faced by other classifiers. The ELM works well for generalized single hidden layer feed-forward networks (SLFNs). The essence of the ELM is that the hidden layer of SLFNs need not be tuned. Compared with the traditional classifiers, the ELM provides better generalization performance at a much faster learning speed with less human intervention [21–24].

In this paper, we propose a novel finger vein recognition system that is more robust to the variation of the external factors such as lighting and user positioning, and shows improved stability, complexity, and recognition accuracy, thus rendering the system more practical in real-world applications and enabling it to deal with the increasing size of the datasets. For feature extraction, a novel explicit guided directional filter is proposed to obtain high-quality finger vein contours from noisy, non-uniform, low-contrast images without introducing any segmentation process. This filter enhances an input image with the help of a supervisor image that instructs the filter to preserve the vein patterns and reduce the impacts of the background, such as haze and illumination. After the guided directional filter, the veins are sufficiently magnified to directly extract the average absolute deviation (AAD) features, which are the strengths of the directional block information with eight different angles, even from images with thin, vague ridges and non-uniform backgrounds. Finally, a variant of the original ensemble ELM called the feature component-based ELMs (FC-ELMs) is introduced. FC-ELMs are designed to utilize the characteristics of the AAD features; and improve the recognition accuracy, speed, and stable generalization for large datasets; and substantially reduce the number of hidden units.

Related Study: Ensemble Extreme Learning Machine

The ELM algorithm was first proposed by Huang et al. [21, 22, 25] based on the single-layer feed-forward network (SLFN). The main concept of the ELM is that hidden node parameters are randomly generated without tuning. Consider a set of $N$ arbitrary distinct samples $(x_i, t_i)$, where $x_i=[x_{i1}, x_{i2}, \ldots ,x_{in}]^T\in {R^n}$ and $t_i=[t_{i1},t_{i2}, \ldots , t_{im}]^T\in {R^m},\,x_i$ is an $n\times {1}$ input vector and $t_i$ is an $m\times {1}$ target vector. For the given training samples $\left\{ (x_i,t_i)\right\} _{i=1}^N \in {{R^n}\times {R^m}}$, the output of an SLFN with $L$ hidden nodes can be represented by

$$\begin{aligned} f_S(x_j)=\sum _{i=1}^L\beta _iK(a_i,b_i,x_j)=t_j,\quad j=1, \ldots , N \end{aligned}$$

(1)

where $a_i$ and $b_i$ are the hidden node parameters, which could be randomly generated. $K(a_i,b_i,x_j)$ is an activation function and $\beta _i$ is the weight connecting the $i^{th}$ hidden node to the output nodes, which can be written compactly as:

$$\begin{aligned} H\beta =T \end{aligned}$$

(2)

$H$ is called the hidden layer output matrix of the network. Given the randomly generated hidden node parameters $(a_i,b_i)$ and the training inputs $x_i$, the hidden layer output matrix $H$ can be computed simply.

$$\begin{aligned}&H(a_1,\ldots ,a_L,b_1,\ldots ,b_L,x_1,\ldots ,x_N)=\left[ \begin{array}{ccc} K(a_1,b_1,x_1) &{} \cdots &{} K(a_L,b_L,x_1) \\ \vdots &{} \cdots &{} \vdots \\ K(a_1,b_1,x_N) &{} \cdots &{} K(a_L,b_L,x_N) \end{array}\right] _{N\times {L}} \nonumber \\&\beta =(\beta _1,\ldots ,\beta _L)^T \quad \hbox {and} \quad T =(t_1,\ldots ,t_L)^T\qquad \qquad \end{aligned}$$

(3)

Therefore, training the SLFNs simply amounts to solving the linear system of output weights $\beta$. With the computed $H$ and given output $T$, the output weight $\beta$ is estimated as:

$$\begin{aligned} \beta =H^{\dagger }T \end{aligned}$$

(4)

where $H^{\dagger }$ is the Moore–Penrose generalized inverse of the hidden layer output matrix H. There are several methods of calculating the Moore–Penrose generalized inverse of H, such as the SVD-based method. The single ELM network shown in Fig. 2a is widely used for real-time applications due to its simple steps and very high speed.

Hansen and Salamon [26] proposed that the single network performance can be improved by using an ensemble of neural networks with a plurality consensus scheme. An integration of several ELMs connected in parallel was first proposed by Lan et al. [27]. It was confirmed that the method worked well for both stationary and non-stationary time series prediction [28, 29] and sales prediction [30] with better generalization performance.

The average of the ELM outputs was used as the final decision. Assume that the output of each ELM network is $f_s^{(j)}(X),\,j=1, \ldots , M$. The final output of the ensemble ELMs (E-ELMs) shown in Fig. 2b can be represented as:

$$\begin{aligned} f_E(X)= \sum _{j=1}^{M}w_j\cdot f_s^{(j)}(X), \quad w_j=\frac{1}{M} \end{aligned}$$

(5)

where $f_E(X)$ is the output of the whole system with input $X$. We expect the ensemble ELM to work better than the single ELM, because the randomly generated parameters make each ELM network in the ensemble distinct. The variance of the ensemble network is lower than the average variance of all of the single networks. Let $f(x)$ denote the true output of the predicted input and $\widehat{f_i}(x)$ be the estimated value of network $i$. Then, the error $e_i(x)$ between the predicted $\widehat{f_i}(x)$ and true output $f(x)$ is expected to be at a minimum:

$$\begin{aligned} e_i(x)= \left| \widehat{f_i}(x) -f(x) \right| \end{aligned}$$

(6)

Then, the expected square error of a single network becomes

$$\begin{aligned} E[e_i(x)^2]=E\left[ \left\{ \widehat{f_i}(x) -f(x) \right\} ^2 \right] \end{aligned}$$

(7)

The average error made by M networks is given by

$$\begin{aligned} E_{\mathrm{avg}}=\frac{1}{M}\sum _{i=1}^{M}E[e_i(x)^2] \end{aligned}$$

(8)

Similarly, the expected error of the ensemble is given by

$$\begin{aligned} E_{\mathrm{ens}}=E\left[ \left\{ \frac{1}{M}\sum _{i=1}^{M}\widehat{f_i}(x)-f(x) \right\} ^2 \right] =E\left[ \left\{ \frac{1}{M}\sum _{i=1}^{M}e_i(x) \right\} ^2 \right] \end{aligned}$$

(9)

If the errors $e_i(x)$ are nonzero, then

$$\begin{aligned} E_{\mathrm{ens}}=\frac{1}{M}E_{\mathrm{avg}} \end{aligned}$$

(10)

It can be shown that the ensemble ELM network produces fewer errors than M single ELM (S-ELM). Since each of the M S-ELM networks has different adaptabilities to the new data, they can overcome the problem of networks that cannot adapt well to the new data.

Proposed Finger Vein Recognition System

As shown in Fig. 3, the proposed finger vein recognition system consists of two modules: a feature extraction module and matching module based on the ELM. The feature extraction module consists of three main steps: pre-processing, vein contour extraction, and AAD feature extraction. The ROI image with a pre-defined smaller size can speed up the overall feature extraction process. Vein contour extraction using a guided directional filter extracts high-quality vein contours in eight directions. Instead of pixel-based features, the AAD strengthens the direction information by extracting features on non-overlapping blocks. The matching module was implemented using the ensemble ELM network, which consists of eight small ELM, each trained with the AAD sub-features with a pre-defined angle, and an output layer to combine the outputs of the eight ELMs.

Pre-processing

It is necessary to determine a reliable region of interest (ROI) in a finger vein image with a pre-defined size but adjustable position and rotation, due to the user’s informal placement, distortion, and rotation. As the finger target is brighter than the surrounding background pixels, convex structure is formed at the profiles of the finger, which could be detected by the open top-hat filter defined in Eq. (11). $F\circ B$ represents the opening morphology operation with the structure $B$, which is the disk structure with a size of 5.

$$\begin{aligned} \hbox {OTH}\left( t \right) =(F-F\circ B)(t) \end{aligned}$$

(11)

Since the finger profile can be approximated as a line, the Hough transform is used to detect the positions and angles of the finger lines, since it is tolerant of gaps in the edge descriptions and is relatively unaffected by image noise [31]. The group of edge points $\left\{ \left( x_1,y_1 \right) ,\left( x_2,y_2 \right) ,\ldots ,\left( x_k,y_k \right) \right\}$ is transformed into a sinusoidal curve in the plane $\left( \theta ,\rho \right) ,\left( \rho \geqslant 0,0\leqslant \theta \leqslant \pi \right)$ defined by:

$$\begin{aligned} \rho =x_i\cos \theta +y_i\sin \theta \quad \left( i=1,2,\ldots ,k \right) \end{aligned}$$

(12)

The accumulator cells that lie along the curve are incremented, and the resulting peak in the accumulator array provides strong evidence that a corresponding straight line exists in the image. As shown in Fig. 4c, two peaks $\left( \rho _1,\theta _1 \right) ,\,\left( \rho _2,\theta _2 \right)$ are detected corresponding to the two horizontal finger contour lines. When considering the finger curvature itself, a simple rotation correction will involve a rotation of $(\theta _1+\theta _2)/2$ degrees when two detected peaks satisfy the condition:

$$\begin{aligned} (\theta _1-\pi )+(\theta _2-\pi )\geqslant \frac{\pi }{18} \end{aligned}$$

(13)

Finally, the ROI is centered at the point $(C_x,C_y)=(width/2,\,(\rho _1+\rho _2)/2)$ and cropped with a size of [256, 96] for the rotation-corrected images, as shown in Fig. 4d.

Guided Directional Filter for Vein Contour Extraction

Since finger vein images are not always of high quality, due to the varying tissues and bones, or uneven illumination, an efficient enhancement method is necessary to recover those influencing factors that make the veins appear different in terms of their thickness and brightness at each acquisition. As the finger vein network is composed of a series of ridges in a particular orientation, a properly tuned directional filter, such as the even symmetric Gabor filter [32], has proved to provide excellent performance for ridge extraction. A guided directional filter is constructed using an even symmetric Gabor filter and the guided filter [33]. Using a supervisor image that can be the input image itself or another image, the guided filter instructs the filter to preserve the vein pattern and reduce the impact of the background.

The key assumption of the guided filter is the existence of a local linear model between the supervisor image $S$ and the filtered image. In each window, $w_k$ centered at pixel $k$, and the guided image $G_u$ is linearly transformed by $S$ with the coefficients $(a_k,b_k)$, which can be represented as:

$$\begin{aligned} G_{ui}=a_kS_i+b_k,\quad i\in w_k \end{aligned}$$

(14)

This local linear model ensures that $G_u$ has an edge only if $S$ has an edge, because $\triangledown G_u=a\triangledown S$. This has been proven to be useful in image matting, image super-resolution, and haze removal [33]. The relationship between $S,\,I$ and $G_u$ can be described in the form of image filtering as follows:

$$\begin{aligned} G_{ui}(I,S,w,\varepsilon )=\underset{j}{\sum }W_{ij}(S,w,\varepsilon )I_i \end{aligned}$$

(15)

The kernel weight can be explicitly expressed by:

$$\begin{aligned} W_{ij}(S,w,\varepsilon )=\frac{1}{\left| w \right| ^2}\underset{k:(i,j)\in w_k}{\sum }(1+\frac{(S_i-\mu _k)(S_j-\mu _k)}{\sigma _k^2+\varepsilon }) \end{aligned}$$

(16)

where $\mu _k$ and $\sigma _k^2$ are the mean and variance values of window $k$, respectively. It can be proven that the kernel weights $\underset{j}{\sum }W_{ij}\left( S \right)$ are equal to 1 without any extra normalization. Then, the guided directional filter can be represented by the following general form:

$$\begin{aligned} G(I,f_0,\theta _k,\delta )=G_a(I,f_0,\theta _k,\delta )*G_u(I,S,w,\varepsilon ) \end{aligned}$$

(17)

where,

$$\begin{aligned}&G_a(I,f_0,\theta _k,\delta )=\exp \left\{ -\frac{1}{2} \left( \frac{I_{x_{\theta _k}}^2+I_{y_{\theta _k}}^2}{\delta ^2} \right) \right\} \cos (2\pi f_0 I_{x_{\theta _k}})\nonumber \\&\quad \quad \hbox {where}, \quad \left[ \begin{array}{c} I_{x_{\theta _k}}\\ I_{y_{\theta _k}} \end{array}\right] =\left[ \begin{array}{cc} \cos \theta _k &{} \sin \theta _k \\ -\sin \theta _k &{} \cos \theta _k \end{array}\right] \left[ \begin{array}{c} I_x \\ I_y \end{array}\right] \qquad \qquad \end{aligned}$$

(18)

where $*$ denotes a convolution in two dimensions, while $\theta _k=k\pi /8$, and $k=(1,2,\ldots ,8)$ denote the orientation, and $f_0$ is the center frequency of the Gabor filter. The bank of guided directional filters, as shown in Fig. 7, generates eight filtered components. Since the linear edge preserving coefficient, $a_k$, will decrease with increasing $\varepsilon$ in Eq. (16), $\varepsilon$ is considered as the degree of edge preservation. As shown in Fig. 5, the edge preservation performance is enhanced with increasing $\varepsilon$ and window size, $w$.

Meanwhile, the haze removal and vein enhancement performance vary depending on the supervisor image. A proper supervisor image will benefit the vein extraction process, as demonstrated in Fig. 6e–g, where the supervisor image is the same as the input image in Fig. 6e, f, and the supervisor image in Fig. 6g is the image enhanced by the guided filter. Although Fig. 6g shows a darker and clearer vein contour because of the use of the iteratively enhanced supervisor image than Fig. 6e and g, it may achieve more effective for other matching methods such as local binary pattern-based methods. When using a directional filter for further vein extraction, much more noise is obtained than the image in Fig. 6f, because the additional enhancement also enhances the noise simultaneously. To optimize the setting of the guided filter, the performance of vein contour extraction is quantitatively evaluated by matching performance in “Vein Contour Extraction Performance” section. Compared with the typical enhancement methods, the guided Gabor filter performed superior accuracy when $\varepsilon =1^2,\,w=15$, and $S=I$.

Block-Based Average Absolute Deviation Feature Extraction

The outputs of the guided directional filter form eight-vein contour images are shown in Fig. 7a. The finger vein images can be discriminated by the variation of the finger vein contours in the eight directions. Instead of pixel-based features, the directional filtered image is segmented with non-overlapping blocks of size $[T_1\times {T_2}]$. For instance, $(256 \times {96})/({T_1}\times {T_2})$ features can be extracted from a normalized image with a size of $256\times {96}$ based on the statistical information. The selection of the splitting block size is analyzed in the experimental section. Assuming that $F_{mn}$ represents the block matrix of a filter image, the statistics based on a block (a component of $F$ in the column m and row n, where $m=1, 2,\ldots , 256/{T_1},n=1, 2,\ldots ,96/{T_2}$) can be computed. The AAD [34] $\delta _{mn}^k$ of the magnitudes of $G(I,f_0,\theta _k,\delta )$ corresponding to $F_{mn}$ is calculated as:

$$\begin{aligned} \left\{ \begin{array}{l} \delta _{mn}^k=\frac{1}{N}\underset{F_{mn}}{\sum }\left| \left| G(I,f_0,\theta _k,\delta ,w,\varepsilon ) \right| -\mu _{mn}^k \right| \\ \mu _{mn}^k=\frac{1}{N}\underset{F_{mn}}{\sum }\left| G(I,f_0,\theta _k,\delta ,w,\varepsilon ) \right| \end{array}\right. \end{aligned}$$

(19)

where $N$ is the number of pixels in $F_{mn}$, and $\mu _{mn}^k$ is the mean value of the magnitudes of $G(I,f_0,\theta _k,\delta )$ in $F_{mn}$. The feature vector for matching can be represented by: $X=[C_1,C_2,\ldots ,C_8]$, where,

$$\begin{aligned} C_k=\left[ \begin{array}{ccc} \delta _{11}^k &{} \dots &{} \delta _{1n}^k \\ \vdots &{} \delta _{ij}^k &{} \vdots \\ \delta _{m1}^k &{} \dots &{} \delta _{mn}^k \end{array}\right] _{t\times s} \quad k=(1,2,\dots ,8) \end{aligned}$$

(20)

Eight-dimensional AAD features $X$ corresponding to the eight contour images are obtained in this way as shown in Fig. 7b when the non-overlapping block size is $16 \times {16}$. For each normalized contour image with a size of [256, 96], 96 ($[16\times 6]$) vectors can be extracted to match a query image with a template.

Proposed Feature Component-Based Extreme Learning Machines

In the face recognition system, the facial components features derived from the eyes, nose, and mouth can be separately extracted, while they are batched as one feature set for recognition [35]. For the finger vein recognition, the global feature is more highly sensitive to image variations caused by user operation or environmental conditions, such as finger rotation, translation, or illumination. In contrast to general recognition systems based on structural component features, which are extracted based on the local position or properties of the objects, the proposed recognition system, called feature component-based ELMs (FC-ELMs), selects the feature components from the global features directionally, since the veins are composed of a series of directional information.

In the finger vein recognition system, based on the extracted eight component features, eight S-ELM networks are constructed in a parallel manner, as shown in Fig. 8. The parallel recognition systems based on the selected independent feature components are linearly combined for the final recognition decision. The eight directional components in this paper, called $C_1,\,C_2$, ..., $C_8$, are related to the directional filter at $0^{\circ },\,22.5^{\circ }, \ldots , 157.5^{\circ }$, respectively. For each of the eight directional components, 96 AAD features are extracted with the selected block size of $16\times {16}$. One of the eight components $C_k$ from the total feature vector sets is assigned as the input for each S-ELM network. Thus, the feature size of the each S-ELM network is decreased to 1/8 of the feature vectors in S-ELM and E-ELM models. The output of the FC-ELM model is defined as follows:

$$\begin{aligned} f_{c}(X)=\sum _{k=1}^{8}w_k\cdot f(C_k),\quad k=1,2,\ldots ,8 \end{aligned}$$

(21)

where $k$ denotes the $kth$ component in the eight directions. Although each department will run independently, the success of the project ($f_{c}(X)$) is based on the proper assignment (the component $C_k$), the efficiency of each department (the performance of $f(C_k)$), and the department cooperation between them (the adaptive weight $w_k$). To ensure the matching performance of the recognition system, the principle employed for feature component extraction is that each component has sufficient uniqueness for recognition and robustness for the user operation or illumination.

With the component correlation analysis, the proper weight assignment to the component features will improve their cooperation. An adaptive weights method is proposed with the analysis of the independence and correlation of the eight components based on the following two factors:

1.
Not only the AAD features but also the component distribution of each image will contribute to the matching.
2.
Those components with high confidence are assigned larger weights to decrease the matching error.

Assuming that both the fingerprint and finger vein image are convoluted with the proposed guided directional filter, the fingerprint energy will be spread almost equally in each direction, since the fingerprint ridges are connected in the form of a circle. However, instead of the approximate uniform distribution, the finger vein energy of the eight components will behave more like a Gaussian distribution. The main blood vessels, such as the main branches, flow from one side to another in the vertical direction and form an energy peak. The minor vessels exhibit more energy degeneration than the main vessels, which are connected to the main vessels randomly and less of the energy is focused. The energy distribution in the eight directions for the finger vein is shown in Fig. 9. The energy of each component $E_k$ is defined in Eq. (22).

$$\begin{aligned} E_{k}=\underset{x,y}{\sum }(255-G(x,y,k))^2 \end{aligned}$$

(22)

where $G(x,y,k)$ is the intensity value for pixel $(x,y)$ in the $kth$ filtered image. The dark vein contour with intensity value 0 has the highest energy. The weights $W_{k}$ for FC-ELM are given inversely that a larger energy component will have a smaller weight.

$$\begin{aligned} W_{k}=\frac{1}{n-1} \cdot \left( {1-\frac{E_k}{{\sum _{k=1,2,\ldots ,n} E_k }}} \right) \end{aligned}$$

(23)

When the input features are the same, it was shown by Eq. (10) that the ensemble ELM networks could decrease the square error. Although the input features of the feature component vary, the single component test demonstrated that the matching accuracy of each component, which is more than 94 %, is sufficient. In the single component test, $\hbox {Component}_1$ to $\hbox {Component}_8$ are evaluated based on the basic ELM network under the dataset [3, 3] for five trials. The average training and testing results are shown in Fig. 10 with the tuning of the hidden neurons. We found that the components in $0^{\circ },\,22.5^{\circ },\,135^{\circ }$, and $157.5^{\circ }$ performed better than the components in $45^{\circ },\,67.5^{\circ },\,90^{\circ }$, and $112.5^{\circ }$. In other words, the major vein contributed less to the matching than the minor vein, since most of the vein image contains the major vein, thus decreasing the uniqueness of the major vein. This also satisfies the Shannon entropy theory, which can be approximately defined as the degree of disorder or uncertainty.

To analyze the correlation of the directional components, leave-one-out tests were also performed, as shown in Fig. 11. The matching results for $\hbox {Component}_i$ mean the matching based on all of the components except for $\hbox {Component}_i$. The results show that all of the components contribute to the matching, and the matching performance will be degraded when leaving out any of the component. $\hbox {Component}_1$ contributed the most, since the matching performance is seriously degraded when it is deleted. To evaluate the feature stability, the variances of six images per individual were computed for both the eight single component features and the global features shown in Fig. 12. All of the component features have smaller variances than the global features, which means that the stability and robustness of the feature space are increased. The stabilities of $\hbox {Component}_1,\,\hbox {Component}_2,\,\hbox {Component}_5$, and $\hbox {Component}_8$ are improved by 20 % compared with the global features.

The selected component features have better sufficient accuracy for matching and higher stability than the global features. In addition, similar to the ensemble ELM networks, with the randomly generated nodes for the eight component features, the FC-ELM networks can improve its stability to a leave comparable to that of the E-ELM ($M=10$) network, as shown in “Performance of S-ELM, TER, E-ELM, FC-ELM, and EC-ELM” section.

Proposed Ensemble Components-Based ELM

Compared with the ensemble ELM model, the proposed feature component-based ELM model is much smaller, since the size of the input feature, the number of hidden neurons, and numbers of S-ELM networks are all decreased substantially. To combine the advantage of the ensemble ELM model and component-based ELM model, we propose the ensemble component-based ELM network (EC-ELM), as shown in Fig. 13, in which the average of the FC-ELM outputs is used as the final decision. Assuming that the output of each FC-ELM network is $f_c^{(j)}(X),\,j=1, \ldots , M$, the final output of the EC-ELMs, $f_{\mathrm{EC}}(X)$, can be represented as:

$$\begin{aligned} f_{\mathrm{EC}}(X)= \sum _{j=1}^{M}w_j\cdot f_c^{(j)}(X) \quad \mathrm {where},\quad w_j=\frac{1}{M} \end{aligned}$$

(24)

where $f_{\mathrm{EC}}(X)$ is the output of the whole system with input $X$. The scale of the proposed EC-ELM model is smaller than that of the ensemble ELM model in Fig. 2b, since the scale of each FC-ELM module is larger than each S-ELM module, but the number of modules, M, which participate in an ensemble operation is much smaller than in the ensemble ELM.

Experimental Results

Dataset

The datasets mainly used in the study is a public finger vein dataset, including 106 individuals. The Group of Machine Learning and Applications at Shandong University (SDUMLA) set up the homologous multi-modal traits dataset [36], which consists of face images, finger vein images, gait videos, iris images, and fingerprint images. Each individual was asked to provide images of the index finger, middle finger, and ring finger of both hands, and the collection for each of the six fingers is repeated six times to obtain thirty-six finger vein images. The finger vein dataset is composed of 3,816 images with a size of $320\times {240}$ pixels.

To evaluate the effect of the proposed method FC-ELMs and EC-ELMs, a new finger vein dataset including 1,000 images constructed by the group of Multi-Media Lab of Chonbuk National University (MMCBNU) is added for the evaluation in “Performance of S-ELM, TER, E-ELM, FC-ELM, and EC-ELM” section. The finger vein dataset is composed of finger images from 100 individuals, and each finger image is repeated ten times.

Experimental Protocol

Due to the random settings within the hidden neurons of ELM classifier, as well as the statistical evidence, we use ten runs of tenfold stratified cross-validation for all final accuracy results. Two types of training and testing sets, [3, 3] and [5, 1], are generated randomly. The set [5, 1] means that five images per individual are employed for training with one image per individual for testing. The programs are run on 3.00 GHz Intel Core 2 Quad processor using Matlab 7.0.1.

To optimize the proposed recognition system, the following measures are used to evaluate the performance:

1.
Vein contour extraction: As the accurate contour images can improve the matching performance, the quality of the contour image is quantitatively evaluated using the matching performance.
2.
AAD feature extraction: To optimize the AAD feature, different block sizes are evaluated in terms of the matching accuracy and time consumption using the S-ELM network.
3.
Hidden neurons and the number of ELM: According to the four types of ELM classifier: S-ELM, E-ELMs ($M=5, 10, 15, 20$), FC-ELMs, and EC-ELMs ($M=5, 10, 20$), the matching performance, stability, and computational complexity of the finger vein recognition systems are evaluated with hidden neuron tuning.
4.
Genuine matching and imposter matching: The false acceptance rate (FAR) and false rejection rate (FRR) defined in Eqs. (25) and (26), respectively, are evaluated with $634 \times (3804-6)=2407932$ impostor matches versus $634 \times 5=\hbox {3,170}$ genuine matches.
5.
Comparison with existing finger vein recognition methods: The proposed recognition system is compared with the minutiae feature-based methods [37], local binary pattern-based methods [38, 39], and the SVM-based method [16].
$$\begin{aligned} \hbox {FAR}&= \frac{\hbox {Number\ of\ accepted\ imposter\ claims}}{\hbox {Total\ number\ of\ imposter\ accesses}}\end{aligned}$$
(25)

$$\begin{aligned} \hbox {FRR}&= \frac{\hbox {Number\ of\ rejected\ genuine\ claims}}{\hbox {Total\ number\ of\ genuine\ accesses}} \end{aligned}$$
(26)

Evaluation of Performance

Vein Contour Extraction Performance

The vein contour exaction performances are evaluated in terms of the matching accuracy using the S-ELM network. Corresponding to the typical image enhancement methods mentioned in Fig. 6, the comparison of the matching performance is shown in Table 1. The guided filter performed with superior accuracy.

Table 1 Matching performance comparison with the typical enhancement methods

Full size table

AAD Feature Extraction

The AAD feature estimates the similarity between each pair of split blocks. The difference will decrease with increasing size of the local block, and so the characteristic of the individuals will become more featureless. In contrast, a smaller block size can describe the vein contour features in more detail, but will bring about a larger computational burden. To choose the proper block size, block sizes of $16\times {16}$ and $32\times {32}$ are evaluated in terms of the matching accuracy and time consumption using the S-ELM network. According to Table 2, a block size of $16\times {16}$ performs better than the larger block size of $32\times {32}$.

Table 2 Matching performance comparisons with block sizes of $16\times {16}$ and $32\times {32}$

Full size table

Performance of S-ELM, TER, E-ELM, FC-ELM, and EC-ELM

The matching performance, stability, and computational complexity of the finger vein recognition systems were evaluated according to the classifier employed for the four types of ELM: S-ELM, E-ELMs ($M=5, 10, 15, 20$), FC-ELMs, and EC-ELMs ($M=5, 10, 20$). In Fig. 14, the matching performance is compared with hidden neuron tuning for the S-ELM and E-ELM models for $M=5,10,15$, and 20. For $M=20$, the matching performance of the E-ELM networks is found to be the best, with a score of 97.19 %, which is much higher than that of the S-ELM model. In regard to the training and testing times, it is worth mentioning that although the E-ELM can be constructed in a parallel manner, the time consumption is calculated in series, since the simulation is performed based on a the single computer. As shown in Fig. 15, the matching accuracy of the proposed FC-ELMs is 97.69 %, which is higher than that of the E-ELM networks, because the weights are adaptively assigned to strengthen the weak learning ability caused by analyzing the vein directional components distribution. The EC-ELM model with a smaller number ($M=5$) of FC-ELM networks provides slightly improved matching performance, reaching 97.75 %. Based on the results of the tuned hidden neuron test in Figs. 14 and 15, the comparison of the structural complexities between the optimal versions of the four types of ELM is shown in Table 3. While S-ELM has the fewest nodes and E-ELMs ($M=20$) have the most, FC-ELMs have less than 20 % of the nodes required for E-ELMs ($M=20$). This shows that FC-ELMs (and EC-ELMs) are superior to E-ELMs in terms of structural complexity, since FC-ELMs (and EC-ELMs) have a relatively smaller size of basic network than E-ELMs.

Table 3 Comparison of the size of the input features and hidden neuron setting of the S-ELM, E-ELMs, FC-ELMs, and EC-ELMs models

Full size table

Table 4 Matching performance for individual trials with respect to the dataset SDUMLA

Full size table

Table 5 Matching performance for individual trials with respect to the dataset MMCBNU

Full size table

To evaluate the stability, ten trials of tenfold stratified cross-validation test are performed for each network, and the standard deviations of the matching accuracy are shown in Tables 4 and 5 for two testing datasets. Compared with the ELM method and the TER method [40], which minimizes the total error rates by adjustment of the class-specific normalization, EC-ELMs show the better stability. E-ELM and FC-ELM networks improve the stability by more than 50 % compared with the S-ELM. The FC-ELM network can achieve the same stability as E-ELMs ($M=10$), but with much fewer hidden neurons. From the evaluation of the matching performance, stability, and the complexity of the finger vein recognition systems for the S-ELM, E-ELMs ($M=5, 10, 15, 20$), FC-ELMs, and EC-ELMs ($M=5, 10, 20$), it is shown that the optimal classifier among the four kinds of ELM is the proposed FC-ELMs. Moreover, the significance of improvement was obtained based on a paired t test between two compared means at a significance level of 0.05, as shown in Tables 4 and 5.

Genuine Matching and Imposter Matching

The match score distribution of the two kinds of FC-ELMs is shown in Fig. 16. The $X$-axis represents the matching score, which is the final output decision value $f_C(X)$ obtained from the FC-ELM network in Eq. (21), and the $Y$-axis is its frequency value. The genuine matching can be separated from the imposter matching with a clear threshold for both the adaptive and average weighted networks. The adaptive weighted FC-ELMs provide a larger discrimination distance between the genuine and imposter matching than the average weighted FC-ELMs. Hence, the adaptive weighted FC-ELMs would be more adaptive to the growing dataset.

The receiver operating characteristic (ROC), which is a plot of the genuine acceptance rate (GAR = 1 $-$ FRR) versus the FAR, is simulated as shown in Fig. 17. The adaptive weighted FC-ELMs are slightly superior to the average weighted FC-ELMs, which obtains an FAR of 0.16 % and FRR of 0.58 %).

Table 6 Performance comparison of the proposed and existing methods

Full size table

Comparison with the Existing Methods

A comparison of the correct classification rate (CCR), training time, and testing time obtained from the minutiae feature-based methods [37], local binary pattern-based methods [38, 39], and the proposed directional feature-based methods is shown in Table 6. Based on the proposed features, several classifiers are compared including the modified Hausdorff distance [37], SVM [16], S-ELM, E-ELMs, FC-ELMs, and EC-ELMs. The results show that the proposed FC-ELMs achieve higher CCRs of 97.69 and 99.53 % for the [3, 3] and [5, 1] training and testing sets, and the EC-ELMs with $M=5$ afford the highest CCRs of 97.75 and 99.60 % for the [3, 3] and [5, 1] training and testing sets, respectively. Although the testing time of EC-ELMs is higher than that of the S-ELM and FC-ELMs models, it is several hundred times less than that of the E-ELMs and the other matching methods.

Conclusions

This paper presented an efficient finger vein recognition system with novel feature component-based ELM models. With the assignment of the adaptive weights to the FC-ELMs, a higher matching performance of CCR = 99.21 % is achieved with FAR = 0.16 % and FRR = 0.58 %, which is much better than those of the S-ELM, E-ELMs, SVM, and other distance-based methods. Moreover, due to the smaller size of the input feature vectors, fewer hidden neurons, and fewer number of ELM networks, the FC-ELM model provides superior performance in terms of both the recognition rate and matching speed, reaching 0.87 ms per image, which is satisfactory for real-time recognition. The FC-ELM and EC-ELM networks have the advantage of balancing the stability of E-ELM networks, along with higher CCRs and less computation complexity.

References

Miura N, Nagasaka A, Miyatake T. Feature extraction of finger-vein patterns based on repeated line tracking and its application to personal identification. Mach Vis Appl. 2004;15(4):194–203.
Article Google Scholar
Miura N, Nagasaka A, Miyatake T. Extraction of finger vein patterns using maximum curvature points in image profiles. IEICE-Trans Inf Syst. 2007;90(8):1185–94.
Article Google Scholar
Song WS, Kim TJ, Kim HC. A finger-vein verification system using mean curvature. Pattern Recogn Lett. 2011;32(11):1541–7.
Article Google Scholar
Hoover A, Kouznetsova V, Goldbaum M. Locating blood vessels in retinal images by piece-wise threshold probing of a matched filter response. IEEE Trans Med Imaging. 2000;19(3):203–10.
Article CAS PubMed Google Scholar
Chaudhuri S, Chatterjee S, Katz N, Nelson M, Goldbaum M. Detection of blood vessels in retinal images using two-dimensional matched filters. IEEE Trans Med Imaging. 1989;8:263–9.
Article CAS PubMed Google Scholar
Gang L, Chutatape O, Krishnan SM. Detection and measurement of retinal vessels in fundus images using amplitude modified second-order gaussian filter. IEEE Trans Biomed Eng. 2000;49:168–72.
Article Google Scholar
Yang WM, Rao Q, Liao QM. Personal identification for single sample using finger vein location and direction coding. In: International conference on hand-based biometrics; 2011. p. 1–6.
Wang YD, Li KF, Cui JL, Shark LK, Varley M. Study of hand-dorsa vein recognition. Adv Intell Comput Theor Appl. 2010;6215:490–8.
Google Scholar
Grassi M, Cambria E, Hussain A, Piazza F. Sentic web: a new paradigm for managing social media affective information. Cogn Comput. 2011;3(3):480–9.
Article Google Scholar
Cambria E, Hussain A. Sentic album: content-, concept-, and context-based online personal photo management system. Cogn Comput. 2012;4(4):477–96.
Article Google Scholar
Cambria E, Hussain A. Sentic computing: techniques, tools, and applications. In: Springer briefs in cognitive computation; 2012.
Wang QF, Cambria E, Liu CL, Hussain A. Common sense knowledge for handwritten Chinese recognition. Cogn Comput. 2013;5(2):234–42.
Article Google Scholar
Wu JD, Liu CT. Driver identification using finger-vein patterns with radon transform and neural network. Expert Syst Appl. 2009;36(3):5793–9.
Article Google Scholar
Cao FL, Zhang YQ, He ZR. Interpolation and rate of convergence for a class of neural networks. Appl Math Model. 2009;33(3):1441–56.
Article Google Scholar
Cao FL, Xie TF, Xu ZB. The estimate for approximation error of neural networks: a constructive approach. Neurocomputing. 2009;71(4):626–30.
Google Scholar
Wu JD, Liu CT. Finger-vein pattern identification using svm and neural network technique. Expert Syst Appl. 2011;38(11):14284–9.
Google Scholar
Huang G-B, Zhou H, Ding X, Zhang R. Extreme learning machine for regression and multiclass classification. IEEE Trans Syst Man Cybern Part B Cybern. 2012;42(2):513–29.
Article Google Scholar
Yang JC, Xie SJ, Yoon S, Park DS, Fang Z. Fingerprint matching based on extreme learning machine. Neural Comput Appl. 2012;1:1–11.
CAS Google Scholar
Xie SJ, Yang JC, Yoon S, Park DS. Intelligent fingerprint quality analysis using online sequential extreme learning machine. Soft Comput. 2012;16(9):1555–68.
Article Google Scholar
Chacko BP, Vimal KVR, Raju G, Babu AP. Handwritten character recognition using wavelet energy and extreme learning machine. Int J Mach Learn Cybern. 2011;3(2):149–61.
Article Google Scholar
Huang GB, Zhu QY, Siew CK. Extreme learning machine: theory and applications. Neurocomputing. 2006;70:489–501.
Article Google Scholar
Huang GB, Ding X, Zhou HM. Optimization method based extreme learning machine for classification. Neurocomputing. 2010;74(12):155–63.
Article Google Scholar
Wang XZ, Dong LC, Yan JH. Maximum ambiguity based sample selection in fuzzy decision tree induction. IEEE Trans Knowl Data Eng. 2011;24(8):1491–505.
Article Google Scholar
Wang XZ, Chen A, Feng HM. Upper integral network with extreme learning mechanism. Neurocomputing. 2011;74(16):2520–5.
Article Google Scholar
Huang GB, Wang D, Lan Y. Extreme learning machines: a survey. Int J Mach Learn Cybern. 2011;2(2):107–22.
Article Google Scholar
Hansen LK, Salamon P. Neural network ensemble. IEEE Trans Pattern Anal Mach Intell. 1990;12:993–1001.
Article Google Scholar
Lan Y, Soh Y, Huang G-B. Ensemble of online sequential extreme learning machine. Neurocomputing. 2009;72:3391–5.
Article Google Scholar
Heeswijk M, Miche Y, Lindh-Knuutila T, Hilbers P, Honkela T, Oja E, Lendasse A. Adaptive ensemble models of extreme learning machines for time series prediction. Lect Notes Comput Sci. 2009;5769:305–14.
Article Google Scholar
Heeswijk M, Miche Y, Oja E, Lendasse A. Gpu accelerated and parallelized elm ensembles for large-scale regression. Neurocomputing. 2011;74:2430–7.
Article Google Scholar
Sun ZL, Choi TM, Au KF, Yu Y. Sales forecasting using extreme learning machine with applications in fashion retailing. Decis Support Syst. 2008;46:411–9.
Article Google Scholar
Duda RO, Hart PE. Use of the hough transform to detect lines and curves in pictures. Commun Assoc Comput Mach. 1972;15(1):11–5.
Google Scholar
Lee T. Image representation using 2d gabor wavelets. IEEE Trans Pattern Anal Mach Intell. 1996;18(10):959–71.
Google Scholar
He K, Sun J, Tang X. Guided image filtering. Eur Conf Comput Vis. 2010;1:1–14.
Google Scholar
Maltoni D, Maio D, Jain AK, Prabhakar S. Handbook of fingerprint recognition. 2nd ed. Berlin: Springer; 2009.
Book Google Scholar
Heisele B, Ho P, Wu J, Poggio T. Face recognition: component-based versus global approaches. Comput Vis Image Underst. 2003;91(12):6–21.
Article Google Scholar
SDUMLA-database in http://mla.sdu.edu.cn/sdumla-hmt.html
Wang LY, Leedham G, Cho DSY. Minutiae feature analysis for infrared hand vein pattern biometrics. Pattern Recogn. 2008;41(3):920–9.
Article Google Scholar
Yang GP, Xi XM, Yin YL. Finger-vein recognition based on (2d) 2pca and metric learning. J Biomed Biotechnol. 2012;2012:1–9.
Google Scholar
Yang GP, Xi XM, Yin YL. Finger vein recognition based on a personalized best bit map. Sensors. 2012;12(2):1738–57.
Article PubMed Central PubMed Google Scholar
Toh K-A. Deterministic neural classifications. Neural Comput. 2008;20(6):1565–95.
Article PubMed Google Scholar

Download references

Acknowledgments

This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (No. 2013R1A1A2013778), and by the National Natural Science Foundation of China (No. 61063035).

Author information

Authors and Affiliations

Institute of Remote Sensing and Earth Science, Hangzhou Normal University, Hangzhou, 311121, China
Shan Juan Xie & Bin Zhou
Division of Electronic and Information Engineering, Chonbuk National University, Jeonju, 561756, South Korea
Shan Juan Xie, Yu Lu & Dong Sun Park
Department of Multimedia Engineering, Mokpo National University, Jeonnam, 534729, South Korea
Sook Yoon
College of Computer Science and Information Engineering, Tianjin University of Science and Technology, Tianjin, 300222, China
Jucheng Yang
IT Convergence Research Center, Chonbuk National University, Jeonju, 561756, South Korea
Dong Sun Park

Authors

Shan Juan Xie
View author publications
You can also search for this author in PubMed Google Scholar
Sook Yoon
View author publications
You can also search for this author in PubMed Google Scholar
Jucheng Yang
View author publications
You can also search for this author in PubMed Google Scholar
Yu Lu
View author publications
You can also search for this author in PubMed Google Scholar
Dong Sun Park
View author publications
You can also search for this author in PubMed Google Scholar
Bin Zhou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sook Yoon.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xie, S.J., Yoon, S., Yang, J. et al. Feature Component-Based Extreme Learning Machines for Finger Vein Recognition. Cogn Comput 6, 446–461 (2014). https://doi.org/10.1007/s12559-014-9254-3

Download citation

Received: 01 November 2013
Accepted: 11 February 2014
Published: 15 March 2014
Issue Date: September 2014
DOI: https://doi.org/10.1007/s12559-014-9254-3

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Feature Component-Based Extreme Learning Machines for Finger Vein Recognition

Abstract

Similar content being viewed by others

Finger-Vein Recognition Based on an Enhanced HMAX Model

Efficient hand vein recognition using local keypoint descriptors and directional gradients

A sequential combination of convolution neural network and machine learning for finger vein recognition system

Introduction

Related Study: Ensemble Extreme Learning Machine