FVSR-Net: an end-to-end Finger Vein Image Scattering Removal Network

Du, Shanshan; Yang, Jinfeng; Zhang, Haigang; Zhang, Bob; Su, Zhigang

doi:10.1007/s11042-020-09270-1

FVSR-Net: an end-to-end Finger Vein Image Scattering Removal Network

Published: 27 November 2020

Volume 80, pages 10705–10722, (2021)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Multimedia Tools and Applications Aims and scope Submit manuscript

FVSR-Net: an end-to-end Finger Vein Image Scattering Removal Network

Download PDF

Shanshan Du¹,
Jinfeng Yang²^na1,
Haigang Zhang²,
Bob Zhang³ &
…
Zhigang Su ORCID: orcid.org/0000-0002-3512-0147^1,4^na1

529 Accesses
9 Citations
Explore all metrics

Abstract

Based on the ever-growing emphasis on high security of biometrics recognition, finger vein recognition has captured more and more attention. However, due to the light scattering in human skin tissue during near-infrared light transmission imaging, the collected finger vein images are always degraded dramatically, which leads to the unreliability of vein features and the low accuracy of finger vein recognition. Although considerable traditional methods are dedicated to eliminating the effect of light scattering on imaging, the clearer images cannot be output end-to-end, the processes of restoring degraded finger vein images are laborious as well. Thereupon, with the aim at improving the visibility of finger vein features and generating clear finger vein images end-to-end, this effort represents a simple and effective method utilizing Convolutional Neural Network (CNN). First, in our previous work, the biological optical model used to settle the matter of skin scattering is modified to output restored finger vein images in an end-to-end manner. Second, a multi-scale CNN named E-Net is established to acquire credible estimation map of finger vein features, which is conducive to the acquisition of pleasurable restoration outcome. Finally, a scattering removal framework, addressed as Finger Vein Image Scattering Removal Network (FVSR-Net), is designed via integrating improved biological optical model with E-Net. Such a novel design facilitates the generation of clearer venous regions and increases computational efficiency and stability. Experiments accomplished on two finger vein datasets demonstrate the superiority of our proposed method in terms of visual quality and recognition performance.

Finger vein denoising algorithm based on gradient-oriented residual structure and LBP texture loss

Article 23 October 2023

Finger Vein De-noising Algorithm Based on Custom Sample-Texture Conditional Generative Adversarial Nets

Article 04 September 2021

An optimized deep learning based depthwise separable MobileNetV3 approach for automatic finger vein recognition system

Article 15 January 2024

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

As a burgeoning technology of identity authentication, finger vein recognition has captured extensive attention from the biometric community. Compared with current popular biometrics identification technologies (e.g. fingerprint recognition, iris recognition, face recognition), finger vein recognition has the following distinct and merits: anti-counterfeit, user friendliness, living body recognition and high security for user information [2, 12, 34]. On the grounds of these merits, this recognition technology is regarded as the solution of individual identification with great promise. In practice, because the haemoglobin in blood vessels absorbs more near-infrared radiation than other substances in skin tissues, finger vein images are usually obtained by transmission of near-infrared (NIR) light (700 nm–900 nm) [15]. Unfortunately, since finger vein is located in the inner part of human skin tissue, it is quite challenging to capture clear edges between venous region and non-venous region. Moreover, given that biological tissue is a heterogeneous medium [23], NIR light scattering in skin tissue reduces the quality of finger veins. Under these limiting and blocking conditions, finger vein recognition performance is not satisfactory. Hence, it is of importance to efficiently improve the visibility of finger vein features via removing scattering. To attack these problems, many efforts have been contributed to improve the image quality, which can be described as two categories, one based on enhancement technology and the other is based on restoration technology.

Enhancement-based methods

To improve the quality of degraded images, numerous methods based on image enhancement algorithms have been conducted. In [24], a histogram template equalization method was designed to improve the contrast of finger vein images to a certain extent. Fu, et al. [5] combined fuzzy theory with Retinex algorithm to enhance near-infrared images. Yang et al. [29,30,31] used different oriented filtering strategies to emphasize the texture of finger veins and yielded positive results. Shin, et al. [18] adopted four-direction Gabor filter and Retinex filter to process two finger vein images respectively before they were fused. Though the above methods could enhance finger vein images in some ways, they did not address the critical issue of light scattering in image degradation problem. This issue results in undesirable performance in terms of finger vein visual improvements.

Restoration-based methods

Owing to scattering in skin tissue during imaging, the collected images are quality deteriorative. In view of light transmission in human skin tissue, Lee et al. [9, 10] adaptively utilized determined depth point spread function (D-PSF) and constrained least square (CLS) filter to restore venous pattern. In [28], taking into account the optical properties of skin layers, a Gaussian-PSF model and two D-PSF models were developed to restore finger vein images step-by-step. However, for D-PSF-based image reduction, proper estimation of biological parameters is an arduous task in practice. To settle this matter, Yang et al. [27, 32] further established Biological Optical Model (BOM) to describe the process of finger vein image degradation, which facilitated the detailed description and analysis of image blurring. Although these methods took the effect of light scattering into consideration and produced all-right restoration results, the estimation of biological optical model parameters was laborious, and the usage was limited in practice.

In view of numerous deficiencies of these methods, we are committed to employ a novel and efficient finger vein restoration method based on CNN, which is capable of restoring images end-to-end. The contributions of this paper are summarized as follows:

1.
Instead of respectively estimating non-scattered transmission matrix and the intensity of scattered radiation as most previous methods did, an improved biological optical model is presented to realize the goal of integrating non-scattered transmission matrix and the intensity of scattered radiation in one predictor variable.
2.
Furthermore, for the purpose of alleviating scattering removal problem effectively, an end-to-end finger vein image scattering removal model abbreviated as FVSR-Net is designed with the assistance of a light-weight CNN.
3.
Compared with other representative methods, our proposed model conquers the light scattering interference and manifests better visual performance and lower equal error rate (EER) in finger vein recognition.

The rest of this paper is organized as follows. Section 3 introduces the process of finger vein degradation and the establishment of biological optical model. In Section 4, the proposed improved biological optical model and FVSR-Net framework are presented in detail. We then report the experimental results in Section 5, and summarize this paper in Section 5.

2 Related work

Human skin is composed of following three layers: epidermis, dermis and subcutaneous, it can be regarded as a collection of absorption and scattering particles. NIR light propagating through human skin tissue is always refracted, absorbed and scattered by these particles. In such a manner, we cannot make the supposition that light propagation without scattering in a straight line. Actually, transmitted light consists of ballistic photons, snake photons and diffuse photons with a view of bio-photonics, as shown in Fig. 1. The propagation ways belong to these three photons are various in skin tissues. Ballistic photons travel in a straight line through a medium. Snake photons are subjected to some slight scattering, but still propagate in a forward or near-forward manner. Diffuse photons undergo multiple scattering and propagate in a random way. Consequently, the NIR light in the skin tissue inevitably suffer a series of multiple scattering, which is the main cause of blurred finger vein images.

To depict the effects of direct attenuation and scattering attenuation on finger vein imaging, our previous work [32] capitalized on biological optics and introduced the finger vein image scattering removal model. The biological optical model developed in [32] can be described as

$$ I(x)={I}_0(x)T(x)+\left(1-T(x)\right){I}_r(x) $$

(1)

where the vector x represents the pixel coordinates [x, y], I(x) denotes the captured finger vein images, I₀(x) is the scatter-free latent finger vein images to be recovered. I_r(x) represents the local background illumination map, whose value is bound up with the local optical properties of skin issues. T(x) is described as the non-scattered transmission map, which can be further expressed in term as

$$ T(x)={e}^{\mu D(x)} $$

(2)

where μ is called scattering coefficient of skin tissue, D(x) is the depth of object in the skin layer. The first term of (1),I₀(x)T(x), represents the direct attenuation component of the incident light into skin tissues. The last term of (1),(1 − T(x))I_r(x), means an approximately scattering component that appears from skin tissues. As shown in Fig. 2, under the effect of direct attenuation and scattering attenuation, the collected finger vein image appears degraded blur. Given I_r(x) and T(x), I₀(x) can be easily reconstruct from (1).

Conventional finger vein image restoration methods put the emphasis on estimating T(x) and I_r(x), respectively. For instance, [32] coped with scattering removal by proposing biological optical model (BOM). I_r(x) was computed via considering the interaction of particles around the object, and the last term of (1) was estimated by the method of [20] to acquire T(x). After estimating I_r(x) and T(x),I₀(x) can be calculated with little hindrance. To describe the forward probability of the scatter energies representatively, [27] further improved BOM via multiplying the last term of (1) by a weighted factor α₁ and came up with Weighted Biological Optical Model (WBOM). The estimation methods of I_r(x)and T(x) were based on anisotropic diffusion and Gamma correction. Albeit the approaches mentioned above achieved these ends of improving visual and recognition performance to a certain extent, they did not pay more attention to explore a one-step model, that is involving T(x) and I_r(x) integrated into one variable. In order to improve upon and augment the previous work, we take advantage of multi-scale CNN in an attempt to restore finger vein images.

3 The proposed restoration method

In this section, the proposed FVSR-Net is expounded. We introduce the improved biological optical model in the first subsection, which facilitates the achievement of end-to-end output images. In the second subsection, both architecture and parameter setting of FVSR-Net are recommended in detail.

3.1 Improved biological optical model

As explained in related work, it is significant to felicitously choose parameter estimated methods according to the characteristics of different datasets. Based on BOM in (1), the restored finger vein images can be generated by

$$ {I}_0(x)=\frac{1}{T(x)}I(x)-\frac{\left(1-T(x)\right){I}_r(x)}{T(x)} $$

(3)

Traditional methods put forward in [27, 32] are incapable of directly minimizing the restoration errors on I₀(x). Actually, such an indirect optimization will lead errors to be accumulated or even amplified, when combining both T(x) and I_r(x) to calculate I₀(x) in (3). Under the given circumstance, a new variable E(x) in (4) is adopted, serving as a bridge for connecting T(x) and I_r(x). In addition, the existence of E(x) makes it possible to directly minimize restoration errors in the pixel domain. Based on the description above, the BOM in (3) can be re-expressed as below:

$$ {I}_0(x)=E(x)I(x)-E(x)+a $$

(4)

Where,

$$ {\displaystyle \begin{array}{l}E(x)=\frac{\frac{1}{T(x)}\left(I(x)-{I}_r(x)\right)+{I}_r(x)-a}{I(x)-1}\\ {}\end{array}} $$

(5)

In this re-expressed formula, E(x) is regarded as the estimation map of finger vein image and it can roughly outline the veins. a is set to the default of 1 as constant bias. Moreover, the joint estimation of T(x) and I_r(x) makes it feasible to confine each other mutually and gives rise to more credible estimation map. We then concentrate on constructing an input-adaptive CNN model, whose weights will change with input original finger vein images. Thereby, this CNN model can minimize the restoration error between the output I₀(x) and the ground truth. To justify the importance and effectiveness of jointing T(x) and I_r(x), we conduct a comparative experiment that compares our proposed method with the baseline [32]. The baseline estimates T(x) and I_r(x) through separate steps, while the proposed method learns T(x) and I_r(x) together inE(x). As observed in Fig. 3, the restoration performance of baseline is overshadowed by FVSR-Net. From the marked red boxes in Fig. 3 (b), it can be seen that parts of vein are even broken owing to the error accumulation. That precisely emphasizes the importance of joint estimation.

3.2 End-to-end scattering removal network

In Fig. 4 (a), our FVSR-Net is comprised of two parts: an E-Net to estimate E(x)from the input imageI(x); a degraded image restoration module that enables image to be output end-to-end. The architecture of E-Net is depicted in Fig. 4 (b). It is the critical module of FVSR-Net, being responsible for estimating the relative scattering level and generating the estimation map. Our network design is based on the following considerations:

Early fusion of different feature maps

Through many classic works (such as DenseNet [7], U-Net [17]), it can be seen that fusing features of different scales is a vital means to improve image processing performance. To enable the E-Net to take into account the semantics of the whole input images and impose coherence among local structures, we are committed to merging the two kinds of features extracted from images into more discriminative fusion features. Benefited from the DenseNet [7], E-Net includes 3 dense blocks; each makes up of two convolutions, followed by a concatenation layer. Not only that, concatenation layers are then convoluted by larger convolutional kernels to gain richer semantic information.

Various filter sizes design

In addition, more recent attention has been put on the use of various filter sizes in one CNN. In [16], the coarse-scale feature maps are fed to the fine-scale network to generate a refined transmission map. The inception architecture in GoogLeNet [19] adopts parallel convolutions with varieties of filter sizes. Since finger vein feature is a kind of complex texture feature, which contains many detailed vein branches. Employing various filter sizes is conducive to enrich receptive fields when features of different layers are extracted. Inspired by these work, E-Net is formed with multi-scale features by applying different sizes of convolutional filters among. Each convolutional filter is followed by a Rectified Linear Unit (ReLU).

Light-weight structure

Moreover, for improving training speed and speeding up the convergence of the network in practice, a light-weight structure is taken into consideration. We introduce 1 × 1 convolution kernel to decrease the parameter sizes without losing network performance, and employ only ten convolutional layers in E-Net (the specific parameter setting of 10 convolutional layers are listed in Table 1). The later experiments show that it takes less epochs and less time to achieve quite good results.

Table 1 Network parameter setting details

Full size table

Overall, such a design brings good practical value. It can not only compensate the missing spatial information during convolutional process, but also entirely mine the features from non-adjacent layers. The whole image information is also fully utilized in the transmission process, which guarantees a well restoration effect.

Besides, to further reveal the merits of multi-scale design, we construct a simple CNN model without any concatenation, that is “CONV1 → CONV2 → CONV3 → ⋯ → CONV10”(the parameter setting is same as Table 1). According to the description of Fig. 5, we discover that it is formidable to estimate E(x) via employing the CNN structure with no concatenation. Especially in Fig. 5 (b), the boundaries between veins and background are even more blurred than in the original images.

In addition, CNN is expert in extracting image features, but in the case of our task, the assist of improved BOM is also indispensable. Therefore, we compare two approaches: CNN model without improved BOM and FVSR-Net based on improved BOM. As shown in Fig. 6, FVSR-Net allows a more complete restoration of the original image. Beyond that, more pure background is acquired through our method. In this way, it is plausible that the assistant of improved BOM is effective and necessary.

4 Experiments and results

In this section, we manifest the effectiveness of this method through several experiments and compare it with other current approaches. The experimental results indicate that FVSR-Net has satisfactory performance in scattering removal and finger vein recognition.

4.1 Datasets

(1)
Dataset A utilized in our experiments is collected from a lab-made finger vein image acquisition system with a 760 nm NIR LED array source as shown in Fig. 7 (a), and then acquired from the original images by Region of Interest (ROI) localization and segmentation method proposed in [25]. Figure 7 (b) indicates some preprocessed samples in Dataset A. The homemade dataset includes a total of 5850 finger vein images from 585 individuals, and 10 images per individual. We randomly divide Dataset A into two parts, 5270 finger vein images for training and 580 finger vein images for validating.

(2)
Dataset B is from the Shandong University finger vein dataset [33] (abbreviated as SDU dataset). Here, 106 volunteers contribute their finger vein imageries, each of whom supplies index, middle and ring fingers from two hands. In one session, each finger provides 6 images, thus 3816 images are obtained. To test our proposed FVSR-Net, we randomly select 100 fingers for a total of 600 finger vein images. The ROI images with size of 200×100 are then received via the same method as for Dataset A.

4.2 The acquirement of approximate ground truth

As a matter of fact, it is out of the question to obtain high visibility venous images in practice according to the description in Section 1. Consequently, unlike other methods of restoring degraded images [1], the troublesome problem of recovering finger vein images is to acquire reliable ground truth corresponding to collected finger vein images. Considering that vein backbone and branches are affected by scattering comparably when incident light propagates through skin tissues, we can reasonably employ a method of regarding vein backbone as the approximate ground truth. As depicted in Fig. 8, there are several steps of extracting finger vein backbone.

First, original finger vein images are preprocessed before extracting vein backbone for determining stable vein backbone regions. As shown in Fig. 9 (b), the vein backbone regions are faintly visible after preprocessing, but still surrounded by a large amount of noise. The band-pass property of Gabor wavelet and its capability of selecting directions make Gabor transformation very suitable for noise reduction and local feature analysis in image processing. Hence, finger vein image enhancement is then performed by even-symmetric Gabor filters [26], which can be expressed as:

$$ {G}_{mk}^e\left(x,y\right)=\frac{\gamma }{2{{\pi \sigma}_m}^2}\exp \left\{-\frac{1}{2}\left(\frac{x_{\theta_k}^2+{\gamma}^2{y}_{\theta_k}^2}{{\sigma_m}^2}\right)\right\}\times \left(\cos \left(2\pi {f}_m{x}_{\theta_k}\right)-\exp \left(-\frac{v^2}{2}\right)\right) $$

(6)

where m is the scale index, k is the orientation index. m and k are set to 3 and 8, respectively. Finally, more stable vein backbone regions with less noise are obtained in Fig. 9 (c).

What calls for special attention is that the ground truth obtained by the above method is approximate, rather than completely accurate. In point of fact, finger vein images suffer many disturbances during imaging, such as uneven brightness, position variations, and the interference of other substances in skin tissue. These interferences are random, unavoidable and unstable, and are finally represented as texture branches after image processing, as observed in Fig. 9 (c).

4.3 Implement details

For training FVSR-Net, we adopt Adam [8] as the optimization method with a weight decay of 0.0001 and a momentum of 0.9. The learning rate and batch size are set to 0.001 and 4, respectively. The weights of network are initialized by Gaussian random variables. We also clip the gradient to restrict the norm within [−0.1, 0.1]. The FVSR-Net model is trained on a NVidia GTX 1080Ti GPU, Intel Xeon(R) sliver 4110 CPU@2.10GHz and 32GB RAM using PyTorch framework for 10 training epochs.

Mean Square Error (MSE) [37] is often applied as loss function, but the traditional MSE-based loss cannot meet the rising demand of expressing images via imitating the human visual system. Considering that finger vein image restoration is a real-world application, a perceptually motivated metric should be employed instead of only MSE. Structural Similarity Index (SSIM) [21] is a perception-based model, and it is beneficial to evaluate the perceptual phenomena of images. To produce visually pleasing finger vein images, we combine SSIM and MSE as loss function. In our experiments, given the training sample patch P, the MSE loss function can be written as:

$$ {\mathrm{\ell}}^{MSE}(P)=\frac{1}{N}{\sum}_{p\in P}{\left[I(p)-K(p)\right]}^2 $$

(7)

Where N is the number of pixels in patch P, p represents the index of the pixel [x, y], I(p) and K(p)are the pixel values of the generated finger vein image and the ground truth image respectively. The SSIM formula and SSIM loss function can be defined as:

$$ SSIM(P)=\frac{2{\mu}_x{\mu}_y+{C}_1}{\mu_x^2+{\mu}_y^2+{C}_1}\cdot \frac{2{\sigma}_{xy}+{C}_2}{\sigma_x^2+{\sigma}_y^2+{C}_2} $$

(8)

$$ {\mathrm{\ell}}^{SSIM}(P)=\frac{1}{N}{\sum}_{p\in P}1- SSIM(p) $$

(9)

Where, $ {\mu}_x=\frac{1}{N}\sum \limits_{p=1}^N{x}_p $ is the mean luminance value of x, $ {\sigma}_x^2=\frac{1}{N}\sum \limits_{p=1}^N\left({x}_p-{\mu}_p\right) $ is the variance of x, $ {\sigma}_{xy}=\frac{1}{N-1}\sum \limits_{p=1}^N\left({x}_p-{\mu}_x\right)\left({y}_p-{\mu}_y\right) $ is the covariance of x and y, the function of C₁ and C₂ is to prevent denominator from being 0. By combining MSE loss function and SSIM loss function, the final loss function utilized in this task can be written as:

$$ {L}_{loss}={\mathrm{\ell}}^{MSE}(P)+{\mathrm{\ell}}^{SSIM}(P) $$

(10)

4.4 Finger vein images restoration

After training FVSR-Net, the restoration results of degraded finger vein images are shown in Fig. 10. The estimation map in Fig. 10 (b) outline the vein backbone roughly, and the corresponding extracted vein backbone in Fig. 10 (d) is interconnected and abundant. We also observe that scattering effect is suppressed effectively, and the contrast between venous region and background is increased notably.

Qualitative visual comparison results

As illustrated in Fig. 11, we evaluate FVSR-Net against several common approaches for restoring finger vein images: biological optical model (BOM) [32], and weighted biological optical model (WBOM) [27]. Additionally, a dehaze-based restoration method (AOD-Net) [11] is directly adopted to deblur finger vein images in case of simply regarding degraded finger vein images as haze images. As displayed in Fig. 11 (b), we note that restoration results based on BOM are somewhat sensitive to noise, even though BOM has good effect on recovery task. Figure 11 (c) shows better restoration performance than Fig. 11 (b), but compared to Fig. 11 (d), scattering residue in Fig. 11 (c) is still evident. In addition, the dehaze-based method depicted in Fig. 11 (d) yields satisfying removal of scattering, but the contrast between finger vein region and background can still be further improved. Overall, the experimental results in Fig. 11 (e) imply that our proposed FVSR-Net is superior to these current methods on the part of finger vein information preservation and scattering removal, and is more visually faithful to the ground truth.

In practical application, we find that WBOM algorithm does not perform very well in restoring strongly scattered images and low-quality images, even though it manifests pleasurable results in restoring weak scattering images. To verify the preponderance of FVSR-Net, five low-quality finger vein images in Fig. 12 (a) are selected, which have low contrast and serious distortion. The restoration results employing BOM and WBOM methods are illustrated through Fig. 12 (b) and Fig. 12 (c). It is not hard to find that the images output by BOM and WBOM methods still bring obstacles to distinguish venous region and background, which undoubtedly create inferior recognition performance. Particularly worth mentioning is that our proposed FVSR-Net not only obtains better restoration results for weakly scattered images, but also makes up for the inadequacy of traditional algorithms in restoring strong scattering finger vein images, as shown in Fig. 12 (e).

Quantitative comparison results

Except for the previous visual comparisons, PSNR [13], SSIM and Scoot results about the restoration image quality are reported in Tables 2 and 3. Since FVSR-Net is trained under the MSE loss and SSIM loss, higher PSNR and SSIM are acquired than others. As far as we know, Scoot [4] is a state-of-the-art perceptual metric that simultaneously considers the block level spatial structure and co-occurrence texture statistics. So, we adopt Scoot as an approach to evaluate the restoration performance of FVSR-Net. And more attractively, FVSR-Net still has a greater Scoot preponderance than other competitors, even if Scoot is not directly referred to as an optimization criterion.

Table 2 Average PSNR, SSIM and Scoot results on Test Set A

Full size table

Table 3 Average PSNR, SSIM and Scoot results on Test Set B

Full size table

4.5 Running time comparison

Moreover, running time comparison experiments between our proposed method and other typical restoration methods are presented. The first image of each 100 individuals in Dataset B is selected for time cost comparison, and the average time of restoring 100 images in various approaches is given in Table 4. The time cost experiments of BOM and WBOM are performed by MATLAB R2014a on a PC with i5–4590 CPU 3.30GHz and 4GB RAM. And the time cost experiments of AOD-Net and FVSR-Net are performed on Intel Xeon(R) sliver 4110 CPU@2.10GHz and 32GB RAM using PyTorch framework, without GPU acceleration. As displayed in Table 4, it takes much less time to restore one image by using FVSR-Net than traditional methods, owing to the light-weight design of E-Net. At the same time, the time consuming of our restoration approach is slightly larger than AOD-Net. A plausible reason is that more convolutional layers added in FVSR-Net than AOD-Net. On the whole, it still meets real-time requirement and is acceptable in the practice application.

Table 4 The average restoration time of 100 images in Dataset B (image size: 200 × 100)

Full size table

4.6 Matching test

In order to further demonstrate our proposed method is also advantageous to increase recognition accuracy, a simple but effective matching method (named matrix matching method) is employed in our experiments. Matrix matching method calculates the correlation coefficient between two matrices, wherein the similarity of two matrices is expressed as

$$ {M}_s=\frac{\sum \limits_{j=1}^m\sum \limits_{k=1}^n\left(A\left(j,k\right)-\overline{A}\right)\left(B\left(j,k\right)-\overline{B}\right)}{\sqrt{\left(\sum \limits_{j=1}^m\sum \limits_{k=1}^n{\left(A\left(j,k\right)-\overline{A}\right)}^2\right)\left(\sum \limits_{j=1}^m\sum \limits_{k=1}^n{\left(B\left(j,k\right)-\overline{B}\right)}^2\right)}} $$

(11)

where $ \overline{A} $ and $ \overline{B} $ represent the averages of the matrices A and B, respectively. To assess finger vein recognition performance, 1000 finger vein images associated with 100 fingers from Dataset A are regarded as test set A, and 600 finger vein images associated with 100 fingers from Dataset B are employed as test set B. The ROC (Receiver Operating Characteristic) curves of different restoration methods are plotted in Fig. 13 (a) and Fig. 13 (b). In addition, x-coordinate stands for FAR (False Acceptance Rate), y-coordinate represents FRR (False Rejection Rate), and EER (Equal Error Rate) is the error rate when FAR and FRR are equal. Table 5 shows the EERs corresponding to Fig. 13 (a) and Fig. 13 (b). We can clearly observe from Fig. 13 and Table 5 that our proposed FVSR-Net achieves the lowest EER and obtains the best recognition performance. This indicates that FVSR-Net is capable of representing finger vein features reliably and effectively.

Table 5 Equal Error Rates (%) of different restoration methods

Full size table

In Section 4.2, we mentioned that the ground truth employed in this paper is approximate, rather than completely accurate. This is due to the fact that ground truth contains some unstable, random interference branches. In the training process, it is hard for convolution filters in CNN to find a uniform response and output those unstable branches. On the contrary, vein backbone has more reliability and robustness, so that can be output stably and distinctly. More importantly, the finger vein recognition performance is determined by the robust vein backbone to a large extent. Randomly distributed, unstable branches do not contribute to finger vein recognition, but rather disrupt the structure of vein backbone, thereby hindering recognition performance. To further illustrate, we randomly select 2000 finger vein images (200 fingers×10 images in Dataset A) and 1000 finger vein images (100 fingers×10 images in Dataset A) for the recognition experiment. From the results shown in Fig. 14, finger vein images restored by FVSR-Net achieves the lower EER values and better recognition performance.

5 Conclusion

In this study, we propose an end-to-end convolutional neural network named FVSR-Net to address important problems in finger vein image restoration. First, based on biological optical model, an improved biological optical model is put forward to estimate all parameters without any intermediate step and output the restored image end-to-end. Then, we apply a multi-scale CNN to the finger vein scattering removal task and extract vein backbone features clearly. Experimental results indicate that our proposed method obtains better visual performance and recognition performance. Last, FVSR-Net not only works well in restoring weakly scattering images, but also earns favorable restoration results on strong scattering and low-quality images.

Moreover, there is room for improvement in future works: (1) Our proposed method is based on the Biological Optical Model, which is a simplified and ideal physical model. However, there may be highly nonlinear transformation between the blurred image and ground truth. Whether the design of the image restoration algorithm itself have to rely on this physical model? It deserves further exploration. (2) Introducing attention-based models [3, 14, 22, 35, 38] also assists to acquire more reliable results. For example, it is known that the estimation accuracy of one scale in a multi-scale design affects the next scale. Inspired by [39], introducing a channel-wise attention mode is conducive to alleviate the multi-scale bottleneck issue. (3) How ground truth is acquired affects restoration results, more or less. As such, training raw images without ground truth via unsupervised [6] or weakly-supervised attention models [36] is worth investigating further.

References

Cai B, Xu X, Jia K, Qing C, Tao D (2016) Dehazenet: an end-to-end system for single image haze removal. IEEE Trans Image Process 25(11):5187–5198
Article MathSciNet Google Scholar
Chanukya PSVVN, Thivakaran TK (2020) Multimodal biometric cryptosystem for human authentication using fingerprint and ear. Multimed Tools Appl 79:659–673. https://doi.org/10.1007/s11042-019-08123-w
Article Google Scholar
Fan DP, Cheng MM, Liu JJ, et al. (2018) Salient objects in clutter: bringing salient object detection to the foreground. In: proceedings of the European conference on computer vision (ECCV), pp 186-202
Fan DP, Zhang S, Wu YH, Liu Y, Cheng MM, Ren B, Rosin PL, Ji R (2019) Scoot: a perceptual metric for facial sketches. In: Proceedings of the IEEE International Conference on Computer Vision, pp 5612–5622
Fu B, Cui J, Xiong X (2010). A novel adaptive vein image contrast enhancement method based on fuzzy and Retinex theory. In: IEEE international conference on information and automation (ICIA). IEEE, pp 2447–2450
Golts A, Freedman D, Elad M (2019) Unsupervised single image dehazing using dark channel prior loss. IEEE Trans Image Process 29:2692–2701
Article Google Scholar
Huang G, Liu Z, Van Der Maaten L, et al. (2017). Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708
Kingma D, Ba J (2015) Adam: a method for stochastic optimization. 2015 international conference on learning representations (ICLR). arXiv:1412.6980
Lee E, Park K (2009) Restoration method of skin scattering blurred vein image for finger vein recognition. Electron Lett 45(21):1074–1076
Article Google Scholar
Lee EC, Park KR (2011) Image restoration of skin scattering and optical blurring for finger vein recognition. Opt Lasers Eng 49:816–828
Article Google Scholar
Li B, Peng X, Wang Z, Xu, J, Dan F (2017) AOD-net: all-in-one Dehazing network. In: Proceedings of IEEE International Conference on Computer Vision, Venice, pp 4780–4788
Liao X, Li K, Yin J (2017) Separable data hiding in encrypted image based on compressive sensing and discrete fourier transform. Multimed Tools Appl 76:20739–20753. https://doi.org/10.1007/s11042-016-3971-4
Article Google Scholar
Liao X, Qin Z, Ding L (2017) Data embedding in digital images using critical functions. Image Commun 58:146–156. https://doi.org/10.1016/j.image.2017.07.006
Article Google Scholar
Lu X, Wang W, Ma C, et al. (2019) See more, know more: unsupervised video object segmentation with co-attention siamese networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3623–3632
Qin H, Chen Z, He X (2018) Finger-vein image quality evaluation based on the representation of grayscale and binary image. Multimed Tools Appl 77:2505–2527. https://doi.org/10.1007/s11042-016-4317-y
Article Google Scholar
Ren W, Liu S, Zhang H, Pan J, Cao X, Yang MH (2016). Single image dehazing via multi-scale convolutional neural networks. In: Proceedings of European conference on computer vision, pp 154–169.
Ronneberger O, Fischer P, Brox T (2015). U-net: convolutional networks for biomedical image segmentation. In proceedings of international conference on medical image computing and computer-assisted intervention, pp 234-241
Shin KY, Park YH, Nguyen DT (2014) Finger-vein image enhancement using a fuzzy-based fusion method with gabor and retinex filtering. Sensors 14(2):3095–3129
Article Google Scholar
Szegedy C, Liu W, Jia Y, et al. (2015). Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9
Tarel J. P, Hautiere N (2009). Fast visibility restoration from a single color or gray level image. In: Proceedings of 2009 IEEE 12th International Conference on Computer Vision, pp 2201–2208
Wang Z, Bovik AC, Sheikh HR et al (2014) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 13(4):600–612
Article Google Scholar
Wang W, Shen J, Shao L (2018) Video salient object detection via fully convolutional networks. IEEE Trans Image Process 27(1):38–49
Article MathSciNet Google Scholar
Wax AP, Backman V (2010). Biomedical applications of light scattering IV Biomedical applications of light scattering IV, 7573
Wen X, Zhao J, Liang X (2008) Image enhancement of finger-vein patterns based on wavelet denoising and histogram template equalization. Journal of Jilin University (Science Edition) 2:026
Google Scholar
Yang J, Li X (2010). Efficient finger vein localization and recognition. In: Proceedings of international conference on pattern recognition, pp 1148–1151
Yang J, Shi Y (2012) Finger-vein ROI localization and vein ridge enhancement. Pattern Recogn. Lett. 33(12):1569–1579
Google Scholar
Yang J, Shi Y (2014) Towards finger-vein image restoration and enhancement for finger-vein recognition. Inf Sci 1(268):33–52
Article Google Scholar
Yang J, Wang J (2011). Finger-vein image restoration considering skin layer structure. In: proceedings of 2011 international conference on hand-based biometrics, pp 1-5
Yang J, Yan M. (2010). An improved method for finger-vein image enhancement. In: proceedings of the 10th IEEE international conference on signal processing (ICSP 2010), Beijing, pp 1706–1709
Yang J, Yang J (2009) Multi-channel gabor filter design for finger-vein image enhancement. In: Fifth international conference on image and graphics. IEEE, pp 87–91
Google Scholar
Yang J, Yang J, Shi Y (2009). Combination of gabor wavelets and circular gabor filter for finger-vein extraction. In: proceedings of 5th international conference on intelligent computing, pp 346–354
Yang J, Zhang B, Shi Y (2012) Scattering removal for finger-vein image restoration. Sensors 12(3):3627–3640
Article Google Scholar
Yin Y, Liu L, Sun X (2011). Sdumla-hmt: a multimodal biometric database. In: Biometric Recognition. Springer, pp 260–268
Zhang H, Li S, Shi Y, Yang J (2019) Graph fusion for finger multimodal biometrics. IEEE Access 7:28607–28615
Article Google Scholar
Zhang J, Fan DP, Dai Y, et al. (2020) UC-net: uncertainty inspired rgb-d saliency detection via conditional variational autoencoders. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 8582–8591
Zhang J, Yu X, Li A, et al. (2020) Weakly-supervised salient object detection via scribble annotations. arXiv preprint arXiv:2003.07685
Zhao H, Gallo O, Frosio I, Kautz J (2017) Loss functions for image restoration with neural networks. IEEE Trans Comput Imaging 3:47–57
Article Google Scholar
Zhao JX, Liu JJ, Fan DP, et al. (2019) EGNet: edge guidance network for salient object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp 8779–8788
Zhao JX, Cao Y, Fan DP, et al. (2019) Contrast prior and fluid pyramid integration for RGBD salient object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3927–3936

Download references

Author information

Jinfeng Yang and Zhigang Su contributed equally to this work.

Authors and Affiliations

Tianjin Key Laboratory for Advanced Signal Processing, Civil Aviation University of China, Tianjin, China
Shanshan Du & Zhigang Su
Institute of Applied Artificial Intelligence of the Guangdong-Hong Kong-Macao Greater Bay Area, Shenzhen Polytechnic, Shenzhen, China
Jinfeng Yang & Haigang Zhang
Department of Computer and Information Science, University of Macau, Zhuhai, Taipa, China
Bob Zhang
Sino-European Institute of Aviation Engineering, Civil Aviation University of China, Tianjin, China
Zhigang Su

Authors

Shanshan Du
View author publications
You can also search for this author in PubMed Google Scholar
Jinfeng Yang
View author publications
You can also search for this author in PubMed Google Scholar
Haigang Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Bob Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Zhigang Su
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Jinfeng Yang or Zhigang Su.

Ethics declarations

Conflict of interest

The authors declare that there is no conflict of interest regarding the publication of this paper.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Du, S., Yang, J., Zhang, H. et al. FVSR-Net: an end-to-end Finger Vein Image Scattering Removal Network. Multimed Tools Appl 80, 10705–10722 (2021). https://doi.org/10.1007/s11042-020-09270-1

Download citation

Received: 16 January 2020
Revised: 12 June 2020
Accepted: 24 June 2020
Published: 27 November 2020
Issue Date: March 2021
DOI: https://doi.org/10.1007/s11042-020-09270-1

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

FVSR-Net: an end-to-end Finger Vein Image Scattering Removal Network

Abstract

Similar content being viewed by others

Finger vein denoising algorithm based on gradient-oriented residual structure and LBP texture loss

Finger Vein De-noising Algorithm Based on Custom Sample-Texture Conditional Generative Adversarial Nets

An optimized deep learning based depthwise separable MobileNetV3 approach for automatic finger vein recognition system