1 Introduction

Face recognition technology has been widely used in many fields. However, in reality, the captured face images may suffer from variations due to uncontrolled acquisition such as illumination, expression, angle variations and glasses occlusion, which affects the efficiency of face recognition system.

Researchers have spared no efforts to obtain some robust and effective face recognition methods. Texture feature based face recognition methods have attracted more attention. Texture information of face image can be utilized to measure intensity distribution (i.e. gray values of gray image) in different situations (i.e. wrinkles, bumps and dents). Eleyan and Demire used grey level co-occurrence matrix (GLCM) operator to describe image texture information about intensity distribution and relative position of neighborhood pixels [1]. Yu et al. [2] enhanced the method by combining GLCM and weighted Euclidean distance. Local binary pattern (LBP) operator [3] has been widely used to extract texture details of face image [4, 5], but when the resolution of image changes, the calculated textures are not accurate. Zhang et al. [6] proposed high-order local derivative patterns (LDP) descriptor by encoding distinctive spatial relationship in given regions rather than the relationship between the central point and its neighbors in LBP. This method is claimed to be more effective than LBP, but the size of extracted features is large. Methta et al. [7] extracted directional and textural feature by applying modified LBP operator to optimized directional faces. LBP mostly focuses on describing texture details, which leads to poor characterization of shape over a broader range of scale. Wavelet transform was widely used to extract texture information in local area for its property of spatial-frequency localization, multi-scale and multi-orientation [8]. Gabor function which meets the uncertainty principle limit of time–frequency domain is considered as wavelet basis function in 2D-wavelet transform, so it is easier for 2D-Gabor wavelet transform to achieve the best resolution in time–frequency domain [9]. 2D-Gabor with properties of multi-scale and multi-orientation had been widely used for face recognition [1023]. Since both magnitude and phase of Gabor feature contain rich information, some related approaches and fusion methods have been proposed [11, 13, 14, 16, 17, 20, 22, 24]. Yu et al. [19] had combined Gabor magnitude-based and Gabor phase-based texture representations to construct Gabor magnitude-based and phase-based texture representation for high utilization of Gabor feature. Xia et al. [16] applied block Gabor directed derivative layer local radii-changed binary patterns (BG2D2LRP) to capture static texture differences and dynamic contour trends, more information insensitive to expression interferences can be extracted from BG2D2LRP feature. However, Gabor feature performs poorly in describing appearance details in high-frequency regions.

It is worth noting that most of the approaches, such as local Gabor binary patterns [13], learned local Gabor patterns [21] and histogram of Gabor phase patterns [14, 23], combine Gabor wavelet with other texture descriptors (i.e. LBP) to improve efficiency [13, 14, 16, 18, 2023]. Gabor feature and LBP feature were fused to gain better performance than either alone [18]. The methods mentioned above can be divided into two categories, one is to fuse Gabor feature and local patterns feature and the other is to utilize local patterns to encode Gabor filtered face image. The former captures most facial information to a certain extent while the latter improves representation capability of local patterns at multiple scales and orientations.

NSCT has the characteristics of shift-invariant, multi-scale, multi-orientation and better directional frequency localization [25]. Therefore, it has favorable performance in capturing contour structure information of signal. Compared with wavelet transform, the support interval of NSCT basis function has an elongated structure which can change the aspect ratio when scale changes. NSCT is anisotropic and it can be adopted to represent texture information with less coefficients. Xu et al. [26] applied NSCT to capture facial contour information of face image and used support vector machine (SVM) to learn and classify NSCT features. The effectiveness was verified by Wang [27]. Xie et al. [28] introduced logarithm transform into NSCT and then proposed logarithm nonsubsampled contourlet transform. The approach firstly performed logarithm transform on face image, then applied NSCT decomposition on logarithm transformed face image to obtain the low-frequency and high-frequency subbands. Finally illumination invariance was obtained by applying inverse NSCT on the high-frequency subbands processed with Bayes shrink. The extraction of illumination invariant used by Cheng et al. [29, 30] is similar to that used by [28]. The difference is that Xie et al. [28] processes high-frequency subbands with adaptive normal shrink to obtain illumination invariant. However, both the approaches mentioned above only consider the high-frequency subbands, whereas the low-frequency subbands of face image still contain useful facial information. Fan et al. [31] applied histogram equalization to the low-frequency subbands, then performed inverse NSCT on the processed high-frequency subbands and modified low-frequency subbands to obtain more facial information from the extracted illumination invariant. But the multi-orientation analysis of the low-frequency domain is not completed by histogram equalization.

Even though NSCT feature extracted in [26, 2831] is multi-scale, multi-orientation and shift-invariant, it is not rotation-invariant. The rotation of textures in face image will change the coefficient distribution of subbands of NSCT and reduce recognition rate. To improve the stability of NSCT subband coefficients, a novel feature named NSCTLBP which is invariant to rotation is constructed in this paper. The recognition method in [2830] utilized high-frequency subbands of NSCT to depict facial textures. Whereas there is still some benefit facial information in low-frequency regions. Studies show that Gabor transform can extract almost all the information in low-frequency regions and NSCT takes the advantage of tree structured filter to perform frequency decomposition of image and extracts rich texture information in high-frequency regions properly. Thus we use NSCTLBP feature to enhance the performance of the method in case of interferences. Moreover, to further improve the utilization of facial information, a weighted measure rule combining Euclidean distance and EWC distance is proposed to fuse NSCTLBP and Gabor features. The proposed method integrates the good high-frequency characterization of NSCTLBP feature and favorable low-frequency performance of Gabor wavelet transform to increase recognition rate. Experimental results on Yale and ORL databases show that the proposed method has better performances than that based on NSCT feature, NSCTLBP feature and Gabor feature alone in case of illumination, expression and angle variations and glasses occlusion.

In later sections we will firstly describe the construction of NSCTLBP feature, and the extraction of the Gabor feature. Then we will introduce the proposed face recognition method fusing NSCTLBP feature and Gabor feature. Finally, the performance of the proposed method is verified by experiments.

2 Feature Construction and Extraction

2.1 Construction and Extraction of NSCTLBP Feature

Contourlet transform(CT) [32] is an image representation scheme with multi-scale and multi-orientation. It is composed of Laplacian pyramid (LP) and directional filter bank (DFB). CT is not shift-invariant and suffers frequency aliasing due to downsamplers and upsamplers presented in both LP and DFB. NSCT is the shift-invariant version of CT. It consists of nonsubsampled pyramid (NSP) and nonsubsampled directional filter bank (NSDFB). Shift-invariant is adopted to overcome a wide range of changes on subband coefficient distribution caused by image shift. NSP filter ensures NSCT with multi-resolution analysis ability to effectively extract facial information at different scales. While NSDFB guarantees the multi-orientation of NSCT. NSCT decomposition is shown in Fig. 1. Figure 1a displays an overview of NSCT, where NSP filter performs scale decomposition on the original image signals to obtain lowpass image signal and differential bandpass image signals. NSDFB performs tree-structured directional decomposition on the obtained bandpass image signals to get directional subbands. The lowpass image signal obtained previously is continually decomposed into lowpass image signal and differential bandpass image signal of next scale. Repeating the decomposition above, the multi-scale and multi-orientation decomposition of original image is implemented. NSDFB in Fig. 1a splitting bandpass subband into directional subbands is illustrated in Fig. 1b.

Fig. 1
figure 1

Nonsubsampled contourlet transform. a Nonsubsampled filter bank structure that implements NSCT. b Idealized frequency partitioning obtained by NSCT

NSCT is adopted to decompose the face image with size M × N into one low-frequency subband coefficient \( \varvec{C}_{0} \) and high-frequency subband coefficient set \( \{ \varvec{C}_{1,1} ,\varvec{C}_{1,2} , \ldots ,\varvec{C}_{u,1} , \ldots ,\varvec{C}_{u,v} \} \) with the same size as face image. \( \varvec{C}_{u,v} \) describes facial information in the u-th orientation subband of the v-th scale. LBP calculation is introduced to advance the representation capacity of \( \varvec{C}_{u,v} \) and construct NSCTLBP feature with properties of multi-scale, multi-orientation, shift-invariant, rotation-invariant. According to LBP prototype, NSCTLBP feature calculation can be given by [3]

$$ \varvec{C}_{u,v - LBP} \left( {x_{c} ,y_{c} } \right) = \sum\limits_{k = 0}^{7} {s\left( {g_{k} - g_{c} } \right)2^{k - 1} } $$
(1)
$$ s\left( x \right) = \left\{ {\begin{array}{*{20}c} {1,} & {x \ge 0} \\ {0,} & {x < 0} \\ \end{array} } \right. $$

where LBP value of \( \varvec{C}_{u,v} \) at point (x c y c ) is obtained through binary coding of the coefficients in a neighborhood of 3 × 3 pixels with center pixel at point (x c y c ). g c is the value of center coefficient \( \varvec{C}_{u,v} \left( {x_{c} ,y_{c} } \right) \) and g k is the value of k-th neighborhood coefficient of \( \varvec{C}_{u,v} \left( {x_{c} ,y_{c} } \right) \) in clockwise. LBP value (i.e. \( \varvec{C}_{u,v - LBP}^{ri} \)) can be calculated as [3]

$$ \varvec{C}_{u,v - LBP}^{ri} = min\left\{ {ROR\left( {\varvec{C}_{u,v - LBP} ,k} \right)\left| {k = 0,1, \ldots ,7} \right.} \right\} $$
(2)

where \( ROR\left( {\varvec{C}_{u,v - LBP} ,k} \right) \) performs a circular bit-wise right shift on the 8-bit number \( \varvec{C}_{u,v - LBP} \) k times to the right. Finally, the NSCTLBP feature extracted from face image is computed as

$$ f_{nsctlbp} = \left[ {\varvec{C}_{1,1 - LBP}^{ri} ,\varvec{C}_{1,2 - LBP}^{ri} , \ldots ,\varvec{C}_{u,1 - LBP}^{ri} , \ldots ,\varvec{C}_{u,v - LBP}^{ri} } \right]. $$
(3)

As the value at point (x c , y c ) in each feature vector \( \varvec{C}_{u,v - LBP} \) is a minimal circular coding sum of the coefficients in the neighborhood of (x c , y c ), the value is stable with the rotation of local textures in face image. Thus NSCTLBP is robust to image rotation. f nsctlbp obtained on the basis of high-frequency subband coefficients does not contain low-frequency facial information.

2.2 Extraction of Gabor Wavelet Feature

2D Gabor wavelet is often applied to capture local structure information corresponding to spatial localization, spatial frequency selectivity and orientation selectivity [10]. It enables 2D Gabor wavelet to be widely used for facial textures extraction. 2D Gabor wavelet kernel function is defined by [33]

$$ \varPsi_{u,v} \left( z \right) = \frac{{\left\| {k_{u,v} } \right\|^{2} }}{{\sigma^{2} }}exp\left( { - \frac{{\left\| {k_{u,v} } \right\|^{2} \left\| z \right\|^{2} }}{{2\sigma^{2} }}} \right)\left[ {exp\left( {ik_{u,v} z} \right) - exp\left( { - \frac{{\sigma^{2} }}{2}} \right)} \right] $$
(4)
$$ k_{u,v} = \left( {\begin{array}{*{20}c} {k_{v} \cos \phi_{u} } \\ {k_{v} \sin \phi_{u} } \\ \end{array} } \right), \quad k_{v} = \frac{{k_{max} }}{{f^{v} }},\quad \phi_{u} = \frac{\pi u}{8} $$

where \( \frac{{\left\| {k_{u,v} } \right\|^{2} }}{{\sigma^{2} }} \) is used to compensate for energy spectrum attenuation determined by frequency in natural images. Gaussian envelope function \( exp\left( { - \frac{{\left\| {k_{u,v} } \right\|^{2} \left\| z \right\|^{2} }}{{2\sigma^{2} }}} \right) \) is a window function to make Ψ u,v (z) be locally valid. exp(ik u,v z) is a complex plane wave, where real part is cosine plane wave while imaginary part is sine plane wave. \( exp\left( { - \frac{{\sigma^{2} }}{2}} \right) \) is adopted to eliminate the effect of DC component of image to Gabor wavelet and ensures Ψ u,v (z) to be insensitive to illumination variations. z = (xy) denotes the position of pixel, k u,v signifies the filter center frequency, k v and \( \phi_{u} \) describe the multi-scale and multi-orientation capability of Gabor filter respectively. By choosing different scales v and orientations u, and appropriate value of k max , σ, f, we will obtain a series of Gabor kernels. From the equation of k u,v , the coverage of frequency distribution of 2D Gabor filter is a circular area with radius of k v . Although 2D Gabor filter is proper in covering the horizontal and vertical high-frequency regions, it performs weak coverage on diagonal high-frequency region of spatial frequency regions of face image.

Gabor feature representation of an image \( \varvec{ I}\left( {x,y} \right) \) is derived by convolving the image with the 2D Gabor wavelet kernel function at different scales and orientations [34].

$$ \varvec{W}_{u,v} \left( {x,y} \right) = \varvec{I}\left( {x,y} \right)*\varPsi_{u,v} \left( {x,y} \right) $$
(5)

where \( \varvec{W}_{u,v} \left( {x,y} \right) \) denotes the output of the convolution of the image \( \varvec{I}\left( {x,y} \right) \) with Ψ u,v (xy) at scale v and orientation u. The adoption of FFT and IFFT to Eq. (5) can speed up the calculation.

$$ \varvec{W}_{u,v} \left( {x,y} \right) = {\mathcal{F}}^{ - 1} \left( {{\mathcal{F}}\left( {\varvec{I}\left( {x,y} \right)} \right) \times {\mathcal{F}}\left( {\varPsi_{u,v} \left( {x,y} \right)} \right)} \right) $$
(6)

where \( \varvec{W}_{u,v} \) contains the response of real part and imaginary part of Gabor kernel. The amplitude of \( \varvec{W}_{u,v} \) contains local energy variations of image, thus it can be calculated as image feature. Gabor feature of \( \varvec{I}\left( {x,y} \right) \) is generated by combining all the calculated features \( \varvec{W}_{u,v} \):

$$ f_{Gabor} = \left( {\varvec{W}_{1,1} , \varvec{W}_{1,2} , \ldots ,\varvec{W}_{u,1} , \ldots ,\varvec{W}_{u,v} } \right) $$
(7)

3 Fusion of NSCTLBP Feature and Gabor Feature

Gabor features and the NSCT features are observed to be complementary to each other to some extent, since Gabor feature mainly extracted from the low frequency region while NSCT feature extracted from the high frequency region of face image. LBP is a fine scale descriptor that captures small texture details and resistant to illumination variations. The combination of NSCT and LBP will be a good choice for coding fine details of facial appearance and texture. Thereby, the complementary nature of Gabor and NSCTLBP makes them be good candidates for fusion. The fusion method can not only achieve complete characterization of high-frequency detail textures, but also ensure the analysis in low-frequency regions. In view of the varying importance of the feature vector, EWC distance assigns each element of a vector with different weight, which makes the low frequency components more discriminative [34]. Euclidean distance is one of the most widely used similarity measurement approaches for face recognition. In this paper, Euclidean distance is directly used for NSCTLBP features and EWC distance is chosen for Gabor features, a weighted measure rule is applied in the proposed method to integrate the advantages of NSCTLBP and Gabor features. The procedure is briefly described in Fig. 2.

Fig. 2
figure 2

Flow chart depicting the procedure involved in fusion of the NSCTLBP and Gabor features

Euclidean distance is explored to compute NSCTLBP feature distance.

$$ D_{{nsct{\text{lbp}}}} \left( {f_{nsctlbp},\, f_{nsctlbp}^{*} } \right) = \left( {\mathop \sum \limits_{p = 1}^{u \times v} \left( {f_{nsctlbp,p} - f_{nsctlbp,p}^{*} } \right)^{2} } \right)^{{\frac{1}{2}}} $$
(8)

where f nsctlbp and \( f_{nsctlbp}^{*} \) are NSCTLBP features of two face images to be matched respectively. f nsctlbp,p and \( f_{nsctlbp,p}^{*} \) indicate the p-th components of f nsctlbp , \( f_{nsctlbp}^{*} \) separately.

According to EWC distance [35], Gabor feature distance is given by

$$ D_{Gabor} \left( {f_{Gabor} , f_{Gabor}^{*} } \right) = \frac{{\mathop \sum \nolimits_{p = 1}^{u \times v} \left( {\left( {\varvec{W}_{p} \varvec{W}_{p}^{*} } \right)/\lambda_{p}^{2} } \right)}}{{\left( {\mathop \sum \nolimits_{p = 1}^{u \times v} \left( {\varvec{W}_{p} /\lambda_{p} } \right)^{2} \mathop \sum \nolimits_{p = 1}^{u \times v} \left( {\varvec{W}_{p}^{*} /\lambda_{p} } \right)^{2} } \right)^{{\frac{1}{2}}} }} $$
(9)

where f Gabor and \( f_{nsctlbp}^{*} \) are Gabor features of two face images to be matched respectively. \( \varvec{W}_{p} \) and \( \varvec{W}_{p}^{*} \) indicate the p-th components of f Gabor and \( f_{nsctlbp}^{*} \) separately. The eigenvalue denoted by λ p can be calculated as follows.

Firstly, let \( \fancyscript{g}= \left\{ {f_{Gabor,q} } \right\}_{q = 1}^{Q} \) denote a set of Q Gabor feature vectors extracted from Q training faces for one person in face database. The length of each feature vector is u × v. The average facial feature of set \( \fancyscript{g} \) is computed by

$$ \overline{f}_{Gabor} = \frac{1}{Q}\mathop \sum \limits_{q = 1}^{Q} f_{Gabor} $$
(10)

Secondly, calculate the covariance matrix of training faces \( \varvec{\xi}= \varvec{GG}^{\varvec{T}} \), where \( \varvec{G}^{\varvec{T}} \) is matrix transpose of \( \varvec{G} \).

$$ \varvec{G} = \left[ {f_{Gabor,1} - \overline{f}_{Gabor},\, f_{Gabor,2} - \overline{f}_{Gabor} , \ldots ,f_{Gabor,Q} - \overline{f}_{Gabor} } \right] $$
(11)

λ p is the p-th eigenvalue of \( \varvec{\xi} \).

Research shows that the measure principles of Euclidean distance and EWC distance are different. A lower D nsctlbp indicates high similarity of two features to be matched, especially \( D_{{nsct{\text{lbp}}}} = 0 \) denotes that they are identical. But the range of D Gabor is in the interval [−1, 1]. D Gabor  = 1 indicates that the training face and the test face are identical.

Before fusion, normalization is required to map the matching scores obtained from multiple frameworks to a common range so that they can be easily combined. Thereafter, these individual matching distances are combined by using the sum rule to generate a single scalar score which is then used to make the final decision. Gabor feature matching score can be computed by

$$ \widetilde{D}_{Gabor} = \left( {D_{Gabor} - 1} \right)/2 $$
(12)

where \( \widetilde{D}_{Gabor} \) is in the interval \( \left[ { - 1,0} \right], \widetilde{D}_{Gabor} = 0 \) indicates that the training face and the test face are identical.

Therefore on the basis of NSCTLBP feature matching score and Gabor feature matching score, a weighted measure rule is proposed to measure the fusion of \( \widetilde{D}_{nsctlbp} \) and \( \widetilde{D}_{Gabor} \).

$$ \begin{aligned} D & = \omega \widetilde{D}_{nsctlbp} + \left( {1 - \omega } \right)\left( { - \widetilde{D}_{Gabor} } \right) \\ & = \omega \widetilde{D}_{nsctlbp} - \left( {1 - \omega } \right)\widetilde{D}_{Gabor} \\ \end{aligned} $$
(13)

where \( \widetilde{D}_{nsctlbp} \) is NSCTLBP feature matching score, it is the normalized D nsctlbp to avoid the variations of large value of D nsctlbp . ω is a weighted parameter ranging from 0 to 1. D is the matching score of the test face and training face. The smaller the value of D, the greater the similarity of the two faces, especially D = 0 denotes that they are identical.

4 Experimental Results and Analyses

4.1 Face Databases Explanation

In this paper typical Yale and ORL databases are adopted to verify the validity of the proposed method.

Table 1 gives the details about Yale and ORL databases. Figures 3 and 4 show the sample images for one person in each database. Both databases are divided into test subset and training subset. ORL-test subset consists of 200 images (5 images per person) and ORL-training subset contains the remaining 200 images. Yale-test subset contains 75 images (5 images per person), Yale-training subset contains the remaining 90 images(6 images per person).

Table 1 Description of ORL and Yale face databases
Fig. 3
figure 3

Sample images for one person from Yale face database

Fig. 4
figure 4

Sample images for one person from ORL face database

4.2 Experiment Results and Analysis

In this paper, the ‘maxflat’ filter and ‘dmaxflat7’ filter are selected as NSP and NSDFB respectively in NSCT decomposition. We repeated 20 trails for random training and testing sets. Each face image is decomposed into three scales and each scale with eight orientations. Gabor feature extraction also complies in three different scales and samples in eight orientations for each scale, in addition, Gabor kernel parameters are set as follows: σ = 2π, k max  = π/2, \( f = \sqrt 2 \).

  1. (1)

    Comparison on average face recognition rate of the methods based on different features

The proposed method in this paper fuses NSCTLBP feature and Gabor feature to measure the similarity between the test images and the training images. From Eq. (13), ω = 0 indicates that only Gabor feature is considered for identification, while ω = 1 means only NSCTLBP feature is considered. To illustrate the effectiveness of the proposed method, we make an average recognition rates comparison of the proposed method with the methods based on NSCTLBP feature, NSCT feature and Gabor feature separately on Yale and ORL databases.

Table 2 shows that the proposed method with ω = 0.4 is more effective than other three methods. The proposed method can extract facial information properly from both low-frequency region and high-frequency region, which agrees with the above theoretical analysis.

Table 2 Average recognition rates of the methods based on different features
  1. (2)

    Anti-interference ability

For further research on the anti-interference performance of the proposed method, the test face subsets of Yale and ORL databases are classified in accordance with the interferences such as angle, expression, illumination variations and glasses occlusion. The average face recognition rates of different methods under different interferences are listed in Tables 3 and 4.

Table 3 Average recognition rates of the methods under different interferences on Yale face database
Table 4 Average recognition rates of the methods under different interferences on ORL face database
  1. (a)

    Comparison on performance between NSCTLBP feature and NSCT feature

NSCT and NSCTLBP features have equivalent performance on Yale database under illumination variations and glasses occlusion on ORL database, but in general, NSCTLBP feature is more robust than NSCT feature on both Yale and ORL database, the reason is that LBP could capture some small texture details of face images. As expression and angle variations will cause partial rotation or translation of local facial textures, NSCTLBP feature with shift-invariant and rotation-invariant property would effectively adapt to these variations.

  1. (b)

    Comparison on anti-interference between the proposed method and others

Recognition rates in Tables 3 and 4 indicate that the proposed method performs better than the other methods in the case of illumination, expression and angle variations. In addition to the rotation and translation of the facial textures, expression and angle variations will affect the distribution of both low-frequency region and high-frequency region of face image. Hence, the fusion texture feature is robust to expression and angle variations. Illumination variations primarily affects low-frequency region of face image. The fusion texture feature is superior to NSCTLBP feature and Gabor feature in describing high-frequency details alone. In the case of glasses occlusion, the proposed method contributes slightly to the recognition performance. That’s because the glasses occlusion in the face images can severely obstruct the feature extraction in the eye area.

  1. (3)

    The role of NSCTLBP and Gabor features in facial feature representation

Considering the role of NSCTLBP and Gabor features in facial feature representation, we take Yale database for example and make the following experiments by [36].

The vertical coordinate represents the matching score between the training image and the test images.

In this experiment, three test images with illumination, expression variations and glasses occlusion separately are selected from the same person to extract NSTPLBP and Gabor features. Figure 5 shows one of the examples of the training image and the test images with different interferences. Note that, Fig. 6 presents the average results of ten different persons from Yale database.

Fig. 5
figure 5

Example of the training image and the test images with different interferences. a Training image. b Test image with glasses occlusion. c Test image with expression variations. d Test image with illumination variations

Fig. 6
figure 6

Demonstration of different roles of NSCTLBP and Gabor features in face representation

From Fig. 6, the matching score of Gabor feature(i.e. \( - \widetilde{D}_{Gabor} \)) and NSCTLBP feature (i.e. \( \widetilde{D}_{nsctlbp} \)) between training image and test images with illumination variations are larger than that with expression variations and glasses occlusion, which demonstrates that Gabor feature and NSCTLBP feature are more sensitive to illumination variations. The matching score of NSCTLBP feature is lower than that of Gabor feature, that is to say NSCTLBP feature is much robust to illumination variations than that of Gabor feature. In other words, in case of illumination variations the contribution of NSCTLBP feature is larger than that of Gabor feature. The same analyses are applicable to NSCTLBP and Gabor features in facial feature representation in case of glasses occlusion and expression variations. We can conclude that in case of expression variations the contribution of Gabor feature is larger and the contribution of NSCTLBP feature is larger in case of glasses occlusion.

  1. (4)

    Performance of the proposed method with fewer training images for each person

In order to verify the effectiveness and adaptability of the proposed method with fewer training images, experiments are performed on ORL and Yale databases by taking different number of training images randomly. The average recognition rates over 20 different runs of training and test sets are presented in Tables 5, 6 and 7.

Table 5 Average recognition rates of the proposed method with fewer training images
Table 6 Average recognition rates of the proposed method under different interferences on Yale
Table 7 Average recognition rates of the proposed method under different interferences on ORL

As the constructed NSCTLBP feature is invariant to rotation, and Gabor transform can extract almost all the desired features in high frequency and low frequency regions, the proposed method is especially robust to angle and expression variations, thus the average recognition rate of the proposed method on ORL database is higher than that on Yale database. From the data obtained, it is observed that the recognition rates decline with fewer training images, because less desired features can be extracted with the decrease of training images. Additionally, illumination variations can affect distribution of the gray value of original face image. The average recognition rate of the proposed method with different training images in Table 6 also confirms the conclusion. With the variations of the training images of each person, exhaustive experiments testify the proposed method to be reasonable and robustness.

  1. (5)

    Impact on recognition performance by ω

In order to examine the effect of the weighted parameter ω on the proposed method, experiments on ORL and Yale databases are implemented by ranging ω from 0 to 1. Figure 7 illustrates the results on Yale and ORL databases, where ω is taken as abscissa and recognition rate as vertical coordinate. Distribution of face recognition rate also reveals the recognition performance of the proposed method. Additionally, it can be seen from Fig. 7 that the method has a good recognition rate when ω ranges from 0.1 to 0.3, and the recognition rate can reach 88.89 % on ORL database and 93.33 % on Yale database when ω = 0.2. When ω is out of the range, the recognition rate will decline. The reason is that Gabor feature plays a growing role in the proposed method as ω decreases, which leads to poor description of facial information in high-frequency region. While NSCTLBP feature plays a growing role in the proposed method as ω increases, which leads to less facial information in low-frequency region. We can conclude that the proposed method has the best performance by offering a good tradeoff between the constructed NSCTLBP feature and Gabor feature when ω ranges from 0.1 to 0.3.

Fig. 7
figure 7

Face recognition rate of the proposed method with different ω

5 Conclusions

A face recognition method fusing texture features is proposed and experiments on Yale and ORL databases are carried out to verify its recognition performance. Experimental results show that NSCTLBP feature constructed in this paper is invariant to rotation, and it is more robust than NSCT feature. Meanwhile the proposed method has better recognition performances than the methods based on NSCTLBP feature or Gabor feature. The proposed method not only retains the multi-scale and multi-orientation analysis capability of NSCT and Gabor transform but also takes advantage of NSCTLBP feature to compensate Gabor feature for the lack of facial information in high-frequency regions. Hence, it is robust to illumination, expression, angle variations and glasses occlusion.