Abstract
In a reduced dimensional space, linear discriminant analysis looks for a projective transformation that can maximizes separability among classes. Since linear discriminant analysis demands the within-class scatter matrix appear to non-singular, which cannot directly used in condition of small sample size (SSS) issues in which the dimension of image is much higher, while the number of samples isn’t unlimited. Both the between-class and within-class scatter matrices are always exceedingly ill-posed in SSS problems. And many algorithms are suffered from small sample size issues still. To solve SSS problems, many methods including regularized linear discriminant analysis were proposed. In this article, a way was presented by optimized regularized linear discriminant analysis for feature extraction in FR which can not only fix the singularity problem existing in scatter matrix but also the problem of parameter estimation. The experiment is conducted on several databases and promising results are obtained compared to some state-of-the-art methods to demonstrate the effectiveness of the proposed approach.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
1 Introduction
With the gradual growth of artificial intelligence application and technology, many industry perspective and research interest in face recognition (FR). Face recognition has some advantages, such as low cost, least intrusive and more data sources. However the processing and analysis of high-dimensional data in FR is still a challenge [1,2,3], robust FR remains a challenging task in small sample environments. There are mainly two stages for a common face recognition system: (1) robust and discriminant feature extraction, such as, principal component analysis (PCA) [4], linear discriminative analysis (LDA) [5], regularized linear discriminant analysis (RLDA) [6]. Null LDA (NLDA) [7], the orientation matrix is calculated in two steps. In the first stage, the data is projected on the null space of SW and in the second stage it finds W that maximizes \(\left| {{{\text{W}}^T}{S_B}W} \right|\), and spectral regression discriminative analysis (SRDA) [8]. Sparse graph-based discriminate analysis (SGDA) [9] was developed by preserving the sparse connection in a block-structured affinity matrix with class-specific samples. Using low-rank constraints, low-rank graph-based discriminate analysis (LGDA) [10] preserves the global structure in data. Sparse and low-rank graph-based discriminate analysis (SLGDA) method was developed in [10] to purse block-diagonal structured affinity matrix with sparsity and low-rank constraints. (2) Classifier construction, e.g. Nearest Neighbor (NN) [11]. But, many methods that include LDA-based statistical learning methods always affected by “small-sample-size” (SSS) problems [12].
Feature extraction has proved to be great in transforming high-dimensional space to lower one, and retain most of the intrinsic information in original data [13, 14]. PCA was originally used to remove zero value for Sw, and LDA was then executed in the reduced dimensional subspace. It has lighted that removed zero spaces include discriminatory information that cannot ignore. However, for supervised dimensionality reduction methods which are only suitable for single model data, classification performance is closely related to between-class separation, within-class compactness and equal emphasis on separation between classes [15]. In the technique of RLDA, the Sw matrix is regularized to deal with the Sw singularity. The matrix Sw is approximated by Sw + ηI. But it does not consider whether the definition of the scatter matrix is more reasonable. Li and Tang [16] presents the idea that traditional LDA algorithm is not optimal for the definition of between-class scatter matrices. It cannot help separate classes other than edge classes, and it may cause them to overlap with each other, resulting in discriminant performance degradation. Second, a fixed regularization parameter value was introduced in the RLDA, but it may not give the best classification. In [17], an approach, estimating η term by putting the modified Fisher’s criterion maximize, presents better performance than other methods. In addition, close class pairs prone to overlap in the subspace, which is referred to as the class separation problem. A number of weighting methods was put forward to deal with this problem [18, 19], and the fundamental thought is to assign large weights to close class pairs. However, the problem cannot be solved thoroughly by those methods [20]. The proximity function proposed by [21] solves the shortcomings of the traditional distance function in high dimensional data. In view of that, this paper proposes an improved RLDA algorithm. It redefines the between-class and introduces a precise regularized parameter to control the deviation and variance of the eigenvalue. Finally, a better method of parameter estimation and the improved scatter matrices are combined.
This study is motivated by the fact that previous studies [6, 17, 21]. The frame of this article is as below. Part 2 gives a detailed introduction of mathematical derivation of regularized linear discriminant analysis algorithm. Section 3 introduces the method with the structure of improved scatter matrices and a precise regularized parameters. Section 4 is the simulation results and analysis. Lastly, the last part is the conclusions.
2 Regularized linear discriminant analysis
Among supervised dimensionality reduction methods, RLDA is the most popular discriminant analysis method to SSS problem, which is widely used in pattern recognition fields. Both the degree of deviation and variance are all decided by the extent of SSS problem. A related algorithm improved by Friedman under similar conditions, where estimating Si for every sample type covariance matrix may not be very appropriate. The solving method, put forward by Friedman, is to add a regularization parameter, η·I, so have Si = Si + ηI. I is identity matrix and η is a regularization term. This regularization has influence on increasing the smaller ones and decreasing the larger eigenvalues, thus offsetting the biasing. To stabilize the smallest eigenvalues is another effect of the regularization.
A training set, \({\text{z}}=\{ {z_i}\} _{{i=1}}^{C}\) including C classes with every class \({{\text{z}}_i}=\{ {z_{ij}}\} _{{j=1}}^{{{C_i}}}\) consisting of multiple partial images zij, a total of \(N=\sum\nolimits_{{i=1}}^{C} {{C_{\text{i}}}}\) images can be obtained in the set. For easing computation, each face image is standed by a lexicographic order of pixel elements (i.e. \({z_{ij}} \in {R^J}\)). The length of it is J (= Iw × Ih). RJ represents the J-dimensional data space. The method, obtains a discriminant vector by lagging the proportion of the between-class scatter measure to the within-class scatter measure, can be formulated as:
Among them, Sb is between-class scatter matrices, Sw is within-class scatter matrices, \(W \in {R^m}\), 0 ≤ η ≤ 1 is regularization parameter. And
where zi is the average value (or center) of class i, \(\bar{\text{z}}\) is the total mean (or center) of all the classes. In general, by using the training samples, zi, \(\bar{\text{z}}\) can be estimated, i.e., \({{\text{z}}_i}=\frac{1}{{{N_i}}}\sum\nolimits_{{j=1}}^{{{N_i}}} {{x_{ij}}}\) and \(\bar{\text{z}}=1/N\sum\nolimits_{{i=1}}^{{C=1}} {\sum\nolimits_{{j=i+1}}^{C} {{x_{ij}}} }\).
A series of discriminant vectors would be available by eigenvalue decomposition of \(S_{w}^{{ - 1}}{S_b}\) according to (1) when Sw is full rank. The matrix of projection can be constructed by the eigenvectors that associated to the d eigenvalues those are largest, which is the suboptimal solution to (2). However, as mentioned in Sect. 1, there are some disadvantages on RLDA for FR and it can be improved. To sum up, algorithm proceeds in the following way.
3 Improved regularized linear discriminant analysis
Equation (2) is defined so that all the mean values of the sample and the average values of the classes are separated as much as possible, but the mean values of various class may be close to each other, resulting in overlapping of many samples of adjacent classes, resulting in a decrease in recognition performance. The reason for this problem is that the variance is the largest in the most discriminating projection direction obtained by the previous algorithm. So the edge class and other class can be separated as much as possible. It should be noted, however, that this direction does not help separate other class except the edge class, and it may cause them to overlap with each other, resulting in a decline in discriminant performance.
Therefore, the existing algorithm for the definition of the scatter of the between-class is not optimal, because the edge class dominates the feature decomposing, resulting in the dimension reduction of the conversion matrix is too much emphasis on those who already have been better separated from the class. Then the overlapping of adjacent class be caused.
3.1 The model of improved between-class scatter matrices
Improved scatter matrices model was expressed as:
where the Close function:
The value of a sample is in m-dimensional space. The range of the Close function is (0, 1), which indicates the proximity of the sample \(\bar{\text {z}}_{\text{i}}\) to the sample \(\bar{\text{z}}_{\text{j}}\). The closer, the greater the value of the function. On the contrary, the smaller, the value. Where Pi and Pj are the prior probability of class i and j, respectively, and \(\bar{\text {z}}_{\text{i}}\) and \(\bar{\text{z}}_{\text{j}}\) are the average values of i-th and j-th.
From Eq. (3), the larger the \(\left\| {\bar{\text{z}}_{i} - \bar{z}_{j} } \right\|\) value, the smaller the weight assigned to them; on the other hand, the greater the weight assigned to them.
3.2 The model of improved within-class scatter matrices
Many algorithms need to train scatter matrices on larger databases. But in practice, the number of training set isn’t unlimited. Under the condition of small sample, the model cannot be correctly and effectively represent the logic and characteristics of the model, and it is easy to obtain the problem of over-fitting of the scatter matrices, which makes the performance of face recognition significantly lower. Because the difference of characteristics of the same person is more susceptible to other factors, even greater than the difference between different characteristics, that is, the degree of scatter within the class is greater than the between-class scatter changes, making the estimation error greater. Therefore, under the condition of small sample, the within-class scatter is obviously more sensitive, and this paper also pays attention to solve the sensitivity of within-class scatter matrices to small samples.
When the data is less effective in data set, the effective information of samples can be robustly estimated by making full use of local data structure of the sample. When there is an outlier in the data set, the local data structure adjacent to the sample can also be used to represent the characteristics of the outlier. For the small sample caused by the over-fitting problem, you can solve the problem by smoothing. In this paper, the KNN algorithm was used to select the within-class scatter matrices of adjacent classes and by taking advantage of local data structure. The within-class scatter matrices are smoothed, the over-fitting problems caused by small samples can be solved.
Let the training sample data set be: \({z_{ij}} \in R\), i = 1,..., C and j = 1,..., Ci. Ci is the number of classes i in the training sample, C is class number, N is the total sample number, and zij represents the j-th face image of the i-th class of the training sample. Within-class scatter matrices model can be express as:
The general within-class scatter matrices was formulated as:
Using the adjacent class to smooth the class divergence matrix
\(k \in {\text{KNN}}(i)\) represents the K nearest neighbors of class i. \(\beta \in [0,1]\) is the trade-off parameter, k is the weight parameter determined by the nearest neighbor system, the smaller the distance, the greater the weight.
Improved within-class scatter matrices model can be express as:
By the definition, \({\tilde {S}_1}\) is the result of the smoothing of Si and Sk of Si K nearest neighbor classes of Si, and the problem of fitting can be solved by making full use of class i samples and adjacent class sample information. When a class has only one sample, the scatter matrix cannot be estimated effectively, but the scatter matrix can be approximated by using the neighboring class samples. The smoothing method takes full advantage of local data structure and reduces adverse effects of outliers in each class. The improved algorithm makes full use of the within-class scatter matrices of the class, and solves the problem of over-fitting of the general within-class scatter matrices, and obtains exact within-class scatter matrices.
3.3 A deterministic approach to RLDA
Let ST, SW, SB denotes the total, within-class and between-class scatter matrix, respectively. The scatter matrices would be singular in condition of SSS. It is well known that the discriminant information does not exist in the zero space of ST. Thus, the feature dimensionality from d-dimension can drop to rt-dimension (where rt is the rank of ST) by advance processing of PCA. The range space of ST matrix, \({P_1} \in {R^{d \times rt}}\), will be applied as a transformation matrix. In reduced dimension, the scatter matrices are: \({{\text{S}}_{\text{w}}}={\text{P}}_{1}^{{\text{T}}}{{\text{S}}_{\text{W}}}{{\text{P}}_1}\) and \({{\text{S}}_{\text{b}}}={\text{P}}_{1}^{{\text{T}}}{{\text{S}}_{\text{B}}}{{\text{P}}_1}\). After this procedure \({S_w} \in {R^{rt \times rt}}\) and \({S_b} \in {R^{rt \times rt}}\) are decreased dimensional scatter matrix.
In RLDA, the regularization of within-class scatter matrix SW was performed by adding η to diagonal elements of SW; i.e., Sw = Sw + ηI. The η make SW presents non-singular and reversible which would benefit the revised Fisher’s criterion maximized:
where \(w \in {R^{rt \times 1}}\) is orientation vector. Avoid using any heuristic method in determining η, solving Eq. (10) in the belowing way. Denote
Constraint condition:
b > 0 is constant. Under constrained curve g, the restricted relative maximum of f can be obtained. By putting its derivative to zero value, then
Or
λ is Lagrange’s multiplier (λ ≠ 0). Shifting \(\eta\)W from Eq. (13) into Eq. (12), we conclude
And from Eq. (12) and Eq. (14), we can get
We can observe that the left term of Eq. (15) is the Lagrange’s multiplier, and to the right of Eq. (15) same as the Fisher’s revised criterion. To large the modified Fisher’s criterion, we need to maximize λ. So approximate value of λ can be got by maximizing WTSbW/WTSwW, W corresponding to the large eigenvalue of \({\text{S}}_{w}^{{ - 1}}{{\text{S}}_b}\). But, \({\text{S}}_{w}^{{ - 1}}\) can be replaced by its pseudoinverse for which it is singular and irreversible. We can get λmax by decomposing the eigenvalue of matrix \({\text{S}}_{w}^{+}{{\text{S}}_b}\). \({\text{S}}_{w}^{+}\) is the pseudoinverse of Sw. The value of λmax can be substituted as follows:
Equation (16) will help us to seek the value of η by decomposing the eigenvalue of 1/λSb − Sw which will give \({r_b}=~rank({S_b})\) finite eigenvalues. Since the dominant eigenvalue correspond to the largest discriminant eigenvector, η is considered to be the maximum eigenvalue. Then,
where \(1/\lambda {S_{b~}} - ~{S_{w~}}=~E\Lambda {E^T}\), \(E \in {R^{rt \times rt}}\) is a matrix of eigenvectors, Λ is a diagonal matrix of corresponding eigenvalues. If η is determined, the projection vector W would be obtained by decomposing the eigenvalue of \({({S_w}+\eta I)^{ - 1}}{S_b}\) which can be formulated as:
The m eigenvectors be obtained by Eq. (18) corresponding to the m highest eigenvalues to form W.
4 Simulation results and analysis
In this part, our approach is compared with a number of related state-of-the-art methods, including LGDA [10], NLDA [7], SRDA [8], SGDA [9] and SLGDA [10], etc. The range of normalized parameters in RLDA [6] is [0, 1]. RLDA in the following experiments has better results when the value obtained is 0.001. Our algorithm solves the difficulty of determining the normalized parameter values in RLDA. And the parameters introduced in our algorithm are β and k which can be seen in formula (8). When the values in the following experiments are 0.5 and 10 respectively, this algorithm achieves comparatively better results. Under different dimensions or training samples or classes, the parameters can be changed to other values so that the performance can be much better. NLDA verifies that the zero space of the between-class scatter matrices contains important discriminative information, but in some cases, the between-class scatter matrices may not contain null space. So the results were not so great sometimes. The Tikhonov regularizer was used in SRDA to control the model complexity, but the projection matrix obtained by it does not have orthogonality and it is not conducive to eliminating information redundancy between samples. All those algorithms of feature extraction are combined with NN classification algorithm for face recognition. The experiment is conducted on three face datasets, including the Extended Yale B [22], CMU PIE [23] and AR to evaluate performance. Details of datasets can be seen in Table 1 and Fig. 1.
The parameters in competing methods are adjusted to their best performance according to the suggestions in original papers.
4.1 2-D visualization experiment on CMU PIE dataset
In this part, the discriminate ability is showed by different methods using a partial CMU PIE [22] face database. In the experiment, each individual are randomly selected 7 images for training, and the remaining about 17 images were tested. Figure 2a–j visualize the testing data distribution along the first two dimensions obtained by different methods. From Fig. 2, we may draw several conclusions. First, considered small sample problem, NLDA [7], RLDA [6], SRDA [8] are superior to PCA [4] and LDA [5]. But overlaps are still serious. Second, SGDA [9] only uses the local neighborhood structure through sparse representation, which doesn’t perform very well. Some parts of 5 classes mixed together can be seen in Fig. 2g. LGDA [10] shows better separation ability by introducing global low-rank regularization, But, still have significant overlaps among class 2, class 3 and class 5, and the distance between class 1 and class 3 is not far. With both sparse and low-rank constraints, SLGDA [10] performs better than the first two. However, class 2, class 3 and class 5 are still not separated as shown in Fig. 2i. Contrastively, the proposed method show more clear boundaries among classes and shows stronger robustness in the following experiment.
4.2 Experiments on face recognition
4.2.1 CMU PIE database
The CMU PIE have surpassed four thousands images of sixty-eight individuals. Each person’s image was obtained through 13 different postures. Here, we use a subset of poses close to the front, C07, for the experiment, which contains 1629 images of 68 people. Everyone has about 24 pictures. And all the facial images were pruned to 32 × 32 pixels. Each individual is selected to have a subset of p (= 2, 3 ...) samples for training, the rest for testing. For each p, we ran all of methods 10 times independently, and reported the average results in Table 2. The FR rates under different dimensions are shown in Table 3. Table 2 shows that our method almost exceeds the other methods in different experimental settings. The results in LGDA and SLGDA are similar with ours, while obviously lower than ours when training samples per subject are not much available. Also, LGDA and SLGDA can get better performance as ours under different dimensions when training samples per subject are fixed. The results in RLDA are better than in SRDA under different training samples per subject except p = 2. And the results in NLDA are better than both RLDA and SRDA. Our method has higher recognition rate under different dimensions. Figure 2 shows the recognition rate versus the number of training samples and feature dimensions on CMU PIE by using some methods. From Fig. 2b, it is obvious that the whole training information does not provide significant advantage for classification, by which lead to computational costs instead. Thereby it’s necessary to extra features.
4.2.2 Experiments on AR database
A subset of AR consisted of 50 men and 50 women in two sessions, with 6 lighting and 8 expression changes. From session 1, only seven images of light and expression changes, seven samples from another session. Each individual is selected to have a random subset of p (= 2, 3 ...) samples for training. For each p, we ran 10 times independently, and reported the average results in Tables. From Table 4, one can conclude that all the algorithms achieve better performance with the number of training samples per class increases, our method has higher recognition rate than other methods under numbers of every individual of training samples. The results in RLDA are better than in SRDA under different training samples per subject except p = 2. And the results in NLDA are better than both RLDA and SRDA. The FR rates under different dimensions were listed in Table 5. It can show that NLDA and our method exceed other methods in different experimental settings. The NLDA gain best outcomes on AR, which is slightly better than our method under 50 feature dimensions. But it doesn’t better than our algorithm under higher dimensions. Particularly, because of the number of training samples per subject are just 14 in AR Database, the performance of SGDA, LGDA and SLGDA are drop sharply when p does not reach half. Figure 3 shows the recognition rate versus different number of training samples per class and feature dimensions by using some methods. From Fig. 3b, it is obvious that the whole training information does not provide significant advantage for classification, by which lead to computational costs instead.
4.2.3 Yale face database
About 2414 images of 38 people each and 64 frontal face of different lighting on Extended Yale B. In this experiment, the cropped and resized images were used which is 32 × 32 pixels. Figure 1 shows some example images of individual. A subset of individuals with p (= 3, 4, 5...) samples were taken with labels as for training, and the remaining is used for testing. The experiment was ran for 10 times. From Table 6, one can conclude that all the methods achieve better performance along with the number of training images per subject grows. Then, we randomly select four images from every person for training, and use the remaining samples for testing. The results in RLDA are better than in SRDA under different training samples per subject except p = 3, 4, 5. And the results in NLDA are better than both RLDA and SRDA. SGDA doesn’t present superior performance in this dataset as in CMU PIE. The results in LGDA and SLGDA are similar with ours in some cases, while obviously lower than ours when training samples per subject are not much available. Also, LGDA and SLGDA can get better performance as ours under different dimensions when training samples per subject are fixed. The rates, under the condition of different dimensions, were listed in Table 7. Our method exceeds the other methods in different experimental settings, and shows more robust in dealing with illumination problem in FR. Figure 4 vividly illustrate the recognition rate versus the number of training samples for each category and feature dimensions by using some methods. From Fig. 4b, it is obvious that the whole training information does not provide significant advantage for classification, by which lead to computational costs instead (Fig. 5).
5 Conclusions
The issue of small sample size in FR is studied in this paper. The algorithm of regularized linear discriminant analysis still has some disadvantages to fix the SSS problems. Considering that the model of scatter matrices can be more reasonable and related parameter can be obtained by avoiding the process of heuristic. An improved algorithm is introduced, which cannot only fix the singularity problem of scatter matrix but also the problem of parameter estimation. PCA is simply to calculate, and performs well in some cases, but the performance is limited by its unsupervised nature. By introducing different discrimination standards to fix SSS problems, RLDA, NLDA, SRDA and etc perform well to some extent. SGDA, LGDA and SLGDA can adaptively select neighbors for graph construction, and use the labeled samples in the same class to find the representation of each sample for block-diagonal structure representations. However, due to the limited number of samples per class, this process may result in large representation error, which may not reveal the within-class adjacent relationship as well as ours do. So, SGDA, LGDA and SLGDA hardly perform better than the proposed method when training sample are not enough. The simulation results on the famous databases illustrate that the proposed method has much better performance than other methods and improves the face recognition.
References
Zhang L, Zhang D (2017) Evolutionary cost-sensitive extreme learning machine. IEEE Trans Neural Netw Learn Syst 28(12):3045–3060
Zhang L, Zhang D (2016) Visual understanding via multi-feature shared learning with global consistency. IEEE Trans Multimed 18(2):247–259
Sha C, Zhao H (2017) Design and analysis of associative memories based on external inputs of continuous bidirectional associative networks. Neurocomputing 266:433–444
Jolliffe I (2005) Principal component analysis. Wiley Online Library, New York
Lee S, Park YT, d’Auriol BJ et al (2012) A novel feature extraction method based on normalized mutual information. Appl Intell 37(1):100–120
Friedman JH (1989) Regularized discriminant analysis. J Am Stat Assoc 84(405):165–175
Chen L-F, Liao H-Y, Ko M-T, Lin J-C, Yu G-J (2000) A new LDA-based face recognition system which can solve the small sample size problem. Pattern Recognit 33:1713–1726
Cai D, He X, Han JS (2008) An efficient algorithm for large-scale discriminant analysis. IEEE Trans Knowl Data Eng 20(1):1–12
Ly NH, Du Q, Fowler JE (2014) Sparse graph-based discriminant analysis for hyperspectral imagery. IEEE Trans Geosci Remote Sens 52:3872–3884
Li W, Liu J, Du Q (2016) Sparse and low-rank graph for discriminant analysis of hyperspectral imagery. IEEE Trans Geosci Remote Sens 54:4094–4105
Cover TM, Hart PE (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13(1):21–27
Lu GF, Wang Y, Zou J (2016) Graph maximum margin criterion for face recognition. Neural Process Lett 44(2):1–19
Kasun LLC, Yang Y, Huang GB et al (2016) Dimension reduction with extreme learning machine. IEEE Trans Image Process 25(8):1–1
Zhang Q, Deng K, Chu T (2016) Sparsity induced locality preserving projection approaches for dimensionality reduction. Neurocomputing 200:35–46
Shao G, Sang N (2014) Max–min distance analysis by making a uniform distribution of class centers for dimensionality reduction. Neurocomputing 143:208–221
Li K, Tang P (2014) An improved linear discriminant analysis method and its application to face recognition. Appl Mech Mater 556:4825–4829
Sharma A, Paliwal KK (2015) A deterministic approach to regularized linear discriminant analysis. Neurocomputing 151:207–214
Tao D, Li X, Wu X, Maybank SJ (2009) Geometric mean for subsPACe selection. IEEE Trans Pattern Anal Mach Intell 31(2):260–274
Loog M, Duin R, Haeb-Umbach R (2001) Multiclass linear dimension reduction by weighted pairwise Fisher criteria. IEEE Trans Pattern Anal Mach Intell 23(7):762–766
Bian W, Tao D (2011) Max–min distance analysis by using sequential SDP relaxation for dimension reduction. IEEE Trans Pattern Anal Mach Intell 33:1037–1050
Shao C, Lou W, Yan L-M (2011) Optimization of algorithm of similarity measurement in high-dimensional data. Comput Technol Dev 21(2):1–4
Georghiades AS, Belhumeur PN, Kriegman DJ (2001) From few to many: illumination cone models for face recognition under variable lighting and pose. IEEE Trans Pattern Anal Mach Intell 23(6):643–660
Sim T, Baker S, Bsat M (2003) The CMU pose, illumination, and expression database. IEEE Trans Pattern Anal Mach Intell 25(12):1615–1618
Acknowledgements
This work was supported by the National Natural Science Foundation of China (no. 61571069) and Project no. 106112017CDJQJ168817 supported by the Fundamental Research Funds for the Central Universities.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Tan, X., Deng, L., Yang, Y. et al. Optimized regularized linear discriminant analysis for feature extraction in face recognition. Evol. Intel. 12, 73–82 (2019). https://doi.org/10.1007/s12065-018-0190-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12065-018-0190-0