Abstract
Attribute coordinate comprehensive evaluation method features subjective weighting in which the weights of indicators are determined by evaluators, which possibly leads to the arbitrariness in setting the weights. When there are many indicators, it is difficult to accurately judge if the sample is better or worse than others. To address the problem, this paper applies principal component analysis on the attribute coordinate comprehensive evaluation method. When there are many indicators, they can be reduced to new indicators with related meanings given through the method of principal component analysis. With the simplification, it will greatly facilitate experts to rate samples, which is the paramount basis that provides the preference of experts for the attribute coordinate comprehensive evaluation method to further calculate all the satisfaction degrees of objects to be evaluated. Experimental results show the advantages of the improved algorithm over the original algorithm.
The work was supported by the Key Disciplines of Computer Science and Technology of Shanghai Polytechnic University (No. XXKZD1604).
Access provided by CONRICYT-eBooks. Download conference paper PDF
Similar content being viewed by others
Keywords
- Attribute coordinate comprehensive evaluation
- Principal component analysis
- Barycentric coordinates
- Global satisfaction degree
1 Introduction
In regards to comprehensive evaluation, the most important problem needed to address is how to set the weight of each evaluated indicator. The setting of weights can fall into two categories. One is the subjective weighting, such as AHP [1, 2], and the other is the objective weighting, such as the least square method and the principal component analysis [3, 4]. The two types have their own advantages and disadvantages. Subjective weighting is that the weights are given by experts and could be arbitrary in some cases, while objective weighting is not able to reflect the experiences or preferences of experts. Attribute coordinate comprehensive evaluation, belonging to the former, whose characteristic is that it can construct the corresponding psychological preference curve through evaluators rating the sample data in light of their own experiences or preferences, has made certain progress both in theory and practice [5,6,7,8,9,10,11,12,13]. However, when indicators are many, it is difficult for experts to accurately distinguish satisfactory samples from unsatisfactory samples, which might result in arbitrary ratings on some samples. To address the obstacle, the principal component analysis method is used to reduce the number of indicators and give the related meanings of new indicators, so it is easier for experts to rate on samples with new indicators.
This paper first introduces the steps of simplification of indicators by means of the principal component analysis, then explores the core idea of the attribute coordinate comprehensive evaluation method, and next elaborates the process of combining the two methods through the simulation and the comparison of results before and after the model is improved.
2 Reduction of Indicators by Principal Component Analysis
Principal component analysis is a method of dimensionality reduction in mathematics. The basic idea is to try to make the original indicators X1, X2, … Xt (for example, there are t indicators) recombined into a set of relatively unrelated comprehensive indicators Fm with fewer numbers than the number of original indicators. The specific steps of the principal component analysis are as follows:
-
(1)
Calculate the covariance matrix
Calculate the covariance matrix \( s = (s_{ij} )_{p \times p} \) of sample data
Among them, sij (i, j = 1, 2, …, p) is the correlation coefficient between the original variable xi and xj. p is the number of indicators. n is the number of samples. \( \bar{x}_{i} \) and \( \bar{x}_{j} \) is respectively the mean of values of indicator i and j. \( x_{ki} \) is the value of indicator i of a certain sample, and \( x_{kj} \) is the value of indicator j of a certain sample.
-
(2)
Calculate the eigenvalues \( \lambda_{i} \) of S and orthogonal unit eigenvectors \( a_{i} \).
The first m larger eigenvalues of S, λ1 ≥ λ2 ≥ … λm > 0, is the variance of the first m principal components, and the unit eigenvector \( a_{i} \) corresponding to \( \lambda_{i} \) is the coefficient of the principal component Fi, and then the ith principal component Fi is:
The variance (information) contribution rate of principal components reflects the information magnitude, \( \gamma_{i} \) is:
-
(3)
Determine the principal components
The final principal components to be selected are F1, F2, … Fm, and m is determined by the cumulative contribution rate of variance G(m).
When the cumulative contribution rate is greater than 85%, it will be considered enough to reflect the information of the original variables, and m is the extracted first m principal components.
-
(4)
Calculate the load of the principal components
The principal component load reflects the degree of correlation between the principal component Fi and the original variable Xj, and the load lij(i = 1, 2, …, m; j = 1,2,…, p) of the original variable Xj (j = 1,2, … p) on the principal component Fi (i = 1, 2, …, m) is:
-
(5)
Calculate the scores of the principal components
The scores on the m principal components of the sample:
-
(6)
Select the principal components and give the new meanings
Provide the new meaning of the new evaluation indicator Fi (i = 1, 2, …, m) for experts to rate on the new samples.
3 Attribute Coordinate Comprehensive Evaluation Model
3.1 Explore Barycentric Coordinates Reflecting Evaluators’ Preference Weight
Attribute coordinate comprehensive evaluation method combines machine learning with experts’ ratings on sample data. Set T0 to be the critical total score, Tmax the largest total score, we evenly select several total scores: T1, T2, … Tn−1 from (T0, Tmax) regarding the curve fitting requirements, and then select some samples on each total score Ti(i = 1, 2, 3 … n − 1) and rate them according to experts’ preference or experiences, which is taken as the process of the learning of samples, so as to get the barycentric coordinate for Ti (i = 1, 2, 3 … n − 1) according to (7).
Where, {fk, k = 1, … s} ⊆ ST ∩ F is the set for sample fi with the total score equal to T. In Formula (7), b({vh(z)}) is the barycentric coordinate of {vh(z)}, {fh, h = 1, … t} is the values of indicators of t sets of samples the evaluator Z selects from {fk}, {vh(z)} is the ratings (or taken as weight) the evaluator gives on the samples.
3.2 Calculate the Most Satisfactory Solution
Use the interpolation formula Gj(T) = a0j + a1j T + a2j T2 + … +an+1j Tn+1 and barycentric coordinates obtained above to do curve fitting and construct the psychological barycentric line (or most satisfactory local solution line) L(b({fh(z)})); and then calculate the global satisfaction degree according to (8), and sort them in descending order to obtain the most satisfactory solution.
Where, sat(f, Z) is the satisfaction of evaluated object f from evaluator Z, whose value is expected to be between 0 and 1. \( f_{j} \) is the value of each indicator. \( \left| {f_{j} - b(f^{h} (z_{j} )} \right| \) is to measure the difference between each attribute value and the corresponding barycentric value. \( w_{j} \) and \( \delta_{j} \) are used as the factor which can be adjusted to make the satisfaction comparable value in the case where the original results are not desirable. \( \sum\limits_{j = 1}^{m} {F_{j} } \) is the sum of Fj with each indicator value full score. \( \sum\limits_{ij = 1}^{m} {f_{ij} } \) is the sum of the values of all the indicators Fij of Fi.
4 Simulation Experiment
To verify the effectiveness of the improved method, we chose the grades of nine courses from 2008 students in the final exam in a high school as the experimental data, nine courses being taken as nine indicators including Chinese, mathematics, English, physics, chemistry, politics, history, geography and biology. The sample data is shown in Table 1.
First of all, we use the attribute coordinate comprehensive evaluation method to respectively construct the psychological barycentric curves of several courses without applying the principal component analysis. And then we improve the method in the way that the principal component analysis is used to simplify the indicators, further the attribute coordinate comprehensive evaluation method is applied to construct the psychological barycentric lines of the new indicators.
We also compare the global satisfaction degrees between two students before and after the improved method is applied.
4.1 Attribute Coordinate Comprehensive Evaluation Without Using Principal Component Analysis
Respectively, we choose the total score of 1000, 701 and 620 as the three evaluation planes, and select some samples for the experts to rate. The last column (Rating) of Tables 2 and 3 are respectively the rating data for total score 701 and 620.
According to (7), the barycentric coordinates of total score 701 and 620 with (Chinese, math, geography) are respectively (88.65625, 92.625, 76.4375) and (88.79069767, 83.93023256, 67.51162791).
Next, according to the interpolation theorem, we calculate the barycentric curves of Chinese, mathematics, geography (respectively shown in Figs. 1, 2, 3). It can be seen that the barycenter curve of Chinese is very unreasonable, as the curve should be monotonically increasing, while in this curve, the curve for total score of 650 is even lower than that of the total score of 600. From Figs. 2 and 3, we can see that barycentric curves of mathematics and geography are almost the same, which is not obvious to see the expert put more weight on arts or science.
The most likely reason for the result is that so many indicators make it difficult for experts to accurately distinguish good samples from bad samples among nine indicators, which could result in arbitrary ratings.
4.2 Attribute Coordinate Comprehensive Evaluation with Principal Component Analysis
We apply the improved algorithm, first carrying out principal component analysis to reduce the quantity of indicators.
-
(1)
Calculate the covariance matrix S (correlation coefficient matrix) between indicators.
-
(2)
Calculate the eigenvalue vector of the correlation coefficient matrix
(1.5315, 0.2945, 0.2291, 0.1658, 0.1331, 0.1170, 0.1006, 0.0881, 0.0778)
-
(3)
Calculate the principal component contribution rate vector \( \lambda \) and cumulative contribution rate G(M).
The contribution rate vector \( \lambda \) = (55.9456, 10.7591, 8.3684, 6.0553, 4.8631, 4.2733, 3.6763, 3.2172, 2.8418)
The contribution rate of the first three principal components is G(M) = 75.0731%, although there will be some information loss, it is not so great to affect the overall situation.
According to the coefficient matrix S, the expressions of the first three principal components (f1, f2, f3) are respectively as follows.
-
f1 = 0.1725x1 + 0.5151x2 + 0.2543x3 + 0.4274x4 + 0.3474x5 + 0.1596x6 + 0.231x7 + 0.3152x8 + 0.3983x9
-
f2 = 0.3276x1 − 0.3978x2 + 0.7953x3 − 0.1859x4 − 0.0728x5 + 0.1882x6 + 0.1357x7 − 0.0823x8 + 0.0389x9
-
f3 = −0.2918x1 + 0.5263x2 + 0.4751x3 + 0.0203x4 − 0.0053x5 − 0.2401x6 − 0.3541x7 − 0.4071x8 − 0.2510x9
Respectively, x1, x2 … x9 represents Chinese, math…biological.
From the expression of the first principal component f1, it has the positive load on each variable, indicating that the first principal component represents the comprehensive components.
From the expression of the second principal component f2, the value of f2 decreases with the increase of x2(Math), x4(physics) and x5(chemistry), whereas increases with the increase of x3(English), x6(politics), x7(history) and x9(biology), which indicates f2 reflects a student’s level of liberal arts.
From the expression of the third principal component f3, the value of f3 increases with the increase of x2(Math), x3(English) and x4(physics), whereas decreases with the increase of x1(Chinese), x6(politics), x7(history), x8(geography) and x9(biology), which indicates f3 reflects a student’s level of science.
In this way we can simplify the nine indicators into three ones: f1, f2 and f3. Now we can calculate students’ scores with the new indicator system. Table 4 is new sample data with the new indicator system.
-
(4)
Attribute coordinate comprehensive evaluation
Respectively we provide three total score planes 460, 345 and 311 for the expert to rate. The scores of the last two total samples are shown in Tables 5 and 6 respectively. The expert’s preference can be seen directly from the ratings (the last column). When the total score is higher, the expert pays more attention to the comprehensive level of students. When the total score is relatively lower, the expert values students’ science scores more. This evaluation is easier than that without principal component analysis.
We can obtain the barycentric coordinates of 460, 345 and 311 respectively (268.2157, 92.1146, 100), (223.98, 51.46642, 6922313) and (200.2936, 44.12158, 66.05498). We draw the barycentric curves of indicator f1, f2 and f3 respectively (shown in Figs. 4, 5, 6). It can be seen that the three curves are all monotonically increasing, which are more reasonable than those drawn with the old model.
-
(5)
the Comparison of Satisfaction Degree Before and After Improvement
Finally, we examine the satisfaction degree obtained respectively using the two models. The followings are the scores of two students No. 466 and No. 196. They almost have the same total score, however, it is obvious that No. 196 is better at science than No. 466. So normally the satisfaction degree of No. 196 should be greater than that of No. 466 under the condition that the evaluator values the science scores more. However the result is opposite in the case of the unimproved method, which is unreasonable (shown in Table 7). Comparatively, the improved algorithm fixes the flaw and obtains the reasonable result, better reflecting the preference of the evaluator (shown in Table 8).
5 Conclusion
The improved method integrates principal component analysis into the original method to reduce the number of indicators so as to make the experts’ rating process more simple and effective. The simulation examines the comparison of the results before and after using the principal component analysis and shows that the barycentric curves look more favorable, and the satisfaction degrees of the evaluated objects more accurately reflect the preferences and experiences of experts.
References
Ji, Z.: Research on personal information security evaluation based on analytic hierarchy process. Value Eng. 5, 57–60 (2018)
Gong, X., Chang, X.: Comprehensive prejudgment of coal mine fire rescue path optimization based on fault tree-analytic hierarchy process. Mod. Electron. Tech. 8, 151–154 (2018)
Pan, X., Liu, F.: Evaluation mode of risk investment based on principal component analysis. Sci. Technol. Prog. Policy 3, 65–67 (2004)
Zhou, B., Wang, M.: A method of cloud manufacturing service QoS evaluation based on PCA. Manuf. Autom. 14, 28–33 (2013)
Xu, X., Xu, G., Feng, J.: Study on updating algorithm of attribute coordinate evaluation model. In: Huang, D.-S., Hussain, A., Han, K., Gromiha, M.M. (eds.) ICIC 2017. LNCS (LNAI), vol. 10363, pp. 653–662. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-63315-2_57
Xu, G., Xu, X.: Study on evaluation model of attribute barycentric coordinates. Int. J. Grid Distrib. Comput. 9(9), 115–128 (2016)
Xu, X., Feng, J.: A quantification method of qualitative indices based on inverse conversion degree functions. In: Enterprise Systems Conference, pp. 261–264 (2014)
Xu, G., Min, S.: Research on multi-agent comprehensive evaluation model based on attribute coordinate. In: IEEE International Conference on Granular Computing (GrC), pp. 556–562 (2012)
Xu, X., Xu, G., Feng, J.: A kind of synthetic evaluation method based on the attribute computing network. In: IEEE International Conference on Granular Computing (GrC), pp. 644–647 (2009)
Xu, X., Xu, G.: Research on ranking model based on multi-user attribute comprehensive evaluation method. In: Applied Mechanics and Materials, pp. 644–650 (2014)
Xu, X., Xu, G.: A recommendation ranking model based on credit. In: IEEE International Conference on Granular Computing (GrC), pp. 569–572 (2012)
Xu, X., Feng, J.: Research and implementation of image encryption algorithm based on zigzag transformation and inner product polarization vector. In: IEEE International Conference on Granular Computing, vol. 95, no. 1, pp. 556–561 (2010)
Xu, G., Wang, L.: Evaluation of aberrant methylation gene forecasting tumor risk value in attribute theory. J. Basic Sci. Eng. 16(2), 234–238 (2008)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 IFIP International Federation for Information Processing
About this paper
Cite this paper
Xu, X., Liu, Y., Feng, J. (2018). Attribute Coordinate Comprehensive Evaluation Model Combining Principal Component Analysis. In: Shi, Z., Pennartz, C., Huang, T. (eds) Intelligence Science II. ICIS 2018. IFIP Advances in Information and Communication Technology, vol 539. Springer, Cham. https://doi.org/10.1007/978-3-030-01313-4_7
Download citation
DOI: https://doi.org/10.1007/978-3-030-01313-4_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01312-7
Online ISBN: 978-3-030-01313-4
eBook Packages: Computer ScienceComputer Science (R0)