Abstract
Interval-valued data appear as a way to represent the uncertainty affecting the observed values. Dealing with interval-valued information systems is helpful to generalize the applications of rough set theory. Attribute reduction is a key issue in analysis of interval-valued data. Existing attribute reduction methods for single-valued data are unsuitable for interval-valued data. So far, there have been few studies on attribute reduction methods for interval-valued data. In this paper, we propose a framework for attribute reduction in interval-valued data from the viewpoint of information theory. Some information theory concepts, including entropy, conditional entropy, and joint entropy, are given in interval-valued information systems. Based on these concepts, we provide an information theory view for attribute reduction in interval-valued information systems. Consequently, attribute reduction algorithms are proposed. Experiments show that the proposed framework is effective for attribute reduction in interval-valued information systems.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Rough set theory is as an extension of set theory for the study of intelligent systems characterized by insufficient and incomplete information (Pawlak, 1991). It has attracted the attention of many researchers who have studied its theories and applications in recent years (Dai and Xu, 2012; 2013; Dai et al., 2012; 2013a; 2013c; Hu, 2015; Lin et al., 2015; Zhang X et al., 2015; Zhang XH et al., 2016).
The classical rough set model is not appropriate for handling interval-valued data. However, in real applications, many data are interval-valued (Billard et al., 2008; Hedjazi et al., 2011; Dai et al., 2012; 2013b). Thus, dealing with interval-valued information systems becomes an interesting problem in rough set theory (Leung et al., 2008; Qian et al., 2008; Yang et al., 2009). Dai (2008) investigated the algebraic structures for interval-set-valued rough sets generated from an approximation space. Recently, Dai et al. (2012; 2013b) studied the uncertainty measurement issue in interval-valued information systems. Leung et al. (2008) presented a rough set approach based on the misclassification rate for interval-valued information systems. Qian et al. (2008) proposed a dominance relation to ordered interval information systems. Yang et al. (2009) investigated a dominance relation in incomplete interval-valued information systems. However, there have been few studies dealing with the attribute reduction issue in interval-valued information systems. Dai et al. (2013b) considered this issue from the viewpoint of uncertainty measurement. In this study, we aim to introduce a heuristic approach for attribute reduction in interval-valued information systems based on information theory.
2 Basic concepts
2.1 Similarity degree between interval values
Unlike classical real values, it is difficult to compare two interval values using traditional methods. Motivated by this fact, some research efforts have been directed toward finding efficient methods to measure or rank two interval values, mainly in the fuzzy set community (Bustince et al., 2006; Zhang and Fu, 2006; Galar et al., 2011). Galar et al. (2011) defined the similarity between two intervals as an interval. Zhang and Fu (2006) defined the similarity between two intervals as a real number. In this study, motivated by a similarity measure for interval-valued fuzzy sets proposed by Zhang and Fu (2006), we give a similarity measure for general interval-valued data: Definition 1 Let \(U = \left\{ {{u_1},\,{u_2},\, \ldots ,\,{u_n}} \right\}\) be the universe of the interval values, and \({u_i} = \left[ {u_i^- ,u_i^+} \right],\;i = 1,2, \ldots ,n\). Let \({m^-} = {\min\nolimits_{{u_i} \in U}}\{ u_i^- \} ,\;\;\;{m^+} = {\max\nolimits_{{u_i} \in U}}\{ u_i^+ \}\). The relative bound difference similarity degree between u i and u j is defined as
Proposition 1 Note that the relative bound difference similarity degree u ij has the following properties:
-
1.
0 ≤ υ ij ≤ 1;
-
2.
υ ij = 1 if and only if u i equals u j ;
-
3.
υ ij = υ ji
Proof It can be proved easily according to Definition 1.
Example 1 Assume U = {u1, u2}, u1 = [3, 4], u2 = [2, 5]. We have
2.2 Similarity classes and generalized decisions in interval-valued information systems
Let IVIS = (U, A) denote an interval-valued information system, where \(U = \left\{ {{u_1},\,{u_2},\, \ldots ,\,{u_n}} \right\}\) is a non-empty finite set called the universe of discourse, and \(A = \left\{ {{a_1},\,{a_2},\, \ldots ,\,{a_m}} \right\}\) is a non-empty finite set of m attributes called conditional attributes.
Definition 2 Assume that IVIS = (U, A) is an interval-valued information system. For a given similarity rate α ∈ [0, 1] and an attribute subset B ⊆ A, an α-similarity class of an object u i ∈ U is denoted as
where \(v_{ij}^\kappa \) represents the similarity degree of u i and u j at the κth attribute.
Remark 1 \(S_B^\alpha \) denotes the family set \(\left\{ {S_B^\alpha \left( {{u_i}} \right)\,\left| {\,{u_i} \in U} \right.} \right\}\).
\(S_B^\alpha \left( {{u_i}} \right)\) is the maximum set of objects which are possibly indiscernible with object u i by attribute set B under similarity rate α. In other words, \(S_B^\alpha \left( {{u_i}} \right)\) is an α-similarity class of u i .
Proposition 2 Given an interval-valued information system IVIS, and assuming that the attribute subset B ⊆ A, then \(S_B^\alpha \) has the following properties for any u i ∈ U:
-
1.
\(S_B^\alpha \left( {{u_i}} \right) = \bigcap\limits_{b \in B} {S_{\{ b\} }^\alpha \left( {{u_i}} \right);} \)
-
2.
if C ⊆ B, then \(S_B^\alpha \left( {{u_i}} \right) \subseteq \,S_C^\alpha \left( {{u_i}} \right)\).
Proof
-
1.
By definition, we have \(S_B^\alpha ({u_i}) = \left\{ {{u_j}\vert v_{ij}^b > \alpha ,\;\forall b \in B,\;{u_j} \in U} \right\} = \bigcap\limits_{b \in B} {S_{\{ b\}}^\alpha ({u_i})}\)
-
2.
By definition, we have \(S_B^\alpha ({u_i}) = \left\{ {\left. {{u_j}} \right\vert v_{ij}^b > \alpha ,\;\forall b \in B,{u_{ij}} \in U} \right\}\) and \(S_C^\alpha ({u_i}) = \left\{ {\left. {{u_j}} \right\vert v_{ij}^b > \alpha ,\;\forall b \in C,{u_j} \in U} \right\}\). For any \({u_j} \in S_B^\alpha ({u_i})\), we know \(v_{ij}^b > \alpha \) holds for any b ∈ B. Since C ⊆ B, we know \(v_{ij}^b > \alpha \) holds for any b ∈ C. Hence, \({u_j} \in S_C^\alpha \left( {{u_i}} \right)\).
Let IVDS = (U, A ∪ {d}) denote an interval-valued decision system where d is the decision attribute, also called the class label.
Definition 3 Assume that IVDS = (U, A U{d}) is an interval-valued decision system. The decision class of an object u i ∈ U is denoted as
Remark 2 We use \(U/d = \left\{ {\left. {{D_i}} \right\vert \;d({u_x}) = d({u_y})} \right.\), \(\left. {\forall \;{u_x},\;{u_y} \in {D_i}} \right\}\) to denote the partition of U based on the decision attribute d; i.e., D(u i ) represents the set of objects which contain the same decision attribute as u i .
Similar to the definition of generalized decision proposed by Kryszkiewicz (1998) for incomplete decision systems, a-generalized decision can be defined as follows:
Definition 4 The α-generalized decision of an object u i ∈ U is denoted as
Remark 3 \(\partial_B^\alpha\) denotes the family set \(\{ \partial_B^\alpha ({u_i})\vert {u_i} \in U\}\).
Let \({\rm{IVDS}} = \left( {U,\;A \cup \{ d\}} \right)\) be an interval-valued decision system. If \(S_A^\alpha ({u_i}) \subseteq D({u_i})\) for any object u i ∈ U, then the interval-valued decision system {U, A U{d}} is called a consistent (deterministic, definite) interval-valued decision system. Otherwise, it is called an inconsistent (non-deterministic, non-definite) interval-valued decision system.
Proposition 3 Given a consistent decision system \({\rm{IVDS}} = \left( {U,\;A \cup \{ d\}} \right)\), for an attribute set B ⊆ A, the following conditions are equivalent for all objects u i ∈ U:
-
1.
\(S_B^\alpha ({u_i}) \subseteq \;D({u_i});\);
-
2.
\(\left\vert {\partial_B^\alpha \left( {{u_i}} \right)} \right\vert = 1\).
Proof Suppose \(S_B^\alpha ({u_i}) \subseteq \;D({u_i})\). For any \({u_j} \in S_B^\alpha \left( {{u_i}} \right)\), we have u j ∈ D(u i ). It follows that d(u j ) = d(u i ). In other words, d(u j ) = d(u i ) holds for any u j satisfying \(v_{ij}^b > \alpha\), ∀ b ∈ B. Hence, we know \(\left\vert {\partial_B^\alpha \left( {{u_i}} \right)} \right\vert = 1\) by Definition 4.
Suppose \(\left\vert {\partial_B^\alpha \left( {{u_i}} \right)} \right\vert = 1\). For any \({u_j} \in S_B^\alpha \left( {{u_i}} \right)\), we have d(u i ) = d(u j ). It means u j ∈ D(u i ). Thus, we have \(S_B^\alpha \left( {{u_i}} \right) \subseteq D\left( {{u_i}} \right)\).
3 Information entropy and conditional entropy for interval-valued information systems
Let us introduce an information measure for the discernibility power of an α-similarity class denoted as \(S_B^\alpha \left( {{u_i}} \right)\), which is equivalent to Shannon’s entropy if \(S_B^\alpha \left( {{u_i}} \right)\) is a partition of U.
Definition 5 Given \({\rm{IVDS}} = (U,A \cup \{ d\} )\) and B ⊆ A, the information entropy of B is defined as follows:
where |·| denotes the number of elements in the set.
At the same time, the conditional entropy of B to d is defined as follows:
The joint entropy of B and d is defined as follows:
Note that we define \({H_{{\rm{SIM}}}}(d\vert B) = 0\) when \(\vert S_B^\alpha ({u_i}) \cap {D_j}\vert = 0\)
Proposition 4 Let \({\rm{IVDS}} = (U,A \cap \{ d\} )\) be an interval-valued decision system. For the attribute subset B ⊆ A, we have
-
1.
\({H_{{\rm{SIM}}}}(B) \geq 0\);
-
2.
\({H_{{\rm{SIM}}}}(d \cup B) = \max \left\{ {{H_{{\rm{SIM}}}}(d),{H_{{\rm{SIM}}}}(B)} \right\}\);
-
3.
\({H_{{\rm{SIM}}}}(d\vert B) = 0\) and \({H_{{\rm{SIM}}}}(d \cup B) = {H_{{\rm{SIM}}}}(B)\) if and only if \(S_B^\alpha ({u_i}) \subseteq D({u_i})\), ∀ u i ∈ U;
-
4.
\({H_{{\rm{SIM}}}}(d \cup B) = {H_{{\rm{SIM}}}}(d\vert B) + {H_{{\rm{SIM}}}}(B)\).
Proof
Note that
4 Attribute reduction framework for interval-valued information systems and interval-valued decision systems based on information entropies
One significant problem in rough set theory is searching for particular subsets of attributes which provide the same information for classification. It is also called attribute reduction.
4.1 Information theory view for attribute reduction of interval-valued information systems
Definition 6 Assume that IVIS = (U, A) is an interval-valued information system. If an attribute subset B ⊆ A and α ∈ [0, 1], attribute subset B is a reduct of IVIS if and only if
-
1.
\(S_B^\alpha = S_A^\alpha\);
-
2.
\(\forall b \in B,\;S_{B - \{ b\}}^\alpha \neq S_A^\alpha\).
Definition 7 Let \({\rm{IVDS}} = (U,A \cup \;\{ d\} )\) be an interval-valued decision system. If attribute subset B ⊆ A and α ∈ [0, 1], attribute set B is a relative reduct of IVDS if and only if
-
1.
\(\partial_B^\alpha = \partial_A^\alpha\);
-
2.
\(\forall b \in B,\;\partial_{B - \{ b\}}^\alpha \neq \partial_A^\alpha\).
To construct our information theory based attribute reduction methods, we provide an information theory view for attribute reduction in interval-valued information systems.
Theorem 1 Assume that IVIS = (U, A) is an interval-valued information system. Then \(S_B^\alpha = S_A^\alpha\) and HSIM(B) = HSIM(A) are equivalent.
Proof
Theorem 2 Let IVIS = (U, A) be an interval-valued information system. If C ⊆ A is redundant, then \({H_{{\rm{SIM}}}}(A) - {H_{{\rm{SIM}}}}(A - C) = 0\).
Proof If C is redundant in IVIS, then we know \(S_{\{ A - C\}}^\alpha = S_A^\alpha\) by Definition 6. It is easy to obtain \({H_{{\rm{SIM}}}}(A - C) = {H_{{\rm{SIM}}}}(A)\) by Theorem 1.
Theorem 3 Let IVIS = (U, A) be an interval-valued information system. Then C ⊆ A is indispensable if and only if \({H_{{\rm{SIM}}}}(A) - {H_{{\rm{SIM}}}}(A - C) > 0\).
Proof Suppose the attribute subset C is indispensable. Then we have \(S_{\{ A - C\}}^\alpha \neq S_A^\alpha\). By Theorem 1, we have \({H_{{\rm{SIM}}}}(A - C) \neq {H_{{\rm{SIM}}}}(A)\). Since \({H_{{\rm{SIM}}}}(A - C) \leq {H_{{\rm{SIM}}}}(A)\), it follows that \({H_{{\rm{SIM}}}}(A) - {H_{{\rm{SIM}}}}(A - C) > 0\)
Definition 8 Assume that IVIS = (U, A) is an interval-valued information system. The significance of an attribute α i ∉ B relative to B is as follows:
Definition 9 If \({\rm{IVDS}} = (U,A \cup \;\{ d\} )\) is an interval-valued decision system, the significance of attribute a i relative to B is defined as
From the above theorems, we obtain the information theory view for attribute reduction in interval-valued information systems.
Theorem 4 Assume that IVIS = (U, A) is an interval-valued information system. The attribute subset B ⊆ A is a reduct of A if and only if
-
1.
HSIM(B) = HSIM(A);
-
2.
∀ b ∈ B, Sig(b, A) > 0.
Theorem 5 Given \({\rm{IVDS}} = (U,A \cup \;\{ d\} )\) and B ⊆ A, if IVDS is consistent, then \(\partial_B^\alpha = \partial_A^\alpha\) and \({H_{{\rm{SIM}}}}(d\vert B) = {H_{{\rm{SIM}}}}(d + A)\) are equivalent.
Proof Assume \(\partial_B^\alpha = \partial_A^\alpha\). Since \({\rm{IVDS}} = (U,A \cup \;\{ d\} )\) is consistent, we have \(\left\vert {\partial_A^\alpha ({u_i})} \right\vert = 1\) and \(S_A^\alpha ({u_i}) \subseteq D({u_i})\) for all objects u i ∈ U. Hence, \(\left\vert {\partial_B^\alpha ({u_i})} \right\vert = 1\) and \(S_B^\alpha ({u_i}) \subseteq D({u_i})\) for all objects u i ∈ U. By Proposition 4, we have \(H(d\vert B) = 0\). Since \(H(d\vert A) = 0\), we have \(H(d\vert B) = H(d\vert A)\).
Suppose \(H(d\vert B) = H(d\vert A)\). Since IVDS is consistent, we have \(H(d\vert B) = 0\). Hence, \(S_B^\alpha ({u_i}) \subseteq D({u_i})\) for all objects u i ∈ U. Since \(S_A^\alpha ({u_i}) \subseteq D({u_i})\), we have \(\partial_B^\alpha = \partial_A^\alpha\). This completes the proof.
Theorem 6 Let \({\rm{IVDS}} = (U,A \cup \;\{ d\} )\) be a consistent interval-valued decision system. Then C ⊆ A is dispensable if and only if \({H_{{\rm{SIM}}}}(d\vert A - C) = 0\).
Proof Suppose that \({\rm{IVDS}} = (U,A \cup \;\{ d\} )\) is a consistent interval-valued decision system and that C is a dispensable attribute subset. Then we have \(\partial_{A - C}^\alpha = \partial_A^\alpha\) by Definition 7. According to Theorem 5, we obtain \({H_{{\rm{SIM}}}}(d\vert A - C) = {H_{{\rm{SIM}}}}(d\vert A)\). Since IVDS is consistent, we have \({H_{{\rm{SIM}}}}(d\vert A) = 0\). Therefore, it follows that \({H_{{\rm{SIM}}}}(d\vert A - C) = 0\).
Theorem 7 Let \({\rm{IVDS}} = (U,A \cup \;\{ d\} )\) be a consistent interval-valued decision system. Then C ⊆ A is indispensable if and only if \({H_{{\rm{SIM}}}}(d\vert A - C) > 0\).
Proof Assume that \({\rm{IVDS}} = (U,A \cup \;\{ d\} )\) is a consistent interval-valued decision system and that the attribute subset C is indispensable. Then we have \(\partial_{A - C}^\alpha \neq \partial_A^\alpha\). It is not difficult to obtain \({H_{{\rm{SIM}}}}(d\vert A - C) \neq {H_{{\rm{SIM}}}}(d\vert A)\). Then we have \({H_{{\rm{SIM}}}}(d\vert A - C) \geq {H_{{\rm{SIM}}}}(d\vert A)\) and HSIM(d | A) = 0, according to Proposition 4. Consequently, we have \({H_{{\rm{SIM}}}}(d\vert A - C) > 0\)
From the above theorems, we obtain the information theory view for attribute reduction in consistent interval-valued decision systems.
Theorem 8 Assume that \({\rm{IVDS}} = (U,A \cup \;\{ d\} )\) is a consistent interval-valued decision system. The attribute subset B ⊆ A is a reduct of IVDS if and only if
-
1.
\({H_{{\rm{SIM}}}}(d\vert B) = {H_{{\rm{SIM}}}}(d\vert A)\);
-
2.
∀ b ∈ B, Sig(b, B, d) > 0.
4.2 Attribute reduction algorithms for IVIS and IVDS
The attribute reduction algorithms for IVIS and IVDS are given in Algorithms 1 and 2, respectively.
Now we provide an example to illustrate the proposed method. Assume that an IVDS is as listed in Table 1.
Example 2 Suppose the similarity rate α = 0.8 and A = {a, b, c}. We confirm that the IVDS shown in Table 1 is consistent when α = 0.8. Let us compute \(S_A^{0{.}8}\left( {{u_1}} \right)\) in detail. By Definition 1, we have
Then, we have
Similarly, we have
Thus, we obtain
In the same way, we can compute \(S_A^{0{.}8}\left( {{u_i}} \right)\) for all objects u i ∈ U:
By decision attribute d, we have the partition
The decision classes based on the decision attribute are also listed as follows:
First, an empty set is assigned to the original attribute reduction set B, i.e., B = ∅, and the conditional entropy for each single attribute a, b, or c can be computed.
By definition, we have
Hence, we obtain
Then, we find HSIM(d | c) is the minimum, and attribute c is added to attribute reduction set B, i.e., B = {c}. The next step is to compute the conditional entropies of c ∪ a and c ∪ b:
Since \({H_{{\rm{SIM}}}}(d\vert a \cup c) = {H_{{\rm{SIM}}}}(d\vert b \cup c)\), we choose the one explored first, i.e., attribute a. The algorithm terminates since the conditional entropy is equal to zero. Hence, we obtain the reduct B = {a, c}.
5 Experiments
To test the effectiveness of the proposed algorithm, experiments on three real-world datasets are performed. All values of conditional attributes in the datasets are interval values.
5.1 Fish dataset (or ecotoxicology dataset)
The Fish dataset has been introduced to test the effectiveness of attribute reduction for symbolic interval data (Hedjazi et al., 2011). This dataset is composed of observations for abnormal levels of mercury contamination in some Amerindian areas, taken from several studies in French Guyana by researchers from the LEESA Laboratory.
There are 13 interval conditional attributes, which are length, weight, muscle, intestine, stomach, gills, liver, kidneys, liver/muscle, kidneys/muscle, gills/muscle, intestine/muscle, and stomach/muscle. Similarly, they are labeled as a1−a13. What is more, a reference classification with respect to the fish diet is taken as the decision attribute, including carnivorous, detritivorous, omnivorous, and herbivorous.
The conditional entropy for each conditional attribute is illustrated in Fig. 1. We can find that the conditional entropy of attribute a9, which represents the liver/muscle, is the minimum. This means liver/muscle is more significant than the other attributes.
5.2 Face Recognition dataset
The Face Recognition dataset focuses on face recognition. Each interval value represents the measurement of each local feature in a face image. For each face image, the localization of the salient features such as nose, mouth, and eyes is obtained by using morphological operators. A distance is measured between specific points delimiting each boundary and several distances are described as interval values.
The dataset contains 9 men with 3 sequences for each, giving a total of 27 observations (Dai et al., 2013b). The decision attribute identifies which person it is. There are 6 conditional attributes, including the length spanned by the eyes, the length between the eyes, the length from the outer right eye to the upper middle lip at the point between the nose and mouth, and so on. We can represent the six conditional attributes as a1−a6.
By using our reduction algorithm, first the set of attribute reductions, B, is initialized to the empty set, and we can compute the conditional entropy for each single conditional attribute shown in Fig. 2. Then, we can find that the conditional entropy of attribute a3 (which represents the length from the outer right eye to the upper middle lip at the point between the nose and the mouth) is the minimum. This means the length from the outer right eye to the upper middle lip at the point between the nose and the mouth is more significant than the other attributes.
5.3 Car dataset
The Car dataset contains 33 car models described by 7 interval variables, 2 categorical multivalued variables, and 1 nominal variable. This dataset has been used in research on clustering for interval values (Hedjazi et al., 2011; Dai et al., 2013b).
In this study, we take only the 7 interval conditional attributes into consideration, which are price, engine capacity, top speed, step, length, width, and height, denoted as a1−a7. The nominal variable ‘car category’ represents the decision attribute, which has been used as a priori classification.
In Fig. 3, we list the conditional entropy of each attribute a1−a7 to decision attribute d. It is easy to find that HSIM(d | a1) is the minimum, and we can conclude that the attribute price provides more information for classification.
5.4 Comparison of performance
To verify the effectiveness of the proposed approach, attribute selection experiments based on the proposed uncertainty measurements are conducted on the above datasets.
Besides the similarity measure constructed in Definition 1, we consider two other similarity measures for comparison. One is constructed by Dai et al. (2012; 2013b) based on possible degree. The similarity is called possible degree similarity. Let \(A = \left[ {{a^-},{a^+}} \right]\) and \(B = \left[ {{b^-},{b^+}} \right]\) be two interval values. The possible degree similarity between the two interval values is defined as
where
\({P_{(A \geq B)}}\) and \({P_{(B \geq A)}}\) are the possible degree of A relative to B and the possible degree of B relative to A, respectively.
The other similarity measure can be adapted from that used by Dai and Tian (2013) to handle set-valued data. For interval values \(A = \left[ {{a^-},{a^+}} \right]\) and \(B = \left[ {{b^-},{b^+}} \right]\), the intersection-union similarity is defined as
It is easy to prove that the similarity has the three properties in Proposition 1.
Very few classifiers can be used to address interval data; therefore, to compare the performance of classification based on the selected attributes, Dai et al. (2013b) extended the classical k-nearest neighbor (KNN) classifier and probabilistic neural network (PNN) classifier to handle interval-valued data by redefining the distance between two objects:
Definition 10 (Dai et al., 2013b) Suppose that X and Y are two objects in interval-valued information systems and that \(u_i^\kappa\) and \(u_j^\kappa\) are two interval values at the κth attribute. The distance between X and Y is defined as follows:
where m is the number of conditional attributes and \({P_{(u_i^\kappa \geq u_j^\kappa )}}\) is the possible degree between two interval values.
Due to the limited number of objects in the dataset, we use a leave-one-out cross-validation approach to evaluate the classification performances.
In the experiments, the relative bound difference similarity, possible degree similarity, and intersection-union similarity are used in the proposed attribute reduction framework, denoted as RBD, PD, and IU, respectively. We also compare these three methods with the attribute selection method based on uncertainty measurement for interval-valued information systems constructed by Dai et al. (2013b), called uncertainty measurement attribute reduction (UMAR).
The results are shown in Tables 2–4. The parameter α represents the similarity threshold (Section 2.2), used to construct similarity relations and similarity classes. Tables 2–4 show the accuracy rates on the Fish, Face Recognition, and Car datasets, respectively, obtained by the extended KNN classifier and the extended PNN classifier. From the results, we find that RBD obviously outperforms the existing method UMAR. Moreover, PD is a little better than UMAR. IU performs as well as UMAR. The results indicate that the proposed attribute reduction method is feasible and effective. Applying different similarity measures to the proposed algorithm leads to different results. Among RBD, IU, and PD, RBD outperforms IU and PD. In most cases, RBD obtains the best results. PD performs a little better than IU. The reason may lie in the differences of the abilities of describing the similarity between two interval values among these three measures.
6 Conclusions
The classical rough set model is not appropriate for handling interval-valued data. In this paper, we present a new framework for attribute reduction in interval-valued information systems from the viewpoint of information theory. Some information theory concepts, including entropy, conditional entropy, and joint entropy, are defined in interval-valued information systems. Based on these concepts, we provide an information theory view for attribute reduction in interval-valued information systems. Consequently, attribute reduction algorithms are proposed. To test the proposed algorithms, experiments on three datasets are conducted. Experiments show that the proposed framework is effective for attribute reduction in interval-valued information systems.
In the future, we plan to investigate other similarity measures and introduce them into our information theory framework of attribute reduction for interval-valued information systems.
References
Billard, L., Douzal-Chouakria, A., Diday, E., 2008. Symbolic Principal Component for Interval-Valued Observations. Available from https://hal.archivesouvertes.fr/hal-00361053.
Bustince, H., Barrenechea, E., Pagola, M., 2006. Restricted equivalence functions. Fuzzy Sets Syst., 157(17): 2333–2346. http://dx.doi.org/10.1016/j.fss.2006.03.018
Dai, J.H., 2008. Rough 3-valued algebras. Inform. Sci., 178(8): 1986–1996. http://dx.doi.org/10.1016/j.ins.2007.11.011
Dai, J.H., Tian, H.W., 2013. Fuzzy rough set model for set-valued data. Fuzzy Sets Syst., 229: 54–68. http://dx.doi.org/10.1016/j.fss.2013.03.005
Dai, J.H., Xu, Q., 2012. Approximations and uncertainty measures in incomplete information systems. Inform. Sci., 198: 62–80. http://dx.doi.org/10.1016/j.ins.2012.02.032
Dai, J.H., Xu, Q., 2013. Attribute selection based on information gain ratio in fuzzy rough set theory with application to tumor classification. Appl. Soft Comput., 13(1): 211–221. http://dx.doi.org/10.1016/j.asoc.2012.07.029
Dai, J.H., Wang, W.T., Xu, Q., et al., 2012. Uncertainty measurement for interval-valued decision systems based on extended conditional entropy. Knowl.-Based Syst., 27: 443–450. http://dx.doi.org/10.1016/j.knosys.2011.10.013
Dai, J.H., Tian, H.W., Wang, W.T., et al., 2013a. Decision rule mining using classification consistency rate. Knowl.-Based Syst., 43: 95–102. http://dx.doi.org/10.1016/j.knosys.2013.01.010
Dai, J.H., Wang, W.T., Mi, J.S., 2013b. Uncertainty measurement for interval-valued information systems. Inform. Sci., 251: 63–78. http://dx.doi.org/10.1016/j.ins.2013.06.047
Dai, J.H., Wang, W.T., Xu, Q., 2013c. An uncertainty measure for incomplete decision tables and its applications. IEEE Trans. Cybern., 43(4): 1277–1289. http://dx.doi.org/10.1109/TSMCB.2012.2228480
Galar, M., Fernandez, J., Beliakov, G., et al., 2011. Intervalvalued fuzzy sets applied to stereo matching of color images. IEEE Trans. Image Process., 20(7): 1949–1961. http://dx.doi.org/10.1109/TIP.2011.2107525
Hedjazi, L., Aguilar-Martin, J., Le Lann, M.V., 2011. Similarity-margin based feature selection for symbolic interval data. Patt. Recogn. Lett., 32(4): 578–585. http://dx.doi.org/10.1016/j.patrec.2010.11.018
Hu, Y.C., 2015. Flow-based tolerance rough sets for pattern classification. Appl. Soft Comput., 27: 322–331. http://dx.doi.org/10.1016/j.asoc.2014.11.021
Kryszkiewicz, M., 1998. Rough set approach to incomplete information systems. Inform. Sci., 112(1–4):39–49. http://dx.doi.org/10.1016/S0020-0255(98)10019-1
Leung, Y., Fischer, M.M., Wu, W.Z., et al., 2008. A rough set approach for the discovery of classification rules in interval-valued information systems. Int. J. Approx. Reason., 47(2): 233–246. http://dx.doi.org/10.1016/j.ijar.2007.05.001
Lin, S.H., Huang, C.C., Che, Z.X., 2015. Rule induction for hierarchical attributes using a rough set for the selection of a green fleet. Appl. Soft Comput., 37: 456–466. http://dx.doi.org/10.1016/j.asoc.2015.08.016
Pawlak, Z., 1991. Rough Sets: Theoretical Aspects of Reasoning about Data. Kluwer Academic Publishers, Dordrecht & Boston. http://dx.doi.org/10.1007/978-94-011-3534-4
Qian, Y.H., Liang, J.Y., Dang, C.Y., 2008. Interval ordered information systems. Comput. Math. Appl., 56(8): 1994–2009. http://dx.doi.org/10.1016/j.camwa.2008.04.021
Yang, X.B., Yu, D.J., Yang, J.Y., et al., 2009. Dominancebased rough set approach to incomplete interval-valued information system. Data Knowl. Eng., 68(11): 1331–1347. http://dx.doi.org/10.1016/j.datak.2009.07.007
Zhang, C.Y., Fu, H.Y., 2006. Similarity measures on three kinds of fuzzy sets. Patt. Recogn. Lett., 27(12): 1307–1317. http://dx.doi.org/10.1016/j.patrec.2005.11.020
Zhang, X.H., Dai, J.H., Yu, Y.C., 2015. On the union and intersection operations of rough sets based on various approximation spaces. Inform. Sci., 292: 214–229. http://dx.doi.org/10.1016/j.ins.2014.09.007
Zhang, X.H., Miao, D.Q., Liu, C.H., et al., 2016. Constructive methods of rough approximation operators and multigranulation rough sets. Knowl.-Based Syst., 91: 114–125. http://dx.doi.org/10.1016/j.knosys.2015.09.036
Author information
Authors and Affiliations
Corresponding author
Additional information
Project supported by the National Natural Science Foundation of China (Nos. 61473259, 61502335, 61070074, and 60703038), the Zhejiang Provincial Natural Science Foundation (No. Y14F020118), and the PEIYANG Young Scholars Program of Tianjin University, China (No. 2016XRX-0001)
ORCID: Jian-hua DAI, http://orcid.org/0000-0003-1459-0833
Rights and permissions
About this article
Cite this article
Dai, Jh., Hu, H., Zheng, Gj. et al. Attribute reduction in interval-valued information systems based on information entropies. Frontiers Inf Technol Electronic Eng 17, 919–928 (2016). https://doi.org/10.1631/FITEE.1500447
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1631/FITEE.1500447