Keywords

14.1 Introduction

The theory of rough set, as proposed by Pawlak, is an efficient mathematical theory to deal with uncertainty and vagueness of decision system, and it has many successful applications in the fields of artificial intelligence such as experts system, pattern recognition, machine learning and knowledge discovery [1, 2]. Attribute reduction plays important role in analyzing enormous data, for it provides the equivalent information descriptive ability as the entire set of attributes [3].

Skowron and Rauszer introduced two basic notions in 1991, namely discernibility matrix and discernibility function [4], which provides a canonical and precise mathematical model for finding core and reduction of decision system [5, 6]. However, this method has some disadvantages: generating discernibility matrix would consume enormous time and space; and it is difficult to suit inconsistent circumstance of decision system [7, 8].

In this paper, these disadvantages mentioned above are analyzed in detail, and an efficient method of simplifying decision system is proposed to find minimum discernibility set. This method not only considers inconsistent circumstance of decision system sufficiently, but also saves executive time and space greatly. The theoretic analysis and simulation instance shows that this algorithm is feasible and effective in practice.

14.2 Preliminaries

Formally, an decision information system \( DS \) is denoted as \( DS = \left\langle {U,A,V,f} \right\rangle \), where \( U = \{ x_{1} ,x_{2} , \ldots ,x_{n} \} \) is a nonempty finite set of objects, U is also called as the universe; \( A = C \cup D \) is a nonempty finite set of attributes, C is condition attribute set and D is decision attribute set; \( V = \cup_{a \in A} V_{a} \), \( V_{a} \) is the value domain of the attribute a; \( f:U \times A \to V \) is a information mapping function such that \( f(x_{i} ,a) \in V_{a} (x_{i} \in U,a \in A) \).

Definition 1

Given a subset of attributes \( R \subseteq A \), an indiscernible relation \( IND(R) \subseteq U \times U \) is denoted as: \( IND(R) = \{ (x_{i} ,x_{j} ) \in U \times U:\forall a \in R,\,f(x_{i} ,a) = f(x_{j} ,a)\} \). For any object pair \( \left\langle {x_{i} ,x_{j} } \right\rangle \), if \( x_{i} IND(R)x_{j} \) then \( x_{i} \) and \( x_{j} \) have the same value on the attribute set R, i.e. they are indiscernible on the attribute set R. The partition of the universe induced by C and D is denoted by \( U/IND(C) \) and \( U/IND(D) \), which are called as condition partition and decision partition respectively.

Definition 2

Given an set of object \( X \subseteq U \), the lower and upper approximation of X are defined as below:

$$ \begin{aligned} &\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{R} (X) = \cup \left\{ {X_{i} :X_{i} \in U/IND(R) \wedge X_{i} \subseteq X} \right\}, \hfill \\ &\bar{R}(X) = \cup \left\{ {X_{i} :X_{i} \in U/IND(R) \wedge X_{i} \cap X \ne \emptyset } \right\}. \hfill \\ \end{aligned} $$

According to indiscernible relation IND(R) R, the lower approximation \( \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{R} (X) \) is the greatest definable set contained in X, and the upper approximation \( \bar{R}(X) \) is the least definable set containing X.

Definition 3

Given decision system \( DS = \left\langle {U,A = C \cup D,V,f} \right\rangle \), the positive region is defined by:

$$ POS_{c} (D) = \cup_{X \in U/D} C(X). $$

The positive region is the union of such condition partitions that can be fully contained in the decision partitions.

Definition 4

Given decision system \( DS = \left\langle {U,A,V,f} \right\rangle \), considering the characteristic \( \Uppsi \) of DS, an attribute set \( B \subseteq A \) is called as a reduction of DS iff:

  1. 1.

    \( \Uppsi (B) = \Uppsi (A); \)

  2. 2.

    \( \forall B^{\prime} \subset B,\,\Uppsi (B^{\prime}) \ne \Uppsi \left( B \right). \)

The intersection of all reductions is called as core. Different characteristic \( \Uppsi \) is corresponding to different function standard of reduction goal, for example keeping condition partition unchanged, keeping relative positive region unchanged, and keeping information entropy unchanged.

As important development of attribute reduction in rough set theory, Skowron and Rauszer introduced the notion of discernibility matrix based on which discernibility function was designed, and proved that the set of prime implicants in minimal disjunctive form of discernibility function is corresponding to all reduction of given information system.

Definition 5

Given decision system \( DS = \left\langle {U,A = C \cup D,V,f} \right\rangle \), Discernibility matrix is a \( |U| \times |U| \) matrix denoted as M, its element \( M(i,j) \) is defined by:

$$ M\left( {i,j} \right) = \left\{ {\begin{array}{*{20}c} {\left\{ {a:a\left( {x_{i} } \right) \ne a\left( {x_{j} } \right),\left( {a \in C} \right) \wedge \left( {x_{i} ,x_{j} \in U} \right)} \right\}} & {\left( {D(x_{i} ) \ne D(x_{j} )} \right) \wedge \left( {\hbox{min} \left\{ {\left| {V_{D} \left( {\left[ {x_{i} } \right]_{IND(C)} } \right)} \right|,\left| {V_{D} \left( {\left[ {x_{j} } \right]_{IND(C)} } \right)} \right|} \right\} = 1} \right)} \\ \emptyset & {otherwise} \\ \end{array} } \right. $$

\( M(i,j) \) is an element located on the ith row and the jth column of discernibility matrix M; \( [x_{i} ]_{IND(C)} \) is the equivalence class of \( x_{i} \) with respect to condition attribute set C, i.e. \( [x_{i} ]_{IND(C)} \in U/IND(C) \); \( V_{D} \left( {[x_{i} ]_{IND(C)} } \right) \) denotes the value domain of \( \left[ {x_{i} } \right]_{IND\left( C \right)} \), \( \left| Y \right| \) denotes the cardinality of the set Y.

\( \left( {\hbox{min} \;\left\{ {\left| {V_{D} \left( {\left[ {x_{i} } \right]_{IND\left( C \right)} } \right)} \right|,\left| {V_{D} \left( {\left[ {x_{j} } \right]_{IND\left( C \right)} } \right)} \right|} \right\} = 1} \right) \)” indicates that at least one of the two objects \( x_{i} \) and \( x_{j} \) is contained in the positive region.

It consists of all attributes on which two objects \( x_{i} \) and \( x_{j} \) possess different value, simultaneously at least one of \( x_{i} \) and \( x_{j} \) has only one decision value. Definition 5 assures that discernibility matrix must be capable of discerning the object of positive region from other objects, therefore this discernibility matrix may be considered as discernibility matrix based on changeless positive region.

14.3 Discernibility Matrix Based on Decision Vector

Discernibility matrix is a kind of efficient method to find the reduction. However, generating discernibility matrix would consume enormous time and space, and it is also difficult to suit inconsistent circumstance of decision system. This section would analysis these disadvantages in detail, introduce decision vector to simplify decision system for improving discernibility matrix. This improved matrix is capable of describe inconsistent objects, and decreasing the comparison times and storage spaces.

Definition 6

Given an decision system \( DS = \left\langle {U,A = C \cup D,V,f} \right\rangle \), \( U = \left\{ {x_{1} ,x_{2} , \cdots ,x_{n} } \right\} \), \( U/IND\left( D \right) = \left\{ {Y_{1} ,Y_{2} , \cdots Y_{{\left| {U/IND\left( D \right)} \right|}} } \right\} \), then the value domain of system decision can be achieved: \( F_{D} = \left\{ {f\left( {Y_{1} ,D} \right),f\left( {Y_{2} ,D} \right), \ldots ,f\left( {Y_{{\left| {U/IND\left( D \right)} \right|}} ,D} \right)} \right\} \). For \( \forall x_{i} \in U \), \( B \subseteq C \), the decision of \( \left[ {x_{i} } \right]_{B} \) with respect to D can be denoted as a decision vector: \( DV\left( {\left[ {x_{i} } \right]_{B} } \right) = \left( {\mu_{{DV\left( {\left[ {x_{i} } \right]_{B} } \right)}} \left( {f\left( {Y_{1} ,D} \right)} \right),f\left( {Y_{1} ,D} \right)} \right), \ldots ,\left( {\mu_{{DV\left( {\left[ {x_{i} } \right]_{B} } \right)}} \left( {f\left( {Y_{U/IND\left( D \right)} ,D} \right)} \right),f\left( {Y_{U/IND\left( D \right)} ,D} \right)} \right) \)where \( f\left( {Y_{j} ,D} \right) \) is the value domain of decision class \( Y_{j} \left( {1 \le j \le \left| {U/IND\left( D \right)} \right|} \right) \) over decision attribute D, \( \mu_{{DV\left( {\left[ {x_{i} } \right]_{B} } \right)}} \left( {f\left( {Y_{j} ,D} \right)} \right) = {{\left| {Y_{j} \cap \left[ {x_{i} } \right]_{B} } \right|} \mathord{\left/ {\vphantom {{\left| {Y_{j} \cap \left[ {x_{i} } \right]_{B} } \right|} {\left| {\left[ {x_{i} } \right]_{B} } \right|}}} \right. \kern-0pt} {\left| {\left[ {x_{i} } \right]_{B} } \right|}} \). Obviously, \( DV\left( {\left[ {x_{i} } \right]_{B} } \right) = \mu_{{DV\left( {\left[ {x_{i} } \right]_{B} } \right)}} \left( {f\left( {Y_{1} ,D} \right)} \right), \ldots ,\mu_{{DV\left( {\left[ {x_{i} } \right]_{B} } \right)}} \left( {f\left( {Y_{{\left| {U/IND\left( D \right)} \right|}} ,D} \right)} \right) \) is the probability distribution of the equivalence class \( \left[ {x_{i} } \right]_{B} \) on each decision equivalence class, so \( \mu_{{DV\left( {\left[ {x_{i} } \right]_{B} } \right)}} \left( {f\left( {Y_{1} ,D} \right)} \right) + \cdots + \mu_{{DV\left( {\left[ {x_{i} } \right]_{B} } \right)}} (f\left( {Y_{{\left| {U/IND\left( D \right)} \right|}} ,D} \right) = 1. \)

Definition 7

For a decision vector \( DV \), its \( \lambda - cut \) defined for \( \lambda \in (0,1] \) by: \( DV_{\lambda } = \left\{ {x:DV\left( x \right) \ge \lambda } \right\} \)

All elements whose membership are higher than \( \lambda \) constitute the \( \lambda - cut \) of decision vector \( DV \).

Definition 8

For an decision system \( DS = \left\langle {U,A = C \cup D,V,f} \right\rangle \), its simplification decision system is defined by: \( DS^{\prime} = \left\langle {U/IND\left( C \right),C \cup D^{\prime},V^{\prime},f^{\prime}} \right\rangle \), \( V^{\prime} = \cup_{{a \in C \cup D^{\prime}}} V_{a} \), \( V_{a} \) is the value domain of the attribute \( a \); \( f^{\prime}:U \times \left( {C \cup D^{\prime}} \right) \to V^{\prime} \) is an information mapping function such that

$$ f^{\prime}\left( {X_{i} ,a} \right) \in V_{a}^{\prime } \left( {X_{i} \in U/IND\left( C \right),a \in C \cup D^{\prime}} \right),\;{\text{here}}\;f^{\prime}\left( {X_{i} ,D^{\prime}} \right) = \tilde{F}_{D} \left( {X_{i} } \right). $$

Based on simplification decision system, we can define discernibility matrix which keeps positive region and information entropy unchanged respectively.

Definition 9

Given decision system \( DS = \left\langle {U,A = C \cup D,V,f} \right\rangle \) and its simplification decision system \( DS^{\prime} = \left\langle {U/IND\left( C \right),C \cup D^{\prime},V^{\prime},f^{\prime}} \right\rangle \), Discernibility matrix is a \( \left| {U/IND\left( C \right)} \right| \times \left| {U/IND\left( C \right)} \right| \) matrix denoted as \( M^{\prime} \), its element \( M^{\prime}\left( {i,j} \right) \) is defined by:

\( M_({i,j}) = \left\{ {\begin{array}{*{20}c} {\left\{ {a{\text{:}}a{\text{(}}x_{i} {\text{)}} \ne a{\text{(}}x_{j} {\text{), (}}a \in C{\text{)}} \wedge {\text{(}}x_{i} {\text{,}}\,x_{j} \in U{\text{)}}} \right\}} & \quad{\left| {{\text{(}}DV{\text{(}}X_{i} {\text{))}}_{1} } \right| = 1 \vee \left| {{\text{(}}DV{\text{(}}X_{j} {\text{))}}_{1} } \right| = 1} \\ \emptyset & {otherwise} \\ \end{array} } \right. \) where \( \left( {DV(X_{i} )} \right)_{1} \) is \( 1 - level \) set (\( 1 - cut \)) of \( DV\left( {X_{i} } \right) \), \( \left| {\left( {DV\left( {X_{i} } \right)} \right)_{1} } \right| = 1 \) indicates that \( X_{i} \) is contained in positive region. This discernibility matrix assures that at least one of two compared objects is contained in positive region, so it is capable of keeping positive region unchanged.

Definition 10

Given an decision system \( DS = \left\langle {U,A = C \cup D,V,f} \right\rangle \) and its simplification decision system \( DS^{\prime} = \left\langle {U/IND(C),C \cup D^{\prime},V^{\prime},f^{\prime}} \right\rangle \), Discernibility matrix is a \( \left| {U/IND\left( C \right)} \right| \times \left| {U/IND\left( C \right)} \right| \) matrix denoted as \( M^{\prime} \), its element \( M^{\prime}(i,j) \) is defined by:

\( M^{\prime}\left( {i,j} \right) = \left\{ {\begin{array}{*{20}c} {\left\{ {a:a(x_{i} ) \ne a(x_{j} ),(a \in C) \wedge (x_{i} ,x_{j} \in U)} \right\}} & {DV(X_{i} ) \ne DV(X_{j} )} \\ \emptyset & {otherwise} \\ \end{array} } \right. \) where \( DV(X_{i} ) \ne DV(X_{j} ) \) indicates that the probability distribution of \( X_{i} \) and \( X_{j} \) on decision equivalence class are not equal. Therefore, this discernibility matrix is capable of keeping information entropy unchanged.

The improved discernibility matrix defined as definition 9 and definition 10, considering the equivalence class as elementary comparison unit. There is not more than one time of comparison between two condition equivalence classes, so the improved discernibility matrix is able to save the time and spaces consumed in generating discernibility matrix than before.

Definition 11

For an simplification decision system \( DS^{\prime} = \left\langle {U/IND\left( C \right),C \cup D^{\prime},V^{\prime},f^{\prime}} \right\rangle \) and its improved discernibility matrix \( M^{\prime} \), the discernibility function is a Boolean expression defined as follows:

$$ f\left( {M^{\prime}} \right) = \wedge \left\{ { \vee M^{\prime}(i,j):1\;<\; j\;<\;i\;\le\;\left| {U/IND\left( C \right)} \right|,M^{\prime}(i,j) \ne \emptyset } \right\} $$

The expression \( \vee M^{\prime}(i,j) \) denotes making disjunction operation among all attributes which belong to nonempty element \( M^{\prime}(i,j) \). The expression \( \wedge \left\{ { \vee M^{\prime}(i,j)} \right\} \) means making conjunction operation among all \( \vee M^{\prime}(i,j) \). The physical meaning of discernibility function consists in the fact that the pair \( \left( {X_{i} ,Y_{j} } \right) \) can be discerned by using any attribute in \( M^{\prime}(i,j) \), so disjunction operations are made among all attributes which belong to \( M^{\prime}(i,j) \); and all discernible pairs of condition equivalence class can be assured to be still discernible, therefore conjunction operations are made among all \( \vee M^{\prime}(i,j) \). Thus, the value of \( \wedge \left\{ { \vee M^{\prime}(i,j)} \right\} \) becomes true by selecting some attributes (if \( a \in M^{\prime}\left( {i,j} \right) \) is selected, then the value of \( \vee M^{\prime}(i,j) \) is true; if the value of all \( \vee M^{\prime}(i,j) \) in \( f\left( {M^{\prime}} \right) \) is true, then \( f\left( {M^{\prime}} \right) \) is true.)

Based on the discernibility matrix and discernibility function, attribute reduction and core can be considered in association with knowledge reduction. The core attributes are common part of all reduction, and they are indispensable to preserve original knowledge in information system.

Theorem 1

Given an decision system\( DS = \left\langle {U,A = C \cup D,V,f} \right\rangle \)and its improved discernibility matrix\( M^{\prime} \), \( R \subseteq C \)is a reduction if and only ifRsatisfies:

  1. 1.

    \( \forall M^{\prime}(i,j) \ne \emptyset (1 \le j < i \le n) \Rightarrow R \cap M^{\prime}(i,j) \ne \emptyset ; \)

  2. 2.

    \( \forall R^{\prime} \subset R \), \( \exists M^{\prime}(i,j) \ne \emptyset \)such that\( R^{\prime} \cap M^{\prime}(i,j) \ne \emptyset \).

The condition (1) shows R is jointly sufficient to distinguish all object pairs which are discernible in original information system, and the condition (2) indicates R is minimal subset of attributes satisfying condition (1). These two conditions ensure that \( R \subseteq C \) is a minimal subset of attributes which preserve particular property of given information system, so R is a reduction.

Theorem 2

Suppose\( f^{\prime}(M) \)is the minimaldisjunctive form which is equivalent to\( f(M) \), \( R \subseteq C \)is a reduction if and only if\( \wedge R \)is a prime implicant of\( f^{\prime}(M) \).

Every disjunctive item of the minimal disjunctive form is called a prime implicant. \( \wedge R \) denotes the conjunctive form of all attributes in R. Theorem 2 illuminates that the problem of finding all reduction is equivalent to transforming the conjunctive form \( f(M) \) to the minimal disjunctive form \( f^{\prime}(M) \) by applying many times of the absorption or multiplication laws on Boolean expression.

Theorem 3

For improved discernibility matrix\( M^{\prime} \), the attribute core is the collection of singleton attribute element in\( M^{\prime} \), i.e. \( Core\left( {M^{\prime}} \right) = \left\{ {M^{\prime}(i,j):\left| {M^{\prime}(i,j)} \right| = 1} \right\} \).

Where \( M^{\prime}(i,j) \) is an element of discernibility matrix \( M^{\prime} \), and \( \left| {M^{\prime}(i,j)} \right| \) is the cardinality of \( M^{\prime}(i,j) \), namely the number of attributes contained in \( M^{\prime}(i,j) \). \( \left| {M^{\prime}(i,j)} \right| = 1 \) denotes \( M^{\prime}(i,j) \) is an element of singleton attribute, then it is necessary for \( M^{\prime}(i,j) \) contained in each prime implicant of the minimal disjunctive form \( f^{\prime}(M^{\prime}) \). So \( Core\left( {M^{\prime}} \right) \) is indispensable to all reductions, and be contained in every reduction.

14.4 Analysis of an Instance

In order to illustrate the method mentioned above, we give an example of application. This example is simple, but it illustrates essential characteristics. Given an decision system in Table 14.1, where \( U = \left\{ {x1, \ldots ,x8} \right\}\,C = \left\{ {a,b,c,d,e,f} \right\} \).

Table 14.1 Decision system

Its simplification decision system \( DS^{\prime} \) can be computed as Table 14.2.

Table 14.2 Simplification decision system \( DS^{\prime} \)

The improved discernibility matrix based on changeless positive region and information entropy are shown as Tables 14.3 and 14.4 respectively.

Table 14.3 The improved discernibility matrix based on positive region
Table 14.4 The improved discernibility matrix based on information entropy

Based on Table 14.3, we can get the discernibility function \( f^{\prime}(M) = \left( {a \vee b \vee c} \right) \wedge \left( {c \vee d} \right) \wedge \left( {b \vee c \vee f} \right) = c \vee \left( {b \wedge d} \right) \vee \left( {a \wedge d \wedge f} \right) \), so the reduction set based positive region is \( \left\{ {\left\{ c \right\},\left\{ {b,d} \right\},\left\{ {a,d,f} \right\}} \right\} \). Based on Table 14.4, we can get the discernibility function \( f^{\prime}(M) = \left( {a \vee b \vee c} \right) \wedge \left( {a \vee b \vee d} \right) \wedge \;\left( {c \vee d} \right) \wedge \left( {b \vee c \vee f} \right)\left( {b \vee d \vee f} \right) \), so we can compute that the reduction set based information entropy is \( \left\{ {\left\{ {a,d,f} \right\},\left\{ {a,c,f} \right\},\left\{ {b,c} \right\},\left\{ {b,d} \right\},\left\{ {c,d} \right\}} \right\} \).

14.5 Conclusions

Attribute reduction based on decernibility matrix is one of important parts researched in rough set theory. This paper analyses some disadvantages of discernibility matrix before in detail, and proposes an efficient method of simplifying decision system to find minimum discernibility set. By introducing decision vector, this paper designs an improved decernibility matrix that can suit inconsistency of decision system, as well as be capable of reducing comparison times and storage space for building decernibility matrix by the greatest extent. The theoretic analysis and simulation instance shows that this algorithm is feasible and effective in practice.