Keywords

1 Introduction

Recently, computer technology and automatic control technology have rapidly developed, and research on the uncertain information system has attracted more and more researchers’ attention [18, 22]. Fuzzy set theory, rough set theory and quotient space theory are three basic granular computing models which have been successfully applied to process uncertain information. As a simple computing model, rough set theory [4, 13] is an important method for handling uncertain problems as well as probability theory, fuzzy set theory and evidence theory. In the view of Pawlak’s rough sets, people usually research how to acquire decision rules from the upper approximation set \(\overline{{R}}(X)\) and the lower approximation set \(\underline{R}(X)\). Furthermore, many extended rough set models are proposed to deal with the real-life uncertain information, such as variable precision rough set model [26], probability rough set model [11], game-theoretic rough sets [2] and so on [10, 12, 25]. Pawlak and Skworn analyzed and summarized these extended models referred to [11]. Mi analyzed the variable precision rough set model and discussed how to use this model to obtain attribute reduction [8]. Yao and Ziarko et al., combining probability and inclusion degree, proposed probability rough set model and obtained many related results [17, 19, 20]. However, these methods mainly focus on constructing the extended approximation operators of traditional rough set model. There is little research on how to use the existing knowledge granules in knowledge base to construct an approximation set of X. Could we construct an approximation set which is more approximate to X than \(\overline{{R}}(X)\) or \(\underline{R}(X)\)? And does an optimal approximation set exist? The first problem is solved in our previous work [21], and the second problem is our main motivations in this paper.

Based on above assumptions, the related models and results on the approximation set of an uncertain set were proposed in our other paper referred to [21, 23]. In these papers, the basic idea is translating rough sets into fuzzy sets according to the different membership degree of elements in boundary region and constructing an approximation set of an uncertain concept by using cut-set of the fuzzy set with some thresholds. With this construction method, the approximation sets with the existing knowledge granules can be obtained directly. In the literature [21], a general approximation set was constructed and it had many good properties. Experimental results show that \(R_{0.5} (X)\) is a better model dealing with uncertain information systems. Better classification results could be obtained with \(R_{0.5} (X)\). The amount of correct classification objects increases and amount of uncertain classification objects reduces.However, that \(R_{0.5} (X)\) is the optimal approximation set of an uncertain set X is still unable to determine, and the related concepts and results on the optimal approximation were not presented in [21]. It is difficult to search for the optimal approximation set of an uncertain set directly. Based on the research referred to [21], through minimizing similarity between the target concept and its approximation sets, the optimal approximation set \(R_\textit{Best} (X)\) is defined, and an algorithm for constructing the optimal approximation set \(R_\textit{Best} (X)\) is proposed in this paper. And several comparative analysis between \(R_\textit{Best} (X)\) and other approximation sets is carried out In addition, the operations properties of \(R_\textit{Best} (X)\) is analyzed. Finally we discuss the change rules of the similarity degree between X and its \(R_\textit{Best} (X)\) in different knowledge granularity levels.

The rest of this paper is organized as follows. In Sect. 2, the related basic concepts and preliminary knowledge are reviewed. The \(R_\textit{Best} (X)\) of an uncertain set X in rough approximation spaces is proposed in Sect. 3. Besides, several comparative analysis between \(R_\textit{Best} (X)\) and other approximation sets and many operation rules related the \(R_\textit{Best} (X)\) are given in Sect. 3. The change rules of the similarity degree between X and its \(R_\textit{Best} (X)\) in the different knowledge granularity levels are discussed in Sect. 4. Finally, the conclusions are drawn in Sect. 5.

2 Preliminaries

In order to better present the context of this paper, many preliminary concepts, definitions and results related to rough set and uncertainty measurement are reviewed as follows.

Definition 1

(Information table of knowledge system [9, 14]). A knowledge system can be described as \(S=\left\langle {U, A, V, f} \right\rangle \). U is the domain. \(A=C\cup D\) is the set of all attributes. Subset C is the set of conditional attributes, and D is the set of decision attributes. \(V=\cup _{r\in A} V_{r} \) is the set of attribute values. \(V_{r} \) describes the range of attribute values r where \(r\in A\). \(f: U\times A\rightarrow V\) is a function which describes attribute values of object x in U.

Definition 2

(Indiscernibility Relation [9, 14]). For any attribute set \(R\subseteq A\), an indiscernibility relation is defined as

$$ \textit{IND}(R)=\{(x, y)|(x, y)\in U^{2}\wedge \forall _{b\in R}({b(x)=b(y)})\}. $$

Definition 3

(Upper Approximation Set and Lower Approximation Set [9, 14]). Let \(S=\left\langle {U, A, V, f} \right\rangle \) be a knowledge System, for any \(X\subseteq U\) and \(R\subseteq A\), the upper approximation set \(\overline{{R}}(X)\) and the lower approximation set \(\underline{R}(X)\) of X are defined as follows,

$$\begin{aligned}&\overline{{R}}(X)=\cup \left\{ {Y_{i} |Y_{i} \in U/\textit{IND}(R)\wedge Y_{i} \cap X\ne \emptyset } \right\} ,\\&\underline{R}(X)=\cup \left\{ {Y_{i} |Y_{i} \in U/\textit{IND}(R)\wedge Y_{i} \subseteq X} \right\} , \end{aligned}$$

where \(U/\textit{IND}(R)=\{X|X\subseteq U\wedge \forall _{x\in X, y\in Y, b\in R} \) \(({b(x)=b(y)} ) \}\) is a partition of equivalence relation R on U. The upper approximation set and the lower approximation set of X on R can be defined in another form as follows,

$$\begin{aligned}&\overline{{R}}(X)=\left\{ {x|x\in U\wedge [x]_{R} \cap X\ne \emptyset } \right\} ,\\&\underline{R}(X)=\left\{ {x|x\in U\wedge [x]_{R} \subseteq X} \right\} , \end{aligned}$$

where \([x]_{R} \in U/\textit{IND}(R)\), and \(\left[ x \right] _{R} \) is an equivalence class of x on relation R. \(\underline{R}(X)\) is a set of objects which certainly belong to U according to knowledge R; \(\overline{{R}}(X)\) is a set of objects which possibly belong to U according to knowledge R. \(\textit{BND}_{R} (X)=\overline{{R}}(X)-\underline{R}(X)\) is called as Boundary region of the target concept X on attribute set R. \(\textit{POS}_{R} (X)=\underline{R}(X)\) is called as Positive region of target concept X on attribute set R. \(\textit{NEG}_{R} (X)=U-\overline{{R}}(X)\) is called as Negative region of target concept X on attribute set R. \(\textit{BND}_{R} (X)\) is a set of objects which just possibly belong to target concept X.

Let U be a finite domain. Let \(X\subseteq U\), \(x\in U\), and the membership degree of x belong to set X is defined as

$$ \mu _{X}^{R} (x)=\frac{|X\cap [x]_{R} |}{|[x]_{R} |}, $$

obviously, \(0\le \mu _{X}^{R} (x)\le 1\).

Definition 4

(\(\lambda \)-Approximation Sets of X [21]). Let X be a subset (the target concept) of U, let

$$ R_{\lambda } (X)=\{x|x\in U\wedge \mu _{X}^{R} (x)\ge \lambda \}~~(1\ge \lambda >0), $$

then \(R_{\lambda } (X)\) is called as \(\lambda \)-approximation sets of X. Let

then is called as \( \lambda \)-strong approximation sets of X.

So when \(\lambda =0.5\), \(R_{0.5} (X)\) is called as 0.5-approximation sets of X and is called as 0.5-strong approximation sets of X.

Definition 5

[14]. Let \(U=\{x_{1}, x_{2}, \cdots , x_{n} \}\) be a non-empty finite set, \({P}'=\{{P}'_{1}, {P}'_{2}, \cdots , {P}'_{l} \}\) and \({P}''\) \(=\) \(\{{P}''_{1}, {P}''_{2}, \cdots , {P}''_{m} \}\) be two partition spaces on U. If \(\forall _{P_{i}' \in {P}'} (\exists _{P_{j}'' \in {P}''} (P_{i}' \subseteq P_{j}''))\), then \({P}'\) is finer than \({P}''\), denoted by \({P}'\preceq {P}''\).

Definition 6

[14]. Let \(U=\{x_{1}, x_{2}, \cdots , x_{n} \}\) be a non-empty finite set, \({P}'=\{{P}'_{1}, {P}'_{2}, \cdots {P}'_{l} \}\) and \({P}''=\{{P}''_{1}, {P}''_{2}, \cdots {P}''_{m} \}\) be two partition spaces on U. If \({P}'\preceq {P}''\), and \(\exists _{P_{i}' \in {P}'} (\exists _{P_{j}'' \in {P}''} (P_{i}' \subset P_{j}''))(P_{i}' \subset P_{j}'')\)), then \({P}'\) is strictly finer than \({P}''\), denoted by \({P}'\prec {P}''\).

Definition 7

(Similarity Degree [21]). Let A and B be two subsets of U, the mapping: \(S: U\times U\rightarrow [0, 1]\). S(AB) is called as similarity degree between A and B, if and only if S(AB) satisfy the following properties:

(1) For any \(A, B\subseteq U\), \(0\le S(A, B)\le 1\) (Boundedness),

(2) For any \(A, B\subseteq U\), \(S(A, B)=S(B, A)\) (Symmetry),

(3) For any \(A\subseteq U\), \(S(A, A)=1\); \(S(A, B)=0\) if and only if \(A\cap B=\emptyset \).

Any formula that satisfy above (1), (2) and (3) is a similarity degree formula between two sets. In similarity measurement of rough sets, because of its universality and effectiveness, most experts and scholars have adopted a similarity degree formula in reference [7] as

$$ S\left( {A, B} \right) =\frac{\left| {A\cap B} \right| }{\left| {A\cup B} \right| }, $$

where \(\left| \cdot \right| \) represents cardinality of elements in finite subset. Obviously, this formula satisfies above (1), (2) and (3).

3 Optimal Approximation Set of Rough Set

In our previous works, we find that \(R_{0.5} (X)\), as an approximation set of X, has many excellent properties. However, whether \(R_{0.5} (X)\) is the optimal approximation set of X when \(0\le \lambda <0.5\)? Let us analyze according to the following example.

Example 1

In a decision information table (Table 1), let \(U=\left\{ {x_{1}, x_{2}, \ldots , x_{9}} \right\} \), the condition attribute set \(C=\left\{ {a, b, c} \right\} \) and the decision attribute set \(D=\left\{ d \right\} \).

Table 1. Decision information table

According to rough set theory, the following partitions can be obtained easily,

$$\begin{aligned}&\textit{IND}(C)=\{\{x_{1}, x_{\mathrm{4}}, x_{\mathrm{7}} \}, \{x_{\mathrm{2}}, x_{5}, x_{\mathrm{8}} \}, \{x_{\mathrm{3}}, x_{\mathrm{6}}, x_{9} \}\},\\&\textit{IND}(D)=\{\{x_{1}, x_{2}, x_{3} \}, \{x_{4}, x_{5}, x_{6}, x_{7}, x_{8}, x_{9} \}\}. \end{aligned}$$

Let \(X_{1} =\{x_{1}, x_{2}, x_{3} \}\), \(X_{2} =\{x_{4}, x_{5}, x_{6}, x_{7}, x_{8}, x_{9} \}\). According to the definition \(\mu _{X}^{R} (x)=\frac{\left| {[x]_{R} \cap X} \right| }{\left| {[x]_{R}} \right| }\), a fuzzy set can be obtained as

$$ F_{X_{1}} (U )=\left\{ \frac{1/3}{x_{1}}, \frac{1/3}{x_{2} }, \frac{1/3}{x_{3}}, \frac{1/3}{x_{4}}, \frac{1/3}{x_{5}},\right. \left. \frac{1/3}{x_{6} }, \frac{1/3}{x_{7}}, \frac{1/3}{x_{8}}, \frac{1/3}{x_{9}} \right\} . $$

Then

$$ R_{0.5} (X_{1})=\emptyset , ~~R_{0.3} (X_{1})=U.$$

While

$$ S\left( {X_{1}, R_{0.5} (X_{1})} \right) =\frac{\left| {X_{1} \cap R_{0.5} (X_{1})} \right| }{\left| {X_{1} \cup R_{0.5} (X_{1})} \right| }=\frac{0}{3}=0, S\left( {X_{1}, R_{0.3} (X_{1})} \right) =\frac{1}{3}. $$

So, \(S\left( {X_{1}, R_{0.3} (X_{1})} \right) >S\left( {X_{1}, R_{0.5} (X_{1})} \right) \).

In the same way, the following fuzzy set can be obtained,

$$ F_{X_{2}} (U )=\left\{ \frac{2/3}{x_{1}}, \frac{2/3}{x_{2} }, \frac{2/3}{x_{3}}, \frac{2/3}{x_{4}}, \frac{2/3}{x_{5}},\right. \left. \frac{2/3}{x_{6} }, \frac{2/3}{x_{7}}, \frac{2/3}{x_{8}}, \frac{2/3}{x_{9}} \right\} . $$

Then we have \(R_{0.5} (X_{2})=U, ~~R_{0.3} (X_{2})=U\). Here we have \(S\left( {X_{2}, R_{0.3} (X_{2})} \right) =S\left( {X_{2}, R_{0.5} (X_{2})} \right) \).

So, we find that the approximation set \(R_{0.5} (X_{1})\) is not the optimal approximation set of \(X_{1} \), on the contrary, \(R_{0.5} \left( {X_{2}} \right) \) is the optimal approximation set of \(X_{2}\). Therefore, the approximation set \(R_{0.5} (X)\) may not be the optimal approximation set of X and an optimal approximation set of X in rough approximation spaces must exist. Thus, based on the membership degree function \(\mu _X^R\left( x \right) \), the optimal approximation set of X in rough approximation space is proposed through minimizing similarity between the target concept and its approximation sets.

Definition 8

(Optimal approximation set) Let X be a subset (target concept) of U, \(R_{\lambda } (X)\) be a \(\lambda -\) approximation set of X, and \({S_{Best}} = \mathop {max}\limits _{0 < \lambda \le 1} \left\{ {S\left( {X,{R_\lambda }\left( X \right) } \right) } \right\} \). For any \(\lambda (\mathrm{{0}} < \lambda \le \mathrm{{1}})\), if \(S\left( {X,{R_\lambda }\left( X \right) } \right) \,\mathrm{{ = }}\,{S_{Best}}\), then the \(R_{\lambda } (X)\) is called the optimal approximation set of X, denoted by \({R_{Best}}\left( X \right) \). Namely, \(S\left( {X,{R_{Best}}\left( X \right) } \right) = {S_{Best}}\).

According to Definition 8, we know if \(R_{\lambda } (X)\) is the \(R_\textit{Best} (X)\), \(\lambda \) must be in the interval (0, 0.5]. In order to more clearly show the Definition 8, an example with a decision information table (Table 2) is presented.

Example 2

Let domain \(U=\{x_{1}, x_{2}, \ldots , x_{15} \}\), the condition attribute set \(C=\left\{ {a, b, c} \right\} \) and the decision attribute set \(D=\left\{ d \right\} \).

Table 2. Decision information table

According to rough set theory, the following partition is obtained easily,

$$U/\textit{IND}(D) =\left\{ \{x_{1}, x_{2}, x_{3}, x_{4}, x_{5} \}, \{x_{6}, x_{7}, x_{8},x_{9}, x_{10} \},\right. \left. \{x_{11}, x_{12}, x_{13}, x_{14}, x_{15} \} \right\} . $$

Here, three decision concepts induced by decision attribute set are generated, and they are \(X_{1} =\{x_{1}, x_{2}, x_{3},\) \( x_{4}, x_{5} \}\), \(X_{2} =\{x_{6}, x_{7}, x_{8}, x_{9}, x_{10} \}\) and \(X_{3} =\{x_{11}, \) \(x_{12}, \) \(x_{13}, x_{14}, x_{15} \}\) respectively. For the decision concept \(X_{1}\), computing \(U/\textit{IND}(C)\),

$$U/\textit{IND}(C) =\{\{x_{1}, x_{2}, x_{7}, x_{11} \}, \{x_{3}, x_{4}, x_{8}, x_{9}, x_{13} \},\{x_{5}, x_{10}, x_{14} \}, \{x_{6}, x_{12}, x_{15} \} \}. $$

Then let \(Y_{1}=\{x_{1}, x_{2}, x_{7}, x_{11} \}\), \(Y_{2}=\{x_{3}, x_{4}, x_{8}, x_{9}, x_{13} \}\), \(Y_{3}=\{x_{5}, x_{10}, x_{14} \}\), \(Y_{4}=\{x_{6}, x_{12}, x_{15} \}\). Computing the membership degree \(\mu (x)\) of x \((x\in U)\), where \(\mu (x) = \frac{{\left| {{Y_i} \cap X} \right| }}{{\left| {{Y_i}} \right| }}\), that is to say, every object in equivalence class \({Y_i}\) has same membership degree. For \(X_{1}\), the membership degrees are shown as follows,

(1) For equivalence class \(Y_{1}\), the membership degree is1/2;

(2) For equivalence class \(Y_{2}\), the membership degree is 2/5;

(3) For equivalence class \(Y_{3}\), the membership degree is 1/3;

(4) For equivalence class \(Y_{4}\), the membership degree is 0.

Computing \(R_{0.5} (X_{1})\) and \(R_{\mu _{i}} \left( {X_{1} } \right) \) and sorting the \(\mu _{i}\), then we can get

$$\begin{aligned}&R_{1/3} ({X_{1}} )=\{x_{1}, x_{2}, x_{3}, x_{4}, x_{5}, x_{7}, x_{8}, x_{9},x_{10}, x_{11}, x_{13}, x_{14} \};\\&R_{2/5} ({X_{1}} )=\{x_{1}, x_{2}, x_{3}, x_{4}, x_{7}, x_{8}, x_{9}, x_{11}, x_{13} \};R_{0.5} ({X_{1}} )=\left\{ {x_{1}, x_{2}, x_{7}, x_{11}}\right\} . \end{aligned}$$

Further, we can obtain:

$$ S\left( {{X_1},{R_{{1/3}}}\left( {{X_1}} \right) } \right) =\frac{5}{12}, S\left( {{X_1},{R_{{2/5}}}\left( {{X_1}} \right) } \right) =\frac{2}{5}, S\left( {{X_1},{R_{{0.5}}}\left( {{X_1}} \right) } \right) =\frac{2}{7}. $$

We find that \(S_{1/3} (X_{1})\) is maximum value, that is to say, the approximation set \(R_{1/3} (X_{1})\), is the optimal approximation set of \(X_{1}\). Then we can obtain \(R_\textit{Best} (X_{1})\) as follows,

$$R_\textit{Best} (X_{1})=R_{1/3} (X_{1})=\{ x_{1}, x_{2}, x_{3}, x_{4}, x_{5}, x_{7}, x_{8}, x_{9}, x_{10}, x_{11}, x_{13}, x_{14} \}.$$

Similarly,

$$\begin{aligned} R_\textit{Best} (X_{2})=&\{x_{3}, x_{4}, x_{5}, x_{6}, x_{8}, x_{9}, x_{10}, x_{12}, x_{13}, x_{14}, x_{15} \};\\ R_\textit{Best} (X_{3})=&\left\{ {x_{5}, x_{6}, x_{10}, x_{12}, x_{14}, x_{15}} \right\} . \end{aligned}$$

The purpose of selecting \(R_\textit{Best} (X)\) is to characterize a target concept and further acquire rules. So compared with \(\overline{{R}}(X)\), \(\underline{R}(X)\) and \(R_{0.5}(X)\), what advantages does the \(R_\textit{Best} (X)\) have? Then, relative analysis is shown as follows:

(Continue)Example 3. In Table 2, we can get three decision concepts \(X_{1} =\left\{ {x_{1}, x_{2}, x_{3}, x_{4}, x_{5}} \right\} \), \(X_{2} =\left\{ {x_{6}, x_{7}, x_{8}, x_{9}, x_{10}} \right\} \) and \(X_{3} =\{{x_{11}, x_{12}, x_{13}, x_{14}, x_{15}} \}\). With these decision concepts, we can obtain,

$$\begin{aligned}&\underline{{R}}(X_{1})=\emptyset , \underline{{R}}(X_{2})=\emptyset , \underline{{R}}(X_{3})=\emptyset ;\\&\overline{{R}}(X_{1})=\{x_{1}, x_{2}, x_{3}, x_{4}, x_{5}, x_{7}, x_{8}, x_{9},x_{10}, x_{11}, x_{13}, x_{14} \}, \\&\overline{{R}}(X_{2})=\{x_{1}, x_{2}, x_{3}, x_{4}, x_{5}, x_{6}, x_{7}, x_{8}, x_{9}, x_{10}, x_{11}, x_{12}, x_{13}, x_{14}, x_{15} \}, \\&\overline{{R}}(X_{3})=\{x_{1}, x_{2}, x_{3}, x_{4}, x_{5}, x_{6}, x_{7}, x_{8}, x_{9}, x_{10}, x_{11}, x_{12}, x_{13}, x_{14}, x_{15} \};\\&{R}_{0.5}(X_{1})=\{{x_{1}, x_{2}, x_{7}, x_{11}}\}, {R}_{0.5}(X_{2})=\emptyset , {R}_{0.5}(X_{3})=\emptyset ;\\&R_\textit{Best} (X_{1})=\{x_{1}, x_{2}, x_{3}, x_{4}, x_{5}, x_{7}, x_{8}, x_{9}, x_{10}, x_{11}, x_{13}, x_{14} \}, \\&R_\textit{Best} (X_{2})=\{x_{3}, x_{4}, x_{5}, x_{6}, x_{8}, x_{9}, x_{10}, x_{12}, x_{13},x_{14}, x_{15} \}, \\&R_\textit{Best} (X_{3})=\{{x_{5}, x_{6}, x_{10}, x_{12}, x_{14}, x_{15}} \}. \end{aligned}$$

From the decision information Table 2 we can acquire many decision rules based on \(\underline{{R}}(X)\) are shown as follows,

(1) For \(X_{1}\), the corresponding approximation set is \(\emptyset \);

(2) For \(X_{2}\), the corresponding approximation set is \(\emptyset \);

(3) For \(X_{3}\), the corresponding approximation set is \(\emptyset \).

We can acquire many decision rules based on \(\overline{{R}}(X)\) are shown as follows,

(1) For \(X_{1}\), the decision rule is \((a=1\wedge b=2\wedge c=0)\vee (a=1\wedge b=0\wedge c=2)\vee (a=2\wedge b=1\wedge c=1) \rightarrow d=1\);

(2) For \(X_{2}\), the decision rule is \((a=2\wedge b=2\wedge c=1)\vee (a=1\wedge b=2\wedge c=0)\vee (a=1\wedge b=0\wedge c=2)\) \(\vee (a=2\wedge b=1\wedge c=1)\rightarrow d=2\);

(3) For \(X_{3}\), the decision rule is \((a=1\wedge b=2\wedge c=0)\vee (a=2\wedge b=2\wedge c=1)\vee (a=1\wedge b=0\wedge c=2)\vee (a=2\wedge b=1\wedge c=1)\rightarrow d=3\).

We can acquire many decision rules based on \(R_{0.5} (X)\) are shown as follows,

(1) For \(X_{1}\), the decision rule is \((a=1\wedge b=2\wedge c=0)\rightarrow d=1\);

(2) For \(X_{2}\), the corresponding approximate set is \(\emptyset \);

(3) For \(X_{3}\), the corresponding approximate set is \(\emptyset \).

We can acquire many decision rules based on \(R_\textit{Best} (X)\) are shown as follows,

(1) For \(X_{1}\), the decision rule is \((a=1\wedge b=2\wedge c=0)\vee (a=1\wedge b=0\wedge c=2)\vee (a=2\wedge b=2\wedge c=1)\vee (a=2\wedge b=1\wedge c=1)\rightarrow d=1\);

(2) For \(X_{2}\), the decision rule is \((a=2\wedge b=1\wedge c=1)\vee (a=1\wedge b=0\wedge c=2)\vee (a=2\wedge b=2\wedge c=1) \rightarrow d=2\);

(3) For \(X_{3}\), the decision rule is \((a=2\wedge b=1\wedge c=1)\vee (a=2\wedge b=2\wedge c=1)\rightarrow d=3\).

Table 3. Comparative analysis

A comparative analysis Table 3 is constructed according to the above decision rules. From these above rules acquired from \(\underline{{R}}(X)\), \(\overline{{R}}(X)\), \(R_{0.5} (X)\) and \(R_\textit{Best} (X)\) and Table 3, the qualitative and quantitative comparisons could be made. It obvious that many objects can not determine decision classification if the decision rules are acquired based on \(R_{0.5} (X)\) and \(\underline{{R}}(X)\), and this is not an expected result in actual decision problems. Though each object belongs to a certain decision classification if the decision rules are acquired based on \(\overline{{R}}(X)\) the objects in the boundary cannot be assigned to a certain decision classification, and the error classifications will be produced. Compared with \(R_{0.5} (X)\) and \(\underline{{R}}(X)\), the amount of correct classification objects increases, and amount of uncertain classification objects reduces if the decision rules are acquired based on \(R_\textit{Best} (X)\). Compared with \(\overline{{R}}(X)\), although the amount of correct classification objects weakly declines, the amount of error classification objects also reduces if the decision rules are acquired based on \(R_\textit{Best} (X)\).

According to above comparison results, we find that the decision rules based on \(R_\textit{Best} (X)\) have more powerful classification ability than the decision rules based on \(\underline{R}(X)\), \(\overline{{R}}(X)\) and \(R_{0.5} (X)\). The \(R_\textit{Best} (X)\) provide a novel perspective for approximate characterization of a target concept in multi-granularity spaces. Furthermore, it would be an effective method that could be suitable to real-life knowledge discovery from the uncertain information systems.

4 Operation Properties of \(R_{\textit{Best}} (X)\)

It is well known that \(\underline{R}(X)\) and \(\underline{R}(X)\) have many important operation properties as literature [8, 24]. Now, we will prove that the optimal approximation set \(R_\textit{Best} (X)\) has many similar operation properties with the upper approximation set and lower approximation set also. For convenience, let \(R_\textit{Best} (X)\) \(=\) \( R_{k} (X)\), \((0 < k \le 0.5)\), U be a finite domain, and X, Y be two subsets on U, we have

(1) , ;

(2) if \(X\subseteq Y\), then \(R_{k} (X)\subseteq R_{k} (Y)\), ;

(3) \(\subseteq \) ;

(4) \(\supseteq \) .

Proof

(1) Because

Similarly, holds. Hence, the proposition (1) is proved.

(2) \(\forall x\in R_{k} (X)\), we have \([x]_{R}\) satisfying \(\frac{| {[x]_{R} \cap X} |}{| {[x]_{R}}|}\) \(\ge \) k; because \(X\subseteq Y\), we have \(| [x]_{R}\) \( \cap \) \( X|\le | [x]_{R} \) \(\cap \) Y|, then \(\frac{| {[x]_{R} \cap X} |}{| {[x]_{R}} |}\le \frac{| {[x]_{R} \cap Y} |}{| {[x]_{R}} |}\), then \(\frac{| {[x]_{R} \cap Y}|}{| {[x]_{R}} |}\ge k\). So we can get \(x\in R_{k} (Y )\), and then \(R_{k} (X)\subseteq R_{k} (Y)\). Similarly, , we have \([x]_{R} \) satisfying \(\frac{| {[x]_{R} \cap X}|}{| {[x]_{R}} |}>k\); because \(X\subseteq Y\), we have \(| {[x]_{R} \cap X} |\le | {[x]_{R} \cap Y} |\), then \(\frac{| {[x]_{R} \cap X} |}{| {[x]_{R}} |}\le \frac{| {[x]_{R} \cap Y} |}{| {[x]_{R}} |}\), and we can get \(\frac{| {[x]_{R} \cap Y} |}{| {[x]_{R}} |}>k\). Therefore, we have , and then . So, the proposition (2) is proved successfully.

(3) Because \(X\cap Y\subseteq X\) and \(X\cap Y\subseteq Y\), according to proposition (2) we have \(R_{k} ({X\cap Y} )\subseteq R_{k} (X)\) and \(R_{k} ({X\cap Y} )\subseteq R_{k} (Y)\), and we have \(R_{k} ({X\cap Y} )\subseteq R_{k} (X)\cap R_{k} (Y)\). Similarly, we can get . So the proposition (3) holds.

(4) Because \(X\subseteq X\cup Y\) and \(Y\subseteq X\cup Y\), according to proposition (2), \(R_{k} (X)\subseteq R_{k} ({X\cup Y} )\) and \(R_{k} (Y)\subseteq R_{k} ({X\cup Y} )\), so \(R_{k} ({X\cup Y} )\supseteq R_{k} (X)\cup R_{k} (Y)\) is held. Similarly because \(X\subseteq X\cup Y\) and \(Y\subseteq X\cup Y\), both and are held, so we have . Then the proposition (4) is proved successfully.

5 Change Rules of \(R_\textit{Best} (X)\) with Changing Knowledge Granularity

Currently, in different knowledge granularity levels, change rules of rough set uncertainty is one of important issues to measure the uncertainty of knowledge [1, 3, 5, 6, 15, 16]. Therefore, in different knowledge granularity levels, change rules of \(S(X, R_\textit{Best} (X))\) are also focus on our attention. Next, we will discuss the change rules in detail. Firstly, suppose a, b, c, d, e and f be all real number, and some basic results and lemmas are reviewed in order to discuss the relevant theorems easily.

Lemma 1

[21] If \(0<a<b\) and \(0<c<d\), then \(a/b<{({a+d} )}/{({b+c} )}\).

Lemma 2

[21] If \(f/{e={({b+d} )}/{({a+c} )}}\) and \(b/{a<f/e}\), then \(d/{c>f/e}\).

Lemma 3

[19] Let \(0<c<a\), \(0<d<b\). If \(a/b\ge c/d\), then \(a/b\le {\left( {a-c} \right) }/{\left( {b-d} \right) }\). On the contrary, if \(a/b\le c/d\), then \(a/b\ge {\left( {a-c} \right) }/{\left( {b-d} \right) }\).

Next, we would discuss the relationship between S(X\(R_\textit{Best} (X) )\) and \(S(X, {R}'_\textit{Best} (X) )\) in different knowledge granularity levels. Let \(R_\textit{Best} (X)=R_{\lambda } (X)\) then \(\left( {0<\lambda \le 0.5} \right) \). And let \([x_{i_{1}}]_{R}, [x_{i_{2}}]_{R}, \ldots ,\) \( [x_{i_{k} }]_{R} \) be the equivalence classes induced by an equivalence relation R on U, and \([x_{i_{1}}]_{{R}'}, [x_{i_{2}}]_{{R}'},\) \( \ldots , [x_{i_{k}}]_{{R}'} \) be the equivalence classes induced by another equivalence relation \({R}'\) on U. If the partition \(U/{{R}'}\) is finer than U / R namely \(U/{{R}'}\preceq U/R\), for any \(x\in U\), there is \([x]_{{R}'} \subseteq [x]_{R}\). For convenience, let \(R_{\lambda } (X)={R}(X)\cup [x_{i_{1}}]_{R} \cup [x_{i_{2}}]_{R} \cup \ldots \cup [x_{i_{k} }]_{R}\), and \(\textit{BND}_{R} (X)=[x_{i_{1}}]_{R} \cup [x_{i_{2}} ]_{R} \cup \ldots \cup [x_{i_{m}}]_{R}\). The equivalence classes divided into finer equivalence classes (sub-granules) may be in \(\textit{NEG}_{R} (X)\), \(\textit{POS}_{R} (X)\) or \(\textit{BND}_{R} (X)\). When the equivalence classes divided into finer equivalence classes are in \(\textit{NEG}_{R} (X)\) or \(\textit{POS}_{R} (X)\), \(S({X, R_{\lambda } (X)} )=S({X, {R}'_{\lambda } (X)} )\) is held. Next we will focus on \(S({X, {R}'_{\lambda } (X)} )\) when the equivalence classes divided into finer equivalence classes are in \(\textit{BND}_{R} (X)\). For simplicity, suppose there is only one granule subdivided to two disjoint granules and others remain unchanged. That is to say, suppose \([x_{i_{t}}]_{R} =[x_{i_{t}} ^{1}]_{{R}'} \cup [x_{i_{t}} ^{2}]_{{R}'}\). This situation will be discussed as follows.

Theorem 1

If \(\lambda =0.5\), \(1\le t\le k\), that is \([x_{i_{t}} ]_{R} \subset R_{\lambda } (X)\):

(1) If \([x_{i_{t}} ^{1}]_{{R}'} \subset {R}'_{\lambda } (X)\), \([x_{i_{t}} ^{2}]_{{R}'} \subset {R}'_{\lambda } (X)\), then \(S\left( {X, R_{\lambda } (X)} \right) =S\left( {X, {R}'_{\lambda } (X)} \right) \);

(2) If \([x_{i_{t}} ^{1}]_{{R}'} \subset {R}'_{\lambda } (X)\), \([x_{i_{t}} ^{2}]_{{R}'} \not \subset {R}'_{\lambda } (X)\), then \(S\left( {X, R_{\lambda } (X)} \right) \le S\left( {X, {R}'_{\lambda } (X)} \right) \).

Proof

\(\forall x\in R_{0.5} (X)\), we have \(\mu _{X}^{R} (x)=\frac{\left| {[x]_{R} \cap X} \right| }{[x]_{R}} \ge 0.5\). Then we can obtain,

$$R_{0.5} (X)=\left\{ {x|\mu _{X}^{R} (x)\ge 0.5} \right\} =\left\{ {x|\mu _{X}^{R} (x)=1} \right\} \cup \left\{ {x|0.5\le \mu _{X}^{R} (x)<1} \right\} . $$

Obviously, \(\left\{ {x|\mu _{X}^{R} (x)=1} \right\} =\underline{R}(X)\), and then Let \(\left\{ {x|0.5\le \mu _{X}^{R} (x)<1} \right\} =[x_{i_{1}}]_{R} \cup [x_{i_{2}}]_{R} \cup \ldots \cup [x_{i_{k}} ]_{R}\), we have

$$X\cap R_{0.5} (X) = X\cap \left( {\underline{R}(X)\cup [x_{i_{1}}]_{R} \cup [x_{i_{2}}]_{R} \cup \ldots \cup [x_{i_{k}}]_{R}}\right) .$$

Since the intersection sets between any two elements in \(\underline{R}(X)\), \([x_{i_{1}}]_{R}, [x_{i_{2}}]_{R}, \ldots ,\) \( [x_{i_{k}}]_{R} \) is empty set, we have

$$\begin{aligned} | {X\cap R_{0.5} (X)} |=&| {X\cap \underline{R}(X)} |+| {X\cap [x_{i_{1}}]_{R}} |+| {X\cap [x_{i_{2}}]_{R}} |+\ldots +| {X\cap [x_{i_{n}}]_{R}} | \\ =&| {\underline{R}(X)} |+| {X\cap [x_{i_{1}}]_{R}} |+| {X\cap [x_{i_{2}}]_{R}} |+|\ldots |+| {X\cap [x_{i_{n}}]_{R}} |. \end{aligned}$$

Since \(X\cup R_{\lambda } (X)=X\cup ({[x_{i_{1}}]_{R} -X} )\) \(\cup \) \( ({[x_{i_{2}}]_{R} -X} )\) \(\cup \) \( \ldots \) \(\cup \) \( ( {[x_{i_{k}}]_{R} -X} )\) and the intersection between any two elements in X, \(({[x_{i_{1}}]_{R} -X} ), ({[x_{i_{2}}]_{R} -X} ), \ldots ,\) \( ({[x_{i_{k}}]_{R} -X} )\) is empty set, we have

$$ | {X\cup R_{0.5} (X)} |=|X|+|{({[x_{i_{1}}]_{R} -X} )} |+| {({[x_{i_{2}} ]_{R} -X} )} |+|\ldots |+| {({[x_{i_{k}}]_{R} -X} )} |. $$

Therefore,

$$S({X, R_{0.5} (X)} )=\frac{ \begin{array}{l} | {{R}(X)} |+| {X\cap [x_{i_{1} }]_{R}} |+| {X\cap [x_{i_{2}}]_{R}} |+\ldots +| {X\cap [x_{i_{k}}]_{R}} | \end{array} }{ \begin{array}{l} |X|+| {[x_{i_{1}} ]_{R} -X} |+| {[x_{i_{2}}]_{R} -X} | +\ldots +| {[x_{i_{k}}]_{R} -X} | \end{array} }. $$

While \([x_{i_{t}} ^{1}]_{{R}'} \subset {R}'_{\lambda } (X)\) and \([x_{i_{t}} ^{2}]_{{R}'} \subset {R}'_{\lambda } (X)\), we can get

$$\begin{aligned}&S({X, {R}'_{0.5} (X)} )\\ =\,&\frac{\begin{array}{l} | {{{R}}'( X )} |+| {X\cap [x_{i_{1}}]_{{R}'}} |+\ldots +| {X\cap [x_{i_{t}} ^{1}]_{{R}'}} | +| {X\cap [x_{i_{t}} ^{2} ]_{{R}'}} |+\ldots +| {X\cap [x_{i_{k}}]_{{R}'}} | \end{array}}{\begin{array}{l} | X |+| {[x_{i_{1}}]_{{R}'} -X} |+\ldots +| {[x_{i_{t} }^{1}]_{{R}'} -X} | +| {[x_{i_{t}} ^{2}]_{{R}'} -X} |+\ldots +| {[x_{i_{k}}]_{{R}'} -X} | \end{array} } \\ =\,&\frac{ \begin{array}{l} | {{R}(X)} |+| {X\cap [x_{i_{1}}]_{R} } |+| {X\cap [x_{i_{2}}]_{R}} | +\ldots +| {X\cap [x_{i_{k}}]_{R}} | \end{array} }{ \begin{array}{l} |X|+| {[x_{i_{1}}]_{R} -X} |+| {[x_{i_{2}}]_{R} -X} | +\ldots +| {[x_{i_{k}}]_{R} -X} | \end{array} } \\ =\,&S({X, R_{\lambda } (X)} ). \end{aligned}$$

So the part (1) is proved successfully.

For the part (2) when \([x_{i_{t}} ^{2}]_{{R}'} \not \subset {R}'_{\lambda } (X)\), we can have the equality as follows,

$$S({X, {R}'_{0.5} (X)} )= \frac{ \begin{array}{l} | {{\underline{R}}'(X)} |+| {X\cap [x_{i_{1}}]_{{R}'}} |+\ldots +| {X\cap [x_{i_{t}} ^{1} ]_{{R}'}} |+\ldots +| {X\cap [x_{i_{k}}]_{{R}'}} | \end{array} }{ \begin{array}{l} | X |+| {[x_{i_{1}}]_{{R}'} -X} |+\ldots +| {[x_{i_{t} }^{1}]_{{R}'} -X} |+\ldots +| {[x_{i_{k}}]_{{R}'} -X} | \end{array} }. $$

With Lemma 2, we have the inequality \(\frac{| {X\cap [x_{i_{t}} ^{1} ]_{{R}'}} |}{| {[x_{i_{t}} ^{1}]_{{R}'} -X} |}>\frac{| {X\cap [x_{i_{t}}]_{{R}'}} |}{| {[x_{i_{t}}]_{{R}'} -X} |}\), let \(\frac{| {X\cap [x_{i_{t} }^{1}]_{{R}'}} |}{| {[x_{i_{t}} ^{1}]_{{R}'} -X} |}=\frac{| {X\cap [x_{i_{t}}]_{{R}'}} |+p}{| {[x_{i_{t}}]_{{R}'} -X} |+q}\), then \(\frac{p}{q}>\frac{| {X\cap [x_{i_{t}}]_{{R}'}} |}{| {[x_{i_{t}}]_{{R}'} -X} |}\), then we can get

$$\begin{aligned}&S({X, {R}'_{\lambda } (X)} )\\ =&\frac{ \begin{array}{l} | {{\underline{R}}'(X)} |+| {X\cap [x_{i_{1}}]_{{R}'}} |+| {X\cap [x_{i_{2}}]_{{R}'}} |+\ldots +| {X\cap [x_{i_{t}} ^{1}]_{{R}'}} |+\ldots +| {X\cap [x_{i_{k}}]_{{R}'}} | \end{array} }{ \begin{array}{l} |X|+| {[x_{i_{1}}]_{{R}'} -X} |+| {[x_{i_{2}}]_{{R}'} -X} |+\ldots +| {[x_{i_{t}} ^{1}]_{{R}'} -X} |+\ldots +| {[x_{i_{k}}]_{{R}'} -X} | \end{array} } \\ =&\frac{ \begin{array}{l} | {\underline{R}(X)} |+| {X\cap [x_{i_{1}}]_{R} } |+| {X\cap [x_{i_{2}}]_{R}} | +\ldots +| {X\cap [x_{i_{k}}]_{R}} |+p \end{array} }{ \begin{array}{l} | X |+| {[x_{i_{1}}]_{R} -X} |+| {[x_{i_{2}}]_{R} -X} | +\ldots +| {[x_{i_{k}} ]_{R} -X} |+q \end{array} }. \end{aligned}$$

We know

$$S({X, R_{0.5} (X)} ) =\frac{ \begin{array}{l} | {\underline{R}( X )} |+| {X\cap [x_{i_{1}}]_{R}} |+| {X\cap [x_{i_{2}}]_{R}} | +\ldots +| {X\cap [x_{i_{k}}]_{R}} | \end{array} }{ \begin{array}{l} |X|+| {[x_{i_{1}}]_{R} -X} |+| {[x_{i_{2}}]_{R} -X} | +\ldots +| {[x_{i_{k}}]_{R} -X} | \end{array} }, $$

according to Definition 6 we have \(0\le S({X, R_{0.5} (X)} )\) \(\le \) 1, then we can easily get the inequality as follow,

$$\begin{aligned}&| {\underline{R}(X)} |+| {X\cap [x_{i_{1}}]_{R}} |+| {X\cap [x_{i_{2}}]_{R}} | +\ldots +| {X\cap [x_{i_{k}}]_{R}} |\\&\le | X |+| {[x_{i_{1}}]_{R} -X} | +| {[x_{i_{2}}]_{R} -X} |+\ldots +| {[x_{i_{k}} ]_{R} -X} |. \end{aligned}$$

Since \(\frac{p}{q}>\frac{| {X\cap [x_{i_{t}}]_{{R}'}} |}{| {[x_{i_{t}}]_{{R}'} -X} |}\) we have \(p>q\). With Lemma 1,

$$\frac{| {\underline{R}(X)} |+| {X\cap [x_{i_{1}}]_{R}} |+\ldots +| {X\cap [x_{i_{k}}]_{R}} |+p}{| X |+| {[x_{i_{1}}]_{R} -X} |+\ldots +| {[x_{i_{k}}]_{R} -X} |+q} \ge \frac{| {{R}(X)} |+| {X\cap [x_{i_{1}}]_{R}} |+\ldots +| {X\cap [x_{i_{k}}]_{R}} |}{| X |+| {[x_{i_{1}}]_{R} -X} |+\ldots +| {[x_{i_{k}}]_{R} -X} |}, $$

namely \(S({X, R_{0.5} (X)} )\le S( {X, {R}'_{0.5} (X)} )\). Hence the part (2) is proved successfully.

Theorem 1 show that when \(\lambda =0.5\), no matter the equivalence classes divided into many sub-granules are in \(\textit{NEG}_{R} (X)\), \(\textit{POS}_{R} (X)\) or \(\textit{BND}_{R} (X)\), similarity degree \(S( {X, {R}'_{\lambda } (X)} )\) is not smaller than \(S( {X, R_{\lambda } (X)} )\).

Theorem 2

If \(0<\lambda <0.5\), \(k<t\le m\), that is \([x_{i_{t}} ]_{R} \not \subset R_{\lambda } (X)\):

(1) If \([x_{i_{t}} ^{1}]_{{R}'} \subset {R}'_{\lambda } (X)\), \([x_{i_{t}} ^{2}]_{{R}'} \not \subset {R}'_{\lambda } (X)\) and S(X\( R_{\lambda } (X) )\le {| {X\cap [x_{i_{t}} ^{1}]_{{R}'}} |}/ {| {[x_{i_{t}} ^{1}]_{{R}'} -X} |}\), then S(X\( R_{\lambda } (X) )\le S({X, {R}'_{\lambda } (X)} )\);

(2) If \([x_{i_{t}} ^{1}]_{{R}'} \not \subset {R}'_{\lambda } (X)\), \([x_{i_{t}} ^{2} ]_{{R}'} \not \subset {R}'_{\lambda } (X)\), then \(S({X, R_{\lambda } (X)} )=S( {X, {R}'_{\lambda } (X)} )\).

Proof

According to the Theorem 1 we can get the equality as follow,

$$S({X, R_{\lambda } (X)} )=\frac{ \begin{array}{l} | {{R}(X)} |+| {X\cap [x_{i_{1}} ]_{R}} |+| {X\cap [x_{i_{2}} ]_{R}} | +\ldots +| {X\cap [x_{i_{k}} ]_{R}} | \end{array} }{ \begin{array}{l} | X |+| {[x_{i_{1}} ]_{R} -X} |+| {[x_{i_{2}} ]_{R} -X} | +\ldots +| {[x_{i_{k}} ]_{R} -X} | \end{array} }. $$

Since \([x_{i_{t}} ^{1} ]_{{R}'} \subset {R}'_{\lambda } (X)\) and \([x_{i_{t}} ^{2} ]_{{R}'} \not \subset {R}'_{\lambda } (X)\) we can obtain the following equality,

$$\begin{aligned}&S({X, {R}'_{\lambda } (X)} )=\frac{| {X\cap {R}'_{\lambda } (X)} |}{| {X\cup {R}'_{\lambda } (X)} |} \\ =&\frac{ \begin{array}{l} | {{\underline{R}}'(X)} |+| {X\cap [x_{i_{1}} ]_{{R}'}} |+| {X\cap [x_{i_{2}} ]_{{R}'}} | +\ldots +| {X\cap [x_{i_{k}} ]_{{R}'}} |+| {X\cap [x_{i_{t}} ^{1} ]_{{R}'} } | \end{array} }{ \begin{array}{l} | X |+| {[x_{i_{1}} ]_{{R}'} -X} |+| {[x_{i_{2}} ]_{{R}'} -X} | +\ldots +| {[x_{i_{k}} ]_{{R}'} -X} |+| {[x_{i_{t}} ^{1} ]_{{R}'} -X} | \end{array} } \\ =&\frac{ \begin{array}{l} | {\underline{R}(X)} |+| {X\cap [x_{i_{1}} ]_{R} } |+| {X\cap [x_{i_{2}} ]_{R}} | +\ldots +| {X\cap [x_{i_{k}} ]_{R}} |+| {X\cap [x_{i_{t}} ^{1} ]_{{R}'}} | \end{array} }{ \begin{array}{l} | X |+| {[x_{i_{1}} ]_{R} -X} |+| {[x_{i_{2}} ]_{R} -X} | +\ldots +| {[x_{i_{k}} ]_{R} -X} |+| {[x_{i_{t}} ^{1} ]_{{R}'} -X} | \end{array} }. \end{aligned}$$

Let \(S({X, R_{\lambda } (X)} )=k\), \(| {\underline{R}(X)} |+| {X\cap [x_{i_{1}} ]_{R}} |+\ldots +| {X\cap [x_{i_{k}} ]_{R}} |=k_{1} \) and \(| X |+| {[x_{i_{1}} ]_{R} -X} |+\ldots +| {[x_{i_{k}} ]_{R} -X} |=k_{2}\), we can get the equality as follow,

$$ S({X, {R}'_{\lambda } (X)} )=\frac{k_{1} +| {X\cap [x_{i_{t}} ^{1} ]_{{R}'}} |}{k_{2} +| {[x_{i_{t}} ^{1} ]_{{R}'} -X} |}. $$

For \(S({X, R_{\lambda } (X)} )=k=\frac{k_{1}}{k_{2} }\le \frac{| {X\cap [x_{i_{t}} ^{1} ]_{{R}'}} |}{| {[x_{i_{t}} ^{1} ]_{{R}'} -X} |}\), and according to Lemma 2 we can easily obtain the following inequality,

$$ S({X, {R}'_{\lambda } (X)} )=\frac{k_{1} +| {X\cap [x_{i_{t}} ^{1} ]_{{R}'}} |}{k_{2} +| {[x_{i_{t}} ^{1} ]_{{R}'} -X} |} \ge \frac{k_{1} +k_{1}}{k_{2} +k_{2}} =S( {X, R_{\lambda } (X)} ). $$

Therefore the part (1) is proved.

For the part (2), since \([x_{i_{t}} ^{1} ]_{{R}'} \not \subset {R}'_{\lambda } (X)\) and \([x_{i_{t}} ^{2} ]_{{R}'} \not \subset {R}'_{\lambda } (X)\), we can easily get the equality,

$$\begin{aligned}&S({X, {R}'_{\lambda } (X)} )=\frac{| {X\cap {R}'_{\lambda } (X)} |}{| {X\cup {R}'_{\lambda } (X)} |}\\ =&\frac{ \begin{array}{l} | {{\underline{R}}'(X)} |+| {X\cap [x_{i_{1}} ]_{{R}'}} |+| {X\cap [x_{i_{2} } ]_{{R}'}} | +\ldots +| {X\cap [x_{i_{k}} ]_{{R}'}} | \end{array} }{ \begin{array}{l} | X |+| {[x_{i_{1}} ]_{{R}'} -X} |+| {[x_{i_{2}} ]_{{R}'} -X} | +\ldots +| {[x_{i_{k}} ]_{{R}'} -X} | \end{array} } \\ =&\frac{ \begin{array}{l} | {\underline{R}(X)} |+| {X\cap [x_{i_{1}} ]_{R} } |+| {X\cap [x_{i_{2}} ]_{R}} | +\ldots +| {X\cap [x_{i_{k}} ]_{R}} | \end{array} }{ \begin{array}{l} | X |+| {[x_{i_{1}} ]_{R} -X} |+| {[x_{i_{2}} ]_{R} -X} | +\ldots +| {[x_{i_{k}} ]_{R} -X} | \end{array} } \\ =&S({X, R_{\lambda } (X)} ). \end{aligned}$$

Therefore the part (2) is proved completely.

Note: since \([x_{i_{t}} ]_{R} \not \subset R_{\lambda } (X)\), according to the Lemma 2, the inclusion relations \([x_{i_{t}} ^{1} ]_{{R}'} \subset {R}'_{\lambda } (X)\) and \([x_{i_{t}} ^{2} ]_{{R}'} \subset {R}'_{\lambda } (X)\) are not existed.

Theorem 3

If \(0<\lambda <0.5\), \(k<t\le m\), that is \([x_{i_{t}} ]_{R} \subset R_{\lambda } (X)\):

(1) If \([x_{i_{t}} ^{1} ]_{{R}'} \subset \underline{R}'_{\lambda } (X)\), \([x_{i_{t}} ^{2} ]_{{R}'} \subset \underline{R}'_{\lambda } (X)\), then \(S\left( {X, R_{\lambda } (X)} \right) =S\left( {X, R_{\lambda } (X)} \right) \).

(2) If \([x_{i_{t}} ^{1} ]_{{R}'} \subset {R}'_{\lambda } (X)\), \([x_{i_{t}} ^{2} ]_{{R}'} \not \subset {R}'_{\lambda } (X)\), as well as \(S\left( {X, R_{\lambda } (X)} \right) \ge {\left| {X\cap [x_{i_{t}} ^{2} ]_{{R}'}} \right| }/ {\left| {[x_{i_{t}} ^{2} ]_{{R}'} -X} \right| }\), then \(S\left( {X, R_{\lambda } (X)} \right) \le S\left( {X, {R}'_{\lambda } (X)} \right) \).

Proof

For the part (1), since \([x_{i_{t}} ^{1} ]_{{R}'} \subset {R}'_{\lambda } (X)\), and \([x_{i_{t}} ^{2} ]_{{R}'} \subset {R}'_{\lambda } (X)\), we can easily get the equality as follow,

$$\begin{aligned}&S({X, {R}'_{\lambda } (X)} )=\frac{| {X\cap {R}'_{\lambda } (X)} |}{| {X\cup {R}'_{\lambda } (X)} |} \\ =&\frac{ \begin{array}{l} | {{\underline{R}}'(X)} |+| {X\cap [x_{i_{1}} ]_{{R}'}} |+| {X\cap [x_{i_{2}} ]_{{R}'}} | +\ldots +| {X\cap [x_{i_{k}} ]_{{R}'}} | \end{array} }{ \begin{array}{l} |X|+| {[x_{i_{1}} ]_{{R}'} -X} |+| {[x_{i_{2}} ]_{{R}'} -X} | +\ldots +| {[x_{i_{k}} ]_{{R}'} -X} | \end{array} }\\ =&\frac{ \begin{array}{l} | {\underline{R}(X)} |+| {X\cap [x_{i_{1}} ]_{R} } |+| {X\cap [x_{i_{2}} ]_{R}} | +\ldots +| {X\cap [x_{i_{k}} ]_{R}} | \end{array} }{ \begin{array}{l} |X|+| {[x_{i_{1}} ]_{R} -X} |+| {[x_{i_{2}} ]_{R} -X} | +\ldots +| {[x_{i_{k}} ]_{R} -X} | \end{array} } =S({X, R_{\lambda } (X)} ). \end{aligned}$$

Hence the part (1) is proved.

For the part (2), since \([x_{i_{t}} ^{2} ]_{{R}'} \not \subset {R}'_{\lambda } (X)\), we can easily get the equality as follow,

$$\begin{aligned}&S({X, {R}'_{\lambda } (X)} )=\frac{| {X\cap {R}'_{\lambda } (X)} |}{| {X\cup {R}'_{\lambda } (X)} |} \\ =&\frac{ \begin{array}{l} | {{\underline{R}}'(X)} |+| {X\cap [x_{i_{1}} ]_{{R}'}} |+| {X\cap [x_{i_{2}} ]_{{R}'}} | +\ldots +| {X\cap [x_{i_{t}} ^{1} ]_{{R}'}} |+\ldots +| {X\cap [x_{i_{k}} ]_{{R}'}} | \end{array} }{ \begin{array}{l} |X|+| {[x_{i_{1}} ]_{{R}'} -X} |+| {[x_{i_{2}} ]_{{R}'} -X} | +\ldots +| {[x_{i_{t}} ^{1} ]_{{R}'} -X} |+\ldots +| {[x_{i_{k}} ]_{{R}'} -X} | \end{array} } \\ =&\frac{ \begin{array}{l} | {{\underline{R}}'(X)} |+| {X\cap [x_{i_{1}} ]_{{R}'}} |+\ldots +| {X\cap [x_{i_{t}} ]_{{R}'}} | +\ldots +| {X\cap [x_{i_{k}} ]_{{R}'}} |-| {X\cap [x_{i_{t}} ^{2} ]_{{R}'}} | \end{array} }{ \begin{array}{l} |X|+| {[x_{i_{1}} ]_{{R}'} -X} |+\ldots +| {[x_{i_{t}} ]_{{R}'} -X} | +\ldots +| {[x_{i_{k}} ]_{{R}'} -X} |-| {[x_{i_{t}} ^{2} ]_{{R}'} -X} | \end{array} } \\ =&\frac{| {X\cap R_{\lambda } (X)} |-| {X\cap [x_{i_{t}} ^{2} ]_{{R}'}} |}{| {X\cup R_{\lambda } (X)} |-| {[x_{i_{t}} ^{2} ]_{{R}'} -X} |}. \end{aligned}$$

Since \(S({X, R_{\lambda } (X)} )\ge {| {X\cap [x_{i_{t}} ^{2} ]_{{R}'}} |}/{| {[x_{i_{t}} ^{2} ]_{{R}'} -X} |}\), then according to Lemma 3, we can get \(S({X, R_{\lambda } (X)} )\) \(\le \) \( S({X, {R}'_{\lambda } (X)} )\). So the part (2) is proved perfectly.

Note: Since \([x_{i_{t}} ]_{R} \subset R_{\lambda } (X)\), according to the Lemma 2, the inclusion relations \([x_{i_{t}} ^{1} ]_{{R}'} \not \subset {R}'_{\lambda } (X)\) and \([x_{i_{t}} ^{2} ]_{{R}'} \not \subset {R}'_{\lambda } (X)\) are not held either.

Theorems 2 and 3 show that when \(0<\lambda <0.5\) and the equivalence classes are subdivided into many finer equivalence classes (sub-granules) by \({R}'\), the similarity degree between \({R}'_{\lambda } (X)\) and X is not generally lower than the similarity degree between \(R_{\lambda } (X)\) and X.

6 Conclusions

Since rough set theory was proposed in 1982, it has developed more than 30 years. Many scholars have made some improvements for the traditional models and obtained many extended rough set models which overcome some shortcomings of the traditional models. Combining with the fuzzy set theory, we have constructed an approximation set of an uncertain set X with the cut-set and proposed a general approximation model \(R_{0.5} (X)\), but the optimal approximation set of X still is not established. In order to solve this problem, in this paper the optimal approximation set through minimizing similarity between the uncertain concept and its approximation sets is defined. Then comparative analysis between \(R_\textit{Best} (X)\) and other approximation sets is given. Next, the operation properties of \(R_\textit{Best} (X)\) are presented and proved successfully. Finally, change rules of the similarity degree between X and \(R_\textit{Best} (X)\) in different knowledge granularity levels are discussed in detail. These research presents a computational method for establishing or searching an optimal approximation set of X from the perspective of similarity. We hope these works can expand the range of rough set theory model to deal with uncertain problems in the real world and promote the development of uncertainty artificial intelligence.