Keywords

1 Introduction

Attribute reduction in information systems is a fundamental aspect of rough set theory. A reduction is a subset of attributes which reserves the same information for classification purposes as the entire set of attributes. Attribute reduction has been successfully applied in many fields, such as pattern recognition, machine learning and data mining. There are many different types of attribute reductions [1, 8, 11, 12, 19], for example, positive region reduction [14], variable precision reduction [18], distribution reduction [10], partial reduction [7], three-way decision based reduction [9] and so on. Jia et al. [2] gave a brief description of twenty-two kinds of existing reduction approaches. Pawlak [13, 14] was the first to propose the concept of attribute reduction, Skowron and Rauszer [15, 16] proposed discernibility matrix based attribute reduction algorithms for finding all reduction sets in information systems. Recently, Ma and Yao [9] studied class-specific attribute reductions in a decision table from the three-way decision perspective. We [3,4,5,6,7] extended some existing reduction approaches to general relation systems or relation decision systems. For a relation system (UA) and a fixed non-empty subset \(X\subseteq U\), the universal set U is partitioned into the positive, boundary and negative regions via the lower and upper approximations of X. This partition is the theoretical basis of three-way decisions. In fact, we considered the positive and negative region reductions [7] for relation systems. This paper considers the boundary region reduction for a given relation system and gives the corresponding reduction algorithm for finding all reduction sets. We also discuss the relationship among positive, boundary and negative region reductions.

The remainder of the paper is organized as follows. In Sect. 2, we briefly recall some basic concepts and properties of binary relations, rough sets and relation systems. In Sect. 3, we present the definition of boundary reduction for a given relation system and a given subset and give a boundary reduction algorithm. Section 4 discusses the relationship among positive, boundary and negative region reductions. Finally, Sect. 5 concludes the paper.

2 Preliminaries

Relationships between numbers, sets and many other entities can be formalized in the idea of a binary relation. This section reviews briefly some basic notations and notions based on binary relations, rough sets and relation systems.

Let \(U = \{ x_1, x_2,\cdots , x_n\}\) be a finite universal set and P(U) be the power set of U. Suppose that R is an arbitrary binary relation on U. The left and right R-relative sets of an element x in U are defined as

$$l_R(x) =\{y|y\in U,yRx\} \text{ and } r_R(x) =\{y|y\in U,xRy\},$$

respectively. The left and right R-relative sets are a common generalization of equivalence classes. Recall the following terminology: (1) R is reflexive if xRx for each \(x\in U\); (2) R is symmetric if \(l_R(x) = r_R(x)\) for each \(x\in U\); (3) R is transitive if, for each \(x,y,z\in U\), \(y\in r_R(x)\) and \(z\in r_R(y)\) imply \(z\in r_R(x)\); and (4) R is an equivalence relation if R is reflexive, symmetric, and transitive. Based on the right R-relative set, for subset \(X\subseteq U\), the lower and upper approximations [13, 14, 17] of X are defined as

$$\underline{R}(X) =\{x|x\in U, r_R(x)\subseteq X\} \text{ and } \overline{R}(X) =\{x|x\in U, r_R(x)\cap X\ne \emptyset \},$$

respectively.

Definition 2.1

 [5]. Let U be a finite universal set and A be a family of binary relations on U, then (UA) is called a relation system.

If A consists of equivalence relations on U, then (UA) is just a usual information system. Thus a relation system is a generalization of an information system. Let (UA) be a relation system, with respect to a subset \(\emptyset \ne B\subseteq A\), we always associate a relation \(R_B\), which is defined as \(R_B=\cap _{R\in B}R\).

For a given information system, Pawlak [14] defined the concept of positive, negative and borderline regions of \(X\subseteq U\). We extend his definition.

Definition 2.2

Let (UA) be a relation system and \(\emptyset \ne X\subseteq U\), then the positive region \(POS_A(X)\), the boundary region \(BND_A(X)\) and the negative region \(NEG_A(X)\) of X are respectively defined as follows:

$$\begin{aligned} POS_A(X)&=\underline{R_A}(X),\\ BND_A(X)&=\overline{R_A}(X)-\underline{R_A}(X),\\ NEG_A(X)&=U-\overline{R_A}(X). \end{aligned}$$

This paper studies the boundary region reduction for relation systems. The following proposition gives some basic properties of the boundary region \(BND_A(X)\) of X.

Proposition 2.1

Let (UA) be a relation system, \(\emptyset \ne X\subseteq U\) and \(\emptyset \ne B\subseteq A\), then the following conditions are equivalent:

  1. (1)

    \(BND_A(X)=BND_B(X)\).

  2. (2)

    \(\overline{R_A}(X)=\overline{R_B}(X)\) and \(\underline{R_A}(X)=\underline{R_B}(X)\).

  3. (3)

    \((\underline{R_A}(X),\underline{R_A}(X^C))=(\underline{R_B}(X),\underline{R_B}(X^C))\), where \(X^C=U-X\) is the complement of X.

Proof

\((2)\Rightarrow (1)\) is clear. By using the negative property \((\overline{R_A}(X))^C=\underline{R_A}(X^C)\), \((2) \Leftrightarrow (3)\) is also clear.

\((1)\Rightarrow (2)\): Since \(R_A\subseteq R_B\), we have \(\overline{R_A}(X)\subseteq \overline{R_B}(X)\) and \(\underline{R_B}(X)\subseteq \underline{R_A}(X)\). \(BND_B(X)=\overline{R_B}(X)-\underline{R_B}(X)=\overline{R_A}(X)-\underline{R_A}(X)\subseteq \overline{R_B}(X)-\underline{R_A}(X)\subseteq \overline{R_B}(X)-\underline{R_B}(X)\) implies \(\overline{R_A}(X)-\underline{R_A}(X)=\overline{R_B}(X)-\underline{R_A}(X)\), thus \(\overline{R_A}(X)=\overline{R_B}(X)\). Similarly, \(\underline{R_A}(X)=\underline{R_B}(X)\).    \(\Box \)

3 Boundary Region Reductions

Ma and Yao [9] considered a boundary reduction from the three-way decision perspective on special decision classes for a decision table. Now we extend their definition to a given relation system (UA) and a given non-empty subset \(X\subseteq U\). This section studies such a type of reduction, which keeps \(BND_A(X)\) unchanged, we call such a type of reduction a boundary reduction. We first give its definition.

Definition 3.1

Let (UA) be a relation system and a given subset \(\emptyset \ne X\subseteq U\). \(\emptyset \ne B\subseteq A\), B is called an X-boundary reduction of (UA) if B satisfies the following conditions:

  1. (1)

    \(BND_A(X)=BND_B(X)\).

  2. (2)

    For any \(\emptyset \ne B'\subset B\), \(BND_A(X)\ne BND_{B'}(X)\).

By Proposition 2.1, an X-boundary reduction of (UA) keeps both \(\underline{R_A}(X)\) and \(\overline{R_A}(X)\) unchanged. We [7] considered two types of reductions that keep \(\underline{R_A}(X)\) and \(\overline{R_A}(X)\) unchanged, respectively. Now, via the strict mathematical proofs, we give an X-boundary reduction algorithm for a given relation system (UA) and a given non-empty subset \(X\subseteq U\). Suppose that \(U=\{x_1,x_2,\cdots ,x_n\}\), we define the discernibility matrix \(M=(m_{ij})_{n\times n}\) as follows:

$$m_{ij}=\left\{ \begin{array}{cc}\{a|a\in A, (x_i,x_j)\notin a\}, &{} \text{ if } x_i\in \underline{R_A}(X^C) \text{ and } x_j\in X \\ &{} \text{ or } x_i\in \underline{R_A}(X) \text{ and } x_j\notin X\\ \emptyset ,&{} otherwise \end{array}\right. .$$

Where \(X^C\) denotes the complement of X. We need a technical lemma.

Lemma 3.1

Let (UA) be a relation system and \(\emptyset \ne X\subseteq U\), if \(x_i\) and \(x_j\) satisfy one of the following conditions:

  1. (1)

    \(x_i\in \underline{R_A}(X^C), x_j\in X\).

  2. (2)

    \(x_i\in \underline{R_A}(X), x_j\notin X\).

Then \(m_{ij}\ne \emptyset \).

Proof

Suppose that \(x_i\in \underline{R_A}(X^C)\) and \(x_j\in X\), if \(m_{ij}=\emptyset \), then \(x_iR_Ax_j\), so \(x_j\in r_{R_a}(x_i)\subseteq X^C\), that is, \(x_j\notin X\), which contradicts \(x_j\in X\). Similarly, if \(x_i\in \underline{R_A}(X), x_j\notin X\), then \(m_{ij}\ne \emptyset \).    \(\Box \)

Theorem 3.1

Let (UA) be a relation system, \(\emptyset \ne X\subseteq U\), and \(\emptyset \ne B\subseteq C\). Then the following conditions are equivalent:

  1. (1)

    \(BND_A(X)=BND_B(X)\).

  2. (2)

    If \(m_{ij}\ne \emptyset \), then \(B\cap m_{ij}\ne \emptyset \).

Proof

\((1)\Rightarrow (2)\): By Proposition 2.1, we have \(\underline{R_A}(X^C)=\underline{R_B}(X^C)\) and \(\underline{R_A}(X)=\underline{R_B}(X)\). Suppose that \(m_{ij}\ne \emptyset \) and \(B\cap m_{ij}=\emptyset \), then

  1. (i)

    \(x_i\in \underline{R_A}(X^C)\) and \(x_j\in X\) or

  2. (ii)

    \(x_i\in \underline{R_A}(X)\) and \(x_j\notin X\).

\(B\cap m_{ij}=\emptyset \) implies \(x_iR_Bx_j\) and \(x_j\in R_{R_B}\).

If \(x_i\in \underline{R_A}(X^C)\) and \(x_j\in X\), by condition (1), \(x_i\in \underline{R_B}(X^C)\) and \(x_j\in X\), so \(x_j\in r_{R_B}(x_i)\subseteq X^C\), which contradicts \(x_j\in X\).

If \(x_i\in \underline{R_A}(X)\) and \(x_j\notin X\), then \(x_i\in \underline{R_B}(X)\) and \(x_j\notin X\), thus \(x_j\in r_{R_B}(x_i)\subseteq X\), which contradicts \(x_j\notin X\).

\((2)\Rightarrow (1)\): We first show that \(\underline{R_A}(X)=\underline{R_B}(X)\). Note that \(\underline{R_B}(X)\subseteq \underline{R_A}(X)\) is clear. If \(\underline{R_A}(X)\ne \underline{R_B}(X)\), let \(x_i\in \underline{R_A}(X)-\underline{R_B}(X)\), by definition of a lower approximation, we have \(r_{R_A}(x_i)\subseteq X\), and \(r_{R_B}(x_i)\nsubseteq X\). Let \(x_j\in r_{R_B}(x_i)\) and \(x_j\notin X\), by Lemma 3.1, \(m_{ij}\ne \emptyset \), and from condition (2), \(B\cap m_{ij}\ne \emptyset \). Thus \((x_i,x_j)\notin R_B\), which contradicts \(x_j\in r_{R_B}\). This shows that \(\underline{R_A}(X)=\underline{R_B}(X)\). Similarly, we can show that \(\overline{R_A}(X)=\overline{R_B}(X)\).    \(\Box \)

From Theorem 3, we have the following corollary.

Corollary 3.1

Let (UA) be a relation system, \(\emptyset \ne X\subseteq U\), and \(\emptyset \ne B\subseteq C\), then B is an X-boundary reduction of A if and only if it is a minimal subset satisfying \(m_{ij}\cap B\ne \emptyset \) for any \(m_{ij}\ne \emptyset \).

According to Corollary 3.1, we propose an X-boundary reduction algorithm for a given relation system (UA) and a given subset \(\emptyset \ne X\subseteq U\) as follows.

figure a

We illustrate the algorithm introduced previously with a simple example.

Example 3.1

Let (UA) be a relation system, where \(U=\{1,2,3,4,5\}\), \(A=\{R_1,R_2,R_3,R_4,R_5\}\) and \(X=\{1,3,5\}\). Each \(R_i(i=1,2,\cdots ,5)\) is given by its Boolean matrix \(M_{R_i}\).

$$M_{R_1}=\left( \begin{array}{ccccc} 0&{}1&{}1&{}1&{}0 \\ 0&{}1&{}1&{}0&{}1 \\ 1&{}0&{}0&{}0&{}0 \\ 1&{}1&{}0&{}1&{}1 \\ 0&{}1&{}0&{}1&{}0 \end{array}\right) , M_{R_2}=\left( \begin{array}{ccccc} 1&{}1&{}1&{}1&{}0 \\ 0&{}1&{}1&{}0&{}1 \\ 1&{}1&{}0&{}1&{}0 \\ 1&{}1&{}0&{}1&{}0 \\ 0&{}1&{}1&{}1&{}0 \end{array}\right) , M_{R_3}=\left( \begin{array}{ccccc} 1&{}1&{}0&{}1&{}0 \\ 0&{}0&{}1&{}0&{}1 \\ 1&{}0&{}1&{}0&{}0 \\ 1&{}0&{}0&{}1&{}1 \\ 0&{}1&{}1&{}1&{}0 \end{array}\right) ,$$
$$M_{R_4}=\left( \begin{array}{ccccc} 0&{}1&{}0&{}1&{}0 \\ 1&{}0&{}1&{}1&{}1 \\ 1&{}0&{}1&{}0&{}1 \\ 1&{}0&{}0&{}1&{}0 \\ 1&{}1&{}1&{}1&{}0 \end{array}\right) , \text{ and } M_{R_5}=\left( \begin{array}{ccccc} 1&{}1&{}1&{}1&{}0 \\ 0&{}1&{}1&{}1&{}1 \\ 1&{}1&{}0&{}1&{}1 \\ 1&{}0&{}0&{}1&{}1 \\ 0&{}1&{}0&{}1&{}1 \end{array}\right) . \text{ Clearly, } M_{R_A}=\left( \begin{array}{ccccc} 0&{}1&{}0&{}1&{}0 \\ 0&{}0&{}1&{}0&{}1 \\ 1&{}0&{}0&{}0&{}0 \\ 1&{}0&{}0&{}1&{}0 \\ 0&{}1&{}0&{}1&{}0 \end{array}\right) .$$

By direct computation, \(\overline{R_A}(X)=\{2,3,4\}\), \(\underline{R_A}(X)=\{2,3\}\) and \(BND_A(X)=\overline{R_A}(X)-\underline{R_A}(X)=\{4\}\). The following Table 1 gives the discernibility matrix of the boundary region reduction. Since \(1\in \underline{R_A}(X^C)\) and \(1\in X\), it follows that both \(R_1\) and \(R_4\) are in the entry (1, 1) of Table 1, because \((1,1)\notin R_1\) and \((1,1)\notin R_4\). The discernibility function

$$\begin{aligned} f&=(R_1+R_3)(R_1+R_4)(R_1+R_5)(R_3+R_4)\\&=(R_1+R_3R_4R_5)(R_3+R_4)\\&=R_1R_3+R_1R_4+R_3R_4R_5. \end{aligned}$$

Thus all boundary region reduction sets are \(\{R_1,R_3\}\), \(\{R_1,R_4\}\), and \(\{R_3,R_4,R_5\}\).

4 The Relationship Among Positive, Boundary and Negative Region Reductions

This section will illustrate the relationship among positive, boundary and negative region reductions. Let (UA) be a relation system and \(X\subseteq U\), recall that an X-positive region reduction keeps \(\underline{R_C}(X)\) unchanged. Its formal definition is as follows.

Table 1. The discernibility matrix of the reduction

Definition 4.1

Let (UA) be a relation system and a given subset \(\emptyset \ne X\subseteq U\). \(\emptyset \ne B\subseteq A\), set B is called an X-positive reduction of (UA) if B satisfies the following conditions:

  1. (1)

    \(POS_A(X)=POS_B(X)\).

  2. (2)

    For any \(\emptyset \ne B'\subset B\), \(POS_A(X)\ne POS_{B'}(X)\).

Similarly, an X-negative region reduction keeps \(U-\overline{R_C}(X)=\underline{R_C}(X^C)\) unchanged, however, we omit its formal definition. The discernibility matrices \(M=(m_{ij})_{s\times (n-t)}\) and \(N=(n_{ij})_{u\times t}\) of an X-positive region and X-negative region reduction are given as follows:

$$\begin{aligned} m_{ij}&=\left\{ \begin{array}{cc}\{a|a\in A, (x_i,x_j)\notin a\}, &{} x_i\in \underline{R_A}(X), x_j\notin X\\ \emptyset ,&{} otherwise \end{array}\right. , \text{ and } \\ n_{ij}&=\left\{ \begin{array}{cc}\{a|a\in A, (x_i,x_j)\notin a\}, &{} x_i\in \underline{R_A}(X^C), x_j\in X \\ \emptyset ,&{} otherwise \end{array}\right. , \end{aligned}$$

respectively. Where \(s=|\underline{R_A}(X)|\) denotes the cardinality of \(\underline{R_A}(X)\), \(t=|X|\) and \(u=|\underline{R_C}(X^C)|\).

Using the matrices M and N, we can calculate all positive and negative region reduction sets, respectively. Moreover, we can also derive the boundary region reduction from the positive and negative region reductions. This provides another boundary region reduction algorithm. We use the example below to show the detailed method.

Example 4.1

Let (UA) and \(X\subseteq U\) be as in Example 3.1, the discernibility matrices \(M=(m_{ij})_{s\times (n-t)}\) of the X-positive region reduction and \(N=(n_{ij})_{u\times t}\) of the X-negative region reduction are shown in Tables 2 and 3:

Table 2. The discernibility matrix of an X-positive reduction
Table 3. The discernibility matrix of an X-negative reduction

Since the discernibility function of the X-positive region reduction \(f_1=R_3+R_1R_4\), so that all the X-positive region reduction sets are \(\{R_3\}\) and \(\{R_1,R_4\}\), similarly, the discernibility function of the X-negative region reduction \(f_2=R_1R_3+R_1R_4+R_4R_5\), so that all the X-negative region reduction sets are \(\{R_1,R_3\}\), \(\{R_1,R_4\}\) and \(\{R_4,R_5\}\). The discernibility function of the X-boundary region reduction is

$$\begin{aligned} f&=f_1f_2=(R_3+R_1R_4)(R_1R_3+R_1R_4+R_4R_5) \\&=R_1R_3+R_1R_4+R_3R_4R_5. \end{aligned}$$

Thus all boundary region reduction sets are \(\{R_1,R_3\}\), \(\{R_1,R_4\}\), and \(\{R_3,R_4,R_5\}\).

Remark 1

Let BC and D be respectively X-positive, boundary and negative region reductions of a relation system (UA), then

  1. (1)

    \(B\cap C\) keeps the negative region unchanged,

  2. (2)

    \(C\cap D\) keeps the positive region unchanged, and

  3. (3)

    \(B\cap D\) keeps the boundary region unchanged.

5 Conclusions

The boundary region consists of hesitation objects. In other words, for these objects, we can neither accept nor reject and, hence, make a non-commitment decision. Naturally, it is an interesting problem to consider the reduction that keeps the boundary region unchanged. Thus we propose the concept of the boundary region reduction for relation systems and obtain a corresponding reduction algorithm for finding all reduction sets. We have also established a relationship among the positive, boundary and negative region reductions. We have provided a way to derive the boundary region reduction sets from the positive and negative region reduction sets. The future work is to apply the reduction model given in this paper to discover knowledge in real life data sets.