Keywords

1 Introduction

Incomplete information systems consist of objects whose attribute values are described by a set of values. When a set of values is obtained as an attribute value, one element of the set is the actual one, but we cannot know it without additional information. Such a situation frequently appears in our daily life. For example, when we obtain the incomplete information “John’s age is round 20,” “round 20” is expressed by \(\{19,20,21\}\). The actual age is in \(\{19,20,21\}\), but it is unknown which is actual.

The framework of rough sets, constructed by Pawlak [10], is used as an effective tool for data science including various fields such as data analysis, pattern recognition, machine learning, data mining, and so on. The rough sets are based on indiscernibility of objects whose characteristic values are indistinguishable. The fundamental framework is given by rough approximations that consist of lower and upper approximations. The original rough approximations are usually derived from interrelationships, inclusion and intersection, between equivalence classes. The equivalence classes are obtained from indiscernibility relations in an information table containing only complete information.

Some extensions are imposed on the original rough approximations to deal with incomplete information. Kryszkiewicz constructed a discernibility relation by giving indiscernibility of a missing value with any value under an assumption [2]. Some authors propose indiscernibility relations under different assumptions from Kryszkiewicz [1, 3, 11]. This approach creates poor results of rough approximations [6, 11], because it considers only the possibility that a missing value may be equal to another value. In addition, the approach does not give the same rough approximations as the method based on Lipski’s one under possible world semantics [6].

A missing value has two possibilities. One possibility is that it may be equal to another value. The other is that it may not be equal to the value. It is unknown which possibility is true without additional information. From this standpoint, Nakata and Sakai have developed an approach based on possible equivalence classes [7]. The number of possible equivalence classes exponentially increases, as the number of missing values does. They avoid the computational complexity by using minimum and maximum possible equivalence classes. Their approach gives the same rough approximations as the work based on Lipski. However, the approach is limited in the case of obtaining possible equivalence classes.

To remove the limitation, we show an approach directly using indiscernibility relations, but not equivalence classes obtained from the indiscernibility relations. The approach is applicable to various types of information. Nakata and Sakai develop the approach for possibilistic information and give successful results [8, 9]. In this paper, we apply the approach to incomplete information expressed by a set of values. We formulate rough approximations and rule induction from the viewpoint of both certainty and possibility, as Lipski did in incomplete databases, to deal with incomplete information that includes present but unknown type of missing values as special cases.

The paper is organized as follows. In Sect. 2, an approach based on indiscernibility relations is briefly addressed in the case of information tables with complete information, called complete information systems. In Sect. 3, we develop the approach in the case of information tables with incomplete information, called incomplete information systems. The approach is described from the viewpoint of both certainty and possibility. In Sect. 4, conclusions are addressed.

2 Rough Sets by Indiscernibility Relations in Complete Information Systems

A data set is represented as a table, called an information table, where each row and each column represent an object and an attribute, respectively. A mathematical model of an information table with complete information is called a complete information system. The complete information system is a triplet expressed by \((U,AT,\{D(a_{i}) \mid a_{i} \in AT \})\). U is a non-empty finite set of objects called the universe, AT is a non-empty finite set of attributes such that \(a_{i} : U \rightarrow D(a_{i})\) for every \(a_{i} \in AT\) where \(D(a_{i})\) is the domain of attribute \(a_{i}\). Binary relation \(R_{a_{i}}\) for indiscernibility of objects on attribute \(a_{i} \in AT\), which is called the indiscernibility relation for \(a_{i}\), is:

$$\begin{aligned} R_{a_{i}} = \{(o,o') \in U \times U \mid a_{i}(o) = a_{i}(o') \}, \end{aligned}$$
(1)

where \(a_{i}(o)\) is the value for attribute \(a_{i}\) of object o. From the indiscernibility relation, indiscernible class \([o]_{a_{i}}\) for object o is obtained:

$$\begin{aligned}{}[o]_{a_{i}} = \{o' \mid (o,o') \in R_{a_{i}} \}. \end{aligned}$$
(2)

The condition \(a_{i}(o) = a_{i}(o')\) in formula (1) makes \([o]_{a_{i}}\) an equivalence class. The condition can be replaced by another condition. For example, \(a_{i}(o) \) and \(a_{i}(o')\) are similar. In this case, \([o]_{a_{i}}\) is not always an equivalence class. Family \(\mathcal{E}_{a_{i}}\) of indiscernible classes on \(a_{i}\) is:

$$\begin{aligned} \mathcal{E}_{a_{i}} = \{[o]_{a_{i}} \mid o \in U \}. \end{aligned}$$
(3)

When \([o]_{a_{i}}\) is an equivalence class, U is uniquely partitioned by \(a_{i}\); namely, this is the classification induced by \(a_{i}\).

Using indiscernibility relation \(R_{a_{i}}\), lower approximation \(\underline{apr}_{a_{i}}(\mathcal{O})\) and upper approximation \(\overline{apr}_{a_{i}}(\mathcal{O})\) for \(a_{i}\) of set \(\mathcal{O}\) of objects are:

$$\begin{aligned} \underline{apr}_{a_{i}}(\mathcal{O})= & {} \{o \mid \forall o' \in U \ (o, o') \not \in R_{a_{i}} \vee o' \in \mathcal{O}\}, \end{aligned}$$
(4)
$$\begin{aligned} \overline{apr}_{a_{i}}(\mathcal{O})= & {} \{o \mid \exists o' \in U \ (o, o') \in R_{a_{i}} \wedge o' \in \mathcal{O}\}. \end{aligned}$$
(5)

When we focus on object o, o is an element of the lower approximation of \(\mathcal{O}\), if all objects that are indiscernible with o are included in \(\mathcal{O}\). On the other hand, the object is an element of the upper approximation of \(\mathcal{O}\), if some objects that are indiscernible with o are in \(\mathcal{O}\). Thus, if \(o \in \underline{apr}_{a_{i}}(\mathcal{O})\), then \(o \in \overline{apr}_{a_{i}}(\mathcal{O})\); namely, \(\underline{apr}_{a_{i}}(\mathcal{O}) \subseteq \overline{apr}_{a_{i}}(\mathcal{O})\). It is well known that the lower and upper approximations are linked with each other, which is called complementarity property:

$$\begin{aligned} \underline{apr}_{a_{i}}(\mathcal{O}) = U - \overline{apr}_{a_{i}}(U - \mathcal{O}). \end{aligned}$$
(6)

From this formula, \(o \in \underline{apr}_{a_{i}}(\mathcal{O})\) if and only if \(o \not \in \overline{apr}_{a_{i}}(U - \mathcal{O})\) and \(o \in \overline{apr}_{a_{i}}(\mathcal{O})\) if and only if \(o \not \in \underline{apr}_{a_{i}}(U - \mathcal{O})\).

When objects are characterized by values of attributes, a set of objects being approximated have some structures. In the case where \(a_{i}(o) = a_{i}(o')\) is used in formula (1), the set of objects is partitioned by equivalence classes obtained from the values of attribute \(a_{i}\) being equal. Under this consideration, lower approximation \(\underline{apr}_{a_{i}}(\mathcal{O}/a_{j})\) and upper approximation \(\overline{apr}_{a_{i}}(\mathcal{O}/a_{j})\) for \(a_{i}\) are:

$$\begin{aligned} \underline{apr}_{a_{i}}(\mathcal{O}/a_{j}) = \{ o \mid \exists o'' \in \mathcal{O} \ \forall o' \in U \ (o, o') \not \in R_{a_{i}} \vee (o',o'') \in R_{a_{j}} \wedge o' \in \mathcal{O}\}, \end{aligned}$$
(7)
$$\begin{aligned} \overline{apr}_{a_{i}}(\mathcal{O}/a_{j}) = \{ o \mid \exists o'' \in \mathcal{O} \ \exists o' \in U \ (o, o') \in R_{a_{i}} \wedge (o',o'') \in R_{a_{j}} \wedge o' \in \mathcal{O}\}. \end{aligned}$$
(8)

In the two approximations, if \(o \in \underline{apr}_{a_{i}}(\mathcal{O}/a_{j})\), then \(o \in \overline{apr}_{a_{i}}(\mathcal{O}/a_{j})\); namely inclusion relation \(\underline{apr}_{a_{i}}(\mathcal{O}/a_{j}) \subseteq \overline{apr}_{a_{i}}(\mathcal{O}/a_{j})\) holds. On the other hand, the complementarity property does not hold. If \(o \in \underline{apr}_{a_{i}}(\mathcal{O}/a_{j})\), then \(o \not \in \overline{apr}_{a_{i}}((U - \mathcal{O})/a_{j})\) and if \(o \in \overline{apr}_{a_{i}}(\mathcal{O}/a_{j})\), then \(o \not \in \underline{apr}_{a_{i}}((U - \mathcal{O})/a_{j})\).

We induce rules that hold between attributes from lower and upper approximations. From the lower approximation, when \(o \in \underline{apr}_{a_{i}}(\mathcal{O}/a_{j})\), \(\exists E_{a_{j}=v} \in \mathcal{E}_{a_{j}}\ [o]_{a_{i}=u} \subseteq E_{a_{j}=v}\), where \(E_{a_{j}=v}\) is the indiscernible class characterized by value v of \(a_{j}\) and \([o]_{a_{i}=u}\) is the indiscernible class including o characterized by value u for \(a_{i}\):

$$\begin{aligned} E_{a_{j}=v}= & {} \{o \mid a_{j}(o)=v \wedge v \in D(a_{j}) \}, \\ ~[o]_{a_{i}=u}= & {} \{o' \mid a_{i}(o') = a_{i}(o) \wedge a_{i}(o) = u \wedge u \in D(a_{i})\}. \end{aligned}$$

All objects in \([o]_{a_{i}=u}\) supports the rule denoted by \(a_{i} = u \rightarrow a_{j} = v\) where o has u and v of \(a_{i}\) and \(a_{j}\); namely , \(a_{i}(o)=u\) and \(a_{j}(o)=v\), respectively. Thus, o consistently supports \(a_{i} = u \rightarrow a_{j} = v\). This is denoted by \((o, a_{i}=u \rightarrow a_{j}=v)\). From the upper approximation, when \(o \in \overline{apr}_{a_{i}}(\mathcal{O}/a_{j})\), \(\exists E_{a_{j}=v} \in \mathcal{E}_{a_{j}} \ [o]_{a_{i}=u} \cap E_{a_{j}=v} \ne \emptyset \) if o has u of \(a_{i}\). From \(\underline{apr}_{a_{i}}(\mathcal{O}/a_{j}) \subseteq \overline{apr}_{a_{i}}(\mathcal{O}/a_{j})\), \(o \in (\overline{apr}_{a_{i}}(\mathcal{O}/a_{j}) - \underline{apr}_{a_{i}}(\mathcal{O}/a_{j}))\) inconsistently supports a rule denoted by \(a_{i} = u \rightarrow a_{j} = v\) where o has u of \(a_{i}\), but o does not always have v of \(a_{j}\), although this is also expressed by \((o, a_{i}=u \rightarrow a_{j}=v)\). All objects included in \([o]_{a_{i}=u}\) do not support \(a_{i}=u \rightarrow a_{j}=v\). The consistency degree, called accuracy, is evaluated by \(|[o]_{a_{i}=u} \cap E_{a_{j}=v}|/|[o]_{a_{i}=u}|\). Clearly, this degree is equal to 1, if \(o \in \underline{apr}_{a_{i}}(\mathcal{O}/a_{j})\).

For formulae on sets A and B of attributes,

$$\begin{aligned}&R_{A} = \cap _{a_{i} \in A}R_{a_{i}}, \end{aligned}$$
(9)
$$\begin{aligned}&\,[o]_{A} = \{o' \mid (o,o') \in R_{A} \} = \cap _{a_{i} \in A}[o]_{a_{i}}, \end{aligned}$$
(10)
$$\begin{aligned}&\underline{apr}_{A}(\mathcal{O}) = \{o \mid \forall o' \in U \ (o, o') \not \in R_{A} \vee o' \in \mathcal{O}\}, \end{aligned}$$
(11)
$$\begin{aligned}&\overline{apr}_{A}(\mathcal{O}) = \{o \mid \exists o' \in U \ (o, o') \in R_{A} \wedge o' \in \mathcal{O}\}, \end{aligned}$$
(12)
$$\begin{aligned}&\underline{apr}_{A}(\mathcal{O}/B) =\nonumber \\&\qquad \quad ~~ \{ o \mid \exists o'' \in \mathcal{O} \ \forall o' \in U \ (o, o') \not \in R_{A} \vee (o',o'') \in R_{B} \wedge o' \in \mathcal{O}\}, \end{aligned}$$
(13)
$$\begin{aligned}&\overline{apr}_{A}(\mathcal{O}/B) = \nonumber \\&\qquad \quad ~~ \{ o \mid \exists o'' \in \mathcal{O} \ \exists o' \in U \ (o, o') \in R_{A} \wedge (o',o'') \in R_{B} \wedge o' \in \mathcal{O}\}. \end{aligned}$$
(14)

3 Rough Sets by Indiscernibility Relations in Incomplete Information Systems

In incomplete information systems, \(a_{i} : U \rightarrow s_{a_{i}}\) for every \(a_{i} \in AT\) where \(s_{a_{i}}\) is the set of all subsets over domain \(D(a_{i})\) of attribute \(a_{i}\). \(v \in a_{i}(o)\) is a possible value that may be the actual one as the value of attribute \(a_{i}\) in object o. The possible value is the actual one if \(|a_{i}(o)| = 1\).

The indiscernibility relation for \(a_{i}\) in an incomplete information system is expressed by using two relations \(CR_{a_{i}}\) and \(PR_{a_{i}}\). \(CR_{a_{i}}\) is a certain indiscernibility relation and \(PR_{a_{i}}\) is a possible one:

$$\begin{aligned} CR_{a_{i}}= & {} \{(o,o') \mid o = o' \vee a_{i}(o) = a_{i}(o') \ \text{ with } \ |a_{i}(o)| = |a_{i}(o')| = 1\}, \end{aligned}$$
(15)
$$\begin{aligned} PR_{a_{i}}= & {} \{(o,o') \mid o = o' \vee u = v \wedge u \in a_{i}(o) \wedge v \in a_{i}(o') \}. \end{aligned}$$
(16)

The certain indiscernibility relation is reflexive, symmetric, and transitive, but the possible one is not transitive although it is reflexive and symmetric. We have three patterns. One case is that a pair of objects are not in both certain and possible indiscernibility relations, which means that they are discernible. Another is that they are not in the certain indiscernibility relation, but in the possible one, which means that they are discernible and indiscernible. The other is that they are in both certain and possible indiscernibility relations, which means that they are indiscernible.

Example 1

Let information table T be obtained as follows:

figure a

In information table T, \(U = \{o_{1},o_{2},o_{3},o_{4},o_{5}, o_{6}\}\), where domains \(D(a_{1})\) and \(D(a_{2})\) of attributes \(a_{1}\) and \(a_{2}\) are \(\{w,x,y,z\}\) and \(\{a,b,c\}\), respectively. Using formulae (15) and (16), certain and possible indiscernibility relations for \(a_{1}\) in T are:

$$\begin{aligned} CR_{a_{i}}= & {} \{(o_{1},o_{1}), (o_{2},o_{2}), (o_{3},o_{3}), (o_{3},o_{4}), (o_{4},o_{3}), (o_{4},o_{4}), (o_{5},o_{5}), (o_{6},o_{6})\}, \\ PR_{a_{i}}= & {} \{(o_{1},o_{1}), (o_{1},o_{2}), (o_{2},o_{1}), (o_{2},o_{2}), (o_{2},o_{3}), (o_{2},o_{4}), (o_{3},o_{2}), (o_{3},o_{3}), \\&(o_{3},o_{4}), (o_{4},o_{2}), (o_{4},o_{3}), (o_{4},o_{4}), (o_{5},o_{5}), (o_{5},o_{6}), (o_{6},o_{5}), (o_{6},o_{6})\}. \end{aligned}$$

Lipski showed that certain and possible answers, not the actual answer, are obtained in query processing under incomplete information [4, 5]. This is true for rough approximations. We cannot definitely obtain whether or not an object belongs to rough approximations, but we can know whether or not the object certainly or possibly belongs to rough approximations. Therefore, we show certain rough approximations (resp. possible rough approximations) whose object certainly (resp. possibly) belongs to the actual rough approximations.

Let \(\mathcal{O}\) be a set of objects. Certain lower approximation \(C\underline{apr}_{a_{i}}(\mathcal{O})\) and possible one \(P\underline{apr}_{a_{i}}(\mathcal{O})\) are:

$$\begin{aligned} C\underline{apr}_{a_{i}}(\mathcal{O})= & {} \{o \mid \forall o' \in U \ (o, o') \not \in PR_{a_{i}} \vee o' \in \mathcal{O}\}, \end{aligned}$$
(17)
$$\begin{aligned} P\underline{apr}_{a_{i}}(\mathcal{O})= & {} \{o \mid \forall o' \in U \ (o, o') \not \in CR_{a_{i}} \vee o' \in \mathcal{O}\}. \end{aligned}$$
(18)

Proposition 1

\(C\underline{apr}_{a_{i}}(\mathcal{O}) \subseteq P\underline{apr}_{a_{i}}(\mathcal{O})\).

Similarly, Certain upper approximation \(C\overline{apr}_{a_{i}}(\mathcal{O})\) and possible one \(P\overline{apr}_{a_{i}}(\mathcal{O})\) are:

$$\begin{aligned} C\overline{apr}_{a_{i}}(\mathcal{O})= & {} \{o \mid \exists o' \in U \ (o, o') \in CR_{a_{i}} \wedge o' \in \mathcal{O}\}, \end{aligned}$$
(19)
$$\begin{aligned} P\overline{apr}_{a_{i}}(\mathcal{O})= & {} \{o \mid \exists o' \in U \ (o, o') \in PR_{a_{i}} \wedge o' \in \mathcal{O}\}. \end{aligned}$$
(20)

Proposition 2

\(C\overline{apr}_{a_{i}}(\mathcal{O}) \subseteq P\overline{apr}_{a_{i}}(\mathcal{O})\).

Proposition 3

\(C\underline{apr}_{a_{i}}(\mathcal{O}) \subseteq C\overline{apr}_{a_{i}}(\mathcal{O})\) and \(P\underline{apr}_{a_{i}}(\mathcal{O})\) \(\subseteq \) \(P\overline{apr}_{a_{i}}(\mathcal{O})\).

Proposition 4

\(C\underline{apr}_{a_{i}}(\mathcal{O})\) \(\subseteq \) \(P\underline{apr}_{a_{i}}(\mathcal{O})\) \(\subseteq \) \(\mathcal{O}\) \(\subseteq \) \(C\overline{apr}_{a_{i}}(\mathcal{O})\) \(\subseteq \) \(P\overline{apr}_{a_{i}}(\mathcal{O})\).

Four approximations are linked with each other.

Proposition 5

\(P\underline{apr}_{a_{i}}(\mathcal{O}) = U - C\overline{apr}_{a_{i}}(U - \mathcal{O})\) and \(C\underline{apr}_{a_{i}}(\mathcal{O}) = U - P\overline{apr}_{a_{i}}(U - \mathcal{O})\).

Using four approximations denoted by formulae (17)–(20), lower and upper approximations are expressed by interval sets as follows:

$$\begin{aligned} \underline{apr}_{a_{i}}(\mathcal{O})= & {} [C\underline{apr}_{a_{i}}(\mathcal{O}), P\underline{apr}_{a_{i}}(\mathcal{O}) ], \end{aligned}$$
(21)
$$\begin{aligned} \overline{apr}_{a_{i}}(\mathcal{O})= & {} [C\overline{apr}_{a_{i}}(\mathcal{O}), P\overline{apr}_{a_{i}}(\mathcal{O}) ]. \end{aligned}$$
(22)

Certain and possible approximations are the lower and upper bounds of the actual approximation. The lower and upper approximations depend on each other; namely, the complementarity property linked with them holds, as is so in complete information systems.

Proposition 6

$$\begin{aligned} \underline{apr}_{a_{i}}(\mathcal{O}) = U - \overline{apr}_{a_{i}}(U - \mathcal{O}). \end{aligned}$$

Example 2

Let us go back to Example 1. Let set \(\mathcal{O}\) of objects be \(\{o_{2},o_{3},o_{4}\}\). Using formulae (17)–(20),

$$\begin{aligned} C\underline{apr}_{a_{1}}(\mathcal{O})= & {} \{o_{3}, o_{4}\}, \\ P\underline{apr}_{a_{1}}(\mathcal{O})= & {} \{o_{2}, o_{3}, o_{4}\}, \\ C\overline{apr}_{a_{1}}(\mathcal{O})= & {} \{o_{2}, o_{3}, o_{4}\}, \\ P\overline{apr}_{a_{1}}(\mathcal{O})= & {} \{o_{1}, o_{2}, o_{3}, o_{4}\}. \end{aligned}$$

Thus, using formulae (21)–(22),

$$\begin{aligned} \underline{apr}_{a_{1}}(\mathcal{O})= & {} [\{o_{3}, o_{4}\}, \{o_{2}, o_{3}, o_{4}\}], \\ \overline{apr}_{a_{1}}(\mathcal{O})= & {} [\{o_{2}, o_{3}, o_{4}\}, \{o_{1}, o_{2}, o_{3}, o_{4}\}]. \end{aligned}$$

Subsequently, we describe the case where a set of objects characterized by incomplete information is approximated by objects with complete information. Let objects in U have complete information for \(a_{i}\) and \(\mathcal{O}\) be characterized by \(a_{j}\) with incomplete information. Four approximations are:

$$\begin{aligned} C\underline{apr}_{a_{i}}(\mathcal{O}/a_{j})= & {} \{o \mid \exists o'' \in \mathcal{O} \ \forall o' \in [o]_{a_{i}} (o',o'') \in CR_{a_{j}} \wedge o' \in \mathcal{O} \}, \end{aligned}$$
(23)
$$\begin{aligned} P\underline{apr}_{a_{i}}(\mathcal{O}/a_{j})= & {} \{o \mid \exists o'' \in \mathcal{O} \ \forall o' \in [o]_{a_{i}} (o',o'') \in PR_{a_{j}} \wedge o' \in \mathcal{O} \}, \end{aligned}$$
(24)
$$\begin{aligned} C\underline{apr}_{a_{i}}(\mathcal{O}/a_{j})= & {} \{o \mid \exists o'' \in \mathcal{O} \ \exists o' \in [o]_{a_{i}} (o',o'') \in CR_{a_{j}} \wedge o' \in \mathcal{O} \}, \end{aligned}$$
(25)
$$\begin{aligned} P\underline{apr}_{a_{i}}(\mathcal{O}/a_{j})= & {} \{o \mid \exists o'' \in \mathcal{O} \ \exists o' \in [o]_{a_{i}} (o',o'') \in PR_{a_{j}} \wedge o' \in \mathcal{O} \}. \end{aligned}$$
(26)

Combining the above two cases, we can obtain four approximations in the case where both objects used to approximate and objects approximated are characterized by attributes with incomplete information. Certain lower approximation \(C\underline{apr}_{a_{i}}(\mathcal{O}/a_{j})\) and possible one \(P\underline{apr}_{a_{i}}(\mathcal{O}/a_{j})\) are:

$$\begin{aligned}&C\underline{apr}_{a_{i}}(\mathcal{O}/a_{j}) \nonumber \\&\qquad = \{o \mid \exists o'' \in \mathcal{O} \ \forall o' \in U \ (o, o') \not \in PR_{a_{i}} \vee (o',o'') \in CR_{a_{j}} \wedge o' \in \mathcal{O} \}, \end{aligned}$$
(27)
$$\begin{aligned}&P\underline{apr}_{a_{i}}(\mathcal{O}/a_{j}) \nonumber \\&\qquad = \{o \mid \exists o'' \in \mathcal{O} \ \forall o' \in U \ (o, o') \not \in CR_{a_{i}} \vee (o',o'') \in PR_{a_{j}} \wedge o' \in \mathcal{O} \}. \end{aligned}$$
(28)

Proposition 7

\(C\underline{apr}_{a_{i}}(\mathcal{O}/a_{j}) \subseteq P\underline{apr}_{a_{i}}(\mathcal{O}/a_{j})\).

Similarly, certain upper approximation \(C\overline{apr}_{a_{i}}(\mathcal{O}/a_{j})\) and possible one \(P\overline{apr}_{a_{i}}(\mathcal{O}/a_{j})\) are:

$$\begin{aligned}&C\overline{apr}_{a_{i}}(\mathcal{O}/a_{j}) \nonumber \\&\qquad = \{o \mid \exists o'' \in \mathcal{O} \ \exists o' \in U \ (o, o') \in CR_{a_{i}} \wedge (o',o'') \in CR_{a_{j}} \wedge o' \in \mathcal{O} \}, \end{aligned}$$
(29)
$$\begin{aligned}&P\overline{apr}_{a_{i}}(\mathcal{O}/a_{j}) \nonumber \\&\qquad = \{o \mid \exists o'' \in \mathcal{O} \ \exists o' \in U \ (o, o') \in PR_{a_{i}} \wedge (o',o'') \in PR_{a_{j}} \wedge o' \in \mathcal{O} \}. \end{aligned}$$
(30)

Proposition 8

\(C\overline{apr}_{a_{i}}(\mathcal{O}/a_{j}) \subseteq P\overline{apr}_{a_{i}}(\mathcal{O}/a_{j})\).

Proposition 9

\(C\underline{apr}_{a_{i}}(\mathcal{O}/a_{j}) \subseteq C\overline{apr}_{a_{i}}(\mathcal{O}/a_{j})\) and \(P\underline{apr}_{a_{i}}(\mathcal{O}/a_{j}) \subseteq P\overline{apr}_{a_{i}}(\mathcal{O}/a_{j})\).

Proposition 10

\(C\underline{apr}_{a_{i}}(\mathcal{O}/a_{j})\) \(\subseteq \) \(P\underline{apr}_{a_{i}}(\mathcal{O}/a_{j})\) \(\subseteq \) \(\mathcal{O}\) \(\subseteq \) \(C\overline{apr}_{a_{i}}(\mathcal{O}/a_{j})\) \(\subseteq \) \(P\overline{apr}_{a_{i}}(\mathcal{O}/a_{j})\).

Lower and upper approximations are:

$$\begin{aligned} \underline{apr}_{a_{i}}(\mathcal{O}/a_{j})= & {} [C\underline{apr}_{a_{i}}(\mathcal{O}/a_{j}), P\underline{apr}_{a_{i}}(\mathcal{O}/a_{j}) ], \end{aligned}$$
(31)
$$\begin{aligned} \overline{apr}_{a_{i}}(\mathcal{O}/a_{j})= & {} [C\overline{apr}_{a_{i}}(\mathcal{O}/a_{j}), P\overline{apr}_{a_{i}}(\mathcal{O}/a_{j}) ]. \end{aligned}$$
(32)

Example 3

Let us go back to information table T in Example 1. Let \(\mathcal{O}\) be \(\{o_{2},o_{3},o_{4}\}\) that is characterized by values of attribute \(a_{2}\). Using formulae (27)–(32),

$$\begin{aligned} \underline{apr}_{a_{1}}(\mathcal{O}/a_{2})= & {} [\{ \emptyset \}, \{o_{2}, o_{3}, o_{4} \}], \\ \overline{apr}_{a_{1}}(\mathcal{O}/a_{2})= & {} [\{o_{2}, o_{3}, o_{4} \}, \{o_{1}, o_{2}, o_{3}, o_{4} \}]. \\ \end{aligned}$$

An object that belongs to certain rough approximations does not certainly support a rule. For example, \(C\underline{apr}_{a_{1}}(U/a_{2}) = \{o_{5},o_{6}\}\) in T of Example 1. \(o_{5}\) certainly supports rule \(a_{1}=w \rightarrow a_{2}=c\), but \(o_{6}\) does not certainly supports rule \(a_{1}=w \rightarrow a_{2}=c\), because \(a_{1}=\{w,z\}\). To clarify how an object supports a rule, we derive certain and possible indiscernibility relations \(CR_{a_{i} = u}\) and \(PR_{a_{i} = u}\) where all pairs of objects are characterized by value u of attribute \(a_{i}\).

$$\begin{aligned} CR_{a_{i} = u}= & {} \{(o,o') \mid a_{i}(o) = a_{i}(o') = u \}, \end{aligned}$$
(33)
$$\begin{aligned} PR_{a_{i} = u}= & {} \{(o,o') \mid u \in a_{i}(o) \wedge u \in a_{i}(o') \}. \end{aligned}$$
(34)

Example 4

In T of Example 1, using formulae (33) and (34),

$$\begin{aligned} CR_{a_{1} = x}= & {} \{(o_{1},o_{1})\}, \\ PR_{a_{1} = x}= & {} \{(o_{1},o_{1}), (o_{1},o_{2}), (o_{2},o_{1}), (o_{2},o_{2})\}, \\ CR_{a_{1} = y}= & {} \{(o_{3},o_{3}), (o_{3},o_{4}), (o_{4},o_{3}), (o_{4},o_{4}),\}, \\ PR_{a_{1} = y}= & {} \{(o_{2},o_{2}), (o_{2},o_{3}), (o_{2},o_{4}), (o_{3},o_{2}), (o_{3},o_{3}), (o_{3},o_{4}), (o_{4},o_{2}), (o_{4},o_{3}), \\&(o_{4},o_{4}),\}, \\ CR_{a_{1} = z}= & {} \emptyset , \\ PR_{a_{1} = z}= & {} \{(o_{6},o_{6}) \}, \\ CR_{a_{1} = w}= & {} \{(o_{5},o_{5})\}, \\ PR_{a_{1} = w}= & {} \{(o_{5},o_{5}), (o_{5},o_{6}), (o_{6},o_{5}), (o_{6},o_{6})\}. \end{aligned}$$

Using indiscernibility relations \(CR_{a_{i} = u}\) and \(PR_{a_{i} = u}\) that are characterized by a value of an attribute, we obtain four sets of pairs of an object and a rule that it supports: certain lower, possible lower, certain upper, and possible upper sets, which correspond to the above four approximations.

Certain lower set \(C\underline{r}_{a_{i}}(\mathcal{O}/a_{j})\), which corresponds to \(C\underline{apr}_{a_{i}}(\mathcal{O}/a_{j})\), is:

(35)

Possible lower set \(P\underline{r}_{a_{i}}(\mathcal{O}/a_{j})\), which corresponds to \(P\underline{apr}_{a_{i}}(\mathcal{O}/a_{j})\), is:

(36)

Proposition 11

\(C\underline{r}_{a_{i}}(\mathcal{O}/a_{j}) \subseteq P\underline{r}_{a_{i}}(\mathcal{O}/a_{j}) \).

This proposition shows that the possible lower set includes the certain lower set.

Certain upper set \(C\overline{r}_{a_{i}}(\mathcal{O}/a_{j})\), which corresponds to \(C\overline{apr}_{a_{i}}(\mathcal{O}/a_{j})\), is:

(37)

Possible upper set \(P\overline{r}_{a_{i}}(\mathcal{O}/a_{j})\), which corresponds to \(P\overline{apr}_{a_{i}}(\mathcal{O}/a_{j})\), is:

(38)

Proposition 12

\(C\overline{r}_{a_{i}}(\mathcal{O}/a_{j}) \subseteq P\overline{r}_{a_{i}}(\mathcal{O}/a_{j}) \).

This proposition shows that the possible upper set includes the certain upper set.

Proposition 13

\(C\underline{r}_{a_{i}}(\mathcal{O}/a_{j}) \subseteq C\overline{r}_{a_{i}}(\mathcal{O}/a_{j}) \) and \(P\underline{r}_{a_{i}}(\mathcal{O}/a_{j}) \subseteq P\overline{r}_{a_{i}}(\mathcal{O}/a_{j}) \).

This proposition shows that the certain upper set includes the certain lower set and the possible upper set includes the possible lower set.

Using the above four sets, sets \(\underline{r}_{a_{i}}(\mathcal{O}/a_{j})\) and \(\overline{r}_{a_{i}}(\mathcal{O}/a_{j})\), which correspond to \(C\underline{apr}_{a_{i}}(\mathcal{O}/a_{j})\) and \(C\overline{apr}_{a_{i}}(\mathcal{O}/a_{j})\), are also expressed by interval sets:

$$\begin{aligned} \underline{r}_{a_{i}}(\mathcal{O}/a_{j})= & {} [C\underline{r}_{a_{i}}(\mathcal{O}/a_{j}), P\underline{r}_{a_{i}}(\mathcal{O}/a_{j}) ], \end{aligned}$$
(39)
$$\begin{aligned} \overline{r}_{a_{i}}(\mathcal{O}/a_{j})= & {} [C\overline{r}_{a_{i}}(\mathcal{O}/a_{j}), P\overline{r}_{a_{i}}(\mathcal{O}/a_{j}) ]. \end{aligned}$$
(40)

Pairs of an object and a rule are classified into four cases: certain and consistent, certain and inconsistent, possible and consistent, and possible inconsistent pairs. Objects that appear in certain set \(C\underline{r}_{a_{i}}(\mathcal{O}/a_{j})\) certainly support rules with consistency. From Proposition 13, \(C\underline{r}_{a_{i}}(\mathcal{O}/a_{j}) \subseteq C\overline{r}_{a_{i}}(\mathcal{O}/a_{j}) \). So, objects that appear in \((C\overline{r}_{a_{i}}(\mathcal{O}/a_{j}) - C\underline{r}_{a_{i}}(\mathcal{O}/a_{j}))\) certainly support rules with inconsistency. From Proposition 11, \(C\underline{r}_{a_{i}}(\mathcal{O}/a_{j}) \subseteq P\underline{r}_{a_{i}}(\mathcal{O}/a_{j})\). So, objects that appear in \((P\underline{r}_{a_{i}}(\mathcal{O}/a_{j}) - C\underline{r}_{a_{i}}(\mathcal{O}/a_{j}))\) possibly support rules with consistency. From Propositions 12 and 13, \(C\overline{r}_{a_{i}}(\mathcal{O}/a_{j}) \subseteq P\overline{r}_{a_{i}}(\mathcal{O}/a_{j}) \) and \(P\underline{r}_{a_{i}}(\mathcal{O}/a_{j}) \subseteq P\overline{r}_{a_{i}}(\mathcal{O}/a_{j}) \). So, objects that appear in \((P\overline{r}_{a_{i}}(\mathcal{O}/a_{j}) - P\underline{r}_{a_{i}}(\mathcal{O}/a_{j}) - C\overline{r}_{a_{i}}(\mathcal{O}/a_{j}))\) possibly support rules with inconsistency.

Example 5

Let us go back to Example 1. Using formulae (35)–(38) and then gathering objects supporting the same rule in a set,

$$\begin{aligned} C\underline{r}_{a_{1}}(U/a_{2})= & {} \{(\{ o_{5} \}, a_{1} = w \rightarrow a_{2} =c)\}, \\ C\overline{r}_{a_{1}}(U/a_{2})= & {} \{(\{ o_{3},o_{4} \}, a_{1} = y \rightarrow a_{2} =b), (\{ o_{5} \}, a_{1} = w \rightarrow a_{2} =c)\},\\ P\underline{r}_{a_{1}}(U/a_{2})= & {} \{ (\{ o_{1} \}, a_{1} = x \rightarrow a_{2} =c), (\{ o_{1}, o_{2} \}, a_{1} = x \rightarrow a_{2} =a), \\&(\{ o_{2}, o_{3}, o_{4} \}, a_{1} = y \rightarrow a_{2} =b), (\{ o_{5}, o_{6} \}, a_{1} = w \rightarrow a_{2} =c), \\&(\{o_{6} \}, a_{1} = z \rightarrow a_{2} =c)\}, \\ P\overline{r}_{a_{1}}(U/a_{2})= & {} \{(\{ o_{1}, o_{2} \}, a_{1} = x \rightarrow a_{2} =a), (\{ o_{1}, o_{2} \}, a_{1} = x \rightarrow a_{2} =b), \\&(\{ o_{1}, o_{2} \}, a_{1} = x \rightarrow a_{2} =c), (\{ o_{2}, o_{3}, o_{4} \}, a_{1} = y \rightarrow a_{2} =a), \\&(\{ o_{2}, o_{3}, o_{4} \}, a_{1} = y \rightarrow a_{2} =b), (\{ o_{5}, o_{6} \}, a_{1} = w \rightarrow a_{2} =c), \\&(\{ o_{6} \}, a_{1} = z \rightarrow a_{2} =c) \}. \end{aligned}$$

Using these formulae, the certain set of pairs of objects and a rule with consistency, which are in \(C\underline{r}_{a_{1}}(U/a_{2})\), is:

$$\begin{aligned} \{(\{ o_{5} \}, a_{1} = w \rightarrow a_{2} =c)\}. \end{aligned}$$

The certain set of pairs with inconsistency, which are in \((C\overline{r}_{a_{1}}(U/a_{2}) - C\underline{r}_{a_{1}}(U/a_{2}))\), is:

$$\begin{aligned} \{(\{ o_{3},o_{4} \}, a_{1} = y \rightarrow a_{2} =b)\}. \end{aligned}$$

The possible set of pairs with consistency, which are in \((P\underline{r}_{a_{1}}(U/a_{2}) - C\underline{r}_{a_{1}}(U/a_{2}))\), is:

$$\begin{aligned}&\{(\{ o_{1} \}, a_{1} = x \rightarrow a_{2} =c), (\{ o_{1}, o_{2} \}, a_{1} = x \rightarrow a_{2} =a), \\&(\{ o_{2}, o_{3}, o_{4} \}, a_{1} = y \rightarrow a_{2} =b), (\{ o_{6} \}, a_{1} = w \rightarrow a_{2} =c), \\&(\{o_{6} \}, a_{1} = z \rightarrow a_{2} =c)\}. \end{aligned}$$

A possible set of pairs with inconsistency, which are in \((P\overline{r}_{a_{1}}(U/a_{2}) - P\underline{r}_{a_{1}}(U/a_{2}) - C\overline{r}_{a_{1}}(U/a_{2}) )\), is:

$$\begin{aligned}&\{(\{ o_{1}, o_{2} \}, a_{1} = x \rightarrow a_{2} =b), (\{ o_{2} \}, a_{1} = x \rightarrow a_{2} =c), \\&(\{ o_{2}, o_{3}, o_{4} \}, a_{1} = y \rightarrow a_{2} =a)\}. \end{aligned}$$

4 Conclusions

We have described an approach based on rough sets in an incomplete information system. An attribute value is expressed by a set of values in the incomplete information system. The approach is based on directly using indiscernibility relations from the viewpoint of both certainty and possibility. First, We have shown rough approximations for the case where only objects used to approximate are characterized by attributes with incomplete information. Second, we have shown the case where only objects in a set approximated have incomplete information. Finally, rough approximations have been shown in the case where both objects used to approximate and objects approximated are characterized by attributes with incomplete information.

We have four approximations: certain lower, possible lower, certain upper, and possible upper ones. These are linked with each other. Lower and upper approximations consists of a pair of certain and possible lower ones and a pair of certain and possible upper ones, respectively. This is essential in incomplete information systems. As a result, the complementarity property linked with lower and upper approximations holds, as is valid under complete information.

Objects that belongs to certain rough approximations do not always support certain rule. To clarify how rules an object supports, we have introduced expressions where we deal with pairs of an object and a rule that it supports. By using the expressions, we can obtain four sets of pairs of an object and a rule that it supports: certain and consistent, possible and consistent, certain and inconsistent, and possible and inconsistent sets. In other words, pairs of an object and a rule are classified into four types.

Our approach is applicable to the case where equivalence classes are not obtained, because we directly use indiscernibility relations.