Keywords

1 Introduction

The Information generated in the real world includes various types of data. When we deal with character string data, the data is broadly classified into discrete data and continuous data.

Rough sets, constructed by Pawlak [18], are used as an effective method for feature selection, pattern recognition, data mining and so on. The framework consists of lower and upper approximations. This is traditionally applied to complete information tables with nominal attributes. Fruitful results are reported in various fields. However, when we are faced with real-world objects, it is often necessary to handle attributes that take a continuous value. Furthermore, objects with incomplete information ubiquitously exist in the real world. Without processing incomplete and continuous information, the information generated in the real world cannot be fully utilized. Therefore, extended versions of the rough sets have been proposed to handle incomplete information in continuous domains.

An approach handling incomplete information, which is often adopted [7, 20,21,22], is to use the way that Kryszkiewicz applied to nominal attributes [8]. This approach gives in advance the indistinguishability of objects that have incomplete information with other objects. However, it is natural that there are two possibilities for incomplete information objects. One possibility is that an object with incomplete information may have the same value as another object. That is, the two objects may be indiscernible. The other possibility is that the object may have a different value from another object. That is, they may be discernible. Giving in advance the indiscernibility corresponds to neglecting one of the two possibilities. Therefore, the approach leads to loss of information and creates poor results [11, 19].

Another approach is to directly use indiscernibility relations extended to handle incomplete information [14]. Yet another approach is to use possible classes obtained from the indiscernibility relation on a set of attributes [15]. These two approaches have no computational complexity for the number of values with incomplete information. We need to give some justification to these extended approaches. It is known in discrete data tables that an approach using possible class has some justification from the viewpoint of possible world semantics [12]. We focus on an approach directly using indiscernibility relations.Footnote 1 To give it some justification, we need to develop an approach that is based on possible world semantics. The previous approaches are developed under possible tables derived from an incomplete and continuous information table. Unfortunately, an infinite number of possible tables can be generated from an incomplete and continuous information table. Possible world semantics cannot be applied to an infinite number of possible tables.

The starting point for a rough set is the indiscernibility relation on a set of attributes. When an information table contains values with incomplete information, we obtain lots of possible indiscernibility relations in place of the indiscernibility relation. The number is finite, even if the number of possible tables is infinite, because the number of objects is finite. We note this finiteness and develop an approach based on the possible indiscernibility relations, not the possible tables.

The paper is constructed as follows. Section 2 describes an approach directly using indiscernibility relations in a complete and continuous information table. Section 3 develops an approach applying possible world semantics to an incomplete and continuous information table. Section 4 describes rule induction in a complete and continuous information table. Section 5 address rule induction in an incomplete and continuous information table. Section 6 mentions the conclusions.

2 Rough Sets by Directly Using Indiscernibility Relations in Complete and Continuous Information Systems

A continuous data set is represented as a two-dimensional table, called a continuous information table. In the continuous information table, each row and each column represent an object and an attribute, respectively. A mathematical model of an information table with complete and continuous information is called a complete and continuous information system. The complete and continuous information system is a triplet expressed by \((U,AT,\{D(a) \mid a \in AT \})\). U is a non-empty finite set of objects, which is called the universe. AT is a non-empty finite set of attributes such that \(a : U \rightarrow D(a)\) for every \(a \in AT\) where D(a) is the continuous domain of attribute a.

We have two approaches for handling continuous values. One approach is to discretize a continuous domain into disjunctive intervals in which objects are considered as indiscernible [4]. How to discretize has a heavy influence over results. The other approach is to use neighborhood [10]. The indiscernibility of two objects is derived from the distance of the values that characterize them. A threshold is given, which is the indiscernibility criterion. When the distance between two objects is less than or equal to the threshold, they are considered as indiscernible. As the threshold changes, the results change gradually. Therefore, we take the neighborhood-based approach.

Binary relation \(R_{A}\)Footnote 2 that represents the indiscernibility between objects on set \(A \subseteq AT\) of attributes is called the indiscernibility relation on A:

$$\begin{aligned} R_{A} = \{(o,o') \in U \times U \mid |A(o) - A(o')| \le \delta _{A} \}, \end{aligned}$$
(1)

where A(o) is the value sequence for A of object o and \((|A(o) - A(o')| \le \delta _{A}) = (\wedge _{a \in A} |a(o) - a(o')| \le \delta _{a})\) and \(\delta _{a}\)Footnote 3 is a threshold indicating the range of indiscernibility between a(o) and \(a(o')\).

Proposition 1

If \(\delta 1_{A} \le \delta 2_{A}\), equal to \(\wedge _{a \in A}(\delta 1_{a} \le \delta 2_{a})\), then \(R_{A}^{\delta 1_{A}} \subseteq R_{A}^{\delta 2_{A}}\), where \(R_{A}^{\delta 1_{A}}\) and \(R_{A}^{\delta 2_{A}}\) are the indiscernibility relations with thresholds \(\delta 1_{A}\) and \(\delta 2_{A}\), respectively and \(R_{A}^{\delta 1_{A}} = \cap _{a \in A} R_{a}^{\delta 1_{a}}\) and \(R_{A}^{\delta 2_{A}}= \cap _{a \in A} R_{a}^{\delta 2_{a}}\).

From indiscernibility relation \(R_{A}\), indiscernible class \([o]_{A}\) for object o is obtained:

$$\begin{aligned}{}[o]_{A} = \{o' \mid (o,o') \in R_{A} \}, \end{aligned}$$
(2)

where \([o]_{A} = \cap _{a \in A}[o]_{a}\).

Directly using indiscernibility relation \(R_{A}\), lower approximation \(\underline{apr}_{A}(\mathcal{O})\) and upper approximation \(\overline{apr}_{A}(\mathcal{O})\) for A of set \(\mathcal{O}\) of objects are:

$$\begin{aligned} \underline{apr}_{A}(\mathcal{O})= & {} \{o \mid \forall o' \in U \ (o, o') \not \in R_{A} \vee o' \in \mathcal{O}\}, \end{aligned}$$
(3)
$$\begin{aligned} \overline{apr}_{A}(\mathcal{O})= & {} \{o \mid \exists o' \in U \ (o, o') \in R_{A} \wedge o' \in \mathcal{O}\}. \end{aligned}$$
(4)

Proposition 2

[14] Let \(\underline{apr}_{A}^{\delta 1_{A}}(\mathcal{O})\) and \(\overline{apr}_{A}^{\delta 1_{A}}(\mathcal{O})\) be lower and upper approximations under threshold \(\delta 1_{A}\) and let \(\underline{apr}_{A}^{\delta 2_{A}}(\mathcal{O})\) and \(\overline{apr}_{A}^{\delta 2_{A}}(\mathcal{O})\) be lower and upper approximations under threshold \(\delta 2_{A}\). If \(\delta 1_{A} \le \delta 2_{A}\), then \(\underline{apr}_{A}^{\delta 1_{A}}(\mathcal{O}) \supseteq \underline{apr}_{A}^{\delta 2_{A}}(\mathcal{O})\) and \(\overline{apr}_{A}^{\delta 1_{A}}(\mathcal{O}) \subseteq \overline{apr}_{A}^{\delta 2_{A}}(\mathcal{O})\).

For object o in the lower approximation of \(\mathcal{O}\), all objects with which o is indiscernible are included in \(\mathcal{O}\); namely, \([o]_{A} \subseteq \mathcal{O}\). On the other hand, for objects in the upper approximation of \(\mathcal{O}\), some objects indiscernible o are in \(\mathcal{O}\). That is, \([o]_{A} \cap \mathcal{O} \ne \emptyset \). Thus, \(\underline{apr}_{A}(\mathcal{O}) \subseteq \overline{apr}_{A}(\mathcal{O})\).

3 Rough Sets from Possible World Semantics in Incomplete and Continuous Information Systems

An information table with incomplete and continuous information is called an incomplete and continuous information system. In incomplete and continuous information systems, \(a : U \rightarrow s_{a}\) for every \(a \in AT\) where \(s_{a}\) is the union of or-sets of values over domain D(a) of attribute a and sets of intervals on D(a). Note that an or-set is a disjunctive set [9]. Single value \(v \in a(o)\) is a possible value that may be the actual value of attribute a in object o. The possible value is the actual one if a(o) is single; namely, \(|a(o)| = 1\).

We have lots of possible indiscernibility relations from an incomplete and continuous information table. The smallest possible indiscernibility relation is the certain one. Certain indiscernibility relation \(CR_{A}\) is:

$$\begin{aligned} CR_{A}= & {} \cap _{a \in A}CR_{a}, \end{aligned}$$
(5)
$$\begin{aligned} CR_{a}= & {} \{(o,o') \in U \times U \mid (o = o') \vee (\forall u \in a(o) \forall v \in a(o') |u - v| \le \delta _{a})\}. \end{aligned}$$
(6)

In this binary relation, which is unique on A, two objects o and \(o'\) of \((o,o') \in CR_{A} \) are certainly indiscernible on A. Such a pair is called a certain pair. Family \(\mathcal{F}(R_{A})\) of possible indiscernibility relations is:

$$\begin{aligned} \mathcal{F}(R_{A}) = \{e \mid e = CR_{A} \cup e' \wedge e' \in \mathcal{P}(MPPR_{A})\}, \end{aligned}$$
(7)

where each element is a possible indiscernibility relation and \(\mathcal{P}(MPPR_{A})\) is the power set of \(MPPR_{A}\) and \(MPPR_{A}\) is:

$$\begin{aligned} MPPR_{A}= & {} \{\{(o',o),(o,o')\}|(o',o) \in MPR_{A}\}, \nonumber \\ MPR_{A}= & {} \cap _{a \in A}MPR_{a}, \end{aligned}$$
(8)
$$\begin{aligned} MPR_{a}= & {} \{(o,o') \in U \times U \mid \exists u \in a(o) \exists v \in a(o') |u - v| \le \delta _a) \}\backslash CR_{a}. \end{aligned}$$
(9)

A pair of objects in \(MPR_{A}\) is called a possible one. \(\mathcal{F}(R_{A})\) has a lattice structure for set inclusion. \(CR_{A}\) is the minimum possible indiscernibility relation in \(\mathcal{F}(R_{A})\) on A, which is the minimum element, whereas \(CR_{A} \cup MPR_{A}\) is the maximum possible indiscernibility relation on A, which is the maximum element. One of possible indiscernibility relations is actual. However, we cannot know it without additional information.

Fig. 1
figure 1

Incomplete and continuous information table T

Example 1

Or-set \(<1.25,1.31>\) means 1.25 or 1.31. Let threshold \(\delta _{a_{1}}\) be 0.05 in T of Fig. 1. The set of certain pairs of indiscernible objects on \(a_{1}\) is:

$$\begin{aligned} \{(o_{1},o_{1}),(o_{1},o_{3}),(o_{3},o_{1}),(o_{1},o_{5}),(o_{5},o_{1}),(o_{2},o_{2}),(o_{3},o_{3}),(o_{4},o_{4}),(o_{5},o_{5})\}. \end{aligned}$$

The set of possible pairs of indiscernible objects is:

$$\begin{aligned} \{(o_{1},o_{2}),(o_{2},o_{1}),(o_{2},o_{3}),(o_{3},o_{2}),(o_{3},o_{5}),(o_{5},o_{3})\}. \end{aligned}$$

Applying formulae (5)–(7) to these sets, the family of possible indiscernibility relations and each possible indiscernibility relation \(pr_{i}\) with \(i =1, \ldots , 8\) are:

$$\begin{aligned} \mathcal{F}(R_{a_{1}})= & {} \{pr_{1},\cdots ,pr_{8}\}, \\ pr_{1}= & {} \{(o_{1},o_{1}),(o_{1},o_{3}),(o_{3},o_{1}),(o_{1},o_{5}),(o_{5},o_{1}),(o_{2},o_{2}),(o_{3},o_{3}),\\&(o_{4},o_{4}),(o_{5},o_{5})\}, \\ pr_{2}= & {} \{(o_{1},o_{1}),(o_{1},o_{3}),(o_{3},o_{1}),(o_{1},o_{5}),(o_{5},o_{1}),(o_{2},o_{2}),(o_{3},o_{3}),\\&(o_{4},o_{4}),(o_{5},o_{5}), (o_{1},o_{2}),(o_{2},o_{1}) \}, \\ pr_{3}= & {} \{(o_{1},o_{1}),(o_{1},o_{3}),(o_{3},o_{1}),(o_{1},o_{5}),(o_{5},o_{1}),(o_{2},o_{2}),(o_{3},o_{3}),\\&(o_{4},o_{4}),(o_{5},o_{5}), (o_{2},o_{3}),(o_{3},o_{2}) \}, \\ pr_{4}= & {} \{(o_{1},o_{1}),(o_{1},o_{3}),(o_{3},o_{1}),(o_{1},o_{5}),(o_{5},o_{1}),(o_{2},o_{2}),(o_{3},o_{3}),\\&(o_{4},o_{4}),(o_{5},o_{5}), (o_{3},o_{5}),(o_{5},o_{3}) \}, \\ pr_{5}= & {} \{(o_{1},o_{1}),(o_{1},o_{3}),(o_{3},o_{1}),(o_{1},o_{5}),(o_{5},o_{1}),(o_{2},o_{2}),(o_{3},o_{3}),\\&(o_{4},o_{4}),(o_{5},o_{5}), (o_{1},o_{2}),(o_{2},o_{1}), (o_{2},o_{3}), (o_{3},o_{2}) \}, \\ pr_{6}= & {} \{(o_{1},o_{1}),(o_{1},o_{3}),(o_{3},o_{1}),(o_{1},o_{5}),(o_{5},o_{1}),(o_{2},o_{2}),(o_{3},o_{3}),\\&(o_{4},o_{4}),(o_{5},o_{5}), (o_{1},o_{2}),(o_{2},o_{1}), (o_{3},o_{5}),(o_{5},o_{3}) \}, \\ pr_{7}= & {} \{(o_{1},o_{1}),(o_{1},o_{3}),(o_{3},o_{1}),(o_{1},o_{5}),(o_{5},o_{1}),(o_{2},o_{2}),(o_{3},o_{3}),\\&(o_{4},o_{4}),(o_{5},o_{5}), (o_{2},o_{3}),(o_{3},o_{2}), (o_{3},o_{5}),(o_{5},o_{3}) \}, \\ pr_{8}= & {} \{(o_{1},o_{1}),(o_{1},o_{3}),(o_{3},o_{1}),(o_{1},o_{5}),(o_{5},o_{1}),(o_{2},o_{2}),(o_{3},o_{3}),\\&(o_{4},o_{4}),(o_{5},o_{5}), (o_{1},o_{2}),(o_{2},o_{1}), (o_{2},o_{3}), (o_{3},o_{2}), (o_{3},o_{5}),(o_{5},o_{3}) \}. \end{aligned}$$

The family of these possible indiscernibility relations has the lattice structure for set inclusion like Fig. 2. \(pr_{1}\) is the minimum element, whereas \(pr_{8}\) is the maximum element.

Fig. 2
figure 2

Lattice structure

We develop an approach based on possible indiscernibility relations in an incomplete and continuous information table. Applying formulae (3) and (4) to a possible indiscernibility relation pr, Lower and upper approximations in pr are:

$$\begin{aligned} \underline{apr}_{A}(\mathcal{O})^{pr}= & {} \{o \mid \forall o' \in U \ ((o, o') \not \in pr \wedge pr \in \mathcal{F}(R_{A})) \vee o' \in \mathcal{O}\}, \end{aligned}$$
(10)
$$\begin{aligned} \overline{apr}_{A}(\mathcal{O})^{pr}= & {} \{o \mid \exists o' \in U \ ((o, o') \in pr \wedge pr \in \mathcal{F}(R_{A})) \wedge o' \in \mathcal{O} \}. \end{aligned}$$
(11)

Proposition 3

If \(pr_{k} \subseteq pr_{l}\) for possible indiscernibility relations \(pr_{k}, pr_{l} \in \mathcal{F}(R_{A})\),

then \(\underline{apr}_{A}(\mathcal{O})^{pr_{k}} \supseteq \underline{apr}_{A}(\mathcal{O})^{pr_{l}}\) and \(\overline{apr}_{A}(\mathcal{O})^{pr_{k}} \subseteq \overline{apr}_{A}(\mathcal{O})^{pr_{l}}\).

From this proposition the families of lower and upper approximations in possible indiscernibility relations also have the same lattice structure for set inclusion as the family of possible indiscernibility relations.

By aggregating the lower and upper approximations in possible indiscernibility relations, we obtain four kinds of approximations: certain lower approximation \(C\underline{apr}_{A}(\mathcal{O})\), certain upper approximation \(C\underline{apr}_{A}(\mathcal{O})\), possible lower approximation \(P\underline{apr}_{A}(\mathcal{O})\), and possible upper approximation \(P\overline{apr}_{A}(\mathcal{O})\):

$$\begin{aligned} C\underline{apr}_{A}(\mathcal{O})= & {} \{o \mid \forall pr \in \mathcal{F}(R_{A}) o \in \underline{apr}_{A}(\mathcal{O})^{pr}\}, \end{aligned}$$
(12)
$$\begin{aligned} C\overline{apr}_{A}(\mathcal{O})= & {} \{o \mid \forall pr \in \mathcal{F}(R_{A}) o \in \overline{apr}_{A}(\mathcal{O})^{pr}\}, \end{aligned}$$
(13)
$$\begin{aligned} P\underline{apr}_{A}(\mathcal{O})= & {} \{o \mid \exists pr \in \mathcal{F}(R_{A}) o \in \underline{apr}_{A}(\mathcal{O})^{pr}\}, \end{aligned}$$
(14)
$$\begin{aligned} P\overline{apr}_{A}(\mathcal{O})= & {} \{o \mid \exists pr \in \mathcal{F}(R_{A}) o \in \overline{apr}_{A}(\mathcal{O})^{pr}\}. \end{aligned}$$
(15)

Using Proposition 3,

$$\begin{aligned} C\underline{apr}_{A}(\mathcal{O})= & {} \underline{apr}_{A}(\mathcal{O})^{pr_{\max }}, \end{aligned}$$
(16)
$$\begin{aligned} C\overline{apr}_{A}(\mathcal{O})= & {} \overline{apr}_{A}(\mathcal{O})^{pr_{\min }}, \end{aligned}$$
(17)
$$\begin{aligned} P\underline{apr}_{A}(\mathcal{O})= & {} \underline{apr}_{A}(\mathcal{O})^{pr_{min}}, \end{aligned}$$
(18)
$$\begin{aligned} P\overline{apr}_{A}(\mathcal{O})= & {} \overline{apr}_{A}(\mathcal{O})^{pr_{\max }}, \end{aligned}$$
(19)

where \(pr_{\min }\) and \(pr_{\max }\) are the minimum and the maximum possible indiscernibility relations on A.

Using formulae (16)–(19), we can obtain the four approximations without the computational complexity for the number of possible indiscernibility relations, although the number of possible indiscernibility relations has exponential growth as the number of values with incomplete information linearly increases.

Definability on set A of attributes is defined as follows:

Set \(\mathcal{O}\) of objects is certainly definable if and only if \(\forall pr \in \mathcal{F}(R_{A}) \exists S \subseteq U \ \mathcal{O} = \cup _{o \in S}[o]_{A}^{pr}\).

Set \(\mathcal{O}\) of objects is possibly definable if and only if \(\exists pr \in \mathcal{F}(R_{A}) \exists S \subseteq U \ \mathcal{O} = \cup _{o \in S}[o]_{A}^{pr}\).

These definition is equivalent to:

Set \(\mathcal{O}\) of objects is certainly definable if and only if \(\forall pr \in \mathcal{F}(R_{A}) \ \underline{apr}_{A}(\mathcal{O})^{pr} = \overline{apr}_{A}(\mathcal{O})^{pr}\).

Set \(\mathcal{O}\) of objects is possibly definable if and only if \(\exists pr \in \mathcal{F}(R_{A}) \ \underline{apr}_{A}(\mathcal{O})^{pr} = \overline{apr}_{A}(\mathcal{O})^{pr}\).

Example 2

We use the possible indiscernibility relations in Example 1. Let set \(\mathcal{O}\) of objects be \(\{o_{2},o_{4}\}\). Applying formulae (10) and (11) to \(\mathcal{O}\), lower and upper approximations from each possible indiscernibility relation are:

$$\begin{aligned}&\underline{apr}_{a_{1}}(\mathcal{O})^{pr_{1}} = \{o_{2},o_{4}\}, \overline{apr}_{a_{1}}(\mathcal{O})^{pr_{1}} = \{o_{2},o_{4}\}, \\&\underline{apr}_{a_{1}}(\mathcal{O})^{pr_{2}} = \{o_{4}\}, \overline{apr}_{a_{1}}(\mathcal{O})^{pr_{2}} = \{o_{1},o_{2},o_{4}\}, \\&\underline{apr}_{a_{1}}(\mathcal{O})^{pr_{3}} = \{o_{4}\}, \overline{apr}_{a_{1}}(\mathcal{O})^{pr_{3}} = \{o_{2},o_{3},o_{4}\}, \\&\underline{apr}_{a_{1}}(\mathcal{O})^{pr_{4}} = \{o_{2},o_{4}\}, \overline{apr}_{a_{1}}(\mathcal{O})^{pr_{4}} = \{o_{2},o_{4}\}, \\&\underline{apr}_{a_{1}}(\mathcal{O})^{pr_{5}} = \{o_{4}\}, \overline{apr}_{a_{1}}(\mathcal{O})^{pr_{5}} = \{o_{1},o_{2},o_{3},o_{4}\}, \\&\underline{apr}_{a_{1}}(\mathcal{O})^{pr_{6}} = \{o_{4}\}, \overline{apr}_{a_{1}}(\mathcal{O})^{pr_{6}} = \{o_{1},o_{2},o_{4}\}, \\&\underline{apr}_{a_{1}}(\mathcal{O})^{pr_{7}} = \{o_{4}\}, \overline{apr}_{a_{1}}(\mathcal{O})^{pr_{7}} = \{o_{2},o_{3},o_{4}\}, \\&\underline{apr}_{a_{1}}(\mathcal{O})^{pr_{8}} = \{o_{4}\}, \overline{apr}_{a_{1}}(\mathcal{O})^{pr_{8}} = \{o_{1},o_{2},o_{3},o_{4}\}. \end{aligned}$$

By using formulae (16)–(19),

$$\begin{aligned} C\underline{apr}_{a_{1}}(\mathcal{O})= & {} \{o_{4}\}, \\ C\overline{apr}_{a_{1}}(\mathcal{O})= & {} \{o_{2},o_{4}\}, \\ P\underline{apr}_{a_{1}}(\mathcal{O})= & {} \{o_{2},o_{4}\}, \\ P\overline{apr}_{a_{1}}(\mathcal{O})= & {} \{o_{1},o_{2},o_{3},o_{4}\}. \end{aligned}$$

\(\mathcal{O}\) is possibly definable on \(a_{1}\).

As with the case of nominal attributes [12], the following proposition holds.

Proposition 4

\(C\underline{apr}_{A}(\mathcal{O})\) \(\subseteq \) \(P\underline{apr}_{A}(\mathcal{O})\) \(\subseteq \) \(\mathcal{O}\) \(\subseteq \) \(C\overline{apr}_{A}(\mathcal{O})\) \(\subseteq \) \(P\overline{apr}_{A}(\mathcal{O})\).

Using the four approximations denoted by formulae (16)–(19), lower approximation \(\underline{apr}_{A}^{\bullet }(\mathcal{O})\) and upper approximation \(\overline{apr}_{A}^{\bullet }(\mathcal{O})\) are expressed in interval sets, as is described in [13]Footnote 4:

$$\begin{aligned} \underline{apr}_{A}^{\bullet }(\mathcal{O}) = [C\underline{apr}_{A}(\mathcal{O}), P\underline{apr}_{A}(\mathcal{O}) ], \end{aligned}$$
(20)
$$\begin{aligned} \overline{apr}_{A}^{\bullet }(\mathcal{O}) = [C\overline{apr}_{A}(\mathcal{O}), P\overline{apr}_{A}(\mathcal{O}) ]. \end{aligned}$$
(21)

The two approximations \(\underline{apr}_{A}^{\bullet }(\mathcal{O})\) and \(\overline{apr}_{A}^{\bullet }(\mathcal{O})\) are dependent through the complementarity property \(\underline{apr}_{A}^{\bullet }(\mathcal{O}) = U - \overline{apr}_{A}^{\bullet }(U - \mathcal{O})\).

Example 3

Applying four approximations in Example 2 to formulae (20) and (21),

$$\begin{aligned} \underline{apr}_{a_{1}}^{\bullet }(\mathcal{O})= & {} [\{o_{4}\}, \{o_{2},o_{4}\}], \\ \overline{apr}_{a_{1}}^{\bullet }(\mathcal{O})= & {} [\{o_{2},o_{4}\}, \{o_{1},o_{2},o_{3},o_{4}\}]. \end{aligned}$$

Furthermore, the following proposition is valid from formulae (16)–(19).

Proposition 5

$$\begin{aligned} C\underline{apr}_{A}(\mathcal{O})= & {} \{o \mid \forall o' \in U \ (o, o') \not \in (CR_{A} \cup MPR_{A}) \vee o' \in \mathcal{O}\}, \\ C\overline{apr}_{A}(\mathcal{O})= & {} \{o \mid \exists o' \in U \ (o, o') \in CR_{A} \wedge o' \in \mathcal{O}\}, \\ P\underline{apr}_{A}(\mathcal{O})= & {} \{o \mid \forall o' \in U \ (o, o') \not \in CR_{A} \vee o' \in \mathcal{O}\}, \\ P\overline{apr}_{A}(\mathcal{O})= & {} \{o \mid \exists o' \in U \ (o, o') \in (CR_{A} \cup MPR_{A}) \wedge o' \in \mathcal{O}\}. \end{aligned}$$

Our extended approach directly using indiscernibility relations [14] is justified from this proposition. That is, approximations from the extended approach using two indiscernibility relations are the same as the ones obtained under possible world semantics. A correctness criterion for justification is formulated as

$$\begin{aligned} q(R_{A}) = \bigodot q'(\mathcal{F}(R_{A})), \end{aligned}$$

where \(q'\) is the approach for complete and continuous information, which is described in Sect. 2, and q is an extended approach of \(q'\), which directly handles with incomplete and continuous information, and \(\bigodot \) is an aggregate operator. This is represented in Fig. 3.

This kind of correctness criterion is usually used in the field of databases handling incomplete information [1,2,3, 6, 17, 23].

Fig. 3
figure 3

Correctness criterion of extended method q

When objects in \(\mathcal{O}\) are specified by a restriction containing set B of nominal attribute with incomplete information, elements in domain \(D(B)(=\cup _{b \in B}D(b))\) are used. For example, \(\mathcal{O}\) is specified by restriction \(B = X(=\wedge _{b \in B}(b=x_{b}))\) with \(B \in AT\) and \(x_{b} \in D(b)\). Four approximations: certain lower, certain upper, possible lower, and possible upper ones are:

$$\begin{aligned} C\underline{apr}_{A}(\mathcal{O})= & {} \underline{apr}_{A}(C\mathcal{O}_{B = X})^{pr_{\max }}, \end{aligned}$$
(22)
$$\begin{aligned} C\overline{apr}_{A}(\mathcal{O})= & {} \overline{apr}_{A}(C\mathcal{O}_{B = X})^{pr_{\min }}, \end{aligned}$$
(23)
$$\begin{aligned} P\underline{apr}_{A}(\mathcal{O})= & {} \underline{apr}_{A}(P\mathcal{O}_{B = X})^{pr_{\min }}, \end{aligned}$$
(24)
$$\begin{aligned} P\overline{apr}_{A}(\mathcal{O})= & {} \overline{apr}_{A}(P\mathcal{O}_{B = X})^{pr_{\max }}. \end{aligned}$$
(25)

where

$$\begin{aligned} C\mathcal{O}_{B = X }= & {} \{o \in \mathcal{O}\mid B(o) = X \}, \end{aligned}$$
(26)
$$\begin{aligned} P\mathcal{O}_{B = X}= & {} \{o \in \mathcal{O} \mid B(o) \cap X \not = \emptyset \}. \end{aligned}$$
(27)

When \(\mathcal{O}\) is specified by a restriction containing set B of numerical attributes with incomplete information, set \(\mathcal{O}\) is specified by an interval where precise values of \(b \in B\) are used.

$$\begin{aligned} C\underline{apr}_{A}(\mathcal{O})= & {} \underline{apr}_{A}(C\mathcal{O}_{\wedge _{b \in B} [b(o_{m_{b}}),b(o_{n_{b}})]})^{pr_{\max }}, \end{aligned}$$
(28)
$$\begin{aligned} C\overline{apr}_{A}(\mathcal{O})= & {} \overline{apr}_{A}(C\mathcal{O}_{\wedge _{b \in B} [b(o_{m_{b}}),b(o_{n_{b}})]})^{pr_{\min }}, \end{aligned}$$
(29)
$$\begin{aligned} P\underline{apr}_{A}(\mathcal{O})= & {} \underline{apr}_{A}(P\mathcal{O}_{\wedge _{b \in B} [b(o_{m_{b}}),b(o_{n_{b}})]})^{pr_{\min }}, \end{aligned}$$
(30)
$$\begin{aligned} P\overline{apr}_{A}(\mathcal{O})= & {} \overline{apr}_{A}(P\mathcal{O}_{\wedge _{b \in B} [b(o_{m_{b}}),b(o_{n_{b}})]})^{pr_{\max }}, \end{aligned}$$
(31)

where

$$\begin{aligned} C\mathcal{O}_{\wedge _{b \in B} [b(o_{m_{b}}),b(o_{n_{b}})]}= & {} \{o \in \mathcal{O}\mid \forall b \in B \ b(o) \subseteq [b(o_{m_{b}}), b(o_{n_{b}})] \}, \end{aligned}$$
(32)
$$\begin{aligned} P\mathcal{O}_{\wedge _{b \in B} [b(o_{m_{b}}),b(o_{n_{b}})]}= & {} \{o \in \mathcal{O} \mid \forall b \in B \ b(o) \cap [b(o_{m_{b}}), b(o_{n_{b}})] \not = \emptyset \}, \end{aligned}$$
(33)

where \(b(o_{m_{b}})\) and \(b(o_{n_{b}})\) are precise and \(\forall b \in B \ b(o_{m_{b}}) \le b(o_{n_{b}})\).

Example 4

In incomplete information table T of Example 1, let \(\mathcal{O}\) be specified by values \(a_{2}(o_{3})\) and \(a_{2}(o_{4})\). Using formulae (32) and (33),

$$\begin{aligned} C\mathcal{O}_{[a_{2}(o_{3}),a_{2}(o_{4})]}= & {} \{o_{3},o_{4}\}, \\ P\mathcal{O}_{[a_{2}(o_{3}),a_{2}(o_{4})]}= & {} \{o_{2},o_{3},o_{4}\}. \end{aligned}$$

Possible indiscernibility relations \(pr_{\min }\) and \(pr_{\max }\) on \(a_{1}\) is \(pr_{1}\) and \(pr_{8}\) in Example 1. Using formulae (28)–(31),

$$\begin{aligned} C\underline{apr}_{a_{1}}(\mathcal{O})= & {} \{o_{4}\}, \\ C\overline{apr}_{a_{1}}(\mathcal{O})= & {} \{o_{3},o_{4}\}, \\ P\underline{apr}_{a_{1}}(\mathcal{O})= & {} \{o_{2},o_{3},o_{4}\}, \\ P\overline{apr}_{a_{1}}(\mathcal{O})= & {} \{o_{1},o_{2},o_{3},o_{4}\}. \end{aligned}$$

4 Rule Induction in Complete and Continuous Information Systems

Let single rules that are supported by objects be derived from the lower and upper approximations of O specified by restriction \(B = X\).

  • Object \(o \in \underline{apr}_{A}(O)\) supports rule \(A = A(o) \rightarrow B = X\) consistently.

  • Object \(o \in \overline{apr}_{A}(O)\) supports rule \(A = A(o) \rightarrow B = X\) inconsistently.

The accuracy, which means the degree of consistency, is \(|[o]_{A} \cap O|/|[o]_{A} |\). This degree is equal to 1 for \(o \in \underline{apr}_{A}(O)\).

In the case where a set of attributes that characterize objects has continuous domains, single rules supported by individual objects in an approximation usually have different antecedent parts. So, we obtain lots of single rules. The disadvantage of the single rule is that it lacks applicability. For example, let two values a(o) and \(a(o')\) be 4.53 and 4.65 for objects o and \(o'\) in \(\underline{apr}_{a}(O)\). When O is specified by restriction \(b=x\), o and \(o'\) consistently support single rules \(a = 4.53 \rightarrow b = x\) and \(a = 4.65 \rightarrow b = x\), respectively. By using these single rules, we can say that an object with value 4.57 of a, which is indiscernible with 4.53 under \(\delta _{a} = 0.05\), supports \(a = 4.57 \rightarrow b = x\). However, we cannot at all say anything for a rule consistently supported by an object with value 4.59 discernible with 4.53 and 4.65 under \(\delta _{a} = 0.05\). This shows that the single rule has low applicability.

To improve applicability, we bring serial single rules into one combined rule. Let \(o \in U\) be arranged in ascending order of a(o) and be given a serial superscript from 1 to |U|. \(\underline{apr}_{A}(O)\) and \(\overline{apr}_{A}(O)\) consist of collections of serially superscripted objects. For instance, \(\underline{apr}_{A}(O) = \{\cdots , o_{i_{h}}^{h}, o_{i_{h+1}}^{h+1}, \cdots , o_{i_{k-1}}^{k-1}, o_{i_{k}}^{k}, \cdots \}\) \((h \le k)\). The following processing is done to each attribute in A. A single rule that \(o^{l} \in \underline{apr}_{A}(O)\) has antecedent part \(a = a(o{^l})\) for attribute a. Then, antecedent parts of serial single rules induced from collection \((o_{i_{h}}^{h}, o_{i_{h+1}}^{h+1}, \cdots , o_{i_{k-1}}^{k-1}, o_{i_{k}}^{k})\) can be brought into one combined antecedent part \(a= [a(o_{i_{h}}^{h}), a(o_{i_{k}}^{k})]\). Finally, a combined rule is expressed in \(\wedge _{a \in A}(a= [a(o_{i_{h}}^{h}), a(o_{i_{k}}^{k})] \rightarrow B =X)\). The combined rule has accuracy

$$\begin{aligned} \min _{h \le j \le k}|[o^{j}_{i_{j}}]_{A} \cap O|/|[o^{j}_{i_{j}}]_{A} |. \end{aligned}$$
(34)

Proposition 7

Let \(\underline{r}\) be the set of combined rules obtained from \(\underline{apr}_{A}(O)\) and \(\overline{r}\) be the set from \(\overline{apr}_{A}(O)\). If \((A = [l_{A}, u_{A}] \rightarrow B = X) \in \underline{r}\), then \(\exists l'_{A} \le l_{A}, \exists u'_{A} \ge u_{A} \ (A = [l'_{A}, u'_{A}] \rightarrow B = X) \in \overline{r}\), where O is specified by restriction \(B = X\) and \((A = [l_{A}, u_{A}]) = \wedge _{a\in A}(a = [l_{a}, u_{a}])\).

Proof

A single rule obtained from \(\underline{apr}_{A}(O)\) is also derived from \(\overline{apr}_{A}(O)\). This means that the proposition holds.

Example 7

Let continuous information table T0 in Fig. 3 be obtained, where U consists of \(\{o_{1}, o_{2}, \cdots ,\) \(o_{19}, o_{20}\}\). Tables T1, T2, and T3 in Fig. 4 are created from T0. T1 where set \(\{a_{1},a_{4}\}\) of attributes is projected from T0, T2 where \(\{a_{2},a_{3}\}\) is projected, and T3 where \(\{a_{3}\}\) is projected. In addition, objects included in T1, T2, and T3 are arranged in ascending order of values of attributes \(a_{1}\), \(a_{2}\), and \(a_{3}\), respectively.

Fig. 4
figure 4

T0 is an incomplete and continuous information table. T1, T2, and T3 are derived from T0

Indiscernible classes on \(a_{1}\) of each object under \(\delta _{a_{1}}=0.05\) are:

$$\begin{aligned}&[o_{1}]_{a_{1}} = \{o_{1},o_{10},o_{14}\}, [o_{2}]_{a_{1}} = \{o_{2},o_{11},o_{16},o_{17}\},\\ \qquad&[o_{3}]_{a_{1}} = \{o_{3}\}, [o_{4}]_{a_{1}} = \{o_{4}\}, [o_{5}]_{a_{1}} = \{o_{5},o_{20}\}, \\ \qquad&[o_{6}]_{a_{1}} = \{o_{6},o_{10},o_{15}\}, [o_{7}]_{a_{1}} = \{o_{7}\}, [o_{8}]_{a_{1}} = \{o_{8}\}, \\ \qquad&[o_{9}]_{a_{1}} = \{o_{9}\}, [o_{10}]_{a_{1}} = \{o_{1},o_{6},o_{10},o_{14},o_{15}\}, \\ \qquad&[o_{11}]_{a_{1}} = \{o_{2},o_{11},o_{16}\}, [o_{12}]_{a_{1}} = \{o_{12}\}, [o_{13}]_{a_{1}} = \{o_{13},o_{19}\}, \\ \qquad&[o_{14}]_{a} = \{o_{1},o_{10},o_{14}\}, [o_{15}]_{a} = \{o_{6},o_{10},o_{15}\}, \\ \qquad&[o_{16}]_{a_{1}} = \{o_{2},o_{11},o_{16}\}, [o_{17}]_{a_{1}} = \{o_{2},o_{17}\}, [o_{18}]_{a_{1}} = \{o_{18}\}, \\ \qquad&[o_{19}]_{a_{1}} = \{o_{13},o_{19}\}, [o_{20}]_{a_{1}} = \{o_{5},o_{20}\}. \end{aligned}$$

When O is specified by \(a_{4} = x\), \(O = \{o_{1}, o_{2}, o_{5}, o_{9}, o_{11}, o_{14}, o_{16}, o_{19}, o_{20}\}\). Let O be approximated by objects characterized by attribute \(a_{1}\) whose values are continuous. Using formulae (3) and (4), two approximations are:

$$\begin{aligned} \underline{apr}_{a_{1}}(O)= & {} \{ o_{5}, o_{9}, o_{11}, o_{16}, o_{20} \}, \\ \overline{apr}_{a_{1}}(O)= & {} \{ o_{1}, o_{2}, o_{5}, o_{9}, o_{10}, o_{11}, o_{13}, o_{14}, o_{16}, o_{17}, o_{19}, o_{20} \}. \end{aligned}$$

In continuous information table T1, which is created from T0, objects are arranged in ascending order of values of attribute \(a_{1}\) and each object is given a serial superscript from 1 to 20. Using the serial superscript, the two approximations are rewritten:

$$\begin{aligned} \underline{apr}_{a_{1}}(O)= & {} \{ o_{16}^{7}, o_{11}^{8}, o_{9}^{14}, o_{5}^{15}, o_{20}^{16} \}, \\ \overline{apr}_{a_{1}}(O)= & {} \{ o_{17}^{5}, o_{2}^{6}, o_{16}^{7}, o_{11}^{8}, o_{10}^{11}, o_{1}^{12}, o_{14}^{13}, o_{9}^{14}, o_{5}^{15}, o_{20}^{16},o_{19}^{17},o_{13}^{18} \}, \end{aligned}$$

The lower approximation creates consistent combined rules:

$$\begin{aligned} a_{1} = [3.96, 3.98] \rightarrow a_{4} = x, a_{1} = [4.23, 4.43] \rightarrow a_{4} = x, \end{aligned}$$

from collections \(\{o_{16}^{7}, o_{11}^{8}\}\) and \(\{o_{9}^{14}, o_{5}^{15}, o_{20}^{16}\}\), respectively, where \(a_{1}(o_{16}^{7}) = 3.96, \ a_{1}(o_{11}^{8}) = 3.98, \ a_{1}(o_{9}^{14}) = 4.23,\) and \(a_{1}(o_{20}^{16}) = 4.43\). The upper approximation creates inconsistent combined rules:

$$\begin{aligned} a_{1} = [3.90, 3.98] \rightarrow a_{4} = x, a_{1} = [4.08, 4.92] \rightarrow a_{4} = x, \end{aligned}$$

from collections \(\{o_{17}^{5}, o_{2}^{6},o_{16}^{7}, o_{11}^{8}\}\) and \(\{o_{10}^{11}, o_{1}^{12},o_{14}^{13},o_{9}^{14},\) \(o_{5}^{15}, o_{20}^{16}, o_{19}^{17},\) \( o_{13}^{18}\}\), respectively, where \(a_{1}(o_{17}^{5}) = 3.90\), \(a_{1}(o_{10}^{11}) = 4.08\), and \(a_{1}(o_{13}^{18}) = 4.92\).

Next, let O be specified by \(a_{3}\) that takes continuous values. In information table T3 projected from T0 the objects are arranged in ascending order of values of \(a_{3}\) and each object is given a serial superscript from 1 to 20. Let lower and upper bounds be \(a_{3}(o_{15}^{6}) = 4.23\) and \(a_{3}(o_{8}^{11}) = 4.50\), respectively. Then, \(O = \{o_{15}^{6}, o_{3}^{7}, o_{17}^{8}, o_{2}^{9}, o_{16}^{10}, o_{8}^{11}\}\). We approximate O by objects restricted by attribute \(a_{2}\). Under \(\delta _{a_{2}}\) = 0.05, indiscernible classes of objects \(o_{1}, \ldots o_{20}\) are:

$$\begin{aligned}&[o_{1}]_{a_{2}} = \{o_{1},o_{4},o_{7},o_{8}\}, [o_{2}]_{a_{2}} = \{o_{2},o_{3},o_{16}\}, \\ \qquad&[o_{3}]_{a_{2}} = \{o_{2},o_{3},o_{13},o_{16}\}, [o_{4}]_{a_{2}} = \{o_{1},o_{4},o_{7},o_{8}\}, \\ \qquad&[o_{5}]_{a_{2}} = \{o_{5},o_{20}\}, [o_{6}]_{a_{2}} = \{o_{6}\}, [o_{7}]_{a_{2}} = \{o_{1},o_{4},o_{7}\}, \\ \qquad&[o_{8}]_{a_{2}} = \{o_{8}\}, [o_{9}]_{a_{2}} = \{o_{9}\}, [o_{10}]_{a_{2}} = \{o_{10}\}, \\ \qquad&[o_{11}]_{a_{2}} = \{o_{11},o_{18}\}, [o_{12}]_{a_{2}} = \{o_{12}\}, [o_{13}]_{a_{2}} = \{o_{3},o_{13}\}, \\ \qquad&[o_{14}]_{a_{2}} = \{o_{14}\}, [o_{15}]_{a_{2}} = \{o_{15}\}, [o_{16}]_{a_{2}} = \{o_{2},o_{3},o_{16}\}, \\ \qquad&[o_{17}]_{a_{2}} = \{o_{17}\}, [o_{18}]_{a_{2}} = \{o_{11},o_{18}\}, [o_{19}]_{a_{2}} = \{o_{19}\}, [o_{20}]_{a_{2}} = \{o_{5},o_{20}\}. \end{aligned}$$

Formulae (3) and (4) derives the following approximations:

$$\begin{aligned} \underline{apr}_{a_{2}}(O)= & {} \{o_{2}, o_{8}, o_{15}, o_{16}, o_{17} \}, \\ \overline{apr}_{a_{2}}(O)= & {} \{o_{1}, o_{2}, o_{3}, o_{4}, o_{8}, o_{13}, o_{15}, o_{16}, o_{17} \}. \end{aligned}$$

In continuous information table T2, objects are arranged in ascending order of values of attribute \(a_{2}\) and each object is given a serial superscript from 1 to 20. Using objects with superscripts, the two approximations are rewritten:

$$\begin{aligned} \underline{apr}_{a_{2}}(O)= & {} \{ o_{8}^{7}, o_{15}^{8}, o_{17}^{10}, o_{2}^{11}, o_{16}^{12} \}, \\ \overline{apr}_{a_{2}}(O)= & {} \{ o_{1}^{5}, o_{4}^{6}, o_{8}^{7}, o_{15}^{8}, o_{17}^{10}, o_{2}^{11}, o_{16}^{12}, o_{3}^{13}, o_{13}^{14} \}, \end{aligned}$$

Consistent combined rules from collections \(\{o_{8}^{7}, o_{15}^{8}\}\) and \(\{o_{17}^{10}, o_{2}^{11},\) \(o_{16}^{12} \}\) are

$$\begin{aligned} a_{2} = [2.10, 2.28]\rightarrow & {} a_{3} = [4.23, 4.50], \\ a_{2} = [2.50, 2.64]\rightarrow & {} a_{3} = [4.23, 4.50], \end{aligned}$$

where \(a_{2}(o_{8}^{7}) = 2.10, \ a_{2}(o_{15}^{8}) = 2.28, \ a_{2}(o_{17}^{10}) = 2.50,\) and \(a_{2}(o_{16}^{12}) = 2.64\). Inconsistent combined rules from collections \((o_{1}^{5}, o_{4}^{6}, o_{8}^{7}, o_{15}^{8}\}\) and \(\{o_{17}^{10}, o_{2}^{11}, o_{16}^{12}, o_{3}^{13}, o_{13}^{14} )\) are

$$\begin{aligned} a_{2} = [1.97, 2.28]\rightarrow & {} a_{3} = [4.23, 4.50], \\ a_{2} = [2.50, 2.70]\rightarrow & {} a_{3} = [4.23, 4.50], \end{aligned}$$

where \(a_{2}(o_{1}^{5}) = 1.97\) and \(a_{2}(o_{13}^{14}) = 2.70\).

Example 7 shows that a combined rule has higher applicability than single rules. For example, by using the consistent combined rule \(a_{2} = [2.10, 2.28] \rightarrow a_{3} = [4.23, 4.50]\), we can say that an object with attribute \(a_{2}\) value 2.16 supports this rule, because 2.16 is included in [2.10, 2.28]. On the other hand, by using single rules \(a_{2} = 2.10 \rightarrow a_{3} = [4.23, 4.50]\) and \(a_{2} = 2.28 \rightarrow a_{3} = [4.23, 4.50]\), we cannot say what rule the object supports, because 2.16 is discernible with both 2.10 and 2.28 under threshold 0.05.

5 Rule Induction in Incomplete and Continuous Information Tables

When O is specified by restriction \(B = X\), we can say for rules induced from objects in approximations as follows:

  • Object \(o \in C\underline{apr}_{A}(O)\) certainly supports rule \(A = A(o) \rightarrow B = X\) consistently.

  • Object \(o \in C\overline{apr}_{A}(O)\) certainly supports rule \(A = A(o) \rightarrow B = X\) inconsistently.

  • Object \(o \in P\underline{apr}_{A}(O)\) possibly supports \(A = A(o) \rightarrow B = X\) consistently.

  • Object \(o \in P\overline{apr}_{A}(O)\) possibly supports \(A = A(o) \rightarrow B = X\) inconsistently.

We create combined rules from these single rules. Let \(U^{C}_{a}\) be the set of objects with complete and continuous information for attribute a and \(U^{I}_{a}\) be one with incomplete and continuous information.

$$\begin{aligned} U^{C}_{A} = \cap _{a \in A}U^{C}_{a}, \end{aligned}$$
(35)
$$\begin{aligned} U^{I}_{A} = \cup _{a \in A}U^{I}_{a}. \end{aligned}$$
(36)

A combined rule is represented by:

$$\begin{aligned} (A = [l_{A}, u_{A}] \rightarrow B = X) = (\wedge _{a \in A} (a = [l_{a}, u_{a}]) \rightarrow B = X). \end{aligned}$$
(37)

The following treatment is done for each attribute \(a \in A\). \(o \in U^{C}_{a}\) is arranged in ascending order of a(o) and is given a serial superscript from 1 to \(|U^{C}_{a}|\). Objects in \((C\underline{apr}_{A}(O) \cap U^{C}_{a})\), in \((C\overline{apr}_{A}(O) \cap U^{C}_{a})\), in \((P\underline{apr}_{A}(O) \cap U^{C}_{a})\), and in \((P\overline{apr}_{A}(O) \cap U^{C}_{a})\) are arranged in ascending order of attribute a values, respectively. And then the objects are expressed by collections of objects with serial superscripts like \(\{\cdots , o_{i_{h}}^{h}, o_{i_{h+1}}^{h+1}, \cdots , o_{i_{k-1}}^{k-1}, o_{i_{k}}^{k}, \cdots \}\) \((h \le k)\). From collection \((o_{i_{h}}^{h}, o_{i_{i+1}}^{h+1}, \cdots , o_{i_{k-1}}^{k-1}, o_{i_{k}}^{k})\), the antecedent part for a of the combined rule expressed by \(A = [l_{A}, u_{A}] \rightarrow B = X\) is created. For a certain and consistent combined rule,

$$\begin{aligned} l_{a}= \min (a(o_{i_{h}}^{h}), \min _{Y} e) \ \text{ and }~ u_{a} = \max (a(o_{i_{k}}^{k}), \max _{Y} e), \nonumber \\ Y = \left\{ \begin{array}{ll} e< a(o_{i_{k+1}}^{k+1}), &{} \,\text{ for }~h = 1 \wedge k \ne |U^{C}_{a}| \\ a(o_{i_{h-1}}^{h-1})< e< a(o_{i_{k+1}}^{k+1}), &{} \, \text{ for }~h \ne 1 \wedge k \ne |U^{C}_{a}| \\ a(o_{i_{h-1}}^{h-1}) < e, &{} \, \text{ for }~h \ne 1 \wedge k = |U^{C}_{a}| \nonumber \\ \end{array} \right. \\ \text{ with } \, e \in a(o') \wedge o' \in Z, \end{aligned}$$
(38)

where Z is \((C\underline{apr}_{A}(O) \cap U^{I}_{a})\).

In the case of certain and inconsistent, possible and consistent, possible and inconsistent combined rules, Z is \((C\overline{apr}_{A}(O) \cap U^{I}_{a})\), \((P\underline{apr}_{A}(O) \cap U^{I}_{a})\), and \((P\overline{apr}_{A}(O) \cap U^{I}_{a})\), respectively.

Proposition 8

Let \(C\underline{r}\) be the set of combined rules induced from \(C\underline{apr}_{A}(O)\) and \(P\underline{r}\) the set from \(P\underline{apr}_{A}(O)\). When O is specified by restriction \(B = X\), if \((A = [l_{A}, u_{A}] \rightarrow B = X) \in C\underline{r}\), then \(\exists l'_{A} \le l_{A}, \exists u'_{A} \ge u_{A} \ (A = [l'_{A}, u'_{A}] \rightarrow B = X) \in P\underline{r}\).

Proof

A single rule created from \(C\underline{apr}_{A}(O)\) is also derived from \(P\underline{apr}_{A}(O)\) because of \(C\underline{apr}_{A}(O) \subseteq P\underline{apr}_{A}(O)\). This means that the proposition holds.

Proposition 9

Let \(C\overline{r}\) be the set of combined rules induced from \(C\overline{apr}_{A}(O)\)and \(P\overline{r}\) the set from \(P\overline{apr}_{A}(O)\). When O is specified by restriction \(B = X\), if \((A = [l_{A}, u_{A}] \rightarrow B = X) \in C\overline{r}\), then \(\exists l'_{A} \le l_{A}, \exists u'_{A} \ge u_{A} \ (A = [l'_{A}, u'_{A}] \rightarrow B = X) \in P\overline{r}\).

Proof

The proof is similar to one for Proposition 8.

Proposition 10

Let \(C\underline{r}\) be the set of combined rules induced from \(C\underline{apr}_{A}(O)\) and \(C\overline{r}\) the set from \(C\overline{apr}_{A}(O)\). When O is specified by restriction \(B = X\), if \((A = [l_{A}, u_{A}] \rightarrow B = X) \in C\underline{r}\), then \(\exists l'_{A} \le l_{A}, \exists u'_{A} \ge u_{A}\) \( \ (A = [l'_{A}, u'_{A}] \rightarrow B = X) \in C\overline{r}\).

Proof

The proof is similar to one for Proposition 8.

Proposition 11

Let \(P\underline{r}\) be the set of combined rules induced from \(P\underline{apr}_{A}(O)\) and \(P\overline{r}\) the set from \(P\overline{apr}_{A}(O)\). When O is specified by restriction \(B = X\), if \((A = [l_{A}, u_{A}] \rightarrow B = X) \in P\underline{r}\), then \(\exists l'_{A} \le l_{A}, \exists u'_{A} \ge u_{A} \ (A = [l'_{A}, u'_{A}] \rightarrow B = X) \in P\overline{r}\).

Proof

The proof is similar to one for Proposition 8.

Fig. 5
figure 5

Information table IT2 containing incomplete information

Example 8

Let O be specified by restriction \(a_{4}=x\) in IT2 of Fig. 5.

$$\begin{aligned} CO_{a_{4} = x}= & {} \{o_{2}, o_{5}, o_{9}, o_{11}, o_{14}, o_{16}, o_{20} \}, \\ PO_{a_{4} = x}= & {} \{o_{1}, o_{2}, o_{5}, o_{9}, o_{11}, o_{14}, o_{16}, o_{17}, o_{19}, o_{20}\}. \end{aligned}$$

Each \(C[o_{i}]_{a_{1}}\) with \(i =1,\dots ,20\) is, respectively,

$$\begin{aligned}&C[o_{1}]_{a_{1}} = \{o_1,o_{10}\}, C[o_{2}]_{a_{1}} = \{o_{2},o_{11},o_{16},o_{17}\}, \\&C[o_{3}]_{a_{1}} = \{o_{3}\}, C[o_{4}]_{a_{1}} = \{o_{4}\}, C[o_{5}]_{a_{1}} = \{o_{5}, o_{20}\}, \\&C[o_{6}]_{a_{1}} = \{o_{6},o_{10},o_{15}\}, C[o_{7}]_{a_{1}} = \{o_{7}\}, \\&C[o_{8}]_{a_{1}} = \{o_{8}\}, C[o_{9}]_{a_{1}} = \{o_{9}\}, \\&C[o_{10}]_{a_{1}} = \{o_{1},o_{6},o_{10},o_{14},o_{15} \}, \\&C[o_{11}]_{a_{1}} = \{o_{2},o_{11},o_{16}\}, C[o_{12}]_{a_{1}} = \{o_{12}\}, \\&C[o_{13}]_{a_{1}} = \{o_{13},o_{19}\}, C[o_{14}]_{a_{1}} = \{o_{10},o_{14}\}, \\&C[o_{15}]_{a_{1}} = \{o_{6},o_{10},o_{15}\}, C[o_{16}]_{a_{1}} = \{o_{2},o_{11},o_{16}\}, \\&C[o_{17}]_{a_{1}} = \{o_{2},o_{17}\}, C[o_{18}]_{a_{1}} = \{o_{18}\}, \\&C[o_{19}]_{a_{1}} = \{o_{13},o_{19}\}, C[o_{20}]_{a_{1}} = \{o_{5}, o_{20}\}. \end{aligned}$$

Each \(P[o_{i}]_{a_{1}}\) with \(i =1,\dots ,20\) is, respectively,

$$\begin{aligned}&P[o_{1}]_{a_{1}} = \{o_{1}, o_{6}, o_{10}, o_{14}, o_{15} \}, \\&P[o_{2}]_{a_{1}} = \{o_{2}, o_{9}, o_{11}, o_{16}, o_{17}\}, \\&P[o_{3}]_{a_{1}} = \{o_{3}\}, P[o_{4}]_{a_{1}} = \{o_{4}\}, P[o_{5}]_{a_{1}} = \{o_{5}, o_{20}\} \\&P[o_{6}]_{a_{1}} = \{o_{1}, o_{6}, o_{10}, o_{15}\}, P[o_{7}]_{a_{1}} = \{o_{7}\}, P[o_{8}]_{a_{1}} = \{o_{8}\}, \\&P[o_{9}]_{a_{1}} = \{o_{2}, o_{9}, o_{11}, o_{16}, o_{17}\}, \\&P[o_{10}]_{a_{1}} = \{o_{1}, o_{6}, o_{10}, o_{14}, o_{15} \}, \\&P[o_{11}]_{a_{1}} = \{o_{2}, o_{9}, o_{11}, o_{16}, o_{17}\}, P[o_{12}]_{a_{1}} = \{o_{12}\}, \\&P[o_{13}]_{a_{1}} = \{o_{13},o_{19}\}, P[o_{14}]_{a_{1}} = \{o_{1}, o_{10}, o_{14}\}, \\&P[o_{15}]_{a_{1}} = \{o_{1}, o_{6}, o_{10}, o_{15} \}, \\&P[o_{16}]_{a_{1}} = \{o_{2}, o_{9}, o_{11}, o_{16}, o_{17}\}, \\&P[o_{17}]_{a_{1}} = \{o_{1}, o_{2}, o_{9}, o_{11}, o_{16}, o_{17}\}, P[o_{18}]_{a_{1}} = \{o_{18}\}, \\&P[o_{19}]_{a_{1}} = \{o_{13},o_{19}\}, P[o_{20}]_{a_{1}} = \{o_{5}, o_{20}\}. \end{aligned}$$

Four approximations are:

$$\begin{aligned} C\underline{apr}_{a_{1}}(O)= & {} \{o_{5}, o_{20}\}, \\ P\underline{apr}_{a_{1}}(O)= & {} \{o_{2}, o_{5}, o_{9}, o_{11}, o_{16}, o_{17}, o_{20}\}, \\ C\overline{apr}_{a_{1}}(O)= & {} \{o_{2}, o_{5}, o_{9}, o_{10}, o_{11}, o_{14}, o_{16}, o_{17}, o_{20}\}, \\ P\overline{apr}_{a_{1}}(O)= & {} \{o_{1}, o_{2}, o_{5}, o_{6}, o_{9}, o_{10}, o_{11}, o_{13}, o_{14}, o_{15}, o_{16}, o_{17},o_{19}, o_{20}\}. \end{aligned}$$
$$\begin{aligned} U_{a_{1}}^{C}= & {} \{o_{2}, o_{3}, o_{4}, o_{5}, o_{6}, o_{7}, o_{8}, o_{10}, o_{12}, o_{13}, o_{14}, o_{15}, o_{16}, o_{20} \}, \\ U_{a_{1}}^{I}= & {} \{o_{1}, o_{9}, o_{11}, o_{17}, o_{18}, o_{19} \} \end{aligned}$$

Objects in \(U_{a_{1}}^{C}\) are arranged in ascending order of \(a_{1}(o)\) like this:

$$\begin{aligned} o_{3}, o_{12}, o_{7}, o_{2}, o_{16}, o_{6}, o_{15}, o_{10}, o_{14}, o_{5}, o_{20}, o_{13}, o_{8}, o_{4} \end{aligned}$$

Giving serial superscripts to these objects,

$$\begin{aligned} o_{3}^{1}, o_{12}^{2}, o_{7}^{3}, o_{2}^{4}, o_{16}^{5}, o_{6}^{6}, o_{15}^{7}, o_{10}^{8}, o_{14}^{9}, o_{5}^{10}, o_{20}^{11}, o_{13}^{12}, o_{8}^{13}, o_{4}^{14}. \end{aligned}$$

And then, the four approximations are rewritten like these:

$$\begin{aligned} C\underline{apr}_{a_{1}}(O)= & {} \{o_{5}^{10}, o_{20}^{11} \}, \\ P\underline{apr}_{a_{1}}(O)= & {} \{o_{2}^{4}, o_{16}^{5}, o_{5}^{10}, o_{20}^{11}, o_{9},o_{11},o_{17}\}, \\ C\overline{apr}_{a_{1}}(O)= & {} \{o_{2}^{4}, o_{16}^{5}, o_{10}^{8}, o_{14}^{9}, o_{5}^{10}, o_{20}^{11}, o_{9},o_{11},o_{17}\}, \\ P\overline{apr}_{a_{1}}(O)= & {} \{o_{2}^{4}, o_{16}^{5}, o_{6}^{6}, o_{15}^{7}, o_{10}^{8}, o_{14}^{9}, o_{5}^{10}, o_{20}^{11}, o_{13}^{12}, o_{1},o_{9},o_{11}, o_{17},o_{19}\}. \end{aligned}$$

Objects are separated into two parts: ones with a superscript and ones with only a subscript; namely, ones having complete information and ones having incomplete information for attribute \(a_{1}\), respectively. That is,

$$\begin{aligned} C\underline{apr}_{a_{1}}(O) \cap U^{C}_{a_{1}}= & {} \{o_{5}^{10}, o_{20}^{11}\}, \\ C\underline{apr}_{a_{1}}(O) \cap U^{I}_{a_{1}}= & {} \emptyset , \\ P\underline{apr}_{a_{1}}(O) \cap U^{C}_{a_{1}}= & {} \{o_{2}^{4}, o_{16}^{5}, o_{5}^{10}, o_{20}^{11}\}, \\ P\underline{apr}_{a_{1}}(O) \cap U^{I}_{a_{1}}= & {} \{o_{9},o_{11},o_{17}\}, \\ C\overline{apr}_{a_{1}}(O) \cap U^{C}_{a_{1}}= & {} \{o_{2}^{4}, o_{16}^{5}, o_{10}^{8}, o_{14}^{9}, o_{5}^{10}, o_{20}^{11}\}, \\ C\overline{apr}_{a_{1}}(O) \cap U^{I}_{a_{1}}= & {} \{o_{9},o_{11},o_{17}\}, \\ P\overline{apr}_{a_{1}}(O) \cap U^{C}_{a_{1}}= & {} \{o_{2}^{4}, o_{16}^{5}, o_{6}^{6}, o_{15}^{7}, o_{10}^{8}, o_{14}^{9}, o_{5}^{10}, o_{20}^{11}, o_{13}^{12}\}, \\ P\overline{apr}_{a_{1}}(O) \cap U^{I}_{a_{1}}= & {} \{o_{1},o_{9},o_{11},o_{17},o_{19}\}. \end{aligned}$$

From these expressions and formula (38), four kinds of combined rules are created. A certain and consistent rule is:

$$\begin{aligned} a_{1} = 4.43 \rightarrow a_{4} = x. \end{aligned}$$

Possible and consistent rules are:

$$\begin{aligned} a_{1}= & {} [3.90,3.98] \rightarrow a_{4} = x, \\ a_{1}= & {} [4.23,4.43] \rightarrow a_{4} = x. \end{aligned}$$

Certain and inconsistent rules are:

$$\begin{aligned} a_{1}= & {} [3.90,3.98] \rightarrow a_{4} = x, \\ a_{1}= & {} [4.08,4.43] \rightarrow a_{4} = x. \end{aligned}$$

A possible and inconsistent rule is:

$$\begin{aligned} a_{1} = [3.90,4.93] \rightarrow a_{4} = x. \end{aligned}$$

6 Conclusions

We have described rough sets that consist of lower and upper approximations and rule induction from the rough sets in continuous information tables.

First, we have handled complete and continuous information tables. Rough sets are derived directly using the indiscernibility relation on a set of attributes.

Second, we have coped with incomplete and continuous information tables under possible world semantics. We use a possible indiscernibility relation as a possible world. This is because the number of possible indiscernibility relations is finite, although the number of possible tables, which is traditionally used under possible world semantics, is infinite. The family of possible indiscernibility relations has a lattice structure with the minimum and the maximum elements. The families of lower and upper approximations that are derived from each possible indiscernibility relation also have a lattice structure for set inclusion. The approximations are obtained by using the minimum and the maximum possible indiscernibility relations. Therefore, we have no difficulty of computational complexity for the number of attribute values with incomplete information, although the number of possible indiscernibility relations increases exponentially as the number of values with incomplete information grows linearly.

Consequently, we derive four kinds of approximations. These approximations are the same as those obtained from an extended approach directly using indiscernibility relations. Therefore, this justifies the extended approach in our previous work.

From these approximations, we derive four kinds of single rules that are supported by individual objects. These single rules have weak applicability. To improve the applicability, we have brought serial single rules into one combined rule. The combined rule has greater applicability than the single ones that are used to create it.