Keywords

1 Introduction

Fuzzy rough sets skillfully combined the rough sets that deal with classification uncertainty and the fuzzy sets that deal with boundary uncertainty to handle various classes of data types [1]. Subsequently, many scholars expanded and developed fuzzy rough sets [2,3,4,5,6,7]. To deal with multimodality attributes, Hu proposed multikernel fuzzy rough sets [8], different kernel functions used to process multimodality attributes.

Actually, there are semantic hierarchies among most data types [9]. Chen and others constructed a decision tree among classes with a tree-like structure [10]. Inspired by fuzzy rough sets theory, Wang proposed deep fuzzy trees [11]. Zhao embed the hierarchical structure into fuzzy rough sets [12]. Qiu proposed a fuzzy rough sets method using the Hausdorff distance of the sample set for hierarchical feature selection [13].

The expansion and updating of data information will also cause data loss. Therefore, in the multimodality incomplete information system, the incremental updating algorithm of the approximations is particularly important, Zeng proposed an incremental updating algorithm for the variation of object set [14] and the attribute value based on the hybrid distance [6]. Dong L proposed an incremental algorithm for attribute reduction when samples and attributes change simultaneously [15]. Huang proposed a multi-source hybrid rough set (MCRS) [16], under variation of the object, attributes set and attribute values, a matrix-based incremental mechanism studied. However, multikernel fuzzy rough sets incremental algorithm based on hierarchical classification is not considered now.

This paper is organized as follows: Sect. 2 introduces the basic knowledge. In Sect. 3, the hierarchical structure among decision attribute values is considered into the multikernel fuzzy rough sets, the tree-based hierarchical class structure is imported, and the upper and lower approximations of all nodes is proposed. In Sect. 4, an incremental updating algorithm of upper and lower approximations for the immigration and emmigration of single object. Finally a corresponding example is given for deduction. The paper ends with conclusions and further research topics in Sect. 5.

2 An Introduction of Fuzzy Rough Sets

In this section, some related content of fuzzy rough sets will be briefly introduced.

Definition 1

[17]. Given an fuzzy approximate space \(\left\{ {U,R} \right\}\), \(\forall x,y,z \in U\). If R satisfies:

Reflexivity: \(R\left( {x,x} \right) = 1\);

Symmetry: \(R\left( {x,y} \right) = R\left( {y,x} \right)\);

Min-max transitivity: \(min \left( {R\left( {x,y} \right),R\left( {y,z} \right)} \right) \le R\left( {x,z} \right)\).

Then R is said to be a fuzzy equivalence relation on U.

Definition 2

[4]. Given a fuzzy approximate space \(\left\{ {U,R} \right\}\), R is a fuzzy equivalence relation on the U, and the fuzzy lower and upper approximations are respectively defined as follows:

$$ \left\{ {\begin{array}{*{20}l} {\underline{{R_{s} }} X(x) = \mathop {inf }\limits_{y \in u} S\left( {N\left( {R\left( {x,y} \right)} \right),X\left( y \right)} \right)} \hfill \\ {\overline{{R_{T} }} X(x) = \mathop {sup }\limits_{y \in u} T\left( {R\left( {x,y} \right),X\left( y \right)} \right)} \hfill \\ \end{array} } \right. $$
(1)

T and S are the triangular mode. N is a monotonically decreasing mapping function.

In a multimodality information system, the attributes of samples are multimodality, and multikernel learning is an effective method, often using different kernel functions to extract information from different attributes [8]. At the same time, there may also be unknown value in the attribute values. In this paper, the case where the attribute values exist unknown value is considered into the multimodality information system.

Definition 3

[6]. Given a multimodality incomplete information system \(\left\{ {U,MC} \right\}\), MC is a multimodality conditional attribute, \(\forall x,y \in U\), \(\forall M \in MC\), and it exists unknown value. \(M\left( x \right)\), \(M\left( y \right)\) are the M attribute values of x and y, respectively. Unknown values are marked as “?”. The similarity relationships extracted from this data using a matching kernel.

$$ K\left( {M\left( x \right),M\left( y \right)} \right) = 1 $$
(2)

It is easy to prove that the kernel function satisfies Definition 1. Therefore, the similarity relation calculated is fuzzy equivalence relation.

Definition 4

[8]. Given a \(\left\{ {U,MC} \right\}\), MC divides the multimodality information system into p subsets with different attributes, referred to as \(MC/U = \left\{ {M_{1} ,M_{2} ,...,M_{P} } \right\}\), \(M_{i} \in MC\), \(K_{i}\) is a fuzzy similarity relation computed with single attribute. \(\forall x,y \in U\), the fuzzy similarity relation based on combination kernels is defined as follows:

$$ K_{{T_{\cos } }} \left( {x,y} \right) = max \left( {\prod\limits_{i = 1}^{p} {K_{i} \left( {x,y} \right)} - \prod\limits_{i = 1}^{p} {\sqrt {1 - K_{i} \left( {x,y} \right)^{2} } } ,0} \right) $$
(3)

Example 1.

Given a \(\left\{ {U,MC,D} \right\}\), there are seven objects, \(MC = \left\{ {K_{1} ,K_{2} ,K_{3} ,K_{4} } \right\}\), \(D = \left\{ {d_{1} ,d_{2} ,d_{3} ,d_{4} ,d_{5} } \right\}\), The details are shown in Table 1.

Table 1. Dataset used in the example

In Table 1, there are three types of conditional attributes, numerical type, categorical type, and unknown values. It can be regarded as a multimodality incomplete decision system, and different kernel functions is used to extract the fuzzy similarity relationships of different attributes.

\(x_{1}\) and \(x_{2}\) as an example, calculating the fuzzy similarity relationships based on combination kernels:

$$ K_{1} \left( {x_{1} ,x_{2} } \right) = 0.990,K_{2} \left( {x_{1} ,x_{2} } \right) = 0.930,K_{3} \left( {x_{1} ,x_{2} } \right) = 0.965,K_{4} \left( {x_{1} ,x_{2} } \right) = 1. $$
$$ K_{{T_{\cos } }} \left( {x_{1} ,x_{2} } \right) = max (\prod\limits_{i = 1}^{4} {K_{i} \left( {x_{1} ,x_{2} } \right)} - \prod\limits_{i = 1}^{4} {\sqrt {1 - K_{i} \left( {x_{1} ,x_{2} } \right)^{2} } } ,0) = 0.888. $$

In the same way, the fuzzy similarity relationships based on combination kernels among other objects can be obtained, and the fuzzy similarity relationships matrix \(K_{{T_{\cos } }} \left( {x,y} \right)\) can be obtained as follows:

$$ K_{{T_{\cos } }} \left( {x,y} \right) = \left( {\begin{array}{*{20}c} 1 & {0.888} & 0 & 0 & {0.918} & {0.863} & {0.001} & {0.973} \\ {0.888} & 1 & {0.855} & {0.824} & {0.965} & {0.918} & {0.001} & {0.956} \\ 0 & {0.855} & 1 & {0.779} & {0.956} & 0 & 0 & 0 \\ 0 & {0.824} & {0.779} & 1 & {0.827} & 0 & 0 & 0 \\ {0.918} & {0.965} & {0.956} & {0.827} & 1 & {0.983} & {0.001} & {0.965} \\ {0.863} & {0.918} & 0 & 0 & {0.983} & 1 & {0.002} & {0.906} \\ {0.001} & {0.001} & 0 & 0 & {0.001} & {0.002} & 1 & {0.001} \\ {0.973} & {0.956} & 0 & 0 & {0.965} & {0.906} & {0.001} & 1 \\ \end{array} } \right) $$

3 Multikernel Fuzzy Rough Sets Based on Hierarchical Classifification

Given a multimodality incomplete decision system, in addition to multimodality attributes of objects, there may be hierarchical relationships among decision attribute values. The hierarchical classification is mostly based on a tree structure. A multikernel fuzzy rough sets based on hierarchical classification takes the hierarchical relationships of decision attribute values into account in the fuzzy rough sets.

Definition 5

[4]. Given a \(\left\{ {U,MC} \right\}\). \(K_{{T_{\cos } }}\) is a fuzzy equivalence relation based on combination kernels, X is a fuzzy subset of U, and the approximations are respectively defined as as follows:

$$ \left\{ {\begin{array}{*{20}l} {\underline{{K_{T} }} X(x) = \mathop {inf }\limits_{y \in u} S\left( {N\left( {K_{{T_{\cos } }} \left( {x,y} \right)} \right),X\left( y \right)} \right)} \hfill \\ {\overline{{K_{T} }} X(x) = \mathop {sup }\limits_{y \in u} T\left( {K_{{T_{\cos } }} \left( {x,y} \right),X\left( y \right)} \right)} \hfill \\ \end{array} } \right. $$
(4)

Definition 6

[12]. \(\left\{ {U,MC,D_{tree} } \right\}\) is a multimodality decision system based on hierarchical classification, \(D_{Tree}\) is decision attribute based on hierarchical classification, and divides U into q subsets, referred to as \(U/D_{Tree} = \left\{ {d_{1} ,......,d_{p} } \right\}\), \(sib\left( {d_{i} } \right)\) represents the sibling node of \(d_{i}\).\(\forall x \in U\), if \(x \in sib\left( {d_{i} } \right)\), then \(d_{i} \left( x \right) = 0\), else \(d_{i} \left( x \right) = 1\). The approximations of decision class \(d_{i}\) is defined as follows:

$$ \left\{ {\begin{array}{*{20}l} {\underline{{K_{T} }}_{sibling} d_{i} \left( x \right) = \mathop {inf }\limits_{{y \in \left\{ {sib(d_{i} )} \right\}}} \{ \sqrt {1 - K_{{T_{\cos } }}^{2} \left( {x,y} \right)} \} } \hfill \\ {\overline{{K_{T} }}_{sibling} d_{i} \left( x \right) = \mathop {sup }\limits_{{y \in d_{i} }} \{ K_{{T_{\cos } }} \left( {x,y} \right)\} } \hfill \\ \end{array} } \right. $$
(5)

Proposition 1.

This paper extends the lower approximations algorithm of the decision class to any node. When the decision class is a non-leaf node, the upper approximations is obtained by finding the least upper bound of the upper approximations of its child nodes. Leaf(d) represents no child node, the child node of \(d_{i}\) is marked as \(d_{ich} = \left\{ {d_{i1} ,d_{i1} ,...,d_{ik} } \right\}\), where k is the number of child nodes, there are:

$$ \underline{{K_{T} }}_{sibling} d_{i} \left( x \right) = \left\{ {\begin{array}{*{20}l} {\mathop {inf }\limits_{{y \in \left\{ {sib\left( {d_{i} } \right)} \right\}}} \left\{ {\sqrt {1 - K_{{T_{\cos } }}^{2} \left( {x,y} \right)} } \right\}} \hfill & {sib\left( {d_{i} } \right) \ne \emptyset } \hfill \\ 0 \hfill & {else} \hfill \\ \end{array} } \right. $$
(6)
$$ \overline{{K_{T} }}_{sibling} d_{i} \left( x \right) = \left\{ {\begin{array}{*{20}l} {\mathop {sup }\limits_{{y \in d_{i} }} \left\{ {K_{{T_{\cos } }} \left( {x,y} \right)} \right\}} \hfill & {d_{i} \in leaf(d)} \hfill \\ {sup \left\{ {\overline{{K_{T} }}_{sibling} d_{ich} \left( x \right)} \right\}} \hfill & {else} \hfill \\ \end{array} } \right. $$
(7)

Given a \(\left\{ {U,MC,D_{tree} } \right\}\), the algorithm for the lower and upper approximations is designed in Algorithm 1.

figure a

Example 2.

On the basis of Example 1, the decision attributes are divided into five subsets, namely \(d_{1}\), \(d_{2}\), \(d_{3}\), \(d_{4}\), \(d_{5}\). From Fig. 1, \(d_{1} = \left\{ {x_{4} ,x_{6} } \right\}\), and from Table 1, \(sib(d_{1} ) = d_{4} = \left\{ {x_{1} ,x_{8} } \right\}\), according to Proposition 1, the lower and upper approximations of the decision class are calculated as follows:

$$ \underline{{K_{T} }}_{sibling} d_{1} \left( {x_{2} } \right) = \mathop {inf }\limits_{{y \in \left\{ {x_{1} ,x_{8} } \right\}}} \left\{ {\sqrt {1 - K_{{T_{\cos } }}^{2} \left( {x_{2} ,y} \right)} } \right\} = min \left\{ {0.475,0.293} \right\} = 0.293 $$
$$ \underline{{K_{T} }}_{sibling} d_{1} \left( {x_{1} } \right) = 0,\underline{{K_{T} }}_{sibling} d_{1} \left( {x_{3} } \right) = 0,\underline{{K_{T} }}_{sibling} d_{1} \left( {x_{4} } \right) = 0,\underline{{K_{T} }}_{sibling} d_{1} \left( {x_{5} } \right) = 0.262. $$
$$ \underline{{K_{T} }}_{sibling} d_{1} \left( {x_{6} } \right) = 0.423,\underline{{K_{T} }}_{sibling} d_{1} \left( {x_{7} } \right) = 1,\underline{{K_{T} }}_{sibling} d_{1} \left( {x_{8} } \right) = 0. $$

Because \(d_{1}\) is a non-leaf node, according to Proposition 1 there are:

$$ \overline{{K_{T} }}_{sibling} d_{1} \left( {x_{1} } \right) = \mathop {sup }\limits_{{y \in d_{3} }} \left\{ {\overline{{K_{T} }}_{sibling} d_{3} \left( {x_{1} } \right)} \right\} = \sup \left\{ {K_{{T_{\cos } }} \left( {x_{1} ,x_{3} } \right)} \right\} = 0 $$
$$ \overline{{K_{T} }}_{sibling} d_{1} \left( {x_{2} } \right) = 0.855,\overline{{K_{T} }}_{sibling} d_{1} \left( {x_{3} } \right) = 1,\overline{{K_{T} }}_{sibling} d_{1} \left( {x_{4} } \right) = 0.778. $$
$$ \overline{{K_{T} }}_{sibling} d_{1} \left( {x_{5} } \right) = 0.956,\overline{{K_{T} }}_{sibling} d_{1} \left( {x_{6} } \right) = 0,\overline{{K_{T} }}_{sibling} d_{1} \left( {x_{7} } \right) = 0,\overline{{K_{T} }}_{sibling} d_{1} \left( {x_{8} } \right) = 0. $$

So the lower and upper approximations of \(d_{1}\) are:

$$ \underline{{K_{T} }}_{sibling} d_{1} = \left\{ {0.293/x_{2} ,0.262/x_{5} ,0.423/x_{6} ,1/x_{7} } \right\} $$
$$ \overline{{K_{T} }}_{sibling} d_{1} = \left\{ {0.885/x_{2} ,1/x_{3} ,0.778/x_{4} ,0.956/x_{5} } \right\} $$

On the basis of Table 1, the tree-based hierarchical class structure is established in Fig. 1.

Fig. 1.
figure 1

The tree of decision class

4 Incremental Updating for Lower and Upper Approximations Under the Variation of Single Object

A \(\left\{ {U,MC,D_{tree} } \right\}\) at time t is given. \(\underline{K}_{{T_{sibling} }}^{\left( t \right)} X\) represents lower approximations, and \(\overline{K}_{{T_{sibling} }}^{(t)} X\) represents upper approximations. Given \(\left\{ {\overline{U} ,\overline{MC} ,\overline{D}_{tree} } \right\}\) represents a multimodality decision system based on hierarchical classification at time t + 1, \(x^{ + } ,x^{ - }\) represents immigration and emmigration of one object, respectively. The fuzzy upper and lower approximations at time t + 1 are denoted by \(\overline{K}_{{T_{sibling} }}^{(t + 1)} X\) and \(\underline{K}_{{T_{sibling} }}^{{\left( {t + 1} \right)}} X\), respectively.

4.1 Immigration of Single Object

The \(x^{ + }\) immigrates into the \(\left\{ {\overline{U} ,\overline{MC} ,\overline{D}_{tree} } \right\}\) at time t + 1, in which \(\overline{U} = U \cup \left\{ {x^{ + } } \right\}\). If no new decision class is generated at time t + 1, tree-based hierarchical class structure does not need to be updated. Otherwise, the tree will be updated. Next, the incremental updating is discussed by whether to generate a new decision class.

Proposition 2.

\(\forall d_{i} \in \overline{U} /\overline{D}_{tree}\), \(x \in \overline{U}\), \(x^{ + }\) will generate a new decision class, and the new class is marked as \(\overline{d}^{ + }_{n + 1}\). The labeled class \(\overline{d}^{ + }_{n + 1}\) needs to be inserted into the tree-based hierarchical class structure. Then the approximations updates of the decision class is shown:

$$ \underline{K}_{{T_{sibling} }}^{{\left( {t + 1} \right)}} d_{i} \left( x \right) = \left\{ {\begin{array}{*{20}l} {\mathop {inf }\limits_{{y = x^{ + } }} \left\{ {\sqrt {1 - K^{2}_{{T_{\cos } }} \left( {x,y} \right)} ,\underline{K}_{{T_{sibling} }}^{\left( t \right)} d_{i} \left( x \right)} \right\}} \hfill & {x^{ + } \in \left\{ {sib\left( {d_{i} } \right)} \right\}\,and} \hfill \\ {} \hfill & {\left\{ {sib\left( {d_{i} } \right)} \right\} \ne \overline{d}^{ + }_{n + 1} } \hfill \\ {\underline{K}_{{T_{sibling} }}^{\left( t \right)} d_{i} \left( x \right)} \hfill & {x^{ + } \notin \left\{ {sib\left( {d_{i} } \right)} \right\}\,and\,x \ne x^{ + } } \hfill \\ {} \hfill & {} \hfill \\ {\mathop {inf }\limits_{{y \in \left\{ {sib\left( {d_{i} } \right)} \right\}}} \left\{ {\sqrt {1 - K^{2}_{{T_{\cos } }} \left( {x,y} \right)} } \right\}} \hfill & {else} \hfill \\ \end{array} } \right. $$
(8)
$$ \overline{K}_{{T_{sibling} }}^{{\left( {t + 1} \right)}} d_{i} \left( x \right) = \left\{ {\begin{array}{*{20}c} {\overline{K}_{{T_{sibling} }}^{\left( t \right)} d_{i} \left( x \right)} & {x \ne x^{ + } {{ and \, }}d_{i} \in leaf\left( d \right)} \\ {\mathop {sup }\limits_{{y \in d_{i} }} K_{{_{{T_{\cos } }} }} \left( {x,y} \right)} & {x = x^{ + } \, and \, d_{i} \in leaf\left( d \right)} \\ {\sup \left\{ {\overline{K}_{{T_{sibling} }}^{\left( t \right)} d_{ick} \left( x \right)} \right\}} & {else} \\ \end{array} } \right. $$
(9)

Proof. For the lower approximations of \(d_{i}\), the approximations at time t + 1 is determined by the objects belonging to the sibling nodes of \(d_{i}\). When \(x \ne x^{ + }\) and \(x^{ + } \notin \left\{ {sib\left( {d_{i} } \right)} \right\}\), the lower approximations are the same as time t; when \(x = x^{ + }\) or the sibling nodes of at time t + 1 are newly decision class, calculating its lower approximations directly according to Proposition 1; when \(x^{ + } \in \left\{ {sib\left( {d_{i} } \right)} \right\}\) and all sibling nodes of \(d_{i}\) are not newly classes, \(\forall x \ne x^{ + }\), there is:

$$ \begin{aligned} \underline{K}_{{T_{sibling} }}^{{\left( {t + 1} \right)}} d_{i} \left( x \right) & = \mathop {inf }\limits_{{y \in \left\{ {sib(d_{i} )} \right\}}} \left\{ {\sqrt {1 - K^{2}_{{T_{\cos } }} \left( {x,y} \right)} } \right\} \\ & = \mathop {inf }\limits_{{y \in \left\{ {sib\left( {d_{i} } \right) - x^{ + } } \right\} \cup x^{ + } }} \left\{ {\sqrt {1 - K^{2}_{{T_{\cos } }} \left( {x,y} \right)} } \right\} \\ & = \underline{K}_{{T_{sibling} }}^{\left( t \right)} d_{i} \left( x \right) \wedge \mathop {inf }\limits_{{y = x^{ + } }} \left\{ {\sqrt {1 - K^{2}_{{T_{\cos } }} \left( {x,y} \right)} } \right\} \\ & = \mathop {inf }\limits_{{y = x^{ + } }} \left\{ {\underline{K}_{{T_{sibling} }}^{\left( t \right)} d_{i} \left( x \right),\sqrt {1 - K^{2}_{{T_{\cos } }} \left( {x,y} \right)} } \right\} \\ \end{aligned} $$

For the upper approximations of \(d_{i}\), the upper approximations is determined by the object that belongs to \(d_{i}\). When \(d_{i}\) is a leaf node and \(x \ne x^{ + }\) at time t + 1, that is the newly decision subset not equal to \(d_{i}\), and its upper approximations is the same as time t; When \(d_{i}\) is a leaf and \(x = x^{ + }\) at time t + 1, direct computational approximations according to Proposition 1; In the tree-based hierarchical class structure, the decision class with the same parents node belongs to a major class, so \(d_{i}\) and dich belong to the same major class, When \(d_{i}\) is not a leaf node, we directly find the least upper bound for all child nodes of the decision class \(d_{i}\), and obtain updating of the upper approximations.

Proposition 3.

\(\forall d_{i} \in \overline{U} /\overline{D}_{tree}\), \(x \in \overline{U}\), \(x^{ + }\) will not generate a new decision class, and then the tree-based hierarchical class structure does not need to be updated. The approximations updating of the decision class is shown:

$$ \underline{K}_{{T_{sibling} }}^{{\left( {t + 1} \right)}} d_{i} (x) = \left\{ {\begin{array}{*{20}l} {\mathop {inf }\limits_{{y = x^{ + } }} \left\{ {\sqrt {1 - K^{2}_{{T_{\cos } }} \left( {x,y} \right)} ,\underline{K}_{{T_{sibling} }}^{\left( t \right)} d_{i} \left( x \right)} \right\}} \hfill & {x \ne x^{ + }\, and} \hfill \\ {} \hfill & {x^{ + } \in \left\{ {sib\left( {d_{i} } \right)} \right\}} \hfill \\ {\underline{K}_{{T_{sibling} }}^{\left( t \right)} d_{i} \left( x \right)} \hfill & {x \ne x^{ + }\, {{and\,}}} \hfill \\ {} \hfill & {x^{ + } \notin \left\{ {sib\left( {d_{i} } \right)} \right\}} \hfill \\ {\mathop {inf }\limits_{{y \in \left\{ {sib\left( {d_{i} } \right)} \right\}}} \left\{ {\sqrt {1 - K^{2}_{{T_{\cos } }} \left( {x,y} \right)} } \right\}} \hfill & {else} \hfill \\ \end{array} } \right. $$
(10)
$$ \overline{K}_{{T_{sibling} }}^{{\left( {t + 1} \right)}} d_{i} \left( x \right) = \left\{ {\begin{array}{*{20}l} {\overline{K}_{{T_{sibling} }}^{(t)} d_{i} \left( x \right)} \hfill & {x \ne x^{ + } , \, x^{ + } \notin d_{i} } \hfill \\ {} \hfill & {and \, d_{i} \in leaf\left( d \right)} \hfill \\ {\mathop {sup }\limits_{{y = x^{ + } }} \left\{ {K_{{_{{T_{\cos } }} }} \left( {x,y} \right),\overline{K}_{{T_{sibling} }}^{(t)} d_{i} \left( x \right)} \right\}} \hfill & {x \ne x^{ + } {, }x^{ + } \in d_{i} } \hfill \\ {} \hfill & {and \, d_{i} \in leaf\left( d \right)} \hfill \\ {\mathop {sup }\limits_{{y \in d_{i} }} \left\{ {K_{{_{{T_{\cos } }} }} \left( {x,y} \right)} \right\}} \hfill & {x = x^{ + } } \hfill \\ {} \hfill & {and\,d_{i} \in leaf\left( d \right)} \hfill \\ {\sup \left\{ {\overline{K}_{{T_{sibling} }}^{(t)} d_{ich} \left( x \right)} \right\}} \hfill & {else} \hfill \\ \end{array} } \right. $$
(11)

Proof. The proof of the lower approximations updating is similar to Proposition 2; for the upper approximations, at time t + 1, when \(d_{i}\) is a non-leaf node, directly find the least upper bound for the child nodes of \(d_{i}\), and the approximations updating is obtained. When \(d_{i}\) is a leaf node, the upper approximations at time t + 1 is determined by the object that belongs to \(d_{i}\); When \(x = x^{ + }\), the upper approximations is directly calculated according to Proposition 1; When \(x \ne x^{ + }\) and \(x^{ + } \notin \left\{ {d_{i} } \right\}\) the approximations is the same as time t; when \(x \ne x^{ + }\) and \(x^{ + } \in \left\{ {d_{i} } \right\} \, \), there is:

$$ \begin{aligned} \overline{K}_{{T_{sibling} }}^{{\left( {t + 1} \right)}} d_{i} \left( x \right) & = \mathop {sup }\limits_{{y \in d_{i} }} \left\{ {K_{{_{{T_{\cos } }} }} \left( {x,y} \right)} \right\} \\ & = \mathop {inf }\limits_{{y \in \left\{ {sib\left( {d_{i} } \right) - x^{ + } } \right\} \cup x^{ + } }} \left\{ {\sqrt {1 - K^{2}_{{T_{\cos } }} \left( {x,y} \right)} } \right\} \\ & = \mathop {sup }\limits_{{y = x^{ + } }} \left\{ {\overline{K}_{{T_{sibling} }}^{\left( t \right)} d_{i} \left( x \right),K_{{_{{T_{\cos } }} }} \left( {x,y} \right)} \right\} \\ \end{aligned} $$

Given a \(\left\{ {\overline{U} ,\overline{MC} ,\overline{D}_{tree} } \right\}\), when immigration of one object, the algorithm for the lower and upper approximations is designed in Algorithm 2.

figure bfigure b
Table 2. Information about immigration of single bject

Example 3.

On the basis of Table 1, one object \(x_{9}^{ + }\) immigrates into system, and its information is shown in Table 2. A new decision class \(d_{6}\) is generated. first the fuzzy similarity relationships with other objects based on combination kernels are calculated according to Definition 6, as follows:

$$ K_{{T_{\cos } }} \left( {x_{9}^{ + } ,x} \right) = \left( {\begin{array}{*{20}c} 0 & {0.299} & {0.306} & {0.643} & {0.306} & 0 & 0 & 0 & 1 \\ \end{array} } \right) $$

On the basis of Fig. 1, inserting the newly decision class \(d_{6}\) into the tree, the tree is update as follows:

Fig. 2.
figure 2

The tree of decision class

As shown in Fig. 2, \(d_{6} \in leaf\), \(sib\left( {d_{1} } \right) = \left\{ {d_{4} ,d_{6} } \right\} = \left\{ {x_{1} ,x_{8} ,x_{9}^{ + } } \right\}\).

According to Proposition 2:

$$ \underline{K}_{{T_{sibling} }}^{{\left( {t + 1} \right)}} d_{1} \left( {x_{1} } \right) = \mathop {inf }\limits_{{y = x_{9}^{ + } }} \left\{ {\sqrt {1 - K^{2}_{{T_{\cos } }} \left( {x_{1} ,y} \right)} ,\underline{K}_{{T_{sibling} }}^{\left( t \right)} d_{1} \left( {x_{1} } \right)} \right\} = min \left( {0,0} \right) = 0 $$
$$ \underline{K}_{{T_{sibling} }}^{{\left( {t + 1} \right)}} d_{1} \left( {x_{2} } \right) = 0.293,\underline{K}_{{T_{sibling} }}^{{\left( {t + 1} \right)}} d_{1} \left( {x_{3} } \right) = 0,\underline{K}_{{T_{sibling} }}^{{\left( {t + 1} \right)}} d_{1} \left( {x_{4} } \right) = 0,\underline{K}_{{T_{sibling} }}^{{\left( {t + 1} \right)}} d_{1} \left( {x_{5} } \right) = 0.263. $$
$$ \underline{K}_{{T_{sibling} }}^{{\left( {t + 1} \right)}} d_{1} \left( {x_{6} } \right) = 0,\underline{K}_{{T_{sibling} }}^{{\left( {t + 1} \right)}} d_{1} \left( {x_{7} } \right) = 0,\underline{K}_{{T_{sibling} }}^{{\left( {t + 1} \right)}} d_{1} \left( {x_{8} } \right) = 0,\underline{K}_{{T_{sibling} }}^{{\left( {t + 1} \right)}} d_{1} \left( {x_{9} } \right) = 0. $$

From Fig. 2, \(d_{1}\) can be obtained as a non-leaf node, according Proposition 2:

$$ \overline{K}_{{T_{sibling} }}^{(t + 1)} d_{1} \left( {x_{1} } \right) = \mathop {sup }\limits_{{y \in d_{3} }} \left\{ {\overline{K}_{{T_{sibling} }}^{(t + 1)} d_{1ch} \left( {x_{1} } \right)} \right\} = \mathop {sup }\limits_{{y \in d_{3} }} \left\{ {\overline{K}_{{T_{sibling} }}^{(t)} d_{3} \left( {x_{1} } \right)} \right\} = 0 $$
$$ \overline{K}_{{T_{sibling} }}^{(t + 1)} d_{1} \left( {x_{2} } \right) = 0.855,\overline{K}_{{T_{sibling} }}^{(t + 1)} d_{1} \left( {x_{3} } \right) = 1,\overline{K}_{{T_{sibling} }}^{(t + 1)} d_{1} \left( {x_{4} } \right) = 0.778. $$
$$ \overline{K}_{{T_{sibling} }}^{(t + 1)} d_{1} \left( {x_{5} } \right) = 0.956,\overline{K}_{{T_{sibling} }}^{(t + 1)} d_{1} \left( {x_{6} } \right) = 0,\overline{K}_{{T_{sibling} }}^{(t + 1)} d_{1} \left( {x_{7} } \right) = 0,\overline{K}_{{T_{sibling} }}^{(t + 1)} d_{1} \left( {x_{8} } \right) = 0. $$
$$ \overline{K}_{{T_{sibling} }}^{(t + 1)} d_{1} \left( {x_{9} } \right) = 0.306. $$

So the lower and upper approximations of \(d_{1}\) are:

$$ \underline{K}_{{T_{sibling} }}^{{\left( {t + 1} \right)}} d_{1} = \left\{ {0.293/x_{2} ,0.262/x_{5} } \right\} $$
$$ \overline{K}_{{T_{sibling} }}^{(t + 1)} d_{1} = \left\{ {0.885/x_{2} , \, 1/x_{3} , \, 0.778/x_{4} ,} \right. \, \left. {0.956/x_{5} , \, 0.306/x_{9} } \right\} $$

From Example 3, obviously, when immigration of one object in the multimodality decision system based on hierarchical classification, it will affect the upper and lower approximations of other decision classes. Only a small amount of update operations need to be performed according to properties 2 and 3, which greatly reduces the amount of calculation and time cost.

4.2 Emmigration of Single Object

\(x^{ - }\) emmigrates from the \(\left\{ {\overline{U} ,\overline{MC} ,\overline{D}_{tree} } \right\}\) at time t + 1, in which \(\overline{U} = U - \left\{ {x^{ - } } \right\}\). If no decision class is removed at time t + 1, the tree-based hierarchical class structure does not need to be updated, otherwise, the tree is updated. Next, the incremental updating will be discussed by whether to remove a decision class.

Proposition 4.

\(\forall d_{i} \in \overline{U} /\overline{D}_{tree}\), \(x \in \overline{U}\), \(x^{ - }\) emmigrates form system, which leads to the decision class is removed, marked as \(\overline{d}_{l}\). The tree-based hierarchical class structure will be updated, and the \(\overline{d}_{l}\) will be removed from the tree. Then the approximations updating of the decision class is shown:

$$ \underline{K}_{{T_{sibling} }}^{{\left( {t + 1} \right)}} d_{i} \left( x \right) = \left\{ {\begin{array}{*{20}l} {\mathop {inf }\limits_{{y \in sib(d_{i} )}} \left\{ {\sqrt {1 - K^{2}_{{T_{\cos } }} \left( {x,y} \right)} ,\underline{K}_{{T_{sibling} }}^{\left( t \right)} d_{i} \left( x \right)} \right\}} \hfill & {x^{ - } \in sib\left( {d_{i} } \right)and} \hfill \\ {} \hfill & {sib\left( {d_{i} } \right) \ne \emptyset } \hfill \\ {\underline{K}_{{T_{sibling} }}^{\left( t \right)} d_{i} \left( x \right)} \hfill & {x^{ - } \notin sib\left\{ {d_{i} } \right\}} \hfill \\ 0 \hfill & {else} \hfill \\ \end{array} } \right. $$
(12)
$$ \overline{K}_{{T_{sibling} }}^{(t + 1)} d_{i} \left( x \right) = \overline{K}_{{T_{sibling} }}^{(t)} d_{i} \left( x \right) $$
(13)

Proof. At time t+1, the removed decision class has no upper and lower approximations, but \(x^{ - }\) will have the impact on the upper and lower approximations of other decision classes. If \(x^{ - }\) is the closest object to the sibling node to which the object x belongs, the lower approximations need to be recalculated, otherwise, the lower approximations remains unchanged; if the sibling node of \(d_{i}\) does not exist after \(x^{ - }\) emmigrates, the lower approximations is 0. For a decision class, the upper approximations is only related to the objects belonging to the decision class. At time t+1, \(x^{ - } \notin d_{i}\), so the upper approximations remains the same.

Proposition 5.

\(\forall d_{i} \in \overline{U} /\overline{D}_{tree}\), \(x \in \overline{U}\), \(x^{ - }\) will not cause the decision class to be removed, and then the tree-based hierarchical class structure does not need to be updated. Then the approximations updating of the decision class is shown:

$$ \underline{K}_{{T_{sibling} }}^{{\left( {t + 1} \right)}} d_{i} \left( x \right) = \left\{ {\begin{array}{*{20}l} {\mathop {inf }\limits_{{y \in sib(d_{i} )}} \left\{ {\sqrt {1 - K^{2}_{{T_{\cos } }} \left( {x,y} \right)} ,\underline{K}_{{T_{sibling} }}^{\left( t \right)} d_{i} \left( x \right)} \right\}} \hfill & {x^{ - } \in sib\left( {d_{i} } \right)} \hfill \\ {\underline{K}_{{T_{sibling} }}^{\left( t \right)} d_{i} \left( x \right)} \hfill & {else} \hfill \\ \end{array} } \right. $$
(14)
$$ \overline{K}_{{T_{sibling} }}^{(t + 1)} d_{i} \left( x \right) = \left\{ {\begin{array}{*{20}c} {\mathop {sup }\limits_{{y = x^{ - } }} \left\{ {K_{{_{{T_{\cos } }} }} \left( {x,y} \right),\overline{K}_{{T_{sibling} }}^{\left( t \right)} d_{i} \left( x \right)} \right\}} & {x^{ - } \in d_{i}\, {{ and\, }}d_{i} \in leaf\left( d \right)} \\ {\overline{K}_{{T_{sibling} }}^{\left( t \right)} d_{i} \left( x \right)} & {x^{ - } \notin d_{i}\, {{ and\, }}d_{i} \in leaf\left( d \right)} \\ {\sup \left\{ {\overline{K}_{{T_{sibling} }}^{\left( t \right)} d_{ich} \left( x \right)} \right\}} & {else} \\ \end{array} } \right. $$
(15)

The proof process is similar.

Given a \(\left\{ {\overline{U} ,\overline{MC} ,\overline{D}_{tree} } \right\}\), when one object emmigrates from system, the algorithm for the lower and upper approximations is designed in Algorithm 3.

figure d

Example 4.

The object \(x_{3}\) emmigrates from the system on the basis of Table 1, which will cause the decision class \(d_{3}\) to be removed, and the tree will be updated as follows:

Fig. 3.
figure 3

The new tree for emmigration of one object \(x_{3}\)

It can be seen from Fig. 3 that the child node of \(d_{3}\) has been removed, and \(d_{1}\) is a leaf node. According to Proposition 4, it can be obtained:

$$ \underline{K}_{{T_{sibling} }}^{{\left( {t + 1} \right)}} d_{1} \left( {x_{1} } \right) = 0,\underline{K}_{{T_{sibling} }}^{{\left( {t + 1} \right)}} d_{1} \left( {x_{2} } \right) = 0.293,\underline{K}_{{T_{sibling} }}^{{\left( {t + 1} \right)}} d_{1} \left( {x_{3} } \right) = 0,\underline{K}_{{T_{sibling} }}^{{\left( {t + 1} \right)}} d_{1} \left( {x_{4} } \right) = 0. $$
$$ \underline{K}_{{T_{sibling} }}^{{\left( {t + 1} \right)}} d_{1} \left( {x_{5} } \right) = 0.262,\underline{K}_{{T_{sibling} }}^{{\left( {t + 1} \right)}} d_{1} \left( {x_{6} } \right) = 0.423,\underline{K}_{{T_{sibling} }}^{{\left( {t + 1} \right)}} d_{1} \left( {x_{7} } \right) = 1,\underline{K}_{{T_{sibling} }}^{{\left( {t + 1} \right)}} d_{1} \left( {x_{8} } \right) = 0. $$

Because \(d_{1}\) become a leaf node, where are:

$$ \overline{K}_{{T_{sibling} }}^{(t + 1)} d_{1} \left( {x_{1} } \right) = 0,\overline{K}_{{T_{sibling} }}^{(t + 1)} d_{1} \left( {x_{2} } \right) = 0.855,\overline{K}_{{T_{sibling} }}^{(t + 1)} d_{1} \left( {x_{3} } \right) = 0,\overline{K}_{{T_{sibling} }}^{(t + 1)} d_{1} \left( {x_{5} } \right) = 0.956. $$
$$ \overline{K}_{{T_{sibling} }}^{(t + 1)} d_{1} \left( {x_{6} } \right) = 0,\overline{K}_{{T_{sibling} }}^{(t + 1)} d_{1} \left( {x_{7} } \right) = 0,\overline{K}_{{T_{sibling} }}^{(t + 1)} d_{1} \left( {x_{8} } \right) = 0. $$

So the lower and upper approximations of \(d_{1}\) are:

$$ \underline{K}_{{T_{sibling} }}^{{\left( {t + 1} \right)}} d_{1} = \left\{ {0.293/x_{2} ,0.262/x_{5} ,0.423/x_{6} ,1/x_{7} } \right\} $$
$$ \overline{K}_{{T_{sibling} }}^{(t + 1)} d_{1} = \left\{ {0.885/x_{2} ,0.778/x_{4} ,0.956/x_{5} } \right\} $$

From Example 4, it can be seen that when one object emmigrate form the multimodality decision system based on hierarchical class, the emmigration of one objects have an impact on the upper and lower approximations of other decision classes, and only a small amount of update operations need to be performed according to Proposition 4 and 5, which greatly reduces the calculation and time cost.

5 Conclusions and Further Research

In the multimodality decision system based on hierarchical classification, attributes of simples are multimodality, decision attribute values often have a hierarchical structure, and data changes frequently. The paper takes into account the fact that the attribute values are not known in the multimodality information system based on hierarchical classification, and proposes an incremental updating algorithm in multikernel fuzzy rough sets based on hierarchical classification. The specific process of this algorithm is demonstrated through relevant examples. This algorithm can effectively reduce the time cost caused by object set changes. In future work, the approximations updating algorithm of more objects changing in multikernel fuzzy rough sets based on hierarchical classification and the performance of the algorithm to test the algorithm using the UCI dataset will be the focus of the research.