Keywords

1 Introduction

Decision-theoretic rough set (DTRS) model was firstly introduced by Yao et al. [1] in the early 1990’s. As a probabilistic rough set model, it has been successfully used in many research areas, such as knowledge presentation [2,3,4], data mining [5], machine learning [6], artificial intelligence [7, 8] and pattern recognition.

Attribute reduction [9,10,11,12,13,14] aims to remove the unnecessary attributes from the information system while keeping the particular property, and becomes one of the hottest issues in rough set theory. Yao and Zhao [9] studied attribute reduction in decision-theoretic rough set models with respect to the different classification properties, confidence, coverage, decision-monotocity, generality and cost, they also gave a general definition of probabilistic attribute reduction. Jia et al. [10] provided a minimum cost attribute reduction in decision-theoretic rough set model, and decision cost induced from the reduct is minimum. Dou et al. proposed a parameterized decision-theoretic rough set model in the paper [11]. In the proposed model, the smallest possible cost and the largest possible cost are computed respectively. Li et al. [12] introduced a non-monotonic attribute reduction for decision-theoretic rough set model. The expanded positive region can be kept by the non-monotonic attribute reduction in an information system. To extend classical indiscernibility relation in Yao’s decision-theoretic rough sets, Ju et al. [13] gave the δ-cut decision-theoretic rough set. In the proposed decision-theoretic rough set model, attribute reduction of the decision-monotonicity criterion and the cost minimum criterion are proposed respectively in the paper. By constructing variants of conditional entropy in decision-theoretic rough set model, Ma et al. [14] proposed solutions to the attribute reduction based on decision region preservation.

The remaining parts of this paper are arranged as follows. Some basic notions with respect to utility-based decision-theoretic rough set (UDTRS) model are briefly recalled in Sect. 2. Definition of attribute reduction in UDTRS and relative heuristic reduction algorithm are investigated respectively in Sect. 3. We give the experimental analysis in Sect. 4. The paper is summarized in Sect. 5.

2 Utility-Based Decision-Theoretic Rough Sets

By considering the subjective factors in risk decision, Zhang et al. [15] proposed utility-based decision-theoretic rough set (UDTRS) model based on Yao’s decision-theoretic rough set model [1]. In this section, we briefly recall some basic notions about utility-based decision-theoretic rough set model. Detailed information about UTRS can be found in the paper [15].

A decision system is defined as the 3-tuple \( S = (U,AT = C{ \cup }D,V_{a} ) \).Universe \( U \) is the finite set of the objects; \( AT \) is a nonempty set of the attributes, such that for all \( a\,{ \in }\,AT \); \( C \) is the set of conditional attribute and \( D \) is the set of decision attribute; \( V_{a} \) is the domain of attribute \( a \).

For each nonempty subset \( A \subseteq AT \), the indiscernibility relation \( IND(A) \) is defined as: \( IND(A) = \{ (x,y) \in U^{2} ,a(x) = a(y),{\forall }a \in A\} \). Two objects in \( U \) satisfy \( IND(A) \) if and only if they have the same value in \( \forall a \in A \). \( U \) is divided into a family of disjoint subsets \( U/IND(A) \) defined a quotient set of \( U \) as \( U/IND(A) = \{ [x]_{A} :x \in U\} \), where \( [x]_{A} = \{ y \in U:(x,y)\,{ \in }\,IND(A)\} \) denotes the equivalence class determined by \( x \) with respect to \( A \). The set of states is given by \( \Omega = U /D = \{ X,X^{c} \} \) indicating that an object is in state \( X \) or \( X^{c} \).

Utility is an important economic concept, and it reflects degree of one’s satisfaction related to the cost or profit in decision procedure. For \( \forall x \in U \) and \( [x] \in U /\pi \), \( u(\lambda ) \) is utility function, \( \lambda \) denotes the cost of taking action. The expected utilities for different actions can be expressed as:

$$ \begin{array}{*{20}l} {\Psi (a_{P} |[x]) = u(\lambda_{PP} )P(X|[x]) + u(\lambda_{PN} )P(X^{c} |[x])} \hfill \\ {\Psi (a_{N} |[x]) = u(\lambda_{NP} )P(X|[x]) + u(\lambda_{NN} )P(X^{c} |[x])} \hfill \\ {\Psi (a_{B} |[x]) = u(\lambda_{BP} )P(X|[x]) + u(\lambda_{BN} )P(X^{c} |[x])} \hfill \\ \end{array} $$

According to maximal utility in Bayesian procedures, we have the following as:

$$ \begin{aligned} & {\text{If}}\,\,\Psi (a_{P} |[x]) \ge\Psi (a_{N} |[x])\,{\text{and}}\,\Psi (a_{P} |[x]) \ge\Psi (a_{B} |[x]),\,{\text{then}}[x] \subseteq POS_{\pi } (X), \\ & {\text{If}}\,\,\Psi (a_{N} |[x]) \ge\Psi (a_{P} |[x])\,{\text{and}}\,\Psi (a_{N} |[x]) \ge\Psi (a_{B} |[x]),\,{\text{then}}[x] \subseteq NEG_{\pi } (X), \\ & {\text{If}}\,\,\Psi (a_{B} |[x]) \ge\Psi (a_{P} |[x])\,{\text{and}}\,\Psi (a_{B} |[x]) \ge\Psi (a_{N} |[x]),\,{\text{then}}[x] \subseteq BND_{\pi } (X). \\ \end{aligned} $$

If \( P(X|[x]) = P \) then \( P(X^{c} |[x]) = 1 - P \), then we derived the following decision rules:

$$ \begin{aligned} & {\text{If}}\,\,P(X|[x]) \ge \alpha_{u} ,\,{\text{then}}\,[x] \subseteq POS_{\pi } (X), \\ & {\text{If}}\,\,P(X|[x]) \le \beta_{u} ,\,{\text{then}}\,[x] \subseteq NEG_{\pi } (X), \\ & {\text{If}}\,\,\beta_{u} < P(X|[x]) < \alpha_{u} ,\,{\text{then}}\,[x] \subseteq BND_{\pi } (X); \\ \end{aligned} $$

where

$$ \begin{aligned} & \alpha_{u} = \frac{{u(\lambda_{BN} ) - u(\lambda_{PN} )}}{{(u(\lambda_{BN} ) - u(\lambda_{PN} )) + (u(\lambda_{PP} ) - u(\lambda_{BP} ))}}, \\ & \beta_{u} = \frac{{u(\lambda_{NN} ) - u(\lambda_{BN} )}}{{(u(\lambda_{NN} ) - u(\lambda_{BN} )) + (u(\lambda_{BP} ) - u(\lambda_{NP} ))}}. \\ \end{aligned} $$

Since \( u(\lambda_{PP} ) \ge u(\lambda_{BP} ) > u(\lambda_{NP} ),\,u(\lambda_{NN} ) \ge u(\lambda_{BN} ) > u(\lambda_{PN} ),\,\alpha_{u} \in (0,1],\,\beta_{u} \in [0,1) \) Then, we can obtain

$$ \begin{aligned} & \alpha_{u} = \frac{1}{{1 + \Delta (\alpha_{u} )}} = \frac{1}{{1 + \frac{{u(\lambda_{PP} ) - u(\lambda_{BP} )}}{{u(\lambda_{BN} ) - u(\lambda_{PN} )}}}}, \\ & \beta_{u} = \frac{1}{{1 + \Delta (\beta_{u} )}} = \frac{1}{{1 + \frac{{u(\lambda_{BP} ) - u(\lambda_{NP} )}}{{u(\lambda_{NN} ) - u(\lambda_{BN} )}}}}. \\ \end{aligned} $$

For \( \forall X \subseteq U,\,\,(\alpha_{u} ,\beta_{u} ) \) -upper and lower approximations in utility-based decision-theoretic rough set model are presented as:

$$ \begin{aligned} & \underline{apr}_{{(\alpha_{u} ,\beta_{u} )}} (X) = \{ x \in U|P(X|[x]) \ge \alpha_{u} \} , \\ & \overline{apr}_{{(\alpha_{u} ,\beta_{u} )}} (X) = \{ x \in U|P(X|[x]) > \beta_{u} \} . \\ \end{aligned} $$

Based on the definition of rough approximations in UDTRS, the positive, boundary and negative regions are defined as

$$ \begin{aligned} & POS_{\pi } (X) = \underline{apr}_{{(\alpha_{u} ,\beta_{u} )}} (X), \\ & BND_{\pi } (X) = \overline{apr}_{{(\alpha_{u} ,\beta_{u} )}} (X) - \underline{apr}_{{(\alpha_{u} ,\beta_{u} )}} (X), \\ & NEG_{\pi } (X) = U - \overline{apr}_{{(\alpha_{u} ,\beta_{u} )}} (X). \\ \end{aligned} $$

3 Attribute Reduction in UDTRS

In this section, we will give the definition of attribute reduction based on maximal utility in UDTRS. By attribute reduction, the maximal utility will be obtain in decisions. According to the proposed definition of reduction, a heuristic algorithm with respect to the maximal utility will be investigated in this section.

Similar to the Bayesian expected cost [10] in decision-theoretic rough set model, the Bayesian expected utility [15] of each rule is expressed as:

$$ \begin{array}{*{20}l} {{\text{Utility of the positive rule}}: p \cdot u(\lambda_{PP} ) + (1 - p) \cdot u(\lambda_{PN} );} \hfill \\ {{\text{Utility of the negative rule}}:p \cdot u(\lambda_{BP} ) + (1 - p) \cdot u(\lambda_{BN} );} \hfill \\ {{\text{Utility of the boundary rule}}:p \cdot u(\lambda_{NP} ) + (1 - p) \cdot u(\lambda_{NN} ).} \hfill \\ \end{array} $$

From above, we can easily get the Bayesian expected utility of decision rules:

Utility of positive rules:

$$ Utility_{A}^{POS} = \sum\limits_{{x_{i} \in POS_{{(\alpha_{u} ,\beta_{u} )}} (\pi_{D} /\pi_{A} )}} {(p_{i} \cdot u(\lambda_{PP} ) + (1 - p_{i} ) \cdot u(\lambda_{PN} )} )\,; $$

Utility of boundary rules:

$$ Utility_{A}^{{_{BND} }} = \sum\limits_{{x_{j} \in BND_{{(\alpha_{u} ,\beta_{u} )}} (\pi_{D} /\pi_{A} )}} {(p_{j} \cdot u(\lambda_{BP} ) + (1 - p_{j} ) \cdot u(\lambda_{BN} )} )\,; $$

Utility of negative rule:

$$ Utility_{A}^{{_{NEG} }} = \sum\limits_{{x_{k} \in NEG_{{(\alpha_{u} ,\beta_{u} )}} (\pi_{D} /\pi_{A} )}} {(p_{k} \cdot u(\lambda_{NP} ) + (1 - p_{k} ) \cdot u(\lambda_{NN} )} ). $$

For any subset \( A \subseteq AT \), the whole utility is composed of three parts: utility of positive region, utility of boundary region and utility of negative region. Then, we have the whole utility of all decision rules in decision systems as follows [15]:

$$ \begin{aligned} Utility_{A} & = Utility_{A}^{{_{POS} }} + Utility_{A}^{{_{BND} }} + Utility_{A}^{{_{NEG} }} \\ & = \sum\limits_{{x_{i} \in POS_{{(\alpha_{u} ,\beta_{u} )}} (\pi_{D} /\pi_{A} )}} {(p_{i} \cdot u(\lambda_{PP} ) + (1 - p_{i} ) \cdot u(\lambda_{PN} )} ) \\ & \quad + \sum\limits_{{x_{j} \in BND_{{(\alpha_{u} ,\beta_{u} )}} (\pi_{D} /\pi_{A} )}} {(p_{j} \cdot u(\lambda_{BP} ) + (1 - p_{j} ) \cdot u(\lambda_{BN} )} ) \\ & \quad + \sum\limits_{{x_{k} \in NEG_{{(\alpha_{u} ,\beta_{u} )}} (\pi_{D} /\pi_{A} )}} {(p_{k} \cdot u(\lambda_{NP} ) + (1 - p_{k} ) \cdot u(\lambda_{NN} )} ) \\ \end{aligned} $$

In real applications, it is better to obtain more utility in decision procedures. Thus, according to “non-decreasing” principle, we define attribute reduction in utility-based decision-theoretic rough set model as follows:

Definition 1.

A decision system \( S = (U,C \cup D,V_{a} ),\,R \subseteq C \) is a reduct of \( C \) with respect to \( D \) if it satisfies the following two conditions:

  1. (1)

    \( Utility_{R} \ge Utility_{C} ; \)

  2. (2)

    \( \forall Rh^{{\prime }} \subset R,\,Utility_{{R^{{\prime }} }} < Utility_{R} . \)

From Definition 1, the decision utility will be increased or unchanged by the reduction. Condition (1) is the jointly sufficient condition and condition (2) is the individual necessary condition. Condition (1) guarantees that the utility induced from the reduct is maximal, and condition (2) guarantees the reduct is minimal.

The fitness function, which shows the significance of an attribute, is usually used to construct a heuristic algorithm in rough set theory. In UTRS model, the fitness function is defined as:

Definition 2.

A decision system \( S = \{ U,C \cup D,V_{a} \} ,\,A \subseteq C \) The utility fitness function of attribute \( a_{i} \in A \) is defined as:

$$ Sig_{Utility} (A,a_{i} ) = \frac{{Utility_{{A - \{ a_{i} \} }} - Utility_{A} }}{{Utility_{A} }}. $$

The three strategies in heuristic algorithm is summarized in paper [9]. In this paper, we take deletion strategy to give an algorithm in UDTRS. The heuristic algorithm (The algorithm of maximal-utility attribute reduction, AMUAR) based on the utility fitness function is described as follows:

figure a

The fitness function shows the significance of an attribute. In the processing of deleting attributes, if \( Utility_{B} \ge Utility_{C} \), the algorithm will stop the deleting procedure and output reduct of decision systems.

4 Experimental Analysis

In this section, we will verify effectiveness of the algorithm AMUAR and the monotonicity of utility with attributions by experiments. All the experiments have been carried out on a personal computer with Windows 7, Intel (R) Pentium (R) CPU G640 (2.8 GHz) and 6.00 GB memory. The programming language is Matlab 2010b.

We take \( u(\lambda ) = a( - \lambda + c)^{b} \) as the utility function. If \( 0 < b < 1 \), then UDTRS model is risk aversion; If \( b = 1 \), UDTRS model is risk neutrality; If \( b > 1 \), UDTRS model is risk loving; Six data sets from the UCI Machine Learning Repository are used. For each data set, the utility functions are randomly generated in interval value [100, 1000]. Their values meet the following constraint conditions: \( u(\lambda_{BP} ) > u(\lambda_{NP} ),\,u(\lambda_{BN} ) > u(\lambda_{PN} ),\,u(\lambda_{PP} ) = u(\lambda_{NN} ) = 1 \). 10 different groups of utility functions are randomly generated. Table 1 shows the average length of the derived reduct with different data sets.

Table 1. Average length of a reduct

To validate the monotonicity of utility with attributes, utility is calculated with the increasing number of attributes from 1 to the total attribute number in each data set. In Fig. 1, the x-coordinate represents the number of attributes, and the y-coordinate represents the utility of three models. Figure 1 shows the utility of three models do not strictly increase with the increasing of attribute numbers. For example, the utility decrease with adding an attribute in data set credit_a, forestfires and vote. The utility with the number of attributes increasing do not present monotonicity strictly.

Fig. 1.
figure 1

Utility comparison of three attribute reductions

5 Conclusions

Utility-based decision-theoretic rough set model is introduced in this paper. The utility of the positive region, the boundary region and the negative region are given respectively. We provide a definition of reduction which aims to obtain the maximal utility in decisions. A heuristic reduction algorithm with respect to the definition is proposed. Finally, experimental results show the proposed algorithm is effective.