Greedy Algorithm for the Construction of Approximate Decision Rules for Decision Tables with Many-Valued Decisions

Azad, Mohammad; Moshkov, Mikhail; Zielosko, Beata

doi:10.1007/978-3-662-53611-7_2

Mohammad Azad¹⁵,
Mikhail Moshkov¹⁵ &
Beata Zielosko¹⁶

Part of the book series: Lecture Notes in Computer Science ((TRS,volume 10020))

Abstract

The paper is devoted to the study of a greedy algorithm for construction of approximate decision rules. This algorithm is applicable to decision tables with many-valued decisions where each row is labeled with a set of decisions. For a given row, we should find a decision from the set attached to this row. We consider bounds on the precision of this algorithm relative to the length of rules. To illustrate proposed approach we study a problem of recognition of labels of points in the plain. This paper contains also results of experiments with modified decision tables from UCI Machine Learning Repository.

Access provided by Autonomous University of Puebla. Download chapter PDF

Optimization of Approximate Decision Rules Relative to Number of Misclassifications: Comparison of Greedy and Dynamic Programming Approaches

Experimental Study of Totally Optimal Decision Rules

Optimization of Approximate Decision Rules Relative to Coverage

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

In this paper, we consider one more extension of the notion of decision table – decision table with many-valued decisions. In a table with many-valued decisions, each row is labeled with a nonempty finite set of decisions, and for a given row, we should find a decision from the set of decisions attached to this row.

Such tables arise in problems of discrete optimization, pattern recognition, computational geometry, decision making etc. [10, 17]. However, the main sources of decision tables with many-valued decisions are datasets filled by statistical or experimental data. In such datasets, we often have groups of objects with equal values of conditional attributes but, probably, different values of the decision attribute. Instead of a group of objects, we can consider one object given by values of conditional attributes. We attach to this object a set of decisions: either all decisions for objects from the group, or k the most frequent decisions for objects from the group, etc. As a result we obtain a decision table with many-valued decisions. In real life applications we can meet multi-label data when we study, e.g., problem of semantic annotation of images [4], music categorization into emotions [35], functional genomics [3], and text categorization [36].

In the rough set theory [22, 30, 31], decision tables are considered often that have equal rows labeled with different decisions. The set of decisions attached to equal rows is called the generalized decision for that rows [23–25]. Here our aim is to find the generalized decision for a given row. In the paper, we will call this approach the generalized decision approach. However, the problem of finding an arbitrary decision or one of the most frequent decisions from the generalized decision is interesting also. Such study of decision tables with many-valued decisions can give a new tool for the rough set theory. In [2] and [18] we considered problem of construction of tests (super-reducts) and decision trees for decision tables with many-valued decisions. To choose one of the attributes we used uncertainty measure that is the number of boundary subtables.

Decision table with many-valued decisions can be considered as a decision table with an incomplete information because we don’t know which decision should be chosen from the set of decisions. Incomplete information exists also in decision tables where instead of a single value of conditional attribute we have a subset of values of the attribute domain. In [13, 14] approaches to interpreting queries in a database with such incomplete information were discussed. Z. Pawlak [22] and E. Orłowska [21] proposed Non-deterministic Information System for dealing with an incomplete information. Information incompleteness is connected also with missing values of attributes or intervals on values of attributes. M. Kryszkiewicz in [11] proposed method for computing all optimal generalized rules from decision table with missing values. In [27–29] authors proposed rule generation system, based on Apriori algorithm, where incomplete information was considered as nondeterministic information.

In literature, often, problems connected with multi-label data are considered from the point of view of classification (multi-label classifications problems) [7, 8, 15, 19, 33, 34, 37]. Here our aim is not to deal with classification but to show that proposed approach for construction of decision rules for decision tables with many-valued decisions can be useful when we deal with knowledge representation. In various applications, we often deal with decision tables which contain noisy data. In this case, exact rules can be “over-fitted”, i.e., depend essentially on the noise. So, instead of exact rules with many attributes, it is more appropriate to work with approximate rules with smaller number of attributes. Besides, classifiers based on approximate decision rules have often better accuracy than classifiers based on exact decision rules.

In the proposed approach a greedy algorithm constructs $\alpha $-decision rules ($\alpha $ is a degree of rule uncertainty), and the number of rules for a given row is equal to the cardinality of set of decisions attached to this row. Then we choose for each row in a decision table a rule with the minimum length. The choice of shorter rules is connected with the Minimum Description Length principle [26].

The problem of construction of rules with minimum length is NP-hard. Therefore, we consider approximate polynomial algorithm for rule optimization. Based on results of U. Feige [9] it was proved in [16], for decision tables with one-valued decision, that greedy algorithm under some natural assumptions on the class NP, is close to the best polynomial approximate algorithms for partial decision rule minimization. It is natural to use these results in our approach. Note that each decision table with one-valued decision can be interpreted also as a decision table where each row is labeled with a set of decisions which has one element.

The paper, extending a conference publication [6] and some results presented in [17], is devoted to the study of a greedy algorithm for construction of approximate decision rules for decision tables with many-valued decisions. The greedy algorithm for rule construction has polynomial time complexity for the whole set of decision tables with many-valued decisions.

We discuss also a problem of recognition of labels of points in the plain which illustrates the considered approach and the obtained bounds on precision of this algorithm relative to the length of rules.

In this paper, we study only binary decision tables with many-valued decisions. However, the obtained results can be extended to the decision tables filled by numbers from the set $\{0, \ldots , k-1\}$, where $k \ge 3$. We present experimental results based on modified data sets from UCI Machine Learning Repository [12] (by removal of some conditional attributes) into the form of decision tables with many-valued decisions. Experiments are connected with length of $\alpha $-decision rules, number of different rules, lower and upper bounds on the minimum length of $\alpha $-decision rules and 0.5-hypothesis for $\alpha $-decision rules. We also present experimental results for the generalized decision approach. It allows us to make some comparative study of length and number of different rules, based on the proposed approach and the generalized decision approach.

The paper consists of eight sections. In Sect. 2, main notions are discussed. In Sect. 3, a parameter M(T) and auxiliary statement are presented. This parameter is used for analysis of a greedy algorithm. Section 4 is devoted to the consideration of a set cover problem. In Sect. 5, the greedy algorithm for construction of approximate decision rules is studied. In this section we also present a lower and upper bounds on the minimum rule length based on the information about greedy algorithm work, and 0.5-hypothesis for tables with many-valued decisions. In Sect. 6, we discuss the problem of recognition of labels of points in the plain. In Sect. 7, experimental results are presented. Section 8 contains conclusions.

2 Main Definitions

In this section, we consider definitions corresponding to decision tables with many-valued decisions.

A (binary) decision table with many-valued decisions is a rectangular table T filled by numbers from the set $\{0,1\}$. Columns of this table are labeled with attributes $f_{1},\ldots ,f_{n}$. Rows of the table are pairwise different, and each row is labeled with a nonempty finite set of natural numbers (set of decisions). Note that each decision table with one-valued decisions can be interpreted also as a decision table with many-valued decisions. In such table, each row is labeled with a set of decisions which has one element. An example of decision table with many-valued decisions $T_0$ is presented in Table 1.

Table 1. Decision table $T_0$ with many-valued decisions

Greedy Algorithm for the Construction of Approximate Decision Rules for Decision Tables with Many-Valued Decisions

Abstract

Similar content being viewed by others

Optimization of Approximate Decision Rules Relative to Number of Misclassifications: Comparison of Greedy and Dynamic Programming Approaches

Experimental Study of Totally Optimal Decision Rules

Optimization of Approximate Decision Rules Relative to Coverage

Keywords

1 Introduction

2 Main Definitions

3 Parameter M(T)

Lemma 1

Proof

4 Set Cover Problem

Theorem 1

Proof

5 Greedy Algorithm for \(\alpha \)-Decision Rule Construction

Theorem 2

Example 1

Proposition 1

5.1 Upper and Lower Bounds on \(L_{\mathrm {min}}(\alpha ,T,r)\)

5.2 0.5-Hypothesis

Theorem 3

6 Problem of Recognition of Labels of Points in the Plain

Example 2

Proposition 2

Proof

Corollary 1

7 Results of Experiments

7.1 Proposed Approach

7.2 Generalized Decision Approach

7.3 Comparative Study

8 Conclusions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation