A new classification method based on rough sets theory

Cekik, Rasim; Telceken, Sedat

doi:10.1007/s00500-016-2443-0

A new classification method based on rough sets theory

Methodologies and Application
Published: 19 December 2016

Volume 22, pages 1881–1889, (2018)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Soft Computing Aims and scope Submit manuscript

A new classification method based on rough sets theory

Download PDF

Rasim Cekik¹ &
Sedat Telceken¹

530 Accesses
16 Citations
Explore all metrics

Abstract

Discovering the common attributes of an object is an important problem in classification. The rough sets theory (RST) successfully reveals the relationship between an object, its attributes and classes and helps bring a solution to the classification problem. In this study, a new classification method has been developed that uses RST and a similarity-based method to create the weight matrix scoring system. The proposed method is named feature weighted rough set classification (FWRSC) and is compared with the classification methods in WEKA for five different datasets. The experimental results show that FWRSC gives higher performance than most of the methods in WEKA. Additionally, FWRSC produces the highest performance in terms of accuracy with an overall average of 67.47% for five different datasets.

Rough Set-Based Feature Subset Selection Technique Using Jaccard’s Similarity Index

A Rough Set Classifier Based on Discretization and Attribute Selection

Modified Soft Rough set for Multiclass Classification

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The rapid development of computer technology has given rise to an exponential growth of data. According to a study presented in (Witten and Frank 2005), the amount of data doubles every 20 months in the world. It is obvious that this growing big data would be redundant with many repetitions, incompleteness, inconsistencies, vagueness, uncertainties and therefore is difficult to interpret and analyze. This low-quality nature of data creates big problems for such areas as classification, clustering, feature selection, feature extraction, knowledge discovery, image processing, prediction and text mining. Among these areas, data mining is the most affected problem from missing and hidden data, and therefore, the focus of this paper is to decrease classification problems for missing and hidden data.

This study focuses on classification and proposes a new classification method. The simplest definition of classification is the process of finding the class that an object belongs to. Numbers and labels of classes are clear in the classification process. Moreover, the fact that each object has features and these features are related to the classes is important for classification. Nevertheless, there are some problems that make correct classification a difficult job. For example, classifying within a dataset that is inconsistent, incomplete and contains redundant data is difficult. To solve these problems in data mining, preprocessing procedures such as data cleaning, relevance analysis, data transformation, and reduction are applied. Recently, however, rough sets theory (RST) has been widely and effectively used for this purpose in the literature. Lately, some researchers have also proposed classification methods for inconsistent and incomplete data. For example, Khoo et al. (1999) proposed a novel approach for the classification and rule induction of incomplete information systems. Xiang-Wei and Yian-Fang (2012) proposed a novel effective preprocessing algorithm using RST. Chakhar and Saad (2012) (https://archive.ics.uci.edu/ml/datasets.html) proposed a methodology to support groups in multi-criteria classification problems.

RST makes a dataset more consistent by working on the incomplete and inconsistent data. Additionally, RST can find which features of an object are more effective than the others in the classification of the object. The experimental results as well as related works in the literature show that more effective classification can be achieved by using these aspects of RST.

2 Rough sets theory

Rough sets theory (RST) (Pawlak 1982, 1991, 1998; Pawlak and Skowron 2007) was proposed by Pawlak in 1982, and it is a mathematical method used for generating rules and evaluation for expert systems such as pattern recognition, machine learning and knowledge discovery. RST has recently been attracting people’s attention due to its ability to obtain useful information from missing and inconsistent data, which has opened up very wide application areas (Kryszkiewicz 1998; Dubois and Prade 1990; Pawlak and Sowinski 1994; Jensen and Shen 2004; Duntsch and Gediga 1998; Pawlak 2002; Swiniarski and Skowron 2003; Hu and Cercone 1995; Slowinski and Zopounidis 1995; Pawlak 1984), and to resolve problems related to vague, uncertain and incomplete information. RST also has wide usage for feature extraction and reduction in data mining.

Constructing a wide and comprehensive Information system (IS) related with the problem is the first step of RST (Cekik and Telceken 2014).

$$\begin{aligned} \hbox {IS} = (U, A, D) \end{aligned}$$

(1)

U is a non-empty finite set of objects $(U={x_1,x_2,\ldots ,x_m})$, A is non-empty finite set of features $(A={a_1,a_2,\ldots ,a_m})$, D is non-empty finite set of classes decisions $(D={d_1,d_2,\ldots ,d_k})$. $\forall a\in A$ defines an information function $f_a:U \rightarrow V_a$, where $V_a$ is the set of values of a, called the domain of feature a.

$$\begin{aligned} Z_{i,j}=\{ \forall a\in A |\;\; a(x_i )\ne a(x_j )\} \quad i,j = 1,2,\ldots ,n \end{aligned}$$

(2)

Core and reducts of features are computed by using a Discernibility matrix of size $n\times n$, where n denotes the number of objects. $Z_{i,j}$ is defined as the set of all features which discern objects $x_i$ and $x_j$. A Discernibility function is a Boolean function of m Boolean variables $a_1^*,a_2^*,\ldots ,a_m^*$ which correspond to the features $a_1,a_2,\ldots ,a_m$ defined as below,

$$\begin{aligned} f_{\mathrm{A}} \left( a_1^{*},a_2^{*},\ldots ,a_m^{*}\right) ={\wedge }\{{\vee } Z_{i,j}^{*} \;\;| \;\; 1\le j\le i\le n, \;\;Z_{i,j}\ne \varnothing \}\nonumber \\ \end{aligned}$$

(3)

where

$$\begin{aligned} Z_{i,j}^*=\{ a^* |\;\;a\in Z_{i,j} \}. \end{aligned}$$

The set of all prime implicates of $f_{\mathrm{A}}$ determines the set of all reducts of A (Komorowski 1999). In this study, core and reduced attributes for every object have been obtained using the discernibility matrix and functions in RST. The goal is to demonstrate different characteristics that determine the class of an object depending on the attributes of the object in different classes. For example, if the first and the third attributes are obtained as discernibility attributes for an object, these attributes are derived to be the discernibility of this object from the other objects. Let $d_k$ denote a class and,

$$\begin{aligned} i,j= & {} 1,2,\ldots ,n\nonumber \\ Z_{i,j}= & {} \left\{ \forall a\in A |\;\; a(x_i )\ne a(x_j ) \;\; \mathrm{and} \;\;d(x_i )\ne d(x_j )\right\} \end{aligned}$$

(4)

Another concept in RST is upper/lower approximation sets. Let $R \subseteq A $ and $X \subseteq U $. Set X can be approximated by using the features contained in R by defining the lower and upper approximations of X, denoted $\underline{R}(X)$ and $\overline{R}(X)$, respectively, where $\underline{R}(X)=\{x \in U |\;\; [x]_R \subseteq X\}$ and $\overline{R}(X)=\{x \in U | \;\;[x]_R \cap X \ne \emptyset \}$. The objects in $\underline{R}(X)$ can be classified as members of X on the basis of the knowledge in R. On the other hand, the objects in $\overline{R}(X)$ can only be classified as possible members of X. The boundary set of X is defined as $\hbox {BN}_R(X)= \overline{R}(X)-\underline{R}(X)$ and thus consists of those objects that cannot be classified exactly into X on the basis of the knowledge in R. Four basic classes of rough sets can be defined as follows:

X is roughly R-definable, $\Leftrightarrow \underline{R}(X) \ne \emptyset \;\; \textit{and} \;\;\overline{R}(X) \ne U. $
X is internally R-undefinable, $\Leftrightarrow \underline{R}(X) = \emptyset \;\; \textit{and} \;\;\overline{R}(X) \ne U.$
X is externally R-undefinable, $\Leftrightarrow \underline{R}(X) \ne \emptyset \;\; \textit{and} \;\;\overline{R}(X) = U. $
X is totally R-undefinable, $\Leftrightarrow \underline{R}(X) = \emptyset \;\; \textit{and} \;\;\overline{R}(X) = U. $

The last concept we talk about RST is the positive, negative and boundary regions of features. Let P and Q be sets of attributes including equivalence relations over U. Then the positive, negative and boundary regions are defined as $\hbox {POS}_P(Q)=U_{X \in U/Q}\underline{P}X$, $\hbox {NEG}_P(Q)= U - U_{X \in U/Q}\overline{P}X$ and $\hbox {BND}_P(Q)= U_{X \in U/Q}\overline{P}X-U_{X \in U/Q}\underline{P}X$, respectively.

3 A new rough sets classification method

Data mining discipline has been formed to analyze the big data and extract meaningful information to cope with the problem of data explosion. Data mining uses statistics, machine learning and artificial intelligence and has been the focus of attention in the industry, economy and academy with great interest from the business community. Data mining methods that contain numerous algorithms are used in health industry, basic sciences, banking, finance and market research. Methods used in data mining can be classified as unsupervised clustering, supervised learning and market basket analysis in the most general sense. Supervised learning can be classified as classification, estimation and prediction in itself.

Classification is the process of putting an object into a predefined class using its features. The classification technique is a systematic approach, which builds classification models from input datasets. Each technique relies on a learning algorithm, which tries to construct a relation between features of the input data and the class labels (Han et al. 2011; Tan et al. 2006; Cios et al. 2012).

The classification is accomplished by computing relations between features and classes according to common and distinct features among the classes. Classification becomes easier whenever this relation is simple and the number of common features is small. Unfortunately, the reverse condition is applicable for most of the applications in real life. A feature can be common for more than one class in nature, which would make classification hard and complex. Additionally, some attributes or attribute values can be redundant and unnecessary for the classification of an object. In practice, it means that the object can be classified despite the fact that the values of some attributes may be missing, since the remaining attributes or attribute values are sufficient for classification or to make a decision.

In this study, discernible features were determined for each class. Hence, the relationship between a class and a feature value is simplified avoiding the problem of hidden classification. Consequently, the performance of the classification process is increased. In order to handle the problems explained above, a new rough-set-based classification method, named feature weighted rough set classification (FWRSC), is proposed in this paper. Steps of FWRSC are shown in Fig. 1. The preprocessing phase shown in the figure is used to fill missing values in the dataset using discretization, interpolation, etc. This step was not used in this paper since there are no missing values in the used datasets. In the second step of the processing pipeline in Fig. 1, i.e., the split data phase, the dataset is first divided into two parts as training and testing. The training set is then divided into several classes based on the class label. In this step, FWRSC processes the objects in the training dataset to determine the best attributes that discriminate the objects in their classes. To achieve this, FWRSC creates what is called the weight matrix scoring system. The rows of this matrix are the objects, and the columns are the attributes. Each cell of the matrix signifies the weight of the attribute in determining the particular object’s class label. The higher the weight of an attribute, the higher its contribution toward determining the class of that object. Similarly, the attributes with smaller values have smaller effect on determining the class label of the object. The details of how the weight matrix scoring system is calculated are given in Sect. 3.1. After the feature vectors are determined using the training set, they are used to classify the objects in the test set. The classification of a test object works as follows: Each attribute value of the test object is compared with the attributes of a training object. If the attribute values are the same, a score of 1 is assigned to that attribute, otherwise 0 is assigned. The score values of all attributes are then added up to compute the cumulative score for the object. The test object is then assigned to the same class as the training object for which this score is maximized. The details of this procedure is discussed in Sect. 3.2.

3.1 Weight matrix calculation

The discernibility relations are closely related to indiscernibility and belong to the most important relations considered in RST. The ability to discern between perceived objects is important for constructing many structures such as reducts, classification, decision rule or algorithm (Pawlak and Skowron 2007). Particularly, efficiency of this relation affects the whole classifier performance in data mining. It reveals core and reduced attributes, and the important attributes for an object is clearly made visible. Based on this information, a scoring system for each attribute of an object is made applicable. The scoring system process works similar to term frequency (TF) (Manning et al. 2008) in the text document classification problem. Here the discernibility function can be thought as the text document and the reducts which are combined with the logical OR operator can be thought as the sentences in the text document. Then, computation of the weight of a feature is similar to the computation of the term frequencies. The only difference is that the number of the OR operators plus is used instead of the number of terms. For example, in an information system with five attributes, if the core attributes of an object are the first and the third attributes, then the effect of these two attributes must be improved. In this way, the effect of core attributes that better define the class of an object must be increased while reducing the influence of non-core attributes. Thus, a more accurate classification process can be carried out.

In this study, a matrix for scoring the system attributes, called the weight matrix ($W_{i,j}$) is created. $f_i$ is discernibility function to ith object, and $\hbox {Count}(f_i, a_j)$ is the frequency of the attribute $a_j$ in $f_i$. $\bigvee $ denotes the Boolean $\hbox {OR}$ operator in $f_i$, then $\hbox {Count}(f_i, \bigvee )$ is the frequency of the $\bigvee $ which is the Boolean OR operator in $f_i$. It is formalized:

$$\begin{aligned} N= & {} \hbox {Count}\left( f_i,\bigvee \right) +1 \end{aligned}$$

(5)

$$\begin{aligned} d_{i,j}= & {} \frac{\hbox {Count}(f_i,a_j )}{N} \end{aligned}$$

(6)

Weight matrix ($W_{i,j}$):

$$\begin{aligned} \left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c} d_{1,1}&{}d_{1,2}&{}...&{}d_{1,n}\\ d_{2,1}&{}d_{2,2}&{}...&{}d_{2,n}\\ ...&{}...&{}...&{}...\\ ...&{}...&{}...&{}...\\ d_{m,1}&{}d_{m,2}&{}...&{}d_{m,n}\\ \end{array} \right) \end{aligned}$$

Let $\hbox {IS} = (U, A)$ be an information system that has 10 objects and four attributes. The discernibility function for the fifth object:

$$\begin{aligned} f_5= a_1 \wedge a_2 \bigvee a_2\wedge a_3 \wedge a_4 \end{aligned}$$

The weight of each attribute for 5th object:

$$\begin{aligned} \hbox {Count}(f_5,a_1 )= & {} 1,\;\; \hbox {Count}(f_5,a_2 )=2,\;\;\\ \hbox {Count}(f_5,a_3 )= & {} 1,\;\;\hbox {Count}(f_5,a_4 )=1,\;\;\\ N= & {} \hbox {Count}\left( f_i,\bigvee \right) +1=2 \end{aligned}$$

According to this;

$$\begin{aligned} d_{5,1}= & {} \frac{\hbox {Count}(f_5,a_1)}{N}= \frac{1}{2}=0.5,\\ d_{5,2}= & {} \frac{\hbox {Count}(f_5,a_2)}{N}= \frac{2}{2}=1.0,\\ d_{5,3}= & {} \frac{\hbox {Count}(f_5,a_3)}{N}= \frac{1}{2}=0.5, \\ d_{5,4}= & {} \frac{\hbox {Count}(f_5,a_4)}{N}= \frac{1}{2}=0.5, \end{aligned}$$

The weight of each attribute for the fifth object:

$$\begin{aligned} W_{5,j}=[0.5\quad 1\quad 0.5\quad 0.5] \end{aligned}$$

Here, $a_1$ attribute presents only in $a_1\wedge a_2$, and hence, $\hbox {Count}(f_5,a_1 )$ is equal to 1. $a_2$ attribute presents both $a_1\wedge a_2$ and $a_2\wedge a_3\wedge a_4$; hence, $\hbox {Count}(f_5,a_2 )$ is equal to 2. Vector $W_{5,j}$ shows the weight of each attribute for the fifth object. As seen from the example, attribute $a_1$ is more important than the other attributes for the fifth object.

3.2 Determining the class of an object

The first step in the classification process is to split the training data into a set of ‘n’ classes. Let $C_t$ denote the class of the $t{\mathrm{th}}$ object, and D define the set of classes (Fig. 2).

Formal definitions of each class are as follows:

$$\begin{aligned} r= & {} \{d_i,d_j \in D \;\; |\;\; \forall d_i \not = \forall d_j,\;\; 0<i\le j \le S(D)\}\nonumber \\ t= & {} 1,2,\ldots ,S(r)\nonumber \\ C_t= & {} \{x_k\in U;\;\; d_i,d_j \in D \;\;| \;\; d_i= d_j , \;\;x_k\} \end{aligned}$$

(7)

Here, X is a finite set, and function S(X) is the number of elements in X. Similarly, S(D) denotes the total number of objects in $\hbox {IS}$, and r denotes the number of classes in $\hbox {IS}$.

Let $\hbox {IS} = (U,A,D)$ be an information system, and let $\beta , \gamma \subset \hbox {IS}$ be the test and the training sets, respectively. Then the class score for a test object for class ‘t’ is calculated as follows:

$$\begin{aligned}&R^*= S(U) \quad \hbox { and } \quad r^*=S(A)-1\nonumber \\&\hbox {Class}^t_{\mathrm{score}}=\frac{\sum _{\begin{array}{c} i=1,\\ \gamma _i \in C_t \end{array}}^{R^*}\sum _{\begin{array}{c} j=1,\\ \beta (a_j) = \gamma (a_j) \end{array}}^{r^*}(1+W_{i,j})}{S(C_t)} + \frac{S(C_t)}{S(U)} \end{aligned}$$

(8)

where $S(C_t)$ is number of objects (or elements) of $ t{\mathrm{th}} $ class set and S(U) is the total number of objects in $\hbox {IS}$, and the ratio of $S(C_t)$ to S(U) is the class rate.

As seen from Eq. 8, the object to be classified in the test dataset is compared with all objects in the training dataset and a score is computed for each class. The computation of this score proceeds as follows: When the value of an attribute is the same for the object in question and for the object in the training set, its score value for similarity is 1, and this score is added to the weight of that attribute in $W_{i,j}$. The reason for adding the feature weight to the similarity score is to prevent hidden classification and thus enable more sensitive classification by increasing the accuracy. Furthermore, it exposes the important and unimportant attributes that determine the class of an object. The total weight thus computed becomes the feature weight for the object for the particular class. Finally, the class rate is added to the score obtained thus far to come up with the overall score for the class.

To summarize, the computation of the class score for a test object consists of three parts: the similarity value, the feature value and the class rate. The similarity value incorporates how similar an object in test dataset is to each object in the training set, but it does not prevent hidden classification and does not expose hidden pattern in data. At this point, the feature value comes into play and assigns a weight to each attribute. By giving more weight to the important attributes, the hidden classification and pattern problems are avoided. Finally, the class rate ensures internal balance.

Having computed the class scores, the class of an object in the test dataset is determined by:

$$\begin{aligned} \hbox {Calculated class}=\hbox {max} \left( \hbox {Class}^t_{\mathrm{score}}\right) \end{aligned}$$

(9)

Clearly, an object in the test dataset is included in the class having the highest score value. If two or more classes have the same score value, then one of the classes is randomly chosen for the object.

The algorithm for the proposed FWRSC method is given below:

Table 1 Datasets

Full size table

Table 2 Accuracy results for different classification methods

Full size table

3.3 Experimental results

In this section, we quantitatively evaluate the performance of the proposed classification method. Data sets used in experiments have been taken from the website of UCI Machine Learning Repository (https://archive.ics.uci.edu/ml/datasets.html). The classification performance of FWRSC is compared with as many of the classification methods as possible in a program called WEKA (Garner 1995; Markov and Russell 2006; Bouckaert 2013) on five different dataset. The selected datasets have been randomly chosen so as to cover a wide variety of application areas. For example, if a dataset is from the health industry, other datasets belong to areas outside of the health industry. Table 1 gives the details of the selected datasets.

There are several criterions for the evaluation of a classification method. In this paper, accuracy, F-score (F-measure) and mean absolute error (MAE) criterions have been used. The more common definition is that accuracy is a level of measurement with no inherent limitation (i.e., free of systemic error, another form of observational error). In statistical analysis of binary classification, the F-score (also F1 score or F-measure) is a measure of a test’s accuracy. It considers both the precision p and the recall r of the test to compute the score: p is the number of correct positive results divided by the number of all positive results, and r is the number of correct positive results divided by the number of positive results that should have been returned. The F-score can be interpreted as a weighted average of the precision and recall, where an F-score reaches its best value at 1 and worst at 0. In statistics, the mean absolute error (MAE) is a quantity used to measure how close forecasts or predictions are to the eventual outcomes, and it is one of a number of ways of comparing forecasts with their eventual outcomes.

Table 3 Classification methods having an average accuracy percentage above 60%

Full size table

Table 4 MAE for different classification methods

Full size table

The success of the classification methods is illustrated in Tables 2, 3 and 4 for accuracy, MAE and F-score, respectively. The best results for each dataset are made darker in all tables. The performance value for the proposed FWRSC method is given at the last row of each table. Let’s analyze the accuracy results in Table 2. While FWRSC has the highest accuracy for the Balloons dataset, its performance on other datasets is close to the best classification method. For example, on the Tic-Tac-Toe dataset, the best accuracy results are obtained with the Rules-OneR classification method for an accuracy value of 63.55%. On the same dataset, FWRSC produces the second highest classification score of 62.30%. Also observe that the performance of FWRSC on other datasets is also close to the best classification method. To better summarize the accuracy results listed in Table 2, a compact representation of only the best classification methods with an average accuracy result greater than 60% is listed in Table 6. It can clearly be observed from Table 3 that FWRSC produces the best average classification accuracy of 67.74% over all datasets, while giving the best performance for the Balloon dataset.

Table 5 F-measure results for different classification methods

Full size table

Analyzing the MAE and F-score results given in Tables 4 and 5, respectively, we see that FWRSC gives the best MAE performance on Haberman’s survival and Abalone datasets and the best F-score performance for Balloon and Tic-Tac-Toe datasets.

We also can say that the performance of FWRSC on all date sets for MAE and F-score remains just below the overall average of all classifiers with a very small difference. However, FWRSC is one of classifiers that show the best performance after OneR classifier for MAE criteria on Tic-Tac-Teo dataset in Table 4. This shows that FWRSC performs a classification process with low error rate (See Table 4). Additionally, although OneR classifiers indicate the best performance on Tic-Tac-Teo dataset in Table 2, FWRSC classifiers indicate the best performance on this dataset shown in Table 5. This means that FWRSC produces more a balanced classification process than OneR and other classifiers that in tables. All and all, it can be concluded that FWRSC finds itself to be among the best classification methods found in the literature.

4 Conclusions and future work

In this article, a new classification method named feature weighted rough set classification (FWRSC) was proposed to decrease classification problems on account for hidden in data. The most important advantages of FWRSC are reduced complexity, handling inconsistency in the dataset and taking care of hidden classification. RST can find core and reduct attributes. Core attributes that influence the class of an object more than the others, and the attributes that have little impact on the classification performance are determined. Therefore, the ill impacts of classification problems are minimized, and more efficient classification is made possible. FWRSC is compared with the classification methods in WEKA for five different datasets. The experimental results show that FWRSC gives higher performance than most of the methods in WEKA for many of the datasets. Furthermore, FWRSC gives the best average accuracy performance with an overall average score of 67.47% for five different datasets. We believe that FWRSC will find its rightful place among the best classification methods found in the literature. Out future goal is to improve the performance of FWRSC by incorporating some hybrid RST techniques.

References

Bouckaert RR et al (2013) WEKA manual for version 3-7-8
Cekik R, Telceken S (2014) Classification of ECG signals using rough sets theory, Anadolu University. J Sci Technol A Appl Sci Eng 15(2):125–135
Google Scholar
Chakhar S, Saad I (2012) Dominance-based rough set approach for groups in multicriteria classification problems. Decis Support Syst 54(1):372–380
Article Google Scholar
Cios KJ, Pedrycz W, Swiniarski RW (2012) Data mining methods for knowledge discovery, vol 458. Springer Science and Business Media, Berlin
MATH Google Scholar
Dubois D, Prade H (1990) Rough fuzzy sets and fuzzy rough sets. Int J Gen Syst 17(2–3):191–209
Article MATH Google Scholar
Duntsch T, Gediga G (1998) Uncertainty measures of rough set prediction. Artif Intell 106(1):109–137
Article MathSciNet MATH Google Scholar
Garner SR (1995) Weka: the waikato environment for knowledge analysis. In: Proceedings of the New Zealand computer science research students conference, pp 57–64
Han J, Kamber M, Pei J (2011) Data mining: concepts and techniques: concepts and techniques. Elsevier, Amsterdam
MATH Google Scholar
Hu X, Cercone N (1995) Learning in relational databases: a rough set approach. Comput Intell 11(2):323–338
Article Google Scholar
Jensen R, Shen Q (2004) Fuzzy-rough attribute reduction with application to web categorization. Fuzzy Sets Syst 141(3):469–485
Article MathSciNet MATH Google Scholar
Khoo LP, Tor SB, Zhai LY (1999) A rough-set-based approach for classification and rule induction. Int J Adv Manuf Technol 15(6):438–444
Article Google Scholar
Komorowski J, Pawlak Z, Polkowski L, Skowron A (1999) Rough sets: a tutorial. In: Pal SK, Skowron A (eds) Rough-fuzzy hybridization: a new trend in decision-making. Springer, Singapore, pp 1–98
Kryszkiewicz M (1998) Rough set approach to incomplete information systems. Inf Sci 112(1):39–49
Article MathSciNet MATH Google Scholar
Manning CD, Raghavan P, Schutze H (2008) Introduction to information retrieval, vol 1. Cambridge University Press, Cambridge
Book MATH Google Scholar
Markov Z, Russell I (2006) An introduction to the WEKA data mining system. ACM SIGCSE Bull 38(3):367–368
Article Google Scholar
Pawlak Z (1982) Rough sets. Int J Inf Comput Sci 11(5):341–356
Article MATH Google Scholar
Pawlak Z (1984) Rough classification. Int J Man Mach Stud 20(5):469–483
Article MATH Google Scholar
Pawlak Z (1991) Rough sets: theoretical aspects of reasoning about data. Series D: system theory, knowledge engineering and problem solving, vol 9. Kluwer, Dordrecht
Google Scholar
Pawlak Z (1998) Rough set theory and its applications to data analysis. Cybern Syst 29(7):661–688
Article MATH Google Scholar
Pawlak Z (2002) Rough sets and intelligent data analysis. Inf Sci 147(1):1–12
Article MathSciNet MATH Google Scholar
Pawlak Z, Skowron A (2007) Rudiments of rough sets. Inf Sci 177(1):3–27
Article MathSciNet MATH Google Scholar
Pawlak Z, Sowinski R (1994) Rough set approach to multi-attribute decision analysis. Eur J Oper Res 72(3):443–459
Article MATH Google Scholar
Slowinski R, Zopounidis C (1995) Application of the rough set approach to evaluation of bankruptcy risk. Intell Syst Account Finance Manag 4(1):27–41
Article Google Scholar
Swiniarski RW, Skowron A (2003) Rough set methods in feature selection and recognition. Pattern Recognit Lett 24(6):833–849
Article MATH Google Scholar
Tan PN, Steinbach M, Kumar V (2006) Classification: basic concepts, decision trees, and model evaluation. Introduct Data Min 1:145–205
Google Scholar
Witten IH, Frank E (2005) Data mining, practical machine learning tools and techniques. Morgan Kaufmann, Los Altos
MATH Google Scholar
Xiang-Wei L, Yian-Fang Q (2012) A data preprocessing algorithm for classification model based on Rough sets. Phys Proc 25:2025–2029
Article Google Scholar

Download references

Acknowledgements

This work has been partially supported by Anadolu University Scientific Research Project Commission under the Grant Number 1402F047.

Author information

Authors and Affiliations

Department of Computer Engineering, Anadolu University, 26555, Eskisehir, Turkey
Rasim Cekik & Sedat Telceken

Authors

Rasim Cekik
View author publications
You can also search for this author in PubMed Google Scholar
Sedat Telceken
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sedat Telceken.

Ethics declarations

Conflict of interest

None.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Communicated by V. Loia.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cekik, R., Telceken, S. A new classification method based on rough sets theory. Soft Comput 22, 1881–1889 (2018). https://doi.org/10.1007/s00500-016-2443-0

Download citation

Published: 19 December 2016
Issue Date: March 2018
DOI: https://doi.org/10.1007/s00500-016-2443-0

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A new classification method based on rough sets theory

Abstract

Similar content being viewed by others

Rough Set-Based Feature Subset Selection Technique Using Jaccard’s Similarity Index

A Rough Set Classifier Based on Discretization and Attribute Selection

Modified Soft Rough set for Multiclass Classification

1 Introduction

2 Rough sets theory