Introducing NRough Framework

Widz, Sebastian

doi:10.1007/978-3-319-60837-2_53

Sebastian Widz²⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10313))

Included in the following conference series:

International Joint Conference on Rough Sets

1119 Accesses
4 Citations

Abstract

In this article we present the new machine learning framework called NRough. It is focused on rough set based algorithms for feature selection and classification i.e. computation of various types of decision reducts, bireducts, decision reduct ensembles and rough set inspired decision rule induction. Moreover, the framework contains other routines and algorithms for supervised and unsupervised learning. NRough is written in C# and compliant with .NET Common Language Specification (CLS). Its architecture allows easy extendability and integration.

Access provided by CONRICYT-eBooks. Download conference paper PDF

Rseslib 3: Open Source Library of Rough Set and Machine Learning Methods

Rseslib 3: Library of Rough Set and Machine Learning Methods with Extensible Architecture

An Empirical Comparison of Rule Sets Induced by LERS and Probabilistic Rough Classification

Keywords

1 Introduction

Interest in machine learning tools and algorithms has been huge in recent years and is still growing. There is a wide range of applications on the market that use various machine learning routines. There is however still only a few solutions compatible with the Microsoft .NET framework that can provide machine learning algorithms. Those that exist are rather focused on numerical rather than symbolic methods and so far none of these has included rough set [1] based algorithms.

Machine learning models that are based on mathematically sophisticated methods may achieve high accuracy but they are hardly understandable by users who expect not only accurate results but also easy yet meaningful explanation how these results were obtained. Models relaying on symbolic, e.g., rule based methods may be less accurate but more intuitive and understandable for humans [2]. In both cases, feature subset selection leads to an increase of interpretability and practical usefulness of machine learning models.

Symbolic methods focus on finding relationships in data, typically reported in a form of rules in a feature-value language. The rules are built with a use of basic logical operators. Examples include the rule induction methods such as learning if-then rules [3] or decision trees [4].

Rough sets have proven to be a successful tool in feature selection (see e.g. [5]). The rough set approach is based on decision reducts – irreducible subsets of features, which determine specified decision classes in (almost) the same degree as the original set of features. Determining decisions can be interpreted analogously to, e.g., functional or multi-valued dependencies in relational databases. Subsets of features providing exactly the same degree of determination as the original set are often referred as crisp decision reducts, in opposite to approximate decision reducts [6] where some controlled decrease of determination is allowed. By specifying a threshold for allowed decrease of determination, one can address the balance between decision model’s simplicity and accuracy. Indeed, it is easier to search for smaller subsets of features yielding simpler sub-models under loosened constraints for decision determination, although too weak constraints may also cause poorer accuracy. However, even relatively less accurate sub-models may lead towards very accurate final model, if the processes of sub-models’ design are appropriately synchronized.

NRough is a set of libraries written in C# programming language focusing on rough sets and other symbolic machine learning methods. It contains a number of algorithms for searching approximate decision reducts and constructing decision models. All presented algorithms has been successfully used in our previous research and proven their value. The framework is aimed to be used by researchers who can extend it and test their methods against already implemented models. The second user group are developers and system integrators who can include described routines in their applications.

NRough can be downloaded from GitHub Repository [7] as well as from its dedicated website [8] in a form of a Microsoft Visual Studio solution containing source code for all described libraries. The sources include unit test code that presents use case examples as well as unit testing procedures.

We present the framework’s key features as well as formal definitions behind implemented algorithms in Sect. 2. We describe data representation, approximate decision reducts and decision reduct classifier ensembles in this section. Moreover, we list other supervised and unsupervised machine learning algorithms included in NRough. Finally, we list included features related to model evaluation. Next, in Sect. 3 we describe the architecture of our solution. We put licensing information in Sect. 4. In Sect. 5 we describe other rough set based frameworks. The last Sect. 6 concludes this paper and includes draft of a road map as well as a direction in which we would like the framework to evolve.

2 Key Features

NRough framework contains a number of algorithms for searching approximate decision reducts and constructing decision models based on the rough set theory. Moreover, we added a number of decision rule based classifiers known from machine learning. Last but not least, the framework contains routines for classifier validation and results presentation. Below, we present the list of implemented features starting with data representation description.

2.1 Data Representation

We follow data representation in a form of a decision table which is a well known structure in the rough sets domain. Decision table is a tuple $\mathbb {A}=(U, A \cup \{d\})$, where U is a finite set of objects, A is a finite set of attributes and $d \notin A$ is a distinguished decision attribute. We refer to elements of U using their ordinal numbers $i=0,...,|U|-1$ as well as by unique record identifier (if such exists in the data set). We treat attributes $a\in A$ as functions $a:U\rightarrow V_a$, where $V_a$ denotes the set of values of a. Values $v_d\in V_d$ correspond to decision classes that we want to describe using the values of attributes in A. The framework internally encodes attribute values using signed long base type which allows generic approach for data access and avoiding boxing/unboxing from origin types i.e. faster computations. Internal values can be however converted to their original typed values using a dictionary lookup methods. There are no restrictions about input types, that are automatically recognized during data loading.

Decision tables can be loaded from text files (including comma delimited files and RSES 1.0 format [9]) as well as from the System.Data.DataTable instance which is often used in .NET to store SQL query results. The framework includes a number of filters to manipulate the data such as removal of selected attributes or records based on a given user criteria. Filter concept is also used to define more sophisticated data manipulations such as numeric value discretization.

One of the key concepts in rough sets theory is the definition of indiscernibility relation. For any subset of attributes $B \subseteq A$ and the universe of objects $x \in U$ we are able to define an information vector $B(x) = [ a_{i_{1}}(x),\dots ,a_{i_{|B|}}(x) ]$ where $a_{i_{j}}(x)$ are values of attributes $a_{i_{j}} \in B$ and $j = 1, \dots ,|B|$. We can also denote the set of all B-information vectors, which will then occur in $\mathbb {A}$, as $V_{B} = \{B(x) : x \in U \}.$ Each subset $B \subseteq A$ partitions the space U onto so called equivalence classes that can be enumerated as $v_{1}, \dots ,v_{|V_{B}|}$. For such division we get the partition space denoted as $U/B = \{E_{1},\dots ,E_{t}\}$ where $E_{t} \subseteq U$ for $t = 1,\dots ,|V_{B}|$. Each equivalence class is defined as $E_{t} = \{x \in U : B(x) = v_{t} \}$. NRough utilizes this concept in a form of a dedicated data structure which is used in many scenarios like calculating functions information entropy, majority or relative gain functions to name a few. The majority function is used in many ways by framework algorithms e.g. for approximate decision reducts computation as well as for decision tree pre-pruning or branch split calculation.

Last but not least, the library contains a number of benchmark data sets taken from UCI repository [10]. These data can be accessed with a predefined methods loading the data into memory including their meta data.

2.2 Approximate Decision Reducts

Attribute selection plays an important role in knowledge discovery. It establishes the basis for more efficient classification, prediction and approximation models. Attribute selection methods originating from the theory of rough sets aim at searching for so called decision reducts – irreducible subsets of attributes that satisfy predefined criteria for keeping enough information about decision classes. NRough contains a number of algorithms for computing approximate decision reducts (as well as crisp decision reducts when approximation threshold is set to 0). All reduct computation algorithms are based on heuristic approach and many utilize parallel computing.

We define an $(F,\varepsilon )$-approximate decision reduct [11] where F is a measure $F(d|\cdot ) : 2^{|A|} \rightarrow \mathfrak {R}$ which evaluates the degree of influence F(d|B) of subset $B \subseteq A$ in d. Below we present the definition as well as the general routine for computing $(F,\varepsilon )$-approximate decision reducts as Algorithm 1 called $(F,\varepsilon )$-REDORD [11]).

Definition 1

Let $\varepsilon \in [0,1)$ and $\mathbb {A}=(U, A \cup \{d\})$ be given. We say that $B \subseteq A$ is an $(F,\varepsilon )$-approximate decision reduct, iff it is an irreducible subset of attributes satisfying the following condition:

$$\begin{aligned} F(B) \ge (1 - \varepsilon )F(A) \end{aligned}$$

(1)

The framework defines three types of F measures: $\gamma (B)$ [1] which is based on so called positive region, Majority M(B) [11] and Relative Gain R(B) [12]. Other user defined measures can be used with $(F,\varepsilon )$-approximate reduct computation algorithm.

$$\begin{aligned} M(B) = \frac{1}{|U|}\sum _{E\in U/B} \max _{k \in V_d} |X_k \cap E| \end{aligned}$$

(2)

$$\begin{aligned} R(B) = \frac{1}{|V_d|}\sum _{E\in U/B}\max _{X\in U/\{d\}}\frac{|X \cap E|}{|X|} \end{aligned}$$

(3)

$$\begin{aligned} \gamma (B) = \frac{1}{|U|}|POS(B)| = \frac{1}{|U|} \sum _{E\in U/B:P(X|E)=1}|E| \end{aligned}$$

(4)

In [13] it was shown how to compute approximate decision reducts over a universe of weighted objects and that two different weighting schemes lead to an unified way of computing M(B) and R(B) measures that is for $\varvec{1} : U \rightarrow \{1\}$ we have $M_{\varvec{1}}(B)=M(B)$ and for $r(u) = \frac{1}{|\{x \in U : d(x) = d(u)\}|}$ we obtain $M_{r}(B) = R(B)$.

Definition 2

Let $\varepsilon \in [0,1)$, $\mathbb {A}=(U, A \cup \{d\})$ and $\omega : U \rightarrow [0,+\infty )$ be given. We say that $B \subseteq A$ is an $(\omega ,\varepsilon )$-approximate decision reduct, iff it is an irreducible subset of attributes satisfying the following condition:

$$\begin{aligned} M_{\omega }(B) \ge (1 - \varepsilon )M_{\omega }(A) \end{aligned}$$

(5)

$$\begin{aligned} M_{\omega }(B) = \frac{1}{|U|_{\omega }}\sum _{E\in U/B} \max _{k \in V_d} |X_k \cap E|_{\omega } \end{aligned}$$

(6)

$$\begin{aligned} |Y|_{\omega } = \sum _{u \in Y} \omega (u) \end{aligned}$$

(7)

Moreover the framework contains algorithms for computing decision bireducts and their derivatives $\gamma $-bireducts and relative-bireducts [14]. Below we present definitions as well as pseudo code as Algorithm 2. In [15] we showed relationships between $(F,\varepsilon )$-approximate decision reducts and different types of bireducts.

Definition 3

Let $\mathbb {A}=(U,A\cup \{d\})$ be a decision system. A pair (B, X), where $B \subseteq A$ and $X \subseteq U$, is called a decision bireduct, iff B discerns all pairs $i,j \in X$ where $d(i) \ne d(j)$, and the following properties hold:

1.
There is no $C \subsetneq B$ such that C discerns all pairs $i,j \in X$ where $d(i) \ne d(j)$;
2.
There is no $Y \supsetneq X$ such that B discerns all pairs $i,j \in Y$ where $d(i) \ne d(j)$.

Definition 4

Let $\mathbb {A}=(U,A\cup \{d\})$ be a decision system. A pair (B, X), where $B \subseteq A$ and $X \subseteq U$, is called a decision $\gamma $-bireduct, iff B discerns all pairs $i \in X, j \in U$ where $d(i) \ne d(j)$, and the following properties hold:

1.
There is no $C \subsetneq B$ such that C discerns all pairs $i \in X, j \in U$ where $d(i) \ne d(j)$;
2.
There is no $Y \supsetneq X$ such that B discerns all pairs $i \in Y, j \in U$ where $d(i) \ne d(j)$.

In [16] we presented the new definition of so called generalized majority decision function, which can be treated as an extension to well known generalized decision function. We also showed the definition of generalized approximate majority decision reducts. The pseudo code is presented as Algorithm 3. An interesting extension in to use so called exceptions which on one hand allow further feature reduction in the main model and on the other hand store details about outlayers. Both definitions are implemented in NRough.

Definition 5

For any decision table ${\mathbb {A}} = (U,A\cup \{d\})$ and approximation threshold $\varepsilon \in [0, 1)$ one can consider generalized approximate majority decision function $m_{d}^{\varepsilon } : 2^U \rightarrow 2^{V_d}$ that is taking the following form:

$$\begin{aligned} m_{d}^{\varepsilon }(E) = \{ k : |X_k \cap E| \ge (1 - \varepsilon ) \max _{j} |X_j \cap E| \} \end{aligned}$$

(8)

Definition 6

Let ${\mathbb {A}} = (U,A\cup \{d\})$ be given. We say that $B \subseteq A$ is a $m_{d}^{\varepsilon }$-decision superreduct, if and only if the following condition holds:

(9)

We say that B is a $m_{d}^{\varepsilon }$-decision reduct, if and only if it is a $m_{d}^{\varepsilon }$-superreduct and none of it proper subsets satisfy the above condition.

2.3 Approximate Decision Reduct Classifier Ensembles

Approximate decision reducts usually include less attributes than classical reducts. On the other hand, they may generate if-then rules that make mistakes even within the training samples. For noisy data sets it is to some extent desirable. Nevertheless, some methods for controlling those mistakes should be considered. For example, if the goal is to construct a classification model based on several approximate decision reducts, then – by following ideas taken from machine learning [17] – one may wish to assure that if-then rules generated by different reducts do not repeat the same mistakes on the training data. For this purpose, we can consider a mechanism aiming at diversification of importance of particular objects while searching for different approximate reducts. The same mechanisms are used in classifier ensemble methods. These methods perform usually better than their components used independently [18, 19]. Combining classifiers is efficient especially if they are substantially different from each other. In fact, the feature subsets applied in ensembles can be relatively smaller than in case of a single feature subset approach, if we can guarantee that combination of less accurate classifier components (further referred as weak classifiers) will lead back to satisfactory level of determining decision or preserving information about decision.

NRough includes several mechanisms for approximate decision reduct classifier ensembles learning. One method is based on well known Adaptive Boosting algorithm [20]. In NRough we introduced an AdaBoost version which use decision rules derived from approximate decision reducts [21] - the pseudo code is presented in Algorithm 5.

When a reduct ensemble is used to create decision rules one can consider a weak classifier output combination method. We implemented several voting mechanisms described in [22]. We present different voting options in Table 1 in the way compliant with $(\omega , \varepsilon )$-approximate decision reducts. The voting weights are presented in a slightly changed form where

$$\begin{aligned} X^{\omega }_{E} = \mathop {\text {argmax}}\limits _{X\in U/\{d\}}|X \cap E|_{\omega } \end{aligned}$$

(10)

Table 1. Six options of weighting decisions by if-then rules, corresponding to the consequent coefficient types plain, $\omega $-confidence and $\omega $-coverage, and antecedent coefficient types single and $\omega $-support. $|E|_{\omega }$ denotes the support of a rule’s left side. $X_{E}^{\omega }$ is defined by formula (10).

Full size table

Another introduced method for decision reduct diversification is based on decision reduct hierarchical clustering where a distance between reducts is based on binary vectors created according to Formula 11. Figure 1 presents an example of a dendrogram created based on hierarchical clustering of approximate decision reducts. By choosing a cut level we create a number of reducts groups. From each group a single reduct is selected and used as a base for creating a weak classifier. Weak classifiers together form a classifier ensemble.

$$\begin{aligned} \overrightarrow{v_{B}}[k] = {\left\{ \begin{array}{ll} 1, &{} \text {if} \quad d(x_k) = \mathop {\text {argmax}}\limits _{X \in U/\{d\}} |X \cap E| \\ 0, &{} \text {otherwise} \end{array}\right. } \end{aligned}$$

(11)

The framework also includes selection mechanisms which allow to select from a reduct pool those reducts that meet a user defines criteria e.g. contain least number of features, generate least number of decision rules etc.

2.4 Other Machine Learning Algorithms

Except rough sets inspired classifiers the framework includes a number of decision rule induction algorithms. These algorithms can be combined with rough set based feature selection methods defined in the previous section. Current implementation includes the following routines:

Decision lists:: generation routine based on feature subsets and a given decision table.
Majority voting:: based on a feature subset and a given decision table [23].
Decision trees:: (C4.5 [24], ID3 [25]) implementation supporting numerical and nominal attributes types. The impurity functions can be easily exchanged with one another and include information entropy, gini index, majority function or other user defined methods. Decision tree base class include pruning option - current implementation includes an error based pruning and a reduced error pruning [26]. We also introduce a new pre-pruning method based on the majority function.
Random forest:: implementation with option to select ensemble size, base decision tree type and data sampling method.
1R rule inducer:: which, as suggested in [27], could be used to calculate data set baseline accuracy.
Constant decision:: classifier which classifies all object as majority decision from training data set.

The framework includes a set of unsupervised algorithms based on Hierarchical clustering [28] with different linkage and distance methods. Model construction algorithms that work only with nominal data can utilize a number of numerical attribute discretization methods, both supervised and unsupervised. Supervised include hierarchical methods based on information entropy [29, 30] and majority function. Unsupervised include equal width and equal frequency binning. Most of implemented algorithms can work with weighted instances.

2.5 Model Evaluation

Proper evaluation and error estimation is crucial in constructing and comparing decision models. One of the key features in NRough is decision model evaluation. Currently the framework provides the following evaluation methods [31]: k-fold n-repeated cross validation (CV), leave-one-out CV, bootstrap with out-of-the-bag testing and finally n-repeated hold out.

Each evaluation test can return detailed information about experiment results in a form of a formatted table which can be saved as a .CSV or .TEX file. Additionally, results can be presented in a graphical form using an interface to R environment [32] - the graphical presentation as well as the latex tabular output are still under development but the working code is available in the repository and can be customized according to your need. The classification result class interface contains information such as error and accuracy rates, balanced accuracy (useful for testing imbalanced data sets), error standard deviation, confidence, coverage, f-score, recall and precision as well as classification confusion table. If model definition allows it, there is possibility to include information about complexity e.g. size of a decision tree, number of rules, average length of rules, etc.

3 Architecture

NRough is a Microsoft .NET based framework written in C# programming language. The source code is CLS compliant which enables to use it in other .NET languages. Currently the framework is provided as a set of libraries targeting .NET 4.6.1 and 64 bit architecture.

The libraries have the following structure:

NRough.Core:: Contains generic data structures and extensions methods.
NRough.Data:: Responsible for data handling. It defines the decision table interface and equivalence class collection as well as routines for providing meta data and interface to data filtering.
NRough.MachineLearning:: Contains approximate decision reduct computation algorithms as well as other described machine learning models and routines.
NRough.Math:: Contains special functions used across other modules e.g. statistical functions or distance metrics used in clustering.
NRough.Tests.*:: Contains a set of test fixtures which except unit testing purpose serve as a code sample repository. Each of the above listed libraries has its own unit test library e.g. NRough.Data is tested by NRough.Tests.Data.

The framework is currently dependent on the following external libraries (except standard .NET): Math.NET.Numerics [33], NUnit [34] and R.NET [35].

4 License

NRough libraries are provided under GNU Lesser General Public License ver. 3 (GNU LGPLv3). This means that provided source can be used for research, commercial and non-commercial purposes without any charges as long as GNU LGPLv3 restrictions are satisfied. Copyright and license notices must be preserved. Contributors provide an express grant of patent rights. However, a larger work using the licensed work through interfaces provided by the licensed work may be distributed under different terms and without source code for the larger work. The source code is provided “as is” without warranty of any kind, express or implied. In no event shall the authors or copyright holders be liable for any claim, damages or other liability. Complete license can be found in [36].

5 Other Frameworks

Let us focus on other rough set related frameworks developed by various researches. First of all, let us mention LERS - a system for Learning from Examples based on Rough Sets [37]. LERS contained two rules’ induction algorithms (LEM1 and LEM2) that could cope with inconsistent data. LERS contained also a number of algorithms for handling missing data and numerical attribute discretization. Its performance was comparable with AQ15 and C4.5 algorithms.

Secondly let us mention more complete GUI-based systems: Rough Set Exploration System (RSES) [9] and ROSETTA [38] (a toolkit for analyzing tabular data within a framework of rough sets). Both solutions shared the same computational kernel developed at the Group of Logic, Institute of Mathematics, University of Warsaw, Poland and finally ROSE2 (Rough Sets Data Explorer) [39] which is a software implementing basic elements of the rough set theory and rule discovery techniques. It was created at the Laboratory of Intelligent Decision Support Systems of the Institute of Computing Science in Poznan, Poland. RSES, ROSETTA as well as ROSE2 contained many different algorithms ranging from data preprocessing, filtering, discretization, rule induction, to classification and feature selection.

All mentioned above solutions are according to authors knowledge no longer maintained. These solutions were based on Java or C++ and its source code was not open. More recently, there were several attempts to revive RSES in a form of an open source Java based library distributed under GNU GPL license called RSESLib [40].

Last but not least we need to mention RoughSets [41] and RapidRoughSets [42] packages as a most recent development in rough sets domain. The former package contains the core computational methods for rough and fuzzy sets based on already mentioned R statistical software. The latter package is the GUI extension based on RapidMiner software.

In .NET domain there were so far, according to authors knowledge, no attempts to publish a machine learning framework containing rough set based algorithms. We would like to mention Microsoft Azure Machine Learning [43] which is a cloud based service integrating many different solutions resulting in a platform worth considering for system integrators. Its is however a commercial solution. We also would like to mention Accord.NET framework [44] which is an open source machine learning framework focusing on numerical methods as well as on machine vision. This framework could complement some general machine learning routines not yet implemented in NRough.

6 Conclusions and Future Work

We have created NRough framework based on our experience and research focused on approximate decision reducts done over past few years. In the beginning presented methods had been developed separately, but recently the whole source code went through major refactoring process resulting in presented solution. We have added other machine learning routines to the framework for two reasons: First to combine well known proven machine learning algorithms with rough sets, secondly, in order to be able to compare their performance against rough set inspired classifiers.

The whole framework was so far developed by a single person and there is still much to be done. First of all the framework needs strong API documentation and more examples. We are planning to add this in the nearest future and publish it on-line Secondly, we would like to add more rough set related algorithms in order to create a comprehensive library of different decision reduct computation routines. Thirdly, we are planning to extend the approximate decision reduct selection and diversification criteria. Last but not least, there are some development tasks to complete like graphical presentation methods using R interface. Moreover the framework is currently targeting .NET version 4.6.1 which allows to compile it only on Windows platform. We plan to extend its compatibility .NET Core to be able to use it on Linux and OSX platforms, but so far .NET Core lacks some important functionality so we are waiting for its new releases.

References

Pawlak, Z., Skowron, A.: Rudiments of rough sets. Inf. Sci. 177(1), 3–27 (2007)
Article MathSciNet Google Scholar
Widz, S., Ślęzak, D.: Rough set based decision support - models easy to interpret. In: Peters, G., Lingras, P., Ślęzak, D., Yao, Y. (eds.) Rough Sets: Selected Methods and Applications in Management and Engineering, pp. 95–112. Springer, Heidelberg (2012)
Google Scholar
Agrawal, R., Imielinski, T., Swami, A.N.: Mining association rules between sets of items in large databases. In: Buneman, P., Jajodia, S. (eds.) Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, Washington, D.C., 26–28 May 1993, pp. 207–216. ACM Press (1993)
Google Scholar
Rokach, L., Maimon, O.: Data Mining with Decision Trees: Theory and Applications. World Scientific, Singapore (2014)
Google Scholar
Świniarski, R.W., Skowron, A.: Rough set methods in feature selection and recognition. Pattern Recogn. Lett. 24(6), 833–849 (2003)
Article Google Scholar
Nguyen, H.S., Ślęzak, D.: Approximate reducts and association rules - correspondence and complexity results. In: Zhong, N., Skowron, A., Ohsuga, S. (eds.) RSFDGrC 1999. LNCS, vol. 1711, pp. 137–145. Springer, Heidelberg (1999). doi:10.1007/978-3-540-48061-7_18
Chapter Google Scholar
Widz, S.: NRough framework git repository (2017). https://www.github.org/nrough/
Widz, S.: NRough framework website (2017). http://www.nrough.net
Bazan, J.G., Szczuka, M.S.: The rough set exploration system. In: Peters, J.F., Skowron, A. (eds.) Transactions on Rough Sets III. LNCS, vol. 3400, pp. 37–56. Springer, Heidelberg (2005)
Chapter Google Scholar
Lichman, M.: UCI machine learning repository (2013)
Google Scholar
Ślęzak, D.: Rough sets and functional dependencies in data: foundations of association reducts. In: Gavrilova, M.L., Tan, C.J.K., Wang, Y., Chan, K.C.C. (eds.) Transactions on Computational Science V. LNCS, vol. 5540, pp. 182–205. Springer, Heidelberg (2009). doi:10.1007/978-3-642-02097-1_10
Chapter Google Scholar
Ślęzak, D., Ziarko, W.: The investigation of the Bayesian rough set model. Int. J. Approximate Reason. 40(1–2), 81–91 (2005)
Article MathSciNet Google Scholar
Widz, S., Ślęzak, D.: Attribute subset quality functions over a universe of weighted objects. In: Kryszkiewicz, M., Cornelis, C., Ciucci, D., Medina-Moreno, J., Motoda, H., Raś, Z.W. (eds.) RSEISP 2014. LNCS, vol. 8537, pp. 99–110. Springer, Cham (2014). doi:10.1007/978-3-319-08729-0_9
Chapter Google Scholar
Stawicki, S., Ślęzak, D., Janusz, A., Widz, S.: Decision bireducts and decision reducts - a comparison. Int. J. Approximate Reason. 84, 75–109 (2017)
Article MathSciNet Google Scholar
Stawicki, S., Widz, S.: Decision bireducts and approximate decision reducts: comparison of two approaches to attribute subset ensemble construction. In: 2012 Federated Conference on Computer Science and Information Systems (FedCSIS), pp. 331–338. IEEE (2012)
Google Scholar
Widz, S., Stawicki, S.: Generalized majority decision reducts. In: 2016 Federated Conference on Computer Science and Information Systems (FedCSIS), pp. 165–174. IEEE (2016)
Google Scholar
Kuncheva, L.I., Diez, J.J.R., Plumpton, C.O., Linden, D.E.J., Johnston, S.J.: Random subspace ensembles for fMRI classification. IEEE Trans. Med. Imaging 29(2), 531–542 (2010)
Article Google Scholar
Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)
MathSciNet MATH Google Scholar
Kuncheva, L.I.: Combining Pattern Classifiers: Methods and Algorithms. Wiley, Hoboken (2014)
Google Scholar
Freund, Y., Schapire, R.E.: Experiments with a new boosting algorithm. In: Saitta, L. (ed.) ICML, pp. 148–156. Morgan Kaufmann, Burlington (1996)
Google Scholar
Widz, S.: Boosting approximate reducts. In: Techniki informacyjne: teoria i zastosowania: wybrane problemy, Instytut Badań Systemowych PAN, vol. 5, no. 17, pp. 129–148 (2015)
Google Scholar
Ślęzak, D., Widz, S.: Is it important which rough-set-based classifier extraction and voting criteria are applied together? In: Szczuka, M., Kryszkiewicz, M., Ramanna, S., Jensen, R., Hu, Q. (eds.) RSCTC 2010. LNCS, vol. 6086, pp. 187–196. Springer, Heidelberg (2010). doi:10.1007/978-3-642-13529-3_21
Chapter Google Scholar
Kohavi, R.: The power of decision tables. In: Lavrac, N., Wrobel, S. (eds.) ECML 1995. LNCS, vol. 912, pp. 174–189. Springer, Heidelberg (1995). doi:10.1007/3-540-59286-5_57
Chapter Google Scholar
Quinlan, J.R.: C4.5: Programs for Machine Learning. Elsevier, Amsterdam (2014)
Google Scholar
Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)
Google Scholar
Quinlan, J.R.: Simplifying decision trees. Int. J. Man-Mach. Stud. 27(3), 221–234 (1987)
Article Google Scholar
Holte, R.C.: Very simple classification rules perform well on most commonly used datasets. Mach. Learn. 11(1), 63–90 (1993)
Article MathSciNet Google Scholar
Gurrutxaga, I., Arbelaitz, O., Martín, J.I., Muguerza, J., Pérez, J.M., Perona, I.: SIHC: a stable incremental hierarchical clustering algorithm. In: ICEIS, vol. 2, pp. 300–304 (2009)
Google Scholar
Fayyad, U.M., Irani, K.B.: Multi-interval discretization of continuous-valued attributes for classification learning. In: Thirteenth International Joint Conference on Artificial Intelligence, vol. 2, pp. 1022–1027. Morgan Kaufmann Publishers (1993)
Google Scholar
Kononenko, I.: On biases in estimating multi-valued attributes. In: 14th International Joint Conference on Articial Intelligence, pp. 1034–1040 (1995)
Google Scholar
Friedman, J., Hastie, T., Tibshirani, R.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer Series in Statistics. Springer, Heidelberg (2009)
Google Scholar
R Development Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2008). ISBN 3-900051-07-0
Google Scholar
Rüegg, C., Marcus, C.: Math.NET numerics (2017). https://numerics.mathdotnet.com/. Accessed 11 Feb 2017
Poole, C., Prouse, R., Busoli, S., Colvin, N.: NUnit framework (2017) https://www.nunit.org/. Accessed 11 Feb 2017
Perraud, J.M.: R.NET github repository (2017). https://github.com/jmp. 75/rdotnet. Accessed 11 Feb 2017
Free Software Foundation: Gnu lesser general public license. https://www.gnu.org/licenses/lgpl-3.0.en.html. Accessed 11 Feb 2017
Grzymala-Busse, J.W.: LERS-a data mining system. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook, pp. 1347–1351. Springer, Heidelberg (2005)
Google Scholar
Komorowski, J., Øhrn, A., Skowron, A.: Case studies: public domain, multiple mining tasks systems: ROSETTA rough sets. In: Handbook of Data Mining and Knowledge Discovery, pp. 554–559. Oxford University Press, Inc. (2002)
Google Scholar
Predki, B., Słowiński, R., Stefanowski, J., Susmaga, R., Wilk, S.: ROSE - software implementation of the rough set theory. In: Polkowski, L., Skowron, A. (eds.) RSCTC 1998. LNCS, vol. 1424, pp. 605–608. Springer, Heidelberg (1998). doi:10.1007/3-540-69115-4_85
Chapter Google Scholar
Wojna, A.: RSESLib. (2017). http://rseslib.mimuw.edu.pl/. Accessed 11 Feb 2017
Riza, L.S., Janusz, A., Bergmeir, C., Cornelis, C., Herrera, F., Ślęzak, D., Benítez, J.M., et al.: Implementing algorithms of rough set theory and fuzzy rough set theory in the R package “roughsets”. Inf. Sci. 287, 68–89 (2014)
Article Google Scholar
Janusz, A., Stawicki, S., Szczuka, M., Ślęzak, D.: Rough set tools for practical data exploration. In: Ciucci, D., Wang, G., Mitra, S., Wu, W.-Z. (eds.) RSKT 2015. LNCS, vol. 9436, pp. 77–86. Springer, Cham (2015). doi:10.1007/978-3-319-25754-9_7
Chapter Google Scholar
Barga, R., Fontama, V., Tok, W.H., Cabrera-Cordon, L.: Predictive Analytics with Microsoft Azure Machine Learning. Springer, Heidelberg (2015)
Google Scholar
Souza, C.R.: The Accord.NET framework, SãoCarlos, Brazil (2014). http://accord-framework.net. Accessed 11 Feb 2017

Download references

Author information

Authors and Affiliations

Systems Research Institute, Polish Academy of Sciences, ul. Newelska 6, 01-447, Warsaw, Poland
Sebastian Widz

Authors

Sebastian Widz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sebastian Widz .

Editor information

Editors and Affiliations

Polish-Japanese Academy of Information Technology, Warsaw, Poland
Lech Polkowski
University of Regina, Regina, SK, Canada
Yiyu Yao
University of Warmia and Mazury, Olsztyn, Poland
Piotr Artiemjew
University of Milano-Bicocca, Milano, Italy
Davide Ciucci
Southwest Jiaotong University, Chengdu, China
Dun Liu
Warsaw University, Warszawa, Poland
Dominik Ślęzak
Silesian University, Sosnowiec, Poland
Beata Zielosko

Appendix NRough Code Samples

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Widz, S. (2017). Introducing NRough Framework. In: Polkowski, L., et al. Rough Sets. IJCRS 2017. Lecture Notes in Computer Science(), vol 10313. Springer, Cham. https://doi.org/10.1007/978-3-319-60837-2_53

Download citation

DOI: https://doi.org/10.1007/978-3-319-60837-2_53
Published: 22 June 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-60836-5
Online ISBN: 978-3-319-60837-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics