Nonlinear Pattern Classification Using Fuzzy System and Hybrid Genetic-Imperialist Algorithm

Łapa, Krystian; Cpałka, Krzysztof

doi:10.1007/978-3-319-28567-2_14

Krystian Łapa⁶ &
Krzysztof Cpałka⁶

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 432))

497 Accesses
2 Citations

Abstract

An approach proposed in this paper allows to select neuro-fuzzy classifiers taking into account new interpretability criteria. Those criteria are focused not only on complexity of the system, but also on semantics of the rules. The approach uses capabilities of new hybrid population algorithm which is a combination of the genetic algorithm and the imperialist competitive algorithm. This combination allows to select not only the parameters of the neuro-fuzzy system, but also the structure of it. In simulations typical issues of classification were used.

Access provided by Autonomous University of Puebla. Download conference paper PDF

New Method for Design of Fuzzy Systems for Nonlinear Modelling Using Different Criteria of Interpretability

A New Interpretability Criteria for Neuro-Fuzzy Systems for Nonlinear Classification

Application of Self-adapting Genetic Algorithms to Generate Fuzzy Systems for a Regression Problem

Keywords

1 Introduction

The process of creation of methods for nonlinear classification is focused mostly on reaching high accuracy. The other important goal is focused on achieve a good clarity and interpretability of classification rules, which allows to better understand considered problem. These both aims are contradictory, so the balance between accuracy and interpretability of classifier is often investigated in the literature (see e.g. [6, 7, 8, 18]).

Nonlinear classification can be based on many types of approaches. Among them, for example, a neuro-fuzzy systems (see e.g. [13, 17]) can be found. In these systems the knowledge in the form of $if \ldots then \ldots$ rules is gathered. These rules contain linguistic variables and variables corresponding to fuzzy sets and their parameters. Methods created to increasing interpretability of neuro-fuzzy system rules take an important place in the literature. The interpretability arises not only from complexity of the system, but also from semantics of the rules (see e.g. [2, 7, 19]). In this research area it is worth to list methods focused on: (a) Definition and implementation of new criteria of interpretability of fuzzy rules (see e.g. [1, 7]). (b) Appropriate aggregation of these criteria (see e.g. [8, 18]) and using multi-objective methods (see e.g. [1, 18]). (c) Use of population-based algorithms to obtain interpretable systems (see e.g. [12]) etc.

In this paper we propose a new approach which allows to select fuzzy classifiers taking into account different interpretability criteria (including, among others, semantics). This approach is based on hybrid population-based algorithm, which is a fusion between genetic algorithm (see e.g. [17]) and imperialist competitive algorithm (ICA) (see e.g. [3]). The genetic part of the algorithm allows for automatic selection of the structure of neuro-fuzzy system, use of the imperialist algorithm allows to simultaneously select the parameters of these structures. Algorithm ICA was chosen as a part of the proposed hybrid method because: (a) it was created taking inspiration from social evolution, (b) it is a multi-population algorithm and it provides migration and competition of sub-populations in order to improve obtained solutions, (c) it is distinguished by two interesting operators: assimilation and revolution. It is worth to mention that the system presented in our previous paper [14] was used for classification process. Our approach is additionally focused on trade-off between accuracy and interpretability of the system and allows to present accuracy-interpretability dependences using estimated Pareto front (see e.g. [17]).

This paper is organized as follows: in Sect. 2 a description of proposed system and its tuning process for nonlinear classification is presented. In Sect. 3 a interpretability criteria to increase interpretability for neuro-fuzzy systems are shown. The results of simulations are presented in Sect. 4, finally the conclusions are described in Sect. 5.

2 Description of Neuro-Fuzzy System for Classification and Algorithm for Its Tuning

2.1 Description of the System

We consider multi-input, multi-output neuro-fuzzy system mapping ${\mathbf{X}} \to {\mathbf{Y}}$, where ${\mathbf{X}} \subset {\mathbf{R}}^{n}$ and ${\mathbf{Y}} \subset {\mathbf{R}}^{m}$. The flexible fuzzy rule base consists of a collection of N fuzzy if-then rules in the form:

$$R^{k} :\left[ {\left( \begin{aligned} {\text{IF}}\left( {\bar{x}_{1}\; {\text{is}}\; A_{1}^{k} } \right)\left| {w_{k,1}^{A} } \right.{\text{AND}} \, \ldots \, {\text{AND}}\left( {\bar{x}_{n} \; {\text{is}}\; A_{n}^{k} } \right)\left| {w_{k,n}^{A} } \right. \\ {\text{THEN}}\left( {y_{1}\; {\text{is}}\; B_{1}^{k} } \right)|w_{1,k}^{B} , \ldots ,\left( {y_{m}\; {\text{is}}\; B_{m}^{k} } \right)|w_{m,k}^{B} \\ \end{aligned} \right)\left| {w_{k}^{\text{rule}} } \right.} \right],$$

(1)

where n is a number of inputs, m is a number of outputs, ${\bar{\mathbf{x}}} = \left[ {\bar{x}_{1} , \ldots ,\bar{x}_{n} } \right] \in {\mathbf{X}}$, ${\mathbf{y}} = \left[ {y_{1} , \ldots ,y_{m} } \right] \in {\mathbf{Y}},A_{1}^{k} , \ldots ,A_{n}^{k}$ are fuzzy sets characterized by membership functions $\mu_{{A_{i}^{k} }} \left( {x_{i} } \right),i = 1, \ldots ,n,k = 1, \ldots ,N,B_{1}^{k} , \ldots ,B_{m}^{k}$ are fuzzy sets characterized by membership functions $\mu_{{B_{j}^{k} }} \left( {y_{j} } \right),j = 1, \ldots ,m,k = 1, \ldots ,N,w_{k,i}^{A} \in \left[ {0,1} \right],i = 1, \ldots ,n,k = 1, \ldots ,N$, are weights of antecedents, $w_{j,k}^{B} \in \left[ {0,1} \right],k = 1, \ldots ,N,j = 1, \ldots ,m$, are weights of consequences, $w_{k}^{\text{rule}} \in \left[ {0,1} \right],k = 1, \ldots ,N$, are weights of rules. The flexibility of rule base results from using weights of the antecedences and consequences of the rules. Using of weights need a proper defined aggregation function, which definition can be found in our previous work (see [5]). In logical approach output signal $\bar{y}_{j} ,j = 1, \ldots ,m,$ of the neuro-fuzzy system can be described by the formula:

$$\bar{y}_{j} = \frac{{\sum\nolimits_{r = 1}^{R} {\bar{y}_{j,r}^{\text{def}} \cdot\mathop {\mathop {T^{*} }\limits^{N} }\limits_{k = 1} \left\{ {S^{*} \left\{ {1 - \mathop {\mathop {T^{*} }\limits^{n} }\limits_{i = 1} \left\{ {\mu_{{A_{i}^{k} }} \left( {\bar{x}_{i} } \right);w_{k,i}^{A} } \right\},\mu_{{B_{j}^{k} }} \left( {\bar{y}_{j,r}^{\text{def}} } \right);1,w_{j,k}^{B} } \right\};w_{k}^{\text{rule}} } \right\}} }}{{\sum\nolimits_{r = 1}^{R} {\mathop {\mathop {T^{*} }\limits^{N} }\limits_{k = 1} \left\{ {S^{*} \left\{ {1 - \mathop {\mathop {T^{*} }\limits^{n} }\limits_{i = 1} \left\{ {\mu_{{A_{i}^{k} }} \left( {\bar{x}_{i} } \right);w_{k,i}^{A} } \right\},\mu_{{B_{j}^{k} }} \left( {\bar{y}_{j,r}^{\text{def}} } \right);1,w_{j,k}^{B} } \right\};w_{k}^{\text{rule}} } \right\}} }},$$

(2)

where $\bar{y}_{j,r}^{\text{def}} ,j = 1, \ldots ,m,r = 1, \ldots ,R$, are discretization points, R is a number of discretization points (points in Y, in which the fuzzy inference from the rule base (1) is discretized, resulting from, among others, use of typical for neuro-fuzzy systems defuzzification operations, which allow to determine the real value of the system output signal), $T^{*} \left\{ \cdot \right\}$ and $S^{*} \left\{ \cdot \right\}$ are weighted triangular norms (see e.g. [17]). In particular, t-norm with weights of arguments can be denoted as follows (see e.g. [17]):

$$T^{*} \left\{ {a_{1} ,a_{2} ;w_{1} ,w_{2} } \right\} = T\left\{ {1 - w_{1} \cdot \left( {1 - a_{1} } \right),1 - w_{2} \cdot \left( {1 - a_{2} } \right)} \right\}\mathop = \limits^{{{\text{e}}.{\text{g}}.}} \left( {1 - w_{1} \cdot \left( {1 - a_{1} } \right)} \right) \cdot \left( {1 - w_{2} \cdot \left( {1 - a_{2} } \right)} \right) ,$$

(3)

where t-norm $T\left\{ \cdot \right\}$ is a generalization of the usual two-valued logical conjunction (studied in classical logic), $w_{1}$ and $w_{2} \in \left[ {0,1} \right]$ mean weights of importance of the arguments $a_{1} ,a_{2} \in \left[ {0,1} \right]$. T-conorm with weights of arguments can be denoted analogously:

$$S^{*} \left\{ {a_{1} ,a_{2} ;w_{1} ,w_{2} } \right\} = S\left\{ {w_{1} \cdot a_{1} ,w_{2} \cdot a_{2} } \right\}\mathop = \limits^{{{\text{e}}.{\text{g}}.}} 1 - \left( {1 - w_{1} \cdot a_{1} } \right) \cdot \left( {1 - w_{2} \cdot a_{2} } \right) .$$

(4)

For more details see our previous papers, e.g. [17].

2.2 Description of the Tuning Algorithm

The purpose of the algorithm described in this section is an automatic selection of the structure and parameters of the rules in form (1) (number of inputs, antecedences, consequences, rules) and system in form (2) (discretization points). In this process interpretability criteria defined in Sect. 3 are used. Considered algorithm is a fusion between genetic algorithm (which allows to select the structure of the system) with imperialist competitive algorithm (which allows to select the parameters of the system).

Encoding of parameters and structure. The parameters of system (2) are encoded in the following individuals (Pittsburgh approach, in which a single individual of the population encodes the entire neuro-fuzzy system):

$$\begin{aligned} {\mathbf{X}}_{ch}^{\text{par}} & = \left\{ {\begin{array}{*{20}c} {\bar{x}_{1,1}^{A} ,\sigma_{1,1}^{A} , \ldots ,\bar{x}_{n,1}^{A} ,\sigma_{n,1}^{A} , \ldots \bar{x}_{1,Nmax}^{A} ,\sigma_{1,Nmax}^{A} , \ldots ,\bar{x}_{n,Nmax}^{A} ,\sigma_{n,Nmax}^{A} ,} \\ {\bar{y}_{1,1}^{B} ,\sigma_{1,1}^{B} , \ldots ,\bar{y}_{m,1}^{B} ,\sigma_{m,1}^{B} , \ldots \bar{y}_{1,Nmax}^{B} ,\sigma_{1,Nmax}^{B} , \ldots ,\bar{y}_{m,Nmax}^{B} ,\sigma_{m,Nmax}^{B} ,} \\ {w_{1,1}^{A} , \ldots ,w_{1,n}^{A} , \ldots ,w_{Nmax,1}^{A} , \ldots ,w_{Nmax,n}^{A} ,w_{1,1}^{B} , \ldots ,w_{m,1}^{B} , \ldots ,w_{1,Nmax}^{B} , \ldots ,w_{m,Nmax}^{B} ,} \\ {w_{1}^{\text{rule}} , \ldots ,w_{Nmax}^{\text{rule}} ,\bar{y}_{1,1}^{\text{def}} , \ldots ,\bar{y}_{1,Rmax}^{\text{def}} , \ldots ,\bar{y}_{m,1}^{\text{def}} , \ldots ,\bar{y}_{m,Rmax}^{\text{def}} } \\ \end{array} } \right\} \\ & = \left\{ {X_{ch,1}^{\text{par}} , \ldots ,X_{ch,L}^{\text{par}} } \right\}, \\ \end{aligned}$$

(5)

where $L = Nmax \cdot \left( {3 \cdot n + 3 \cdot m + 1} \right) + Rmax \cdot m$ is the length of the parameters ${\mathbf{X}}_{ch}^{\text{par}} ,ch = 1, \ldots ,\mu$ for the parent population or $ch = 1, \ldots ,\lambda$ for the temporary population, $\left\{ {\bar{x}_{i,k}^{A} ,\sigma_{i,k}^{A} } \right\},i = 1, \ldots ,n,k = 1, \ldots ,N$, are parameters of Gaussian membership functions $\mu_{{A_{i}^{k} }} \left( {x_{i} } \right)$ of the input fuzzy sets $A_{1}^{k} , \ldots ,A_{n}^{k}$ (were used in our simulations), $\left\{ {\bar{y}_{j,k}^{B} ,\sigma_{j,k}^{B} } \right\},k = 1, \ldots ,N,j = 1, \ldots ,m$, are parameters of Gaussian membership functions $\mu_{{B_{j}^{k} }} \left( {y_{j} } \right)$ of the output fuzzy sets $B_{1}^{k} , \ldots ,B_{m}^{k} ,Nmax$ is the maximum number of rules, $Rmax$ is the maximum number of discretization points. The process of selecting the structure of the system is done using additional parameters ${\mathbf{X}}_{ch}^{\text{str}}$. Their genes take binary values and indicate which rules, antecedents, consequents, inputs, and discretization points are selected. The parameters ${\mathbf{X}}_{ch}^{\text{str}}$ are given by:

$$\begin{aligned} {\mathbf{X}}_{ch}^{\text{str}} & = \left\{ {\begin{array}{*{20}c} {x_{1} , \ldots ,x_{n} ,A_{1}^{1} , \ldots ,A_{n}^{1} , \ldots ,A_{1}^{Nmax} , \ldots ,A_{n}^{Nmax} ,B_{1}^{1} , \ldots ,B_{m}^{1} , \ldots ,B_{1}^{Nmax} , \ldots ,B_{m}^{Nmax} ,} \\ {{\text{rule}}_{1} , \ldots ,{\text{rule}}_{Nmax} ,\bar{y}_{1,1}^{\text{def}} , \ldots ,\bar{y}_{1,Rmax}^{\text{def}} , \ldots ,\bar{y}_{m,1}^{\text{def}} , \ldots ,\bar{y}_{m,Rmax}^{\text{def}} } \\ \end{array} } \right\} \\ & = \left\{ {X_{ch,1}^{\text{str}} , \ldots ,X_{{ch,L^{\text{str}} }}^{\text{str}} } \right\}, \\ \end{aligned}$$

(6)

where $L^{\text{str}} = Nmax \cdot (n + m + 1) + n + Rmax \cdot m$ is the length of the parameters ${\mathbf{X}}_{ch}^{\text{str}}$. Their genes indicate which rules $( {\text{rule}}_{k} ,k = 1, \ldots ,Nmax)$, antecedents $(A_{i}^{k} ,i = 1, \ldots ,n,k = 1, \ldots ,Nmax)$, consequents $(B_{j}^{k} ,j = 1, \ldots ,m,k = 1, \ldots ,Nmax)$, inputs $(\bar{x}_{i} ,i = 1, \ldots ,n)$, and discretization points $(\bar{y}^{r} ,r = 1, \ldots ,Rmax)$ are taken to the system. We can easily notice that the number of inputs used in the system and encoded in the individual ch can be determined as follows:

$$n_{ch} = \sum\limits_{i = 1}^{n} {{\mathbf{X}}_{ch}^{\text{str}} \left\{ {x_{i} } \right\}} ,$$

(7)

where ${\mathbf{X}}_{ch}^{\text{str}} \left\{ {x_{i} } \right\}$ means parameters of the individual ${\mathbf{X}}_{ch}^{\text{str}}$ associated with the input $x_{i}$ (as previously mentioned, if the value of the gene is 1, the associated input is taken into account during work of the system). The number of rules $(N_{ch} )$ used in the system and encoded in the individual ch may be determined analogously.

Evolution of the parameters and structure. The idea of the proposed algorithm is shown in Fig. 1. In Step 1 of the algorithm, an initial population (in a size of $N_{pop}$) is created and evaluated (each individual is called colony). It is worth to mention that for each colony both the real value parameters ${\mathbf{X}}_{ch}^{\text{par}}$ and the structure parameters ${\mathbf{X}}_{ch}^{\text{str}}$ are initialized. From initial population N best colonies are chosen, and on the basis of each of them empires (subpopulations) are created. Best colony in every empire is called imperialist. The remain $N_{pop} - N$ colonies are spread in a specified way among the empires. In Step 2 of algorithm a assimilation and revolution process [which is responsible to tune real parameters of the system (2)] are made. These processes purpose is to move colonies toward imperialist in their empires. Extension of this step relies on using mutation operator from genetic algorithm, which is used to modify the structure of the system (2). The mutation operator has been designed to be proportional to the value of the evaluation function of the colonies (best colony have 0 % chances to be modified, worst colony have 100 % chances to be modified). In Step 3 an evaluation of the modified colonies is made. If a colony gets a better value than imperialist of its empire then the imperialist is replaced by this colony. It is worth to mention that the fitness function defined in our paper promotes these colonies which are characterized, among others, by the simplest structures. In Step 4 of the algorithm, an empire competition (based on the power of empires) takes place. The empire which win competition (empire selected using roulette wheel method basing on probability calculated by using power of empires) gets the weakest colony of the weakest empire. If empire lost all colonies, it is removed from the algorithm. In the Step 5 a stop condition is checked (e.g. if number of iterations reaches maximum value). If stop condition is met, algorithm ends (and best colony of best empire is presented), otherwise algorithm goes back to step 2. More details about algorithms used in proposed hybrid genetic-imperialist algorithm can be found e.g. in [3, 17].

Chromosome population evaluation. Each individual ${\mathbf{X}}_{ch}$ of the parental and temporary populations is represented by a sequence of parameters $\left\{ {{\mathbf{X}}_{ch}^{\text{par}} ,{\mathbf{X}}_{ch}^{\text{str}} } \right\}$, given by formulas (5) and (6). First parameters take real values, whereas the second ones take integer values from the set $\left\{ {0,1} \right\}$. The system aims to minimize value of the following fitness function:

$${\text{ff}}\left( {{\mathbf{X}}_{ch} } \right) = T^{*} \left\{ {{\text{ffaccuracy}}\left( {{\mathbf{X}}_{ch} } \right),{\text{ffinterpretability}}\left( {{\mathbf{X}}_{ch} } \right);w_{\text{ffaccuracy}} ,w_{\text{ffinterpretability}} } \right\},$$

(8)

where $T^{*} \left\{ \cdot \right\}$ is the algebraic weighted t-norm (see e.g. [17]), $w_{\text{ffaccuracy}} \in \left( {0,1} \right]$ is a weight of the component ${\text{ffaccuracy}}\left( {{\mathbf{X}}_{ch} } \right)$ and $w_{\text{ffinterpretability}}$ is a weight of the component ${\text{ffinterpretability}}\left( {{\mathbf{X}}_{ch} } \right)$. The component ${\text{ffaccuracy}}\left( {{\mathbf{X}}_{ch} } \right)$ determines the accuracy of the system (2) (in a form of classification error). The component ${\text{ffinterpretability}}\left( {{\mathbf{X}}_{ch} } \right)$ determines complexity-based (component ${\text{ffint}}_{A} \left( {{\mathbf{X}}_{ch} } \right)$) and semantic-based (components ${\text{ffint}}_{B} \left( {{\mathbf{X}}_{ch} } \right) - {\text{ffint}}_{E} \left( {{\mathbf{X}}_{ch} } \right)$) interpretability of the system (2) encoded in the tested individual:

$$\begin{aligned} & {\text{ffinterpretability}}\left( {{\mathbf{X}}_{ch} } \right) = \\ & T^{*} \left\{ {\begin{array}{*{20}c} {{\text{ffint}}_{A} \left( {{\mathbf{X}}_{ch} } \right),{\text{ffint}}_{B} \left( {{\mathbf{X}}_{ch} } \right),{\text{ffint}}_{C} \left( {{\mathbf{X}}_{ch} } \right),{\text{ffint}}_{D} \left( {{\mathbf{X}}_{ch} } \right),{\text{ffint}}_{E} \left( {{\mathbf{X}}_{ch} } \right)} \\ {{\text{ffint}}_{E} \left( {{\mathbf{X}}_{ch} } \right),{\text{ffint}}_{F} \left( {{\mathbf{X}}_{ch} } \right),{\text{ffint}}_{G} \left( {{\mathbf{X}}_{ch} } \right);w_{\text{ffintA}} ,w_{\text{ffintB}} ,w_{\text{ffintC}} ,w_{\text{ffintD}} ,w_{\text{ffintE}} } \\ \end{array} } \right\}, \\ \end{aligned}$$

(9)

where $w_{\text{ffintA}} \in \left( {0,1} \right]$ denotes weight of the component ${\text{ffint}}_{A} \left( {{\mathbf{X}}_{ch} } \right)$, etc. The individual components of the formula (9) are defined in the next section.

3 An Interpretability Criteria for Neuro-Fuzzy System for Nonlinear Classification

In this section a new interpretability criteria for neuro-fuzzy system for nonlinear classification are described. Each criterion is a component of fitness function responsible for interpretability (9) of the system. The criteria are defined as follows:

(a)
The component ${\text{ffint}}_{A} \left( {{\mathbf{X}}_{ch} } \right)$ determines complexity of the system (2) i.e. a number of reduced elements of the system (rules, input fuzzy sets, output fuzzy sets, inputs, and discretization points) in relation to length of the parameters ${\mathbf{X}}_{\text{ch}}^{\text{str}}$ (it allows to increase complexity-based interpretability):

$${\text{ffint}}_{A} \left( {{\mathbf{X}}_{ch} } \right) = \frac{{\left( \begin{aligned} & \sum\nolimits_{i = 1}^{n} {{\mathbf{X}}_{ch}^{\text{str}} \left\{ {x_{i} } \right\} \cdot \sum\nolimits_{k = 1}^{Nmax} {{\mathbf{X}}_{ch}^{\text{str}} \left\{ {{\text{rule}}_{k} } \right\} \cdot {\mathbf{X}}_{ch}^{\text{str}} \left\{ {A_{i}^{k} } \right\}} } \\ & \quad + \sum\nolimits_{j = 1}^{m} {\sum\nolimits_{k = 1}^{Nmax} {{\mathbf{X}}_{ch}^{\text{str}} \left\{ {{\text{rule}}_{k} } \right\} \cdot {\mathbf{X}}_{ch}^{\text{str}} \left\{ {B_{j}^{k} } \right\}} } + \sum\nolimits_{j = 1}^{m} {\sum\nolimits_{r = 1}^{Rmax} {{\mathbf{X}}_{ch}^{\text{str}} \left\{ {\bar{y}_{m,r}^{\text{def}} } \right\}} } \\ \end{aligned} \right)}}{{N_{ch} \cdot \left( {n_{ch} + m} \right) + m \cdot Rmax}},$$
(10)
where ${\mathbf{X}}_{ch}^{\text{str}} \left\{ {x_{i} } \right\}$ means a parameter of ${\mathbf{X}}_{ch}^{\text{str}}$ associated with the input $x_{i}$, etc.
(b)
The component ${\text{ffint}}_{B} \left( {{\mathbf{X}}_{ch} } \right)$ reduces overlapping of input and output fuzzy sets of the system (2) encoded in the tested individual. This criterion aims to the situation where crossover point between two nearest fuzzy sets have $\mu \left( x \right)$ value at $c_{\text{ffint}}$ (set to 0.5) and it prevents from situations where nearest fuzzy sets overlaps each other:

$${\text{ffint}}_{B} \left( {{\mathbf{X}}_{ch} } \right) = \frac{{\sum\nolimits_{i = 1}^{{n_{ch} }} {\sum\nolimits_{k = 1}^{{{\text{noifs}}\left( i \right) - 1}} {\left( {2\left| {c_{\text{ffintc}} - \hat{y}_{i,k}^{1} } \right| + \hat{y}_{i,k}^{2} } \right) + \sum\nolimits_{j = 1}^{{m_{ch} }} {\sum\nolimits_{k = 1}^{{{\text{noofs}}\left( j \right) - 1}} {\left( {2\left| {c_{\text{ffintc}} - \hat{y}_{j,k}^{1} } \right| + \hat{y}_{j,k}^{2} } \right)} } } } }}{{2\left( {\sum\nolimits_{i = 1}^{{n_{ch} }} ({{\text{noifs}}\left( i \right) - 1}) + \sum\nolimits_{j = 1}^{{m_{ch} }} {\left( {{\text{noofs}}\left( j \right) - 1} \right)} } \right)}},$$
(11)
where ${\text{noifs}}\left( {\text{i}} \right)$ stands for number of active fuzzy sets of i input, ${\text{noofs}}\left( {\text{j}} \right)$ stands for number of active fuzzy sets of j output, $\hat{y}_{i,k}^{1} ,\hat{y}_{i,k}^{2}$ are $\mu_{{A_{i}^{k} }} \left( x \right)$ value of crossover points between two input fuzzy sets and $\hat{y}_{j,k}^{1} ,\hat{y}_{j,k}^{2}$ are $\mu_{{B_{j}^{k} }} \left( x \right)$ value of crossover points between two output fuzzy sets. Those values can be calculated for inputs (and analogically for outputs) as:

$$\hat{y}_{i,k} = \exp \left( { - 0.5\left( {{\mathbf{x}}_{ch}^{\text{supp}} \left\{ {\bar{x}_{i,k}^{A} } \right\} + {\mathbf{x}}_{ch}^{\text{supp}} \left\{ {\bar{x}_{i,k + 1}^{A} } \right\}} \right)/\left( {{\mathbf{x}}_{ch}^{\text{supp}} \left\{ {\sigma_{i,k}^{A} } \right\} \pm {\mathbf{x}}_{ch}^{\text{supp}} \left\{ {\sigma_{i,k + 1}^{A} } \right\}} \right)^{2} } \right),$$
(12)
where ${\mathbf{X}}_{ch}^{\text{supp}}$ stands for additional set of system parameters [which is build temporary on a base of ${\mathbf{X}}_{ch}$ from Eq. (11)] with sorted (by position of their centres) list of non-reduced fuzzy sets (for details see [5]).
(c)
The component ${\text{ffint}}_{C} \left( {{\mathbf{X}}_{ch} } \right)$ increases the integrity of the shape of the input and output fuzzy sets associated with the inputs and outputs of the system (2) encoded in the tested individual. This criterion aims to achieve fuzzy sets with similar sizes under the same inputs and outputs:

$${\text{ffint}}_{C} \left( {{\mathbf{X}}_{ch} } \right) = \frac{1}{{n_{ch} + m}}\left( \begin{aligned} \sum\nolimits_{i = 1}^{n} {{\mathbf{X}}_{ch}^{\text{str}} \left\{ {x_{i} } \right\} \cdot \sum\nolimits_{k1 = 1}^{N\hbox{max} } {{\mathbf{X}}_{ch}^{\text{str}} \left\{ {{\text{rule}}_{k1} } \right\} \cdot {\text{shx}}_{i,k1} \left( {{\mathbf{X}}_{ch} ,i,k1} \right)} } \\ + \sum\nolimits_{j = 1}^{m} { \cdot \sum\nolimits_{k1 = 1}^{N\hbox{max} } {{\mathbf{X}}_{ch}^{\text{str}} \left\{ {{\text{rule}}_{k1} } \right\} \cdot {\text{shy}}\left( {{\mathbf{X}}_{ch} ,j,k1} \right)} } \\ \end{aligned} \right),$$
(13)
where ${\text{shx}}_{i,k1} \left( {{\mathbf{X}}_{ch} } \right)$ (and analogically ${\text{shy}}_{j,k1} \left( {{\mathbf{X}}_{ch} } \right)$) is a function for calculating proportion between fuzzy sets defined as follows:

$${\text{shx}}\left( {{\mathbf{X}}_{ch} ,i,k1} \right) = 1 - \frac{{\hbox{min} \left( {{\mathbf{X}}_{ch}^{\text{par}} \left\{ {\sigma_{i,k1}^{A} } \right\},\frac{1}{{N_{ch} }}\sum\nolimits_{k2 = 1}^{N\hbox{max} } {{\mathbf{X}}_{ch}^{\text{str}} \left\{ {{\text{rule}}_{k2} } \right\}{\mathbf{X}}_{ch}^{\text{par}} \left\{ {\sigma_{i,k2}^{A} } \right\}} } \right)}}{{\hbox{max} \left( {{\mathbf{X}}_{ch}^{\text{par}} \left\{ {\sigma_{i,k1}^{A} } \right\},\frac{1}{{N_{ch} }}\sum\nolimits_{k2 = 1}^{N\hbox{max} } {{\mathbf{X}}_{ch}^{\text{str}} \left\{ {{\text{rule}}_{k2} } \right\}{\mathbf{X}}_{ch}^{\text{par}} \left\{ {\sigma_{i,k2}^{A} } \right\}} } \right)}},$$
(14)
where ${\mathbf{X}}_{ch}^{\text{par}} \left\{ {\sigma_{i,k}^{A} } \right\}$ stands for a gene of the individual ${\mathbf{X}}_{ch}^{\text{par}}$ associated with the parameter $\sigma_{i,k}^{A}$ (the width of the Gaussian function), ${\mathbf{X}}_{ch}^{\text{par}} \left\{ {\sigma_{j,k}^{B} } \right\}$ means parameter of the ${\mathbf{X}}_{ch}^{\text{par}}$ associated with the parameter $\sigma_{j,k}^{B}$.
(d)
The component ${\text{ffint}}_{D} \left( {{\mathbf{X}}_{ch} } \right)$ increases complementarity (adjusting position of the input fuzzy sets and data $\bar{x}_{z,i}$) of system (2) encoded in the tested individual:

$${\text{ffint}}_{D} \left( {{\mathbf{X}}_{ch} } \right) = \frac{1}{{Z \cdot n_{ch} }}\left( {\sum\limits_{z = 1}^{Z} {\sum\limits_{i = 1}^{n} {{\mathbf{X}}_{ch}^{\text{str}} } } \left\{ {x_{i} } \right\} \cdot \hbox{max} \left( {1,\left| {1 - \sum\limits_{k = 1}^{N\hbox{max} } {{\mathbf{X}}_{ch}^{\text{str}} \left\{ {{\text{rule}}_{k} } \right\} \cdot \mu_{{A_{i}^{k} }} \left( {\bar{x}_{z,i} } \right)} } \right|} \right)} \right).$$
(15)
(e)
The component ${\text{ffint}}_{E} \left( {{\mathbf{X}}_{ch} } \right)$ increases readability of the antecedents and weights (it aims to reach specified values of weights—0, 0.5 and 1) of rules of system (2) encoded in the tested individual:

$$\begin{aligned} {\text{ffint}}_{E} \left( {{\mathbf{X}}_{ch} } \right) & = 1 - \frac{1}{{2N_{ch} }}\left( {\frac{1}{{n_{ch} }}\sum\limits_{k = 1}^{Nmax} {{\mathbf{X}}_{ch}^{\text{str}} \left\{ {{\text{rule}}_{k} } \right\}\sum\limits_{i = 1}^{n} {{\mathbf{X}}_{ch}^{\text{str}} \left\{ {x_{i} } \right\}\cdot\mu_{w} \left( {w_{i,k}^{A} } \right)} } } \right. \\ & \quad \quad \quad \quad \quad \quad \left. { + \sum\limits_{k = 1}^{Nmax} {{\mathbf{X}}_{ch}^{\text{str}} \left\{ {{\text{rule}}_{k} } \right\}\cdot\mu_{w} \left( {w_{k}^{\text{rule}} } \right)} } \right), \\ \end{aligned}$$
(16)
where $\mu_{w} \left( {w_{i,k}^{A} } \right)$ is a function defining congeries around values 0, 0.5 and 1 (in simulations we assumed that $a = 0.25,b = 0.50$ and $c = 0.75$). This function is described as follows:

$$\mu_{w} \left( x \right) = \left\{ {\begin{array}{*{20}l} {\begin{array}{*{20}c} {\left( {a - x} \right)a^{ - 1} } & {\text{for}} & {x \ge 0} & {\text{and}} & {x \le a} \\ \end{array} } \hfill \\ {\begin{array}{*{20}c} {\left( {x - a} \right)\left( {b - a} \right)^{ - 1} } & {\text{for}} & {x \ge a} & {\text{and}} & {x \le b} \\ \end{array} } \hfill \\ {\begin{array}{*{20}c} {\left( {c - x} \right)\left( {c - b} \right)^{ - 1} } & {\text{for}} & {x \ge b} & {\text{and}} & {x \le c} \\ \end{array} } \hfill \\ {\begin{array}{*{20}c} {\left( {x - c} \right)\left( {1 - c} \right)^{ - 1} } & {\text{for}} & {x \ge c} & {\text{and}} & {x \le 1} \\ \end{array} } \hfill \\ \end{array} } \right..$$
(17)

4 Simulation Results

In our simulations we considered five typical problems from the field of non-linear classification [15]: (a) wine recognition problem, (b) glass identification problem, (c) Pima Indians diabetes problem, (d) iris classification problem, (d) Wisconsin breast cancer problem. For each problem a 10-fold cross validation was used, and the process was repeated 10 times. Moreover, for each simulation problem a seven variants of learning were applied. Each variant had different set of weights of fitness function (8)—see Table 1. Weights of remaining criteria were set as follows: $w_{\text{ffintA}} = 0. 50,w_{\text{ffintB}} = 1.00,w_{\text{ffintC}} = 1.00,w_{\text{ffintD}} = 0. 20,w_{\text{ffintE}} = 0. 50$. The following parameters associated with ICA algorithm were set as follows: number of colonies $N_{pop} = 100$, number of empires N = 10, number of iterations to 1000, the revolution rate to 0.3. The mutation probability of genetic operator was set to 0.2.

Table 1 Values of the weights of the components ffaccuracy(X _ch) and ffinterpretability(X _ch) [see formula (8)] for various variants considered in simulations: case I–case V

Full size table

The conclusions from simulations can be summarized as follows: (a) Using a low value of the weights (such as 0.2) for components of the function (9) caused a reduction in the readability of the relationship between the values of interpretability criteria and the accuracy of the system (see Fig. 3-row 4). (b) Using extreme weight cases (Case I and Case VII) often has no effect on improvement of the system (see Table 2) and it can cause deterioration of the solutions (in comparison to other cases). Solutions founded for these cases may appear under estimated Pareto front (see Fig. 3). (c) Using proposed interpretability criteria allows to achieve semantic clear rules of the system (2) (see Fig. 2). (d) Considering seven cases of weights allowed to determine the estimated Pareto fronts, which make possible to select the interpretability-accuracy trade-off (compromise) by the user (see Fig. 3). (e) Number of reduced inputs and rules depends from the simulation problem (see Fig. 3-row 6 and 7). For example for classification problem (c) system can reduce up to 3 inputs (see Fig. 2) without significantly lost in the accuracy of the system. (f) Achieved results are comparable (in a field of accuracy) with results achieved by other authors using different methods (see Table 2). It should be emphasized that the purpose of the paper was not to achieve the best possible accuracy in comparison with the accuracy obtained by other methods. The purpose of the paper was to increase the legibility of knowledge represented in the form of fuzzy rules with acceptable accuracy of the system. It seems that this objective has been achieved.

Table 2 The accuracy (%) of the neuro-fuzzy classifier (2) for learning phase, testing phase and average value of them both for simulation variants case I–case VII

Full size table

5 Conclusions

In this paper a new approach for non-linear classification was proposed. It is based on possibilities of neuro-fuzzy system and new hybrid genetic-imperialist algorithm. The purpose of this algorithm was to select both the structure and the structure parameters of the estimated classifier with different interpretability criteria taken into consideration. Those criteria are focused not only on the complexity of the system, but also on semantic part of the system. Simulation results performed for typical problems of classification confirmed the correctness of the proposed approach.

References

Alonso, J.M.: Embedding HILK in a three-objective evolutionary algorithm with the aim of modeling highly interpretable fuzzy rule-based classifiers. Eur. Centre Soft Comput. 15–20 (2010)
Google Scholar
Alonso, J.M., Cordon, O., Quirin, A., Magdalena, L.: Analyzing interpretability of fuzzy rule-based systems by means of fuzzy inference-grams. In: 1st World Conference on Soft Computing, pp. 181.1–181.8 (2011)
Google Scholar
Atashpaz-Gargari, E., Lucas, C.: Imperialist competitive algorithm: an algorithm for optimization inspired by imperialistic competition. IEEE Congress on Evolutionary Computation 7, pp. 4661–4666 (2007)
Google Scholar
Bostanci, B., Bostanci, E.: An evaluation of classification algorithms using Mc Nemars Test. Adv. Intell. Syst. Comput. 201, 15–26 (2013)
Article Google Scholar
Cpałka, K., Łapa, K., Przybył, A., Zalasiński, M.: A new method for designing neuro-fuzzy systems for nonlinear modelling with interpretability aspects. Neurocomputing 135, 203–217 (2014)
Article Google Scholar
Fazzolari, M., Alcalá, R.: Francisco Herrera, A multi-objective evolutionary method for learning granularities based on fuzzy discretization to improve the accuracy-complexity trade-off of fuzzy rule-based classification systems: D-MOFARC algorithm. Appl. Soft Comput. 24, 470–481 (2014)
Article Google Scholar
Gacto, M.J., Alcalá, R., Herrera, F.: Interpretability of linguistic fuzzy rule-based systems: an overview of interpretability measures. Inf. Sci. 181, 4340–4360 (2011)
Article Google Scholar
Gacto, M.J., Alcalá, R., Herrera, F.: A Multiobjective evolutionary algorithm for tuning fuzzy rule based systems with measures for preserving interpretability. In: Proceedings of the Joint International Fuzzy Systems Association World Congress and the European Society for Fuzzy Logic and Technology Conference (IFSA/EUSFLAT 2009) (2009)
Google Scholar
Hossen, J., Sayeed, S., Yusof, I., Kalaiarasi, S.M.A.: A framework of modified adaptive fuzzy inference engine (MAFIE) and its application. Int. J. Comput. Inf. Syst. Ind. Manage. Appl. 5, 662–670 (2013)
Google Scholar
Jensen, R., Cornelis, C.: Fuzzy-rough nearest neighbour classification. In: Transactions on Rough Sets XIII, pp. 56–72. Springer, Berlin (2011)
Google Scholar
Kalaiselvi, C., Nasira, G.M.: A novel approach for the diagnosis of diabetes and liver cancer using ANFIS and improved KNN. Res. J. Appl. Sci. Eng. Technol. 8(2), 243–250 (2014)
Google Scholar
Kumar, G., Rani, P., Devaraj, C., Victoire, D.: Hybrid ant bee algorithm for fuzzy expert system based sample classification. Comput. Biol. Bioinf. IEEE/ACM Trans. 11(2), 347–360 (2014)
Article Google Scholar
Łapa, K., Przybył, A., Cpałka, K.: A new approach to designing interpretable models of dynamic systems. Lect. Notes Artif. Intell. 7895, 523–534 (2013)
Google Scholar
Łapa, K., Zalasiński, M., Cpałka, K.: A new method for designing and complexity reduction of neuro-fuzzy systems for nonlinear modelling. Lect. Notes Artif. Intell. 7894, 329–344 (2013)
Google Scholar
Machine Learning Repository [Online]. Available from: https://archive.ics.uci.edu/ml/datasets.html Accessed 6 June 2015
Qu, Y., Shang, C., Shen, Q., Parthalain, M., Wei, W.N.: Kernel-based fuzzy-rough nearest neighbour classification. In: Fuzzy Systems (FUZZ), 2011 IEEE International Conference on, pp. 1523–1529 (2011)
Google Scholar
Rutkowski L., 2008, Computational Intelligence, Springer
Google Scholar
Shukla, P.K., Tripathi, S.P.: A review on the interpretability-accuracy trade-off in evolutionary multi-objective fuzzy systems (EMOFS). Information 3, 256–277 (2012)
Article Google Scholar
Zalasiński, M., Łapa, K., Cpałka, K.: New algorithm for evolutionary selection of the dynamic signature global features. Lect. Notes Artif. Int. 7895, 113–121 (2013)
Google Scholar

Download references

Acknowledgments

The authors would like to thank the reviewers for very helpful suggestions and comments in the revision process. The project was financed by the National Science Centre (Poland) on the basis of the decision number DEC-2012/05/B/ST7/02138.

Author information

Authors and Affiliations

Institute of Computational Intelligence, Częstochowa University of Technology, Częstochowa, Poland
Krystian Łapa & Krzysztof Cpałka

Authors

Krystian Łapa
View author publications
You can also search for this author in PubMed Google Scholar
Krzysztof Cpałka
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Krystian Łapa .

Editor information

Editors and Affiliations

Faculty of Computer Science and Manageme, Wrocław University of Technology, Wrocła, Wrocław, Poland
Zofia Wilimowska
Faculty of Computer Science and Manageme, Wrocław University of Technology, Wrocła, Wroclaw, Poland
Leszek Borzemski
Faculty of Computer Science and Manageme, Wrocław University of Technology, Wrocła, Wrocław, Poland
Adam Grzech
Faculty of Computer Science and Manageme, Wrocław University of Technology, Wrocła, Wrocław, Poland
Jerzy Świątek

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Łapa, K., Cpałka, K. (2016). Nonlinear Pattern Classification Using Fuzzy System and Hybrid Genetic-Imperialist Algorithm. In: Wilimowska, Z., Borzemski, L., Grzech, A., Świątek, J. (eds) Information Systems Architecture and Technology: Proceedings of 36th International Conference on Information Systems Architecture and Technology – ISAT 2015 – Part IV. Advances in Intelligent Systems and Computing, vol 432. Springer, Cham. https://doi.org/10.1007/978-3-319-28567-2_14

Download citation

DOI: https://doi.org/10.1007/978-3-319-28567-2_14
Published: 24 February 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-28565-8
Online ISBN: 978-3-319-28567-2
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics