Keywords

1 Introduction

The aim of nonlinear modelling is to obtain model with behaviour as close as possible to the testing object. Nonlinear modelling can concern many area of interest, such as physics, engineering, biology, chemistry, etc. and it is an important topic in the literature [16]. One of the mostly used systems for nonlinear modelling are fuzzy systems [20]. These systems can achieve high accuracy and interpretable knowledge in a form of fuzzy rules [9]. Most papers in the literature concerns selecting parameters of fuzzy system with specified structure of the system. For example, genetic algorithms [14], population-based algorithms [5], differential evolution [8], etc. are used to achieve that. From the other hand, an interesting group of methods for solving nonlinear modelling are genetic programming methods [21]. These methods allow obtaining system structures in a form of computer programs (trees). These systems can also achieve high accuracy. However, disadvantage of typical genetic programming methods is lack in interpretability.

1.1 Fuzzy Systems

Fuzzy systems [20] are based on fuzzy logic and fuzzy rules. Each fuzzy rule consists of fuzzy sets which can represent linguistic values ‘low’, ‘medium’, ‘high’, etc. Fuzzy rules take interpretable form {IF … THEN …}. The interpretability can result not only from the low number of fuzzy rules and fuzzy sets, but also from the semantic of appropriate selected parameters of fuzzy sets [9]. The semantic clarifies understanding of how the systems (models [9], classifiers [12], control systems [13]) work. It is worth to mention that in the nonlinear modelling problems the interpretability is an important issue [9]. It emerges from possibilities of understanding how the current object works and it allows us to model the typical states of the object.

1.2 Genetic Programming

Genetic programming (GP) is, in a general, a computational intelligence method for designing systems (structures) in a form of computer programs for solving optimization problems [21]. These systems can be used as controllers, models of objects, classifiers, etc. The main difference between genetic programming and other computational intelligence methods is a possibility of creation tree structures with use of mathematical operators. GP and other evolutionary algorithms (like evolutionary strategies [20], evolutionary programming [17], genetic algorithm [7], etc.) rely on a population of solutions. These methods are based on a natural evolution (using mechanisms like natural selection, inheritance, survival, etc.) which gives them an advantage over other methods used for optimization problems like analytic methods, gradient methods and random methods (see e.g. [20]).

1.3 Genetic Programming Trees

Typical genetic programming tree contains nodes and leafs (both of them are noted as tree elements in further part of this paper). Each node contains mathematical operator which decides how node works and usually has two child tree elements (see Fig. 1). Each leaf contains a real number value or connection to real number value from the system input (see Fig. 2). The whole tree structure is represented by the root (main node) and child tree elements (see Fig. 3). The output of the system (tree) is calculated recursively starting from the root node with use of defined mathematical operators. The mathematical operators are divided into: single argument operators (e.g. “\( \cos \left( \cdot \right) \)”), two argument operators (e.g. “+”) and multi arguments operators (e.g. “\( {\text{avg}}\left( \cdot \right) \)”).

Fig. 1
figure 1

GP node N and two connected tree elements E

Fig. 2
figure 2

GP leaf L connected with number value and system input

Fig. 3
figure 3

An example of GP tree with randomly assigned structure and mathematical operators and output calculated as: \( (L_{1} \cdot (L_{2} + L_{3} )) - (L_{4} + L_{5} ) \)

1.4 Fuzzy Genetic Programming

In the fuzzy genetic programming a mathematical operators are replaced by fuzzy operators (such as AND, OR, etc.) and leaves can be connected not to the system inputs but to the input of fuzzy sets. In [4] standard fuzzy operator AND was used, in [7] additional fuzzy operators: OR, NOT, ‘greater’, ‘lesser’ and ‘near’ were used. In [14] operators AND, parent operators OR and fuzzy set operator NOT were used. In paper [18] a selected group of operators was used (with multiple versions of AND and OR operators).

1.5 Paper Aim

The aim of this paper is, among the others, to present impact of used weights in proposed weighted fuzzy genetic programing algorithm and to provide accurate and interpretable fuzzy rules for considered simulation problems. The proposed method is based on fuzzy genetic programming. This method can be distinguished by: (a) use of flexible triangular norms with weights of arguments as fuzzy operators AND and OR, (b) use of weights of fuzzy rules, (c) use of operator NOT for fuzzy sets, (d) use of new encoding of the system, (e) use of fitness function with complexity of the system and new criteria of correct notations of fuzzy rules and (f) use of population algorithm adapted to the proposed system to select GP tree structure and parameters. The full description of the method and learning algorithm is presented in Sect. 2.

1.6 Paper Structure

The structure of the proposed paper consists of: Sect. 2 with description of the proposed method, Sect. 3 with presentation of simulation results and Sect. 4 with conclusions.

2 Proposed Method Description

2.1 Description of Fuzzy System

In this paper a Mamdani type fuzzy system [20] was used, where fuzzy rules can be defined as:

$$ R_{k} :\left( {\begin{array}{*{20}c} {{\text{IF}}\left( {\bar{x}_{1}\, {\text{IS}}[{\text{NOT}}]A_{1,k} } \right)\,{\text{AND/OR}}\, \ldots\, {\text{AND/OR}}\left( {\bar{x}_{n}\, {\text{IS}}[{\text{NOT}}]A_{1,k} } \right)} \\ {{\text{THEN}}\left( {y_{1} \,{\text{IS}}\,B_{1,k} } \right), \ldots ,\left( {y_{m}\, {\text{IS}}\,B_{m,k} } \right)} \\ \end{array} } \right), $$
(1)

where n is the number of inputs, m is the number of outputs, \( {\bar{\mathbf{x}}} = \left[ {\bar{x}_{1} , \ldots ,\bar{x}_{n} } \right] \in {\mathbf{X}} \), \( {\mathbf{y}} = \left[ {y_{1} , \ldots ,y_{m} } \right] \in {\mathbf{Y}} \), \( A_{1,k} , \ldots ,A_{n,k} \) are input fuzzy sets and \( B_{1,k} , \ldots ,B_{m,k} \) are output fuzzy sets. In the proposed method fuzzy rules (1) are represented by GP trees (see Fig. 4). Using genetic programming trees creates the need of using fuzzy sets’ base, which allows fuzzy sets to connect to the leaves of trees. Proposed fuzzy sets’ base C (stored for each input or output) is defined as:

$$ {\mathbf{C}} = \left\{ {A_{1,1} , \ldots ,A_{1,R} , \ldots ,A_{n,1} , \ldots ,A_{n,R} ,B_{1,1} , \ldots ,B_{1,R} , \ldots ,B_{m,1} , \ldots ,B_{m,R} } \right\}. $$
(2)
Fig. 4
figure 4

Proposed structure of: a tree, b leaf, c node

Each input fuzzy set \( A_{i,r} \) from fuzzy set base (2) is represented by the membership function \( \mu_{{A_{i,r} }} \left( {\bar{x}} \right) \), while each output fuzzy set \( B_{j,r} \) is represented by the membership function \( \mu_{{B_{j,r} }} \left( {\bar{y}} \right) \). Each element of the tree (see Table 1) is described by a set of parameters: \( l,o,i,r, \) \( w_{1} ,w_{2} ,N^{\text{L}} \) and \( N^{\text{P}} \). The parameter l decides how the element is treated (as node or leaf—see Table 1). The parameter o indicates operator of a given element: \( o = 0 \) for \( l = 1 \) stands for ‘IS’, \( o = 1 \) for \( l = 1 \) stands for ‘IS NOT’, \( o = 0 \) for \( l = 0 \) stands for ‘AND’ and \( o = 1 \) for \( l = 0 \) stands for ‘OR’ (see Table 2). The parameter \( i \) (for leaf) stands for input index of associated fuzzy set. The parameter r (for leaf) stands for index of associated fuzzy set \( A_{i,r} \) (see Fig. 4b). The parameter \( w_{1} \) stands for weight of left child node, \( w_{2} \) stands for weight of right child node (see Fig. 4c), \( N^{\text{L}} \) stands for left child node and \( N^{\text{P}} \) stands for right child node. Taking into consideration mentioned parameters the output of any element of the tree can be calculated according to Table 2 (in these calculations a \( T^{*} ( \cdot ) \) triangular t-norm with weights of arguments and \( S^{*} ( \cdot ) \) triangular t-conorm with weights of arguments [20] were used).

Table 1 Parameters of proposed tree elements (‘-’ stands for parameters not used in the current type of tree element)
Table 2 Output of proposed tree elements (leaves and nodes)

The activation (firing) level of each fuzzy rule based on the structure presented in Fig. 4 is calculated as follows:

$$ \tau_{k} \left( {{\bar{\mathbf{x}}}} \right) = \mu_{{N_{k}^{\text{root}} }} \left( {{\bar{\mathbf{x}}}} \right), $$
(3)

where \( N_{k}^{\text{root}} \) stands for root of the tree of kth fuzzy rule (\( k = 1, \ldots ,K \), K stands for the number of fuzzy rules). Crisp output values of the system for each output j can be calculated (for example) with center of area method [20]:

$$ \bar{y}_{j} \left( {{\bar{\mathbf{x}}}} \right) = \frac{{\sum\limits_{r = 1}^{R} {y_{j,r}^{B} \cdot \mathop S\limits_{k = 1}^{K} \left\{ {T^{*} \left\{ {\tau_{k} \left( {{\bar{\mathbf{x}}}} \right),\mu_{k,j} \left( {y_{j,r}^{B} } \right)} \right\};w_{k}^{\text{rule}} } \right\}} }}{{\sum\limits_{r = 1}^{R} {\mathop S\limits_{k = 1}^{K} \left\{ {T^{*} \left\{ {\tau_{k} \left( {{\bar{\mathbf{x}}}} \right),\mu_{k,j} \left( {y_{j,r}^{B} } \right)} \right\};w_{k}^{\text{rule}} } \right\}} }}, $$
(4)

where \( y_{j,r}^{B} \) are centres of output fuzzy sets \( B_{j,r} \), \( w^{\text{rule}} \) stands for fuzzy rule weight and \( \mu_{k,j} \left( {y_{j,r}^{B} } \right) \) stands for membership function value of output fuzzy set \( B_{j,k} \) calculated for discretization point \( y_{j,r}^{B} \). This value can be calculated (using proposed encoding of the system) as:

$$ \mu_{k,j} \left( {y_{j,r}^{B} } \right) = \mu_{{B_{{j,n_{j,k}^{B} }} }} \left( {y_{j,r}^{B} } \right), $$
(5)

where \( n_{j,k}^{B} \) stands for index connecting k-th fuzzy rule with jth output fuzzy set (for example \( n_{j = 1,k = 2}^{B} = 3 \) means that the second fuzzy rule is associated with the third set of the first output \( B_{1,3} \)).

2.2 Encoding of the System

In the proposed approach encoding of the system (4) is based on the encoding of a tree elements N (see Table 1 and Fig. 4) as sets of parameters:

$$ {\mathbf{N}} = \left\{ {l,o,i,r,w_{1} ,w_{2} ,{\mathbf{N}}^{\text{L}} ,{\mathbf{N}}^{\text{P}} } \right\}, $$
(6)

where \( {\mathbf{N}}^{\text{L}} \) and \( {\mathbf{N}}^{\text{P}} \) encodes recursively child tree elements (these values are set to “no value” in case of leaves). The encoding of fuzzy system (4) is defined as:

$$ {\mathbf{X}}_{ch} = \left\{ {{\mathbf{X}}_{ch}^{\text{fsets}} ,{\mathbf{X}}_{ch}^{\text{rules}} } \right\}. $$
(7)

The part \( {\mathbf{X}}_{ch}^{\text{fsets}} \) encodes parameters of fuzzy sets’ base (represented by Gaussian membership functions) (2):

$$ {\mathbf{X}}_{ch}^{\text{fsets}} = \left\{ {\begin{array}{*{20}c} {x_{1,1}^{A} ,\sigma_{1,1}^{A} , \ldots ,x_{1,R}^{A} ,\sigma_{1,R}^{A} , \ldots ,x_{n,1}^{A} ,\sigma_{n,1}^{A} , \ldots ,x_{n,R}^{A} ,\sigma_{n,R}^{A} ,} \\ {y_{1,1}^{B} ,\sigma_{1,1}^{B} , \ldots ,y_{1,R}^{B} ,\sigma_{1,R}^{B} , \ldots ,y_{m,1}^{B} ,\sigma_{m,1}^{B} , \ldots ,y_{m,R}^{B} ,\sigma_{m,R}^{B} } \\ \end{array} } \right\}, $$
(8)

thus the part \( {\mathbf{X}}_{ch}^{\text{rules}} \) encodes fuzzy rules:

$$ {\mathbf{X}}_{ch}^{\text{rules}} = \left\{ {{\mathbf{N}}_{1}^{\text{root}} ,n_{1,1}^{B} , \ldots ,n_{m,1}^{B} ,w_{1}^{\text{rule}} ,{\mathbf{N}}_{2}^{\text{root}} ,n_{1,2}^{B} , \ldots ,n_{m,2}^{B} ,w_{2}^{\text{rule}} , \ldots ,{\mathbf{N}}_{K}^{\text{root}} ,n_{1,K}^{B} , \ldots ,n_{m,K}^{B} ,w_{K}^{\text{rule}} } \right\}, $$
(9)

where \( {\mathbf{N}}_{k}^{\text{root}} \) is a root of the tree of k-th rule, \( n_{j,k}^{B} \) is an index connecting k-th fuzzy rule with fuzzy set of j-th output, \( w_{k}^{\text{rule}} \) is weight of the rule. In the proposed method the part \( {\mathbf{X}}_{ch}^{\text{fsets}} \) encoding parameters of fuzzy sets is processed by a genetic algorithm and the part \( {\mathbf{X}}_{ch}^{\text{rules}} \) encoding fuzzy rules is processed by a genetic programming (for details see Sect. 2.5).

2.3 Initialization of the System

The parameters of fuzzy sets encoded in \( {\mathbf{X}}_{ch}^{\text{fsets}} \) are initialized randomly with adjustments to the considered simulation problems. Next, the number of fuzzy rules \( K \in \left[ {K^{\hbox{min} } ,K^{\hbox{max} } } \right] \) is chosen randomly. After the number of fuzzy rules is chosen, parameters of part \( {\mathbf{X}}_{ch}^{\text{rules}} \) are initialized as follows:

$$ {\mathbf{X}}_{ch}^{\text{rules}} = \left\{ {\begin{array}{*{20}c} {init\left( {{\mathbf{N}}_{1}^{\text{root}} ,0,0} \right),U_{c} \left( {1,R} \right), \ldots ,U_{c} \left( {1,R} \right),U_{r} \left( {0,1} \right),} \\ {init\left( {{\mathbf{N}}_{2}^{\text{root}} ,0,0} \right),U_{c} \left( {1,R} \right), \ldots ,U_{c} \left( {1,R} \right),U_{r} \left( {0,1} \right), \ldots ,} \\ {init\left( {{\mathbf{N}}_{K}^{\text{root}} ,0,0} \right),U_{c} \left( {1,R} \right), \ldots ,U_{c} \left( {1,R} \right),U_{r} \left( {0,1} \right)} \\ \end{array} } \right\}, $$
(10)

where \( U_{c} \left( {a,b} \right) \) returns random integer value from the set \( \{ a, \ldots ,b\} \), \( U_{r} \left( {a,b} \right) \) returns random number value from the range \( [a,b] \), \( init({\mathbf{N}},d,e) \) (d stands for “deep lvl” of the tree—see Fig. 3, e stands for type of tree element) is an initialization function for tree elements. The function \( init( \cdot ) \) for nodes initialization (when \( e = 0 \) and \( d < d^{\hbox{max} } \), where \( d^{\hbox{max} } \) is maximum “deep lvl” of the tree) takes the following form:

$$ init({\mathbf{N}},d,0) = \left\{ {\begin{array}{*{20}c} {0,U_{c} \left( {0,1} \right),U_{c} \left( {1,n} \right),U_{c} \left( {1,R} \right),U_{r} \left( {0,1} \right),U_{r} \left( {0,1} \right),} \\ {init\left( {{\mathbf{N}}\left\{ {{\mathbf{N}}^{\text{L}} } \right\},d + 1,U_{c} \left( {0,1} \right)} \right),init\left( {{\mathbf{N}}\left\{ {{\mathbf{N}}^{\text{P}} } \right\},d + 1,U_{c} \left( {0,1} \right)} \right)} \\ \end{array} } \right\}, $$
(11)

where notation \( {\mathbf{N}}\left\{ a \right\} \) refers to gene a encoded in N. It is worth to mention that this function initializes child elements of the tree recursively with increased value of deep of the tree (d). This process can stop if new element is a leaf or when deep of the tree reaches maximum lvl \( d^{\hbox{max} } \). The function \( init( \cdot ) \) for leaves initialization (when \( e = 1 \) or \( d = d^{\hbox{max} } \)) takes the following form:

$$ \begin{array}{*{20}c} {init({\mathbf{N}},d,1) = \left\{ {1,U_{c} \left( {0,1} \right),U_{c} \left( {1,n} \right),U_{c} \left( {1,R} \right),U_{r} \left( {0,1} \right),U_{r} \left( {0,1} \right),null,null} \right\}} \\ \end{array} , $$
(12)

where null stands for genes with no assigned values.

2.4 System Evaluation

For evaluation of the system (4) the following fitness function was used:

$$ {\text{ff}}\left( {{\mathbf{X}}_{ch} } \right) = \mathop {T_{{}}^{*} }\limits_{f = 1}^{F} \, \left\{ {{\text{ffc}}_{f} \left( {{\mathbf{X}}_{ch} } \right),w_{f}^{\text{ffc}} } \right\}, $$
(13)

where F stands for the number of fitness function components, component \( {\text{ffc}}_{1} \left( {{\mathbf{X}}_{ch} } \right) = {\text{ffacc}}\left( {{\mathbf{X}}_{ch} } \right) \) specifies the accuracy of the system (4), component \( {\text{ffc}}_{2} \left( {{\mathbf{X}}_{ch} } \right) = {\text{ffcom}}\left( {{\mathbf{X}}_{ch} } \right) \) specifies complexity of the system (4), component \( {\text{ffc}}_{3} \left( {{\mathbf{X}}_{ch} } \right) = {\text{ffsam}}\left( {{\mathbf{X}}_{ch} } \right) \) stands for a penalty for using the same fuzzy set multiple times by fuzzy rules (which is non-desired), component \( {\text{ffc}}_{4} \left( {{\mathbf{X}}_{ch} } \right) = {\text{ffmul}}\left( {{\mathbf{X}}_{ch} } \right) \) stands for a penalty for using the same input multiple times by single fuzzy rule (with is non-desired), \( w_{f}^{\text{facc}} \) (\( f = 1, \ldots ,F \)) are weights of components, \( T^{*} \{ \cdot \} \) is a n-argument extension of algebraic triangular norm with weights of arguments. The components aliases (ffcom, ffacc, ffsam, ffmul) were used to increase readability of the paper and presentation of the results.

The component \( {\text{ffc}}_{1} \left( {{\mathbf{X}}_{ch} } \right) \) of function (13) is defined as:

$$ {\text{ffc}}_{ 1} \left( {\mathbf{X}} \right) = \frac{1}{m}\sum\limits_{j = 1}^{m} {\frac{{\frac{1}{Z}\sum\nolimits_{z = 1}^{Z} {\left| {d_{z,j} - \bar{y}_{z,j} } \right|} }}{{\mathop {\hbox{max} }\limits_{z = 1, \ldots ,Z} \left\{ {d_{z,j} } \right\} - \mathop {\hbox{min} }\limits_{z = 1, \ldots ,Z} \left\{ {d_{z,j} } \right\}}}} , $$
(14)

where Z is the number of rows of a learning sequence, \( d_{z,j} \) is the desired output value of output j for input vector \( z \) (\( z = 1, \ldots ,Z \)), \( \bar{y}_{z,j} \) is the real output value j calculated for the input vector \( {\bar{\mathbf{x}}}_{z} \). Equation (14) takes into account the normalization of errors at different outputs of the system (4), which allows us to use function (14) in triangular norm used in function (13) (Table 3).

Table 3 Output of the \( {\text{cm}}( \cdot ) \) function

The component \( {\text{ffc}}_{2} \left( {{\mathbf{X}}_{ch} } \right) \) of function (13) is defined as:

$$ {\text{ffc}}_{2} \left( {{\mathbf{X}}_{ch} } \right) = \frac{1}{K}\sum\limits_{k = 1}^{K} {\frac{{{\text{cm}}\left( {{\mathbf{N}}_{k}^{\text{root}} } \right)}}{{2^{{lvl^{\hbox{max} } }} - 1}}} , $$
(15)

where \( {\mathbf{N}}_{k}^{\text{root}} \) is by default a part of structure \( {\mathbf{X}}_{ch}^{\text{rules}} \) of \( {\mathbf{X}}_{ch}^{{}} \) (this notation will be used in further part of this paper), denominator stands for maximum number of tree elements (Mersenne’s number [15]), numerator stands for actual number of tree elements calculated according to the Table 4.

Table 4 Output of the \( {\text{sa}}( \cdot ) \) function

The component \( {\text{ffc}}_{3} \left( {{\mathbf{X}}_{ch} } \right) \) of function (13) is defined as:

$$ {\text{ffc}}_{3} \left( {{\mathbf{X}}_{ch} } \right) = \frac{{\left( {\begin{array}{*{20}l} {\sum\nolimits_{i = 1}^{n} {\sum\nolimits_{r = 1}^{R} {\hbox{max} \left( {0,\sum\nolimits_{k = 1}^{K} {\text{sa}} \left( {{\mathbf{N}}_{k}^{\text{root}} ,i,r} \right) - 1} \right)} } + } \hfill \\ { + \sum\nolimits_{j = 1}^{m} {\sum\nolimits_{r = 1}^{R} {\hbox{max} \left( {0,\sum\nolimits_{k = 1}^{K} {\text{sb}} \left( {n_{j,k}^{B} ,j,r} \right) - 1} \right)} } } \hfill \\ \end{array} } \right)}}{{K \cdot \left( {2^{{d^{\hbox{max} } - 1}} + m} \right)}} $$
(16)

where denominator stands for maximum number of leaves and output fuzzy sets for all K fuzzy rules, numerator stands for penalty for using specified fuzzy set more than 1 time by any fuzzy rule, function \( {\text{sa}}({\mathbf{N}}_{{\mathbf{k}}} ,i,r) \) stands for number of used input fuzzy set \( A_{i,r} \) by kth rule, function \( {\text{sb}}(n^{B} ,j,r) \) stands for number of used output fuzzy set \( B_{j,r} \) by k-th rule. The output of function \( {\text{sa}}({\mathbf{N}},i,r) \) from Eq. (16) is calculated according to Table 4 and the function \( {\text{sb}}(n^{B} ,j,r) \) output is calculated as follows:

$$ {\text{sb}}\left( {n^{B} ,j,r} \right) = \left\{ {\begin{array}{*{20}c} 1 & {\text{for}} & {n^{B} = r} \\ 0 & {\text{for}} & {n^{B} \ne r} \\ \end{array} } \right.. $$
(17)

The component \( {\text{ffc}}_{4} \left( {{\mathbf{X}}_{ch} } \right) \) of function (13) was defined with assumption that one fuzzy rule cannot use multiple fuzzy sets which are connected to the same input:

$$ {\text{ffc}}_{ 4} \left( {{\mathbf{X}}_{ch} } \right) = \frac{1}{n \cdot K}\left( {\sum\limits_{i = 1}^{n} {\hbox{max} \left( {0,\sum\limits_{k = 1}^{K} {{\text{mul}}\left( {{\mathbf{N}}_{k}^{\text{root}} ,i} \right) - 1} } \right)} } \right), $$
(18)

where function \( {\text{mul}}\left( {{\mathbf{N}},i} \right) \) stands for penalty for multiple use of fuzzy sets connected to i-th input calculated according to Table 5. The penalty resulting from using OR operator in minimization of fitness function (13) is smaller than penalty for using AND operator. Using the OR operator (see \( {\mathbf{N}}\{ o\} = 1 \) in Table 5) for the same inputs is acceptable (opposed to AND operator), but it complicates readability of fuzzy rules.

Table 5 Output of the \( {\text{mul}}( \cdot ) \) function

The aim of fitness function is to minimize values of all fitness function components which allow us to obtain accurate system (\( {\text{ffc}}_{1} \)) with simple structure (\( {\text{ffc}}_{2} \)) and consistent interpretable fuzzy rules (\( {\text{ffc}}_{3} \) and \( {\text{ffc}}_{4} \)).

2.5 Description of Learning Algorithm

The learning algorithm purpose is to select parameters of the fuzzy sets stored in base (2) and to select the structure of the fuzzy rules. The taken into consideration algorithm designed for proposed system structure and encoding works according to the following steps:

  • Step 1. In this step \( N^{\text{pop}} \) individuals of the population P are initialized according to description from Sect. 2.3.

  • Step 2. This step involves evaluation of the individuals of the population P by fitness function (13).

  • Step 3. In this step \( N^{\text{pop}} \) of child individuals are generated and stored in the temporary population P. Genes \( {\mathbf{X}}_{ch}^{\text{fsets}} \) of these individuals are initialized with use of genetic algorithm crossover operator. The individuals for crossover are selected by roulette wheel method from the population P. The genes \( {\mathbf{X}}_{ch}^{\text{rules}} \) of these individuals are initialized by choosing randomly genes \( n^{{B_{j,k} }} \), weights \( w_{k}^{\text{rule}} \) and root nodes from preselected parents.

  • Step 4. This step purpose is to mutate individuals from the population P (each individual is mutated with probability \( p_{m1} \in (0,1) \)). Genes \( {\mathbf{X}}_{ch}^{\text{fsets}} \) are mutated (with probability \( p_{m2} \in (0,1) \)) with use of standard genetic mutation operator. Genes \( {\mathbf{X}}_{ch}^{\text{rules}} \) are mutated (with probability \( p_{m3} \in (0,1) \)). This mutation is based on random changes of parameters \( {\mathbf{N}}\left\{ i \right\} \), \( {\mathbf{N}}\left\{ r \right\} \) and \( n_{j,r}^{B} \). Independent mutation probabilities \( p_{m1} \ne p_{m2} \ne p_{m3} \) (where \( p_{m1} \gg p_{m2} > p_{m3} \)) balance the mutation in the following way: (a) mutation should be processed on the greater part of the population \( {\mathbf{P}^{\prime}} \) (\( p_{m1} \)) which provides a proper diversity of the population, (b) from the other hand, genes mutation probability (\( p_{m2} \)) cannot be high due to degeneration of the population, (c) changes in connection between leafs and nodes (\( p_{m3} \)) should be rarely performed, because too intense changes in relationships between the fuzzy rules and fuzzy sets could hinder the convergence of the algorithm.

  • Step 5. Next, the individuals from the population \( {\mathbf{P}}^{\prime } \) are pruned. This process is based on replacing randomly selected node of each genetic programming tree (with probability \( p_{x} \in (0,1) \)) by randomly generated leaf (\( {\text{init}}({\mathbf{N}},0,1) \)).

  • Step 6. In this step extension of genetic programming trees from population \( {\mathbf{P}}^{\prime} \) is performed. This process is based on replacing randomly selected leaf of each genetic programming tree (with probability \( p_{l} \in (0,1) \)) by randomly generated node (\( {\text{init}}({\mathbf{N}},lvl,0) \)). The lvl stands for actual height of leaf, which prevents excessive growth of the tree.

  • Step 7. In this step for each individual from the population \( {\mathbf{P}}^{\prime } \) a new fuzzy rule is added (with probability \( p_{d} \) and only when \( K < K^{\hbox{max} } \)) or existing randomly chosen fuzzy rule is removed (with probability \( p_{u} \) and when \( K > K^{\hbox{min} } \)).

  • Step 8. After modification of individuals from the population \( {\mathbf{P}}^{\prime } \) (Steps 3–7) each individual is evaluated by fitness function (13).

  • Step 9. Next, the individuals from populations P and \( {\mathbf{P}}^{\prime } \) are merged and only \( N^{\text{pop}} \) best individuals are chosen to replace the population P.

  • Step 10. In the last step of the algorithm the purpose is to check if stop condition is met (for example if the number of executed iterations of algorithm reaches specified value). If this condition is met, the algorithm stops. Otherwise, algorithm goes back to the Step 3.

2.6 Fuzzy Rules Notation

As it was mentioned earlier, in the proposed system (4) a varied fuzzy operators were used to aggregate antecedences of fuzzy rules and to process the fuzzy sets. Due to this and using genetic programming tree structure, the notation of fuzzy rules is defined as:

$$ R_{k} :\left( {{\text{IF}}\overbrace {{{\text{zp}}\left( {{\mathbf{X}}_{ch}^{\text{rules}} \left\{ {{\mathbf{N}}_{k}^{\text{root}} } \right\}} \right)}}^{\text{definitionbyfunction}}|{\mathbf{X}}_{ch}^{\text{rules}} \left\{ {w_{k}^{\text{rule}} } \right\}{\text{THEN}}\left( {\begin{array}{*{20}l} {y_{1} {\text{IS}}B_{{1,{\mathbf{X}}_{ch}^{\text{rules}} \left\{ {n_{1,k}^{B} } \right\}}} , \ldots ,} \\ {y_{m} {\text{IS}}B_{{m,{\mathbf{X}}_{ch}^{\text{rules}} \left\{ {n_{m,k}^{B} } \right\}}} } \\ \end{array} } \right)} \right), $$
(19)

where function \( {\text{zp}}({\mathbf{N}}) \) defines antecedences of fuzzy rules according to the Table 6. It is worth to mention that values of weights \( w_{1} \) and \( w_{2} \) from Eq. (6) and \( w^{\text{rule}} \) from Eq. (9) can be replaced by their linguistic equivalents: n (not important) for values lower than 0.25, i (important) for values from the range \( [0.25,0.75] \) and v (very important) for values higher than 0.75. The fuzzy rule notation may be written as the following example:

$$ R_{1} :\left( {{\text{IF}}\left( {\left( {\begin{array}{*{20}c} {x_{4} {\text{ISNOT}}A_{4,5} |v} \\ {\text{AND}} \\ {x_{6} {\text{IS}}A_{6,2} |n} \\ \end{array} } \right)|n{\text{AND}}\left( {\begin{array}{*{20}c} {x_{1} {\text{IS}}A_{1,4} |i} \\ {\text{OR}} \\ {x_{2} {\text{IS}}A_{2,2} |n} \\ \end{array} } \right)|i} \right)|i{\text{THEN}}\left( {y_{1} {\text{IS}}B_{1,4} } \right).} \right), $$
(20)

or in a shorter form as:

$$ R_{1} :{\text{IF}}\left( {\left( {\begin{array}{*{20}c} {(x_{4} {\text{ISNOT}}A_{4,5} |v{\text{AND}}x_{6} {\text{IS}}A_{6,2} |n)|n} \\ {{\text{AND}}(x_{1} {\text{IS}}A_{1,4} |i{\text{OR}}x_{2} {\text{IS}}A_{2,2} |n)|i} \\ \end{array} } \right)|i} \right){\text{THEN}}\left( {y_{1} {\text{IS}}B_{1,4} } \right). $$
(21)
Table 6 Notation output of the \( {\text{zp}}( \cdot ) \) function

3 Simulation Results

Simulation was performed using the following benchmarks (for details see Table 7): airfoil self-noise problem [3] (ASN), Box Jenkins gas furnace problem [2] (BJG), chemical plant problem [22] (CPP), concrete slump test [23] (CST), servo data set [19] (SDS). The simulations were executed for four cases:

Table 7 Considered simulation problems
  • case 1—case without using weights (in this case all weights values \( w_{1} \), \( w_{2} \) and \( w^{\text{rule}} \) were set to 1).

  • case 2—case with using rule weights (in this case weights values \( w_{1} \), \( w_{2} \) were set to 1).

  • case 3—case with using fuzzy operators weights (in this case weights values \( w^{\text{rule}} \) were set to 1).

  • case 4—case with using rule weights and fuzzy operators weights.

This way of testing allowed precise determination of the impact of using weights in the system (4) on the results.

3.1 Simulation Parameters

Values of parameters of the algorithm were experimentally selected as follows: number of fuzzy sets in fuzzy sets’ base (2) for each input and output \( R = 5 \), minimum number of fuzzy rules \( K^{\hbox{min} } = 3 \), maximum number of fuzzy rules \( K^{\hbox{max} } = 5 \), maximum height of the tree \( lvl^{\hbox{max} } = 5 \), weights of fitness function components (13) \( w_{\text{ffacc}} = 1.0 \), \( w_{\text{ffcom}} = 0.5 \), \( w_{\text{ffsam}} = 0.2 \), \( w_{\text{ffmul}} = 0.1 \), number of individuals in population \( N^{\text{pop}} = 100 \), number of algorithm iterations \( Nstep = 1000 \), individual mutation probability \( p_{m1} = 0.7 \), genes mutation probability \( p_{m2} = 0.2 \), rules mutation probability \( p_{m3} = 0.1 \), pruning of tree probability \( p_{x} = 0.3 \), extending of tree probability \( p_{l} = 0.2 \), adding new fuzzy rule probability and removing fuzzy rule probability to \( p_{u} = 0.3 \). For each benchmark and case, simulations were repeat 100 times and results were averaged.

3.2 Obtained Results

Obtained results for all simulation problems are presented in Table 9. The normalized and averaged results for all simulation problems are presented in Fig. 5 and in Table 8. The example of obtained fuzzy rules and fuzzy sets are presented in Fig. 6 and in Table 10.

Fig. 5
figure 5

Obtained accuracy and complexity normalized and averaged for all considered simulation problems

Table 8 Obtained accuracy and complexity, normalized and averaged for all considered simulation problems (see also Fig. 5)
Fig. 6
figure 6

Fuzzy sets obtained for simulation problems corresponding to fuzzy rules presented in Table 10. Grey fuzzy sets stands for fuzzy sets from fuzzy set base not used by any fuzzy rules

3.3 Simulation Conclusions

The simulation conclusions are following: (a) using rule weights allowed us to obtain better accuracy and similar complexity in a comparison to case without using weights (see Tables 8, 9 and case 2 on Fig. 5), (b) using fuzzy operators weights allowed us to obtain lower complexity and similar accuracy in a comparison to case without using weights (see Tables 8, 9 and case 3 on Fig. 5), (c) using fuzzy operators weights and fuzzy rules weights allowed us to obtain both the lower complexity and better accuracy in a comparison to case without using weights (see Table 8, 9 and case 4 in Fig. 5), (d) obtained results do not differ from the results of other authors (see Table 9). It is worth to mention that other authors’ results are concentrated mostly on accuracy or on using more complex systems (see e.g. [1, 10]), (e) proposed approach is characterized by clear and interpretable fuzzy rules (see Fig. 6 and Table 10).

Table 9 Obtained simulation results in comparison with the best results (aimed on accuracy) obtained by other authors [1, 6, 10, 11]
Table 10 Obtained examples of fuzzy rules for all simulation problems

4 Conclusions

In this paper a weighted fuzzy genetic programming algorithm for selection of the structure and the parameters of the fuzzy systems for nonlinear modelling is presented. In presented approach fuzzy rules take the form of binary trees where nodes of these trees decide on aggregation operators (AND/OR) and the leaves of these trees are connected to the input fuzzy sets. The proposed method allows us to obtain accurate fuzzy systems with clear and interpretable fuzzy rules. The obtained accuracy is similar to the accuracy obtained by other authors, achieved using systems which usually do not take into account interpretability. The use of the system weights shown possibilities in increasing the system accuracy (with use of rule weights), decrease the system complexity (with use of fuzzy operators’ weights) or improve both accuracy and complexity (with use of both weights). The proposed approach was tested on typical nonlinear modelling benchmarks and it can be said that obtained results are satisfying.