Keywords

1 Introduction

In 1965, Zadeh [16] introduced the new concept called Fuzzy sets, to manipulate the imprecise data into the fuzzy pattern. The Fuzzy logic aims at creating approximate human reasoning that is helpful on cognitive decision making. Several Hybrid systems were developed with Fuzzy sets combining other soft computing models such as artificial neural networks, expert systems and genetic algorithm etc. [6, 12, 14, 18, 19].

A hybrid system like the combination of the artificial neural network with fuzzy logic has proved their effectiveness in being helpful for real-world problems [6]. In 1992, Simpson [15] proposed Fuzzy Min-Max Neural Network (FMNN) classifier based on fuzzy hyperboxes. The union of fuzzy hyperboxes represents individual decision classes. A hyperbox is defined as a region in n-dimensional pattern space characterized by minimum points, maximum points and fuzzy membership function. FMNN learning algorithm computes the min-max points of hyperboxes to acquire knowledge. These placing and adjustment of hyperboxes create a granular structure of pattern in pattern space which is useful for pattern classification. This method also constitutes with several salient learning features like online learning, non-linear separability and non-parametric classification, thus, making FMNN more flexible.

FMNN has been applied successfully in different applications such as fault detection, lung cancer, medical data analysis, classification of music and text classification etc. [1, 4, 9,10,11,12,13, 20].

However, FMNN is facing problems due to contraint of the size of hyperboxes and contraction process which may lead to gradation error in classification [8, 17]. Several developments have been proposed for FMNN in order to overcome its limitation and for enhanced classification.

In 2000, Gabrys et al. [7] proposed a generalization and extension based on FMNN known as General Fuzzy Min-Max Neural Network (GFMNN) that incorporates a significant modification on conventional FMNN with the new fuzzy membership function and hyperbox expansion criteria. But they used the same contraction process as FMNN that tempered the acquired knowledge in the boundary region causing gradation error in classification.

Many researchers have achieved an innovative way to exclude the contraction process to retains overlapping information for better pattern classification. In 2004, Bargiela et al. [2] proposed a new classifier known as Inclusion/Exclusion Fuzzy Hyperbox classifier (EFC) using an inclusion hyperboxes that includes input patterns belonging to the same class, and exclusion hyperboxes (erroneous hyperboxes) that includes input patterns in a confusion region of a different class. However, this method resulted in the reduction in classification accuracy owing to the removal of exclusion hyperboxes.

In 2007, Nandedkar et al. [8] introduced a novel concept called Fuzzy Min-Max Neural Network classifier with Compensatory Neuron Architecture (FMCN). This method can protect the min-max points of confusion overlap region to enhance the learning algorithm as this information is highly significant for pattern classification. Although this method did not allow the overlapped hyperboxes to be expanded for next time which tends to increase in cardinality of hyperboxes, thus increasing the time and space complexity.

In 2007, Zhang et al. [17] proposed a new approach called as Data Core Based Fuzzy Neural Network (DCFMN) to overcome the limitation in FMCN with the help of geometrical centre and data core of hyperbox which can additionally benefit to handle noisy data. Hence, this method results in high classification accuracy than other prominent approaches like GFMNN, FMCN, EFC and also classical FMNN. In 2014, Devtalab et al. [3] proposed a new method called Multi-Level Fuzzy Min-Max Neural Network (MLF) classifier employing a multi-level tree structure to classify the pattern. Each level of model operates the smaller hyperboxes to handle the confusion region problem. It resulted in the enhancement of classification accuracy in the boundary region compared with existing approach GFMNN, EFC, FMCN, DCFMN and FMNN.

The above-mentioned improvement to FMNN has been obtained at an increased cost of training as additional is added to the simple three-layer architecture of FMNN. This motivated us to explore for the methodology for achieving the better classification accuracies without resorting to modification of the structure of FMNN. The proposed work introduces hybridization of k Nearest Neighbour algorithm in FMNN as kNN-FMNN classifier for dealing with overlapping regions without change of Neural Network structure of FMNN. We perform the experiments on benchmark dataset mentioned in [3] to establish the importance of kNN-FMNN. Comparative experiments are conducted against the existing approaches GFMNN, EFC, FMCN, DCFMN and MLF respectively for establishing the relevance of the proposed approach.

The remaining part in this paper is organized as follows: Sect. 2 briefly introduces the basics of FMNN for Classification. Section 3 gives the proposed kNN-FMNN algorithm. Section 4 provides the experiments and analysis of results. Paper ends with the conclusion.

2 Fuzzy Min-Max Neural Network

In 1992, Simpson [15] proposed the single-pass dynamic network structure with salient learning features as online learning, non-linear separability and non-parametric classification, to deal with pattern classification using fuzzy systems known as the fuzzy min-max neural network (FMNN). It is a supervised learning neural network that uses n-dimensional hyperbox fuzzy sets to represent pattern spaces [15]. FMNN learning process creates and adjusts hyperboxes in n-dimensions space for all decision classes in the pattern space.

Fig. 1.
figure 1

(a) Hyperbox, (b) Overlapped region by two hyperboxes

Each hyperbox is determined by min points, max points with corresponding fuzzy membership function, defined as:

$$\begin{aligned} B_{j} = \{ X, V_{j},W_{j},f(X, V_{j},W_{j})\} \quad \forall X \in I^n \end{aligned}$$
(1)

where X is the input pattern, \(V_{j}\) and \(W_{j}\) are the minimum and maximum points of \(B_{j}\) hyperbox. \(I^{n}\) is the n-dimensional unit pattern space.

The fuzzy membership function \((b_{j})\) defined in Eq. (2):

$$\begin{aligned} b_j(X_h)&= \frac{1}{2n} \sum _{i=1}^{n}[ max(0,1-max\left( 0, \gamma .min\left( 1,x_{hi}-w_{ji} \right) \right) ) \nonumber \\&\quad + max(0,1-max\left( 0, \gamma .min\left( 1,v_{ji}-x_{hi} \right) \right) ) ] \end{aligned}$$
(2)

where \(X_{h} = (x_{h1},x_{h2},...,x_{hn})\) is input pattern in n dimensional space and, \(V_{j} = (v_{j1},v_{j2},...,v_{jn})\) and \(W_{j} = (w_{j1},w_{j2},...,w_{jn})\) are the corresponding min points and max points for hyperbox \(B_{j}\). \(\gamma \) is the sensitive parameter that regulates how fast the membership decreases as the distance between \(A_{h}\) and \(B_{j}\) increases.

FMNN training is a single epoch algorithm. For each training pattern, the learning involves three stages: (1) Expansion (2) Overlap Test (3) Contraction Process of Hyperboxes. During the training phase, when an input pattern enters into the network, the network tries to accommodate into one of existing same class hyperbox that gives full membership value. Otherwise, the network attempts to find the closest same label hyperbox which have the highest membership degree. The input pattern attempts to expand the particular hyperbox, bounded by expansion criteria given in Eq. (3). The range of user-defined parameter theta in Eq. (3) is \((0< \theta < 1)\) and controls the volume of hyperbox.

$$\begin{aligned} \sum _{i=1}^{n}\left( max(w_{ji},x_{hi}) - min(v_{ji},x_{hi}) \right) \le n\theta \end{aligned}$$
(3)

When the condition in Eq. (3) is satisfied, the Hyperbox expands to incorporate the input pattern by adjusting the min and max points by using the Eqs. (4) and (5).

$$\begin{aligned} v_{ji}^{new} = min(v_{ji}^{old}, x_{hi}) \quad \forall i=1,2,3,\dots ,n. \end{aligned}$$
(4)
$$\begin{aligned} w_{ji}^{new} = max(w_{ji}^{old}, x_{hi}) \quad \forall i=1,2,3,\dots ,n. \end{aligned}$$
(5)

If the condition Eq. (3) is not satisfied, a point hyperbox is created with minimum and maximum value same as the input pattern.

After the expansion process, the overlap test [15] examines the overlap for the expanded hyperbox with all hyperboxes of other decision classes. Two hyperboxes don’t overlap as long as there is at least one dimension at which they are not overlapping. If there is an overlap in all dimensions, then the test determine the dimension at which the smallest overlap occurs. If overlap test results in identifying the dimension having the smallest overlap, the contraction steps [15] adjust the hyperboxes along that dimension resulting in non-overlapping hyperboxes.

For example in Fig. 1b, both hyperboxes have overlapped in all dimensions. Overlap test determines that the least overlap exist horizontal dimension. The contraction step adjusts the hyperboxes along this dimension and resulting adjusting hyperboxes are given with a bold outline.

In testing an FMNN for a given test pattern x, the fuzzy membership of x into all the hyperboxes is computed. The test pattern x is classified to the decision class corresponding to the hyperbox achieving highest fuzzy membership.

3 Proposed kNN-FMNN Algorithm

Classical FMNN algorithm [15], described in Sect. 2, results in non-overlapping among the hyperbox of different classes. This results in information loss existing in the overlapping (boundary) region and results in the possibility of objects of one class being absolute members of hyperboxes of other class. The defuzzification of the overlapping region affects the generalizability of the FMNN. The existing approaches [2, 3, 7, 8, 17] dealing with the representation of overlapping region by avoiding overlapping and contraction process, are resulting in increasing the complexity of FMNN structure. The proposed kNN-FMNN approach aims at retaining the simple structure of FMNN while having the ability to deal with decision making in overlapping regions.

In kNN-FMNN approach, the kNN classification algorithm is used for decision making when a testing pattern falls into an overlapping region.

kNN classification algorithm doesn’t have any training phase. For every input test pattern, the distance is evaluated between the test pattern with all the training patterns. The nearest k training patterns are selected as the nearest neighbours. Based on the classes of those k nearest neighbours, voting is conducted, and the test pattern is characterized to majority class of nearest neighbours. But in the presence of large training data, kNN requires significant testing time.

FMNN gives a natural way to group the nearest objects into the granular structure of hyperbox. So, using this we can restrict the space in which k nearest neighbour computation needs to be performed. This aspect we are employing in dealing with respect to overlapping region of FMNN testing algorithm.

The rest of section described the training and testing phases of kNN-FMNN algorithms given in Algorithms 1 and 2 respectively.

3.1 Training of kNN-FMNN Algorithm

Let DT represents the set of the training pattern, and FM represents the FMNN model be constructed. Initially, FM is empty, and as training proceeds, hyperboxes are added to the FM model extending the representation of hyperbox H in FMNN, given in Sect. 2. The index list of objects belonging to H is maintained in our approach. For each input pattern x belonging to DT, only the expansion step is performed to preserve the overlapping region. In traditional FMNN based on expansion criteria, given in Eq. (3) , hyperbox can expand non-uniformly in a different dimension as cumulative widths of all dimensions needs to be less than \(n\theta \). This can result in a narrow strip of hyperboxes along few dimension and found to be unsuitable for decision making with respect to kNN. To overcome this, we have adopted the modified expansion criteria given by Gabriel et al. [7] in their work on General Fuzzy Min-Max Neural Network for Clustering and Classification (GFMNN). For a hyperbox H with V and W as min and max points, and x as the input pattern, the modified expansion criteria is given in Eq. (6).

$$\begin{aligned} \forall _{i=1 \dots n} (\max (w_{ji},x_{hi}) - \min (v_{ji}, x_{hi}) \le \theta \end{aligned}$$
(6)

This modified expansion criteria bounds every width of hyperbox on each dimension by \(\theta \) and helps in generation of more uniform hyperboxes found suitable for kNN based decision making.

figure a

For every training pattern x, the method Belongs(x) finds fuzzy membership value of x with all hyperboxes pertaining to class of x using Eq. (2) and determine whether there exists a hyperbox giving full membership of one to x.

$$ \begin{aligned} Belong(x) = \left\{ \exists h \in HBS~|~Memb_{h}(x) == 1~ \& ~class(x) == class(h) \right\} \end{aligned}$$
(7)

where HBS represents a set of hyperboxes.

If Belongs(x) is true, x is added to the hyperbox giving the full membership without resulting in any modificaiton of hyperbox. Otherwise using HMemb(x) the hyperbox H giving the highest membership is obtained. In case of the existence of such H, if expansion criteria are satisfied the hyperbox H is expanded using Eqs. (4) and (5) and object x stored as a member of resulting hyperbox. In case of expansion criteria not being met, or no hyperbox of a corresponding class existing, a point hyperbox is created using Create(x), and x is added to the point hyperbox created.

3.2 Testing of kNN-FMNN Algorithm

Let DS be a set of testing sample. For every testing pattern x in DS, we compute the fuzzy membership value with all hyperboxes in FM. Because the overlapping among hyperboxes is allowed in the training phase, it is possible to obtain absolute membership of 1 to multiple hyperboxes. The absMemb(x) returns all the hyperboxes giving full membership. If this set is empty, then the testing pattern is not belonging to any of hyperboxes and decision is taken like traditional FMNN testing by assigning the decision class corresponding to nearest hyperbox. In case absMemb(x) return a non-empty collection of hyperboxes then the purity of collection is examined. The resulting collection is pure if only if all hyperboxes correspond to a single decision class and in which case, without ambiguity that class is assigned to the testing pattern. In case of impurity objects belonging to all this hyperboxes collected in LocalSet function and kNN is performed locally for determining the decision class of x. The descriptions of functions used is given below:

\(absMemb(x) = \left\{ h \in HBS~|~Memb_{h}(x) == 1 \right\} \): Collection of hyperboxes which have full membership for the object x.

\(pure(absMemb(x)) = \left\{ \forall h_{1}, h_{2} \in absMemb(x)~|~class(h_{1}) == class(h_{2}) \right\} \): Collection of all hyperboxes that containing x correspond to same decision class.

Members(h): Collecting the objects belonging to hyperbox h.

\(LocalSet(absMemb(x)) = \bigcup _{h \in absMemb(x)}(Members(h))\).

knnlocal(LocalSet(absMemb(x)), x): applying kNN methods on selected objects.

figure b

4 Experiments

4.1 Performance Comparison with MLF [3] Approach

For evaluating the performance of kNN-FMNN, we have adopted the experimentation model given in [3] for MLF algorithm. In [3], MLF algorithm’s performance was compared with popular variants of FMNN such as FMNN [15], GFMNN [7], EFC [2], FMCN [8] and DCFMN [17] in the aspects of average mis-classification, average number of hyperboxes produced and computational time given in milliseconds. The experiments were conducted on a synthesized and standard datasets. Furthermore, The stratified 3-fold cross-validation technique was performed on the original dataset to comprehend the model’s ability. The original dataset was partitioned into three subsets. In each iteration, one group was retained for the testing part and, the remaining two groups were used for training the model. This validation continues for three times (in each fold).

Table 1. Benchmark datasets

In this paper, we have conducted experiments on kNN-FMNN following the same procedure given in [3]. The system configuration used for our experiments is CPU: Intel i5 7500, Clock Speed: 3.40 GHz \(\times \) 4, RAM: 8 GB DDR4, OS: Ubuntu 16.04 LTS 64 bit and Software: Rstudio Version 1.1.456. The configuration for experimentation in [3] for MLF and associated algorithms is CPU:core 2 dual, Clock Speed: 1.3 MHz and Ram: 4 GB.

The experiments are conducted on nine benchmarks numeric dataset, collected from the UCI machine learning repository [5] which were used in MLF [3] experimentation. The description of numeric datasets used is shown in Table 1. The synthesized datasets used in [3] were not experimented because of unavailability of datasets. Here, we employed different expansion criterion (theta = 0.2 and 0.3) on kNN-FMNN with \(\gamma \) as 0.4 and ‘k’ in kNN is set to 3.

All experiment results of the kNN-FMNN with other FMNN methods are listed in Table 2. The results given for MLF and associated algorithms are reproduced from [3] for comparison. The best result under each category for each dataset is shown in boldface.

Table 2. Comparative experiment results

4.2 Analysis of Results

The computational complexity of FMNN training algorithm is proportional to the cardinality of hyperboxes created. In addition to the cardinality of hyperboxes, the cost of complex structures such as a compensatory neuron, exclusion hyperboxes and hierarchical layers in algorithms like FMCN, DCFMN and MLF increases the complexity. The computational time reported in Table 2 validates the same as kNN-FMNN achieved training much lesser time compared to MLF and other approaches. The computational efficiently of kNN-FMNN is due to adapting FMNN with only the expansion step and also achieving much lesser cardinality of hyperboxes compared to other approaches.

From Fig. 3, it is observed that the average number of hyperbox creation is less than other FMNN methods except for thyroid dataset because of sparsity in the dataset. As the number of hyperboxes created thyroid dataset is more, the computation training time is higher only for this dataset.

Figure 2 depicts the experimental result of classification accuracy. It is observed that in all the datasets, kNN-FMNN with theta (0.2 or 0.3) achieved similar or better classification accuracy compared to MLF and other approaches.

kNN-FMNN has achieved significantly better classification accuracies in Parkinson, Ozone layer, Glass and Thyroid datasets. For example for Parkinson dataset, by using the kNN-FMNN algorithm with user-defined parameter \((\theta = 0.2)\), obtained 92.3% of average classification accuracy with 81 average number of hyperboxes creation respectively, whereas MLF gave 83.49% accuracy along with 111 hyperboxes.

We have experimented kNN-FMNN with several theta \((\theta )\) values, and all the results were not reported due to space contraint. It is observed that for less theta values such as 0.02, the cardinality of hyperboxes is huge and for high theta values such as 0.9, the cardinality of hyperboxes is less in the cost of misclassification. The best result (minimization of the cardinality of hyperboxes, maximation of classification accuracy) are obtained for theta values between 0.2 to 0.3, and the same is recommended.

Fig. 2.
figure 2

Comparison of average classification accuracy

Fig. 3.
figure 3

Comparison of average hyperbox size

5 Conclusion

Several improvements were proposed for the Fuzzy Min-Max Neural Network to overcome limitations arise due to contraction step. These extensions have resulted in adding additional complexity to FMNN thus increasing the training time. This work proposed kNN-FMNN as hybridization of FMNN with kNN for overcoming the contraction step in FMNN. The proposed approach resulted in building the classification model with a fewer number of hyperboxes and achieving good classification accuracy by utilizing kNN locally for disambiguating classification decision in the overlapping region. The experimental results have established that kNN-FMNN achieved better classification accuracy than existing approaches such as MLF, FMCN, DCFMN, EFC, GFMNN in less computation time. In future, attempts will be done for proposing parallel and distributing kNN-FMNN for achieving scalability in large-scale decision systems.