EMnGA: Entropy Measure and Genetic Algorithms Based Method for Heterogeneous Ensembles Selection

Zouggar, Souad Taleb; Adla, Abdelkader

doi:10.1007/978-3-030-03496-2_30

Souad Taleb Zouggar¹⁷ &
Abdelkader Adla¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11315))

Included in the following conference series:

International Conference on Intelligent Data Engineering and Automated Learning

1101 Accesses
1 Citations

Abstract

Generating ensembles of classifiers increase the performances in classification and prediction but on the other hand it increases the storage space and the prediction time. Selection or simplification methods have been proposed to reduce space and time while maintaining or improving the performance of initial ensemble. In this paper we propose a method called EMnGA that uses a diversity-based entropy measure and a genetic algorithm-based search strategy to simplify a heterogeneous ensemble of classifiers. The proposed method is evaluated against its prediction performance and is compared to the initial ensemble as well as to the selection methods of heterogeneous ensembles in the literature using a sequential way.

Access provided by CONRICYT-eBooks. Download conference paper PDF

Development of ensemble learning classification with density peak decomposition-based evolutionary multi-objective optimization

Article 07 April 2021

Genetic Programming of Heterogeneous Ensembles for Classification

Ensemble classification for imbalanced data based on feature space partitioning and hybrid metaheuristics

Article 07 February 2019

Keywords

1 Introduction

The purpose of classification is to learn a target function that links a set of data to a set of predefined categories (classes). The constructed target functions are also called classification models.

To address the problem of instability of some classifiers [1, 2], it is asserted that the use of an ensemble classifiers generally gives better results than the use of a singular classification model [3,4,5,6]. These ensembles can be either homogeneous or heterogeneous depending on the nature of the models used. The homogeneous ensembles are obtained by performing different executions of the same learning algorithm; for example these ensemble can be obtained by varying the parameters of the learning algorithm or by manipulating the data (input or output) [2, 7,8,9]. As for heterogeneous models, they are obtained by executing different learning algorithms on the same set of data. These models have different visions of the data which allows obtaining diverse sets.

Generating a large number of predictors allows exploring the solution space largely, and by aggregating all the predictions, we recover a predictor that accounts for all this exploration. The increase in the number of models can be done without risk of over-learning [10]. However, a very large number of models require a large storage space and an important time for the prediction due to the interrogation of all the predictors.

The simplification of ensembles of classifiers, called ensemble selection, reduces the size of ensembles prior to their integration or combination [11,12,13,14,15,16]. In this paper, we use a diversity-based entropy measure combined to a genetic algorithm-based search strategy to simplify an ensemble of heterogeneous classifiers.

The rest of the paper is organized as follows: In Sect. 2, we present the recent works in the ensemble selection field and especially those related to heterogeneous ensembles. Section 3 details the proposed method describing the entropy measure and search strategy used. In Sect. 4, we give the experimentation results and a comparative study with heterogeneous ensemble selection method in literature. Finally in Sect. 5, we conclude and give some future work.

2 Literature Review

Ensemble methods have attracted a growing interest since their appearance. These methods have been applied in several areas namely, statistics, pattern recognition, and machine learning. These methods consist of two main phases: a model production phase and an aggregation or combination phase of these models. In the first phase, variable performance models are added arbitrarily. The reduced performance models negatively affect the ensemble performance. An ensemble may contain several similar models which reduces its diversity.

Several recent works on ensemble selection [3, 17,18,19] have been developed in the literature. Dai [20] proposes an improvement of the ensemble selection method of the same authors. This method uses backtracking in depth, which is perfectly adapted to systematically seek solutions to combinatorial problems of great magnitude. This improvement concerns the response time of this method, which has been considerably improved in this study.

Bhatnagar et al. [21] perform ensemble selection using a performance-based and diversity-based function that considers the individual performance of classifiers as well as the diversity between pairs of classifiers. A bottom-up search is performed to generate the sub ensembles by adding various pairs of classifiers with high performance.

Simplifying a set of classifiers usually involves reducing the number of trees while maximizing performance. Based on the approximate ensembles, Guo et al. [22] propose a new framework for ensemble selection. In this context, the relationship between attributes in an approximate space is considered a priori as well as their degree of maximum dependence. This effectively reduces the search space and increases the diversity of selected sub-ensembles. Finally, to choose the appropriate sub-ensemble, an evaluation function that balances diversity and precision is used. The proposed method allows repetitively changing the search space of the relevant sub-ensembles and selecting the next ensemble from a new search space.

The selection methods were applied for either homogeneous or heterogeneous cases [16, 23]. The heterogeneous ensembles are developed for sensitive application domains. Haque et al. [24] propose a search based on genetic algorithms to find the optimal combination to form a heterogeneous ensemble from an ensemble of classifiers. For this, they develop an algorithm that uses decimal cross-validation on the learning ensemble to evaluate the quality of each candidate ensemble. The proposed method uses a random resampling approach to balance the class distribution and is particularly used for class imbalance cases.

Pölsterl et al. [25] use a heterogeneous ensemble including survival models with the purpose of predicting the survival of patients with stage 3 prostate cancer. The results demonstrate that the constructed ensemble can predict the date of death of patients with this disease.

Partalas et al. [26] propose a diversity measure and a forward selection search strategy that considers all the possible cases that may exist when adding a certain model h_t to an ensemble. The measure called FES (Focused Ensemble Selection) considers 4 cases when adding.

$$ \begin{aligned} FES_{Eval} \left( {h_{k} \text{,}\,SUB} \right) & = \sum\nolimits_{i = 1}^{{\left| {Eval} \right|}} {\left( {\alpha \text{ * }I\left( {y_{i} = h_{k} \left( {x_{i} } \right)ETy_{i} \ne Sub\left( {x_{i} } \right)} \right) - \beta \text{ * }I\left( {y_{i} \ne h_{k} \left( {x_{i} } \right)ETy_{i} = Sub\left( {x_{i} } \right)} \right)} \right.} \\ & \quad \left. { + \,\beta \text{ * }I\left( {y_{i} = h_{k} \left( {x_{i} } \right)ETy_{i} = Sub\left( {x_{i} } \right)} \right) - \alpha \text{ * }I\left( {y_{i} \ne h_{k} \left( {x_{i} } \right)ETy_{i} \ne Sub\left( {x_{i} } \right)} \right)} \right)\text{.} \\ \end{aligned} $$

Eval: the sample of evaluation or selection;

$ I\left( {y_{i} = h_{k} \left( {x_{i} } \right)ETy_{i} \ne Sub\left( {x_{i} } \right)} \right) = 1 $ if the instance x_i is well classified by the model h_k and is not properly classified by the current ensemble Sub and 0 otherwise.

The factors α, β represent respectively the number of models in the Sub ensemble correctly classifying the instance (x_i, y_i) and the number of models not properly classifying the same instance.

3 The EMnGA Method

3.1 Entropy Measure

The key idea in this approach is to generate only trees that have maximum diversity (they are less correlated with each other). This is based on the principle that the generalization error of the random forest will be on the wane while diversity among the trees increases.

Let Ω_V be a sample of individuals with their labels (classes), |Ω_V| = n, $ \Omega _{V} = \left\{ {v_{1} , \ldots ,v_{n} } \right\} $, and $ \Omega _{\text{V}} = \left\{ {v_{1} , \ldots ,v_{n} } \right\} $. Each individual v_j is described by m variables denoted $ x_{1j} , \ldots ,x_{mj} $. Let Ci be a classifier of the classifiers ensemble $ \left\{ {{\text{C}}_{1} , \ldots ,{\text{C}}_{i} , \ldots ,{\text{C}}_{\text{T}} } \right\} $ represented by a n-dimensional binary vector $ y_{i} = (y_{1i} , \ldots ,y_{ni} )^{\text{T}} $ such that y_ji=1 if the classifier C_i classifies the individual v_j and 0 otherwise. The entropy function f_E measures the diversity within an ensemble [27]. Given an individual $ x_{j} { \in \varOmega }_{\,V} $, if half of the classifiers T/2 don’t misclassify x_j then the other half T-T/2 misclassifies it necessarily and vice versa. We speak in this case of maximum diversity.

We note nc(x_j) the number of classifiers of T which correctly classify x_j, $ nc\left( {x_{j} } \right) = \sum\nolimits_{i = 1}^{T} {y_{ij} } $. The entropy measure f_E is written as:

$$ f_{E} = \frac{1}{n}\sum\limits_{j = 1}^{n} {\frac{1}{{T - \frac{T}{2}}}\hbox{min} \left\{ {nc\left( {x_{j} } \right),T - nc\left( {x_{j} } \right)} \right\}} $$

$ f_{E} \in \left[ {0,1} \right] $ where 1 indicates a very large diversity and 0 a lack of diversity. Thus, the goal is to maximize the f_E function.

3.2 Genetic Algorithms

Genetic algorithms are a preferred technique for selection because they are inspired by natural selection. They generate individuals that optimize an evaluation function also called fitness function.

A genetic algorithm is defined by [28]:

Individual also called chromosome or sequence represents a potential solution of the problem. In our case, a solution of the problem corresponds to a binary string of size T (corresponds to the number of trees composing the forest). A chromosome is noted $ ch = (val_{1} val_{2} \ldots val_{T} ) $ where val_i = 1 if the tree is present in the selected chromosome and 0 otherwise;
Population corresponds to all the chromosomes representing all possibilities of 1 and 0 in a binary chain of size T;
Environment represents the search space |ER| = 2^T.
Fitness function corresponds to the function ff_E = f_E (f_E is the diversity function defined above). The goal is to maximize the value of ff_E.

Calculate the ff_E fitness function for chromosome ch₁ is equivalent to calculate the function f_E:

$$ f_{E} = \frac{1}{n}\sum\limits_{j = 1}^{n} {\frac{1}{{T - \frac{T}{2}}}\hbox{min} \left\{ {nc\left( {x_{j} } \right),T - nc\left( {x_{j} } \right)} \right\}} $$

For instance, if n = 2 = |Ω_v|, T = 2 (the classifiers to which correspond the value 1 T₁ and T₃), x₁ and x₂ are the individuals of Ω_v classified respectively (1 0)^t and (0 1)^t by T₁ et T₃.

nc(x₁) = 1 (the number of trees that correctly classify instances x₁).

nc(x₂) = 1 (the number of trees that correctly classify instances x₂).

$ f_{E} = \frac{1}{2}\left( {\frac{1}{{2 - \frac{2}{2}}}\text{ * }\hbox{min} \left( {1,2, - 1} \right) + \frac{1}{{2 - \frac{2}{2}}}\text{ * }\hbox{min} \left( {1,2, - 1} \right)} \right) = 1 $ and ff_E = f_E = 1, ff_E takes its minimum equal to 0 when the trees are diverse and its maximum 1 when they disagree; hence $ ff_{E} \in \left[ {0\,\,1} \right] $.

We give hereafter the EMnGA algorithm that uses a genetic algorithm-based search strategy and an entropy-based fitness function. It is described by the following pseudo code:

4 Experiments and Results

In this section, we describe information about the datasets used to carry out our experiments. We experienced 8 benchmarks datasets downloaded from the UCI Repository [29] as depicted in Table 1.

Table 1. Description of datasets

Full size table

All data sets contain enough data to split them into three samples: a learning sample Ω_L (40% of the initial sample size), a validation or selection sample (20% of the size of the initial sample) and the remaining 20% are used for the test.

We adopt a similar approach to that proposed in [26]. We generate, for each dataset, a heterogeneous ensemble of 200 models (classifiers) containing:

60 Kppv: with 22 values for K ranging from 1 to the size of the training sample. Three weighting methods are applied: no weighting, inverse weighting and similarity weighting;
110 SVM: we use 10 values for the complexity parameter {10⁻⁷, 10⁻⁶, 10⁻⁵, 10⁻⁴, 10⁻³, 10⁻², 0.1, 1, 10, 100}, and 10 different kernels (2 polynomials of degree 2 and 3 and, 8 radial with gamma ∈ {0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1, 2};
2 naive bayes: one model is built with the default parameters and another with a kernel estimate;
4 decision trees: two values are used for the confidence factor {0.25, 0.5};
24 MLPs with a single hidden layer containing 6 neurons.

For each dataset, the generation of the ensemble (composed of 200 models) is repeated 10 times and the majority vote is used for the combination of the different models.

In Table 2, we give the performance results or success rates obtained (calculated on an average of 10 iterations for each dataset) on the 8 datasets. Comparisons are made between the Initial ensemble EI, the method EMnGA, and FES [26].

Table 2. Success rates obtained respectively by EI, EMnGA and FES for the 8 datasets

Full size table

The EMnGA method gives better performance compared to FES on 5 of the 8 datasets, with improvements ranging from 0.5% to 10.92%. FES is doing better than the proposed method on 3 dataset with improvements of 0.45% for kr-vs-kp and segment, and 1.27% for tick-to-toe. On average, across all datasets, EMnGA improves the performance of FES by 3% and the initial ensemble EI by 12% (Table 3).

Table 3. Sizes of ensembles obtained by EMnGA and FES on the 8 datasets

Full size table

We note that FES method generate ensembles with reduced sizes in 6 cases among the 8 with reductions ranging from 0.2 to 2.3. EMnGA reduces the size of the credit-g and tick-to-toe datasets by 0.1 and 1.8, respectively. On average, on the 8 datasets, FES allows a reduction of 0.5 compared to EMnGA.

5 Conclusion and Future Work

In this paper we proposed an ensemble selection method using diversity based entropy measurement combined to a genetic algorithm based strategy search. The proposed method was compared with the FES method [27] which uses a forward selection course for heterogeneous ensembles. Based on the comparison made against the criteria of performance and size of the ensemble generated, we have observed that the EMnGA method generate larger ensembles in most cases compared to FES due mainly to the strategy search method used to explores more possibilities but gives better performance.

In future work, we will compare the measure with other methods in the literature using other search strategies and will use more datasets in order to formally validate our results with appropriate statistical tests. We will also experience a measure based on diversity and performance in order to see the impact of jointly using both criteria. Finally, we intend to apply the EMnGA method in random forest ensemble simplification.

References

Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Chapman & Hall, New York (1984)
MATH Google Scholar
Geurts, P.: Contributions to decision tree induction: bias/variance tradeoff and time series classification. Ph.D. thesis, Department of Electronical Engineering and Computer Science, University of Liège, May 2002
Google Scholar
Margineantu, D., Dietterich, T.: Pruning adaptive boosting. In: Proceedings of the 14th International Conference on Machine Learning, pp. 211–218 (1997)
Google Scholar
Dietterich, T.G.: Ensemble methods in machine learning. In: Kittler, J., Roli, F. (eds.) MCS 2000. LNCS, vol. 1857, pp. 1–15. Springer, Heidelberg (2000). https://doi.org/10.1007/3-540-45014-9_1
Chapter Google Scholar
Banfield, R.E., Hall, L.O., Bowyer, K.W., Kegelmeyer, W.P.: A comparison of decision tree ensemble creation techniques. IEEE Trans. Pat. Recog. Mach. Int. 29, 173–180 (2007)
Article Google Scholar
Bian, S., Wang, W.: On diversity and accuracy of homogeneous and heterogeneous ensembles. Intl. J. Hybrid Intell. Syst. 4, 103–128 (2007)
Article Google Scholar
Breiman, L.: Bagging predictors. Mach. Learn. 26(2), 123–140 (1996)
MATH Google Scholar
Schapire, R.E.: The strength of weak learnability. Mach. Learn. 5(2), 197–227 (1990)
Google Scholar
Tang, E.K., Suganthan, P.N., Yao, X.: An analysis of diversity measures. Mach. Learn. 65(1), 247–271 (2006)
Article Google Scholar
Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)
Article Google Scholar
Banfield, R.E., Hall, L.O., Bowyer, K.W., Kegelmeyer, W.P.: Ensemble diversity measures and their application to thinning. Inf. Fusion 6(1), 49–62 (2005)
Article Google Scholar
Martinez-Munoz, G., Suarez, A.: Aggregation ordering in bagging. In: International Conference on Artificial Intelligence and Applications (IASTED), pp. 258–263. Acta Press (2004)
Google Scholar
Martinez-Munoz, G., Suarez, A.: Pruning in ordered bagging ensembles. In: 23rd International Conference in Machine Learning (ICML-2006), pp. 609–616. ACM Press (2006)
Google Scholar
Caruana, R., Niculescu-Mizil, A., Crew, G., Ksikes, A.: Ensemble selection from libraries of models. In: Proceedings of the 21st International Conference on Machine Learning (2004)
Google Scholar
Partalas, I., Tsoumakas, G., Katakis, I., Vlahavas, I.: Ensemble pruning using reinforcement learning. In: Antoniou, G., Potamias, G., Spyropoulos, C., Plexousakis, D. (eds.) SETN 2006. LNCS (LNAI), vol. 3955, pp. 301–310. Springer, Heidelberg (2006). https://doi.org/10.1007/11752912_31
Chapter Google Scholar
Partalas, I., Tsoumakas, G., Vlahavas, I.: An ensemble uncertainty aware measure for directed hill climbing ensemble pruning. Mach. Learn. 81, 257–282 (2010)
Article MathSciNet Google Scholar
Fan, W., Chu, F., Wang, H., Yu, P.S.: Pruning and dynamic scheduling of cost- sensitive ensembles. In: Eighteenth National Conference on Artificial Intelligence, American Association for Artificial Intelligence, pp. 146–151 (2002)
Google Scholar
Zhang, Y., Burer, S., Street, W.N.: Ensemblepruning via semi-definite programming. J. Mach. Learn. Res. 7, 1315–1338 (2006)
MathSciNet MATH Google Scholar
Hernández-Lobato, D., Martínez-Muñoz, G.: A statistical instance-based pruning in ensembles of independant classifiers. IEEE Trans. Pattern Anal. Mach. Intell. 31(2), 364–369 (2009)
Article Google Scholar
Dai, Q.: An efficient ensemble pruning algorithm using one-path and two-trips searching approach. Knowl.-Based Syst. 51, 85–92 (2013). ISSN 0950-7051
Article Google Scholar
Bhatnagar, V., Bhardwaj, M., Sharma, S., Haroon, S.: Accuracy-diversity based pruning of classifier ensembles. Prog. Artif. Intell. 2(2–3), 97–111 (2014)
Article Google Scholar
Guo, Y., et al.: A novel dynamic rough subspace based selective ensemble. Pattern Recognit. 48(5), 1638–1652 (2015). ISSN 0031-3203
Article Google Scholar
Partalas, I., Tsoumakas, G., Vlahavas, I.: A study on greedy algorithms for ensemble pruning. Technical report TR-LPIS-360-12, Department of Informatics Aristotle University of Thessaloniki, Greece (2012)
Google Scholar
Haque, M.N., Noman, N., Berretta, R., Moscato, P.: Heterogeneous ensemble combination search using genetic algorithm for class imbalanced dataclassification. PLoS ONE 11(1), e0146116 (2016). https://doi.org/10.1371/journal.pone.0146116
Article Google Scholar
Pölsterl, S., et al.: Heterogeneous ensembles for predicting survival of metastatic, castrate-resistant prostate cancer patients. F1000Research 5(2676) (2016). PMC.Web 13 October 2017
Article Google Scholar
Partalas, I., Tsoumakas, G., Vlahavas, I.: Focused ensemble selection: a diversity- based method for greedy ensemble selection. In: Ghallab, M., Spyropoulos, C.D., Fakotakis, N., Avouris, N.M. (eds.) ECAI 2008 – Proceedings of the 18th European Conference on Artificial Intelligence. Frontiers in Artificial Intelligence and Applications, Patras, Greece, vol. 178, pp. 117–121 (2008)
Google Scholar
Kuncheva, L.I., Whitaker, C.J.: Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Mach. Learn. 51, 181–207 (2003)
Article Google Scholar
Lerman, I., Ngouenet, F.: Algorithmes génétiques séquentiels et parallèles pour une représentation affine des proximités, Rapport de Recherche de l’INRIA Rennes - Projet REPCO 2570, INRIA (1995)
Google Scholar
Asuncion, A., Newman, D.: UCI machine learning repository (2007). http://www.ics.uci.edu/»mlearn/MLRepository.html

Download references

Author information

Authors and Affiliations

Department of Economics, University of Oran 2, Bir El Djir, Algeria
Souad Taleb Zouggar
Department of Computer Science, University of Oran 1, Oran, Algeria
Abdelkader Adla

Authors

Souad Taleb Zouggar
View author publications
You can also search for this author in PubMed Google Scholar
Abdelkader Adla
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Abdelkader Adla .

Editor information

Editors and Affiliations

University of Manchester, Manchester, UK
Hujun Yin
Rm 209, Building B, Autonomous University of Madrid, Madrid, Spain
David Camacho
Campus of Gualtar, University of Minho, Braga, Portugal
Paulo Novais
University of Seville, Seville, Spain
Antonio J. Tallón-Ballesteros

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zouggar, S.T., Adla, A. (2018). EMnGA: Entropy Measure and Genetic Algorithms Based Method for Heterogeneous Ensembles Selection. In: Yin, H., Camacho, D., Novais, P., Tallón-Ballesteros, A. (eds) Intelligent Data Engineering and Automated Learning – IDEAL 2018. IDEAL 2018. Lecture Notes in Computer Science(), vol 11315. Springer, Cham. https://doi.org/10.1007/978-3-030-03496-2_30

Download citation

DOI: https://doi.org/10.1007/978-3-030-03496-2_30
Published: 09 November 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-03495-5
Online ISBN: 978-3-030-03496-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

EMnGA: Entropy Measure and Genetic Algorithms Based Method for Heterogeneous Ensembles Selection

Abstract

Similar content being viewed by others

Development of ensemble learning classification with density peak decomposition-based evolutionary multi-objective optimization

Genetic Programming of Heterogeneous Ensembles for Classification

Ensemble classification for imbalanced data based on feature space partitioning and hybrid metaheuristics

Keywords

1 Introduction

2 Literature Review

3 The EMnGA Method

3.1 Entropy Measure

3.2 Genetic Algorithms

4 Experiments and Results

5 Conclusion and Future Work

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

EMnGA: Entropy Measure and Genetic Algorithms Based Method for Heterogeneous Ensembles Selection

Abstract

Similar content being viewed by others

Development of ensemble learning classification with density peak decomposition-based evolutionary multi-objective optimization

Genetic Programming of Heterogeneous Ensembles for Classification

Ensemble classification for imbalanced data based on feature space partitioning and hybrid metaheuristics

Keywords

1 Introduction

2 Literature Review

3 The EMnGA Method

3.1 Entropy Measure

3.2 Genetic Algorithms

4 Experiments and Results

5 Conclusion and Future Work

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation