Abstract
This paper presents a new extension of the AdaBoost algorithm. This extension concerns the weights used in this algorithm. In our approach the original weights are modified, we propose a linear modification of the weights. In our study we use the boosting by the reweighting method where each weak classifier is based on the linear classifier. The described algorithm was tested on Pima data set. The obtained results are compared with the original the AdaBoost algorithm.
Access provided by CONRICYT-eBooks. Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
Boosting is a machine learning effective method of producing a very accurate classification rule by combining a weak classifier [7]. The weak classifier is defined to be a classifier which is only slightly correlated with the true classification i.e. it can classify the object better than a random classifier. In boosting, the weak classifier is learned from various training examples sampled from the original learning set. The sampling procedure is based on the weight of each example. In each iteration, the weights of examples are changing. The final decision of the boosting algorithm is determined on the ensemble of classifiers derived from each iteration of the algorithm. One of the fundamental problems of the development of different boosting algorithms is choosing the weights and to define rules for an ensemble of classifiers. In recent years, many authors presented various concepts based on the boosting idea [6, 9]. In this article we present a new extension of AdaBoost [5] algorithm in which a linear modification of the weights was applied.
This paper is organized as follows: Sect. 2 introduces the necessary terms of the AdaBoost algorithm. In the next section there is our modification of the presented algorithm. Section 4 presents the experiment results comparing AdaBoost with our modification. Finally, some conclusions are presented.
2 AdaBoost Algorithm
In the work [5] weak and strong learning algorithms were discussed. The weak algorithms can classify the object better than random, on the other hand strong algorithms can classify the object accurately. Schapire formulated the first algorithm to “boost” a weak classifier. The main idea of boosting is to improve the prediction of weak learning algorithms by creating a set of weak classifiers which is a single strong classifier. The well-known and widely applied is the AdaBoost algorithm. Its main steps are as follows [2] (Tables 1 and 2):
One of the main steps in the algorithm is to maintain a distribution of the training set using the weights. Initially, all weights of the training set observations are set equally. If an observation is incorrectly classified (at the current stage) the weight of this observation is increased. Similarly, the correctly classified observation receives less weight in the next step. For this reason the weak learner is forced to focus on the and examples from the training set in each next step of the algorithm. In each step of the AdaBoost algorithm the best weak classifier according to the current distribution of the observation weights is found. The goodness of a weak classifier is measured by its error. Based on the value of this error the ratio is calculated. The final prediction of the AdaBoost algorithm is a weighted majority vote of all weak classifiers.
3 AdaBoost Algorithm with Linear Modification of the Weights
One of the main factors that have an effect on the action of the AdaBoost algorithm is the selection of weights assigned to individual elements of the learning set. Let’s propose then a modification of the AdaBoost algorithm in which a linear modification of the weights is introduced. The value of factor \(c_t\) is modified in point 4d. The value of the modification depends on the iteration of the algorithm t. In experimental studies we have assumed that the value of the coefficient after modification (point 4d) is 1.25, 1.5, 1.75, 2, 2.25 or 2.5 times higher in the first iteration compared to the original algorithm (point 4c). The proposed algorithm steps are presented in Table 3.
In the earlier work [1] we proposed the changes in weights based on interval-valued fuzzy sets and in the work [11] the linear combination of the upper and lower value of the weights to brain-computer interface was applied.
4 Experiments
To test Lmw-AdaBoost algorithm, we performed experiments on Pima data set. The feature selection process [10] was performed to indicate four most informative features for this data set. The final results are obtained via the 10-fold-cross-validation method.
The results for the twenty-five iterations of AdaBoost and proposed Lmw-AdaBoost algorithms are presented in Table 4.
The best results (since the third iteration) are in bold. In general the results for the AdaBoost algorithm are worse than the proposed modifications Lmw-AdaBoost. For the first twelve iterations no clear results were obtained. In the iterations 13–19 the best is the algorithm in which the primary coefficient \(c_t\) is increased 1.5 times in the first iteration. In the last iteration this coefficient is unchanged therefore the parameters a and b are equal \(-0.020833333\) 1.541666667 respectively. In recent iterations the best is the algorithm in which parameters a and b are equal \(-0.041666667\) 2.041666667 respectively. With such parameters in the first iteration coefficient \(c_t\) is increased 2 times. The obtained results show an improvement in the quality of the proposed modification the AdaBoost algorithm with respect to the ordinal one.
5 Conclusions
In this paper we presented the new Lmw-AdaBoost algorithm. It is a modification of the AdaBoost algorithm in which it was changed coefficient \(c_t\). Consequently, the change affects the weights assigned to the individual learning objects. Changes compared to the original algorithm are linear. The value of the change is greater in the initial iterations.
The experiments have been carried out on Pima data sets. The aim of the experiments was to compare the proposed algorithm and the original AdaBoost algorithm. The results obtained show an improvement in the classification quality of the proposed method with respect to the original one.
Future work might include the proposed modification in other boosting algorithms such as RealAdaBoost or GentleAdaBoost as well as application of the proposed methods for various practical task [3, 4, 8] or testing other data sets.
References
Burduk, R.: The AdaBoost algorithm with the imprecision determine the weights of the observations. In: Asian Conference on Intelligent Information and Database Systems, pp. 110–116. Springer, Cham (2014)
Dmitrienko, A., Chuang-Stein, C., D’Agostino, R.B.: Pharmaceutical statistics using SAS: a practical guide. SAS Institute (2007)
Forczmański, P., Łabędź, P.: Recognition of occluded faces based on multi-subspace classification. In: Computer Information Systems and Industrial Management, pp. 148–157. Springer, Heidelberg (2013)
Frejlichowski, D.: An algorithm for the automatic analysis of characters located on car license plates. In: International Conference Image Analysis and Recognition, pp. 774–781. Springer, Heidelberg (2013)
Freund, Y., Schapire, R.E.: A desicion-theoretic generalization of on-line learning and an application to boosting. In: European Conference on Computational Learning Theory, pp. 23–37. Springer, Heidelberg (1995)
Freund, Y., Schapire, R.E., et al.: Experiments with a new boosting algorithm. ICML 1996, 148–156 (1996)
Kearns, M., Valiant, L.: Cryptographic limitations on learning boolean formulae and finite automata. J. ACM (JACM) 41(1), 67–95 (1994)
Kozik, R., Choraś, M.: The HTTP content segmentation method combined with adaboost classifier for web-layer anomaly detection system. In: International Conference on EUropean Transnational Education, pp. 555–563. Springer, Heidelberg (2016)
Oza, N.C.: Boosting with averaged weight vectors. Multiple Classifier Systems 2709, 15–24 (2003)
Rejer, I.: Genetic algorithms for feature selection for brain-computer interface. Int. J. Pattern Recogn. Artif. Intell. 29(5), 1559008 (2015)
Rejer, I., Burduk, R.: Classifier selection for motor imagery brain computer interface. In: IFIP International Conference on Computer Information Systems and Industrial Management, pp. 122–130. Springer, Heidelberg (2017)
Acknowledgments
This work was supported by the statutory funds of the Department of Systems and Computer Networks, Wroclaw University of Science and Technology.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this paper
Cite this paper
Burduk, R. (2018). The AdaBoost Algorithm with Linear Modification of the Weights. In: Choraś, M., Choraś, R. (eds) Image Processing and Communications Challenges 9. IP&C 2017. Advances in Intelligent Systems and Computing, vol 681. Springer, Cham. https://doi.org/10.1007/978-3-319-68720-9_11
Download citation
DOI: https://doi.org/10.1007/978-3-319-68720-9_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-68719-3
Online ISBN: 978-3-319-68720-9
eBook Packages: EngineeringEngineering (R0)