Abstract
Class imbalance affects medical diagnosis, as the number of disease cases is often outnumbered. When it is severe, learning algorithms fail to retrieve the rarer classes and common assessment metrics become uninformative. In this work, class imbalance is approached using neuropsychological data, with the aim of differentiating Alzheimer’s Disease (AD) from Mild Cognitive Impairment (MCI) and predicting the conversion from MCI to AD. The effect of the imbalance on four learning algorithms is examined through the application of bagging, Bayes risk minimization and MetaCost. Plain decision trees were always outperformed, indicating susceptibility to the imbalance. The naïve Bayes classifier was robust but suffered a bias that was adjusted through risk minimization. This strategy outperformed all other combinations of classifiers and meta-learning/ensemble methods. The tree-augmented naïve Bayes classifier also benefited from an adjustment of the decision threshold. In the nearly balanced datasets, it was improved by bagging, suggesting that the tree structure was too strong for the attribute dependencies. Support vector machines were robust, as their plain version achieved good results and was never outperformed.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
References
Brookmeyer, R., Johnson, E., Ziegler-Graham, K., Arrighi, H.M.: Forecasting the global burden of Alzheimer’s disease. Alzheimers Dementia the Journal of the Alzheimers Association 3(3), 186–191 (2007)
Alzheimer’s Association: Alzheimer’s Disease Facts and Figures. Technical report, Alzheimer’s Association (2012)
Yesavage, J.A., O’Hara, R., Kraemer, H., Noda, A., Taylor, J.L., Rosen, A., Friedman, L., Sheikh, J., Derouesné, C.: Modeling the prevalence and incidence of Alzheimers disease and mild cognitive impairment. Journal of Psychiatric Research 36, 281–286 (2002)
Maroco, J., Silva, D., Rodrigues, A., Guerreiro, M., Santana, I., Mendonça, A.D.: Data mining methods in the prediction of Dementia: A real-data comparison of the accuracy, sensitivity and specificity of linear discriminant analysis, logistic regression, neural networks, support vector machines, classification trees and random forests. BMC Research Notes 4:299 (2011)
Lemos, L.: A data mining approach to predict conversion from mild cognitive impairment to Alzheimers Disease. Master’s thesis, IST (2012)
Breiman, L.E.O.: Bagging Predictors. Machine Learning 24(2), 123–140 (1996)
Kearns, M., Valiant, L.: Cryptographic limitations on learning Boolean formulae and finite automata. Journal of the Association for Computing Machinery 41(1), 67–95 (1994)
Kubat, M., Matwin, S.: Addressing the curse of imbalanced training sets: one-sided selection. Training, 179–186 (1997)
Elkan, C.: The Foundations of Cost-Sensitive Learning. In: Int. Joint Conf. on Artificial Intelligence, vol. 17(1), pp. 973–978 (2001)
Sun, Y., Wong, A.K.C., Kamel, M.S.: Classification of Imbalanced Data: a Review. Int. Journ. of Pattern Recognition and Artificial Intelligence 23(04), 687–719 (2009)
Akbani, R., Kwek, S.S., Japkowicz, N.: Applying support vector machines to imbalanced datasets. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) ECML 2004. LNCS (LNAI), vol. 3201, pp. 39–50. Springer, Heidelberg (2004)
Wu, G., Chang, E.Y.: Class-Boundary Alignment for Imbalanced Dataset Learning. In: ICML 2003 Workshop on Learning from Imbalanced Data Sets (2003)
Tao, D., Tang, X.: Assymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence 28(7), 1088–1099 (2006)
Sun, Y., Kamel, M.S., Wong, A.K.C., Wang, Y.: Cost-sensitive boosting for classification of imbalanced data. Pattern Recognition 40, 3358–3378 (2007)
Japkowicz, N.: The Class Imbalance Problem: Significance and Strategies. Complexity 1, 111–117 (2000)
McCarthy, K., Zabar, B., Weiss, G.: Does cost-sensitive learning beat sampling for classifying rare classes? In: Proceedings of the 1st Int. Work. on Utilitybased Data Mining, pp. 69–77. ACM Press, New York (2005)
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: Synthetic Minority Over-sampling Technique. Journal of Artificial Intelligence Research 16(1), 321–357 (2002)
Han, H., Wang, W.-Y., Mao, B.-H.: Borderline-SMOTE: A New Over-Sampling Method in Imbalanced Data Sets Learning. In: Huang, D.-S., Zhang, X.-P., Huang, G.-B. (eds.) ICIC 2005. LNCS, vol. 3644, pp. 878–887. Springer, Heidelberg (2005)
Garcia, E.A.: ADASYN: Adaptive synthetic sampling approach for imbalanced learning. In: 2008 IEEE Int. Joint Conf. on Neural Networks (IEEE World Congress on Computational Intelligence), vol. (3), pp. 1322–1328 (June 2008)
Jo, T., Japkowicz, N.: Class imbalances versus small disjuncts. ACM SIGKDD Explorations Newsletter 6(1), 40–49 (2004)
Maloof, M.A.: Learning When Data Sets are Imbalanced and When Costs are Unequal and Unknown. Analysis 21(9), 1263–1284 (2003)
Breiman, L., Friedman, J.H., Stone, C.J., Olshen, R.A.: Classification and Regression Trees (1984)
Zadrozny, B., Langford, J., Abe, N.: Cost-Sensitive Learning by Cost-Proportionate Example Weighting. In: Third IEEE Int. Conf. on Data Mining, pp. 435–442 (2003)
Ting, K.M.: An instance-weighting method to induce cost-sensitive trees (2002)
Veropoulos, K., Campbell, C., Cristianini, N.: Controlling the Sensitivity of Support Vector Machines. Heart Disease, 55–60 (1999)
Bishop, C.M.: Pattern Recognition and Machine Learning. Information science and statistics, vol. 4. Springer (2006)
Domingos, P.: MetaCost: A General Method for Making Classifiers Cost-Sensitive. In: Proceedings of the Fifth Int. Conf. on Knowledge Discovery, pp. 155–164 (1999)
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. Wiley (2001)
Thai-nghe, N., Gantner, Z., Schmidt-thieme, L.: Cost-Sensitive Learning Methods for Imbalanced Data. In: The 2010 Int. Joint Conf. on Neural Networks, pp. 1–8 (2010)
Lawrence, S., Burns, I., Back, A., Tsoi, A.C., Giles, C.L.: Neural network classification and prior class probabilities. In: Orr, G.B., Müller, K.-R. (eds.) NIPS-WS 1996. LNCS, vol. 1524, pp. 299–314. Springer, Heidelberg (1998)
Kubat, M., Holte, R.C., Matwin, S.: Machine Learning for the Detection of Oil Spills in Satellite Radar Images. Machine Learning 30, 195–215 (1998)
Davis, J., Goadrich, M.: The relationship between Precision-Recall and ROC curves. In: Proceedings of the 23rd International Conference on Machine Learning, ICML 2006, pp. 233–240 (2006)
Silva, D., Guerreiro, M., Maroco, J.A., Santana, I., Rodrigues, A., Bravo Marques, J., de Mendonça, A.: Comparison of Four Verbal Memory Tests for the Diagnosis and Predictive Value of Mild Cognitive Impairment. Dementia and Geriatric Cognitive Disorders Extra 2(1), 120–131 (2012)
Garcia, C.: A Doença de Alzheimer, problemas do diagnóstico clínico. Phd, Universidade de Medicina de Lisboa (1984)
Hall, M.A.: Correlation-based Feature Selection for Machine Learning. Methodology 21i195-i20, 1–5 (1999)
Honghai, F., Guoshun, C., Cheng, Y., Bingru, Y., Yumei, C.: A SVM Regression Based Approach to Filling in Missing Values. In: Khosla, R., Howlett, R.J., Jain, L.C. (eds.) KES 2005. LNCS (LNAI), vol. 3683, pp. 581–587. Springer, Heidelberg (2005)
Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian Network Classifiers. Machine Learning 29(1), 131–163 (1997)
Bradford, J., Kunz, C., Kohavi, R., Brunk, C.: Pruning Decision Trees with Misclassification Costs. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, pp. 131–136. Springer, Heidelberg (1998)
Provost, F., Domingos, P.: Well-Trained PETs: Improving Probability Estimation Trees (2000)
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explorations 11(1), 10–18 (2009)
Demsar, J.: Statistical Comparison of Classifiers over Multiple Data Sets. Journal of Machine Learning Research 7(7), 1–30 (2006)
Sheskin, D.J.: Handbook of Parametric and Nonparametric Statistical Procedures, vol. 51. CRC Press (1997)
Domingos, P., Pazzani, M.: Beyond independence: Conditions for the optimality of the simple Bayesian classifier. Machine Learning 29(2/3), 105–112 (1997)
Thai-nghe, N., Schmidt-thieme, L., Techniques, A.M.: Learning Optimal Threshold on Resampling Data to Deal with Class Imbalance. In: 8th IEEE Int. Conf. on Computing and Communication Technologies: Research, Innovation, and Vision for the Future (2010)
Quinn, C.J., Coleman, T.P., Kiyavash, N.: Approximating discrete probability distributions with causal dependence trees (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Nunes, C., Silva, D., Guerreiro, M., de Mendonça, A., Carvalho, A.M., Madeira, S.C. (2013). Class Imbalance in the Prediction of Dementia from Neuropsychological Data. In: Correia, L., Reis, L.P., Cascalho, J. (eds) Progress in Artificial Intelligence. EPIA 2013. Lecture Notes in Computer Science(), vol 8154. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40669-0_13
Download citation
DOI: https://doi.org/10.1007/978-3-642-40669-0_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40668-3
Online ISBN: 978-3-642-40669-0
eBook Packages: Computer ScienceComputer Science (R0)