Abstract
Over the years, many successful applications of case-based reasoning (CBR) systems have been developed in different areas. The performance of CBR systems depends on several factors, including case representation, similarity measure, and adaptation. Achieving good performance requires careful design, implementation, and continuous optimization of these factors. In this paper, we propose a maintenance technique that integrates an ensemble of CBR classifiers with spectral clustering and logistic regression to improve the classification accuracy of CBR classifiers on (ultra) high-dimensional biological data sets.
Our proposed method is applicable to any CBR system; however, in this paper, we demonstrate the improvement achieved by applying the method to a computational framework of a CBR system called \(\mathit{TA3}\). We have evaluated the system on two publicly available microarray data sets that cover leukemia and lung cancer samples. Our maintenance method improves the classification accuracy of \(\mathit{TA3}\) by approximately 20% from 65% to 79% for the leukemia and from 60% to 70% for the lung cancer data set.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Lenz, M., Bartsch-Sporl, B., Burkanrd, H., Wess, S. (eds.): Case-based reasoning: experiences, lessons, and future directions. Springer, Heidelberg (1998)
Jurisica, I., Glasgow, J.: Application of case-based reasoning in molecular biology. Artificial Intelligence Magazine, Special Issue on Bioinformatics 25(1), 85–95 (2004)
Francis, A.G., Ram, A.: The utility problem in case-based reasoning. In: Proceedings of the 1993 AAAI Workshop on Case-Based Reasoning, Washington, DC (1993)
Leake, D.B., Wilson, D.C.: Remembering why to remember: performance-guided case-base maintenance. In: Blanzieri, E., Portinale, L. (eds.) Advances in Case- Based Reasoning, Fivth European Workshop on Case-Based Reasoning, Trento, Italy, pp. 161–172. Springer, Heidelberg (2000)
Smyth, B.: Case base maintenance. In: Pobil, A.D., Mira, J., Ali, M. (eds.) Eleventh International Conference on Industrial and Engineering Applications of Artificial Intelligence and Expert Systems, Castellon, Spain, vol. 2, pp. 507–516. Springer, Heidelberg (1998)
Wilson, D., Leake, D.: Maintaining case-based reasoners: Dimensions and directions. Computational Intelligence 17, 196–212 (2001)
Richter, M.M.: 1. In: Case-based reasoning: experiences, lessons, and future directions, pp. 1–15. Springer, Heidelberg (1998)
Smyth, B., McKenna, E.: Building compact competent case-bases. In: Althoff, K.-D., Bergmann, R., Branting, L.K. (eds.) ICCBR 1999. LNCS (LNAI), vol. 1650, pp. 329–342. Springer, Heidelberg (1999)
Shiu, S.C., Yeung, D.S.: Transferring case knowledge to adaptation knowledge: An approach for case-base maintenance. Computational Intelligence 17, 295–314 (2001)
Yang, Q., Wu, J.: Keep it simple: a case-base maintenance policy based on clustering and information theory. In: Hamilton, H. (ed.) Advances in Artificial Intelligence, In Proceedings of the 13th Biennial Conference of the Canadian Society for Computational Studies of Intelligence, Montreal, Canada, pp. 102–114. Springer, Heidelberg (2000)
Xing, E.P.: Feature selection in microarray analysis. In: Berrar, D., Dubitzky, W., Granzow, M. (eds.) A Practical Approach to Microarray Data Analysis, pp. 110–131. Kluwer Academic Publishers, Dordrecht (2003)
Quackenbush, J.: Computational analysis of microarray data. Nat. Rev. Genet. 2, 418–427 (2001)
Molla, M., Waddell, M., Page, D., Shavlik, J.: Using machine learning to design and interpret gene-expression microarrays. AI Magazine 25, 23–44 (2004)
Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kauffmann Publishers, San Francisco (2000)
John, G., Kohavi, R., Pfleger, K.: Irrelevant features and the subset selection problem. In: Machine Learning: Proceedings of the Eleventh International Conference, pp. 121–129. Morgan Kaufmann, San Francisco (1994)
Aha, D.W., Bankert, R.: Feature selection for case-based classification of cloud types: an empirical comparison. In: Aha, D.W. (ed.) Proceedings of the AAAI 1994 Workshop on Case-Based Reasoning, pp. 106–112. AAAI Press, Menlo Park (1994)
Ng, A.Y., Jordan, M.I., Weiss, Y.: On spectral clustering: Analysis and an algorithm. In: Dieterich, G., Becker, S., Z.G. (eds.) Advances in Neural Information Processing Systems 14, MIT Press, Cambridge (2002)
Hastie, T., Tibshirani, R., Friedman, J.: The elements of statistical learning. Springer, Heidelberg (2001)
Jurisica, I., Glasgow, J., Mylopoulos, J.: Incremental iterative retrieval and browsing for efficient conversational CBR systems. International Journal of Applied Intelligence 12(3), 251–268 (2000)
Leake, D., Wilson, D.: Categorizing case-base maintenance: dimensions and directions. In: Smyth, B., Cunningham, P. (eds.) EWCBR 1998. LNCS (LNAI), vol. 1488, pp. 196–207. Springer, Heidelberg (1998)
Hart, P.: The condensed nearest neighbor rule. IEEE on Information Theory 14, 515–516 (1968)
Leake, D.B., Wilson, D.C.: Combining CBR with interactive knowledge acquisition, manipulation and reuse. In: Althoff, K.-D., Bergmann, R., Branting, L.K. (eds.) ICCBR 1999. LNCS (LNAI), vol. 1650, pp. 203–217. Springer, Heidelberg (1999)
Zhang, Z., Yang, Q.: Dynamic refinement of feature weights using quantitative introspective learning. In: Proceedings of the fifteenth International Joint Conference on Artificial Intelligence (IJCAI 1999), Quebec, Canada, pp. 228–233. Morgan Kaufmann, San Francisco (1999)
Leake, D.B., Kinley, A., Wilson, D.C.: Acquiring case adaptation knowledge: a hybrid approach. In: Proceedings of the Thirteenth National Conference on Artificial Intelligence and Eighth Innovative Applications of Artificial Intelligence Conference, AAAI 1996, IAAI 1996, Portland, Oregon, pp. 648–689. AAAI Press, Menlo Park (1996)
Arshadi, N., Jurisica, I.: Maintaining case-based reasoning in high-dimensional domains using mixture of experts. Technical Report CSRG-490, University of Toronto, Department of Computer Science (2004)
Golub, T., Slonim, D., Tamayo, P., Huard, C., Gassenbeek, M., Mesirov, J., Coller, H., Loh, M., Downing, J., Caligiuri, M., Bloomfield, C., Lander, E.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286, 531–537 (1999)
Tamayo, P., Slonim, D., Mesirov, J., Zhu, Q., Dmitrovsky, E., Lander, E., Golub, T.: Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. Proceedings of the National Academy of Science of the United States of America 96(6), 2907–2912 (1999)
Kohonen, T.: Self-Organizing Maps. Springer, Heidelberg (1995)
Dunn, J.: Well separated clusters and optimal fuzzy partitions. Journal of Cybernetics 4, 95–104 (1974)
Baeza-Yates, R., Ribiero-Neto, B.: Modern information retrieval. Addison-Wesley, Reading (1999)
Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artificial Intelligence 97, 273–324 (1997)
Jaeger, J., Sengupta, B., Ruzzo, W.: Improved gene selection for classification of microarrays. In: Pacific Symposium on Biocomputing, vol. 8, pp. 53–64 (2003)
Xing, E.P., Jordan, M.L., Karp, R.M.: Feature selection for high-dimensional genomic microarray data. In: Brodley, C.E., Danyluk, A.P. (eds.) Proceedings of the Eighteenth International Conference on Machine Learning, Williamstown, MA, USA, pp. 601–608. Morgan Kaufmann, San Francisco (2001)
Jurisica, I., Mylopoulos, J., Glasgow, J., Shapiro, H., Casper, R.F.: Case-based reasoning in IVF: prediction and knowledge mining. Artificial Intelligence in Medicine 12, 1–24 (1998)
Jurisica, I., Rogers, P., Glasgow, J., Fortier, S., Luft, J., Wolfley, J., Bianca, M., Weeks, D., DeTitta, G.: Intelligent decision support for protein crystal growth. IBM Systems Journal 40(2), 394–409 (2001)
Mylopoulos, J., Borgida, A., Jarke, M., Koubarakis, M.: Telos: Representing knowledge about information systems. ACM Transactions on Information Systems 8(4), 325–362 (1990)
Wettschereck, D., Dietterich, T.: An experimental comparison of the nearest neighbor and nearest hyperrectangle algorithms. Machine Learning 19(1), 5–27 (1995)
Troyanskaya, O., Cantor, M., Sherlock, G., Brown, P., Hastie, T., Tibshirani, R., Botstein, D., Altman, R.B.: Missing value estimation methods for DNA microarrays. Bioinformatics 17, 520–525 (2001)
Jones, L., Ng, S.K., Ambroise, C., McLachlan, G.: Use of microarray data via model-based classification in the study and prediction of survival from lung cancer. In: Johnson, K., Lin, S. (eds.) Critical Assessment of Microarray Data Analysis, pp. 38–42 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Arshadi, N., Jurisica, I. (2004). Maintaining Case-Based Reasoning Systems: A Machine Learning Approach. In: Funk, P., González Calero, P.A. (eds) Advances in Case-Based Reasoning. ECCBR 2004. Lecture Notes in Computer Science(), vol 3155. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-28631-8_3
Download citation
DOI: https://doi.org/10.1007/978-3-540-28631-8_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22882-0
Online ISBN: 978-3-540-28631-8
eBook Packages: Springer Book Archive