Abstract
We propose a new algorithm, called Prototype Generation and Filtering (PGF), which combines the strength of instance-filtering and instance-averaging techniques. PGF is able to generate representative prototypes while eliminating noise and exceptions. We also introduce a distance measure incorporating the class label entropy information for the prototypes. Experiments have been conducted to compare our PGF algorithm with pure instance filtering, pure instance averaging, as well as state-of-the-art algorithms such as C4.5 and KNN. The results demonstrate that PGF can significantly reduce the size of the data while maintaining and even improving the classification performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aha, D.W.: Tolerating Noisy, Irrelevant, and Novel Attributes in Instance-Based Learning Algorithms. International Journal of Man-Machine Studies, Vol. 36. (1992) 267–287
Aha, D.W., Kibler, D. and Albert, M.K.: Instance-Based Learning Algorithms. Machine Learning, Vol. 6. (1991) 37–66
Bareiss, R.: Exemplar-Based Knowledge Acquisition. A Unified Approach to Concept Representation, Classification, and Learning. Perspective in Artificial Intelligence, Vol 2. Academic Press (1989)
Bezdek, J.C., Reichherzer, T.R., Lim, G.S. and Attikiouzel, Y.: Multiple-Prototype Classifier Design. IEEE Transactions on Systems, Man, and Cyberneics, Vol. 28, no. 1. (1998) 67–79
Bradshaw, G.: Learning about Speech Sounds: The NEXUS project. Proceedings of the Fourth International Workshop on Machine Learning. (1987) 1–11
Chang, C.L.: Finding Prototypes for Nearest Neighbor Classifiers. IEEE Transactions on Computers, Vol. 23, no. 11. (1974) 1179–1184
Cost, S and Salzberg, S.: A Weighted Nearest Neighbor Algorithm for Learning with Symbolic Feature. Machine Learning, Vol. 10. (1993) 57–78
Dasarathy, B.V.: Nearest Neighbor (NN) Norms: NN Pattern Classification Techniques. Los Alamito, CA: IEEE Computer Society Press (1991)
Datta, P. and Kibler, D.: Learning Prototypical Concept Description. Proceedings of the Twelfth International Conference on Machine Learning. (1995) 158–166
Datta, P. and Kibler, D.: Symbolic Nearest Mean Classifier. Proceedings of the Fourteenth National Conference of Artificial Intelligence. (1997) 82–87
Datta, P. and Kibler, D.: Learning Symbolic Prototypes. Proceedings of the Fourteenth International Conference on Machine Learning. (1997) 75–82
Gates, G.W.: The Reduced Nearest Neighbor Rule. IEEE Transactions on Information Theory, Vol. 18, no. 3. (1972) 431–433
Gowda, K.C. and Krisha, G.: The Condensed Nearest Neighbor Rule Using the Concept of Mutual Nearest Neighborhood. IEEE Transactions on Information Theory, Vol. 25, no. 4. (1979) 488–490
Hart, P.E.: The Condensed Nearest Neighbor Rule. IEEE Transactions on Information Theory, Vol. 14, no. 3. (1968) 515–516
Kibler, D. and Aha, D.W.: Comparing Instance-Averaging with Instance-Filtering Learning Algorithms. Proceedings of the Third European Working Session on Learning. (1988) 63–80
Murphy P.M. and Aha, D.W.: UCI Repository of Machine Learning Database. Irvine, CA: University of California Irvine, Department of Information and Computer Science, http://www.ics.uci.edu/mlearn/MLRepository.html (1994)
Ritter, G.L, Woodruff, H.B. and Lowry, S.R.: An Algorithm for a Selective Nearest Neighbor Decision Rule. IEEE Transactions on Information Theory, Vol. 21, no. 6. (1975) 665–669
Salzberg, S.: A Nearest Hyperrectangle Learning Method. Machine Learning, Vol. 6. (1991) 251–276
Ullmann, J.R.: Automatic Selection of Reference Data for Use in a Nearest Neighbor Method of Pattern Classification. IEEE Transactions on Information Theory, Vol. 20, no. 4. (1974) 431–433
Wettschereck, D.: A Hybrid Nearest-Neighbor and Nearest-Hyperrectangle Algorithm. Proceedings of the Seventh European Conference on Machine Learning. (1994) 323–335
Wilson, D.L.: Asymptotic Properties of Nearest Neighbor Rules Using Edited Data. IEEE Transactions on Systems, Man, and Cyberneics, Vol. 2, no. 3. (1972) 431–433
Wilson, D.R. and Martinez T.R.: Instance Pruning Techniques. Proceedings of the Fourteenth International Conference on Machine Learning. (1997) 403–411
Zhang, J.: Selecting Typical Instances in Instance-Based Learning. Proceedings of International Conference on Machine Learning. (1992) 470–479
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Keung, CK., Lam, W. (2000). Prototype Generation Based on Instance Filtering and Averaging. In: Terano, T., Liu, H., Chen, A.L.P. (eds) Knowledge Discovery and Data Mining. Current Issues and New Applications. PAKDD 2000. Lecture Notes in Computer Science(), vol 1805. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45571-X_17
Download citation
DOI: https://doi.org/10.1007/3-540-45571-X_17
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67382-8
Online ISBN: 978-3-540-45571-4
eBook Packages: Springer Book Archive