Abstract
No matter how great a tool is, if it is applied to a wrong problem, it may delay problem solving, or even worse, cause harm. Experts are needed to match tools with problems. They are not only just domain experts who know a lot about the problems they are facing, but experts who are familiar with feature selection methods. If it is not impossible for us to have such experts, these experts are rare to find. The matching problem still exists regardless of whether we can find such experts or not. The second best solution is to abstract both tools and problems. By relating tools with problems, hopefully, we can help solve the matching problem. Abstracting problems can be done by domain experts according to characteristics of data; abstracting tools requires some measures that can tell us how good each method is and under what circumstances it is good. In order to wisely apply feature selection methods, we need to first discuss their performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Boddy, M. and Dean, T. (1994). Deliberation scheduling for problem solving in time-constrained environments. Artificial Intelligence, 67(2):245–285.
Breiman, L. (1996). Bagging predictors. Machine Learning, 24:123–140.
Breiman, L., Friedman, J., Olshen, R., and Stone, C. (1984). Classification and Regression Trees. Wadsworth & Brooks/Cole Advanced Books & Software.
Clark, P. and Niblett, T. (1989). The CN2 induction algorithm. Machine Learning, 3:261–283.
Cohen, P. (1995). Empirical Methods for Artificial Intelligence. The MIT Press.
Dash, M. and Liu, H. (1997). Feature selection methods for classifications. Intelligent Data Analysis: An International Journal, 1(3).
Dietterich, T., Hild, H., and Bakiri, G. (1990). A comparative study of ID3 and backpropagation for English text-to-speech mapping. In Machine Learning: Proceedings of the Seventh International Conference. University of Texas, Austin, Texas.
Domingos, P. (1997). Why does bagging work? a Bayesian account and its implications. In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining, pages 155–158. AAAI Press.
Fisher, D. and McKusick, K. (1989). An empirical comparison of ID3 and back-propagation. In Proceedings of the Eleventh International Joint Conference on Artificial Intelligence, pages 788–793.
Friedman, J. (1997). On bias, variance, 0/1-loss, and the curse-of-dimensionality. Data Mining and Knowledge Discovery, 1(1).
Fu, L. (1994). Neural Networks in Computer Intelligence. McGraw-Hill.
Goldberg, D. (1989). Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley Publishing Company, Inc.
Hillier, F. and Lieberman, G. (1990). Introduction to Operations Research. McGraw-Hill Publishing Company, 5th edition.
Holte, R. (1993). Very simple classification rules perform well on most commonly used datasets. Machine Learning, 11(1):63–90.
John, G., Kohavi, R., and Pfleger, K. (1994). Irrelevant feature and the subset selection problem. In Machine Learning: Proceedings of the Eleventh International Conference, pages 121–129. Morgan Kaufmann Publisher.
Kazmier, L. and Pohl, N. (1987). Basic Statistics for Business and Economics. McGraw-Hill International Editions, 2nd edition.
Kohavi, R. and Wolpert, D. (1996). Bias plus variance decomposition for zero-on loss functions. In Saitta, L., editor, Machine Learning: Proceedings of the Thirteenth International Conference, pages 273–275. Morgan Kaufmann Publishers, Inc.
Kong, E. and Dietterich, T. (1995). Error-correcting output coding corrects bias and variance. In Prieditis, A. and Russell, S., editors, Machine Learning: Proceedings of the Twelfth International Conference, pages 313–321. Morgan Kaufmann Publishers, Inc.
Kononenko, I. (1994). Estimating attributes: Analysis and extension of RELIEF. In Proceedings of the European Conference on Machine Learning, pages 171–182.
Liu, H. and Motoda, H., editors (1998). Feature Extraction, Construction and Selection: A Data Mining Perspective. Kluwer Academic Publishers.
Liu, H. and Setiono, R. (1998). Scalable feature selection for large sized databases. In Proceedings of the Fourth World Congress on Expert Systems (WCES’98). Morgan Kaufmann Publishers.
Mendenhall, W. and Sincich, T. (1995). Statistics for Engineering and The Sciences. Prentice Hall International, 4th edition.
Merz, C. and Murphy, P. (1996). UCI repository of machine learning databases. http://www.ics.uci.edu/-mlearn/MLRepository.html. Irvine, CA: University of California, Department of Information and Computer Science.
Michie, D., Spiegelhalter, D., and Taylor, C. (1994). Machine Learning, Neural and Statistical Classification. Ellis Horwood Series in Artificial Intelligence.
Mitch, T. (1997). Machine Learning. McGraw-Hill.
Murdoch, J. and Barnes, J. (1993). Statistical Tables for Science, Engineering, Management and Business Studies. The Macmillan Press, 3rd edition.
Murphy, P. and Pazzani, M. (1994). Exploring the decision forest: An empirical investigation of Occam’s razor in decision tree induction. Journal of Art. Intel. Res., 1:257–319.
Pudil, P. and Novovicova, J. (1998). Novel Methods for Subset Selection vnth Respect to Problem Knowledge, pages 101–116. In (Liu and Motoda, 1998).
Quinlan, J. (1993). C4-5: Programs for Machine Learning. Morgan Kaufmann.
Quinlan, J. (1994). Comparing connectionist and symbolic learning methods. In Hanson, S., Drastall, G., and Rivest, R., editors, Computational Learning Theory and Natural Learning Systems, volume 1, pages 445–456. A Bradford Book, The MIT Press.
Russell, S. and Norvig, P. (1995). Artificial Intelligence: A Modern Approach. Prentice Hall.
Setiono, R. and Liu, H. (1995). Understanding neural networks via rule extraction. In Proceedings of International Joint Conference on AI.
Shavlik, J., Mooney, R., and Towell, G. (1991). Symbolic and neural learning algorithms: An experimental comparison. Machine Learning, 6(2): 111–143.
Thrun, S. and et al (1991). The monk’s problems: A performance comarison of different learning algorithms. Technical Report CMU-CS-91-197, Carnegie Mellon University.
Tichy, W. (1998). Should computer scientists experiment more? IEEE Computer, 3(5).
Towell, G. and Shavlik, J. (1993). Extracting refined rules from knowledge-based neural networks. Machine Learning, 13(1):71–101.
Weiss, S. M. and Kulikowski, C. A. (1991). Computer Systems That Learn. Morgan Kaufmann Publishers, San Mateo, California.
Yen, S.-J. and Chen, A. (1995). An efficient algorithm for deriving compact rules from databases. In Proceedings of the Fourth International Conference on Database Systems for Advanced Applications.
Zell, A. and et al (1995). Stuttgart neural network simulator (SNNS), user manual, version 4.1. Technical Report 6/95, Institute for Parallel and Distributed High Performance Systems (IPVR), University of Stuttgart, FTP: ftp.informatik.uni-stuttgart.de/pub/SNNS.
Zilberstein, S. (1996). Using anytime algorithms in intelligent systems. AI Magazine, pages 73–83.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 1998 Springer Science+Business Media New York
About this chapter
Cite this chapter
Liu, H., Motoda, H. (1998). Evaluation and Application. In: Feature Selection for Knowledge Discovery and Data Mining. The Springer International Series in Engineering and Computer Science, vol 454. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-5689-3_5
Download citation
DOI: https://doi.org/10.1007/978-1-4615-5689-3_5
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4613-7604-0
Online ISBN: 978-1-4615-5689-3
eBook Packages: Springer Book Archive