Abstract
In the previous chapters, we focused on feature subset selection. We now discuss related and/or less developed topics with respect to feature transformation and dimensionality reduction. The first two sections are about feature trans- formation, which introduce techniques in Statistics, Machine Learning, and Knowledge Discovery. The third section discusses feature discretization which is closely related to dimensionality reduction. If subset selection allows reduction in one dimension, feature discretization could enable reduction along two dimensions. In the fourth section, we go beyond the classification model and explore feature selection without class information. Data without class information are unsupervised data (unlabeled). As we move from supervised (labeled) data to unsupervised data, we also step into a territory that is not as well explored as feature selection for classification. However, we foresee the rising need for unsupervised feature selection and expect more work to be carried out in the near future.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aha, D. (1997). Editorial. Artificial Intelligence Review, 11.
Bloedorn, E. and Michalski, R. (1998). Data-Driven Constructive Induction: A Methodology and Its Applications, pages 51–68. In (Liu and Motoda, 1998).
Breiman, L., Friedman, J., Olshen, R., and Stone, C. (1984). Classification and Regression Trees. Wadsworth & Brooks/Cole Advanced Books & Software.
Brodley, C. and Utgoff, P. (1995). Multivariate decision trees. Machine Learning, 19:45–77.
Catlett, J. (1991). On changing continuous attributes into ordered discrete attributes. In European Working Session on Learning.
Cerquides, J. and de Mantaras, R. L. (1997). Proposal and empirical comparison of a parallelizable distance-based discretization method. In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining, pages 139–142.
Cheeseman, P., Kelly, J., Self, M., Stutz, J., Taylor, W., and Freeman, D. (1988). AUTOCLASS: A Bayesian classification systems. In Proceedings of the Fourth International Workshop on Machine Learning, pages 54–64. Morgan Kaufmann.
Clark, P. and Niblett, T. (1989). The CN2 induction algorithm. Machine Learning, 3:261–283.
Cost, S. and Salzberg, S. (1993). A weighted nearest neighbor algorithm for learning with symbolic features. Machine Learning, 10:57–78.
Dasarathy, B. (1991). Nearest Neighbor (NN) Norms: NN Pattern Classification Techniques. IEEE Computer Society Press.
Dash, M., Liu, H., and Yao, J. (1997). Dimensionality reduction of unsupervised data. In Proceedings of the Ninth IEEE International Conference on Tools with AI (ICTAI’97), pages 532–539. IEEE Computer Society.
Donoho, S. and Rendell, L. (1996). Constructive induction using fragmentary knowledge. In Saitta, L., editor, Proceedings of International Conference on Machine Learning (ICML-96), pages 113–121. Morgan Kaufmann Publishers.
Dougherty, J., Kohavi, R., and Sahami, M. (1995). Supervised and unsuper-vised discretization of continuous features. In Proceedings of the Twelfth International Conference on Machine Learning. Morgan Kaufmann, Los Altos, CA.
Duda, R. and Hart, P. (1973). Pattern Classification and Scene Analysis. John Wiley & Sons, New York.
Fast, J. (1962). Entropy: the significance of the concept of entropy and its applications in science and technology, chapter 2: The Statistical Significance of the Entropy Concept. Eindhoven: Philips Technical Library.
Fayyad, U. and Irani, K. (1993). Multi-interval discretization of continuous-valued attributes for classification learning. In Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence, pages 1022–1027. Morgan Kaufmann Publishers, Inc.
Fisher, D. (1987). Knowledge acquisition via incremental conceptual clustering. Machine Learning, 2:139–172.
Fisher, D. and Langley, P. (1985). Approaches to conceptual clustering. In Proceedings of the Ninth International Joint Conference on Artificial Intelligence, pages 691–697.
Fisher, R. (1936). The use of multiple measurements in taxonomic problems. Ann. Eugenics, 7(2): 179–188.
Friedman, J., Kohavi, R., and Yun, Y. (1996). Lazy decision trees. In Proceedings of the Thirteenth National Conference on Artificial Intelligence, pages 717–724.
Gennari, J., Langley, P., and Fisher, D. (1989). Models of incremental concept formation. Artificial Intelligence, 40:11–61.
Gluck, M. and Corter, J. (1985). Information, uncertainty, and the utility of categories. In Proceedings of the Seventh Annual Conference of the Cognitive Science Society, pages 283–287. Lawrence Erlbaum, Irvine, CA.
Hagan, M., Demuth, H., and Beale, M. (1996). Neural Network Design. PWS Publishing Company.
Ho, K. and Scott, P. (1997). Zeta: A global method for discretization of continuous variables. In Proceedings of The Third International Conference of Knowledge Discovery and Data Mining, pages 191–194. Newport Beach, CA.
Holte, R. (1993). Very simple classification rules perform well on most commonly used datasets. Machine Learning, 11(1):63–90.
Jun, B., Kim, C., Song, H., and Kim, J. (1997). A new criterion in selection and discretization of attributes for the generation of decision trees. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(12):1371–1375.
Kerber, R. (1992). ChiMerge: Discretization of numeric attributes. In AAAI-92, Proceedings of the Ninth National Conference on Artificial Intelligence, pages 123–128. AAAI Press/The MIT Press.
Klir, G. and Folger, T. (1988). Fuzzy Sets, Uncertainty, and Information, chapter 5: Uncertainty and Information. Prentice-Hall International Editions.
Kohonen, T. (1984). Self-organization and Associative Memory. Springer-Verlag, Berlin.
Kramer, S. (1994). CN2-MCI: A two-step method for constructive induction. In Proceedings of the Workshop on Constructive Induction and Change of Representation, International Conference on Machine Learning (ML-94/COLT-4).
Langley, P., Simon, H., Bradshaw, G., and Zytkow, J. (1987). Scientific Discovery: Computational Explorations of the Creative Processes. Cambridge, MA: The MIT Press.
Lebowitz, M. (1987). Experiments with incremental concept formation. Machine Learning, 1:103–138.
Liu, H. and Motoda, H., editors (1998). Feature Extraction, Construction and Selection: A Data Mining Perspective. Kluwer Academic Publishers.
Liu, H. and Setiono, R. (1995). Chi2: Feature selection and discretization of numeric attributes. In Proceedings of the Seventh IEEE International Conference on Tools with Artificial Intelligence, pages 388–391.
Liu, H. and Setiono, R. (1997). Feature selection via discretization. IEEE Trans on Knoweldge and Data Engineering, 9(4):642–645.
Liu, H. and Wen, W. X. (1994). Joint concept formation. Journal of Knowledge Acquisition, 6:75–87.
Matheus, C. and Rendell, L. (1989). Constructive induction on decision trees. In Proceedings of International Joint Conference on AI, pages 645–650.
McKusick, K. and Langley, P. (1991). Constrains on tree structure in concept formation. In Proceedings of the Eleventh International Joint Conference on Artificial Intelligence, pages 810–816.
Michalski, R. and Stepp, R. (1983). Automated construction of classifications: Conceptual clustering versus numerical taxonomy. IEEE Transactions on Pattern Analysis and Machine Intelligence, 5(4).
Michie, D., Spiegelhalter, D., and Taylor, C. (1994). Machine Learning, Neural and Statistical Classification. Ellis Horwood Series in Artificial Intelligence.
Mingers, J. (1989). An empirical comparison of selection measures for decision-tree induction. Machine Learning, 3:319–342.
Muggleton, S. (1987). Duce, an Oracle-based approach to constructive induction. In Proceedings of International Joint Conference on AI, pages 287–292. Morgan Kaufmann.
Pagallo, G. (1989). Learning DNF by decision trees. In Proceedings of International Joint Conference on AI, pages 639–644. Morgan Kaufmann.
Pagallo, G. and Haussler, D. (1990). Boolean feature discovery in empirical learning. Machine Learning, 5:71–99.
Pettis, K., Bailey, T., Jain, A., and Dubes, R. (1979). An intrinsic dimensionality estimator from near-neighbor information. IEEE Transactions on Pattern Analysts and Machine Intelligence, 1:25–37.
Pettofrezzo, A. (1966). Matrices and Transformations. Dover Publications, Inc. New York.
Quinlan, J. (1986). Induction of decision trees. Machine Learning, 1(1):81–106.
Quinlan, J. (1988). Decision trees and multi-values attributes. In J.E., H., Michie, D., and J., R., editors, Machine Intelligence, volume 11. Oxford University Press.
Quinlan, J. (1993). C4-5: Programs for Machine Learning. Morgan Kaufmann.
Rendell, L. (1985). Substantial constructive induction using layered information compression: Tractable feature formation in search. In Proceedings of International Joint Conference on AI, pages 650–658. Morgan Kaufmann.
Rosch, E. (1978). Principles of categorization. In Rosch, E. and Lloyd, B., editors, Cognition and Categorization. Erlbaum, N.J.
Setiono, R. and Liu, H. (1998). Feature Extraction via Neural Networks, pages 191–204. In (Liu and Motoda, 1998).
Shlimmer, J. (1987). Learning and representation change. In Proceedings of AAAI, pages 511–515. Morgan Kaufmann.
Vilalta, R., Blix, G., and Rendell, L. (1997). Global data analysis and the fragmentation problem in decision tree induction. In van Someren, M. and Widmer, G., editors, Machine Learning: ECML-97, pages 312–326. Springer-Verlag.
Wnek, J. and Michalski, R. (1994). Hypothesis-driven constructive induction in AQ17-HCI: A method and experiments. Machine Learning, 14.
Wyse, N., Dubes, R., and Jain, A. (1980). A critical evaluation of intrinsic dimensionality algorithms. In Gelsema, E. and Kanal, L., editors, Pattern Recognition in Practice, pages 415–425. Morgan Kaufmann Publishers, Inc.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 1998 Springer Science+Business Media New York
About this chapter
Cite this chapter
Liu, H., Motoda, H. (1998). Feature Transformation and Dimensionality Reduction. In: Feature Selection for Knowledge Discovery and Data Mining. The Springer International Series in Engineering and Computer Science, vol 454. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-5689-3_6
Download citation
DOI: https://doi.org/10.1007/978-1-4615-5689-3_6
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4613-7604-0
Online ISBN: 978-1-4615-5689-3
eBook Packages: Springer Book Archive