Skip to main content

Part of the book series: The Springer International Series in Engineering and Computer Science ((SECS,volume 454))

Abstract

In the previous chapters, we focused on feature subset selection. We now discuss related and/or less developed topics with respect to feature transformation and dimensionality reduction. The first two sections are about feature trans- formation, which introduce techniques in Statistics, Machine Learning, and Knowledge Discovery. The third section discusses feature discretization which is closely related to dimensionality reduction. If subset selection allows reduction in one dimension, feature discretization could enable reduction along two dimensions. In the fourth section, we go beyond the classification model and explore feature selection without class information. Data without class information are unsupervised data (unlabeled). As we move from supervised (labeled) data to unsupervised data, we also step into a territory that is not as well explored as feature selection for classification. However, we foresee the rising need for unsupervised feature selection and expect more work to be carried out in the near future.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 299.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 379.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 379.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Aha, D. (1997). Editorial. Artificial Intelligence Review, 11.

    Google Scholar 

  • Bloedorn, E. and Michalski, R. (1998). Data-Driven Constructive Induction: A Methodology and Its Applications, pages 51–68. In (Liu and Motoda, 1998).

    Google Scholar 

  • Breiman, L., Friedman, J., Olshen, R., and Stone, C. (1984). Classification and Regression Trees. Wadsworth & Brooks/Cole Advanced Books & Software.

    Google Scholar 

  • Brodley, C. and Utgoff, P. (1995). Multivariate decision trees. Machine Learning, 19:45–77.

    MATH  Google Scholar 

  • Catlett, J. (1991). On changing continuous attributes into ordered discrete attributes. In European Working Session on Learning.

    Google Scholar 

  • Cerquides, J. and de Mantaras, R. L. (1997). Proposal and empirical comparison of a parallelizable distance-based discretization method. In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining, pages 139–142.

    Google Scholar 

  • Cheeseman, P., Kelly, J., Self, M., Stutz, J., Taylor, W., and Freeman, D. (1988). AUTOCLASS: A Bayesian classification systems. In Proceedings of the Fourth International Workshop on Machine Learning, pages 54–64. Morgan Kaufmann.

    Google Scholar 

  • Clark, P. and Niblett, T. (1989). The CN2 induction algorithm. Machine Learning, 3:261–283.

    Google Scholar 

  • Cost, S. and Salzberg, S. (1993). A weighted nearest neighbor algorithm for learning with symbolic features. Machine Learning, 10:57–78.

    Google Scholar 

  • Dasarathy, B. (1991). Nearest Neighbor (NN) Norms: NN Pattern Classification Techniques. IEEE Computer Society Press.

    Google Scholar 

  • Dash, M., Liu, H., and Yao, J. (1997). Dimensionality reduction of unsupervised data. In Proceedings of the Ninth IEEE International Conference on Tools with AI (ICTAI’97), pages 532–539. IEEE Computer Society.

    Google Scholar 

  • Donoho, S. and Rendell, L. (1996). Constructive induction using fragmentary knowledge. In Saitta, L., editor, Proceedings of International Conference on Machine Learning (ICML-96), pages 113–121. Morgan Kaufmann Publishers.

    Google Scholar 

  • Dougherty, J., Kohavi, R., and Sahami, M. (1995). Supervised and unsuper-vised discretization of continuous features. In Proceedings of the Twelfth International Conference on Machine Learning. Morgan Kaufmann, Los Altos, CA.

    Google Scholar 

  • Duda, R. and Hart, P. (1973). Pattern Classification and Scene Analysis. John Wiley & Sons, New York.

    MATH  Google Scholar 

  • Fast, J. (1962). Entropy: the significance of the concept of entropy and its applications in science and technology, chapter 2: The Statistical Significance of the Entropy Concept. Eindhoven: Philips Technical Library.

    Google Scholar 

  • Fayyad, U. and Irani, K. (1993). Multi-interval discretization of continuous-valued attributes for classification learning. In Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence, pages 1022–1027. Morgan Kaufmann Publishers, Inc.

    Google Scholar 

  • Fisher, D. (1987). Knowledge acquisition via incremental conceptual clustering. Machine Learning, 2:139–172.

    Google Scholar 

  • Fisher, D. and Langley, P. (1985). Approaches to conceptual clustering. In Proceedings of the Ninth International Joint Conference on Artificial Intelligence, pages 691–697.

    Google Scholar 

  • Fisher, R. (1936). The use of multiple measurements in taxonomic problems. Ann. Eugenics, 7(2): 179–188.

    Google Scholar 

  • Friedman, J., Kohavi, R., and Yun, Y. (1996). Lazy decision trees. In Proceedings of the Thirteenth National Conference on Artificial Intelligence, pages 717–724.

    Google Scholar 

  • Gennari, J., Langley, P., and Fisher, D. (1989). Models of incremental concept formation. Artificial Intelligence, 40:11–61.

    Article  Google Scholar 

  • Gluck, M. and Corter, J. (1985). Information, uncertainty, and the utility of categories. In Proceedings of the Seventh Annual Conference of the Cognitive Science Society, pages 283–287. Lawrence Erlbaum, Irvine, CA.

    Google Scholar 

  • Hagan, M., Demuth, H., and Beale, M. (1996). Neural Network Design. PWS Publishing Company.

    Google Scholar 

  • Ho, K. and Scott, P. (1997). Zeta: A global method for discretization of continuous variables. In Proceedings of The Third International Conference of Knowledge Discovery and Data Mining, pages 191–194. Newport Beach, CA.

    Google Scholar 

  • Holte, R. (1993). Very simple classification rules perform well on most commonly used datasets. Machine Learning, 11(1):63–90.

    Article  MathSciNet  MATH  Google Scholar 

  • Jun, B., Kim, C., Song, H., and Kim, J. (1997). A new criterion in selection and discretization of attributes for the generation of decision trees. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(12):1371–1375.

    Article  Google Scholar 

  • Kerber, R. (1992). ChiMerge: Discretization of numeric attributes. In AAAI-92, Proceedings of the Ninth National Conference on Artificial Intelligence, pages 123–128. AAAI Press/The MIT Press.

    Google Scholar 

  • Klir, G. and Folger, T. (1988). Fuzzy Sets, Uncertainty, and Information, chapter 5: Uncertainty and Information. Prentice-Hall International Editions.

    Google Scholar 

  • Kohonen, T. (1984). Self-organization and Associative Memory. Springer-Verlag, Berlin.

    MATH  Google Scholar 

  • Kramer, S. (1994). CN2-MCI: A two-step method for constructive induction. In Proceedings of the Workshop on Constructive Induction and Change of Representation, International Conference on Machine Learning (ML-94/COLT-4).

    Google Scholar 

  • Langley, P., Simon, H., Bradshaw, G., and Zytkow, J. (1987). Scientific Discovery: Computational Explorations of the Creative Processes. Cambridge, MA: The MIT Press.

    Google Scholar 

  • Lebowitz, M. (1987). Experiments with incremental concept formation. Machine Learning, 1:103–138.

    Google Scholar 

  • Liu, H. and Motoda, H., editors (1998). Feature Extraction, Construction and Selection: A Data Mining Perspective. Kluwer Academic Publishers.

    Google Scholar 

  • Liu, H. and Setiono, R. (1995). Chi2: Feature selection and discretization of numeric attributes. In Proceedings of the Seventh IEEE International Conference on Tools with Artificial Intelligence, pages 388–391.

    Google Scholar 

  • Liu, H. and Setiono, R. (1997). Feature selection via discretization. IEEE Trans on Knoweldge and Data Engineering, 9(4):642–645.

    Article  Google Scholar 

  • Liu, H. and Wen, W. X. (1994). Joint concept formation. Journal of Knowledge Acquisition, 6:75–87.

    Article  Google Scholar 

  • Matheus, C. and Rendell, L. (1989). Constructive induction on decision trees. In Proceedings of International Joint Conference on AI, pages 645–650.

    Google Scholar 

  • McKusick, K. and Langley, P. (1991). Constrains on tree structure in concept formation. In Proceedings of the Eleventh International Joint Conference on Artificial Intelligence, pages 810–816.

    Google Scholar 

  • Michalski, R. and Stepp, R. (1983). Automated construction of classifications: Conceptual clustering versus numerical taxonomy. IEEE Transactions on Pattern Analysis and Machine Intelligence, 5(4).

    Google Scholar 

  • Michie, D., Spiegelhalter, D., and Taylor, C. (1994). Machine Learning, Neural and Statistical Classification. Ellis Horwood Series in Artificial Intelligence.

    Google Scholar 

  • Mingers, J. (1989). An empirical comparison of selection measures for decision-tree induction. Machine Learning, 3:319–342.

    Google Scholar 

  • Muggleton, S. (1987). Duce, an Oracle-based approach to constructive induction. In Proceedings of International Joint Conference on AI, pages 287–292. Morgan Kaufmann.

    Google Scholar 

  • Pagallo, G. (1989). Learning DNF by decision trees. In Proceedings of International Joint Conference on AI, pages 639–644. Morgan Kaufmann.

    Google Scholar 

  • Pagallo, G. and Haussler, D. (1990). Boolean feature discovery in empirical learning. Machine Learning, 5:71–99.

    Article  Google Scholar 

  • Pettis, K., Bailey, T., Jain, A., and Dubes, R. (1979). An intrinsic dimensionality estimator from near-neighbor information. IEEE Transactions on Pattern Analysts and Machine Intelligence, 1:25–37.

    Article  MATH  Google Scholar 

  • Pettofrezzo, A. (1966). Matrices and Transformations. Dover Publications, Inc. New York.

    Google Scholar 

  • Quinlan, J. (1986). Induction of decision trees. Machine Learning, 1(1):81–106.

    Google Scholar 

  • Quinlan, J. (1988). Decision trees and multi-values attributes. In J.E., H., Michie, D., and J., R., editors, Machine Intelligence, volume 11. Oxford University Press.

    Google Scholar 

  • Quinlan, J. (1993). C4-5: Programs for Machine Learning. Morgan Kaufmann.

    Google Scholar 

  • Rendell, L. (1985). Substantial constructive induction using layered information compression: Tractable feature formation in search. In Proceedings of International Joint Conference on AI, pages 650–658. Morgan Kaufmann.

    Google Scholar 

  • Rosch, E. (1978). Principles of categorization. In Rosch, E. and Lloyd, B., editors, Cognition and Categorization. Erlbaum, N.J.

    Google Scholar 

  • Setiono, R. and Liu, H. (1998). Feature Extraction via Neural Networks, pages 191–204. In (Liu and Motoda, 1998).

    Google Scholar 

  • Shlimmer, J. (1987). Learning and representation change. In Proceedings of AAAI, pages 511–515. Morgan Kaufmann.

    Google Scholar 

  • Vilalta, R., Blix, G., and Rendell, L. (1997). Global data analysis and the fragmentation problem in decision tree induction. In van Someren, M. and Widmer, G., editors, Machine Learning: ECML-97, pages 312–326. Springer-Verlag.

    Google Scholar 

  • Wnek, J. and Michalski, R. (1994). Hypothesis-driven constructive induction in AQ17-HCI: A method and experiments. Machine Learning, 14.

    Google Scholar 

  • Wyse, N., Dubes, R., and Jain, A. (1980). A critical evaluation of intrinsic dimensionality algorithms. In Gelsema, E. and Kanal, L., editors, Pattern Recognition in Practice, pages 415–425. Morgan Kaufmann Publishers, Inc.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 1998 Springer Science+Business Media New York

About this chapter

Cite this chapter

Liu, H., Motoda, H. (1998). Feature Transformation and Dimensionality Reduction. In: Feature Selection for Knowledge Discovery and Data Mining. The Springer International Series in Engineering and Computer Science, vol 454. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-5689-3_6

Download citation

  • DOI: https://doi.org/10.1007/978-1-4615-5689-3_6

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4613-7604-0

  • Online ISBN: 978-1-4615-5689-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics