Feature Subset Selection, Class Separability, and Genetic Algorithms

Cantú-Paz, Erick

doi:10.1007/978-3-540-24854-5_96

Erick Cantú-Paz¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3102))

Included in the following conference series:

Genetic and Evolutionary Computation Conference

1471 Accesses
34 Citations

Abstract

The performance of classification algorithms in machine learning is affected by the features used to describe the labeled examples presented to the inducers. Therefore, the problem of feature subset selection has received considerable attention. Genetic approaches to this problem usually follow the wrapper approach: treat the inducer as a black box that is used to evaluate candidate feature subsets. The evaluations might take a considerable time and the traditional approach might be impractical for large data sets. This paper describes a hybrid of a simple genetic algorithm and a method based on class separability applied to the selection of feature subsets for classification problems. The proposed hybrid was compared against each of its components and two other feature selection wrappers that are used widely. The objective of this paper is to determine if the proposed hybrid presents advantages over the other methods in terms of accuracy or speed in this problem. The experiments used a Naive Bayes classifier and public-domain and artificial data sets. The experiments suggest that the hybrid usually finds compact feature subsets that give the most accurate results, while beating the execution time of the other wrappers.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Ensemble-Based Wrapper Methods for Feature Selection and Class Imbalance Learning

Novel Approach for Feature Selection Using Genetic Algorithm

Filter-Based Feature Selection Using Two Criterion Functions and Evolutionary Fuzzification

References

John, G., Kohavi, R., Phleger, K.: Irrelevant features and the feature subset problem. In: Proceedings of the 11th International Conference on Machine Learning, pp. 121–129. Morgan Kaufmann, San Francisco (1994)
Google Scholar
Kohavi, R., John, G.: Wrappers for feature subset selection. Artificial Intelligence 97, 273–324 (1997)
Article MATH Google Scholar
Jain, A., Zongker, D.: Feature selection: evaluation, application and small sample performance. IEEE Transactions on Pattern Analysis and Machine Intelligence 19, 153–158 (1997)
Article Google Scholar
Siedlecki, W., Sklansky, J.: A note on genetic algorithms for large-scale feature selection. Pattern Recognition Letters 10, 335–347 (1989)
Article MATH Google Scholar
Brill, F.Z., Brown, D.E., Martin, W.N.: Genetic algorithms for feature selection for counterpropagation networks. Tech. Rep. No. IPC-TR-90-004, University of Virginia, Institute of Parallel Computation, Charlottesville (1990)
Google Scholar
Brotherton, T.W., Simpson, P.K.: Dynamic feature set training of neural nets for classification. In: McDonnell, J.R., Reynolds, R.G., Fogel, D.B. (eds.) Evolutionary Programming IV, Cambridge, MA, pp. 83–94. MIT Press, Cambridge (1995)
Google Scholar
Bala, J., De Jong, K., Huang, J., Vafaie, H., Wechsler, H.: Using learning to facilitate the evolution of features for recognizing visual concepts. Evolutionary Computation 4, 297–311 (1996)
Article Google Scholar
Kelly, J.D., Davis, L.: Hybridizing the genetic algorithm and the K nearest neighbors classification algorithm. In: Belew, R.K., Booker, L.B. (eds.) Proceedings of the Fourth International Conference on Genetic Algorithms, San Mateo, CA, pp. 377–383. Morgan Kaufmann, San Francisco (1991)
Google Scholar
Punch, W.F., Goodman, E.D., Pei, M., Chia-Shun, L., Hovland, P., Enbody, R.: Further research on feature selection and classification using genetic algorithms. In: Forrest, S. (ed.) Proceedings of the Fifth International Conference on Genetic Algorithms, San Mateo, CA, pp. 557–564. Morgan Kaufmann, San Francisco (1993)
Google Scholar
Raymer, M.L., Punch, W.F., Goodman, E.D., Sanschagrin, P.C., Kuhn, L.A.: Simultaneous feature scaling and selection using a genetic algorithm. In: Bäck, T. (ed.) Proceedings of the Seventh International Conference on Genetic Algorithms, San Francisco, pp. 561–567. Morgan Kaufmann, San Francisco (1997)
Google Scholar
Kudo, M., Sklansky, K.: Comparison of algorithms that select features for pattern classifiers. Pattern Recognition 33, 25–41 (2000)
Article Google Scholar
Vafaie, H., De Jong, K.A.: Robust feature selection algorithms. In: Proceedings of the International Conference on Tools with Artificial Intelligence, pp. 356–364. IEEE Computer Society Press, Los Alamitos (1993)
Google Scholar
Inza, I., Larrañaga, P., Etxeberria, R., Sierra, B.: Feature subset selection by Bayesian networks based optimization. Artificial Intelligence 123, 157–184 (1999)
Article Google Scholar
Cantú-Paz, E.: Feature subset selection by estimation of distribution algorithms. In: Langdon, W.B., Cantú-Paz, E., Mathias, K., Roy, R., Davis, D., Poli, R., Balakrishnan, K., Honavar, V., Rudolph, G., Wegener, J., Bull, L., Potter, M.A., Schultz, A.C., Miller, J.F., Burke, E., Jonoska, N. (eds.) GECCO 2002: Proceedings of the Genetic and Evolutionary Computation Conference, San Francisco, CA, pp. 303–310. Morgan Kaufmann Publishers, San Francisco (2002)
Google Scholar
Raymer, M.L., Punch, W.F., Goodman, E.D., Kuhn, L.A., Jain, A.K.: Dimensionality reduction using genetic algorithms. IEEE Transactions on Evolutionary Computation 4, 164–171 (2000)
Article Google Scholar
Inza, I., Larrañaga, P., Sierra, B.: Feature subset selection by Bayesian networks: a comparison with genetic and sequential algorithms. International Journal of Approximate Reasoning 27, 143–164 (2001)
Article MATH Google Scholar
Inza, I., Larrañaga, P., Sierra, B.: Feature subset selection by estimation of distribution algorithms. In: Larrañaga, P., Lozano, J.A. (eds.) Estimation of Distribution Algorithms: A new tool for Evolutionary Computation, Kluwer Academic Publishers, Dordrecht (2001)
Google Scholar
Ozdemir, M., Embrechts, M.J., Arciniegas, F., Breneman, C.M., Lockwood, L., Bennett, K.P.: Feature selection for in-silico drug design using genetic algorithms and neural networks. In: IEEE Mountain Workshop on Soft Computing in Industrial Applications, pp. 53–57. IEEE Press, Los Alamitos (2001)
Google Scholar
Lanzi, P.: Fast feature selection with genetic algorithms: a wrapper approach. In: IEEE International Conference on Evolutionary Computation, pp. 537–540. IEEE Press, Los Alamitos (1997)
Google Scholar
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. Journal of Machine Learning Research 3, 1157–1182 (2003)
Article MATH Google Scholar
Oh, I.S., Lee, J.S., Suen, C.: Analysis of class separation and combination of classdependent features for handwritting recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 21, 1089–1094 (1999)
Article Google Scholar
Harik, G., Cantú-Paz, E., Goldberg, D.E., Miller, B.L.: The gambler’s ruin problem, genetic algorithms, and the sizing of populations. Evolutionary Computation 7, 231–253 (1999)
Article Google Scholar
Matsumoto, M., Nishimura, T.: Mersenne twister: A 623-dimensionally equidistributed uniform pseudorandom number generator. ACM Transactions on Modeling and Computer Simulation 8, 3–30 (1998)
Article MATH Google Scholar
Blake, C., Merz, C.: UCI repository of machine learning databases (1998)
Google Scholar
Miller, B.L., Goldberg, D.E.: Genetic algorithms, selection schemes, and the varying effects of noise. Evolutionary Computation 4, 113–131 (1996)
Article Google Scholar
Alpaydin, E.: Combined 5 × 2cv F test for comparing supervised classification algorithms. Neural Computation 11, 1885–1892 (1999)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Center for Applied Scientific Computing, Lawrence Livermore National Laboratory, Livermore, CA, 94551, USA
Erick Cantú-Paz

Authors

Erick Cantú-Paz
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Indian Institute of Technology Kanpur, Department of Mechanical Engineering, Kanpur Genetic Algorithms Laboratory (KanGAL), 208016, Kanpur, Uttar Pradesh, India
Kalyanmoy Deb

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cantú-Paz, E. (2004). Feature Subset Selection, Class Separability, and Genetic Algorithms. In: Deb, K. (eds) Genetic and Evolutionary Computation – GECCO 2004. GECCO 2004. Lecture Notes in Computer Science, vol 3102. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24854-5_96

Download citation

DOI: https://doi.org/10.1007/978-3-540-24854-5_96
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22344-3
Online ISBN: 978-3-540-24854-5
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Feature Subset Selection, Class Separability, and Genetic Algorithms

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Ensemble-Based Wrapper Methods for Feature Selection and Class Imbalance Learning

Novel Approach for Feature Selection Using Genetic Algorithm

Filter-Based Feature Selection Using Two Criterion Functions and Evolutionary Fuzzification

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Feature Subset Selection, Class Separability, and Genetic Algorithms

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Ensemble-Based Wrapper Methods for Feature Selection and Class Imbalance Learning

Novel Approach for Feature Selection Using Genetic Algorithm

Filter-Based Feature Selection Using Two Criterion Functions and Evolutionary Fuzzification

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation