Two New Metrics for Feature Selection in Pattern Recognition

Piñero, Pedro; Arco, Leticia; García, María M.; Caballero, Yaile; Yzquierdo, Raykenler; Morales, Alfredo

doi:10.1007/978-3-540-24586-5_60

Pedro Piñero⁶,
Leticia Arco⁷,
María M. García⁷,
Yaile Caballero⁸,
Raykenler Yzquierdo⁶ &
…
Alfredo Morales⁶

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2905))

Included in the following conference series:

Iberoamerican Congress on Pattern Recognition

1357 Accesses
3 Citations

Abstract

The purpose of this paper is to discuss about feature selection methods. We present two common feature selection approaches: statistical methods and artificial intelligence approach. Statistical methods are exposed as antecedents of classification methods with specific techniques for choice of variables because we pretend to try the feature selection techniques in classification problems. We show the artificial intelligence approaches from different points of view. We also present the use of the information theory to build decision trees. Instead of using Quinlan’s Gain we discuss others alternatives to build decision trees. We introduce two new feature selection measures: MLRelevance formula and the PRelevance. These criteria maximize the heterogeneity among elements that belong to different classes and the homogeneity among elements that belong to the same class. Finally, we compare different feature selection methods by means of the classification of two medical data sets.

Download to read the full chapter text

Chapter PDF

Adaptive Information-Theoretical Feature Selection for Pattern Classification

Feature Selection for Data and Pattern Recognition: An Introduction

Novel Approach for Feature Selection Using Genetic Algorithm

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Koller, D., Mehran, S.: Toward Optimal Feature Selection. Computer Science Department. Stanford University, Stanford (1997)
Google Scholar
Grau, R.: Estadística aplicada con ayuda de paquetes de software. Editorial Universitaria, Jalisco (1994)
Google Scholar
Michie, D., Spiegelhalter, J.T.C.C.: Machine Learning, Neural and Statistical Classification. Springer, Heidelberg (1994)
MATH Google Scholar
Bello, R.: Métodos de Solución de Problemas para la Inteligencia Artificial. Universidad Central de Las Villas, Santa clara (1998)
Google Scholar
Blum, A., Langley, P.: Selection of relevant features and examples in mechine learning. Artificial Intelligence 97, 245–271 (1997)
Article MATH MathSciNet Google Scholar
John, G., Kohavi, R., Pfleger, K.: Irrelevant features and the subset selection problems. In: Proceedings 11th International conferences on Machine Learning, New Brunswick, NJ (1994)
Google Scholar
Kira, K., Rendell, L.A.: A practical approach to feature selection. In: Proceedings 9th International Conference on Machine Learning, Aberdeen, Scotland (1992)
Google Scholar
Almuallim, H., Dietterich, T.G.: Learning with many irrelevant features. In: Proceedings of AAAI 1992, MIT Press, Cambridge (1992)
Google Scholar
Langley, P., Sage, S.: Oblivious decision trees and abstract cases. In: Working Notes of the AAAI 1994, Workshop on Case Base Reasoning, Seattle (1994)
Google Scholar
Quinlan, J.R.: Induction of Decision Trees. Machine Learning, 81–106 (1986)
Google Scholar
Quinlan, J.R.: Improved Use of Continuous Attributes in C4.5. Research Journal of Artificial Intelligence 4, 77–90 (1996)
MATH Google Scholar
Breiman, L.F., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Wadsworth, Belmont (1984)
MATH Google Scholar
Brender, J.: Measuring quality of medical knowledge. In: Proceeding of the Twelfth International Congress of the European Federation for Medical Informatics (1994)
Google Scholar
Quinlan, J.: C4.5: Programs for Machine Learning. Morgan Kaufman, San Mateo (1993)
Google Scholar
Quinlan, J.R.: See5/C5.0 (2002)
Google Scholar
Mántaras, R.L.: A Distance-Based Attribute Selection Measure for Decision Tree Induction. Machine Learning (1991)
Google Scholar
Cheguis, I., Yablonskii, S.: K-Testor, Moscow: Trudy Matematicheskava Instituta imeni V. A. Steklova LI. 270–360 (1958)
Google Scholar
Zhuravlev, Y.I., Tuliaganov, S.E.: Measures to Determine the Importance of Objects in Complex Systems, Moscu., vol. 12, pp. 170–184 (1972)
Google Scholar
Aizenberg, N.N., Tsipkin, A.I.: Prime Tests, vol. 4, pp. 801–802. Doklady Akademii Nauk (1971)
Google Scholar
Ruiz-Shulcloper, J., Cortés, M.L.: K-testores primos. Revista Ciencias Técnicas Físicas y Matemáticas 9, 17–55 (1991)
Google Scholar
Pawlak, Z.: Rough Sets- Theorical Aspects of Reasoning about Data. Kluwer Academic, Dondrecht (1991)
Google Scholar
Komorowski, J., et al.: A Rough Set Perspective on Data and Knowledge. In: Klosgen, W. (ed.) The HandBook of DataMining and Knowledge Discovery, Oxford University Press, Oxford (1999)
Google Scholar
Blake, C.L., Merz, C.J.: UCI Repository of Machine Learning databases. Department of Information and Computer Science. University of California, Berkeley (2003)
Google Scholar
Aha, D.W.: Case-Based Learning Algorithm (1991)
Google Scholar
Jabson, D.: Applied Multivariate Data Analysis. Categorical and Multivariate methods, vol. 2. Springer, Heidelberg (1992)
Google Scholar

Download references

Author information

Authors and Affiliations

Bioinformatic Laboratory, University of the Informatics Sciences, La Habana, Cuba
Pedro Piñero, Raykenler Yzquierdo & Alfredo Morales
Artificial Intelligence Laboratory, Central University of Las Villas, Santa Clara, Cuba
Leticia Arco & María M. García
Informatics Department, University of Camagüey, Camagüey, Cuba
Yaile Caballero

Authors

Pedro Piñero
View author publications
You can also search for this author in PubMed Google Scholar
Leticia Arco
View author publications
You can also search for this author in PubMed Google Scholar
María M. García
View author publications
You can also search for this author in PubMed Google Scholar
Yaile Caballero
View author publications
You can also search for this author in PubMed Google Scholar
Raykenler Yzquierdo
View author publications
You can also search for this author in PubMed Google Scholar
Alfredo Morales
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dept. System Engineering and Automation, Universitat Politècnica de Catalunya (UPC), Barcelona, Spain
Alberto Sanfeliu
Advanced Technologies Applications Center, MINBAS, Cuba
José Ruiz-Shulcloper

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Piñero, P., Arco, L., García, M.M., Caballero, Y., Yzquierdo, R., Morales, A. (2003). Two New Metrics for Feature Selection in Pattern Recognition. In: Sanfeliu, A., Ruiz-Shulcloper, J. (eds) Progress in Pattern Recognition, Speech and Image Analysis. CIARP 2003. Lecture Notes in Computer Science, vol 2905. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24586-5_60

Download citation

DOI: https://doi.org/10.1007/978-3-540-24586-5_60
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20590-6
Online ISBN: 978-3-540-24586-5
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

Two New Metrics for Feature Selection in Pattern Recognition

Abstract

Chapter PDF

Similar content being viewed by others

Adaptive Information-Theoretical Feature Selection for Pattern Classification

Feature Selection for Data and Pattern Recognition: An Introduction

Novel Approach for Feature Selection Using Genetic Algorithm

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Navigation

Two New Metrics for Feature Selection in Pattern Recognition

Abstract

Chapter PDF

Similar content being viewed by others

Adaptive Information-Theoretical Feature Selection for Pattern Classification

Feature Selection for Data and Pattern Recognition: An Introduction

Novel Approach for Feature Selection Using Genetic Algorithm

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation