Combined SVM-Based Feature Selection and Classification

Neumann, Julia; Schnörr, Christoph; Steidl, Gabriele

doi:10.1007/s10994-005-1505-9

Combined SVM-Based Feature Selection and Classification

Published: 11 July 2005

Volume 61, pages 129–150, (2005)
Cite this article

Download PDF

Machine Learning Aims and scope Submit manuscript

Combined SVM-Based Feature Selection and Classification

Download PDF

Julia Neumann¹,
Christoph Schnörr¹ &
Gabriele Steidl¹

4013 Accesses
181 Citations
Explore all metrics

Abstract

Feature selection is an important combinatorial optimisation problem in the context of supervised pattern classification. This paper presents four novel continuous feature selection approaches directly minimising the classifier performance. In particular, we include linear and nonlinear Support Vector Machine classifiers. The key ideas of our approaches are additional regularisation and embedded nonlinear feature selection. To solve our optimisation problems, we apply difference of convex functions programming which is a general framework for non-convex continuous optimisation. Experiments with artificial data and with various real-world problems including organ classification in computed tomography scans demonstrate that our methods accomplish the desired feature selection and classification performance simultaneously.

Article PDF

Feature and instance selection through discriminant analysis criteria

Article 13 October 2022

A multi-verse optimizer approach for feature selection and optimizing SVM parameters based on a robust system architecture

Article 02 January 2017

Automated Optimization of Non-linear Support Vector Machines for Binary Classification

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

Bach, F., Lanckriet, G., & Jordan, M. (2004). Multiple kernel learning, conic duality, and the SMO algorithm. In Proceedings of the 21st International Conference on Machine Learning. New York, NY: ACM Press.
Ben-Tal, A., & Zibulevsky, M. (1997). Penalty/Barrier multiplier methods for convex programming problems. SIAM Journal on Optimization, 7:2, 347–366.
Article MathSciNet Google Scholar
Bennett, K. P., & Mangasarian, O. L. (1992). Robust linear programming discrimination of two linearly inseparable sets. Optimization Methods and Software, 1, 23–34.
Google Scholar
Blake, C. L., & Merz, C. J. (1998). UCI repository of machine learning databases.
Bradley, P. S. (1998). Mathematical programming approaches to machine learning and data mining. Ph.D. thesis, University of Wisconsin, Computer Sciences Dept., Madison, WI, TR-98-11.
Bradley, P. S., & Mangasarian, O. L. (1998). Feature selection via concave minimization and support vector machines. In J. Shavlik (Ed.), Proceedings of the 15th international conference on machine learning (pp. 82–90). San Francisco, CA: Morgan Kaufmann.
Google Scholar
Chapelle, O., Haffner, P., & Vapnik, V. N. (1999). SVMs for histogram-based image classification. IEEE Transactions on Neural Networks, 10:5, 1055–1064.
Article Google Scholar
Cristianini, N., Shawe-Taylor, J., Elisseeff, A., & Kandola, J. (2002). On kernel-target alignment. In T. G. Dietterich, S. Becker, & Z. Ghahramani (Eds.), Advances in neural information processing systems 14 (pp. 367–373). Cambridge, MA: MIT Press.
Google Scholar
Duda, R., Hart, P., & Stork, D. (2000). Pattern classification. New York, NY: John Wiley & Sons, second edition.
Google Scholar
Guyon, I., & Elisseeff, A. (2003). An introduction to variable and feature selection. Journal of Machine Learning Research, 3, 1157–1182.
Article Google Scholar
Haasdonk, B., & Bahlmann, C. (2004). Learning with distance substitution kernels. In C. E. Rasmussen, H. H. Bülthoff, M. A. Giese, & B. Schölkopf (Eds.), Pattern recognition, proc. of 26th DAGM symposium, Vol. 3175 of LNCS. (pp. 220–227). Berlin: Springer.
Google Scholar
Heiler, M., Cremers, D., & Schnörr, C. (2001). Efficient feature subset selection for support vector machines. Technical Report TR-01-021, Comp. science series, Dept. of Mathematics and Computer Science, University of Mannheim.
Hermes, L., & Buhmann, J. M. (2000). Feature selection for support vector machines. In Proc. of the International Conference on Pattern Recognition (ICPR'00), Vol. 2 (pp. 716–719).
Ilog, Inc.: 2001, ‘ILOG CPLEX 7.5’.
Jakubik, O. J. (2003). Feature selection with concave minimization. Master's thesis, Dept. of Mathematics and Computer Science, University of Mannheim.
Jebara, T., & Jaakkola, T. (2000). Feature selection and dualities in maximum entropy discrimination. In I. Bratko & S. Dzeroski (Eds.), Proceedings of the 16th international conference on machine learning (pp. 291–300). San Francisco, CA: Morgan Kaufmann.
Google Scholar
John, G. H., Kohavi, R., & Pfleger, K. (1994). Irrelevant features and the subset selection problem. In R. S. Michalski & G. Tecuci (Eds.), Proc. of the 11th international conference on machine learning (pp. 121–129). San Francisco, CA: Morgan Kaufmann.
Google Scholar
Mangasarian, O. L. (1997). Minimum-support solutions of polyhedral concave programs. Technical Report TR-1997-05, Mathematical Programming, University of Wisconsin.
MathWorks. (2002). Optimization toolbox user's Guide. The MathWorks, Inc.
Neumann, J., Schnörr, C., & Steidl, G. (2004). SVM-based feature selection by direct objective minimisation. In C. E. Rasmussen, H. H. Bülthoff, M. A. Giese, & B. Schölkopf (Eds.), Pattern recognition, proc. of 26th DAGM symposium, Vol. 3175 of LNCS (pp. 212–219). Berlin: Springer.
Google Scholar
Pham Dinh, T., & Elbernoussi, S. (1988). Duality in d.c. (difference of convex functions optimization. Subgradient Methods. In Trends in Mathematical Optimization, Vol. 84 of Int. Series of Numer. Math. Basel: Birkäuser Verlag (pp. 277–293).
Pham Dinh, T., & Hoai An, L. T. (1998). A D.C. Optimization Algorithm for Solving the Trust-Region Subproblem. SIAM Journal on Optimization, 8:2, 476–505.
MathSciNet Google Scholar
Rockafellar, R. T. (1970). Convex analysis. Princeton, NJ: Princeton University Press.
Google Scholar
Schmidt, S. (2004). Context-sensitive image labeling based on logistic regression. Master's thesis, Dept. of Mathematics and Computer Science, University of Mannheim.
Schölkopf, B., & Smola, A. J. (2002). Learning with kernels. Cambridge, MA: MIT Press.
Google Scholar
Tibshirani, R. (1996). Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society, Series B, 58:1, 267–288.
MathSciNet MATH Google Scholar
Weston, J., Elisseeff, A., Schölkopf, B., & Tipping, M. (2003). Use of the zero-norm with linear models and kernel methods. Journal of Machine Learning Research, 3, 1439–1461.
Article Google Scholar
Weston, J., Mukherjee, S., Chapelle, O., Pontil, M., Poggio, T., & Vapnik, V. (2001). Feature Selection for SVMs. In T. K. Leen, T. G. Dietterich, & V. Tresp (Eds.), Advances in neural information processing systems 13 (pp. 668–674). Cambridge, MA: MIT Press.
Google Scholar
Yuille, A., & Rangarajan, A. (2003). The convex-concave procedure. Neural Computation, 15, 915–936.
Article Google Scholar
Zhu, J., Rosset, S., Hastie, T., & Tibshirani, R. (2004). 1-norm support vector machines. In S. Thrun, L. Saul, & B. Schölkopf (Eds.), Advances in neural information processing systems 16. Cambridge, MA: MIT Press.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics and Computer Science, University of Mannheim, Mannheim, Germany, D-68131
Julia Neumann, Christoph Schnörr & Gabriele Steidl

Authors

Julia Neumann
View author publications
You can also search for this author in PubMed Google Scholar
Christoph Schnörr
View author publications
You can also search for this author in PubMed Google Scholar
Gabriele Steidl
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Julia Neumann.

Additional information

Editor: Dale Schuurmans

Rights and permissions

Reprints and permissions

About this article

Cite this article

Neumann, J., Schnörr, C. & Steidl, G. Combined SVM-Based Feature Selection and Classification. Mach Learn 61, 129–150 (2005). https://doi.org/10.1007/s10994-005-1505-9

Download citation

Received: 17 August 2004
Revised: 16 March 2005
Accepted: 03 April 2005
Published: 11 July 2005
Issue Date: November 2005
DOI: https://doi.org/10.1007/s10994-005-1505-9

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Combined SVM-Based Feature Selection and Classification

Abstract

Article PDF

Similar content being viewed by others

Feature and instance selection through discriminant analysis criteria

A multi-verse optimizer approach for feature selection and optimizing SVM parameters based on a robust system architecture

Automated Optimization of Non-linear Support Vector Machines for Binary Classification

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Combined SVM-Based Feature Selection and Classification

Abstract

Article PDF

Similar content being viewed by others

Feature and instance selection through discriminant analysis criteria

A multi-verse optimizer approach for feature selection and optimizing SVM parameters based on a robust system architecture

Automated Optimization of Non-linear Support Vector Machines for Binary Classification

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation