Abstract
This paper focuses on feature selection for problems dealing with high-dimensional data. We discuss the benefits of adopting a regularized approach with L 1 or L 1–L 2 penalties in two different applications—microarray data analysis in computational biology and object detection in computer vision. We describe general algorithmic aspects as well as architecture issues specific to the two domains. The very promising results obtained show how the proposed approach can be useful in quite different fields of application.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Bertero M, Boccacci P (1998) Introduction to inverse problems in imaging. Institute of Physics Publishing, Bristol and Philadelphia
Breiman L, Friedman JH, Olshen A, Stone CJ (1984) Classification and Regression Trees. Wadsworth and Brooks, Belmont
Candes E, Tao T (2005) The Dantzig selector: statistical estimation when P is much larger than N
Chen S, Donoho D, Saunders M (1998) Atomic decomposition by basis pursuit. SIAM J Sci Comput 20(1): 33–61
Daubechies I, Defrise M, De Mol C (2004) An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Commun Pure Appl Math 57: 1413–1457
De Mol C, Defrise M (2002) A note on wavelet-based inversion algorithms. Contemp Math 313: 85–96
De Mol C, Mosci, Traskine MS, Verri A (2007) Sparsity enforcing and correlation preserving algorithm for microarray data analysis. Technical Report DISI-TR-07-04, DISI, Università di Genova
Destrero A, De Mol C, Odone F, Verri A (2007) A regularized approach to feature selection for face detection. Technical Report DISI-TR-07-01, DISI, Università di Genova
Destrero A, Odone F, Verri A (2007) A system for face detection and tracking in unconstrained environments. In: Advanced video and signal based surveillance, AVSS, London, 2007. ISBN 978-1-4244-1696-7/07
Efron B, Hastie T, Johnstone I, Tibshirani R (2004) Least angle regression. Ann Stat 32: 407–499
Engl HW, Hanke M, Neubauer A (1996) Regularization of inverse problems. Math Appl 375
Forman G (2003) An extensive empirical study of feature selection metrics for text classification. J Mach Learn Res 3: 1289–1306
Gordon GJ, Jensen RV, Hsiao L, Gullans SR, Blumenstock JE, Ramaswamy S, Richard WG, Sugarbaker DJ, Bueno R (2002) Translation of microarray data into clinically relevant cancer diagnostic tests using gene expression ratios in lung cancer and mesothelioma. Cancer Res 62: 4963–4967
Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3: 1157–1182
Hastie T, Tibshirani R, Friedman J (2001) The elements of statistical learning. Springer, Heidelberg
Heisele B, Serre T, Mukherjee S, Poggio T (2001) Feature reduction and hierarchy of classifiers for fast object detection in video images. In: IEEE proceedings of CVPR
Hoerl AE, Kennard R (1970) Ridge regression: biased estimation for nonorthogonal problems. Technometrics 12: 55–67
Kohavi R, John G (1997) Wrappers for feature subset selection. Artif Intell 97(1-2): 273–324
Mohan A, Papageorgiou C, Poggio T (2001) Example-based object detection in images by components. IEEE Trans Pattern Anal Mach Intell 23(4): 349–361
Osuna E, Freund R, Girosi F (1997) Training support vector machines: an application to face detection. In: IEEE proceedings international conference on computer vision and pattern recognition (CVPR), pp 130–136
Schneiderman H, Kanade T (2000) A statistical method for 3D object detection applied to faces and cars. In: IEEE proceedings international conference on computer vision and pattern recognition (CVPR), pp 1746–1759
Singh D et al (2002) Gene expression correlates of clinical prostate cancer behavior. Cancer Cell 1: 203–209
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Ser B 56: 267–288
Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP, Coller H, Loh ML, Downing JR, Caligiuri MA, Bloomfield CD, Lander ES (1999) Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286: 531–537
Ullman S, Vidal-Naquet M, Sali E (2002) Visual features of intermediate complexity and their use in classification. Nat Neurosci 5(7): 682–687
Viola P, Jones MJ (2004) Robust real-time face detection. Int J Comput Vis 57(2): 137–154
Werbos P (1988) Backpropagation: past and future. In: Proceedings of the IEEE international conference on neural networks. IEEE Press, pp 343–353
Weston J, Elisseeff A, Schoelkopf B, Tipping M (2003) Use of the zero norm with linear models and kernel methods. J Mach Learn Res 3: 1439–1461
Weston J, Elisseeff A, Scholkopf B, Tipping M (2003) The use of zero-norm with linear models and kernel methods. J Mach Learn Res 3: 1439–1461
Yang M-H, Kriegman DJ, Ahuja N (2002) Detecting faces in images: a survey. IEEE Trans Pattern Anal Mach Intell 24(1): 34–58
Zhu J, Rosset S, Hastie T, Tibshirani R (2004) 1-norm support vector machines. In: Thrun S, Saul LK, Schölkpf B (eds) Advances in neural information processing systems 16. MIT Press, Cambridge, pp 49–56
Zou Z, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc Ser B 67: 301–320
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Destrero, A., Mosci, S., De Mol, C. et al. Feature selection for high-dimensional data. Comput Manag Sci 6, 25–40 (2009). https://doi.org/10.1007/s10287-008-0070-7
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10287-008-0070-7