Abstract
In this paper, we discuss three wrapper multi-label feature selection methods based on the Random Forest paradigm. These variants differ in the way they consider label dependence within the feature selection process. To assess their performance, we conduct an extensive experimental comparison of these strategies against recently proposed approaches using seven benchmark multi-label data sets from different domains. Random Forest handles accurately the feature selection in the multi-label context. Surprisingly, taking into account the dependence between labels in the context of ensemble multi-label feature selection was not found very effective.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Barkia, H., Elghazel, H., Aussem, A.: Semi-supervised feature importance evaluation with ensemble learning. In: ICDM 2010, pp. 31–40 (2011)
Blockeel, H., De Raedt, L., Ramon, J.: Top-down induction of clustering trees. In: ICML, pp. 55–63 (1998)
Boutell, M.R., Luo, J., Shen, X., Brown, C.M.: Learning multi-label scene classification. Pattern Recognition 37(9), 1757–1771 (2004)
Breiman, L.: Random forests. Machine Learning 45(1), 5–32 (2001)
Dembczynski, K., Waegeman, W., Cheng, W., Hüllermeier, E.: On label dependence and loss minimization in multi-label classification. Machine Learning 88(1-2), 5–45 (2012)
Demsar, J.: Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research 7, 1–30 (2006)
Doquire, G., Verleysen, M.: Feature selection for multi-label classification problems. In: Cabestany, J., Rojas, I., Joya, G. (eds.) IWANN 2011, Part I. LNCS, vol. 6691, pp. 9–16. Springer, Heidelberg (2011)
Elghazel, H., Aussem, A.: Unsupervised feature selection with ensemble learning. Machine Learning, 1–24 (2013)
Gu, Q., Li, Z., Han, J.: Correlated multi-label feature selection. In: CIKM, pp. 1087–1096 (2011)
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. Journal of Machine Learning Research 3, 1157–1182 (2003)
Hong, Y., Kwong, S., Chang, Y., Ren, Q.: Consensus unsupervised feature ranking from multiple views. Pattern Recognition Letters 29(5), 595–602 (2008)
Hong, Y., Kwong, S., Chang, Y., Ren, Q.: Unsupervised feature selection using clustering ensembles and population based incremental learning algorithm. Pattern Recognition 41(9), 2742–2756 (2008)
Kocev, D., Slavkov, I., Dzeroski, S.: More is better: Ranking with multiple targets for biomarker discovery. In: 2nd International Workshop on Machine Learning in Systems Biology, p. 133 (2008)
Kocev, D., Slavkov, I., Dzeroski, S.: Feature ranking for multi-label classification using predictive clustering trees. In: International Workshop on Solving Complex Machine Learning Problems with Ensemble Methods, in Conjunction with ECML/PKDD, pp. 56–68 (2013)
Kocev, D., Vens, C., Struyf, J., Dzeroski, S.: Tree ensembles for predicting structured outputs. Pattern Recognition 46(3), 817–833 (2013)
Lee, J.-S., Kim, D.-W.: Feature selection for multi-label classification using multivariate mutual information. Pattern Recognition Letters 34(3), 349–357 (2013)
Madjarov, G., Kocev, D., Gjorgjevikj, D., Dzeroski, S.: An extensive experimental comparison of methods for multi-label learning. Pattern Recognition 45(9), 3084–3104 (2012)
Read, J., Pfahringer, B., Holmes, G., Frank, E.: Classifier chains for multi-label classification. Machine Learning 85(3), 333–359 (2011)
Saeys, Y., Abeel, T., Van de Peer, Y.: Robust feature selection using ensemble feature selection techniques. In: Daelemans, W., Goethals, B., Morik, K. (eds.) ECML PKDD 2008, Part II. LNCS (LNAI), vol. 5212, pp. 313–325. Springer, Heidelberg (2008)
Spolaôr, N., Cherman, E.A., Monard, M.C., Lee, H.D.: A comparison of multi-label feature selection methods using the problem transformation approach. Electr. Notes Theor. Comput. Sci. 292, 135–151 (2013)
Tsoumakas, G., Katakis, I., Vlahavas, I.P.: Random k-labelsets for multilabel classification. IEEE Trans. Knowl. Data Eng. 23(7), 1079–1089 (2011)
Tsoumakas, G., Xioufis, E.S., Vilcek, J., Vlahavas, I.P.: Mulan: A java library for multi-label learning. Journal of Machine Learning Research 12, 2411–2414 (2011)
Zhang, M.-L.: Lift: Multi-label learning with label-specific features. In: IJCAI, pp. 1609–1614 (2011)
Zhang, M.-L., Zhou, Z.-H.: Multilabel neural networks with applications to functional genomics and text categorization. IEEE Transactions on Knowledge and Data Engineering 18(10), 1338–1351 (2006)
Zhang, M.-L., Zhou, Z.-H.: A review on multi-label learning algorithms. IEEE Transactions on Knowledge and Data Engineering 99(PrePrints):1 (2013)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Gharroudi, O., Elghazel, H., Aussem, A. (2014). A Comparison of Multi-Label Feature Selection Methods Using the Random Forest Paradigm. In: Sokolova, M., van Beek, P. (eds) Advances in Artificial Intelligence. Canadian AI 2014. Lecture Notes in Computer Science(), vol 8436. Springer, Cham. https://doi.org/10.1007/978-3-319-06483-3_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-06483-3_9
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-06482-6
Online ISBN: 978-3-319-06483-3
eBook Packages: Computer ScienceComputer Science (R0)