An Effective Multilabel Classification Using Feature Selection

Sane, S. S.; Chaudhari, Prajakta; Tidake, V. S.

doi:10.1007/978-981-10-7245-1_14

S. S. Sane¹⁹,
Prajakta Chaudhari¹⁹ &
V. S. Tidake²⁰

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 673))

1626 Accesses

Abstract

Recently, multilabel classification has received significant attention during the past years. A multilabel classification approach called coupled k-nearest neighbors algorithm for multilabel classification (called here as CK-STC) reported in the literature exploits coupled label similarities between the labels and provides improved performance [Liu and Cao in A Coupled k-Nearest Neighbor Algorithm for Multi-label Classification, pp. 176–187, 2015]. A multilabel feature selection is presented in Li et al. [Multi-label Feature Selection via Information Gain, pp. 346–355, 2014] and called as FSVIG here. FSVIG uses information gain that shows better performance when used with ML-NB, ML-kNN, and RandSvm when compared with existing multilabel feature selection algorithms.This paper investigates the performance of FSVIG when used with CK-STC and compares its performance with other multilabel feature selection algorithms available in MULAN using standard multilabel datasets. Experimental results show that FSVIG when used with CK-STC provides better performance in terms of average precision and one-error.

Access provided by CONRICYT-eBooks. Download conference paper PDF

A Study on Multi-label Classification

Local and global feature selection for multilabel classification with binary relevance

Article 02 May 2017

Multi-label feature selection based on logistic regression and manifold learning

Article 04 January 2022

Keywords

1 Introduction

Previously in supervised learning, each example is associated with only one instance and single label is assigned to this instance. As single label is assigned to the instance, this classification is known to be single-label classification. In some situations, where real-world object cannot be handled by single label, this problem can be solved by using multilabel classification. The purpose of multilabel classification is to discover the set of labels for unseen instances. In multilabel classification, input data instances are related to multiple class labels. For example, particular news article can be associated with economical article, political article, and business article.

Multilabel classification is used in several areas such as bioinformatics, text categorization, tag recommendation, image classification, direct marketing, medical diagnosis, query categorization, and protein function prediction [3, 4].

2 Related Work

Multilabel classification techniques are grouped into two categories that are problem transformation and algorithm adaptation method [5].

Problem transformation approach transforms the multilabel problem into a set of single-label problem. Problem transformation approach is algorithm independent. Several problem transformation methods such as binary relevance (BR) [6], label powerset (LP) [5], and classifier chain (CC) [7] are used in multilabel classification.

The algorithm adaptation method extends the available machine learning algorithms so as to handle multilabel data. The several algorithm adaptation approaches have been proposed that are multilabel k-nearest neighbor (ML-kNN) [8], binary relevance k-nearest neighbor (BR-kNN) [9], and IBLR [10].

ML-kNN and BR-kNN do not consider label correlation whereas IBLR considers label correlation but not good in terms of hamming loss.

Liu and Cao [1] proposed coupled k-nearest neighbor algorithm for multilabel classification (CK-STC). CK-STC is based on coupled label similarity [11] which updates ML-kNN algorithm which can handle label correlation. An advantage of CK-STC is that it considers label correlations and provides better performance than that of ML-kNN, IBLR, BSVM, and BR-kNN but it is more complex.

Same as traditional classification, multilabel classification also suffers from curse of dimensionality which can be handled by feature selection. Feature selection selects the relevant or more efficient features from original set of features. Traditional feature selection methods handle multilabel data by transforming multilabel problem into single-label problem and then apply feature selection.

Lee et al. [12] proposed multivariate mutual information-based feature selection. Due to transformation of multilabel problem into single-label problem, there is damage to original label structures and which reduces the performance of classifier. So new method called information gain feature selection for multilabel data [2] can perform feature selection on multilabel data directly.

Multilabel classification tools MEKA [13] and MULAN [14] are available for handling multilabel data. MEKA is a multilabel extension to WEKA data mining tool. MULAN provides framework for implementations of many multilabel learning algorithms. In MULAN feature selection methods, multilabel data is transformed into single-label data and then traditional single-label dimensionality reduction technique is applied on these single-label data.

3 Implementation Details

3.1 FSVIG Multilabel Feature Selection Algorithm Details

FSVIG multilabel feature selection algorithm is implemented using information gain measure. Feature selection method removes the redundant and irrelevant features from database. Feature selection technique used in this work is reported in [2]. In this feature, selection filter approach is used. Information gain is used as information metric for measuring correlation degree between features and labels that are present in database. Information gain (IG) between features and entire label set is used to quantify the importance of features. IG is calculated with the help of label entropy H(L) and feature entropy H(fi). A bigger value of IG represents greater correlation between feature and labels. But in some situation, information gain of each label set may not be in same range of measurement. So for comparison, the normalized processing was performed on information gain (SU). Figure 1 shows pseudo-algorithm for feature selection FSVIG which is as follows.

3.2 CK-STC Multilabel Classification Algorithm Details

CK-STC algorithm for multilabel classification is reported in [1, 15]. In CK-STC algorithm, coupled label similarity is estimated using intra-coupling similarity between label value \( w_{i}^{x} \;{\text{and}}\;w_{j}^{y} \) and inter-coupling label similarity between label value \( w_{i}^{x} \;{\text{and}}\;w_{j}^{y} \;{\text{w}}.{\text{r}}.{\text{t}} \) feature value \( w_{p}^{z} \). Inter-coupling label similarity is calculated with the help of co-occurrence frequency (CF) and intra-coupling similarity is calculated with the help of occurrence frequency (F). For every instance in training data, k-nearest neighbor is calculated. Coupled label similarity is considered while calculating prior probabilities and frequency arrays. Estimation of k-nearest neighbor for unseen instance is done. For each label, statistical information is calculated with the help of k-nearest neighbors of unseen instances. Unseen instance labels are evaluated via MAP (Maximum a posterior) rule. MAP rule is based on Bayes theorem.

Figure 2 shows pseudo-algorithm for multilabel classification CK-STC which is stated as follows:

4 Experimental Setup

Table 1 shows benchmark multilabel datasets information including number of features, total number of labels, and number of instances. Experiments were performed on Genbase, Medical, and Enron dataset.

Table 1 Dataset Description

Full size table

Experiments were performed on i5 processor and on Windows 7 operating system. Implementation of multilabel classification algorithm CK-STC and multilabel feature selection algorithm FSVIG was done in MS VISUAL STUDIO.NET and development tool is visual studio 10. In experiments, a proposed (CK-STC with feature selection) method was compared with existing CML-kNN method. Euclidean distance was used to calculate k-nearest neighbors. The experiments were performed on k = 5, 7, 9 and then the average is calculated. Tenfold cross-validation is performed on the above dataset.

5 Experimental Results

Performance of multilabel classification is measured in terms of hamming loss, one-error, and average precision. For each evaluation metric, “\( \downarrow \)” indicates “smaller value has better results” and “\( \uparrow \)” indicates “bigger value has better results”. Bold value indicates winner of the classifier. Experiments were carried out in four sets as follows:

(1)
Multilabel classification using CK-STC without feature selection

Table 2 shows results of CK-STC algorithm.
Table 2 Results of CK-STC
Full size table
(2)
Multilabel feature selection using FSVIG algorithm

Table 3 indicates results of feature selection FSVIG.
Table 3 Results of feature selection as per FSVIG
Full size table
(3)
Multilabel classification using MULAN feature selection technique binary relevance attribute evaluator and its comparison with CK-STC with feature selection FSVIG

The results of multilabel feature selection algorithm FSVIG and MULAN attribute selection algorithms such as binary relevance attribute evaluator algorithm represented in Tables 4, 5 and 6.

Table 4 Comparisons of CK-STC, feature selection FSVIG with MULAN binary relevance attribute evaluator w.r.t. hamming loss \( \downarrow \)

Full size table

Table 5 Comparisons of CK-STC, feature selection FSVIG with MULAN binary relevance attribute evaluator w.r.t. one-error \( \downarrow \)

Full size table

Table 6 Comparisons of CK-STC, feature selection FSVIG with MULAN binary relevance attribute evaluator w.r.t. average precision \( \varvec{ } \uparrow \)

Full size table

Table 4 indicates that CC algorithm with binary relevance attribute evaluator performs better than other algorithms w.r.t. hamming loss. Table 5 indicates that CK-STC algorithm with FSVIG feature selection technique performs better than other algorithms w.r.t. one-error. Table 6 shows that CK-STC algorithm with FSVIG feature selection technique performs better than other algorithms w.r.t average precision.

(4)
Multilabel classification using MULAN feature selection technique multiclass attribute evaluator and its comparison with CK-STC with feature selection FSVIG

The results of multilabel feature selection algorithm FSVIG and MULAN attribute selection algorithms such as multiclass attribute evaluator algorithm represented in Tables 7, 8, and 9.

Table 7 Comparisons of CK-STC, feature selection FSVIG with MULAN binary relevance attribute evaluator w.r.t. hamming loss \( \varvec{ } \downarrow \)

Full size table

Table 8 Comparisons of CK-STC, feature selection FSVIG with MULAN multiclass attribute evaluator w.r.t one-error \( \varvec{ } \downarrow \)

Full size table

Table 9 Comparisons of CK-STC, feature selection FSVIG with MULAN multiclass attribute evaluator w.r.t average precision \( \varvec{ } \uparrow \)

Full size table

Table 7 indicates that CK-STC algorithm with FSVIG feature selection technique performs better than other algorithms w.r.t. hamming loss. Table 8 indicates that CK-STC algorithm with FSVIG feature selection technique performs better than other algorithms w.r.t. one-error. Table 9 indicates that CK-STC algorithm with FSVIG feature selection technique performs better than other algorithms w.r.t average precision.

Tables 6, 7, 8, and 9 indicate that feature selection technique FSVIG with CK-STC gives better performance in terms of one-error and average precision than feature selection algorithms reported in MULAN. Multilabel classification with FSVIG feature selection does not provide best performance in terms of hamming loss because of weak connection between labels in the datasets. Feature selection algorithms in MULAN library transform multilabel problem into single-label problem and on single-label dataset, feature selection methods are applied; so due to transformation there is damage to original label structures which reduces the performance of classifier.

For evaluating the statistical significance of the results, the Friedman test for paired data is used. The level of significance of the Friedman test was determined at α = 0.05.

6 Conclusions

A lazy learning approach CK-STC has been reported in the literature for multilabel classification using coupled similarity between labels that improves the prediction performance, but irrelevant features present in database affect the prediction accuracy of classifier.

A multilabel feature selection FSVIG is incorporated in CK-STC algorithm. FSVIG uses information gain that shows better performance than existing multilabel feature selection algorithms when used with ML-NB, ML-kNN, and RandSvm classifier. This paper investigates a performance of FSVIG when used with CK-STC and compares its performance with other multilabel feature selection algorithms available in MULAN using standard multilabel datasets.

Experimental results show that FSVIG when used with CK-STC provides better performance in terms of average precision and one-error.

References

Chunming Liu and Longbing Cao, “A Coupled k-Nearest Neighbor Algorithm for Multi-label Classification”, Springer PAKDD, Part I, LNAI 9077, pp. 176–187, (2015).
Google Scholar
Ling Li, Huawen Liu, Zongjie Ma, Yuchang Mo, Zhengjie Duan, Jiaqing Zhou, Jianmin Zhao, “Multi-label Feature Selection 6a Information Gain”, 345–355, (2014).
Google Scholar
Tsoumakas, G., Katakis, I. “Multi-label classification: An over6ew” International Journal of Data Warehousing and Mining (IJDWM) 3(3), 1–13 (2007).
Google Scholar
Gjorgji Madjarov, DragiKocev, Dejan Gjorgje6kj, Saso Dzeroski, “An extensive experimental comparison of methods for multi-label learning”, Pattern Recognition 45 (2012) 3084–3104.
Google Scholar
Grigorios Tsoumakas, Ioannis Katakis, and Ioannis Vlahavas, “Mining Multi-label Data”, Data Mining and Knowledge Discovery Handbook, Springer, 2010, pp. 667–685.
Google Scholar
Boutell, M.R., Luo, J., Shen, X., Brown, C.M. “Learning multi-label scene classification”, Pattern recognition 37(9), 1757–1771 (2004).
Google Scholar
Read, J., Pfahringer, B., Holmes, G., Frank, E.: Classifier chains for multi-label classification. Machine Learning 85(3), 333–359 (2011).
Google Scholar
Zhang, M.L., Zhou, Z.H., “ML-kNN: A lazy learning approach to multi-label learning”, Pattern recognition 40(7), 2038–2048 pp. 401–406. Springer (2007).
Google Scholar
Spyromitros, E., Tsoumakas, G., Vlahavas, I., “An empirical study of lazy multilabel classification algorithms”, Artificial Intelligence: Theories, Models and Applications, pp. 401–406. Springer (2008).
Google Scholar
Cheng, W., HÂ´lullermeier, E., “Combining instance-based learning and logistic regression for multi-label classification”, Machine Learning 76(2–3), 211–225 (2009).
Google Scholar
Cao, L., “Coupling learning of complex interactions”, Information Processing and Management (2014).
Google Scholar
Lee, J., Kim, D.W. “Feature selection for multi-label classification using multivariate mutual information”, Pattern Recognition Letters 34(3), 349 – 357 (2013).
Google Scholar
Tidake Vaishali S. And Shirish S. Sane, “Multilabel Learning with MEKA”, CSI communications (2016).
Google Scholar
G. Tsoumakas, E. Spyromitros-Xioufis, J. 6lcek, and I. Vlahavas, “MULAN: A Java library for multi-label learning” Machine Learning. Res., vol. 12, pp. 2411–2414, Jul.(2011).
Google Scholar
Prajakta Chaudhari and S. S. Sane, “Multilabel Classification Exploiting Coupled Label Similarity with Feature Selection”, IJCA ETC (2016).
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Engineering, K. K. Wagh Institute of Engineering Education and Research, SPPU, Nashik, India
S. S. Sane & Prajakta Chaudhari
Computer Engineering, Matoshri College of Engineering and Research Center, Savitribai Phule Pune University, Nashik, India
V. S. Tidake

Authors

S. S. Sane
View author publications
You can also search for this author in PubMed Google Scholar
Prajakta Chaudhari
View author publications
You can also search for this author in PubMed Google Scholar
V. S. Tidake
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Prajakta Chaudhari .

Editor information

Editors and Affiliations

Department of Computer Software, University of Aizu, Aizuwakamatsu, Fukushima, Japan
Subhash Bhalla
Department of Electronics and Communication Engineering, Shri Ramswaroop Memorial Group of Professional Colleges, Lucknow, Uttar Pradesh, India
Vikrant Bhateja
Department of Information Technology, MIT College of Engineering, Pune, Maharashtra, India
Anjali A. Chandavale
Department of Information Technology, MIT College of Engineering, Pune, Maharashtra, India
Anil S. Hiwale
Department of Computer Science and Engineering, Anil Neerukonda Institute of Technology and Sciences, Visakhapatnam, Andhra Pradesh, India
Suresh Chandra Satapathy

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sane, S.S., Chaudhari, P., Tidake, V.S. (2018). An Effective Multilabel Classification Using Feature Selection. In: Bhalla, S., Bhateja, V., Chandavale, A., Hiwale, A., Satapathy, S. (eds) Intelligent Computing and Information and Communication. Advances in Intelligent Systems and Computing, vol 673. Springer, Singapore. https://doi.org/10.1007/978-981-10-7245-1_14

Download citation

DOI: https://doi.org/10.1007/978-981-10-7245-1_14
Published: 20 January 2018
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-7244-4
Online ISBN: 978-981-10-7245-1
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

An Effective Multilabel Classification Using Feature Selection

Abstract