Skip to main content

A New Hybrid Method for Text Feature Selection Through Combination of Relative Discrimination Criterion and Ant Colony Optimization

  • Conference paper
  • First Online:
Proceedings of 7th International Conference on Harmony Search, Soft Computing and Applications

Abstract

Text categorization plays a significant role in many information management tasks. Due to the increasing volume of documents on the Internet, automated text categorization has been more considered for classifying documents in pre-defined categories. A major problem of text categorization is the high dimensionality of feature space. Most of the features are irrelevant and redundant impacting the classifier performance. Hence, feature selection is used to reduce the high dimensionality of feature space and increase classification efficiency. In this paper, we proposed a hybrid two-stage method for text feature selection based on Relative Discrimination Criterion (RDC) and Ant Colony Optimization (ACO). To this end, we applied RDC method, at first, in order to rank features based on their values. Features, then, which their values are lower than a threshold are removed from the feature set. In the second stage, as a wrapper method, an ACO-based feature selection method is applied, to select redundant or irrelevant features that have not been removed in the first stage. Finally, to assess the proposed methods, we have conducted several experiments on different datasets to indicate the superiority of our proposed algorithm. We aim to propose a hybrid approach which is computationally more efficient in much the same way as it is more accurate compared to the other embedded or wrapper methods. The obtained results endorse that the proposed method is of remarkable performance in text feature selection.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Yang Y, Pedersen JO (1997) A comparative study on feature selection in text categorization. In: ICML, vol 412–420, p 35. Nashville, TN, USA

    Google Scholar 

  2. Chen J, Huang H, Tian S, Qu Y (2009) Feature selection for text classification with Naïve Bayes. Expert Syst Appl 36(3):5432–5435

    Article  Google Scholar 

  3. Bahassine S, Madani A, Al-Sarem M, Kissi M (2020) Feature selection using an improved Chi-square for Arabic text classification. J King Saud Univ-Comput Inf Sci 32(2):225–231

    Google Scholar 

  4. Cekik R, Uysal AK (2020) A novel filter feature selection method using rough set for short text data. Expert Syst Appl 160:113691

    Article  Google Scholar 

  5. Mousavirad SJ, Schaefer G, Korovin I, Moghadam MH, Saadatmand M, Pedram M (2021) An enhanced differential evolution algorithm using a novel clustering-based mutation operator. In: 2021 IEEE international conference on systems, man, and cybernetics (SMC), pp 176–181. https://doi.org/10.1109/SMC52423.2021.9658743

  6. Mousavirad SJ, Rahnamayan S (2020) One-array differential evolution algorithm with a novel replacement strategy for numerical optimization. In: 2020 IEEE international conference on systems, man, and cybernetics (SMC), pp 2514–2519. https://doi.org/10.1109/SMC42975.2020.9283154

  7. Mousavirad SJ, Rahnamayan S (2020) CenPSO: a novel center-based particle swarm optimization algorithm for large-scale optimization. In: 2020 IEEE international conference on systems, man, and cybernetics (SMC), pp 2066–2071. https://doi.org/10.1109/SMC42975.2020.9283143

  8. Bojnordi E, Mousavirad SJ, Schaefer G, Korovin I (2021) MCS-HMS: a multi-cluster selection strategy for the human mental search algorithm. arXiv preprint arXiv:2111.10676

  9. Mousavirad SJ, Schaefer G, Korovin I, Saadatmand M (2021) HMS-OS: improving the human mental search optimisation algorithm by grouping in both search and objective space. arXiv preprint arXiv:2111.10188

  10. Marie-Sainte SL, Alalyani N (2020) Firefly algorithm based feature selection for Arabic text classification. J King Saud Univ-Comput Inf Sci 32(3):320–328

    Google Scholar 

  11. Purushothaman R, Rajagopalan S, Dhandapani G (2020) Hybridizing Gray Wolf Optimization (GWO) with Grasshopper Optimization Algorithm (GOA) for text feature selection and clustering. Appl Soft Comput 96:106651

    Article  Google Scholar 

  12. Mousavirad SJ, Ebrahimpour-Komleh H (2013) Feature selection using modified imperialist competitive algorithm. ICCKE 2013:400–405. https://doi.org/10.1109/ICCKE.2013.6682833

    Article  Google Scholar 

  13. Aghdam MH, Ghasem-Aghaee N, Basiri ME (2009) Text feature selection using ant colony optimization. Expert Syst Appl 36(3):6843–6853

    Article  Google Scholar 

  14. Shang W, Huang H, Zhu H, Lin Y, Qu Y, Wang Z (2007) A novel feature selection algorithm for text categorization. Expert Syst Appl 33(1):1–5

    Article  Google Scholar 

  15. Chen Y, Miao D, Wang R (2010) A rough set approach to feature selection based on ant colony optimization. Pattern Recogn Lett 31(3):226–233

    Article  Google Scholar 

  16. Paniri M, Dowlatshahi MB, Nezamabadi-pour H (2021) Ant-TD: Ant colony optimization plus temporal difference reinforcement learning for multi-label feature selection. Swarm Evol Comput 64:100892

    Article  Google Scholar 

  17. Jayaprakash A, KeziSelvaVijila C (2019) Feature selection using ant colony optimization (ACO) and road sign detection and recognition (RSDR) system. Cogn Syst Res 58:123–133

    Article  Google Scholar 

  18. Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3(Mar), 1157–1182

    Google Scholar 

  19. Liu H, Yu L (2005) Toward integrating feature selection algorithms for classification and clustering. IEEE Trans Knowl Data Eng 17(4):491–502

    Article  MathSciNet  Google Scholar 

  20. Ng AY (2004) Feature selection, L 1 vs. L 2 regularization, and rotational invariance. In: Proceedings of the twenty-first international conference on Machine learning, p 78

    Google Scholar 

  21. Mladenić D (2005) Feature selection for dimensionality reduction. In: International statistical and optimization perspectives workshop “Subspace, Latent Structure and Feature Selection”. Springer, pp 84–102

    Google Scholar 

  22. Rehman A, Javed K, Babri HA, Saeed M (2015) Relative discrimination criterion—a novel feature ranking method for text data. Expert Syst Appl 42(7):3670–3681

    Article  Google Scholar 

  23. Cordón García O, Herrera Triguero F, Stützle T (2002) A review on the ant colony optimization metaheuristic: basis, models and new trends. Mathware Soft Comput 9(2) [–3]

    Google Scholar 

  24. Salton G, Buckley C (1988) Term-weighting approaches in automatic text retrieval. Inf Process Manage 24(5):513–523

    Article  Google Scholar 

  25. Van Rijsbergen C (1979) Information retrieval: theory and practice. In: Proceedings of the Joint IBM/University of Newcastle upon Tyne Seminar on Data Base Systems, pp 1–14

    Google Scholar 

  26. Imani MB, Keyvanpour MR, Azmi R (2013) A novel embedded feature selection method: a comparative study in the application of text categorization. Appl Artif Intell 27(5):408–427

    Article  Google Scholar 

  27. Uğuz H (2011) A two-stage feature selection method for text categorization by using information gain, principal component analysis and genetic algorithm. Knowl-Based Syst 24(7):1024–1032

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ehsan Bojnordi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hemmati, M., Mousavirad, S.J., Bojnordi, E., Shaeri, M. (2022). A New Hybrid Method for Text Feature Selection Through Combination of Relative Discrimination Criterion and Ant Colony Optimization. In: Kim, J.H., Deep, K., Geem, Z.W., Sadollah, A., Yadav, A. (eds) Proceedings of 7th International Conference on Harmony Search, Soft Computing and Applications. Lecture Notes on Data Engineering and Communications Technologies, vol 140. Springer, Singapore. https://doi.org/10.1007/978-981-19-2948-9_16

Download citation

Publish with us

Policies and ethics