Skip to main content

A Comparison of Feature Construction Methods in the Context of Supervised Feature Selection for Classification

  • Conference paper
  • First Online:
Computational Intelligence Methods for Green Technology and Sustainable Development (GTSD 2022)

Abstract

In supervised machine learning applications, feature construction may be used to create additional, informative features with the aim to support the prediction of the target output. This study investigates the impact of feature construction, specifically the use of quadratic and interaction terms, on the predictive performance of a classifier. Moreover, the Yager intersection operator is applied as a feature construction method to form additional interaction features. Since feature construction may also create irrelevant features, it can be combined with feature selection to maintain or even reduce the dimensionality of the feature set for model training. In this study, the supervised feature selection method ReliefF is used to rank all features and the k-nearest neighbor classifier is used for predicting the target classes. On the seven real-world data sets contained in this study, the features generated using feature construction are often among the most important features and provide competitive predictive performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 229.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 299.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Change history

  • 26 March 2023

    In the original version of the book, the following corrections has been updated:

    Chapter 2

    In the Table 2 has been updated as up arrow symbol instead of down arrow.

    Chapter 5

    Chapter title has been updated from “A Comparison of Feature Construction Methods in the Context of Supervised Feature Election for Classification” to “A Comparison of Feature Construction Methods in the Context of Supervised Feature Selection for Classification”

    The correction chapters and the book has been updated with the changes.

References

  1. Zhao, H., Sinha, A.P., Ge, W.: Effects of feature construction on classification performance: an empirical study in bank failure prediction. Expert Syst. Appl. 36(2), 2633–2644 (2009)

    Article  Google Scholar 

  2. Reddy, T.R., Vardhan, B.V., Gopichand, M., Karunakar, K.: Gender prediction in author profiling using relieff feature selection algorithm. In: Bhateja V., Coello Coello C., Satapathy S., Pattnaik P. (eds) Intelligent Engineering Informatics. Advances in Intelligent Systems and Computing, vol. 695, pp. 169–176. Springer, Singapore (2018)

    Google Scholar 

  3. Varzaneh, Z.A., Orooji, A., Erfannia, L., Shanbehzadeh, M.: A new covid-19 intubation prediction strategy using an intelligent feature selection and k-NN method. Inf. Med. Unlocked, p. 100825 (2021)

    Google Scholar 

  4. Konovalenko, I., Ludwig, A.: Generating decision support for alarm processing in cold supply chains using a hybrid k-nn algorithm. Expert Syst. Appl. 190, 116208 (2022)

    Article  Google Scholar 

  5. Cover, T., Hart, P.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 13(1), 21–27 (1967)

    Article  MATH  Google Scholar 

  6. Kim, K.: Normalized class coherence change-based knn for classification of imbalanced data. Pattern Recognit. 120, 108126 (2021)

    Article  Google Scholar 

  7. Kumbure, M.M., Luukka, P., Collan, M.: A new fuzzy k-nearest neighbor classifier based on the bonferroni mean. Pattern Recognit. Lett. 140, 172–178 (2020)

    Article  Google Scholar 

  8. Koller, D., Sahami, M.: Toward optimal feature selection. In: Proceedings of the Thirteenth International Conference on International Conference on Machine Learning. pp. 284-292. ICML’96, Morgan Kaufmann Publishers Inc., San Francisco,CA, USA (1996)

    Google Scholar 

  9. Dash, M., Liu, H.: Feature selection for classification. Intell. Data Anal. 1(1–4), 131–156 (1997)

    Article  Google Scholar 

  10. Motoda, H., Liu, H.: Feature selection, extraction and construction. Commun. IICM (Institute of Information and Computing Machinery, Taiwan) 5(67–72), 2 (2002)

    Google Scholar 

  11. Zhang, J., Chen, M., Zhao, S., Hu, S., Shi, Z., Cao, Y.: Relieff-based eeg sensor selection methods for emotion recognition. Sensors 16(10), 1558 (2016)

    Article  Google Scholar 

  12. Wen, X., Xu, Z.: Wind turbine fault diagnosis based on ReliefF-PCA and DNN. Expert Syst. Appl. 178, 115016 (2021)

    Google Scholar 

  13. Lohrmann, C., Luukka, P., Jablonska-Sabuka, M., Kauranne, T.: A combination of fuzzy similarity measures and fuzzy entropy measures for supervised feature selection. Expert Syst. Appl. 110, 216–236 (2018)

    Article  Google Scholar 

  14. Lohrmann, C., Luukka, P.: Nonspecificity, strife and total uncertainty in supervised feature selection. Eng. Appl. Artif. Intell. 109, 104628 (2022)

    Article  Google Scholar 

  15. Urbanowicz, R.J., Meeker, M., La Cava, W., Olson, R.S., Moore, J.H.: Relief-based feature selection: introduction and review. J. Biomed. Inf. 85, 189–203 (2018)

    Article  Google Scholar 

  16. Markovitch, S., Rosenstein, D.: Feature generation using general constructor functions. Mach. Learn. 49(1), 59–98 (2002)

    Article  MATH  Google Scholar 

  17. Hu, Y.-J., Kibler, D.: Generation of attributes for learning algorithms. In: Proceedings of the Thirteenth National Conference on Artificial Intelligence and Eighth Innovative Applications of Artificial Intelligence Conference, vol. 1, pp. 806–811. Portland, Oregon (1996)

    Google Scholar 

  18. Dor, O., Reich, Y.: Strengthening learning algorithms by feature discovery. Inf. Sci. 189, 176–190 (2012)

    Article  Google Scholar 

  19. Ma, J., Gao, X.: A filter-based feature construction and feature selection approach for classification using genetic programming. Knowl.-Based Syst. 196, 105806 (2020)

    Article  Google Scholar 

  20. Wu, Y., Xu, Y., Li, J.: Feature construction for fraudulent credit card cash-out detection. Dec. Support Syst. 127, 113155 (2019)

    Article  Google Scholar 

  21. Erdoğan, Y.E., Narin, A.: Covid-19 detection with traditional and deep features on cough acoustic signals. Comput. Biol. Med. 136, 104765 (2021)

    Article  Google Scholar 

  22. Taunk, K., De, S., Verma, S., Swetapadma, A.: A brief review of nearest neighbor algorithm for learning and classification. In: 2019 International Conference on Intelligent Computing and Control Systems (ICCS), pp. 1255–1260. IEEE (2019)

    Google Scholar 

  23. Kumbure, M.M., Lohrmann, C., Luukka, P.: A study on relevant features for intraday S &P 500 prediction using a hybrid feature selection approach. In: Nicosia, G., Ojha, V., La Malfa, E., La Malfa, G., Jansen, G., Pardalos P., Giuffrida G., Umeton R. (eds) Machine Learning, Optimization, and Data Science. LOD 2021. Lecture Notes in Computer Science, vol. 13163. Springer, Cham (2021)

    Google Scholar 

  24. Gou, J., Ma, H., Ou, W., Zeng, S., Rao, Y., Yang, H.: A generalized mean distance based k-nearest neighbor classifier. Expert Syst. Appl. 115, 356–372 (2019)

    Article  Google Scholar 

  25. Sivalenka, V., Bai, A.: An analysis on prediction of breast cancer using radius nearest neighbor algorithm over other classification algorithms. Mater. Today: Proc. (2021)

    Google Scholar 

  26. Yager, R.R.: Aggregation operators and fuzzy systems modeling. Fuzzy Sets Syst. 67(2), 129–145 (1994)

    Article  MathSciNet  MATH  Google Scholar 

  27. Kononenko, I.: Estimating attributes: Analysis and extensions of RELIEF. In: Bergadano F., De Raedt L. (eds) Machine Learning: ECML-94. ECML 1994. Lecture Notes in Computer Science (Lecture Notes in Artificial Intelligence), vol. 784, pp. 171–182. Springer, Berlin (1994)

    Google Scholar 

  28. Kononenko, I., Simec, E., Robnik- Sikonja, M.: Overcoming the myopia of inductive learning algorithms with relieff. Appl. Intell. 7(1), 39–55 (1997)

    Article  Google Scholar 

  29. Dua, D., Graff, C.: UCI Machine Learning Repository (2017). http://archive.ics.uci.edu/ml

  30. Elter, M., Schulz-Wendtland, R., Wittenberg, T.: The prediction of breast cancer biopsy outcomes using two cad approaches that both emphasize an intelligible decision process. Med. Phys. 34(11), 4164–4172 (2007)

    Article  Google Scholar 

  31. Cinar, I., Koklu, M.: Classification of rice varieties using artificial intelligence methods. Int. J. Intell. Syst. Appl. Eng. 7(3), 188–194 (2019)

    Article  Google Scholar 

  32. Nash, W.: The population biology of abalone (haliotis species) in Tasmania. 1, Blacklip abalone (H. rubra) from the north coast and the islands of Bass Strait/Warwick (1994)

    Google Scholar 

  33. Lyon, R.J., Stappers, B., Cooper, S., Brooke, J.M., Knowles, J.D.: Fifty years of pulsar candidate selection: from simple filters to a new principled real-time classification approach. Mon. Notices R. Astron. Soc. 459(1), 1104–1123 (2016)

    Article  Google Scholar 

  34. Tharwat, A.: Classification assessment methods. New England J. Entrepreneurship 17(1), 168–192 (2020)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Duc Duy Nguyen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Nguyen, D.D., Lohrmann, C., Luukka, P. (2023). A Comparison of Feature Construction Methods in the Context of Supervised Feature Selection for Classification. In: Huang, YP., Wang, WJ., Quoc, H.A., Le, HG., Quach, HN. (eds) Computational Intelligence Methods for Green Technology and Sustainable Development. GTSD 2022. Lecture Notes in Networks and Systems, vol 567. Springer, Cham. https://doi.org/10.1007/978-3-031-19694-2_5

Download citation

Publish with us

Policies and ethics