Abstract
In supervised machine learning applications, feature construction may be used to create additional, informative features with the aim to support the prediction of the target output. This study investigates the impact of feature construction, specifically the use of quadratic and interaction terms, on the predictive performance of a classifier. Moreover, the Yager intersection operator is applied as a feature construction method to form additional interaction features. Since feature construction may also create irrelevant features, it can be combined with feature selection to maintain or even reduce the dimensionality of the feature set for model training. In this study, the supervised feature selection method ReliefF is used to rank all features and the k-nearest neighbor classifier is used for predicting the target classes. On the seven real-world data sets contained in this study, the features generated using feature construction are often among the most important features and provide competitive predictive performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Change history
26 March 2023
In the original version of the book, the following corrections has been updated:
Chapter 2
In the Table 2 has been updated as up arrow symbol instead of down arrow.
Chapter 5
Chapter title has been updated from “A Comparison of Feature Construction Methods in the Context of Supervised Feature Election for Classification” to “A Comparison of Feature Construction Methods in the Context of Supervised Feature Selection for Classification”
The correction chapters and the book has been updated with the changes.
References
Zhao, H., Sinha, A.P., Ge, W.: Effects of feature construction on classification performance: an empirical study in bank failure prediction. Expert Syst. Appl. 36(2), 2633–2644 (2009)
Reddy, T.R., Vardhan, B.V., Gopichand, M., Karunakar, K.: Gender prediction in author profiling using relieff feature selection algorithm. In: Bhateja V., Coello Coello C., Satapathy S., Pattnaik P. (eds) Intelligent Engineering Informatics. Advances in Intelligent Systems and Computing, vol. 695, pp. 169–176. Springer, Singapore (2018)
Varzaneh, Z.A., Orooji, A., Erfannia, L., Shanbehzadeh, M.: A new covid-19 intubation prediction strategy using an intelligent feature selection and k-NN method. Inf. Med. Unlocked, p. 100825 (2021)
Konovalenko, I., Ludwig, A.: Generating decision support for alarm processing in cold supply chains using a hybrid k-nn algorithm. Expert Syst. Appl. 190, 116208 (2022)
Cover, T., Hart, P.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 13(1), 21–27 (1967)
Kim, K.: Normalized class coherence change-based knn for classification of imbalanced data. Pattern Recognit. 120, 108126 (2021)
Kumbure, M.M., Luukka, P., Collan, M.: A new fuzzy k-nearest neighbor classifier based on the bonferroni mean. Pattern Recognit. Lett. 140, 172–178 (2020)
Koller, D., Sahami, M.: Toward optimal feature selection. In: Proceedings of the Thirteenth International Conference on International Conference on Machine Learning. pp. 284-292. ICML’96, Morgan Kaufmann Publishers Inc., San Francisco,CA, USA (1996)
Dash, M., Liu, H.: Feature selection for classification. Intell. Data Anal. 1(1–4), 131–156 (1997)
Motoda, H., Liu, H.: Feature selection, extraction and construction. Commun. IICM (Institute of Information and Computing Machinery, Taiwan) 5(67–72), 2 (2002)
Zhang, J., Chen, M., Zhao, S., Hu, S., Shi, Z., Cao, Y.: Relieff-based eeg sensor selection methods for emotion recognition. Sensors 16(10), 1558 (2016)
Wen, X., Xu, Z.: Wind turbine fault diagnosis based on ReliefF-PCA and DNN. Expert Syst. Appl. 178, 115016 (2021)
Lohrmann, C., Luukka, P., Jablonska-Sabuka, M., Kauranne, T.: A combination of fuzzy similarity measures and fuzzy entropy measures for supervised feature selection. Expert Syst. Appl. 110, 216–236 (2018)
Lohrmann, C., Luukka, P.: Nonspecificity, strife and total uncertainty in supervised feature selection. Eng. Appl. Artif. Intell. 109, 104628 (2022)
Urbanowicz, R.J., Meeker, M., La Cava, W., Olson, R.S., Moore, J.H.: Relief-based feature selection: introduction and review. J. Biomed. Inf. 85, 189–203 (2018)
Markovitch, S., Rosenstein, D.: Feature generation using general constructor functions. Mach. Learn. 49(1), 59–98 (2002)
Hu, Y.-J., Kibler, D.: Generation of attributes for learning algorithms. In: Proceedings of the Thirteenth National Conference on Artificial Intelligence and Eighth Innovative Applications of Artificial Intelligence Conference, vol. 1, pp. 806–811. Portland, Oregon (1996)
Dor, O., Reich, Y.: Strengthening learning algorithms by feature discovery. Inf. Sci. 189, 176–190 (2012)
Ma, J., Gao, X.: A filter-based feature construction and feature selection approach for classification using genetic programming. Knowl.-Based Syst. 196, 105806 (2020)
Wu, Y., Xu, Y., Li, J.: Feature construction for fraudulent credit card cash-out detection. Dec. Support Syst. 127, 113155 (2019)
Erdoğan, Y.E., Narin, A.: Covid-19 detection with traditional and deep features on cough acoustic signals. Comput. Biol. Med. 136, 104765 (2021)
Taunk, K., De, S., Verma, S., Swetapadma, A.: A brief review of nearest neighbor algorithm for learning and classification. In: 2019 International Conference on Intelligent Computing and Control Systems (ICCS), pp. 1255–1260. IEEE (2019)
Kumbure, M.M., Lohrmann, C., Luukka, P.: A study on relevant features for intraday S &P 500 prediction using a hybrid feature selection approach. In: Nicosia, G., Ojha, V., La Malfa, E., La Malfa, G., Jansen, G., Pardalos P., Giuffrida G., Umeton R. (eds) Machine Learning, Optimization, and Data Science. LOD 2021. Lecture Notes in Computer Science, vol. 13163. Springer, Cham (2021)
Gou, J., Ma, H., Ou, W., Zeng, S., Rao, Y., Yang, H.: A generalized mean distance based k-nearest neighbor classifier. Expert Syst. Appl. 115, 356–372 (2019)
Sivalenka, V., Bai, A.: An analysis on prediction of breast cancer using radius nearest neighbor algorithm over other classification algorithms. Mater. Today: Proc. (2021)
Yager, R.R.: Aggregation operators and fuzzy systems modeling. Fuzzy Sets Syst. 67(2), 129–145 (1994)
Kononenko, I.: Estimating attributes: Analysis and extensions of RELIEF. In: Bergadano F., De Raedt L. (eds) Machine Learning: ECML-94. ECML 1994. Lecture Notes in Computer Science (Lecture Notes in Artificial Intelligence), vol. 784, pp. 171–182. Springer, Berlin (1994)
Kononenko, I., Simec, E., Robnik- Sikonja, M.: Overcoming the myopia of inductive learning algorithms with relieff. Appl. Intell. 7(1), 39–55 (1997)
Dua, D., Graff, C.: UCI Machine Learning Repository (2017). http://archive.ics.uci.edu/ml
Elter, M., Schulz-Wendtland, R., Wittenberg, T.: The prediction of breast cancer biopsy outcomes using two cad approaches that both emphasize an intelligible decision process. Med. Phys. 34(11), 4164–4172 (2007)
Cinar, I., Koklu, M.: Classification of rice varieties using artificial intelligence methods. Int. J. Intell. Syst. Appl. Eng. 7(3), 188–194 (2019)
Nash, W.: The population biology of abalone (haliotis species) in Tasmania. 1, Blacklip abalone (H. rubra) from the north coast and the islands of Bass Strait/Warwick (1994)
Lyon, R.J., Stappers, B., Cooper, S., Brooke, J.M., Knowles, J.D.: Fifty years of pulsar candidate selection: from simple filters to a new principled real-time classification approach. Mon. Notices R. Astron. Soc. 459(1), 1104–1123 (2016)
Tharwat, A.: Classification assessment methods. New England J. Entrepreneurship 17(1), 168–192 (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Nguyen, D.D., Lohrmann, C., Luukka, P. (2023). A Comparison of Feature Construction Methods in the Context of Supervised Feature Selection for Classification. In: Huang, YP., Wang, WJ., Quoc, H.A., Le, HG., Quach, HN. (eds) Computational Intelligence Methods for Green Technology and Sustainable Development. GTSD 2022. Lecture Notes in Networks and Systems, vol 567. Springer, Cham. https://doi.org/10.1007/978-3-031-19694-2_5
Download citation
DOI: https://doi.org/10.1007/978-3-031-19694-2_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-19693-5
Online ISBN: 978-3-031-19694-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)