Abstract
Large volumes of data are being generated each day in healthcare. In addition, these huge amounts of data from healthcare datasets cause the issue of proper knowledge discovery. Currently, data integration is an approach, which is increasingly utilized by healthcare data specialists for analyzing the information and data mining. “Which features or attributes should we use to integrate data for data warehouses”-is a difficult question to answer. It requires deep knowledge of the problem domain. Automatic feature selection is the process of selecting a subset of relevant features automatically for later use. In this paper, we proposed a method using four random forest based feature selection algorithm and domain knowledge. Experimental results show that our hybrid method can select a required number of features from a large set of attributes.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Shen, F., et al.: Bearing fault diagnosis based on SVD feature extraction and transfer learning classification. In: 2015 Prognostics and System Health Management Conference (PHM). IEEE (2015)
Khan, S., Latiful Haque, A.: Towards development of national health data warehouse for knowledge discovery (2016). https://doi.org/10.1007/978-3-319-23258-4_36
Acharya, A., Sinha, D.: Application of feature selection methods in educational data mining. Int. J. Comput. Appl. 103(2) (2014)
Bidgoli, A.-M., Parsa, M.N.: A hybrid feature selection by resampling, chi squared and consistency evaluation techniques. World Acad. Sci. Eng. Technol. 68, 276–285 (2012)
Singh, B., Kushwaha, N., Vyas, O.P.: A feature subset selection technique for high dimensional data using symmetric uncertainty. J. Data Anal. Inf. Process. 2(04), 95 (2014)
Ramaswami, M., Bhaskaran, R.: A study on feature selection techniques in educational data mining. J. Comput. 1, 7–11 (2009)
Jishan, S.T., et al.: Improving accuracy of students’ final grade prediction model using optimal equal width binning and synthetic minority over-sampling technique. Decis. Anal. 2(1), 1 (2015)
Tang, J., et al.: Feature selection for classification: a review. In: Data Classification: Algorithms and Applications, vol. 37 (2014)
Kandel, S., Heer, J., Plaisant, C., Kennedy, J., Van Ham, F., Riche, N.H., Buono, P.: Research directions in data wrangling: visualizations and transformations for usable and credible data. Inf. Vis. 10(4), 271–288 (2011)
Lin, T.Y., Cercone, N. (eds.): Rough Sets and Data Mining: Analysis of Imprecise Data. Springer, New York (2012)
Chuang, L.Y., Ke, C. H., Yang, C.H.: A hybrid both filter and wrapper feature selection method for microarray classification. arXiv preprint arXiv:1612.08669 (2016)
Ghaemidizaji, M., Derakhshi, F., Reza, M.: Classifying different feature selection algorithms based on the search strategies (2014)
Ong, T.C., et al.: Dynamic-ETL: a hybrid approach for health data extraction, transformation and loading. BMC Med. Inform. Decis. Mak. 17(1), 134 (2017)
Khan, S.I., Hoque, A.S.M.L.: Development of national health data warehouse Bangladesh: privacy issues and a practical solution. In: 2015 18th International Conference on Computer and Information Technology (ICCIT), pp. 373–378. IEEE (2015)
Biplob, M.B., Sheraji, G.A., Khan, S.I.: Comparison of different extraction transformation and loading tools for data warehousing. In: 2018 International Conference on Innovations in Science, Engineering and Technology (ICISET), pp. 262–267. IEEE (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Badiuzzaman Biplob, M., Khan, S.I., Sheraji, G.A., Shuvo, J.A. (2020). Hybrid Feature Selection Algorithm to Support Health Data Warehousing. In: Jain, L., Peng, SL., Alhadidi, B., Pal, S. (eds) Intelligent Computing Paradigm and Cutting-edge Technologies. ICICCT 2019. Learning and Analytics in Intelligent Systems, vol 9. Springer, Cham. https://doi.org/10.1007/978-3-030-38501-9_10
Download citation
DOI: https://doi.org/10.1007/978-3-030-38501-9_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-38500-2
Online ISBN: 978-3-030-38501-9
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)