Abstract
Artificial intelligence had the most significant leap in the last two decades. In health care, artificial intelligence can be applied to many different task solutions. One of the machine learning types is unsupervised learning, and the most known type of this is clustering. Scientific researches show that clustering algorithms can be applied to identify different diseases. However, although there are many new clustering algorithms, k-means, hierarchal agglomerative clustering, and k-modes methods are still the most widely used algorithms, as these are fast-acting and work well with specific datasets. This work aims to give a brief overview of machine learning and pay more attention to unsupervised machine learning and clustering. Briefly introduce the current clustering methods in the medical field and apply the clustering methods to different medical and non-medical datasets. Results showed that different methods work best for the different datasets, and there are no universal clustering methods for all datasets. Results showed that for the E. coli dataset, the best method tested was BIRCH, but for the cancer clustering, the dataset's best model was Gaussian mixture model.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Mohammed M, Khan MB, Bashier EBM (2016) Machine learning: algorithms and applications. CRC Press
Chollet F (2021) Deep learning with Python. Simon and Schuster
Mohri M, Rostamizadeh A (2012) A. Talwalkar Foundations of machine learning. MIT Press, Cambridge, MA, USA
Van Engelen JE, Hoos HH (2020) A survey on semi-supervised learning. Mach Learn 109:373–440
Duda RO, Hart PE, Stork DG (2001) Unsupervised learning and clustering. Pattern classification, 2nd edn.
Witten IH, Frank E (2002) Data mining: practical machine learning tools and techniques with Java implementations. ACM SIGMOD Rec 31:76–77
Leopold N, Rose O (2020) UNIC: a fast nonparametric clustering. Pattern Recogn 100:107117
El Attar A, Khatoun R, Birregah B, Lemercier M (2014) Robust clustering methods for detecting smartphone's abnormal behavior. In: 2014 IEEE wireless communications and networking conference (WCNC). IEEE, pp 2552–2557
Cuesta-Albertos JA, Gordaliza A, Matrán C (1997) Trimmed k-means: an attempt to robustify quantizers. Ann Stat 25:553–576
Ren Y, Hu K, Dai X, Pan L, Hoi SC, Xu Z (2019) Semi-supervised deep embedded clustering. Neurocomputing 325:121–130
Nezhad MZ, Zhu D, Sadati N, Yang K, Levi P (2017) SUBIC: a supervised bi-clustering approach for precision medicine. In: 2017 16th IEEE international conference on machine learning and applications (ICMLA). IEEE, pp 755–760
Nugent R, Meila M (2010) An overview of clustering applied to molecular biology. Stat Methods Mol Biol 369–404
Li X, Zhu F (2013) On clustering algorithms for biological data. Engineering 5. https://doi.org/10.4236/eng.2013.510B113
Nithya N, Duraiswamy K, Gomathy P (2013) A survey on clustering techniques in medical diagnosis. Int J Comput Sci Trends Technol (IJCST) 1:17–23
Wiwie C, Baumbach J, Röttger R (2015) Comparing the performance of biomedical clustering methods. Nat Methods 12:1033–1038
Chen C-H (2014) A hybrid intelligent model of analyzing clinical breast cancer data using clustering techniques with feature selection. Appl Soft Comput 20:4–14
Polat K (2012) Classification of Parkinson’s disease using feature weighting method on the basis of fuzzy C-means clustering. Int J Syst Sci 43:597–609
Nilashi M, Ibrahim O, Ahani A (2016) Accuracy improvement for predicting Parkinson’s disease progression. Sci Rep 6:1–18
Wu Y, Duan H, Du S (2015) Multiple fuzzy c-means clustering algorithm in medical diagnosis. Technol Health Care 23:S519–S527
Trevithick L, Painter J, Keown P (2015) Mental health clustering and diagnosis in psychiatric in-patients. BJPsych Bulletin 39:119–123
Yilmaz N, Inan O, Uzer MS (2014) A new data preparation method based on clustering algorithms for diagnosis systems of heart and diabetes diseases. J Med Syst 38:48–59
Nikas JB, Low WC (2011) Application of clustering analyses to the diagnosis of Huntington disease in mice and other diseases with well-defined group boundaries. Comput Methods Programs Biomed 104:e133–e147
Alashwal H, El Halaby M, Crouse JJ, Abdalla A, Moustafa AA (2019) The application of unsupervised clustering methods to Alzheimer’s disease. Front Comput Neurosci 13:31
Smys S (2019) Survey on accuracy of predictive big data analytics in healthcare. J Inf Technol 1:77–86
Renganathan V (2017) Text mining in biomedical domain with emphasis on document clustering. Healthc Inform Res 23:141–146
Suetens P, Bellon E, Vandermeulen D, Smet M, Marchal G, Nuyts J, Mortelmans L (1993) Image segmentation: methods and applications in diagnostic radiology and nuclear medicine. Eur J Radiol 17:14–21
Boudraa A-O, Zaidi H (2006) Image segmentation techniques in nuclear medicine imaging. Quantitative analysis in nuclear medicine imaging. Springer, pp 308–357
Qu P, Zhang H, Zhuo L, Zhang J, Chen G (2017) Automatic tongue image segmentation for traditional Chinese medicine using deep neural network. In: International conference on intelligent computing. Springer, pp 247–259
Bruse JL, Zuluaga MA, Khushnood A, McLeod K, Ntsinjana HN, Hsia T-Y, Sermesant M, Pennec X, Taylor AM, Schievano S (2017) Detecting clinically meaningful shape clusters in medical image data: metrics analysis for hierarchical clustering applied to healthy and pathological aortic arches. IEEE Trans Biomed Eng 64:2373–2383
Higuera C, Gardiner KJ, Cios KJ (2015) Self-organizing feature maps identify proteins critical to learning in a mouse model of down syndrome. PLoS ONE 10:e0129126
Ester M, Kriegel H-P, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: kdd, pp 226–231
Zhang T, Ramakrishnan R, Livny M (1996) BIRCH: an efficient data clustering method for very large databases. ACM SIGMOD Rec 25:103–114
Lang A, Schubert E (2020) BETULA: numerically stable CF-trees for BIRCH clustering. In: International conference on similarity search and applications. Springer, pp 281–296
Sarfraz S, Sharma V, Stiefelhagen R (2019) Efficient parameter-free clustering using first neighbor relations. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 8934–8943
Strehl A, Ghosh J (2002) Cluster ensembles—a knowledge reuse framework for combining multiple partitions. J Mach Learn Res 3:583–617
Wu J, Xiong H, Chen J (2010) Adapting the right measures for k-means clustering. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining, pp 877–886
Vinh NX, Epps J, Bailey J (2010) Information theoretic measures for clusterings comparison: variants, properties, normalization and correction for chance. J Mach Learn Res 11:2837–2854
Steinley D (2004) Properties of the Hubert-Arable Adjusted Rand Index. Psychol Methods 9:386
Rand WM (1971) Objective criteria for the evaluation of clustering methods. J Am Stat Assoc 66:846–850
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Lukauskas, M., Ruzgas, T. (2023). Review and Comparative Analysis of Unsupervised Machine Learning Application in Health Care. In: Jacob, I.J., Kolandapalayam Shanmugam, S., Izonin, I. (eds) Data Intelligence and Cognitive Informatics. Algorithms for Intelligent Systems. Springer, Singapore. https://doi.org/10.1007/978-981-19-6004-8_56
Download citation
DOI: https://doi.org/10.1007/978-981-19-6004-8_56
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-6003-1
Online ISBN: 978-981-19-6004-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)