Abstract
Although there are many clustering algorithms are introduced, the K-means clustering algorithm is the most universal algorithm because of its simplicity. Several modifications are also introduced to overcome the drawbacks of the K-means clustering algorithm. But still, it suffers from centroid initialization of clusters primarily which may help to elevate the efficiency of clustering. Hence this paper put forward a way of initializing cluster centroids smartly so that drawbacks of the previous algorithm may be eliminated. The proposed algorithm divides the dataspace into K (number of clusters) equally by calculating the center point of each partition set as initial cluster centers. Compared to the original and some other modified versions of the K-means grouping algorithm made by researchers, the proposed model of writing paper provides better results in terms of time complexity, space complexity, inter-cluster distance, etc.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Na S, Xumin L, Yong G (2010) Research on k-means clustering algorithm: an improved k-means clustering algorithm. Third Int Symp Intell Inf Technol Secur Inform 2010:63–67. https://doi.org/10.1109/IITSI.2010.74
Xu H, Yao S, Li Q, Ye Z (2020) An improved K-means clustering algorithm. In: 2020 IEEE 5th international symposium on smart and wireless systems within the conferences on intelligent data acquisition and advanced computing systems (IDAACS-SWS), pp 1–5. https://doi.org/10.1109/IDAACS-SWS50031.2020.9297060.
Huang Z (1998) Extensions to the k-means algorithm for clustering large data sets with categorical values. Data Min Knowl Disc 2:283–304
Tian Z, Ramakrishnan R, Miron L (1996) BIRCH: an efficient data clustering method for very large databases. ACM SIGMOD Rec 25(2):103–114
Guha S, Rastogi R, Shim K (1998) CURE: an efficient clustering algorithm for large databases. SIGMOD'98, Seattle, Washington, pp 73–84
Ester BM, Kriegel HP, Sander J, Xu X (1996) A density based algorithm for discovering clusters in large spatial databases. In: Proceeding of 1996 international conference on knowledge discovery and data mining, pp 226–231
Ankerst M, Breunig MM, Kriegel HP, Sander J (1999) OPTICS: ordering points to identify the clustering structure. ACM SIGMOD Record 28(2):49–60
Rodriguez A, Laio A (2014) Clustering by fast search and find of density peaks. Science 344(6191):1492–1496
Sheikholeslami G, Chatterjee S, Zhang A (1998) WaveCluster: a multi-resolution clustering approach for very large spatial databases. In: Proceeding of the 24th international conference on very large data bases (VLDB '98), pp 428–439
Begum M, Das BC, Hossain MZ, Saha A, Papry KA (2021) An improved Kohonen self-organizing map clustering algorithm for high-dimensional data sets. Indones J Electr Eng Comput Sci 24(1):600–610. ISSN: 2502-4752. https://doi.org/10.11591/ijeecs.v24.i1.pp600-610
Sun H, Chen Y, Lai J, Wang Y, Liu X, Identifying tourists and locals by K-means clustering method from mobile phone signaling data. J Transport Eng Part A: Syst 147(10):04021070
Hutagalung J, Ginantra NLSR, Bhawika GW, Parwita WGS, Wanto A, Panjaitan PD (2020) COVID-19 cases and deaths in Southeast Asia clustering using k-means algorithm. In: Annual Conference on Science and Technology Research (ACOSTER) 2020, 20–21 June 2020, Medan, Indonesia
Khorshidi N, Parsa M, Lentz DR, Sobhanverdi J (2021) Identification of heavy metal pollution sources and its associated risk assessment in an industrial town using the K-means clustering technique. Appl Geochem 135:105113, ISSN 0883-2927. https://doi.org/10.1016/j.apgeochem.2021.105113
Mardi M, Keyvanpour MR (2021) GBKM: a new genetic based k-means clustering algorithm. In: 2021 7th international conference on web research (ICWR), pp 222–226. https://doi.org/10.1109/ICWR51868.2021.9443113
Capó M, Pérez A, Lozano JA (2022) An efficient split-merge re-start for the K-means algorithm. IEEE Trans Knowl Data Eng 34(4):1618–1627. https://doi.org/10.1109/TKDE.2020.3002926.
Rahman Z, Hossain MS, Hasan M, Imteaj A (2021) An enhanced method of initial cluster center selection for K-means algorithm. Innov Intell Syst Appl Conf (ASYU) 2021:1–6. https://doi.org/10.1109/ASYU52992.2021.9599017
Sen A, Pandey M, Chakravarty K (2020) Random centroid selection for K-means clustering: a proposed algorithm for improving clustering results. In: 2020 international conference on computer science, engineering and applications (ICCSEA), pp 1–4. https://doi.org/10.1109/ICCSEA49143.2020.9132921.
Begum M, Akthar MN (2013) KSOMKM: an efficient approach for high dimensional dataset clustering. Int J Electr Energy 1(2):102–107. https://doi.org/10.12720/ijoee.1.2.102-107
Singh RV, Bhatia MPS (2011) Data clustering with modified K-means algorithm. Int Conf Recent Trends Inf Technol (ICRTIT) 2011:717–721. https://doi.org/10.1109/ICRTIT.2011.5972376
Tajunisha S, Saravanan V (2010) Performance analysis of k-means with different initialization methods for high dimensional data. Int J Artif Intell Appl (IJAIA) 1(4):44–52
Yuan, Yang H (2019) Research on K-value selection method of K-means clustering algorithm. J 2(2):226–235. https://doi.org/10.3390/j2020016
Kandali K, Bennis L, Bennis H (2021) A New hybrid routing protocol using a modified K-means clustering algorithm and continuous hopfield network for VANET. IEEE Access 9:47169–47183. https://doi.org/10.1109/ACCESS.2021.3068074
Motwani M, Arora N, Gupta A (2019) A study on initial centroids selection for partitional clustering algorithms. In: Software engineering. Springer, pp 211–220
http://ijcsit.com/docs/Volume%205/vol5issue06/ijcsit2014050688.pdf.
He Z, Yu C (2019) Clustering stability-based evolutionary K-Means. Soft Comput 23:305–321. https://doi.org/10.1007/s00500-018-3280-0
Zhao Y, Zhou X (2021) K-means clustering algorithm and its improvement research. In: Journal of Physics: Conference Series, Volume 1873, 2021 2nd International Workshop on Electronic communication and Artificial Intelligence (IWECAI 2021), 12–14 March 2021, Nanjing, China
Ghazal TM, Hussain MZ, Said RA, Nadeem A, Hasan MK, Ahmad M, Khan MA, Naseem MT, Intelligent automation and soft computing. 30(2):735–742. https://scholarworks.bwise.kr/gachon/handle/2020.sw.gachon/81931, https://doi.org/10.32604/iasc.2021.019067
Patra GK, sahu KK, Normalization: a preprocessing stage. https://doi.org/10.48550/arXiv.1503.06462
Newman D, Hettich S, Blake C, Merz C (1998) UCI repository of machine learning databases. http://www.ics.uci.edu/~mlearn/MLRepository.html
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Hamidur Rahman, M., Begum, M. (2023). Partitional Technique for Searching Initial Cluster Centers in K-means Algorithm. In: Kaiser, M.S., Waheed, S., Bandyopadhyay, A., Mahmud, M., Ray, K. (eds) Proceedings of the Fourth International Conference on Trends in Computational and Cognitive Engineering. Lecture Notes in Networks and Systems, vol 618. Springer, Singapore. https://doi.org/10.1007/978-981-19-9483-8_22
Download citation
DOI: https://doi.org/10.1007/978-981-19-9483-8_22
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-9482-1
Online ISBN: 978-981-19-9483-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)