Abstract
K-means clustering algorithm is an unsupervised learning method with simple principles, easy implementation, and strong adaptability. Aiming at the disadvantages of this algorithm that the clusters’ number is difficult to determine, sensitive to the initial cluster center, and the clustering result is easily impacted by the outliers, this paper proposes an improved clustering algorithm based on density selection, which compares the neighborhood density of each sample and the average density of all the samples, treats the samples with lower density as the outliers or isolated points, and then deletes them. After data pre-processing, the cluster validity index is modified to obtain the optimal clusters’ number by minimizing the cluster validity index, and then optimizes the initial cluster center by density selection strategy. Finally, it is verified by the experiment that the improved algorithm has better accuracy than the traditional K-means algorithm, and it can converge to the global minimum of SSE faster.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Yang, S., Li, Y., Hu, X.: Optimization study on k value of k-means algorithm. Syst. Eng.-Theory Pract. 26(2), 97–101 (2006). (in Chinese)
Yang, J., Zhao, C.: Survey on K-means clustering algorithm. Comput. Eng. Appl. 55(23), 7–13 (2019). (in Chinese)
Hung, C.H., Chiou, H.M., Yang, W.N.: Candidate groups search for K-harmonic means data clustering. Appl. Math. Model. 37(24), 10123–10128 (2013)
Wang, J., Ma, X., Duan, G.: Improved K-means clustering k-value selection algorithm. Comput. Eng. Appl. 55(8), 27–33 (2019). (in Chinese)
Gan, G.J., Ng, M.K.P.: K-means clustering with outlier removal. Pattern Recogn. Lett. 90, 8–14 (2017)
Zhu, W., Wu, N., Hu, X.: Improved cluster validity index for fuzzy clustering. Comput. Eng. Appl. 47(5), 206–209 (2011). (in Chinese)
Mao, C.: Automatic three-way decision clustering approach based on K-means. Chongqing Univ. Posts Telecommun., 1–51 (2016). (in Chinese)
Xie, J.,Zhang, Y.,Jiang, W.: A k-means clustering algorithm with meliorated initial centers and its application to partition of diet structures. IN: International Symposium on Intelligent Information Technology Application Workshops, pp. 98–102. IEEE (2008)
Shi, H., Zhou, S., et al.: Average density-based outliers detection. J. Univ. Electron. Sci. Technol. China 6(36), 1287–1288 (2007). (in Chinese)
Yuan, F., Zhou, Z., X, X.: K-means clustering algorithm with meliorated initial center. Comput. Eng. 33(3), 65–67 (2017). (in Chinese)
Acknowledgements
This work was supported by Shaanxi Higher Education Teaching Reform Research Project (19BZ030) and the Project of Education Department of Shaanxi Province (JK190646).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Xie, W., Wang, X., Xu, B. (2021). An Improved K-Means Clustering Algorithm Based on Density Selection. In: MacIntyre, J., Zhao, J., Ma, X. (eds) The 2020 International Conference on Machine Learning and Big Data Analytics for IoT Security and Privacy. SPIOT 2020. Advances in Intelligent Systems and Computing, vol 1283. Springer, Cham. https://doi.org/10.1007/978-3-030-62746-1_88
Download citation
DOI: https://doi.org/10.1007/978-3-030-62746-1_88
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-62745-4
Online ISBN: 978-3-030-62746-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)