Abstract
We consider the strongly NP-hard problem of partitioning a finite set of Euclidean points into two clusters so as to minimize the sum (over both clusters) of the weighted sums of the squared intra-cluster distances from the elements of the cluster to its center. The weights of the sums are equal to the cardinalities of the clusters. The center of one of the clusters is given as input, while the center of the other cluster is unknown and is determined as the mean value over all points in this cluster, i.e., as the geometric center (centroid). The version of the problem with constrained cardinalities of the clusters is analyzed. We construct an approximation algorithm for the problem and show that it is a fully polynomial-time approximation scheme (FPTAS) if the space dimension is bounded by a constant.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
A. V. Kel’manov and A. V. Motkova, “Exact pseudopolynomial algorithms for a balanced 2-clustering problem,” J. Appl. Indust. Math. 10 (3), 349–355 (2016).
A. V. Kel’manov and A. V. Pyatkin, “NP-hardness of some quadratic Euclidean 2-clustering problems,” Dokl. Math. 92 (2), 634–637 (2015).
A. V. Kel’manov and A. V. Pyatkin, “On the complexity of some quadratic Euclidean 2-clustering problems,” Comput. Math. Math. Phys. 56 (3), 491–497 (2015).
C. C. Aggarwal, Data Mining: The Textbook (Springer International Publishing, 2015).
C. M. Bishop, Pattern Recognition and Machine Learning (Springer Science + Business Media, New York, 2006).
R. O. Duda and P. E. Hart, Pattern Classification and Scene Analysis (John Wiley & Sons, New York, 1973; Mir, Moscow, 1976).
K. Fukunaga, Introduction to Statistical Pattern Recognition, 2nd ed. (Academic Press, New York, 1990).
T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd ed. (Springer-Verlag, New York, 2009).
A. K. Jain, “Data clustering: 50 years beyond - means,” Pattern Recogn. Lett. 31 (8), 651–666 (2010).
G. James, D. Witten, T. Hastie, and R. Tibshirani, An Introduction to Statistical Learning: with Application in R (Springer Science + Business Media, New York, 2013).
T.-C. Fu, “A review on time series data mining,” Eng. Appl. Artif. Intell. 24 (1), 164–181 (2011).
J. T. Tou and R. C. Gonzalez, Pattern Recognition Principles (Addison-Wesley, Reading, MA, 1974; Mir, Moscow, 1978).
P. Brucker, “On the complexity of clustering problems,” in Optimization and Operations Research: Proc. of the Workshop Held at University Bonn (Bonn, Germany, October 2–8, 1977), Lecture Notes Econom. Math. Syst. 157, 45–54 (1978).
S. Sahni and T. Gonzalez, “P-complete approximation problems,” J. ACM. 23, 555–566 (1976).
S. Hasegawa, H. Imai, M. Inaba, N. Katoh, and J. Nakano, “Efficient algorithms for variance-based - Clustering,” in Proc. 1st Pacific Conf. on Computer Graphics and Applications Seoul, Korea, August 30–September 2, 1993), Vol. 1 (World Scientific, River Edge, NJ, 1993), pp. 75–89.
M. Inaba, N. Katoh, and H. Imai, “Applications of weighted Voronoi diagrams and randomization to variance- based -clustering: (extended abstract),” in Proc. 10th ACM Symposium on Computational Geometry (Stony Brook, New York, USA, June 6–8, 1994), (ACM, New York, 1994), pp. 332–339.
F. de la Vega, M. Karpinski, C. Kenyon, and Y. Rabani, “Polynomial time approximation schemes for metric min-sum clustering,” Electronic Colloquium on Computational Complexity (ECCC), Report No. 25 (2002).
F. de la Vega and C. Kenyon, “A randomized approximation scheme for metric max-cut,” J. Comput. Syst. Sci. 63, 531–541 (2001).
M. R. Garey and D. S. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness (W. H. Freeman and Co., San Francisco, 1979; Mir, Moscow, 1982).
A. V. Kel’manov and S. M. Romanchenko, “An approximation algorithm for solving a problem of search for a vector subset,” J. Appl. Indust. Math. 6 (1), 90–96 (2012).
A. V. Kel’manov and S. M. Romanchenko, “An FPTAS for a vector subset search problem,” J. Appl. Indust. Math. 8 (3), 329–336 (2014).
A. V. Kel’manov and V. I. Khandeev, “Fully polynomial- time approximation scheme for a special case of a quadratic Euclidean 2-clustering problem,” Comput. Math. Math. Phys. 56 (2), 334–341 (2016).
A. V. Kel’manov and A. V. Motkova, “A fully polynomial- time approximation scheme for a special case of a balanced 2-clustering problem,” in Discrete Optimization and Operations Research: Proc. 9th Intern. Conf. DOOR 2016 (Vladivostok, Russia, September 19–23, 2016), Lecture Notes Comp. Sci. 9869, 182–192 (2016).
N. Wirth, Algorithms + Data Structures = Programs (Prentice Hall, New Jersey, 1976; Mir, Moscow, 1985).
Author information
Authors and Affiliations
Corresponding author
Additional information
Alexander Vasilyevich Kel’manov. Born 1952. Graduated from Izhevsk State Technical University in 1974 with specialty in Applied Mathematics. Received Candidate’s Degree in Engineering Cybernetics and Information Theory in 1980 and Doctor of Sciences degree in Physics and Mathematics in 1994. Currently with Sobolev Institute of Mathematics, Siberian Branch of the Russian Academy of Sciences, head of Data Analysis Laboratory. Scientific interests: data analysis, data mining, pattern recognition, clusterization, discrete optimization, NP-hard problems, efficient algorithms with performance guarantees. Author of more than 200 publications.
Anna Vladimirovna Motkova. Born 1993. Graduated from Novosibirsk State University in 2017 with specialty in Mathematics. Currently with Sobolev Institute of Mathematics, Siberian Branch of the Russian Academy of Sciences, Ph.D. student in Data Analysis Laboratory. Scientific interests: data analysis, pattern recognition, clustering, discrete optimization, NP-hard problems, efficient algorithms with performance guarantees. Author of 8 publications.
Rights and permissions
About this article
Cite this article
Kel’manov, A.V., Motkova, A.V. Approximation Scheme for a Quadratic Euclidean Weighted 2-Clustering Problem. Pattern Recognit. Image Anal. 28, 17–23 (2018). https://doi.org/10.1134/S105466181801008X
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1134/S105466181801008X