Abstract
The strongly NP-hard problem of partitioning a finite set of points of Euclidean space into two clusters of given sizes (cardinalities) minimizing the sum (over both clusters) of the intracluster sums of squared distances from the elements of the clusters to their centers is considered. It is assumed that the center of one of the sought clusters is specified at the desired (arbitrary) point of space (without loss of generality, at the origin), while the center of the other one is unknown and determined as the mean value over all elements of this cluster. It is shown that unless P = NP, there is no fully polynomial-time approximation scheme for this problem, and such a scheme is substantiated in the case of a fixed space dimension.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
M. Bern and D. Eppstein, “Approximation algorithms for geometric problems,” in Approximation Algorithms for NP-Hard Problems, Ed. by D. S. Hochbaum (PWS, Boston, 1997), pp. 296–345.
G. James, D. Witten, T. Hastie, and R. Tibshirani, An Introduction to Statistical Learning (Springer Science, New York, 2013).
M. C. Bishop, Pattern Recognition and Machine Learning (Springer Science, New York, 2006).
P. Flach, Machine Learning: The Art and Science of Algorithms That Make Sense of Data (Cambridge University Press, New York, 2012).
C. Steger, M. Ulrich, and C. Wiedemann, Machine Vision Algorithms and Applications (Cambridge Univ. Press, New York, 2010).
S. Kaiser, Biclustering: Methods, Software, and Application (Faculty of Math. Comput. Sci. Stat., Munich, 2011).
A. K. Jain, “Data clustering: 50 years beyond k-means,” Pattern Recogn. Lett. 31 8, 651–666 (2010).
A. V. Kel’manov and A. V. Pyatkin, “On the complexity of a search for a subset of “similar” vectors,” Dokl. Math 78 1, 574–575 (2008).
E. Kh. Gimadi, A. V. Kel’manov, M. A. Kel’manova, and S. A. Khamidullin, “A posteriori detection of a quasiperiodic fragment with a given number of repetitions in a numerical sequence,” Sib. Zh. Ind. Mat. 9 1, 55–74 (2006).
E. Kh. Gimadi, A. V. Kel’manov, M. A. Kel’manova, and S. A. Khamidullin, “A posteriori detecting a quasiperiodic fragment in a numerical sequence,” Pattern Recogn. Image Anal. 18 1, 30–42 (2008).
A. V. Kel’manov, “Off-line detection of a quasi-periodically recurring fragment in a numerical sequence,” Proc. Steklov Inst. Math. 263, Suppl. 2, 84–92 (2008).
A. V. Dolgushev and A. V. Kel’manov, “An approximation algorithm for solving a problem of cluster analysis,” J. Appl. Ind. Math. 5 4, 551–558 (2011).
A. V. Kel’manov and V. I. Khandeev, “Randomized algorithm for two-cluster partition of a set of vectors,” Comput. Math. Math. Phys. 55 2, 330–339 (2015).
D. Aloise, A. Deshpande, P. Hansen, and P. Popat, “NP-hardness of Euclidean sum-of-squares clustering,” Machine Learning 75 2, 245–248 (2009).
J. B. MacQueen, “Some methods for classification and analysis of multivariate observations,” Proceedings of the 5th Berkeley Symposium of Mathematical Statistics and Probability (Univ. of California Press, Berkeley, 1967), Vol. 1, pp. 281–297.
M. Rao, “Cluster analysis and mathematical programming,” J. Am. Stat. Assoc. 66, 622–626 (1971).
A. V. Kel’manov and A. V. Pyatkin, “Complexity of certain problems of searching for subsets of vectors and cluster analysis,” Comput. Math. Math. Phys. 49 11, 1966–1971 (2009).
A. V. Kel’manov, “On the complexity of some data analysis problems,” Comput. Math. Math. Phys. 50 11, 1941–1947 (2010).
A. V. Kel’manov, “On the complexity of some cluster analysis problems,” Comput. Math. Math. Phys. 51 11, 1983–1988 (2011).
A. E. Baburin, E. Kh. Gimadi, N. I. Glebov, and A. V. Pyatkin, “The problem of finding a subset of vectors with the maximum total weight,” J. Appl. Ind. Math. 2 1, 32–38 (2008).
A. V. Kel’manov and A. V. Pyatkin, “NP-completeness of some problems of choosing a vector subset,” J. Appl. Ind. Math. 5 3, 352–357 (2011).
A. V. Kel’manov and V. I. Khandeev, “An exact pseudopolynomial algorithm for a problem of the two-cluster partitioning of a set of vectors,” J. Appl. Ind. Math. 9 4, 497–502 (2015).
A. V. Kel’manov and V. I. Khandeev, “A 2-approximation polynomial algorithm for a clustering problem,” J. Appl. Ind. Math. 7 4, 515–521 (2013).
A. V. Kel’manov and A. V. Pyatkin, “On a version of the problem of choosing a vector subset,” J. Appl. Ind. Math. 3 4, 447–455 (2009).
M. R. Garey and D. S. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness (Freeman, San Francisco, 1979).
A. V. Dolgushev, A. V. Kel’manov, and V. V. Shenmaier, “A PTAS for a problem of cluster analysis,” Proceedings of the 9th International Conference on Intelligent Information Processing, Budva, Montenegro (Torus, Moscow, 2012), pp. 242–244.
V. V. Vazirani, Approximation Algorithms (Springer, Berlin, 2001).
A. V. Kel’manov and S. M. Romanchenko, “An FPTAS for a vector subset search problem,” J. Appl. Ind. Math. 8 3, 329–336 (2014).
I. Wirth, Algorithms + Data Structures = Programs (Prentice Hall, New Jersey, 1976).
Author information
Authors and Affiliations
Corresponding author
Additional information
Original Russian Text © A.V. Kel’manov, V.I. Khandeev, 2016, published in Zhurnal Vychislitel’noi Matematiki i Matematicheskoi Fiziki, 2016, Vol. 56, No. 2, pp. 332–340.
Rights and permissions
About this article
Cite this article
Kel’manov, A.V., Khandeev, V.I. Fully polynomial-time approximation scheme for a special case of a quadratic Euclidean 2-clustering problem. Comput. Math. and Math. Phys. 56, 334–341 (2016). https://doi.org/10.1134/S0965542516020111
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1134/S0965542516020111