Exact Algorithms of Search for a Cluster of the Largest Size in Two Integer 2-Clustering Problems

Kel′manov, A. V.; Panasenko, A. V.; Khandeev, V. I.

doi:10.1134/S1995423919020010

Exact Algorithms of Search for a Cluster of the Largest Size in Two Integer 2-Clustering Problems

Published: 06 June 2019

Volume 12, pages 105–115, (2019)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Numerical Analysis and Applications Aims and scope Submit manuscript

Exact Algorithms of Search for a Cluster of the Largest Size in Two Integer 2-Clustering Problems

Download PDF

A. V. Kel′manov^1,2,
A. V. Panasenko^1,2 &
V. I. Khandeev^1,2

33 Accesses
3 Citations
Explore all metrics

Abstract

We consider two related discrete optimization problems of searching for a subset in a finite set of points in Euclidean space. Both problems are induced by versions of a fundamental problem in data analysis, namely, that of selecting a subset of similar elements in a set of objects. In each problem, given an input set and a positive real number, it is required to find a cluster (i.e., a subset) of the largest size under constraints on a quadratic clusterization function. The points in the input set, which are outside the sought-for subset, are treated as a second (complementary) cluster. In the first problem, the function under the constraint is the sum over both clusters of the intracluster sums of the squared distances between the elements of the clusters and their centers. The center of the first (i.e., the sought-for) cluster is unknown and determined as a centroid, while the center of the second one is fixed at a given point in Euclidean space (without loss of generality, at the origin of coordinates). In the second problem, the function under the constraint is the sum over both clusters of the weighted intracluster sums of the squared distances between the elements of the clusters and their centers. As in the first problem, the center of the first cluster is unknown and determined as a centroid, while the center of the second one is fixed at the origin of coordinates. In this paper, we show that both problems are strongly NP-hard. Also, we present exact algorithms for the problems in which the input points have integer components. If the space dimension is bounded by some constant, the algorithms are pseudopolynomial.

Article PDF

Exact Algorithm for the One-Dimensional Quadratic Euclidean Cardinality-Weighted 2-Clustering with Given Center Problem

Exact Algorithms for the Special Cases of Two Hard to Solve Problems of Searching for the Largest Subset

Fully polynomial-time approximation scheme for a special case of a quadratic Euclidean 2-clustering problem

Article 01 February 2016

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

MacQueen, J.B., Some Methods for Classification and Analysis of Multivariate Observations, Proc. of the 5th Berkeley Symposium on Mathematical Statistics and Probability, Berkeley: Univ. of California Press, 1967, vol. 1, pp. 281–297.
MathSciNet MATH Google Scholar
Rao, M., atCluster Analysis and Mathematical Programming, J.Am. Stat. Assoc, 1971, vol. 66, pp. 622–626.
Article Google Scholar
Hansen, P., Jaumard, B., and Mladenovich, N., Minimum Sum of Squares Clustering in a Low Dimensional Space, J. Classific, 1998, vol. 15, pp. 37–55.
Article MathSciNet MATH Google Scholar
Hansen, P. and Jaumard, B., Cluster Analysis and Mathematical Programming, Math. Programming, 1997, vol. 79, pp. 191–215.
MathSciNet MATH Google Scholar
Fisher, R.A., Statistical Methods and Scientific Inference, New York: Hafner, 1956.
MATH Google Scholar
Jain, A.K., Data Clustering: 50 Years beyond A;-Means, Patt. Recog. Lett., 2010, vol. 31, iss. 8, pp. 651–666.
Article Google Scholar
Aloise, D., Deshpande, A., Hansen, P., and Popat, P., NP-Hardness of Euclidean Sum-of-Squares Clustering, Mach. Learning, 2009, vol. 75, iss. 2, pp. 245–248.
Article MATH Google Scholar
Drineas, P., Frieze, A., Kannan, R., Vempala, S., and Vinay, V. Clustering Large Graphs via the Singular Value Decomposition, Mach. Learning, 2004, vol. 56, pp. 9–33.
Article MATH Google Scholar
Dolgushev, A.V. and Kel'manov, A.V., On the Algorithmic Complexity of a Problem in Cluster Analysis, J. Appl. Ind. Math., 2011, vol. 2011, no. 5, pp. 2–191.
Google Scholar
Mahajan, M., Nimbhorkar, P., and Varadarajan, K., The Planar k;-Means Problem Is NP-Hard, Theor. Comp. Sci., 2012, vol. 442, pp. 3–21.
Article MathSciNet MATH Google Scholar
Brucker, P., On the Complexity of Clustering Problems, Led. Not. Econ. Math. Systems, 1978, vol. 157, pp. 45–54.
Article MathSciNet MATH Google Scholar
Bern, M. and Eppstein, D., Approximation Algorithms for Geometric Problems, in Approximation Algorithms for NP-Hard Problems, Boston: PWS, 1997, pp. 296–345.
Google Scholar
Indyk, P., A Sublinear Time Approximation Scheme for Clustering in Metric Space, Proc. of the 40th Ann. IEEE Symp. on Foundations of Computer Science (FOCS), 1999, pp. 154–159.
Google Scholar
De la Vega, F. and Kenyon, C., A Randomized Approximation Scheme for Metric Max-Cut, I. Comput. Syst. Set., 2001, vol. 63, pp. 531–541.
Article MathSciNet MATH Google Scholar
De la Vega, E. Karpinski, M., Kenyon, C., and Rabani, Y., Polynomial Time Approximation Schemes for Metric Min-Sum Clustering, Electronic Colloquium on Computational Complexity (ECCC), 2002, rep. no. 25.
Hasegawa, S., Imai, E., Inaba, M., Katoh, N., and Nakano, J., Efficient Algorithms for Variance-Based fc-Clustering, Proc. of the 1st Pacific Conference on Computer Graphics and Applications (Pacific Graphics'93, Seoul, K.rea), River Edge, N.: World Scientific, 1993, vol. 1, pp. 75–89.
Google Scholar
Inaba, M., Katoh, N., and Imai, E., Applications of Weighted Voronoi Diagrams and Randomization to Variance-Based k;-Clustering, SCG'94 Proc. of the Tenth Annual Symposium on Computational Geometry, Stony Brook, N., USA, 1994, pp. 332–339.
Chapter Google Scholar
Sahni, S. and Gonzalez, T. P-Complete Approximation Problems, I. ACM, 1976, vol. 23, pp. 555–566.
MathSciNet MATH Google Scholar
Ageev, A.A., Kel'manov, A.V., and Pyatkin, A.V., NP-Hardness of the Euclidean Max-Cut Problem, Dokl. Math., 2014, vol. 2014, no. 89, pp. 3–343.
MATH Google Scholar
Ageev, A.A., Kel'manov, A.V., and Pyatkin, A.V., Complexity of the Weighted Max-Cut in Euclidean Space, 7. Appl. Ind. Math., 2014, vol. 2014, no. 8, pp. 4–453.
MATH Google Scholar
Kel'manov, A.V. and Pyatkin, A.V., On the Complexity of a Search for a Subset of “Similar” Vectors, Dokl. Math., 2008, vol. 78, no. 1, pp. 574/575.
MATH Google Scholar
Kel'manov, A.V. and Pyatkin, A.V, On a Version of the Problem of Choosing a Vector Subset, I. Appl. Ind. Math., 2009, vol. 2009, no. 3, pp. 4–447.
Google Scholar
Kel'manov, A.V. and Pyatkin, A.V., NP-Hardness of Some Quadratic Euclidean 2-Clustering Problems, Dokl. Math., 2015, vol. 2015, no. 92, pp. 2–634.
MathSciNet MATH Google Scholar
Kel'manov, A.V. and Pyatkin, A.V, On the Complexity of Some Quadratic Euclidean 2-Clustering Problems, Comput. Math. Math. Phys., 2016, vol. 2016, no. 56, pp. 3–491.
MathSciNet MATH Google Scholar
Bishop, C.M., Pattern Recognition and Machine Learning, New York: Springer, 2006.
MATH Google Scholar
James, G., Witten, D., Hastie, T., and Tibshirani, R., An Introduction to Statistical Learning, New York: Springer, 2013.
Book MATH Google Scholar
Hastie, T. Tibshirani, R., and Friedman, J., The Elements of Statistical Learning, 2nd ed., Springer, 2009.
Book MATH Google Scholar
Aggarwal, C.C., Data Mining: The Textbook, Springer, 2015.
MATH Google Scholar
Goodfellow, I., Bengio, Y. and Courville, A., Deep Learning (Adaptive Computation and Machine Learning Series), MIT Press, 2017.
MATH Google Scholar
Shirkhorshidi, A.S., Aghabozorgi, S., Wah, T.Y., and Herawan, T. Big Data Clustering: A Review, LNCS, 2014, vol. 8583, pp. 707–720.
Google Scholar
Pach, J. and Agarwal, R.K., Combinatorial Geometry, New York: Wiley, 1995.
Book MATH Google Scholar
Kel'manov, A.V. and Khandeev, V.., A 2-Approximation Polynomial Algorithm for a Clustering Problem, 7. Appl. Ind. Math., 2013, vol. 2013, no. 7, pp. 4–515.
MATH Google Scholar
Gimadi, E.Kh., Kel'manov, A.V., Kel'manova, M.A., and Khamidullin, S.A., A Posteriori Detection of a Quasiperiodic Fragment in a Numerical Sequence with a Given Number of Recurrences, Sib. Zh. Ind. Mat., 2006, vol. 2006, no. 9, pp. 1–55.
MATH Google Scholar
Gimadi, E.Kh., Kel'manov, A.V., Kel'manova, M.A., and Khamidullin, S.A., A Posteriori Detecting a Quasiperiodic Fragment in a Numerical Sequence, Pattern Recogn. Image An., 2008, vol. 2008, no. 18, pp. 1–30.
MATH Google Scholar
Baburin, A.E., Gimadi, E.Kh., Glebov, N.I., and Pyatkin, A.V., The Problem of Finding a Subset of Vectors with the Maximum Total Weight, 7. Appl. Ind. Math., 2008, vol. 2008, no. 2, pp. 1–32.
MathSciNet MATH Google Scholar
Gimadi, E.Kh., Pyatkin, A.V, and Rykov, L.., On Polynomial Solvability of Some Problems of a Vector Subset Choice in a Euclidean Space of Fixed Dimension, 7. Appl. Ind. Math., 2010, vol. 2010, no. 4, pp. 1–48.
Google Scholar
Shenmaier, V.V., Solving Some Vector Subset Problems by Voronoi Diagrams, 7. Appl. Ind. Math., 2016, vol. 2016, no. 10, pp. 4–560.
MATH MathSciNet Google Scholar
Kel'manov, A.V. and Khandeev, V.., An Exact Pseudopolynomial Algorithm for a Problem of the Two-Cluster Partitioning of a Set of Vectors, 7. Appl. Ind. Math., 2015, vol. 2015, no. 9, pp. 4–497.
MATH Google Scholar
Dolgushev, A.V. and Kel'manov, A.V., An Approximation Algorithm for Solving a Problem of Cluster Analysis, 7. Appl. Ind. Math., 2011, vol. 2011, no. 5, pp. 4–551.
MathSciNet MATH Google Scholar
Dolgushev, A.V., Kel'manov, A.V., and Shenmaier, V.V., Polynomial-Time Approximation Scheme for a Problem of Partitioning a Finite Set into Two Clusters, Proc. Steklov Inst. Math., 2016, vol. 295, suppl. 1, pp. 47–56.
Article MathSciNet MATH Google Scholar
Kel'manov, A.V. and Khandeev, V.I., Fully Polynomial-Time Approximation Scheme for a Special Case of a Quadratic Euclidean 2-Clustering Problem, J. Appl. Ind. Math., 2016, vol. 2016, no. 56, pp. 2–334.
MATH Google Scholar
Kel'manov, A.V., Motkova, A.V., and Shenmaier, V.V., An Approximation Scheme for a Weighted Two-Cluster Partition Problem, LNCS, 2018, vol. 10716, pp. 323–333.
MATH Google Scholar
Kel'manov, A.V. and Khandeev, V.I., A Randomized Algorithm for Two-Cluster Partition of a Set of Vectors, Comput. Math. Math. Phys., 2015, vol. 2015, no. 55, pp. 2–330.
MathSciNet MATH Google Scholar
Kel'manov, A.V. and Motkova, A.V, Polynomial-Time Approximation Algorithm for the Problem of Cardinality-Weighted Variance-Based 2-Clustering with a Given Center, Comput. Math. Math. Phys., 2018, vol. 2018, no. 58, pp. 1–130.
MathSciNet MATH Google Scholar
Kel'manov, A.V. and Motkova, A.V, Exact Pseudopolynomial Algorithms for a Balanced 2-Clustering Problem, I. Appl. Ind. Math., 2016, vol. 2016, no. 10, pp. 3–349.
MathSciNet MATH Google Scholar
Kel'manov, A.V. and Motkova, A.V, A Fully Polynomial-Time Approximation Scheme for a Special Case of a Balanced 2-Clustering Problem, LNCS, 2016, vol. 9869, pp. 182–192.
MathSciNet MATH Google Scholar
Kel'manov, A.V, Khandeev, V.., and Panasenko, A.V, Randomized Algorithms for Some Clustering Problems, Comm. Comput. Inform. Sci., 2018, vol. CCIS-871, pp. 109–119.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Sobolev Institute of Mathematics, Siberian Branch, Russian Academy of Sciences, pr. Akad. Koptyuga 4, Novosibirsk, 630090, Russia
A. V. Kel′manov, A. V. Panasenko & V. I. Khandeev
Novosibirsk State University, ul. Pirogova 1, Novosibirsk, 630090, Russia
A. V. Kel′manov, A. V. Panasenko & V. I. Khandeev

Authors

A. V. Kel′manov
View author publications
You can also search for this author in PubMed Google Scholar
A. V. Panasenko
View author publications
You can also search for this author in PubMed Google Scholar
V. I. Khandeev
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to A. V. Kel′manov, A. V. Panasenko or V. I. Khandeev.

Additional information

Russian Text © The Author(s), 2019, published in Sibirskii Zhurnal Vychislitel’noi Matematiki, 2019, Vol. 22, No. 2, pp. 121–136.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kel′manov, A.V., Panasenko, A.V. & Khandeev, V.I. Exact Algorithms of Search for a Cluster of the Largest Size in Two Integer 2-Clustering Problems. Numer. Analys. Appl. 12, 105–115 (2019). https://doi.org/10.1134/S1995423919020010

Download citation

Received: 15 May 2018
Revised: 26 June 2018
Accepted: 21 January 2019
Published: 06 June 2019
Issue Date: April 2019
DOI: https://doi.org/10.1134/S1995423919020010

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Exact Algorithms of Search for a Cluster of the Largest Size in Two Integer 2-Clustering Problems

Abstract

Article PDF

Similar content being viewed by others

Exact Algorithm for the One-Dimensional Quadratic Euclidean Cardinality-Weighted 2-Clustering with Given Center Problem

Exact Algorithms for the Special Cases of Two Hard to Solve Problems of Searching for the Largest Subset

Fully polynomial-time approximation scheme for a special case of a quadratic Euclidean 2-clustering problem

References

Author information

Authors and Affiliations

Corresponding authors

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Exact Algorithms of Search for a Cluster of the Largest Size in Two Integer 2-Clustering Problems

Abstract

Article PDF

Similar content being viewed by others

Exact Algorithm for the One-Dimensional Quadratic Euclidean Cardinality-Weighted 2-Clustering with Given Center Problem

Exact Algorithms for the Special Cases of Two Hard to Solve Problems of Searching for the Largest Subset

Fully polynomial-time approximation scheme for a special case of a quadratic Euclidean 2-clustering problem

References

Author information

Authors and Affiliations

Corresponding authors

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation