Mixed Bregman Clustering with Approximation Guarantees

Nock, Richard; Luosto, Panu; Kivinen, Jyrki

doi:10.1007/978-3-540-87481-2_11

Richard Nock¹,
Panu Luosto² &
Jyrki Kivinen²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5212))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

5744 Accesses
16 Citations

Abstract

Two recent breakthroughs have dramatically improved the scope and performance of k-means clustering: squared Euclidean seeding for the initialization step, and Bregman clustering for the iterative step. In this paper, we first unite the two frameworks by generalizing the former improvement to Bregman seeding — a biased randomized seeding technique using Bregman divergences — while generalizing its important theoretical approximation guarantees as well. We end up with a complete Bregman hard clustering algorithm integrating the distortion at hand in both the initialization and iterative steps. Our second contribution is to further generalize this algorithm to handle mixed Bregman distortions, which smooth out the asymetricity of Bregman divergences. In contrast to some other symmetrization approaches, our approach keeps the algorithm simple and allows us to generalize theoretical guarantees from regular Bregman clustering. Preliminary experiments show that using the proposed seeding with a suitable Bregman divergence can help us discover the underlying structure of the data.

Download to read the full chapter text

Chapter PDF

Improved local search algorithms for Bregman k-means and its variants

Article 28 June 2021

A Semi Brute-Force Search Approach for (Balanced) Clustering

Article 01 August 2023

“Anti-Bayesian” Flat and Hierarchical Clustering Using Symmetric Quantiloids

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Ackermann, M.R., Blömer, J., Sohler, C.: Clustering for metric and non-metric distance measures. In: Proc. of the 19^th ACM-SIAM Symposium on Discrete Algorithms, pp. 799–808 (2008)
Google Scholar
Arthur, D., Vassilvitskii, S.: k-means++: the advantages of careful seeding. In: Proc. of the 18^th ACM-SIAM Symposium on Discrete Algorithms, pp. 1027–1035 (2007)
Google Scholar
Azoury, K.S., Warmuth, M.K.: Relative loss bounds for on-line density estimation with the exponential family of distributions. Machine Learning Journal 43(3), 211–246 (2001)
Article MATH Google Scholar
Banerjee, A., Guo, X., Wang, H.: On the optimality of conditional expectation as a Bregman predictor. IEEE Trans. on Information Theory 51, 2664–2669 (2005)
Article MathSciNet Google Scholar
Banerjee, A., Merugu, S., Dhillon, I., Ghosh, J.: Clustering with Bregman divergences. Journal of Machine Learning Research 6, 1705–1749 (2005)
MathSciNet Google Scholar
Bregman, L.M.: The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming. USSR Comp. Math. and Math. Phys. 7, 200–217 (1967)
Article Google Scholar
Crammer, K., Kearns, M., Wortman, J.: Learning from multiple sources. In: Advances in Neural Information Processing Systems 19, pp. 321–328. MIT Press, Cambridge (2007)
Google Scholar
Chaudhuri, K., McGregor, A.: Finding metric structure in information-theoretic clustering. In: Proc. of the 21^st Conference on Learning Theory (2008)
Google Scholar
Deza, E., Deza, M.-M.: Dictionary of distances. Elsevier, Amsterdam (2006)
Google Scholar
Gentile, C.: The robustness of the p-norm algorithms. Machine Learning Journal 53(3), 265–299 (2003)
Article MATH MathSciNet Google Scholar
Lloyd, S.: Least squares quantization in PCM. IEEE Trans. on Information Theory 28, 129–136 (1982)
Article MATH MathSciNet Google Scholar
Nielsen, F., Boissonnat, J.-D., Nock, R.: On Bregman Voronoi diagrams. In: Proc. of the 18^th ACM-SIAM Symposium on Discrete Algorithms, pp. 746–755 (2007)
Google Scholar
Nock, R., Nielsen, F.: Fitting the smallest enclosing Bregman ball. In: Gama, J., Camacho, R., Brazdil, P.B., Jorge, A.M., Torgo, L. (eds.) ECML 2005. LNCS (LNAI), vol. 3720, pp. 649–656. Springer, Heidelberg (2005)
Chapter Google Scholar
Ostrovsky, R., Rabani, Y., Schulman, L.J., Swamy, C.: The effectiveness of Lloyd-type methods for the k-means problem. In: Proc. of the 47^th IEEE Symposium on the Foundations of Computer Science, pp. 165–176. IEEE Computer Society Press, Los Alamitos (2006)
Google Scholar
Veldhuis, R.: The centroid of the symmetrical Kullback-Leibler distance. IEEE Signal Processing Letters 9, 96–99 (2002)
Article Google Scholar

Download references

Author information

Authors and Affiliations

CEREGMIA — Université Antilles-Guyane, Schoelcher, France
Richard Nock
Department of Computer Science, University of Helsinki, Finland
Panu Luosto & Jyrki Kivinen

Authors

Richard Nock
View author publications
You can also search for this author in PubMed Google Scholar
Panu Luosto
View author publications
You can also search for this author in PubMed Google Scholar
Jyrki Kivinen
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Walter Daelemans Bart Goethals Katharina Morik

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nock, R., Luosto, P., Kivinen, J. (2008). Mixed Bregman Clustering with Approximation Guarantees. In: Daelemans, W., Goethals, B., Morik, K. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2008. Lecture Notes in Computer Science(), vol 5212. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-87481-2_11

Download citation

DOI: https://doi.org/10.1007/978-3-540-87481-2_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-87480-5
Online ISBN: 978-3-540-87481-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Mixed Bregman Clustering with Approximation Guarantees

Abstract

Chapter PDF

Similar content being viewed by others

Improved local search algorithms for Bregman k-means and its variants

A Semi Brute-Force Search Approach for (Balanced) Clustering

“Anti-Bayesian” Flat and Hierarchical Clustering Using Symmetric Quantiloids

Keywords

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Mixed Bregman Clustering with Approximation Guarantees

Abstract

Chapter PDF

Similar content being viewed by others

Improved local search algorithms for Bregman k-means and its variants

A Semi Brute-Force Search Approach for (Balanced) Clustering

“Anti-Bayesian” Flat and Hierarchical Clustering Using Symmetric Quantiloids

Keywords

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation