Probabilistic D-Clustering

Ben-Israel, Adi; Iyigun, Cem

doi:10.1007/s00357-008-9002-z

Probabilistic D-Clustering

Published: 18 June 2008

Volume 25, pages 5–26, (2008)
Cite this article

Download PDF

Access provided by CONRICYT-eBooks

Journal of Classification Aims and scope Submit manuscript

Probabilistic D-Clustering

Download PDF

Adi Ben-Israel¹ &
Cem Iyigun¹

648 Accesses
43 Citations
Explore all metrics

Abstract

We present a new iterative method for probabilistic clustering of data. Given clusters, their centers and the distances of data points from these centers, the probability of cluster membership at any point is assumed inversely proportional to the distance from (the center of) the cluster in question. This assumption is our working principle.

The method is a generalization, to several centers, of theWeiszfeld method for solving the Fermat–Weber location problem. At each iteration, the distances (Euclidean, Mahalanobis, etc.) from the cluster centers are computed for all data points, and the centers are updated as convex combinations of these points, with weights determined by the above principle. Computations stop when the centers stop moving.

Progress is monitored by the joint distance function, a measure of distance from all cluster centers, that evolves during the iterations, and captures the data in its low contours.

The method is simple, fast (requiring a small number of cheap iterations) and insensitive to outliers.

Avoid common mistakes on your manuscript.

References

ARAV, M. “Contour Approximation of Data and the Harmonic Mean”, Mathematical Inequalities & Applications, (to appear).
BEZDEK, J.C. (1973), “Fuzzy Mathematics in Pattern Classification”, Ph.D. Thesis (Applied Mathematics), Cornell University, Ithaca, New York.
BEZDEK, J.C. (1981), Pattern Recognition with Fuzzy Objective Function Algorithms, New York: Plenum.
MATH Google Scholar
DIXON, K.R., and CHAPMAN J.A. (1980), “Harmonic Mean Measure of Animal Activity Areas”, Ecology 61, 1040–1044.
Article Google Scholar
HARTIGAN, J. (1975), Clustering Algorithms, New York:John Wiley & Sons, Inc.
MATH Google Scholar
HEISER, W.J. (2004), “Geometric Representation of Association Between Categories”, Psychometrika 69, 513–545.
Article MathSciNet Google Scholar
HÖPPNER, F., KLAWONN, F., KRUSE, R., and RUNKLER, T. (1999), Fuzzy Cluster Analysis, Chichester, England:John Wiley & Sons, Inc.
MATH Google Scholar
HUBERT, L., and STEINLEY, D.(2005), “Agreement Among Supreme Court Justices: Categorical vs. Continuous Representation”, SIAM News, 38(7).
IYIGUN, C., and BEN-ISRAEL, A. “Probabilistic Distance Clustering Adjusted for Cluster Size”, Probability in the Engineering and Informational Sciences, (to appear).
IYIGUN, C., and BEN-ISRAEL, A. “A GeneralizedWeiszfeldMethod for Multifacility Location Problems”, (to appear).
IYIGUN, C., and BEN-ISRAEL, A. “Probabilistic Distance Clustering, Theory and Applications”, W. Chaovalitwongse, P.M. Pardalos (Eds.), Clustering Challenges in Biological Networks, World Scientific. (to appear)
IYIGUN, C., and BEN-ISRAEL, A. “A New Criterion for Clustering Validity via Contour Approximation of Data”, (to appear).
IYIGUN, C., and BEN-ISRAEL, A. “Probabilistic Semi–Supervised Clustering”, (to appear).
JAIN, A.K., and DUBES, R.C. (1988), Algorithms for Clustering Data, Prentice Hall.
KUHN, H.W. (1973), “A Note on Fermat’s Problem”, Math. Programming 4, 98–107.
Article MATH MathSciNet Google Scholar
LOVE, R., MORRIS, J., and WESOLOWSKY, G. (1988), Facilities Location: Models and Methods, North-Holland.
OSTRESH Jr., L.M. (1978), “On the Convergence of a Class of Iterative Methods for Solving the Weber Location Problem”, Operations Research 26, 597–609.
Article MATH MathSciNet Google Scholar
TAN, P., STEINBACH, M., and KUMAR, V. (2006), Introduction to Data Mining, Addison Wesley.
TEBOULLE, M. (2007), “A Unified Continuous Optimization Framework for Center-Based ClusteringMethods”, Journal of Machine Learning 8, 65–102.
MathSciNet Google Scholar
WEISZFELD, E. (1937), “Sur le point par lequel la somme des distances de n points donn´es est minimum”, Tohoku Math. J. 43, 355–386.
MATH Google Scholar

Download references

Author information

Authors and Affiliations

RUTCOR–Rutgers Center for Operations Research, Rutgers University, 640 Bartholomew Rd., Piscataway, NJ, 08854-8003, USA
Adi Ben-Israel & Cem Iyigun

Authors

Adi Ben-Israel
View author publications
You can also search for this author in PubMed Google Scholar
Cem Iyigun
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Adi Ben-Israel.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ben-Israel, A., Iyigun, C. Probabilistic D-Clustering. J Classif 25, 5–26 (2008). https://doi.org/10.1007/s00357-008-9002-z

Download citation

Published: 18 June 2008
Issue Date: June 2008
DOI: https://doi.org/10.1007/s00357-008-9002-z

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Probabilistic D-Clustering

Abstract

Article PDF

Similar content being viewed by others

On a Robust Approach to Search for Cluster Centers

FPDclustering: a comprehensive R package for probabilistic distance clustering based methods

Fast indefinite multi-point (IMP) clustering

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Probabilistic D-Clustering

Abstract

Article PDF

Similar content being viewed by others

On a Robust Approach to Search for Cluster Centers

FPDclustering: a comprehensive R package for probabilistic distance clustering based methods

Fast indefinite multi-point (IMP) clustering

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation