Unit Volume Based Distributed Clustering Using Probabilistic Mixture Model

Lee, Keunjoon; Joo, Jinu; Yang, Jihoon; Park, Sungyong

doi:10.1007/11563983_29

Keunjoon Lee²¹,
Jinu Joo²²,
Jihoon Yang²² &
…
Sungyong Park²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3735))

Included in the following conference series:

International Conference on Discovery Science

718 Accesses

Abstract

Extracting useful knowledge from numerous distributed data repositories can be a very hard task when such data cannot be directly centralized or unified as a single file or database. This paper suggests practical distributed clustering algorithms without accessing the raw data to overcome the inefficiency of centralized data clustering methods. The aim of this research is to generate unit volume based probabilistic mixture model from local clustering results without moving original data. It has been shown that our method is appropriate for distributed clustering when real data cannot be accessed or centralized.

This work is supported by grant No. R01-2004-000-10689-0 from the Basic Research Program of the Korea Science & Engineering Foundation.

Access provided by Autonomous University of Puebla. Download to read the full chapter text

Chapter PDF

Distributed Gaussian Mixture Model Summarization Using the MapReduce Framework

A Fast Distribution-Based Clustering Algorithm for Massive Data

Coupled Hierarchical Dirichlet Process Mixtures for Simultaneous Clustering and Topic Modeling

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. John Wiley and Sons Inc., Chichester (2000)
Google Scholar
Januzaj, E., Kriegel, H.P., Pfeifle, M.: Towards effective and efficient distributed clustering. In: International Workshop on Clustering Large Data Set (ICDM) (2003)
Google Scholar
Vrahatis, M.N., Boutsinas, B., Alevizos, P., Pavlides, G.: The new k-windows algorithm for improving the k-means clustering algorithm. Journal of Complexity 18, 375–391 (2002)
Article MathSciNet MATH Google Scholar
Tasoulis, D.K., Vrahatis, M.N.: Unsupervised distributed clustering. In: The IASTED International Conference on Parallel and Distributed Computing and Networks, as part of the Twenty-Second IASTED International Multi-Conference on Applied Informatics, Innsbruck, Austria (2004)
Google Scholar
Merugu, S., Ghosh, J.: Privacy-preserving distributed clustering using generative models. In: The Third IEEE International Conference on Data Mining (ICDM 2003) (2003)
Google Scholar
Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Simoudis, E., Han, J., Fayyad, U. (eds.) Second International Conference on Knowledge Discovery and Data Mining, Portland, Oregon, pp. 226–231. AAAI Press, Menlo Park (1996)
Google Scholar
Trivedi, K.S.: Probability and statistics with reliability, queuing and computer science applications. John Wiley and Sons Inc., Chichester (2002)
Google Scholar

Download references

Author information

Authors and Affiliations

Kookmin Bank, 27-2, Yeouido-Dong, Yeoungdeungpo-Ku, Seoul, Korea
Keunjoon Lee
Department of Computer Science and Interdisciplinary Program of Integrated Biotechnology, Sogang University, 1 Shinsoo-Dong, Mapo-Ku, Seoul, 121-742, Korea
Jinu Joo, Jihoon Yang & Sungyong Park

Authors

Keunjoon Lee
View author publications
You can also search for this author in PubMed Google Scholar
Jinu Joo
View author publications
You can also search for this author in PubMed Google Scholar
Jihoon Yang
View author publications
You can also search for this author in PubMed Google Scholar
Sungyong Park
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computer Science & Engineering, The University of New South Wales, Sydney, Australia
Achim Hoffmann
Institute of Scientific and Industrial Research, Osaka University, 8-1 Mihogaoka, 567-0047, Ibaraki, Osaka, Japan
Hiroshi Motoda
Max Planck Institute for Computer Science, Saarbrücken, Germany
Tobias Scheffer

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lee, K., Joo, J., Yang, J., Park, S. (2005). Unit Volume Based Distributed Clustering Using Probabilistic Mixture Model. In: Hoffmann, A., Motoda, H., Scheffer, T. (eds) Discovery Science. DS 2005. Lecture Notes in Computer Science(), vol 3735. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11563983_29

Download citation

DOI: https://doi.org/10.1007/11563983_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29230-2
Online ISBN: 978-3-540-31698-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Unit Volume Based Distributed Clustering Using Probabilistic Mixture Model

Abstract

Chapter PDF

Similar content being viewed by others

Distributed Gaussian Mixture Model Summarization Using the MapReduce Framework

A Fast Distribution-Based Clustering Algorithm for Massive Data

Coupled Hierarchical Dirichlet Process Mixtures for Simultaneous Clustering and Topic Modeling

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Unit Volume Based Distributed Clustering Using Probabilistic Mixture Model

Abstract

Chapter PDF

Similar content being viewed by others

Distributed Gaussian Mixture Model Summarization Using the MapReduce Framework

A Fast Distribution-Based Clustering Algorithm for Massive Data

Coupled Hierarchical Dirichlet Process Mixtures for Simultaneous Clustering and Topic Modeling

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation