Abstract
Social tagging is an increasingly popular phenomenon with substantial impact on the way we perceive and understand the Web. For the many Web resources that are not self-descriptive, such as images, tagging is the sole way of associating them with concepts explicitly expressed in text. Consequently, users are encouraged to assign tags to Web resources, and tag recommenders are being developed to stimulate the re-use of existing tags in a consistent way. However, a tag still and inevitably expresses the personal perspective of each user upon the tagged resource. This personal perspective should be taken into account when assessing the similarity of resources with help of tags. In this paper, we focus on similarity-based clustering of tagged items, which can support several applications in social-tagging systems, like information retrieval, providing recommendations, or the establishment of user profiles and the discovery of topics. We show that it is necessary to capture and exploit the multiple values of similarity reflected in the tags assigned to the same item by different users. We model the items, the tags on them and the users who assigned the tags in a multigraph structure. To discover clusters of similar items, we extend spectral clustering, an approach successfully used for the clustering of complex data, into a method that captures multiple values of similarity between any two items. Our experiments with two real social-tagging data sets show that our new method is superior to conventional spectral clustering that ignores the existence of multiple values of similarity among the items.
The first author gratefully acknowledges the partial co-funding of his work through the European Commission FP7 project MyMedia (www.mymediaproject.org) under the grant agreement no. 215006.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
- Singular Value Decomposition
- Spectral Cluster
- Similarity Graph
- Tensor Factorization
- Silhouette Coefficient
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Banerjee, A., Basu, S., Merugu, S.: Multi-way clustering on relation graphs. In: Proceedings of the 7th SIAM International Conference on Data Mining, SDM 2007 (2007)
Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., Harshman, R.: Indexing by latent semantic analysis. Journal of the American Society for Information Science 41, 391–407 (1990)
Giannakidou, E., Koutsonikola, V., Vakali, A., Kompatsiaris, Y.: Co-clustering tags and social data sources. In: Proceedings of the 9th International Conference on Web-Age Information Management (WAIM 2008), pp. 317–324 (2008)
Jäschke, R., Marinho, L.B., Hotho, A., Schmidt-Thieme, L., Stumme, G.: Tag recommendations in folksonomies. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) PKDD 2007. LNCS (LNAI), vol. 4702, pp. 506–514. Springer, Heidelberg (2007)
Kolda, T.G., Bader, B.W.: Tensor decompositions and applications. SIAM Review 51(3) (to appear, 2009)
de Lathauwer, L., de Moor, B., Vandewalle, J.: A multilinear singular value decomposition. SIAM Journal of Matrix Analysis and Applications 21(4), 1253–1278 (2000)
Ng, A.Y., Jordan, M.I., Weiss, Y.: On spectral clustering: Analysis and an algorithm. In: Proceedings of the Advances in Neural Information Processing Systems (NIPS 2001), pp. 849–856 (2001)
Rendle, S., Marinho, L., Nanopoulos, A., Schmidt-Thieme, L.: Learning optimal ranking with tensor factorization for tag recommendation. In: Proceedings of the ACM Conf. on Knowledge Discovery and Data Mining, KDD 2009 (to appear, 2009)
Selee, T.M., Kolda, T.G., Kegelmeyer, W.P., Griffin, J.D.: Extracting clusters from large datasets with multiple similarity measures using IMSCAND. In: Parks, M.L., Collis, S.S. (eds.) CSRI Summer Proceedings 2007, Technical Report SAND2007-7977, Sandia National Laboratories, Albuquerque, NM and Livermore, CA, pp. 87–103 (2007)
Shashua, A., Zass, R., Hazan, T.: Multi-way clustering using super-symmetric non-negative tensor factorization. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3954, pp. 595–608. Springer, Heidelberg (2006)
Symeonidis, P., Nanopoulos, A., Manolopoulos, Y.: A unified framework for providing recommendations in social tagging systems based on ternary semantic analysis. IEEE Transactions on Knowledge and Data Engineering (accepted, 2009)
Tan, P.-N., Steinbach, M., Kumar, V.: Introduction to Data Mining. Wiley, Chichester (2004)
von Luxburg, U.: A tutorial on spectral clustering. Technical report (No. TR-149) Max Planck Institute for Biological Cybernetics (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Nanopoulos, A., Gabriel, HH., Spiliopoulou, M. (2009). Spectral Clustering in Social-Tagging Systems. In: Vossen, G., Long, D.D.E., Yu, J.X. (eds) Web Information Systems Engineering - WISE 2009. WISE 2009. Lecture Notes in Computer Science, vol 5802. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04409-0_15
Download citation
DOI: https://doi.org/10.1007/978-3-642-04409-0_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04408-3
Online ISBN: 978-3-642-04409-0
eBook Packages: Computer ScienceComputer Science (R0)