Abstract
Tag clouds are means for navigation and exploration of information resources on the web provided by social Web sites. The most used approach to generate a tag cloud so far is based on popularity of tags among users who annotate by those tags. This approach however has several limitations, such as suppressing number of tags which are not used often but could lead to interesting resources as well as tags which have been suppressed due to the default number of tags to present in the tag cloud. In this paper we propose the SimSpectrum: a similarity based spectral clustering approach to generate a tag cloud which improves the current state of the art with respect to these limitations. Our approach is based on finding to which extent the tags are related by a similarity calculus. Based on the results from similarity calculation, the spectral clustering algorithm finds the clusters of tags which are strongly related and are loosely related to the other tags. By doing so, we can cover part of the tags which are discarded by traditional tag cloud generation approaches and therefore, present the user with more opportunities to find related interesting web resources. We also show that in terms of the metrics that capture the structural properties of a tag cloud such as coverage and relevance our method has significant results compared to the baseline tag cloud that relies on tag popularity. In terms of the overlap measure, our method shows improvements against the baseline approach. The proposed approach is evaluated using MedWorm medical article collection.
Chapter PDF
Similar content being viewed by others
References
Batch, Y., Yusof, M.M., Noah, S.A.M., Lee, T.P.: Mtag: A model to enable collaborative medical tagging in medical blogs. Procedia Computer Science 3, 785–790 (2011); World Conference on Information Technology
Bateman, S., Gutwin, C., Nacenta, M.: Seeing things in the clouds: the effect of visual features on tag cloud selections. In: Proceedings of the Nineteenth ACM Conference on Hypertext and Hypermedia, HT 2008, pp. 193–202. ACM, New York (2008)
Begelman, G., Keller, P., Smadja, F.: Automated tag clustering: Improving search and exploration in the tag space. In: Proceedings of the WWW Collaborative Web Tagging Workshop, Edinburgh, Scotland (2006)
Cabarcos, A., Sanchez, T., Seoane, J.A., Aguiar-Pulido, V., Freire, A., Dorado, J., Pazos, A.: Retrieval and management of medical information from heterogeneous sources, for its integration in a medical record visualisation tool. IJEH 5(4), 371–385 (2010)
Chakrabarti, K., Chaudhuri, S., Hwang, S.-W.: Automatic categorization of query results. In: Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data, SIGMOD 2004, New York, NY, USA, pp. 755–766 (2004)
de Spindler, A., Leone, S., Geel, M., Norrie, M.C.: Using Tag Clouds to Promote Community Awareness in Research Environments. In: Luo, Y. (ed.) CDVE 2010. LNCS, vol. 6240, pp. 3–10. Springer, Heidelberg (2010)
Durao, F., Lage, R., Dolog, P., Coskun, N.: Exploring multi-factor tagging activity for personalized search. In: WEBIST 2011, Proceedings of the 7th International Conference on Web Information Systems and Technologies, The Netherlands, May 6-9 (2011)
Hernandez, M.-E., Falconer, S.M., Storey, M.-A., Carini, S., Sim, I.: Synchronized tag clouds for exploring semi-structured clinical trial data. In: Proceedings of the 2008 Conference of the Center for Advanced Studies on Collaborative Research: Meeting of Minds, CASCON 2008, pp. 4:42–4:56. ACM, New York (2008)
Koutrika, G., Zadeh, Z.M., Garcia-Molina, H.: Coursecloud: summarizing and refining keyword searches over structured data. In: Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology, EDBT 2009, pp. 1132–1135. ACM, New York (2009)
Kuo, B.Y.-L., Hentrich, T., Good, B.M., Wilkinson, M.D.: Tag clouds for summarizing web search results. In: Proceedings of the 16th International Conference on World Wide Web, WWW 2007, pp. 1203–1204. ACM, New York (2007)
Leone, S., Geel, M., Muller, C., Norrie, M.C.: Exploiting tag clouds for database browsing and querying. In: Aalst, W., Mylopoulos, J., Rosemann, M., Shaw, M.J., Szyperski, C., Soffer, P., Proper, E. (eds.) Information Systems Evolution. LNBIP, vol. 72, pp. 15–28. Springer, Heidelberg (2011)
Maslowska, I.: Phrase-Based Hierarchical Clustering of Web Search Results. In: Sebastiani, F. (ed.) ECIR 2003. LNCS, vol. 2633, pp. 555–562. Springer, Heidelberg (2003)
Nigam, K., McCallum, A.K., Thrun, S., Mitchell, T.: Text classification from labeled and unlabeled documents using em. Mach. Learn. 39, 103–134 (May 2000)
Rivadeneira, A.W., Gruen, D.M., Muller, M.J., Millen, D.R.: Getting our head in the clouds: toward evaluation studies of tagclouds. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI 2007, pp. 995–998. ACM (2007)
Schrammel, J., Leitner, M., Tscheligi, M.: Semantically structured tag clouds: an empirical evaluation of clustered presentation approaches. In: Proceedings of the 27th International Conference on Human Factors in Computing Systems, CHI 2009, pp. 2037–2040. ACM (2009)
Sinclair, J., Cardew-Hall, M.: The folksonomy tag cloud: when is it useful? J. Inf. Sci. 34, 15–29 (2008)
Van Driessche, R., Roose, D.: An improved spectral bisection algorithm and its application to dynamic load balancing. Parallel Comput. 21, 29–48 (1995)
Venetis, P., Koutrika, G., Garcia-Molina, H.: On the selection of tags for tag clouds. In: Proceedings of the Fourth ACM International Conference on Web Search and Data Mining, WSDM 2011, pp. 835–844 (2011)
von Luxburg, U.: A tutorial on spectral clustering. Statistics and Computing 17, 395–416 (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Durao, F., Dolog, P., Leginus, M., Lage, R. (2012). SimSpectrum: A Similarity Based Spectral Clustering Approach to Generate a Tag Cloud. In: Harth, A., Koch, N. (eds) Current Trends in Web Engineering. ICWE 2011. Lecture Notes in Computer Science, vol 7059. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-27997-3_14
Download citation
DOI: https://doi.org/10.1007/978-3-642-27997-3_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-27996-6
Online ISBN: 978-3-642-27997-3
eBook Packages: Computer ScienceComputer Science (R0)