Abstract
Web users use search engine to find useful information on the Internet. However current web search engines return answer to a query independent of specific user information need. Since web users with similar web behaviors tend to acquire similar information when they submit a same query, these unseen factors can be used to improve search result. In this paper we present an approach that mines these unseen factors from web logs to personalized web search. Our approach is based on probabilistic latent semantic analysis, a model based technique that is used to analyze co-occurrence data. Experimental results on real data collected by MSN search engine show the improvements over traditional web search.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
- User Profile
- Latent Semantic Analysis
- Probabilistic Latent Semantic Analysis
- Search Intention
- Clickthrough Data
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Jin, X., Zhou, Y., Mobasher, B.: Web Usage Mining based on Probabilistic Latent Semantic Analysis. In: Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2004), Seattle (2004)
Sugiyama, K., Hatano, K., Yoshikawa, M.: Adaptive web search based on user profile constructed without any effort from users. In: Proceedings of the 13th international conference on World Wide Web, pp. 675–684. ACM Press, New York (2004)
Hofmann, T.: Collaborative Filtering via Gaussian Probabilistic Latent Semantic Analysis. In: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 259–266. ACM Press, New York (2003)
Hofmann, T.: Probabilistic Latent Semantic Analysis. In: Proceedings of Uncertainty in Artificial Intelligence, UAI 1999, Stockholm (1999)
Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: Bringing order to the web. Technical report, Stanford Digital Library Technologies Project (1998)
Haveliwali, T.H.: Topic-Sensitive PangeRank. In: Proceedings of the 11th International World Wide Web Conference (WWW2002), pp. 517–526 (2002)
Pretschner, A., Gauch, S.: Ontology based personalized search. In: ICTAI, pp. 391–398 (1999)
Speretta, M., Gauch, S.: Personalizing Search Based on User Search Histories. In: Thirteenth International Conference on Information and Knowledge Management, CIKM 2004 (2004)
Liu, F., Yu, C., Meng, W.: Personalized web search by mapping user queries to categories. In: Proceedings of the eleventh international conference on Information and knowledge management, pp. 558–565. ACM Press, New York (2002)
Kim, H.R., Chan, P.K.: Learning implicit user interest hierarchy for context in personalization. In: Proceedings of the 8th international conference on Intelligent User Interfaces, pp. 101–108. ACM Press, New York (2003)
Deerwester, S., Dumais, S., Landauer, T., Furnas, G., Harshman, R.: Indexing by latent semantic analysis.
Berry, M., Dumais, S., OBien, G.: Using linear algebra for intelligent information retrieval (1995)
Hofmann, T.: Unsupervised learning by probabilistic latent semantic analysis. Machine Learning 42, 177–196 (2001)
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm
Hofmann, T.: Probabilistic Latent Semantic Indexing. In: Proceedings of the 22nd Annual ACM Conference on Research and Development in Information Retrieval, Berkeley, California, pp. 50–57 (1999)
Brants, T., Chen, F., Tochantaridis, I.: Topic-based document segmentation with probabilistic latent semantic analysis. In: Proceedings of Eleventh International Conference on Information and Knowledge Management (2002)
Gaussier, E., Goutte, C., Popat, K., Chen, F.: A Hierarchical model for clustering and categorizing documents. In: 24th BCS-IRSG European Colloquium on IR Research (2002)
Cohn, D., Chang, H.: Learning to probabilistically identify authoritative documents. In: Proceedings of the Seventeenth International Conference on Machine Learning, pp. 167–174. Morgan Kaufmann Publishers Inc, San Francisco (2000)
Cohn, D., Hofmann, T.: The missing link - A probabilistic model of document content and hypertext connectivity. In: Leen, T.K., Dietterich, T.G., Tresp, V. (eds.) Advances in Neural Information Processing Systems, vol. 13, pp. 430–436. MIT Press, Cambridge (2001)
Hofmann, T.: Latent semantic models for collaborative filtering. ACM Trans. Inf. Syst. 22, 89–115 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lin, C., Xue, GR., Zeng, HJ., Yu, Y. (2005). Using Probabilistic Latent Semantic Analysis for Personalized Web Search. In: Zhang, Y., Tanaka, K., Yu, J.X., Wang, S., Li, M. (eds) Web Technologies Research and Development - APWeb 2005. APWeb 2005. Lecture Notes in Computer Science, vol 3399. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-31849-1_68
Download citation
DOI: https://doi.org/10.1007/978-3-540-31849-1_68
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25207-8
Online ISBN: 978-3-540-31849-1
eBook Packages: Computer ScienceComputer Science (R0)