Skip to main content

Constructing Web User Profiles: A Non-invasive Learning Approach

  • Conference paper
  • First Online:
Web Usage Analysis and User Profiling (WebKDD 1999)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1836))

Included in the following conference series:

Abstract

Our web user profiles consist of Page Interest Estimators (PIE’s) and Web Access Graphs (WAG’s). We discuss a non-invasive approach to estimating the user’s interest of a web page without directly asking the user. A time and space efficient method is proposed for locating multi-word phrases to enrich the common bag-of-words representation for text documents. PIE’s are then learned to predict the user’s interest on any web page. A WAG summarizes the web page access patterns of a user. We describe how a user profile can be utilized to analyze search results and recommend new and interesting pages. Our empirical results on PIE’s are encouraging.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. C. Apte, F. Damerau, and S. Weiss. Towards language independent automated learning of text categorization models. In Proc. ACM SIGIR-94, pages 23–30, 1994.

    Google Scholar 

  2. M. Balabanovic. An adaptive web page recommendation service. In Proc. 1st Intl. Conf. Autonomous Agents, pages 378–385, 1997.

    Google Scholar 

  3. L. Breiman. Bagging predictors. Machine Learning, 24:123–140, 1996.

    MATH  MathSciNet  Google Scholar 

  4. L. Breiman. Stacked regressions. Machine Learning, 24:41–48, 1996.

    MATH  MathSciNet  Google Scholar 

  5. L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone. Classification and Regression Trees. Wadsworth, Belmont,CA, 1984.

    MATH  Google Scholar 

  6. P. Chan and S. Stolfo. Meta-learning for multistrategy and parallel learning. In Proc. Second Intl. Work. Multistrategy Learning, pages 150–165, 1993.

    Google Scholar 

  7. P. Chan and S. Stolfo. A comparative evaluation of voting and meta-learning on partitioned data. In Proc. Twelfth Intl. Conf. Machine Learning, pages 90–98, 1995.

    Google Scholar 

  8. P. Chan, S. Stolfo, and D. Wolpert, editors. Working Notes for the AAAI-96 Workshop on Integrating Multiple Learned Models for Improving and Scaling Machine Learning Algorithms, Portland, OR, 1996. AAAI.

    Google Scholar 

  9. L. Chen and K. Sycara. Webmate: A personal agent for browsing and searching. In Proc. 2nd Intl. Conf. Autonomous Agents, pages 132–139, 1998.

    Google Scholar 

  10. W. Cohen. Fast effective rule induction. In Proc. 12th Intl. Conf. Machine Learning, pages 115–123, 1995.

    Google Scholar 

  11. B. Croft, H. Turtle, and D. Lewis. The use of phrases and structure queries in information retrieval. In Proc. SIGIR-91, pages 32–45, 1991.

    Google Scholar 

  12. R. Duda and P. Hart. Pattern classification and scene analysis. Wiley, New York, NY, 1973.

    MATH  Google Scholar 

  13. J. Fagan. Experiments in Automatic Phrase Indexing for Document Retrieval. PhD thesis, Linguistics, Cornell Univ., Ithaca,NY, 1987.

    Google Scholar 

  14. W. Frakes and R. Baeza-Yates, editors. Information retrieval: data structures and algorithms. Prentice Hall, Englewood Cliffs,NJ, 1992.

    Google Scholar 

  15. D. Goldberg, D. Nichols, B. Oki, and D. Terry. Using collaborative filtering to weave an information tapestry. Comm. ACM, 35(12):61–70, 1992.

    Article  Google Scholar 

  16. J. Hartigan. Clustering algorithms. Wiley, New York,NY, 1975.

    MATH  Google Scholar 

  17. W. Hill, L. Stead, M. Rosenstein, and G. Furnas. Recommending and evaluating choices in a virtual community of use. In Proc. ACM CHI-95, pages 194–201, 1995.

    Google Scholar 

  18. L. Holson. Feeding a frenzy: Why internet investors are still ravenous. New York Times, June 61999.

    Google Scholar 

  19. J. Konstan, B. Miller, D. Maltz, J. Herlocker, L. Gordon, and J. Riedl. GroupLens: Applying collaborative filtering to usenet news. Comm. ACM, 40(3):77–87, 1997.

    Article  Google Scholar 

  20. S. Kullback. Information Theory and Statistics. Dover, New York,NY, 1968.

    Google Scholar 

  21. D. Lewis and W. Gale. A sequential algorithm for training text classifiers. In Proc. ACM SIGIR-94, pages 3–12, 1994.

    Google Scholar 

  22. D. Lewis, R. Schapire, J. Callan, and R. Papka. Training algorithms for linear text classifiers. In Proc. ACM SIGIR-96, pages 298–306, 1996.

    Google Scholar 

  23. H. Lieberman. Letizia: An agent that assits web browsing. In Proc. IJCAI-95, 1995.

    Google Scholar 

  24. G. Miller. WordNet: A lexical database for English. Comm. ACM, 38(11):39–41, 1995.

    Article  Google Scholar 

  25. M. Morita and Y. Shinoda. Information filtering based on user behavior analysis and best match text retrieval. In Proc. SIGIR-94, pages 272–281, 1994.

    Google Scholar 

  26. Netscape. Netscape Mozilla. http://www.mozilla.org/.

  27. Netscape. Netscape Navigator. http://www.netscape.org/.

  28. M. Pazzani and D. Billsus. Learning and revising user profiles: The identification of interesting web sites. Machine Learning, 27:313–331, 1997.

    Article  Google Scholar 

  29. M. Pazzani, J. Muramatsu, and D. Billsus. Syskill &Webert: Identifying interesting web sites. In Proc. AAAI-96, 1996.

    Google Scholar 

  30. M. Perkowitz and O. Etzioni. Adaptive web sites: Automatically synthesizing web pages. In Proc. AAAI-98, 1998.

    Google Scholar 

  31. M. Porter. An algorithm for suffix stripping. Program, 14(3):130–137, 1980.

    Google Scholar 

  32. J. R. Quinlan. C4.5: programs for machine learning. Morgan Kaufmann, San Mateo,CA, 1993.

    Google Scholar 

  33. P. Resnick, N. Iacovou, M. Sushak, P. Bergstrom, and J. Riedl. GroupLens: An open architecture for collaborative filtering of netnews. In Proc. CSCW-94, 1994.

    Google Scholar 

  34. C. Van Rijsbergen. Information Retrieval. Butterworths, London, 1979.

    Google Scholar 

  35. R. Rosenfeld. A maximum entropy approach to adaptive statistical language modeling. Computer, Speech, and Language, 10, 1996.

    Google Scholar 

  36. G. Salton. Automatic Text Processing. Addison-Wesley, Reading,MA, 1988.

    Google Scholar 

  37. R. Schapire. The strength of weak learnability. Machine Learning, 5:197–226, 1990.

    Google Scholar 

  38. E. Selberg and O. Etzioni. Multi-service search and comparison using the metacrawler. In Proc. WWW4, 1995.

    Google Scholar 

  39. E. Selberg and O. Etzioni. The metacrawler architecture for resource aggregration on the web. IEEE Expert, 12(1):8–14, 1997.

    Article  Google Scholar 

  40. U. Shardanand and P. Maes. Social information filtering: Algorithms for automating “word of mouth”. In Proc. ACM CHI-95, pages 210–217, 1995.

    Google Scholar 

  41. Squid. Squid internet object cache. http://squid.nlanr.net/Squid/.

  42. R. Stiefelhagen, M. Finke, J. Yang, and A. Waibel. From gaze to focus of attention. In Proc. Work. Perceptual User Interfaces, pages 25–30, 1998.

    Google Scholar 

  43. R. Stiefelhagen, J. Yang, and A. Waibel. Tracking eyes and monitoring eye gaze. In Proc. Work. Perceptual User Interfaces, pages 98–100, 1997.

    Google Scholar 

  44. F. Wall. Statistical Data Analysis Handbook. McGraw-Hill, New York,NY, 1986.

    MATH  Google Scholar 

  45. A. Wexelblat and P. Maes. Footprints: Visualizing histories for web browsing. http://wex.www.media.mit.edu/people/wex/Footprints/footprints1.html, 1997.

  46. D. Wolpert. Stacked generalization. Neural Networks, 5:241–259, 1992.

    Article  Google Scholar 

  47. Y. Yang. An evaluation of statistical approaches to text categorization. Technical Report CMU-CS-97-127, CMU, Pittsburgh, PA, 1997.

    Google Scholar 

  48. Y. Yang and J. Pedersen. A comparative study on feature selection in text categorization. In Proc. Intl. Conf. Machine Learning, 1997.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Chan, P.K. (2000). Constructing Web User Profiles: A Non-invasive Learning Approach. In: Masand, B., Spiliopoulou, M. (eds) Web Usage Analysis and User Profiling. WebKDD 1999. Lecture Notes in Computer Science(), vol 1836. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44934-5_3

Download citation

  • DOI: https://doi.org/10.1007/3-540-44934-5_3

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-67818-2

  • Online ISBN: 978-3-540-44934-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics