Abstract
The clustering of Web usage sessions based on the access patterns is studied. Access patterns of Web users are extracted from Web server log files, and then organized into sessions which represent episodes of interaction between the Web users and the Web server. Using attribute-oriented induction, the sessions are then generalized according to a page hierarchy which organizes pages based on their contents. These generalized sessions are finally clustered using a hierarchical clustering method. Our experiments on a large real data set show that the approach is efficient and practical for Web mining applications.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
R. Agrawal, J. Gehrke, D. Gunopulos, and P. Raghavan. Automatic subspace clustering of high dimensional data for data mining applications. In Proc. ACM SIGMOD Int. Conf. on Management of Data, Seattle, Washington, 1998.
J. C. Bezdek and S. K. Pal. Fuzzy Models for Pattern Recognition. IEEE Press, 1992.
J. Borges and M. Levene. Mining association rules in hypertext databases. In Proc. 1998 Int’l Conf. on Data Mining and Knowledge Discovery (KDD’98), pages 149–153, August 1998.
A. Büchner and M. Mulvenna. Discovering internet marketing intelligence through online analytical web usage mining. SIGMOD Record, 27, 1998.
M.S. Chen, J.S. Park, and P.S. Yu. Efficient data mining for path traversal patterns in distributed systems. Proc. 1996 Int’l Conf. on Distributed Computing Systems, 385, May 1996.
R. Cooley, B. Mobasher, and J. Srivastava. Data preparation for mining world wide web browsing patterns. Journal of Knowledge and Information Systems, 1, 1999.
R. Cooley, B. Mobasher, and J. Srivastava. Web mining: Information and pattern discovery on the world wide web. In Proc. Int. Conf. on Tools with Artificial Intelligence, pages 558–567, Newport Beach, CA, 1999.
O. Etzioni. The world-wide web: Quangmire or gold mine? Communications of ACM, 39:65–68, 1996.
J. Han, Y. Cai, and N. Cercone. Knowledge discovery in databases: An attributeoriented approach. In Proc. 18th Int. Conf. Very Large Data Bases, pages 547–559, Vancouver, Canada, August 1992.
J. Han and Y. Fu. Dynamic generation and refinement of concept hierarchies for knowledge discovery in databases. In Proc. AAAI’94 Workshop on Knowledge Discovery in Databases (KDD’94), pages 157–168, Seattle, WA, July 1994.
J. Han and Y. Fu. Exploration of the power of attribute-oriented induction in data mining. In U.M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, editors, Advances in Knowledge Discovery and Data Mining, pages 399–421. AAAI/MIT Press, 1996.
A. K. Jain and R. C. Dubes. Algorithms for Clustering Data. Printice Hall, 1988.
L. Kaufman and P. J. Rousseeuw. Finding Groups in Data: an Introduction to Cluster Analysis. John Wiley & Sons, 1990.
R. S. Michalski and R. Stepp. Automated construction of classifications: Conceptual clustering versus numerical taxonomy. IEEE Trans. Pattern Analysis and Machine Intelligence, 5:396–410, 1983.
B. Mobasher, N. Jain, S. Han, and J. Srivastava. Web Mining: Pattern Discovery from World Wide Web Transcations. Technical Report, University of Minnesota, avialable at ftp://ftp.cs.umn.edu/users/kumar/webmining.ps., 1996.
J. Moore, S. Han, D. Boley, M. Gini, R. Gross, K. Hastings, G. Karypis, V. Kumar, and B. Mobasher. Web Page Categorization and Feature Selection Using Association Rule and Principal Component Clustering. Workshop on Information Technologies and Systems, avialable at ftp://ftp.cs.umn.edu/users/kumar/webwits.ps., 1997.
M. Perkowitz and O. Etzioni. Adaptive web pages: Automatically synthesizing web pages. In Proc. 15th National Conf. on Artificial Intelligence (AAAI/IAAI’98), pages 727–732, Madison, Wisconsin, July, 1998.
C. Shahabi, A. Z. Zarkesh, J. Adibi, and V. Shah. Knowledge discovery from users web-page navigation. In Proc. of 1997 Int. Workshop on Research Issues on Data Engineering (RIDE’97), Birmingham, England, April 1997.
M. Spiliopoulou and L. Faulstich. Wum: A web utilization miner. In Proc. EDBT Workshop WebDB’98, Valencia, Spain, 1998.
A. Woodru., P. M. Aoki, E. Brewer, P. Gauthier, and L. A. Rowe. An Investigation of Documents from the World Wide Web. 5th Int. World Wide Web Conference, Paris, France, May, 1996.
T. W. Yan, M. Jacobsen, H. Garcia-Molina, and U. Dayal. From User Access Patterns to Dynamic Hypertext Linking. 5th Int. World Wide Web Conference, Paris, France, May, 1996.
O. R. Zaïane, X. Xin, and J. Han. Discovering web access patterns and trends by applying olap and data mining technology on web logs. In Proc. Advances in Digital Libraries, pages 19–29, 1998.
O. Zamir, O. Etzioni, O. Madani, and R. Karp. Fast and intuitive clustering of web documents. In Proc. Int’l Conf. on Data Mining and Knowledge Discovery (KDD’97), pages 287–290, Newport Beach, CA, August 1997.
T. Zhang, R. Ramakrishnan, and M. Livny. BIRCH: an efficient data clustering method for very large databases. In Proc. 1996 ACM-SIGMOD Int. Conf. Management of Data, pages 103–114, Montreal, Canada, June 1996.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fu, Y., Sandhu, K., Shih, MY. (2000). A Generalization-Based Approach to Clustering of Web Usage Sessions. In: Masand, B., Spiliopoulou, M. (eds) Web Usage Analysis and User Profiling. WebKDD 1999. Lecture Notes in Computer Science(), vol 1836. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44934-5_2
Download citation
DOI: https://doi.org/10.1007/3-540-44934-5_2
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67818-2
Online ISBN: 978-3-540-44934-8
eBook Packages: Springer Book Archive