Abstract
This paper introduces a new algorithm for clustering sequential data. The SKM algorithm is a K-Means-type algorithm suited for identifying groups of objects with similar trajectories and dynamics. We provide a simulation study to show the good properties of the SKM algorithm. Moreover, a real application to website users’ search patterns shows its usefulness in identifying groups with heterogeneous behavior. We identify two distinct clusters with different styles of website search.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Cadez, I., Heckerman, D., Meek, C., Smyth, P., White, S.: Model-based clustering and visualization of navigation patterns on a web site. Data Mining and Knowledge Discovery 7(4), 399–424 (2003)
Dias, J.G., Willekens, F.: Model-based clustering of life histories with an application to contraceptive use dynamics. Mathematical Population Studies 12(3), 135–157 (2005)
Dias, J.G., Vermunt, J.K.: Latent class modeling of website users’ search patterns: Implications for online market segmentation. Journal of Retailing and Consumer Services 14(4), 359–368 (2007)
Kullback, S., Leibler, R.A.: On information and sufficiency. The Annals of Mathematical Statistics 22(1), 79–86 (1951)
Lee, J., Podlaseck, M., Schonberg, E., Hoch, R.: Visualization and analysis of clickstream data of online stores for understanding Web merchandising. Data Mining and Knowledge Discovery 5(1-2), 59–84 (2001)
MathWorks.: MATLAB 7.0. Natick, MA: The MathWorks, Inc (2004)
MacQueen, J.B.: Some Methods for classification and Analysis of Multivariate Observations. In: Proceedings of 5-th Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297. University of California Press, Berkeley (1967)
Petridou, S.G., Koutsonikola, V.A., Vakali, A.I., Papadimitriou, G.I.: A divergence-oriented approach for web users clustering. In: Gavrilova, M.L., Gervasi, O., Kumar, V., Tan, C.J.K., Taniar, D., Laganá, A., Mun, Y., Choo, H. (eds.) ICCSA 2006. LNCS, vol. 3981, pp. 1229–1238. Springer, Heidelberg (2006)
Ross, S.M.: Introduction to Probability Models, 7th edn. Harcourt/Academic Press, San Diego (2000)
Shahabi, C., Zarkesh, A.M., Adibi, J., Shah, V.: Knowledge discovery from users Web-page navigation. In: Proceedings of the 7th International Workshop on Research Issues in Data Engineering (RIDE 1997). High Performance Database Management for Large-Scale Applications, pp. 20–29. IEEE Computer Society, Los Alamitos (1997)
Schwarz, G.: Estimating the dimension of a model. Annals of Statistics 6, 461–464 (1978)
Smith, K.A., Ng, A.: Web page clustering using a self-organizing map of user navigation patterns. Decision Support Systems 35(2), 245–256 (2003)
Spiliopoulou, M., Pohle, C.: Data mining for measuring and improving the success of Web sites. Data Mining and Knowledge Discovery 5(1-2), 85–114 (2001)
Spiliopoulou, M., Ntoutsi, I., Theodoridis, Y., Schult, R.: MONIC: modeling and monitoring cluster transitions. In: KDD 2006, pp. 706–711 (2006)
Vakali, A., Pokorny, J., Dalamagas, T.: An overview of web data clustering practices. In: Lindner, W., Mesiti, M., Türker, C., Tzitzikas, Y., Vakali, A.I. (eds.) EDBT 2004. LNCS, vol. 3268, pp. 597–606. Springer, Heidelberg (2004)
Yang, Y.H., Padmanabhan, B.: GHIC: A hierarchical pattern-based clustering algorithm for grouping Web transactions. IEEE Transactions on Knowledge and Data Engineering 17(9), 1300–1304 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Dias, J.G., Cortinhal, M.J. (2008). The SKM Algorithm: A K-Means Algorithm for Clustering Sequential Data. In: Geffner, H., Prada, R., Machado Alexandre, I., David, N. (eds) Advances in Artificial Intelligence – IBERAMIA 2008. IBERAMIA 2008. Lecture Notes in Computer Science(), vol 5290. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88309-8_18
Download citation
DOI: https://doi.org/10.1007/978-3-540-88309-8_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-88308-1
Online ISBN: 978-3-540-88309-8
eBook Packages: Computer ScienceComputer Science (R0)