Predictive Modelling of Heterogeneous Sequence Collections by Topographic Ordering of Histories

Kabán, Ata

doi:10.1007/s10994-007-5008-8

Predictive Modelling of Heterogeneous Sequence Collections by Topographic Ordering of Histories

Published: 15 May 2007

Volume 68, pages 63–95, (2007)
Cite this article

Download PDF

Machine Learning Aims and scope Submit manuscript

Predictive Modelling of Heterogeneous Sequence Collections by Topographic Ordering of Histories

Download PDF

Ata Kabán¹

525 Accesses
5 Citations
Explore all metrics

Abstract

We propose a model-based approach to the twofold problem of prediction and exploratory analysis of heterogeneous symbolic sequence collections. Our model is based on seeking low entropy local representations joined together with a smooth nonlinear mixing process. Low entropy components are desirable, as they tend to be both more interpretable and more predictable. The nonlinear mixing in turn acts as a regulariser, and in addition, it creates a topographic ordering of the sequence histories, which is useful for exploratory purposes. The combination of these two modelling elements is performed through the generative probabilistic formalism, which ensures a flexible and technically sound predictive modelling framework. Unlike previous generative topographic modelling approaches for discrete data, the estimation algorithm associated with our model is designed to scale to large data sets by exploiting data sparseness. In addition, local convergence is guaranteed without the need for tuning optimisation parameters or making approximations to the non-Gaussian likelihood. These characteristics make it the first generative topographic model for discrete symbolic data with large scale real-world applicability. We analyse and discuss the relationship of our approach with a number of models and methods. We empirically demonstrate robustness against varying sample sizes, leading to significant improvements in terms of predictive performance over the state of the art. Finally we detail an application to the prediction and exploratory analysis of a large real-world web navigation sequence collection.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

Attias, H. (2001). Learning in high dimensions: Modular mixture models. In Proceedings of the 8th International Conference on Artificial Intelligence and Statistics (pp. 144–148).
Bengio, Y., Paiement, J.-F., & Vincent, P. (2004). Neural information processing systems (NIPS): Vol. 16. Out-of-sample extensions for LLE, isomap, MDS, eigenmaps, and spectral clustering. Cambridge: MIT Press.
Google Scholar
Bernardo, J. M., & Smith, A. F. M. (2001). Bayesian theory. Cambridge: Wiley.
Google Scholar
Bishop, C. M. (1995). Neural networks for pattern recognition. Oxford: Oxford University Press. Chap. 7
Google Scholar
Bishop, C. M., Svensen, M., & & Williams, C. K. I. (1998a). The generative topographic mapping. Neural Computation, 10(1), 215–234.
Article Google Scholar
Bishop, C. M., Svensen, M., & Williams, C. K. I. (1998b). Developments of the generative topographic mapping. Neurocomputing, 21, 203–224.
Article MATH Google Scholar
Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3(5), 993–1022.
Article MATH Google Scholar
Buntine, W. (2002). Variational extensions to EM and multinomial PCA. In Proc. of the 13-th European Conference on Machine Learning (ECML).
Cadez, I., Heckerman, D., Meek, C., Smyth, P., & White, S. (2003). Model-based clustering and visualisation of navigation patterns on a web site. Data Mining and Knowledge Discovery, 7(4), 499–242.
Article Google Scholar
Carreira-Perpiñán, M. Á., & Renals S. (1998). Experimental evaluation of latent variable models for dimensionality reduction. In Proc. IEEE Signal Processing Society Workshop on Neural Networks for Signal Processing (NNSP’98) (pp. 165–173).
Celeux, G., Chrétien, S., Forbes, F., & Mkhadri, A. (2001). A component-wise EM algorithm for mixtures. Journal of Computational & Graphical Statistics, 10(4), 697–712(16).
Article Google Scholar
Cover, T., & Thomas, J. (1991). Elements of information theory. New York: Wiley.
MATH Google Scholar
Girolami, M., & Kabán, A. (2005). Sequential activity profiling: Latent Dirichlet allocation of Markov chains. Data Mining and Knowledge Discovery, 10, 175–196.
Article MathSciNet Google Scholar
Hastie, T., Tibshirani, R., & Friedman, J. (2001). The elements of statistical learning. New York: Springer.
MATH Google Scholar
Hofmann, T. (2000). Probmap—A probabilistic approach for mapping large document collections. Journal for Intelligent Data Analysis, 4, 149–164.
MATH Google Scholar
Hofmann, T., & Buhmann, J. (1998). Competitive learning algorithms for robust vector quantization. IEEE Transactions on Signal Processing, 46(6), 1665–1675.
Article MATH MathSciNet Google Scholar
Hollmén, J., Tresp, V., & Simula, O. (1999). A self-organizing map algorithm for clustering probabilistic models. In Proc. of the 9-th International Conference on Artificial Neural Networks (ICANN):, Vol. 2 (pp. 946–951).
Iwata, T., Saito, K., Ueda, N., Stromsten, S., Griffiths, T. L., & Tennenbaum, J. B. (2005). Neural information processing systems (NIPS): Vol. 17. Parametric embedding for class visualisation. Cambridge: MIT Press.
Google Scholar
Kabán, A. (2005). A scalable generative topographic mapping for sparse data sequences. In Proc. IEEE International Conference on Information Technology: Coding and Computing (ITCC) (pp. 51–56).
Kabán, A., & Girolami, M. (2001). A combined latent class and trait model for the analysis and visualisation of discrete data. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(8), 859–872.
Article Google Scholar
Kaski, S., Kangas, J., & Kohonen, T. (1998). Bibliography of self-organizing map (SOM) papers: 1981–1997. In Neural computing surveys: Vol. 1 (pp. 102–350).
Keller, M., & Bengio, S. (2004). TTMM: a graphical model for document representation. In PASCAL Workshop on Text Mining and Understanding, Grenoble, France
Kelley, T. C. (1995). Iterative methods for optimization. Frontiers in Applied Mathematics, Philadelphia: SIAM.
Google Scholar
Kohonen, T. (1999). Self-organising maps. Berlin: Springer.
Google Scholar
McLachlan, G., & Krishnan, T. (1997). The EM algorithm and extensions. New York: Wiley.
MATH Google Scholar
Peterson, C., & Söderberg, B. (1989). A new method for mapping optimization problems onto neural networks. International Journal of Neural Systems, 1(1), 3–22.
Article Google Scholar
Ramakrishnan, N., & Grama, A. (2001). Mining scientific data. Advances in Computers, 55, 119–169.
Google Scholar
Roweis, S., Saul, L. K., & Hinton, G. (2002). Neural information processing systems (NIPS): Vol. 14. Global coordination of local linear models (pp. 889–896). Cambridge: MIT Press.
Google Scholar
Salakhutdinov, R., Roweis, S., & Ghahramani, Z. (2003). Optimization with EM and expectation-conjugate-gradient. In Proc. of the 20-th International Conference on Machine Learning (ICML) (pp. 672–679).
Sarukkai, R. (2000). Link prediction and path analysis using Markov chains. Computer Networks, 33(1–6), 377–386.
Article Google Scholar
Tiño, P., Kabán, A., & Sun, Y. (2004). A generative probabilistic approach to visualising sets of symbolic sequences. In Proc. of the 10-th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD) (pp. 701–706). New York: ACM.
Chapter Google Scholar
Wu, J. M., & Chiu, S. J. (2001). Independent component analysis using Potts models. IEEE Transactions on Neural Networks, 12(2), 202–211.
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science Edgbaston, The University of Birmingham, Birmingham, B15 2TT, UK
Ata Kabán

Authors

Ata Kabán
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ata Kabán.

Additional information

Editor: Zoubin Ghahramani.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kabán, A. Predictive Modelling of Heterogeneous Sequence Collections by Topographic Ordering of Histories. Mach Learn 68, 63–95 (2007). https://doi.org/10.1007/s10994-007-5008-8

Download citation

Received: 30 March 2006
Revised: 20 January 2007
Accepted: 25 February 2007
Published: 15 May 2007
Issue Date: July 2007
DOI: https://doi.org/10.1007/s10994-007-5008-8

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Predictive Modelling of Heterogeneous Sequence Collections by Topographic Ordering of Histories

Abstract

Article PDF

Similar content being viewed by others

Analysis of Web Visit Histories, Part I: Distance-Based Visualization of Sequence Rules

On the need for structure modelling in sequence prediction

MixedTrails: Bayesian hypothesis comparison on heterogeneous sequential data

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Predictive Modelling of Heterogeneous Sequence Collections by Topographic Ordering of Histories

Abstract

Article PDF

Similar content being viewed by others

Analysis of Web Visit Histories, Part I: Distance-Based Visualization of Sequence Rules

On the need for structure modelling in sequence prediction

MixedTrails: Bayesian hypothesis comparison on heterogeneous sequential data

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation