Using Markov chain Monte Carlo and dynamic programming for event sequence data

Salmenkivi, Marko; Mannila, Heikki

doi:10.1007/s10115-004-0157-6

Using Markov chain Monte Carlo and dynamic programming for event sequence data

Published: 01 March 2005

Volume 7, pages 267–288, (2005)
Cite this article

Download PDF

Access provided by CONRICYT-eBooks

Knowledge and Information Systems Aims and scope Submit manuscript

Using Markov chain Monte Carlo and dynamic programming for event sequence data

Download PDF

Marko Salmenkivi¹ &
Heikki Mannila¹

205 Accesses
13 Citations
Explore all metrics

Abstract

Sequences of events are a common type of data in various scientific and business applications, e.g. telecommunication network management, study of web access logs, biostatistics and epidemiology. A natural approach to modelling event sequences is using time-dependent intensity functions, indicating the expected number of events per time unit. In Bayesian modelling, piecewise constant functions can be utilized to model continuous intensities, if the number of segments is a model parameter. The reversible jump Markov chain Monte Carlo (RJMCMC) methods can be exploited in the data analysis. With very large quantities, these approaches may be too slow. We study dynamic programming algorithms for finding the best fitting piecewise constant intensity function, given a number of pieces. We introduce simple heuristics for pruning the number of the potential change points of the functions. Empirical evidence from trials on real and artificial data sets is provided, showing that the developed methods yield high performance and they can be applied to very large data sets. We also compare the RJMCMC and dynamic programming approaches and show that the results correspond closely. The methods are applied to fault-alarm sequences produced by large telecommunication networks.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

Arjas E (1989) Survival models and Martingale dynamics. Scand J Stat 16:177–225
MathSciNet Google Scholar
Bernardo J, Smith A (1994) Bayesian Theory. Wiley, Chichester
Brooks S, Giudici P (1999) Diagnosing convergence of reversible jump MCMC algorithms. In: Bernardo J, Berger J, Dawid A, Smith A (eds) Bayesian Statistics 6. Oxford University Press, Oxford, pp 733–742
Chib S, Greenberg E (1995) Understanding the Metropolis–Hastings algorithm. Am Stat 49:327–335
Google Scholar
Eerola M, Mannila H, Salmenkivi M (1998) Frailty factors and time-dependent hazards in modeling ear infections in children using Bassist. In: Proceedings of XIII symposium on computational statistics (COMPSTAT ’98). Physica-Verlag, Heidelberg, pp 287–292
Gamerman D (1997) Markov chain Monte Carlo. Stochastic simulation for Bayesian inference. Texts in Statistical Science. Chapman and Hall, London
Gelman A, Carlin J, Stein H, Rubin D (1995) Bayesian data analysis. Texts in Statistical Science. Chapman, New York
Green P (1995) Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 82:711–732
Article MathSciNet Google Scholar
Guralnik V, Srivastava J (1999) Event detection from time series data. In: Proceedings of the Fifth ACM SIGKDD international conference on knowledge discovery and data mining (KDD-1999). San Diego, CA, USA, pp 33–42
Guttorp P (1995) Stochastic modeling of scientific data. Stochastic modeling series. Chapman and Hall, London
Hawkins D (1976) Point estimation of parameters of piecewise regression models. Journal Roy Stat Soc Ser C 25:51–57
MathSciNet Google Scholar
Hätönen K, Klemettinen M, Mannila H et al (1996) TASA: Telecommunication alarm sequence analyzer, or “how to enjoy faults in your network.” In: Proceedings of the 1996 IEEE network operations and management symposium (NOMS’96). Kyoto, Japan, pp 520–529
Mannila H, Salmenkivi M (2001) Finding simple intensity descriptions from event sequence data. In: Proceedings of the seventh ACM SIGKDD international conference on knowledge discovery and data mining (KDD-2001). San Francisco, CA, USA, pp 341–346
Tierney L (1994) Markov chains for exploring posterior distributions. An Stat 22:1701–1728.
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Helsinki Institute for Information Technology, Basic Research Unit, Department of Computer Science, University of Helsinki, P.O. Box 26, 00014, Helsinki, Finland
Marko Salmenkivi & Heikki Mannila

Authors

Marko Salmenkivi
View author publications
You can also search for this author in PubMed Google Scholar
Heikki Mannila
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marko Salmenkivi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Salmenkivi, M., Mannila, H. Using Markov chain Monte Carlo and dynamic programming for event sequence data. Knowl Inf Syst 7, 267–288 (2005). https://doi.org/10.1007/s10115-004-0157-6

Download citation

Received: 24 October 2002
Revised: 29 November 2003
Accepted: 18 December 2003
Published: 01 March 2005
Issue Date: March 2005
DOI: https://doi.org/10.1007/s10115-004-0157-6

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Using Markov chain Monte Carlo and dynamic programming for event sequence data

Abstract

Article PDF

Similar content being viewed by others

A discrete MMAP for analysing the behaviour of a multi-state complex dynamic system subject to multiple events

Bayesian Inference and RJMCMC in Structural Dynamics: On Experimental Data

Maximum spacing estimation for continuous time Markov chains and semi-Markov processes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Using Markov chain Monte Carlo and dynamic programming for event sequence data

Abstract

Article PDF

Similar content being viewed by others

A discrete MMAP for analysing the behaviour of a multi-state complex dynamic system subject to multiple events

Bayesian Inference and RJMCMC in Structural Dynamics: On Experimental Data

Maximum spacing estimation for continuous time Markov chains and semi-Markov processes

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation