Discovery of Time-Series Motif from Multi-Dimensional Data Based on MDL Principle

Tanaka, Yoshiki; Iwamoto, Kazuhisa; Uehara, Kuniaki

doi:10.1007/s10994-005-5829-2

Discovery of Time-Series Motif from Multi-Dimensional Data Based on MDL Principle

Published: February 2005

Volume 58, pages 269–300, (2005)
Cite this article

Download PDF

Machine Learning Aims and scope Submit manuscript

Discovery of Time-Series Motif from Multi-Dimensional Data Based on MDL Principle

Download PDF

Yoshiki Tanaka¹,
Kazuhisa Iwamoto¹ &
Kuniaki Uehara¹

6921 Accesses
156 Citations
4 Altmetric
Explore all metrics

Abstract

Recently, the research on efficient extraction of previously unknown, frequently appearing patterns in a time-series data has received much attention. These patterns are called ‘motifs’. Motifs are useful for various time-series data mining tasks. In this paper, we propose a motif discovery algorithm to extract a motif that represents a characteristic pattern of the given data based on Minimum Description Length (MDL) principle. In addition, the algorithm can extract motifs from multi-dimensional time-series data by using Principal Component Analysis (PCA). In experimental evaluation, we show the efficiency of the motif discovery algorithm, and the usefulness of extracted motifs to various data mining tasks.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

Akaike, H. (1969). Fitting autoregressive model for prediction. The Annals of the Institute of Statistical Mathematics, 21, 243–247.
Google Scholar
Berberidis, C., Vlahavas, I., Aref, W. G., Atallah, M., & Elmagarmid, A. K. (2002). On the discovery of weak periodicities in large time series. In Proc. of PAKDD 2002 (pp. 51–61).
Buhler, J., & Tompa, M. (2001). Finding motifs using random projections. In Proc. of 5th International Conference on Computer Molecular Biology (pp. 67–74).
Caraca-Valente, J. P., & Lopez-Chavarrias, I. (2000). Discovering similar patterns in time series. In Proc. of 6th ACM SIGMOD International Confference on Knowledge Discovery and Data Mining (pp. 126–133).
Chakrabarti, S., Sarawagi, S., & Dom, B. (1998). Mining surprising patterns using temporal description length. In Proc. of 24th International Conference on Very Large databases (pp. 606–617).
Chiu, B., Keogh, E., & Lonardi, S. (2003). Probabilistic discovery of time series motif. In Proc. of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 493–498).
Colomer, J., Melendez, J., & Gamero, F. I. (2002). Pattern recognition based on episodes and DTW, application to diagnosis of a level control system. In Proc. of 16th International Workshop on Qualitative Reasoning (pp. 37–43).
Cyril, G., Peter, T., Egill, R., Arup, N. F., & Kai, H. L. (1999). On clustering fMRI time series. NeuroImage, 9:3, 298–310.
Google Scholar
Das, G., Lin, K., Mannila, H., Renganathan, G., & Smyth, P. (1998). Rule discovery from time series. In Proc. of the 4th ACM SIGMOD International Conference on Knowledge Discovery and Data Mining (pp. 16–22).
Fradkin, D., & Madigan, D. (2003). Experiments with random projections for machine learning. In Proc. of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 517–522).
Friedman, J. H., & Tukey, J. W. (1974). A projection pursuit algorithm for exploratory data analysis. IEEE Transactions on Computers, 23:9, 881–889.
Google Scholar
Heras, D. B., Cabaleiro, J. C., Perez, V. B., Costas, P., & Rivera, F. F. (1996). Principal component analysis on vector computers. In Proc. of Vector and Parallel Processing 1996 (pp. 416–428).
Hyvrinen, A., & Oja, E. (2000). Independent component analysis: Algorithms and applications. Neural Networks, 13:4, 411–430.
Google Scholar
Kashino, K., Smith, G., & Murase, H. (1999). Time-series active search for quick retrieval of audio and video. In Proc. of 1999 International Conference on Acoustics, Speech and Signal Processing (pp. 2993–2996).
Keogh, E. (2004). The UCR Time Series Data Mining Archive [http://www.cs.ucr.edu/~eamonn/TSDMA/datasets. html]. Riverside CA, University of California: Computer Science & Engineering Department.
Koopman, S. J., & Ooms, M. (2003). Time-series modeling of daily tax revenues. Statistica Neerlandica, 57:4, 439–469.
Google Scholar
Levin, A. U., Leen, T. K., and Moody, J. E. (1993), Fast pruning using principal components. In Proc. of NIPS ′93 (pp. 35–42).
Google Scholar
Lin, J., Keogh, E., Lonardi, S., & Chiu, B. (2003). A symbolic representation of time series, with implications for streaming algorithms. In Proc. of the 8th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery.
Lin, J., Keogh, E., Lonardi, S., & Patel, P. (2002). Finding motifs in time series. In Proc. of the 2nd Workshop on Temporal Data Mining (pp. 53–68).
Mori, T., & Uehara, K. (2001). Extraction of primitive motion and discovery of association rules from human motion. In Proc. of the 10th IEEE International Workshop on Robot and Human Communication (pp. 200–206).
Myers, C. S., & Rabiner, L. R. (1981). A comparative study of several dynamic time-warping algorithms for connected word recognition. The Bell System Technical Journal, 7:60, 1389–1409.
Google Scholar
Navarro, G., & Baeza-Yates, R., (1999). Fast multi-dimensional approximate pattern matching. In Proc. of the 10th Annual Symposium on Combinatorial Pattern Matching (pp. 243–257).
Rissanen, J. (1989). Stochastic Complexity in Statistical Inquiry, Vol. 15. World Scientific Publishing.
Schwarz, G (1981). Estimating the dimension of a model. The Annals of Statistical Mathematics, 6:2, 461-464.
Google Scholar
Shannon, C. E. (1949). Communication in the presence of noise. In Proc. of 2nd Institute of Radio Engineers (pp. 10–21).
Tanaka, Y., & Uehara, K. (2003). Discover motifs in multi dimensional time-series using the principal component analysis and the MDL principle. In Proc. of 3rd International Conference on Machine Learning and Data Mining in Pattern Recognition (pp. 252–265).
Vigario, R., Jousmaki, V., Hamalainen, M., Hari, R., and Oja, E. (1998), Independent component analysis for identification of artifacts in magnetoencephalographic recording. In Proc. of NIPS ′97 (pp. 229–235).
Google Scholar
Vlachos, M., Hadjieleftheriou, M., Gunopulos, D., & Keogh, E. (2003). Indexing multi-dimensional time-series with support for multiple distance measure. In Proc. of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 216–225).

Download references

Author information

Authors and Affiliations

Graduate School of Science and Technology, Kobe University 1-1 Rokko-dai, Nada, Kobe, 657-8501, Japan
Yoshiki Tanaka, Kazuhisa Iwamoto & Kuniaki Uehara

Authors

Yoshiki Tanaka
View author publications
You can also search for this author in PubMed Google Scholar
Kazuhisa Iwamoto
View author publications
You can also search for this author in PubMed Google Scholar
Kuniaki Uehara
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yoshiki Tanaka.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tanaka, Y., Iwamoto, K. & Uehara, K. Discovery of Time-Series Motif from Multi-Dimensional Data Based on MDL Principle. Mach Learn 58, 269–300 (2005). https://doi.org/10.1007/s10994-005-5829-2

Download citation

Received: 31 March 2004
Revised: 12 October 2004
Accepted: 12 October 2004
Issue Date: February 2005
DOI: https://doi.org/10.1007/s10994-005-5829-2

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Discovery of Time-Series Motif from Multi-Dimensional Data Based on MDL Principle

Abstract

Article PDF

Similar content being viewed by others

An Efficient Method for Discovering Motifs in Large Time Series

A Fast Method for Motif Discovery in Large Time Series Database under Dynamic Time Warping

Exploring variable-length time series motifs in one hundred million length scale

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Discovery of Time-Series Motif from Multi-Dimensional Data Based on MDL Principle

Abstract

Article PDF

Similar content being viewed by others

An Efficient Method for Discovering Motifs in Large Time Series

A Fast Method for Motif Discovery in Large Time Series Database under Dynamic Time Warping

Exploring variable-length time series motifs in one hundred million length scale

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation