Abstract
Similarity search is a core module of many data analysis tasks, including search by example, classification, and clustering. For time series data, Dynamic Time Warping (DTW) has been proven a very effective similarity measure, since it minimizes the effects of shifting and distortion in time. However, the quadratic cost of DTW computation to the length of the matched sequences makes its direct application on databases of long time series very expensive. We propose a technique that decomposes the sequences into a number of segments and uses cheap approximations thereof to compute fast lower bounds for their warping distances. We present several, progressively tighter bounds, relying on the existence or not of warping constraints. Finally, we develop an index and a multi-step technique that uses the proposed bounds and performs two levels of filtering to efficiently process similarity queries. A thorough experimental study suggests that our method consistently outperforms state-of-the-art methods for DTW similarity search.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Agrawal, R., Faloutsos, C., & Swami, A. (1993). Efficient similarity search in sequence databases. Proc. 4th Int. Conf. on Foundations of Data Organization and Algorithms (FODO) (pp. 69–84).
Beckmann, N., Kriegel, H.-P., Schneider, R., & Seeger, B. (1990). The R*-tree: An efficient and robust access method for points and rectangles. Proc. of the ACM SIGMOD Int. Conf. on Management of Data (pp. 322–331). Atlantic City, New Jersey.
Berndt, D. J. & Clifford, J. (1994). using dynamic time warping to find patterns in time series. Proc. of AAAI Workshop: Knowledge Discovery in Databases (pp. 359–370). Seattle, Washington.
Chu, S., Keogh, E., Hart, D., & Pazzani, M. (2002). Iterative deepening dynamic time warping for time series. In: Proc of SIAM International Conference on Data Mining.
Faloutsos, C., Ranganathan, M., & Manolopoulos, Y. (1994). Fast subsequence matching in time-series databases. Proc. of the ACM SIGMOD Int. Conf. on Management of Data (pp. 419–429). Minneapolis, Minnesota.
Goldin, D. & Kanellakis, P. (1995). On similarity queries for time-series data: constraint specification and implementation. Proc. of the 1st Intl Conference on the Principles and Practice of Constraint Programming (pp. 137–153). Cassis, France.
Hjaltason, G. & Samet, H. (1999). distance browsing in spatial databases. ACM Transactions on Database Systems 24:2, 265–318.
Kaufman, L. & Rousseeuw, P. (1990). Finding Groups in Data: An Introduction to Cluster Analysis. Wiley.
Keogh, E. (2002a). Exact indexing of dynamic time warping. Proc. of 28th International Conference on Very Large Data Bases (pp. 406–417). Hong Kong, China.
Keogh, E. (2002b). The UCR Time Series Data Mining Archive. Computer Science & Engineering Department, University of California, Riverside CA, http://www.cs.ucr.edu/~eamonn/TSDMA/index.html.
Keogh, E., Chakrabarti, K., Mehrotra, S., & Pazzani, M. (2001). Locally adaptive dimensionality reduction for indexing large time series databases. Proc. of the ACM SIGMOD Int. Conf. on Management of Data (pp. 151–162). Santa Barbara, California,.
Keogh, E. & Pazzani, M. (1999). scaling up dynamic time warping to massive datasets. Proc. of the Eur. Conf on Principles of Data Mining and Knowledge Discovery (PKDD) (pp. 1–11). Prague, Czech Republic.
Kim, S. W., Park, S., & Chu, W. W. (2001). An indexed-based approach for similarity search supporting time warping in large sequence databases. Proc. of the 17th International Conference on Data Engineering (pp. 607–614). Heidelberg, Germany.
Kruskall, J. B. & Liberman, M. (1983). The symmetric time warping algorithm: From continuous to discrete. In: Time Warps, String Edits and Macromolecules. Addison-Wesley.
Moon, Y. S., Whang, K. Y., & Han, W. S. (2002). GeneralMatch: A subsequence matching method in time-series databases based on generalized windows.Proc. of the ACM SIGMOD Int. Conf. on Management of Data (pp. 382–393). Madison, Wisconsin.
Munich, M. E. & Perona, P. (1999). continuous dynamic time warping for translation-invariant curve alignment with applications to signature verification. Proc. of the 8 th Int’l Conf. on Computer Vision (pp. 20–25). Corfu, Greece.
Rabiner, L. & Juang, B. H. (1993). Fundamentals of Speech Recognition. Prentice Hall.
Ratanamahatana, C. A. & Keogh, E. (2004). making time-series classification more accurate using learned constraints. Proc. of SIAM International Conference on Data Mining (SDM ’04) (pp. 11–22). Lake Buena Vista, Florida.
Sakoe, H. & Chiba, S. (1978). dynamic programming algorithm optimization for spoken word recognition. IEEE Trans. Acoustics, Speech and Signal Processing 26:1, 43–49.
Seidl, T. & Kriegel, H.-P. (1998). Optimal multi-step k-nearest neighbor search. Proc. of the ACM SIGMOD Int. Conf. on Management of Data (pp. 154–165). Seattle, Washington.
Vlachos, M., Hadjieleftheriou, M., Gunopulos, D., & Keogh, E. (2003). indexing multi-dimensional time-series with support for multiple distance measures. Proc. of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 216–225). Washington, DC.
Yi, B., Jagadish, H., & Faloutsos, C. (1998). Efficient retrieval of similar time sequences under time warping. Proc. of the Fourteenth International Conference on Data Engineering. pp. 201–208, Orlando, Florida.
Zhu, Y. & Shasha, D. (2003). warping indexes with envelope transforms for query by humming. Proc. of the ACM SIGMOD Int. Conf. on Management of Data (pp. 181–192). San Diego, California.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Shou, Y., Mamoulis, N. & Cheung, D.W. Fast and Exact Warping of Time Series Using Adaptive Segmental Approximations. Mach Learn 58, 231–267 (2005). https://doi.org/10.1007/s10994-005-5828-3
Received:
Revised:
Accepted:
Issue Date:
DOI: https://doi.org/10.1007/s10994-005-5828-3