The Performance of a Combined Distance Between Time Series

Cardoso, Margarida G. M. S.; Alexandra Martins, Ana

doi:10.1007/978-3-031-12766-3_6

Part of the book series: Springer Proceedings in Mathematics & Statistics ((PROMS,volume 398))

Included in the following conference series:

International Conference on Congress of the Portuguese Statistical Society

498 Accesses

Abstract

This paper presents the comparison of a proposed measure of dissimilarity between time series (COMB) with three baseline measures. COMB is a convex combination of Euclidean distance, a Pearson-correlation-based distance, a Periodogram-based measure and a distance between estimated autocorrelation structures. The comparison resorts to 1-Nearest Neighbour classifier (1NN) since the effectiveness of the dissimilarity measures is directly reflected on the performance of 1NN. Data considered is available in the University of California Riverside (UCR) Time-Series Archive which includes datasets from a wide variety of application domains and have been used in similar studies. The COMB measure shows promising results: a good trade-off performance-computation time when compared to the alternative distances considered.

Access provided by Autonomous University of Puebla. Download conference paper PDF

A Comprehensive Comparison of Distance Measures for Time Series Classification

Elastic distances for time-series classification: Itakura versus Sakoe-Chiba constraints

Article 12 August 2022

A review on distance based time series classification

Article 01 November 2018

Keywords

1 Introduction

The use of dissimilarity measures between time series is critical in several data analysis tasks which range from simple querying to classification, clustering and anomaly detection. The role of dissimilarity measures in these contexts has been acknowledged by several works, e.g. [1,2,3].

Recently, in [4], we proposed a new dissimilarity measure, COMB, a convex combination of four (normalized) distance measures which offer complementary perspectives on the differences between two time series: the Euclidean distance which captures differences in scale; a Pearson-correlation-based measure that takes into account linear increasing and decreasing trends over time; a Periodogram-based measure that expresses the dissimilarities between frequencies or cyclical components of the series and a distance between estimated autocorrelation structures, comparing the series in terms of their dependence on past observations.

COMB achieved quite good results when clustering electricity market prices time series in European regions and also when clustering electricity loads time series (Portuguese Transmission System Operator data)—[4, 5].

In this work, we conduct an experimental analysis to evaluate the comparative performance of the proposed combined distance measure.

The remainder is structured as follows: first we present the Methodology used to provide the comparison of COMB with alternative distance measures; then, the Data Analysis and Results section brings some insights regarding the comparative analysis and, finally, we end with Discussion and Future Research of the presented work.

2 Methodology

2.1 UCR Repository

We resort to the University of California Riverside (UCR) Time-Series Archive where we can find time series of diverse lengths and numbers of target classes, with corresponding test and train sets—[6]. The UCR time-series datasets are from a wide variety of application domains and have been used to study the comparative performance of time-series classifiers—e.g. [2]—and specifically used in comparative studies of dissimilarity measures between time series, e.g. [7].

We limit the datasets considered to 57 taking into account computational cost. This is a criterion that has been invoked in similar studies—e.g. [8]. In our study we found that, for example, the script routine, when referring to the analysis of the “GesturePebbleZ2” UCR dataset, took 18:45 h to run (using a PC with processor Intel(R) Core(TM) i7-10750H CPU @ 2.60 2.59 GHz with a RAM of 32 GB). Nevertheless, we tried to include dissimilar datasets, namely, in what regards the number of target classes: 28 datasets have 2 target classes while 29 have more than 2 target classes. The selected datasets are presented in the Appendix. As in previous studies—e.g. [7]—and although this can limit the analysis, z-standardization is adopted for fairness, since many of the UCR series are presented in their z-standardized form.

2.2 Using the 1NN Classifier

We follow a methodology suggested in previous studies that were conducted to compare several dissimilarity measures and their variants—e.g. [7]: we use one nearest neighbour (1NN) classifier on labelled data to evaluate the performance of the distance measures. In fact, since the distance measure used is critical to 1NN accuracy, this indicator directly reflects the effectiveness of the dissimilarity measure used. According to [7] p. 1890, 1NN classifiers are suitable methods for distance measure evaluation for several reasons:

1.
resemble the problem solved in time-series similarity search;
2.
are parameter-free and easy to implement;
3.
are dependent on the choice of distance measure;
4.
provide an easy-to-interpret (classification) accuracy measure which captures if the query and the nearest neighbour belong to the same class.

2.3 Dissimilarity Measures

We compare COMB [4] with three alternative dissimilarity measures between time series. Comparisons are provided with three baseline measures: Euclidean distance, DTW (Dynamic Time-Warping with Sakoe-Chiba band [9] windowing considering 20% of the time-series length) and Complexity Invariance Distance (CID).

COMB Distance. Considering two time series $x_{t}$ and $y_{t}$, $\left( t=1, \ldots ,T\right) $, the COMB distance is a convex combination of four distances: Euclidean $(d_{Euclid})$, a Pearson-correlation-based measure $(d_{Pearson})$, a Periodogram-based measure $(d_{Period})$ and an autocorrelation-based measure $(d_{Autocorr})$.

The Euclidean distance, $d_{Eucl}$, yields the sum of Euclidean distances corresponding to each pair $\left( x_t,y_t\right) $ capturing the differences in scale:

$$\begin{aligned} d_{Eucl}={\left( \sum _{t=1}^{T}{\left( x_t-y_t\right) ^{2}}\right) }^\frac{1}{2}. \end{aligned}$$

(1)

The Pearson-correlation-based measure takes into account linear increasing and decreasing trends over time. The following measure was suggested by [10]:

$$\begin{aligned} d_{Pearson}=\sqrt{\frac{1-r_{x_{t},y_{t}}}{2}}, \end{aligned}$$

(2)

where $r_{x_{t},y_{t}}$ represents the Pearson correlation.

The Periodogram-based measure [11] considers the Euclidean distances between the Periodograms expressing the contribution of the various frequencies or cyclical components to the variability of the series,

$$\begin{aligned} d_{Period}=\left( \sum _{j=1}^{\left[ \frac{T}{2}\right] }\left( P_x\left( w_j\right) -P_y\left( w_j\right) \right) ^2\right) ^\frac{1}{2}, \end{aligned}$$

(3)

where $P_x\left( w_j\right) $ is the Periodogram of time series $x_t$ at frequencies $w_{j}=2\pi _j/n$, $j=1, \ldots ,[n/2]$ in the range 0 to $\pi $, being [n/2] the largest integer less or equal to n/2,

$$\begin{aligned} P_x\left( w_j\right) =\left( \frac{1}{n}\right) \left| \sum _{t=1}^{T}{x_te^{-itw_j}}\right| ^2. \end{aligned}$$

(4)

The autocorrelation-based distance [12] calculates Euclidean distances between autocorrelation structures, comparing the series in terms of their dependence on past observations

$$\begin{aligned} d_{Autocorr}=\left( \sum _{l=1}^{L}\left( r_l\left( x_t\right) -r_{l}\left( y_t\right) \right) ^2\right) ^\frac{1}{2}, \end{aligned}$$

(5)

where $r_l\left( x_t\right) $ and $r_{l}\left( y_t\right) $ represent the estimated autocorrelations of lag l of $\left( x_t\right) $ and $\left( y_t\right) $, respectively.

In this study, we specifically use an uniform convex combination, all four weights being equal.

Eucl—Euclidean Distance. The comparison with the performance of the Euclidean distance is unavoidable in all studies of this type. Even because, despite its simplicity, this distance can obtain surprisingly good results especially if the size of the training set/database is relatively large, [13], p. 281.

DTW—Dynamic Time-Warping. DTW is an elastic measure that computes the optimal alignment between two time series to minimize the sum of distances between aligned elements.

Considering two time series $x_{t}$ and $y_{t}$, $\left( t=1, \ldots ,T\right) $, let M be the $T\times T$ matrix where each element is a dissimilarity $d_{i,j}$ (commonly the Euclidean distance is considered) between any pair of elements $x_i$ and $y_j$ $\left( i,j=1, \ldots ,T\right) $.

A warping path $P=\left( \left( i_1,j_1\right) , \left( i_2,j_2\right) \right) $ is a series of indexes of M defining a mapping from each element of one time series to one, or more than one, or even none, of the elements of the other time series. A valid path should satisfy several conditions, for example, $i_{k+1} \ge i_{k}$ ensures the path does not go back in time. For other step patterns constrains see, e.g. [14]. For each path P, through M, the total sum of the distances along it is

$$\begin{aligned} D\left( P\right) =\sum _{k=1}^{K} d_{i_k,j_k}. \end{aligned}$$

(6)

For example, the Euclidean distance is the total distance along the diagonal of M. The goal of the DTW measure is to find a path $P^*$ that minimizes the total distances $D\left( P\right) $:

$$\begin{aligned} P^*=\min _{P} D\left( P\right) . \end{aligned}$$

(7)

To improve the efficiency of the procedure, it is a common practice to limit the time distortion (e.g. considering 20% of the time-series length). For example, the Sakoe-Chiba band [9] limits the warping path to a band of size $T_0$ directly above and to the right of the diagonal of the matrix M, by enforcing the constraint $|i_k-j_k|<T_0$.

CID—Complexity Invariance Distance. CID measure was proposed by [15]. The time series’ complexity is measured by stretching them and measuring the length of the resulting lines.

$$\begin{aligned} CID\left( x_t,y_t\right) =d_{Eucl}.CF\left( x_t,y_t\right) , \end{aligned}$$

(8)

where

$$\begin{aligned} CF\left( x_t,y_t\right) =\frac{max\left( CE\left( x_t\right) ,CE\left( y_t\right) \right) }{min\left( CE\left( x_t\right) ,CE\left( y_t\right) \right) } \end{aligned}$$

(9)

is the Complexity Factor, and

$$\begin{aligned} CE\left( x_t\right) =\left( \sum _{t=1}^{T-1}{\left( x_t-x_{t+1}\right) ^2}\right) ^\frac{1}{2} \end{aligned}$$

(10)

is the Complexity Estimate of time series $x_t$.

We resort to the R package “TSclust” [12] where the four distances that compose the COMB distance, the CID and the DTW (using the “dtw” package [14]) are implemented.

2.4 Evaluating the Classification Results

The evaluation of performance of the 1NN classifiers regards the test sets of the UCR time series considered. Balanced accuracy measure (average between sensitivity and specificity) when dealing with unbalanced sets is suggested by [6]. We propose using the Huberty index (HI)—e.g. [16], as a measure of classification performance:

$$\begin{aligned} HI=\sum _{k=1}^{K}{\frac{p^c-p^{def}}{1-p^{def}}}, \end{aligned}$$

(11)

where $p^c$ and $p^{def}$ are the proportion of observations correctly classified and the proportion of observations in the modal class, respectively. This measure is clearly useful for the evaluation of performance in unbalanced datasets. Furthermore, it provides a fair and interpretable view of the success of classification tasks which could be overestimated when high accuracy results are obtained in strongly unbalanced sets, e.g. a 90% accuracy result when a target class includes 95% of observations yields a negative Huberty index (one should do better by simply allocating all observations to the modal class). In addition, the computational time is also taken into account in the evaluation of the 1NN results referring to the four dissimilarity measures considered.

After the evaluation of aggregated results, comparisons referring to specific datasets are considered to get some dissimilarities’ performance-related insights. On a “closer look to specific problems”, [2] resorts to the selection of some time series from each target class, trying to capture the main differences between these classes on specific datasets. We propose using the medoids of each class as defined by each dissimilarity measure to obtain those insights. The medoid definition is the observations that minimize the sum of all distances to elements in the same class—[17].

3 Data Analysis and Results

3.1 General Comparisons

A brief exploratory data analysis leads to the conclusion that, in the datasets considered, DTW generally provides better classification results than the alternative distances, followed by COMB—Table 1. COMB comparative results are illustrated in Fig. 1. However, for time series with two target classes (K=2) only, COMB provides slightly better results—see Table 1. In what regards the computation time DTW clearly provides the worst results—see Table 2.

Table 1 All time-series results

Full size table

Table 2 Huberty index results: time series with two target classes versus more than two target classes

Full size table

Table 3 Friedman test’s results regarding Huberty index

Full size table

Table 4 Dunn’s pairwise comparison tests regarding Huberty index for data with more than two classes (“Adj. Sig” are p-values adjusted by Bonferroni correction)

Full size table

According to the Friedman test’s results, there are no significant differences between the distributions of HI regarding the four dissimilarity measures (see Table 3). However, significant differences can be found when analysing data with more than two classes (K>2), which, after pairwise comparison of Dunn’s test, can be referred to the significant difference between HI.Eucl and HI.DTW (see Table 4).

The differences between computation times regarding the four dissimilarity measures are all significant according to Friedman’s test—see Table 5.

Table 5 Friedman test’s results regarding computation time

Full size table

3.2 COMB “Wins” and “Looses” Examples

In an attempt to understand the data conditions that could (un)favour COMB, we looked for some insights regarding a “COMB wins example” and a “COMB looses example”: ToeSegmentation2 and Herring time series, respectively. ToeSegmentation2 was originated in the CMU Graphics Lab Motion Capture Data, referring to right toe movements, with target classes “Walk Normally” and “Walk Abnormally”. Herring data refers to calcium carbonate structures from two classes of Herring: North sea or Thames. In Table 6, we present the details of data referring to these two datasets.

Table 6 COMB “wins” and “looses” datasets

Full size table

On the assumption that exploring the target classes in the test set could bring some insights into the performance of 1NN classifier, we obtained the medoids of target classes according to each of the four dissimilarity measures. The ToeSegmentation2 test set classes’ medoids are depicted in Fig. 2. The COMB measure reveals not only scale differences between the medoids (as Euclidean distance does, with the poorest results) but it is also apparent (for example) how the medoids’ tendencies diverge from each other, which, conjugated with the additional differences captured by COMB, results in its best performance, according to the HI.

The Herring test set classes’ medoids coincide for all dissimilarity measures except DTW which presents slightly different medoids. Nevertheless, a negative HI was obtained for all measures (revisit Table 6).

In an attempt to explore the potential of COMB measure, in a worst-case scenario, we performed a brief sensitivity analysis manipulating the COMB’s weights. After some trials, when considering the COMB weights regarding $d_{Period}$ and $d_{Autocorr}$ as nine times the weights regarding $d_{Euclid}$ and $d_{Pearson}$, we managed to cross the “waterline”, obtaining a HI slightly positive which the alternative measures could not. Note, however, that a customized parametrization of DTW could eventually obtain better results also, but we believe that it would also bring a relevant increase in computation time.

4 Discussion and Future Research

We conducted experiments on 57 time-series datasets from diverse application domains to compare the proposed dissimilarity measure, COMB, with three baseline alternative measures: Euclidean, Dynamic Time-Warping and Complexity Invariance Distance. We resorted to the 1-Nearest Neighbour classifier, using the four dissimilarities, to compare their effectiveness. Huberty index was used as a classification metric providing more informative analysis results than the simple Accuracy measure, adopted in previous studies to evaluate performance (ignoring prevalence). Experimental results obtained indicate that there are no significant differences between the classification performance (Huberty index measures) of the four dissimilarity measures. Nevertheless, it appears COMB can produce better results regarding time series with two target classes. Furthermore, there is also the potential to improve the results obtained with COMB by changing the weights in the convex combination: an example was provided for the Herring dataset where the COMB with uniform weights provided the worst classification results, while COMB with tuned weights was able to provide the best results. In what regards the computation time, Dynamic Time-Warping, which appears to be the most direct COMB competitor regarding classification performance, presented the (significantly) worst results. Considering the classification performance-runtime results we conclude that the proposed combined measure can be seen as competitive in several settings.

In future research, we aim to extend the present analysis to all (128) UCR datasets which will require to explore hardware-aware implementations and/or algorithmic solutions to turn the measures’ implementation the most efficient. We also think the Complexity Invariance Distance, which revealed to be a competitive measure, should definitely play a role in future similar studies (along with the unavoidable Euclidean and Dynamic Time-Warping dissimilarities and other eventual baseline measures). An investigation of the process to determine COMB weights should also be considered. Finally, the experimental design should include additional characteristics of the time-series data, besides the number of target classes, namely, we think that the inclusion of a measure of separation between classes should be considered.

References

Ding, H., Trajcevski, G., Scheuermann, P., Wang, X., Keogh, E.: Querying and mining of time series data: experimental comparison of representations and distance measures. In: Proceedings of the VLDB Endowment, vol. 1, pp. 1542–1552 (2008)
Google Scholar
Bagnall, A., Lines, J., Bostrom, A., Large, J., Keogh, E.: The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances. Data Min. Knowl. Disc. 31(3), 606–660 (2017)
Article MathSciNet Google Scholar
Javed, A., Lee, B.S., Rizzo, D.M.: A benchmark study on time series clustering. In: Machine Learning with Applications, vol. 1, p. 100001 (2020)
Google Scholar
Cardoso, M., Martins, A., Lagarto, J.: Combining various dissimilarity measures for clustering electricity market prices. In: Milheiro, P., Pacheco, A., de Sousa, B., Alves, I.F., Pereira, I., Polidoro, M.J., Ramos, S. (eds.) Estatística: Desafios Transversais ás Ciências dos Dados—Atas do XXIV Congresso da Sociedade Portuguesa de Estatística (), Edições SPE, pp. 197–212 (2021)
Google Scholar
Martins, A., Lagarto, J., Canacsinh, H., Reis, F., Cardoso, M.: Short-term load forecasting using time series clustering. In: Proceedings of 16th Conference on Sustainable Development of Energy, Water and Environment Systems (2021). ISSN: 1847-7178
Google Scholar
Dau, H.A., Bagnall, A., Kamgar, K., Yeh, C.-C.M., Zhu, Y., et al.: The UCR time series archive. IEEE/CAA J. Autom. Sin. 6(6), 1293–1305 (2019)
Article Google Scholar
Paparrizos, J., Liu, C., Elmore, A.J., Franklin, M.J.: Debunking four long-standing misconceptions of time-series distance measures. In: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. ACM (2020)
Google Scholar
Hills, J., Lines, J., Baranauskas, E., Mapp, J., Bagnall, A.: Classification of time series by shapelet transformation. Data Min. Knowl. Disc. 28(4), 851–881 (2013)
Article MathSciNet MATH Google Scholar
Sakoe, H., Chiba, S.: Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans. Acoust. Speech Signal Process. 26(1), 43–49 (1978)
Article MATH Google Scholar
Rodrigues, P., Gama, J., Pedroso, J.: Hierarchical clustering of time-series data streams. IEEE Trans. Knowl. Data Eng. 20(5), 615–627 (2008)
Article Google Scholar
Caiado, J., Crato, N., Peña, D.: A periodogram-based metric for time series classification. Comput. Stat. Data Anal. 50(10), 2668–2684 (2006)
Article MathSciNet MATH Google Scholar
Montero, P., Vilar, J.A.: TSclust: an R package for time series clustering. J. Stat. Softw. 62(1) (2014)
Google Scholar
Wang, X., Mueen, A., Ding, H., Trajcevski, G., Scheuermann, P., et al.: Experimental comparison of representation methods and distance measures for time series data. Data Min. Knowl. Disc. 26(2), 275–309 (2012)
Article MathSciNet Google Scholar
Giorgino, T.: Computing and visualizing dynamic time warping alignments in R: the dtw package. J. Stat. Softw. 31(7) (2009)
Google Scholar
Batista, G., Wang, X., Keogh, E.: A complexity-invariant distance measure for time series. In: Proceedings of the 2011 SIAM International Conference on Data Mining, pp. 699–710 (2011)
Google Scholar
Sharma, S.: Applied Multivariate Techniques. Wiley, New York (1996)
Google Scholar
Kaufman, L., Rousseeuw, P.J.: Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, New York (2009)
MATH Google Scholar

Download references

Acknowledgements

This work was supported by Fundação para a Ciência e a Tecnologia, grant UIDB/00315/2020.

Author information

Authors and Affiliations

University Institute of Lisbon (ISCTE-IUL), Business Research Unit (BRU-IUL), Lisbon, Portugal
Margarida G. M. S. Cardoso
CIMOSM, ISEL, Polytechnic of Lisbon, Lisbon, Portugal
Ana Alexandra Martins

Authors

Margarida G. M. S. Cardoso
View author publications
You can also search for this author in PubMed Google Scholar
Ana Alexandra Martins
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Margarida G. M. S. Cardoso .

Editor information

Editors and Affiliations

NOVA School of Science and Technology, Caparica, Portugal
Regina Bispo
University of Évora, Évora, Portugal
Lígia Henriques-Rodrigues
University of Évora, Évora, Portugal
Russell Alpizar-Jara
University of Edinburgh, Edinburgh, UK
Miguel de Carvalho

5 Appendix: The Datasets

The characteristics of the 57 datasets used in this work are presented in Tables 7 and 8. Several time series have missing values which were treated with linear interpolation. In order to make all time series of the same dataset with equal length, low-amplitude random noise was imputed to the end of time series with smallest length. For more details, see web page https://www.cs.ucr.edu/~eamonn/time_series_data_2018/.

Table 7 The datasets’ dimensions

Full size table

Table 8 The datasets’ dimensions (continuation)

Full size table

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cardoso, M.G.M.S., Alexandra Martins, A. (2022). The Performance of a Combined Distance Between Time Series. In: Bispo, R., Henriques-Rodrigues, L., Alpizar-Jara, R., de Carvalho, M. (eds) Recent Developments in Statistics and Data Science. SPE 2021. Springer Proceedings in Mathematics & Statistics, vol 398. Springer, Cham. https://doi.org/10.1007/978-3-031-12766-3_6

Download citation

DOI: https://doi.org/10.1007/978-3-031-12766-3_6
Published: 29 November 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-12765-6
Online ISBN: 978-3-031-12766-3
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics

The Performance of a Combined Distance Between Time Series

Abstract

Similar content being viewed by others

A Comprehensive Comparison of Distance Measures for Time Series Classification

Elastic distances for time-series classification: Itakura versus Sakoe-Chiba constraints

A review on distance based time series classification

Keywords

1 Introduction