Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Clustering of times series data is of pivotal importance in various applications [1] such as, for example, seasonality patterns in retail [2], electricity usage profiles [3], DNA microarrays [4], and fMRI brain activity mappings [5]. A crucial design decision of any clustering algorithm is the choice of (dis)similarity function [6, 7]. In many clustering applications, the underlying (dis)similarity function measures the cost of aligning time series to one another. Typical examples of such functions include the DTW and the Euclidean distance [810].

Alignment-based (dis)similarity functions, however, seem not to be justified for applications, where two time series are considered to be similar, if they share common or similar subsequences of variable length at arbitrary positions [1114]. A real-life example for such an application comes from the automotive industry, where test drives of vehicles are considered to be similar, if they share similar driving behavior patterns, i.e. engine behavior or drive maneuvers, which are described by the progression of multiple vehicle parameters over a certain period of time [15, 16]. In this scenario, the order of the driving behavior patterns does not matter [17], but the frequency with which the patterns occur in the contrasted time series.

Recent work [18] on time series distance measures suggests to neglect irrelevant and redundant time series segments, and to retrieve subsequences that best characterize the real-life data. Although subsequence clustering is a tricky endeavor [19], several studies [1114, 20] have demonstrated that in certain circumstances ignoring sections of extraneous data and keeping intervals with high discriminative power contributes to cluster centers that preserve the characteristics of the data sequences. Related concepts that have been shown to improve clustering results include time series motifs [11, 12], shapelets [13, 14], and discords [20].

In this contribution, we propose to adopt recurrence plots (RPs) [2123] and related recurrence quantification analysis (RQA) [2426] to measure the similarity between multivariate time series that contain segments of similar trajectories at arbitrary positions in time [17]. We introduce the concept of joint cross recurrence plots (JCRPs), an extension of traditional RPs, to visualize and investigate multivariate patterns that (re)occur in pairwise compared time series. In dependence on JCRPs and known RQA measures, such as determinism, we define a RecuRRence plot-based (RRR) distance measure, which reflects the proportion of time series segments with similar trajectories or recurring patterns respectively.

In order to demonstrate the practicability of our proposed recurrence plot-based distance measure, we conduct experiments on both synthetic time series and real-life vehicular sensor data [1517]. The results show that, unlike commonly used (dis)similarity functions, our proposed distance measure is able to (i) determine cluster centers that preserve the characteristics of the data sequences and, furthermore, (ii) identify prototypical time series that cover a high amount of recurring patterns. The rest of the paper is organized as follows. In Sect. 2 we state the general problem being investigated. Subsequently we introduce traditional recurrence plots as well as various extensions in Sect. 3. Recurrence quantification analysis and corresponding measures are discussed in Sect. 4. Our proposed recurrence plot-based distance measure and respective evaluation criteria are introduced in Sect. 5. The experiments results are presented and discussed in Sect. 6. Finally we conclude with future work in Sect. 7.

2 Problem Statement

Car manufacturers aim to optimize the performance of newly developed engines according to operational profiles that characterize recurring driving behavior. To obtain real-life operational profiles for exhaust simulations, Volkswagen (VW) collects data from test drives for various combinations of driver, vehicle and route.

Given a set \(\fancyscript{X} = \{X_1, X_2, \ldots , X_t\}\) of \(t\) test drives, the challenge is to find a subset of \(k\) prototypical time series \(\fancyscript{Y} = \{Y_1,\ldots ,Y_k\} \in \fancyscript{X}\) that best comprehend the recurring (driving behavior) patterns found in set \(\fancyscript{X}\). Test drives are represented as multivariate time series \(X=(x_1,\ldots ,x_n)\) of varying length \(n\), where \(x_i \in \mathbb {R}^d\) is a \(d\)-dimensional feature vector summarizing the observed measurements at time \(i\). A pattern \(S = (x_s, \ldots , x_{s+l-1})\) of \(X=(x_1,\ldots ,x_n)\) is a subsequence of \(l\) consecutive time points from \(X\), where \(l \le n\) and \(1 \le s < s+l-1 \le n\). Assuming two time series \(X=(x_1,...,x_n)\) and \(Y=(y_1,\ldots ,y_m)\) with patterns \(S = (x_s, \ldots , x_{s+l-1})\) and \(P = (y_p, \ldots , y_{p+l-1})\) of length \(l\), we say that \(S\) and \(P\) are recurring patterns of \(X\) and \(Y\) if \(d(S,P) \le \varepsilon \), where and \(d:X \times X \rightarrow \mathbb {R}^+\) is a (dis)similarity function and \(\varepsilon \) is a certain similarity threshold. Note that recurring patterns of \(X\) and \(Y\) may occur at arbitrary positions and in different order.

Since we aim to identify \(k\) prototypical time series that (i) best represent the set \(\fancyscript{X}\) and (ii) are members of the set \(\fancyscript{X}\), one can employ the \(k\)-mediod clustering algorithm.

3 Recurrence Plots

Recurrence plots (RPs) are used to visualize and investigate recurrent states of dynamical systems or rather time series [26, 27]. Even though RPs give very vivid and impressive images of dynamical system trajectories, their implicit mathematical foundation is deceptively simple [21]:

$$\begin{aligned} R_{i,j}^{x}(\varepsilon ) = \varTheta (\varepsilon - ||x_i - x_j||) \quad x_i \in \mathbb {R}^d, i,j=1 \ldots n \end{aligned}$$
(1)

where \(x\) is a time series of length \(n\), \(||\cdot ||\) a norm and \(\varTheta \) the Heaviside function. One of the most crucial parameters of RPs is the recurrence threshold \(\varepsilon \), which influences the formation of line structures [22]. In general, the recurrence threshold should be chosen in a way that noise corrupted observations are filtered out, but at the same time a sufficient number of recurrence structures are preserved. As a rule of thumb, the recurrence rate should be approximately one percent with respect to the size of the plot. For quasi-periodic processes, it has been suggested to use the diagonal line structures to find the optimal recurrence threshold. However, changing the threshold does not preserve the important distribution of recurrence structures [26].

A general problem with standard thresholding methods is that an inappropriate threshold or laminar states cause thick diagonal lines, which basically corresponds to redundant information. Schultz et al. [27] have proposed a local minima-based thresholding approach, which can be performed without choosing any particular threshold and yields in clean RPs of minimized line thickness. But this approach comes with some side-effects, e.g., bowed lines instead of straight diagonal lines.

Furthermore, it is important discuss the definition of recurrences, because distances can be calculated using different norms [21]. Although the \(L_2\)-norm is used in most cases, the \(L_\infty \)-norm is sometimes preferred for relatively large time series with high computational demand [26].

Although traditional RPs only regard one trajectory, we can extend the concept in a way that allows us to study the dynamics of two trajectories in parallel [23]. A cross recurrence plot (CRP) shows all those times at which a state in one dynamical system occurs in a second dynamical system. In other words, the CRP reveals all the times when the trajectories of the first and second time series, \(x\) and \(y\), visits roughly the same area in the phase space. The data length, \(n\) and \(m\), of both systems can differ, leading to a non-square CRP matrix [22, 24].

$$\begin{aligned} CR_{i,j}^{x,y}(\varepsilon ) = \varTheta (\varepsilon - ||x_i - y_j||) \quad x_i, y_j \in \mathbb {R}^d, \, i=1 \ldots n, \, j=1 \ldots m \end{aligned}$$
(2)

For the creation of a CRP, both trajectories, \(x\) and \(y\), have to present the same dynamical system with equal state variables because they are in the same phase space. The application of CRPs to absolutely different measurements, which are not observations of the same dynamical system, is rather problematic and requires some data preprocessing with utmost carefulness [22].

In order to test for simultaneously occurring recurrences in different systems, another multivariate extension of RPs was introduced [23]. A joint recurrence plot (JRP) shows all those times at which a recurrence in one dynamical system occurs simultaneously with a recurrence in a second dynamical system. With other words, the JRP is the Hadamard product of the RP of the first system and the RP of the second system. JRPs can be computed from more than two systems. The data length of the considered systems has to be the same [22, 24].

$$\begin{aligned} JR_{i,j}^{x,y}(\varepsilon ^{x},\varepsilon ^{y}) = \varTheta (\varepsilon ^{x} - ||x_i - x_j||) \cdot \varTheta (\varepsilon ^{y} - ||y_i - y_j||) \\ \qquad \qquad \quad x_i \in \mathbb {R}^{d1}, \;\;\; y_j \in \mathbb {R}^{d2}, \quad i,j=1 \ldots n \nonumber \end{aligned}$$
(3)

Such joint recurrence plots have the advantage, that the individual measurements can present different observables with different magnitudes or range. They are often used for the detection of phase synchronization [22, 24].

Since this work aims at clustering test drives, which involves pairwise (dis)similarity comparisons of multivariate time series, we propose a combination of joint and cross recurrence plot, namely (JCRP) joint cross recurrence plot. A JCRP shows all those times at which a multivariate state in one dynamical system occurs simultaneously in a second dynamical system.

$$\begin{aligned} {\textit{JCR}}_{i,j}^{x,y}(\varepsilon ^1,\ldots ,\varepsilon ^k) = \varTheta (\varepsilon ^1 - ||x^1_i - y^1_j||) \cdot \ldots \cdot \varTheta (\varepsilon ^k - ||x^k_i - y^k_j||) \\ \qquad \qquad \quad x_i, y_j \in \mathbb {R}^{d}, \;\;\; i=1 \ldots n, \quad j=1 \ldots m \nonumber \end{aligned}$$
(4)

For the creation of a JRCP both trajectories, \(x\) and \(y\), need to have the same dimensionality or number of parameters \(d\), but can have different length, \(n\) and \(m\). We shall see that JCRPs are very useful, because they enable us to compare two multivariate systems with the same set of observables that can have different magnitudes. In other words, the introduced JCR notation allows us to determine an \(\varepsilon \)-threshold for each individual parameter, which is advantageous for observables with different variance. A toy example for JCRPs is given in the following:

$$\begin{aligned} x&= \left\{ \begin{array}{l l} \text{ dfcghGATHERSPEEDlmknhDECELERATEghfkd } \\ \text{ rsqtpACCELERATORxywzvBRAKEPEDALtvswr } \end{array} \right. \\ y&= \left\{ \begin{array}{l l} \text{ kdhfSLOWDOWNglbkchdgfGATHERSPEEDnkml } \\ \text{ tpsBRAKEPEDALzrysxtwvACCELERATORxtwv } \end{array} \right. \end{aligned}$$

Assume two multivariate time series \(x\) and \(y\) which comprise the speed and accelerator signal recorded during different car drives. Both time series contain multivariate states or rather string sequences that occur in both systems. The corresponding JCRP of \(x\) and \(y\), as illustrated in Fig. 1, shows the times at which a multivariate state occurs simultaneously in both systems. Furthermore, the diagonal line structure in Fig. 1 reveals that both trajectories run through a similar region in phase space for a certain time interval. With other words, both systems contain the same multivariate pattern, which represents that the driver hits the ‘ACCELERATOR’ pedal and the vehicle simultaneously ‘GATHERSPEED’. In Sect. 4 we discuss how to interpret single recurrence points and diagonal line structures, and explain how to use them to define a distance measure for time series with certain distortions or invariance.

Fig. 1
figure 1

Joint cross recurrence plot (JCRP) of sample drive \(x\) and \(y\) from our toy example, with \(\varepsilon =0\)

4 Recurrence Quantification

Recurrence quantification analysis (RQA) is used to quantify the structures observed in recurrence plots [22]. RQA is grounded in theory, but possesses statistical utility in dissecting and diagnosing nonlinear dynamic systems across multiple fields of science [28]. The explicit mathematical definition to distinct features in recurrence plots enables us to analyze signals that are multivariate, nonlinear, non-stationary and noisy.

The global (large-scale) appearance of a RP can give hints on stationarity and regularity, whereas local (small-scale) patterns are related to dynamical properties, such as determinism [28]. Recent studies have shown that determinism, the percentage of recurrence points that form lines parallel to the main diagonal, reflects the predictability of a dynamical system [22].

Given a recurrence matrix \(R\) with \(N\,\times \,N\) entries generated by any of the introduced recurrence plot variations, such as our proposed JCRP, we can compute the determinism DET(\(\varepsilon ,l_{min}\)) for a predefined \(\varepsilon \)-threshold and a minimum diagonal line length \(l_{min}\) as followed [22, 24]:

$$\begin{aligned}&DET(\varepsilon ,l_{min}) = \frac{\sum _{l=l_{min}}^{N} l \cdot P(\varepsilon ,l)}{\sum _{i,j=1}^{N} R_{i,j}(\varepsilon )} \nonumber \\&P(\varepsilon ,l) = \sum _{i,j=1}^{N} \bigg \{\left( 1- R_{i-1,j-1} \left( \varepsilon \right) \right) \cdot \left( 1- R_{i+l,j+l} \left( \varepsilon \right) \right) \cdot \prod _{k=0}^{l-1} R_{i+k,j+k} \left( \varepsilon \right) \bigg \}\ \end{aligned}$$
(5)

where \(P(\varepsilon ,l)\) is the histogram of diagonal lines of length \(l\) with respect to a certain \(\varepsilon \) neighborhood.

In general, processes with chaotic behavior cause none or short diagonals, whereas deterministic processes cause relatively long diagonals and less single, isolated recurrence points [22, 29]. In respect to JCRPs, diagonal lines usually occur when the trajectory of two multivariate time series segments is similar according to a certain threshold. Since we aim to measure the similarity between time series that contain segments of similar trajectories at arbitrary positions, which in turn cause diagonal line structures, we propose to use determinism as a similarity measure. According to the introduced JCRP approach, a high DET value indicates high similarity or rather a high percentage of multivariate segments with similar trajectory, whereas a relatively low DET value suggests dissimilarity or rather the absence of similar multivariate patterns.

However, data preprocessing like smoothing can introduce spurious line structures in a recurrence plot that cause high determinism value. In this case, further criteria like the directionality of the trajectory should be considered to determine the determinism of a dynamic system, e.g. by using iso-directional and perpendicular RPs [22, 24, 26]. In contrast to traditional recurrence plots, perpendicular recurrence plots (PRPs) consider the dynamical evolution of only the neighborhoods in the perpendicular direction to each phase flow, resulting in plots with lines of the similar width without spreading out in various directions. Removing spurious widths makes it more reasonable to define line-based quantification measures, such as divergence and determinism [30]. Another solution is to estimate the entropy by looking at the distribution of the diagonal lines [26]. The entropy is based on the probability \(p(\varepsilon ,l)\) that diagonal lines structures with certain length \(l\) and similarity \(\varepsilon \) occur in the recurrence matrix [22, 24].

Recurrence plots (RPs) and corresponding recurrence quantification analysis (RQA) measures have been used to detect transitions and temporal deviations in the dynamics of time series. Since detected variations in RQA measures can easily be misinterpreted, Marwan et al. [25] have proposed to calculate a confidence level to study significant changes. They formulated the hypothesis that the dynamics of a system do not change over times, and therefore the RQA measures obtained by the sliding window technique will be normally distributed. Consequently, if the RQA measures are out of a predefined interquantile range, an observation can be considered significantly. Detecting changes in dynamics by means of RQA measures obtained from a sliding window have been proven to be useful in real-life applications such as comparing traffic flow time series under fine and adverse weather conditions [29].

Since recurrence plot based techniques are still a rather young field in nonlinear time series analysis, systematic research is necessary to define reliable criteria for the selection of parameters, and the estimation of RQA measures [26].

5 Recurrence Plot-Based Distance

According to our formalization of joint cross recurrence (JCR) in Eq. 4 and the denotation of the determinism (DET) in Eq. 5, we can define our RecuRRence Plot-based (RRR) distance measure as follows:

$$\begin{aligned} {\textit{RRR}}(\varepsilon ,l_{min}) = 1 - {\textit{DET}}(\varepsilon ,l_{min}) \end{aligned}$$
(6)

Since the DET value ranges from \(0\) to \(1\), depending on the proportion of diagonal line structures found in a JCR plot, the RRR distance is \(0\) if the trajectory of both dynamical systems is identical and \(1\) if there are no similar patterns at any position in time.

Although our proposed RRR distance measure can be used as a subroutine for various time series mining tasks, this work primarily focuses on clustering. Our aim is to group a set of \(t\) unlabeled time series \(T\) into \(k\) clusters \(C\) with centroids \(Z\). In order to evaluate the performance of the time series clustering with respect to our RRR distance, we suggest to quantify the number of similar patterns that recur within the established clusters. Therefore, we define the following cluster validation index:

$$\begin{aligned} E(k) = \frac{1}{t-k} \sum _{z \in \{Z\}} \sum _{c \in \{C_z \setminus z\}} RRR(z,c) \end{aligned}$$
(7)

According to our problem setting, the more patterns occur jointly when comparing each centroid \(z \in \{Z\}\) with all objects \(c \in \{C_z \setminus z\}\) of the corresponding cluster, the lower \(E\), the better our clustering, and the more characteristic are the corresponding prototypes.

Furthermore we are going to evaluate the clustering of time series according to the index \(I\) [31], whose value is maximized for the optimal number of clusters:

$$\begin{aligned} I(k) = \left( \frac{1}{k} \cdot \frac{E(1)}{E(k)} \cdot D_k\right) ^p \end{aligned}$$
(8)

The index \(I\) is a composition of three factors [31], namely \(1/k\), \(E(1)/E(k)\), and \(D_k\). The first factor will try to reduce index \(I\) as the number of clusters \(k\) increases. The second factor consists of the ratio of \(E(1)\), which is constant for a given dataset, and \(E(k)\), which decreases with increase in \(k\). Consequently, index \(I\) increases as \(E(k)\) decreases, encouraging more clusters that are compact in nature. Finally, the third factor, \(D_k\) (which measures the maximum separation between two clusters over all possible pairs of clusters), will increase with the value of \(k\), but is bounded by the maximum separation between two points in the dataset.

$$\begin{aligned} D_k =\overset{k}{\underset{i,j=1}{max}} ||z_i - z_j|| \end{aligned}$$
(9)

Thus, the three factors are found to compete with and balance each other critically. The power \(p\) is used to control the contrast between the different cluster configurations. Previous work [31] suggests to choose \(p=2\).

The index \(I\) has been found to be consistent and reliable, irrespective of the underlying clustering technique and data dimensionality, and furthermore has been shown to outperform the Dunn and David-Bouldin index [31].

6 Evaluation

The goal of our evaluation is to assess how well the RRR distance is suited for: (i) clustering time series that contain similar trajectories at arbitrary positions (in Sect. 6.1), and (ii) identifying prototypical time series that cover as much as possible recurring patterns (in Sect. 6.2).

Fig. 2
figure 2

Univariate synthetic time series with artificially implanted patterns (red color) at arbitrary positions, where each time series belongs to one of three groups (Wave, YoYo, and Peak)

Fig. 3
figure 3

Cluster tree (dendrogram) of multivariate synthetic time series (introduced in Fig. 2) according to the DTW and RRR distance. The x-axis reveals the distance between the time series being merged and the y-axis illustrates the corresponding name and shape of the time series

6.1 Synthetic Data

This controlled experiment aims at visualizing the clustering results of the proposed RRR distance measure compared to the DTW distance.

We generated a labeled dataset, which consists of nine time series from three different categories, called Wave, YoYo and Peak. Each category comprises three time series characterized by multiple occurrence of the same artificial patterns at arbitrary positions. The dataset consists of univariate time series of equal length, as shown in Fig. 2. To visualize the clustering results of the RRR and DTW distance, we applied agglomerative hierarchical clustering with complete linkage on the synthetic dataset.

Figure 3 illustrates the generated hierarchical cluster trees for both examined distance measures on the synthetic time series. The first observation to be made is that RRR perfectly recovers the cluster structure provided by the ground truth, given our knowledge that there are three categories. In contrast, the DTW distance fails and assigns time series of different categories to the same cluster at an early stage. The second observation to be made is that RRR is able to recover the ground truth even if a large portion of the time series is noisy. The DTW distance, however, groups time series into the same clusters, if they have globally a similar shape. Therefore, the noisy parts of the time series supersede or superimpose the relevant recurring patterns.

6.2 Real-Life Data

This experiment aims at assessing the time series prototypes identified by the proposed RRR distance measure compared to the DTW distance.

For our evaluation we consider the VW DRIVE dataset, which consists of 124 real-life test drives recorded by one vehicle operated by seven different individuals. Test drives are represented as multivariate time series of varying length and comprise vehicular sensor data of the same observed measurements. Since we aim to identify operations profiles that characterize recurring driving behavior, we exclusively consider accelerator, speed, and revolution measurements, which are more or less directly influenced by the driver. The complete VW DRIVE dataset contains various other measurements, such as airflow and engine temperature, and can be obtained by mailing the first author of this paper.

To measure the (dis)similarity of the VW DRIVE time series using our proposed RRR distance, we first need to determine the optimal similarity threshold \(\varepsilon \) and pattern length \(l_{min}\) for each of the considered measurements, such that a considerable amount of the recurring patterns is preserved.

Figure 4 shows the determinism value for the accelerator, speed, and revolution signal, in regard to different parameters settings. We can observe that for all considered signals the DET value decreases with increasing pattern length \(l_{min}\) and decreasing similarity threshold \(\varepsilon \). Furthermore, Fig. 4 reveals that the speed signal is highly deterministic, meaning that the same patterns occur frequently, whereas the acceleration and revolution signal are less predictable and show more chaotic behavior.

Fig. 4
figure 4

Determinism (DET) value for changing similarity threshold \(\varepsilon \) and minimum diagonal line length \(l_{min}\) for accelerator, speed and revolution signal; based on the cross recurrence plots (CRPs) of 10 randomly selected pairs of tours from our DRIVE dataset. Note that the DET was averaged

Since we aim to analyze all signals jointly by means of the proposed joint cross recurrence plot (JCRP) approach, we have to choose a pattern length or rather minimum diagonal line length \(l_{min}\) that is suitable for all signals. In general, we are looking for relatively long patterns with high similarity. In other words, we aim to find a parameter setting with preferably large \(l_{min}\) and small \(\varepsilon \) which results in a DET value that is above a certain threshold. To preserve the underlying characteristics or rather recurring patterns contained in examined data, at least 20 % of the recurrence points should form diagonal line structures, which corresponds to DET \(\ge 0.2\). Based on this criterion we choose \(l_{min}=5\) and \(\varepsilon =14/2/40\) for the accelerator, speed, and revolution signal respectively. Note that the individual signals were not normalized, wherefore the \(\varepsilon \)-threshold represents the accelerator pedal angle, kilometers per hour, and rotations per minute.

To identify prototypical time series using RRR and DTW distance respectively, we applied \(k\)-mediods clustering with random initialization. For evaluation purpose we computed index \(I\) and \(E\) for a varying number of \(k\) prototypes. The results of index \(I\) were normalized in a way that the highest value, which indicates the optimal number of clusters, equals one. Since index \(E\) is a sum of RRR values (see Eq. 7) and RRR \(= 1 - DET\), the lower \(E\), the higher the average DET value, and the more recurring (driving behavior) patterns are comprised of the prototypes identified by the respective distance measure.

Fig. 5
figure 5

Evaluation of RRR and DTW distance for clustering a univariate and b multivariate time series of our DRIVE dataset. We compare the index \(E\) for the number of clusters \(k\) where the (normalized) index \(I\) reaches its maximum. The results are based on 1,000 runs of \(k\)-mediods clustering with random initialization

Figure 5 shows the empirical results for clustering univariate and multivariate time series of the VW DRIVE dataset using RRR and DTW distance respectively. Since the VW DRIVE dataset consists of ‘only’ 124 test drives recorded by one and the same vehicle, the optimal number of clusters for both RRR and DTW distance is rather small. However, the proposed RRR distance is able to find cluster configurations with lower index \(E\) values or rather prototypes with higher amount of recurring patterns than the DTW distance. In case of univariate time series (a), in particular speed measurements, RRR and DTW achieved an index \(E\) value of around \(0.52\) and \(0.65\) for the optimal number of clusters, which corresponds to a determinism value of \(0.48\) and \(0.35\) respectively. In the multivariate case (b), RRR and DTW reached an index \(E\) value of around \(0.74\) and \(0.84\) for the optimal number of clusters, which corresponds to determinism value of \(0.26\) and \(0.16\) respectively. As might be expected, the results for the univariate time series are better than for the multivariate case, because the search space expands and the probability of recurring patterns decreases with an increasing number of dimensions or measurements respectively. In both cases, however, our RRR distance performs about \(10\,\%\) better than the compared DTW distance, meaning that the identified prototypes contain \(10\,\%\) more recurring (driving behavior) patterns.

7 Conclusion

This work is a first attempt to solve time series clustering with nonlinear data analysis and modeling techniques commonly used by theoretical physicists. We adopted recurrence plots (RPs) and recurrence quantification analysis (RQA) to measure the (dis)similarity of multivariate time series that contain segments of similar trajectories at arbitrary positions and in different order.

Strictly speaking, we introduced the concept of joint cross recurrence plots (JCRPs), a multivariate extension of traditional RPs, to visualize and investigate recurring patterns in pairwise compared time series. Furthermore, we defined a recurrence plot-based (RRR) distance measure to cluster time series with order invariance.

The proposed RRR distance was evaluated on both synthetic and real-life time series, and compared with the DTW distance. Our evaluation on synthetic data demonstrates that the RRR distance is able to establish cluster centers that preserve the characteristics of the time series. The results on real-life vehicular data show that, in terms of our cost function, RRR performs about \(10\,\%\) better than DTW, meaning that the determined prototypes contain \(10\,\%\) more recurring driving behavior patterns.

Worthwhile future work includes (1) the investigation of RQA measures which quantify recurring patterns with uniform scaling, (2) the application of speed-up techniques for RP computations, and (3) the formalization/analysis of an RP-based distance metric.