1 Introduction

The delay coordinate mapping, or sliding window embedding (Takens 1981; Nolte 2010; Kantz and Schreiber 2004; Das and Giannakis 2017), posits a time series as a sequence of observations made along trajectories in a hidden state space. Under this scheme, a one dimensional time series, which could otherwise be analyzed with more traditional linear analysis techniques such as ARMA and Fourier/Wavelet analysis, is instead turned into a geometric object via a vector of samples of the time series, which moves along the signal (Eq. 1). The shape of this geometric object provides information about the system under study. Periodic processes, for example, map to points which concentrate on a topological loop. Sliding window embeddings have been used in this context, for example, to analyze ECG signals of a beating heart (Stam 2005; Plesnik et al. 2011), to detect chatter in mechanical systems (Khasawneh and Munch 2016), to quantify repetitive motions in human activities (Frank et al. 2010; Venkataraman et al. 2016), to discover periodicity in gene expression during circadian rhythms (Perea et al. 2015), and to detect wheezing in audio signals (Emrani et al. 2014). In addition to loops, torus shapes often show up during “quasiperiodicity,” which is a state of near-chaos. Sliding window embeddings have witnessed this torus shape in such applications as vocal fold anomalies (Herzel et al. 1994), horse whinnies (Briefer et al. 2015), neural networks (Morrison et al. 2016), and oscillating cylinder flow (Glaz et al. 2017). Certain time series even concentrate on fractals after a sliding window embedding (Takens 1981; de Silva et al. 2012). Sliding window embeddings have also been used as a tool for shape analysis more generally even when an underlying model for the dynamics is unknown, such as in music structure analysis (Bello 2011; Serra et al. 2009). We direct the interested reader to Perea (2019) for a recent review on how topological data analysis can be used in the analysis of time delay embeddings.

The main theory motivating the use of sliding window embeddings in all of these applications is Takens’ delay embedding (Takens 1981) theorem, which is stated as follows:

Theorem

(Takens’ embedding theorem Takens 1981) Let M be a compact manifold of dimension m. Suppose X is a smooth vector field with flow \(\psi _t: M \rightarrow M\) and G is a smooth function on M. For \(\tau >0\), \(N \ge 2m\), and pairs (XG) it is a generic property that \(\Psi ^N_\tau : M \rightarrow \mathbb {R}^{N+1}\) defined by

$$\begin{aligned} \Psi ^N_\tau (p) = (G(p), G(\psi _\tau (p)), G(\psi _{2\tau }(p)), \ldots , G(\psi _{N\tau }(p))) \end{aligned}$$

is an embedding.

A “random” choice of X and G makes the delay coordinate mapping \(\Psi ^N_\tau \) a smooth embedding. Thus, remarkably, the state space M of a dynamical system may in general be reconstructed from a single generic observation function G,Footnote 1 which gives rise to a 1D time series. However, in practice, Takens’ result is ill-suited for computational purposes because it does not provide an explicit characterization of “genericity”.

In this work, we extend Takens’ embedding theory with a geometric characterization of observations which yield high-dimensional delay coordinate embeddings, given a particular flow on a manifold. Our main theoretical result for general compact manifolds is stated in Theorem 4.1 in Sect. 4, as follows:

Theorem

The Takens map \(\Psi _{\tau }^{N}\) is an embedding for some dimension \(N>0\) and flow time \(\tau >0\), if the following conditions hold:

  1. 1.

    For any point of \(p\in M\) there is an m-tuple \(J\in \mathbb {Z}_{\ge 0}^m\) of nonnegative integers such that the m-form

    $$\begin{aligned} \mathcal {L}_{X}^{\wedge J}dG := \bigwedge _{j\in J}\mathcal {L}_X^jdG \end{aligned}$$

    is nonzero at some point on the integral curve \(\gamma _p(s)\). Here, \({\mathcal {L}}_X^j\) denotes the \(j^{ th }\)-order Lie derivative.

  2. 2.

    For any pair of distinct points \(p, q\in M\) the observation curves \(g_p(s)\) and \(g_q(s)\) are not identical.

We first provide several examples in Sect. 3 which satisfy the conditions of our theorem. In the process, we discuss a non-example that violates condition 1 if we’re not careful (Example 3.3) and show another non-example which violates condition 2 (Example 3.2, part 2). We then prove our theorem in Sect. 4, and we explore a special case in Sect. 5 in which Fourier bases can be used to construct observation functions.Footnote 2

2 Background

In this section, we provide a more detailed overview of several concepts utilized in this work, including sliding window embeddings, persistent (co)homology, and Eilenberg–MacLane coordinates. The latter two tools will be used to empirically validate that our sliding window embeddings recover our chosen state space and the underlying dynamics.

2.1 Sliding window embeddings

We express a time series g(t) as an observation G along a dense trajectory \(\gamma \) on a manifold M, i.e.

$$\begin{aligned} g(t) = G(\gamma (t)) \end{aligned}$$

for \(\gamma : \mathbb {R}\longrightarrow M\) and \(G: M \longrightarrow \mathbb {R}\). We compute the sliding window of g as

$$\begin{aligned} {{\,\mathrm{SW}\,}}^N_\tau g(t) := \left[ \begin{array}{c} g(t) \\ g(t + \tau ) \\ g(t + 2 \tau ) \\ \vdots \\ g(t + N\tau ) \end{array} \right] \in \mathbb {R}^{N+1} \end{aligned}$$
(1)

where \(N\in \mathbb {N}\) is the number of delays, \(\tau > 0\) is the delay time, and \(N \tau \) is the window length.

We interpret the sliding window \({{\,\mathrm{SW}\,}}^N_\tau g(t)\) as the evaluation of the Takens map \(\Psi ^N_\tau \) in Theorem 1 above on an integral curve \(\psi _t(p)\) of a vector field X through a point \(p\in M\). For if \(N=2\cdot \dim {M}\) and \(\gamma (t) = \psi _t(p)\), then

$$\begin{aligned} {{\,\mathrm{SW}\,}}_\tau ^Ng(t) = \Psi ^N_\tau (\psi _t(p)). \end{aligned}$$

For sufficiently large N and small \(\tau \), \({{\,\mathrm{SW}\,}}_\tau ^Ng(t)\) densely “traces” the embedding \(\Psi ^N_\tau (M)\) for appropriate choice of observation G and vector field X.

When g is a periodic function with frequency \(\omega \in \mathbb {R}\), it readily follows that the sliding window embedding \({{\,\mathrm{SW}\,}}^N_\tau g(t)\) traces a closed curve in \(\mathbb {R}^{N+1}\). The shape of this curve is closely related to the choice of parameters N and \(\tau \), and their relation to \(\omega \) (Perea and Harer 2015). In particular, if \(\tau \) and N are chosen so that N is large enough and \(N\tau \omega \approx 1\), then the image of \({{\,\mathrm{SW}\,}}^N_\tau g\) is in fact a topological circle in \(\mathbb {R}^{N+1}\), whose shape is tightly controlled by the Fourier coefficients of g. In other words, the periodic nature of g—a spectral property—is reflected in the circularity of its sliding window, a topological feature. Quasiperiodicity is another spectral notion with a clear geometric/topological counterpart. Indeed, let \(1,\omega _1, \ldots , \omega _n\in \mathbb {R}\) be linearly independent over the rational numbers. We say that \(f: \mathbb {R}\longrightarrow \mathbb {R}\) is quasiperiodic with frequencies \(\omega _1,\ldots , \omega _n\), if it can be written as \(f(t) = F(t, \ldots , t)\) for some function \(F: \mathbb {R}^n \longrightarrow \mathbb {R}\) whose j-th marginals \(f_j(t) = F(t_1 ,\ldots , t_{j-1}, t , t_{j+1},\ldots , t_n)\) are periodic with frequency \(\omega _j\). In this case, and for appropriate N and \(\tau \), the set \({{\,\mathrm{SW}\,}}_\tau ^N f(\mathbb {Z})\) is dense in an n-dimensional torus embedded in \(\mathbb {R}^{N+1}\) (Perea 2016; Gakhar and Perea 2018).

2.2 Koopman spectra

We now review another relevant tool that goes along with sliding window embeddings. For positive flow time \(t>0\), the flow \(\psi _t\) of a vector field X on a compact manifold M defines a diffeomorphism \(\psi _t: M \rightarrow M\). Then the composition map \(U^t\), or Koopman operator (Koopman 1931; Das and Giannakis 2017; Mezić 2013) given by

$$\begin{aligned} U^tG = G \circ \psi _t, \end{aligned}$$

is a linear operator on the space of observation functions on M. The coordinates of the delay mapping are thus iterated applications of \(U^t\) on an observation G.

For certain classes of dynamical systems, the Koopman operator possesses a discrete spectrum and yields a linear expansion

$$\begin{aligned} G = \sum _{k=0}^\infty G_k\phi _k \end{aligned}$$

where \(\phi _k\) are eigenfunctions of \(U^t\) and \(G_k\) are Koopman modes. For such systems one “lifts” the dynamics on the state space to an evolution of observables. For a more comprehensive overview of Koopman theory and its applications, please refer to Arbabi (2018).

We will see in Sect. 5 that a high-dimensional delay mapping essentially recovers the Koopman modes of an observation function. We therefore characterize delay embedding observations in terms of spectral decomposition properties. We examine a special case with a Fourier basis for the Koopman operator on the Torus and Klein bottle, and show via our main Theorem 4.1 what is needed of these coefficients.

2.3 Persistent homology

In practice we evaluate the sliding window \({{\,\mathrm{SW}\,}}^N_\tau g(t)\) at a finite set of evenly sampled time points \(t_1< \cdots < t_J\). This results in a discrete collection of J vectors, referred to as a “sliding window point cloud”. The topology of a point cloud with J points is trivial; it consists of J connected components and lacks any other topological features (loops, voids, etc). However, if we use a simplicial complex (a discrete object) to approximate the underlying space from which the point cloud is sampled, then we can estimate the underlying topology via combinatorial means. A simplicial complex on a set V of vertices (e.g. a sliding window point cloud) is a collection K of nonempty subsets \(\sigma \subset V\), so that if \(\emptyset \ne \tau \subset \sigma \in K\), then \(\tau \in K\). As an example, suppose we seek a simplicial complex with topology reflecting that of the unit circle \(S^1\). Starting with the set \(V = \{a, b, c\}\) of vertices, we let \(K = \{a, b, c, \{a, b\}, \{b, c\}, \{a, c\} \}\) be the simplicial complex containing 3 edges between every pair of vertices. Like \(S^1\), K has one connected component, one loop which bounds an empty space, and no higher dimensional features (voids, etc).

So far, our description of simplicial complexes has been purely combinatorial/topological, but one can use geometry to inform their construction. An early scheme in Euclidean space is the alpha complex (Edelsbrunner and Mücke 1994), constructed as a family of subcomplexes of Delaunay triangulations at different scales. An even simpler construction, which works in any metric space, is the so-called “Vietoris-Rips” complex at scale \(\alpha \ge 0\), denoted \(R_\alpha (V)\). It is comprised of the finite subsets of V which have diameter less than \(\alpha \). Choosing the “appropriate” scale is ill-posed. For instance, Fig. 1 shows a point cloud in \({\mathbb {R}}^2\) for which it is impossible to choose an appropriate scale at which the simplicial complex contains the two empty loops that are present in the original shape.

Fig. 1
figure 1

An example of the Rips filtration on a point cloud sampled from a thickened figure eight. The Rips complex is shown at different scales, the values of \(\alpha \), producing the persistence diagram on the right

Specifically, \(R_{0.16}(V)\) contains the upper loop, but not the lower loop, and \(R_{0.24}(V)\) contains the lower loop, but the upper loop is no longer empty. In fact, it is impossible to choose an \(\alpha \) in which both loops are present and empty in \(R_{\alpha }(V)\) in this example. However, we can still summarize the multiscale topological information of any point cloud by performing a filtration of the complex. That is, we evaluate \(R_{\alpha }(V)\) as \(\alpha \) varies continuously from 0 to some maximum value, so that \(R_{\alpha _1}(V) \subset R_{\alpha _2}(V)\) if \(\alpha _1 \le \alpha _2\). Throughout this process we keep track of topological features as they appear, or are “born,” and as they are filled in, or “die”. For each such homology class, we can produce a point in a scatter plot, known as the persistence diagram of the filtration, with birth time on the x-axis and death time on the y-axis. Figure 1 shows a persistence diagram associated with our running example.Footnote 3 Intuitively, points further from the diagonal correspond to larger topological features which “persist” (stay alive) over longer intervals, and points closer to the diagram correspond to small, “noisy” features which are often artifacts of sampling (e.g. the square and pentagon loop that exist at \(\alpha = 0.12\)).

For completeness, we extend the above explanation with a brief rigorous presentation. For a more comprehensive treatment, please refer to Edelsbrunner and Harer (2008), Edelsbrunner and Harer (2010), Carlsson (2009), Ghrist (2014) and Perea (2018a). Let \((\Gamma , \preceq )\) be a partially ordered set. A \(\Gamma \)-filtered simplicial complex is a collection \({\mathcal {K}} = \{K_\alpha \}_{\alpha \in \Gamma }\) of simplicial complexes, so that \(K_\alpha \subset K_{\alpha '}\) for every \(\alpha \preceq \alpha ' \in \Gamma \). The typical examples for point cloud data are the Rips filtration, as we mentioned, and the Čech filtration motivated by the nerve lemma (Hatcher 2002). Specifically, let L be a finite subset of a metric space \((\mathbb {M},{\mathbf {d}})\). The Rips filtration of L is the \(\mathbb {R}\)-filtered simplicial complex \({\mathcal {R}}(L) = \{R_\alpha (L)\}_{\alpha \in \mathbb {R}}\). Similarly, for \(\ell \in L\) let

$$\begin{aligned} B_\alpha (\ell ) = \{b\in \mathbb {M}: {\mathbf {d}}(b,\ell ) < \alpha \} \;\;\;\;\; \text{ and } \;\;\;\;\; {\mathcal {B}}_\alpha = \{B_\alpha (\ell ) : \ell \in L\}. \end{aligned}$$

The Čech complex \({\check{C}}_\alpha (L)\) is defined as the nerve of \({\mathcal {B}}_\alpha \); that is \({\check{C}}_\alpha (L) = {\mathcal {N}}({\mathcal {B}}_\alpha )\) where

$$\begin{aligned} \sigma \in {\mathcal {N}}({\mathcal {B}}_\alpha ) \text{ if } \text{ and } \text{ only } \text{ if } \bigcap \limits _{\ell \in \sigma } B_\alpha (\ell ) \ne \emptyset \end{aligned}$$

Hence \({\check{C}}(L) = \{{\check{C}}_\alpha \}_{\alpha \in \mathbb {R}}\) is an \(\mathbb {R}\)-filtered simplicial complex, and \(R_\alpha (L) \subset {\check{C}}_\alpha (L) \subset R_{2\alpha }(L)\) for all \(\alpha \in \mathbb {R}\).

The persistent homology (resp. cohomology) of a filtered complex \({\mathcal {K}} = \{K_\alpha \}_{\alpha \in \Gamma }\), with coefficients in a field \(\mathbb {F}\), are defined, respectively, as

$$\begin{aligned} PH_n({\mathcal {K}}; \mathbb {F}) := \bigoplus \limits _{\alpha \in \Gamma } H_n(K_\alpha ;\mathbb {F}) \text{ and } PH^n({\mathcal {K}}; \mathbb {F}) := \bigoplus \limits _{\alpha \in \Gamma } H^n(K_\alpha ;\mathbb {F}) \end{aligned}$$

Let \(\iota _{\alpha ,\alpha '}: H_n(K_\alpha ; \mathbb {F}) \longrightarrow H_n(K_{\alpha '};\mathbb {F})\) and \(\jmath ^{\alpha ',\alpha }: H^n(K_{\alpha '} ; \mathbb {F}) \longrightarrow H^n(K_{\alpha };\mathbb {F})\) be the \(\mathbb {F}\)-linear maps induced by the inclusion \(K_\alpha \subset K_{\alpha '}\), \(\alpha \preceq \alpha ' \). A persistent homology (resp. cohomology) class is an element \(\bigoplus _{\alpha \in \Gamma } \nu _\alpha \in PH_n({\mathcal {K}}; \mathbb {F})\) (resp. \(\bigoplus _{\alpha \in \Gamma } \mu ^\alpha \in PH^n({\mathcal {K}}; \mathbb {F})\) ) so that \(\iota _{\alpha , \alpha '} (\nu _\alpha ) = \nu _{\alpha '}\) (resp \(\jmath ^{\alpha ', \alpha } \left( \mu ^{\alpha '}\right) = \mu ^{\alpha }\)) for every \(\alpha \preceq \alpha '\).

When \(\Gamma = \mathbb {R}\), a theorem of Crawley-Boevey (2015) contends that if each \(H_n(K_\alpha ; \mathbb {F})\) is finite-dimensional (also known in the literature as being pointwise-finite) then one can choose bases \(S^\alpha \) for each \(H_n(K_\alpha ; \mathbb {F})\), satisfying the following compatibility condition:

  1. 1.

    \(\iota _{\alpha ,\alpha '}(S^\alpha ) \subset \left( S^{\alpha '} \cup \{0\}\right) \) for every \(\alpha \le \alpha '\).

  2. 2.

    If \(\iota _{\alpha ,\alpha '}({\mathbf {v}}_j^\alpha ) = \iota _{\alpha ,\alpha '}({\mathbf {v}}_k^\alpha )\) and \(j\ne k\), then \(\iota _{\alpha ,\alpha '}({\mathbf {v}}_j^\alpha ) = 0\).

The set \(S = \bigcup \nolimits _{\alpha \in \mathbb {R}}S^\alpha \) admits a partial order \(\preceq \) given by \(S^\alpha \ni {\mathbf {v}} \preceq {\mathbf {v}}' \in S^{\alpha '}\) if and only if \(\alpha \le \alpha '\) and \(\iota _{\alpha ,\alpha '}({\mathbf {v}}) = {\mathbf {v}}'\). The maximal chains in \((S,\preceq )\) are the persistent homology classes. To each maximal chain \({\mathscr {C}} \subset S\) one can associated the point \((b_{\mathscr {C}}, d_{\mathscr {C}}) \in [-\infty , \infty ] \times [-\infty , \infty ]\) defined by

$$\begin{aligned} b_{{\mathscr {C}}} = \inf \{\alpha \in \mathbb {R}: S^\alpha \cap {\mathscr {C}}\ne \emptyset \}, \quad d_{{\mathscr {C}}} = \sup \{\alpha \in \mathbb {R}: S^\alpha \cap {\mathscr {C}}\ne \emptyset \} \end{aligned}$$

The collection of such pairs, where \({\mathscr {C}}\) runs over all maximal chains, is the persistence diagram for the persistence homology of the filtered complex \({\mathcal {K}}\).

Persistent cohomology behaves similarly. Indeed, any basis for \(H_n(K_\alpha ; \mathbb {F})\) yields a well-defined isomorphism \(H_n(K_\alpha ; \mathbb {F}) \cong H_n(K_\alpha ;\mathbb {F})^*\) with the linear dual space, and the latter is naturally isomorphic to \(H^n(K_\alpha ; \mathbb {F})\), by the universal coefficient theorem. Hence, these isomorphisms turn the \(S^\alpha \)’s into a collection of compatible bases for the cohomology groups \(H^n(K_\alpha ; \mathbb {F})\), showing that persistent homology and cohomology yield the same persistence diagrams.

2.3.1 Persistent homology of sliding window embeddings

As mentioned in the introduction, there are numerous examples in the literature of persistent homology on sliding window point clouds. For any periodic time series (\(x(t) = x(t + kT), k \in {\mathbb {Z}}\)), a sliding window embedding yields a topological loop, and there is a point of high persistence in the persistence diagram for \(PH_1\) (Perea and Harer 2015). However, the authors of Perea and Harer (2015) also show, surprisingly, that sliding window embeddings of functions like \(x(t) = \cos (t) + a \cos (2t)\), \(|a| > 1\), can lie on the boundary of an embedded Möbius strip (Perea and Harer 2015). We use this to help intuitively explain the time series we obtain for the projective plane (Example 3.4) and the Klein Bottle (Sect. 5.2). Note that this also means that field coefficients other than \({\mathbb {Z}}_2\) are needed to maximize the maximum persistence in \(PH_1\). In general, for \(\cos (t) + a \cos (kt)\), coefficients which are not prime factors of k are needed (Perea and Harer 2015; Tralie 2017). Finally, there are works which utilize both \(PH_1\) and \(PH_2\) to quantify the presence of quasiperiodicity in time series data, by estimating the toroidality of a sliding window point cloud (Perea 2016; Tralie and Perea 2018). In this work, we extend this suite of examples beyond (possibly twisted) loops and torii to other manifolds.

2.4 Eilenberg–MacLane coordinates

Though persistent homology is informative, one can further utilize it to perform nonlinear dimensionality reduction on sliding window point clouds, for visualization purposes and reconstruction of the underlying dynamics. To this end, we use “Eilenberg–MacLane coordinates”, which turn persistent cohomology classes into maps from point clouds to the circle (De Silva et al. 2011; Perea 2018c), and (real or complex) projective spaces (Perea 2018b). We present next a more detailed summary; maps to the projective plane are particularly interesting, as they allow us to “untwist” non-orientable manifolds like the Klein bottle.

More formally, if G is an abelian groupFootnote 4 and n is a positive integer, then it is possible to construct a connected CW complex K(Gn), called an Eilenberg–MacLane space, whose homotopy type is uniquely determined by two properties:

  1. 1.

    its j-th homotopy group \(\pi _j(K(G,n))\) is trivial for all \(j\ne n\)

  2. 2.

    \(\pi _n(K(G,n))\cong G\)

The Brown representability theorem (for CW complexes and singular cohomology) contends that if B is a CW complex, then there is a natural bijection

$$\begin{aligned} H^n(B;G) \cong [B, K(G,n)] \end{aligned}$$
(2)

between the n-th cohomology of B with coefficients in G, and the set of homotopy classes of maps from B to K(Gn).

The two Eilenberg–MacLane spaces we use to generate circular and projective coordinates are: \(K(\mathbb {Z},1) \simeq S^1\), and \(K(\mathbb {Z}/2, 1) \simeq \mathbb {R}\mathbf {P}^\infty = \mathbb {R}^\infty \smallsetminus \{{\mathbf {0}}\}/\sim \), respectively. Here \(\mathbb {R}^\infty \) is the collection of infinite sequences of real numbers \({\mathbf {x}} = (x_0, x_1 , \ldots )\) which are nonzero for all but finitely many \(x_j\)’s, and \({\mathbf {x}} \sim {\mathbf {y}}\) if and only if \({\mathbf {x}} = r{\mathbf {y}}\) for some \(r\in \mathbb {R}\smallsetminus \{0\}\). One can also regard \(\mathbb {R}^\infty \) as the direct limit of the system \(\mathbb {R}\subset \mathbb {R}^2 \subset \mathbb {R}^3 \subset \cdots \), where the inclusion \(\mathbb {R}^j \hookrightarrow \mathbb {R}^{j+1}\) sends \((x_0,\ldots , x_{j-1})\) to \((x_0,\ldots , x_{j-1},0)\). With this interpretation in mind, \(\mathbb {R}\mathbf {P}^\infty \) can be regarded as the direct limit of the system \(\mathbb {R}\mathbf {P}^0 \subset \mathbb {R}\mathbf {P}^1 \subset \mathbb {R}\mathbf {P}^2 \subset \cdots \), where \(\mathbb {R}\mathbf {P}^n = \mathbb {R}^{n+1}\smallsetminus \{{\mathbf {0}}\}/\sim \). Recently (Perea 2018b), it has been shown that if L is a finite subset of a metric space \((\mathbb {M},{\mathbf {d}})\), and for \(\ell \in L\) we let \(B_\alpha (\ell )\) be the open ball of radius \(\alpha \) centered at \(\ell \), then persistent cohomology classes in \(PH^1({\mathcal {R}}(L);\mathbb {Z}/2)\) can be used to define projective coordinates

$$\begin{aligned} f_\mu : \bigcup \limits _{\ell \in L} B_{\alpha }(\ell ) \longrightarrow \mathbb {R}\mathbf {P}^n \end{aligned}$$

Similarly, persistent cohomology classes in \(PH^1({\mathcal {R}}(L); \mathbb {Z}/q)\), for appropriate choices of prime \(q>2\), yield circular coordinates (Perea 2018c)

$$\begin{aligned} f_{\theta , \tau }: \bigcup \limits _{\ell \in L} B_{\alpha }(\ell ) \longrightarrow S^1 \end{aligned}$$

In both cases, the resulting coordinates mimic the properties of the bijection (2) from Brown’s representability.

2.4.1 Projective coordinates

Here is a sketch of the construction of projective coordinates from persistent cohomology classes. Let \(L = \{\ell _0, \ldots , \ell _n\} \subset \mathbb {M}\), and fix a cocycle \(\mu =\{\mu ^\alpha _{jk} \} \in Z^1(R_{2\alpha }(L); \mathbb {Z}/2)\) so that its cohomology class is not in the kernel of the homomorphism

$$\begin{aligned} \iota ^{2\alpha , \alpha }: H^1(R_{2\alpha }(L);\mathbb {Z}/2) \longrightarrow H^1(R_\alpha (L);\mathbb {Z}/2) \end{aligned}$$

induced by the inclusion \(R_\alpha (L) \subset R_{2\alpha }(L)\). Since \(R_\alpha (L) \subset {\check{C}}_\alpha (L) \subset R_{2\alpha }(L)\), then the rightmost inclusion yields a nonzero class in \(H^1({\check{C}}_\alpha (L);\mathbb {Z}/2)\). We let

$$\begin{aligned} f_\mu : \bigcup \limits _{\ell \in L} B_\alpha (\ell ) = L^{(\alpha )}\longrightarrow & {} \mathbb {R}\mathbf {P}^n \\ B_{\alpha }(\ell _j) \ni b&{\mapsto }&\left[ (-1)^{\mu _{j0}^\alpha }|\alpha {-} {\mathbf {d}}(b, \ell _0)|_{+} : \cdots : (-1)^{\mu _{jn}^\alpha }|\alpha - {\mathbf {d}}(b, \ell _n)|_{+} \right] \end{aligned}$$

where \([x_0 : \cdots : x_n] \in \mathbb {R}\mathbf {P}^n\) denotes the equivalence class of \((x_0,\ldots , x_n)\in \mathbb {R}^{n+1}\smallsetminus \{{\mathbf {0}}\}\), and \(|r|_{+}:= \max \{0,r\}\) for \(r\in \mathbb {R}\). Since \(\{\mu _{jk}^\alpha \}\) is a cocycle, it readily follows that the point \(f_\mu (b)\in \mathbb {R}\mathbf {P}^n\) is independent of the index \(j\in \{0,\ldots , n\}\) for which \(b \in B_{\alpha }(\ell _j)\). In other words, \(f_\mu \) is well defined.

Fig. 2
figure 2

An example of projective coordinates for point sampled from a flat Klein bottle obtained as a quotient of the torus via \([x, y] \sim [x + \pi , -y]\). The two coordinates are colored according to their x and y positions on the fundamental domain \([0, 2 \pi ] \times [0, \pi ]\), and we show two different stereographic projections to the plane from \(\mathbb {R}\mathbf {P}^2\). If the fundamental domain is split into distinct parts \(A = [0, 2 \pi ] \times [0, \pi /2]\) and \(B = [0, 2 \pi ] \times [\pi /2, \pi ]\), then A and B map to two distinct Möbius strips which are attached at their boundaries at \(y = \pi /2\) (medium red for the y colors), which is indeed what happens when the Klein bottle is cut down the middle (color figure online)

If \(\{\nu ^\alpha _{jk}\}\in Z^1(R_{2\alpha }(L) ;\mathbb {Z}/2)\) is cohomologous to \(\{\mu _{jk}^\alpha \}\), and \(f_\nu : L^{(\alpha )} \longrightarrow \mathbb {R}\mathbf {P}^n\) is the associated map, then \(f_\mu \simeq f_\nu \) and hence we get a well defined function

$$\begin{aligned} H^1(R_{2\alpha }(L) ; \mathbb {Z}/2 )\longrightarrow & {} \left[ L^{(\alpha )}, \mathbb {R}\mathbf {P}^n\right] \\ \left[ \mu \right]\mapsto & {} [f_\mu ] \end{aligned}$$

The metric properties of \(f_\mu \) are also determined by the cohomology class of \(\mu \). For if

$$\begin{aligned} {\mathbf {d}}_g({\mathbf {x}}, {\mathbf {y}}) := \arccos \left( \frac{ |\langle {\mathbf {x}}, {\mathbf {y}} \rangle | }{\Vert {\mathbf {x}}\Vert \cdot \Vert {\mathbf {y}}\Vert }\right) \end{aligned}$$

denotes the geodesic distance in \(\mathbb {R}\mathbf {P}^n\), then it readily follows that

$$\begin{aligned} {\mathbf {d}}_g\left( f_\nu (b), f_\nu (b')\right) = {\mathbf {d}}_g\left( f_\mu (b), f_\mu (b')\right) \end{aligned}$$

for all \(b,b'\in L^{(\alpha )}\) and \(\mu , \nu \) in the same cohomology class.

Given a finite set \(P \subset L^{(\alpha )}\), taking its image through \(f_\mu \) yields a new point cloud \(f_\mu (P) \subset \mathbb {R}\mathbf {P}^n\). A dimensionality-reduction scheme in \(\mathbb {R}\mathbf {P}^n\) referred to as principal projective component analysis is also defined in Perea (2018b). This procedure yields a sequence of maps

$$\begin{aligned} P_k : f_\mu (P) \longrightarrow \mathbb {R}\mathbf {P}^k, k = 0, \ldots , n \end{aligned}$$

minimizing an appropriate notion of (metric) distortion. In particular, \(P_k\circ f_\mu (P)\) and \(P_k \circ f_\nu (P)\) are isometric if \(\mu \) and \(\nu \) are cohomologous. The point clouds \(P_k \circ f_\mu (P) \subset \mathbb {R}\mathbf {P}^k\) are referred to as the projective coordinates of P, induced by the landmarks \(L \subset \mathbb {M}\) and the cohomology class \(\left[ \mu \right] \in H^1(R_{2\alpha }(L); \mathbb {Z}/2)\).

As an example, Fig. 2 shows the projective coordinates onto \(\mathbb {R}\mathbf {P}^2\) of points sampled from a Klein bottle \(\mathbb {K}\), using the flat metric on the torus \(\mathbb {T}\), descended onto the automorphism \(\kappa : (x,y) \mapsto (x+\pi , -y)\). We use the cocycle representative which is the sum of the representative cocycles from the two most persistent classes.

In fact, this is a 2 to 1 map, as shown in Fig. 2. Just as a torus can be obtained from gluing two annuli together at their boundary, the Klein bottle can be obtained by gluing two Möbius strips at their boundary. Each one of these Möbius strips is visible in the odd and even columns of the bottom two rows of Fig. 2, respectively. In particular, the loops \([0, 2 \pi ] \times 0\) and \([0, 2 \pi ] \times \pi \) are at the center of each Möbius strip, and the boundaries of each Möbius strip at \([0, 2 \pi ] \times \pi /2\) get identified at the center of the projective coordinates plot. We will observe similar projective coordinates for the sliding window of our Klein bottle time series in Sect. 5.2.

2.4.2 Circular coordinates

The idea of using the bijection \(H^1(B;\mathbb {Z}) \cong [B , S^1]\) to construct circle-valued functions for data, from persistent cohomology classes, was first introduced by De Silva et al. (2011). Their construction has shortcomings (not sparse, not transductive) which are addressed in Perea (2018c); the latter is the procedure we use in the paper and the one we describe next.

Let \(q>2\) be a prime so that the homomorphism

$$\begin{aligned} H^1(R_{2\alpha }(L) ;\mathbb {Z}) \longrightarrow H^1(R_{2\alpha }(L);\mathbb {Z}/q) \end{aligned}$$

induced by the projection \(\mathbb {Z}\longrightarrow \mathbb {Z}/q\), is surjective. Hence, any \(\mu \in Z^1(R_{2\alpha }(L);\mathbb {Z}/q)\) has a lift \({\tilde{\mu }}\in Z^1(R_{2\alpha }(L);\mathbb {Z})\). Moreover, if \(\iota : \mathbb {Z}\hookrightarrow \mathbb {R}\) is the inclusion homomorphism, then there are cochains \(\theta \in Z^1(R_{2\alpha }(L); \mathbb {R})\) and \(\tau \in C^0(R_{2\alpha }(L) ; \mathbb {R})\) so that \(\theta \) is the unique harmonic cocycle representative of \(\iota ^*([{\tilde{\mu }}])\) and \(\iota ^\#({\tilde{\mu }}) = \theta - \delta ^0\tau \). From this data we define

$$\begin{aligned} \begin{array}{cccl} f_{\theta ,\tau } : &{} \bigcup \limits _{\ell \in L} B_\alpha (\ell ) &{} \longrightarrow &{} S^1\subset \mathbb {C}\\ &{} B_{\alpha }(\ell _j) \ni b &{} \mapsto &{} \exp \left\{ 2\pi i \left( \tau _j + \sum \limits _{k=0}^n \theta _{jk}\varphi _k (b) \right) \right\} \end{array} \end{aligned}$$

where

$$\begin{aligned} \varphi _k (b)= \frac{|\alpha - {\mathbf {d}}(b,\ell _k)|_{+}}{\sum \nolimits _{r=0}^n |\alpha - {\mathbf {d}}(b,\ell _r)|_{+}} \end{aligned}$$

Figure 3 shows an example of this algorithm on a point cloud sampled from a torus, using 400 landmarks. In this example, the algorithm is able to find maps from the points to the inner and outer circle of the torus.

Fig. 3
figure 3

An example of the circular coordinates algorithm on a point cloud sampled from a torus in \({\mathbb {R}}^3\). The third plot shows the coordinates resulting from the representative cocycle of the largest persistence class, which goes around the large circle on the outside, while the fourth plot shows the circular coordinates resulting from the cocycle from the second largest persistence class, which wraps around the inner circle

3 Preliminary examples: distance to a point as observation function

To motivate a more general development of good observation functions on manifolds, we first explore a very specific genre of observation functions: those which arise as the distance to a specified point in the manifold. We then verify the geometric integrity of a delay coordinate mapping of the resulting time series using persistent homology and Eilenberg–Maclane coordinates on a few examples. Through these tools and a visual comparison of the time series to known examples, we will already be able to explain quite a lot, including motivating both conditions of Theorem 4.1, though a full development of the theory in Sect. 4 is needed to justify these choices of observation functions.

In the discussion below, all of our observation functions are of the form \(G(x) = d(x, {\hat{x}})\), where d is some metric chosen on the manifold and \({\hat{x}}\) is some fixed point on the manifold which is our “reference distance point.”

Example 3.1

Flat torus \(\mathbb {T}\)

We first examine the planar torus \(\mathbb {T}=\mathbb {R}^2/2\pi \mathbb {Z}^2\), parameterized by \((u, v) \in [0, 2 \pi ] \times [0, 2 \pi ]\). As our dynamics, we take the irrational winding \(\psi _t(u, v) = (u + \sqrt{2} dt, v + dt)\), and the observation G(uv) is the flat geodesic distance between (uv) and the point \({\hat{x}} = (6, \pi )\). This is shown in Fig. 4. After performing a delay embedding on the resulting time series with window length of 30 samples, we see two persistent \(H_1\) classes and 1 persistent \(H_2\) class, which is the signature of a torus. Furthermore, circular coordinates resulting from the top two persistent classes in \(H_1\) recovered the full original flow specification.

Fig. 4
figure 4

An irrational winding on the flat torus, with an observation function as the distance to the point \({\hat{x}} = (6, \pi )\), which is shown as a blue dot on the left plot. The distance from this point is indicated in gray (dark means close, light means far). The resulting time series is shown in the second plot, with a sliding window indicated with a red box. The third and fourth figures show, respectively, the persistence diagrams of the sliding window point cloud and the resulting circular coordinates. The arrows in the fourth plot are the recovered dynamics; they indicate the order on the sliding windows inherited from the time series. Colors are coordinated between the flows in the first, second, and fourth plots. Similar plotting conventions are present in Figs. 5, 6, 7, 8 and 9 (color figure online)

Fig. 5
figure 5

A winding with a very shallow slope on the fundamental domain of a flat Klein bottle, which is double covered by the flat torus by the automorphism \((x, y) \sim (x + \pi , -y)\). The observation function is then a scaled \(L^2\) distance from the point \({\hat{x}} = (4.5, 2.5)\), which descends under the automorphism

Example 3.2

Flat Klein Bottle \(\mathbb {K}\)

As in our projective coordinates example in Fig. 2, we now form a quotient on the domain of the flat torus to create a Klein bottle, via the automorphism \(\kappa : (x, y) \sim (x + \pi , -y)\). Then, the metric on the torus descends to the Klein bottle via \(\kappa \). We use a slightly modified weighted \(L^2\) flat metric as our distance measure for the observation function; that is

$$\begin{aligned} d_{\alpha , \beta }((u_1, u_1), (u_2, u_2)) = \sqrt{ \alpha ^2 (u_1 - u_2)^2 + \beta ^2 (u_1-u_2)^2} \end{aligned}$$
(3)

In this particular example, we let \(\alpha = 1\) and \(\beta = 0.5\), and we take an observation to the point \({\hat{x}} = (4.5, 2.5)\); that is, \(G(u, v) = d_{1, 0.5}((u, v), (4.5, 2.5)\). Finally, we use a flow with a very shallow slope, \(\psi _t(u, v) = (u + dt, v + 0.05dt)\), in the fundamental domain \(y<\pi \). After performing a sliding window embedding with a window length of 30 samples, we see two persistent classes in \(H_1\) and one persistent class in \(H_2\) with \({\mathbb {Z}}/2\) coefficients, but we only see one class in \(H_1\) and no classes in \(H_2\) with \({\mathbb {Z}}/3\) coefficients. This is indeed the signature of a Klein bottle. We will show projective coordinates on a similar example with a slightly different observation function in Sect. 5.2, and we will explain more intuitively visual features of the time series at that point.

Fig. 6
figure 6

Not all distance functions on the Klein bottle work. The conditions here are the same as in Fig. 5, but the point \({\hat{x}}\) from which distance is measured has been moved to \((\pi , 0)\). The sliding window embedding degenerates to a cylinder in this example

Note that not every distance function will lead to a reconstruction of the Klein bottle. For instance, if we use the same flow \(\psi _t\) but an observation function \(G(u, v) = d((u, v), (\pi , 0))\), as in Fig. 6, then the sliding window embedding of the resulting time series degenerates to a cylinder, because there exist pairs of points with the same observation curves under the flow. This motivates condition 2 in Theorem 4.1.

Fig. 7
figure 7

An observation function on the sphere which is the geodesic distance from a point \({\hat{x}}\) drawn in red. The top and bottom views of the vector field are drawn in the left two figures. 3D PCA of the sliding window embedding, which retains nearly all of the variance of the sliding window point cloud, is shown in the bottom right plot (color figure online)

Example 3.3

Sphere \({\mathcal {S}}^2\)

We now reconstruct the sphere from a given trajectory and distance function. Tralie (2017) showed empirically that a sliding window embedding of a helical trajectory, under the observation function on the sphere which is the arclength from some point on the sphere, yields an embedding of the sphere. We replicate this here. More specifically, we parameterize the unit sphere in spherical coordinates \((\phi , \theta )\) (where \(\phi \) is azimuth and \(\theta \) is elevation from the north pole), we let \(\psi _t(\phi , \theta )_{\alpha } = (\phi + dt, -\pi /2 + \theta dt)\), and the let the observation \(G(\phi , \theta )\) to a point \({\hat{x}} = ({\hat{\phi }}, {\hat{\theta }})\) be

$$\begin{aligned} G(\theta , \phi )&{=}\cos ^{-1} \left( \cos (\phi )\sin (\theta )\cos ({\hat{\phi }})\sin ({\hat{\theta }}) {+} \sin (\phi )\sin (\theta )\sin ({\hat{\phi }})\sin ({\hat{\theta }})\right. \nonumber \\&{+}\left. \cos (\theta )\cos ({\hat{\theta }}) \right) \end{aligned}$$
(4)

We repeat this here in Fig. 7. In this example, simple linear dimension reduction via PCA is able to recover the most of the geometry of the sliding window point cloud, though spherical coordinates are also possible in the Eilenberg–MacLane framework (Perea 2018b).

One pitfall in this example is that the observation point \({\hat{x}}\) cannot lie on the equator or the north or south poles; that is, \({\hat{\phi }} \notin \{ -\pi /2, 0, \pi /2 \}\). In these cases, the helix structure is flattened to a spiral,so the sliding window embedding degenerates to a disc. This motivates the “derivative rank” condition, or condition 1 in Theorem 4.1.

Fig. 8
figure 8

An observation function on \(\mathbb {R}\mathbf {P}^2\) which is the geodesic distance from a point \({\hat{x}}\) drawn in red (color figure online)

Example 3.4

Projective plane \(\mathbb {R}\mathbf {P}^2\)

We can extend the scheme that we used in Example 3.3 to the projective plane \(\mathbb {RP}^2\) by taking a flow only on the upper hemisphere and performing the antipodal identification at the equator \(x \sim -x\). The flow \(\psi _t\) is the same, but the observation function changes to

$$\begin{aligned} G(\theta , \phi )= & {} \cos ^{-1} \left| \cos (\phi )\sin (\theta )\cos ({\hat{\phi }})\sin ({\hat{\theta }}) + \sin (\phi )\sin (\theta )\sin ({\hat{\phi }})\sin ({\hat{\theta }}) \right. \nonumber \\&\left. +\cos (\theta )\cos ({\hat{\theta }}) \right| \end{aligned}$$
(5)

Figure 8 shows this result, in which a single highly persistent point is present for both \(H_1\) and \(H_2\) using \({\mathbb {Z}}/2\) coefficients, but in which none are present for \({\mathbb {Z}}/3\), which is a correct signature of \(\mathbb {R}\mathbf {P}^2\). Interestingly, the quotient identification is visible in the time series itself; the time series in Fig. 8 can be obtained from the time series in Fig. 7 by reflecting values above the line \(y = \pi /2\) across that line. This is because the maximum distance between any two points on \(\mathbb {R}\mathbf {P}^2\) is \(\pi /2\). Additionally, both the sphere time series and the Möbius loop time series (\(\cos (t) + a \cos (2t)\)) are visible in Fig. 8. The time series starts off in a spiral, which fills out a disc, and this disc transitions to a spiraling Möbius loop time series which fills out the strip. This visually reflects the fact that \(\mathbb {R}\mathbf {P}^2\) is the connected sum of a disc and the boundary of a cross-cap. We will use a similar intuition to explain the Klein bottle time series in Sect. 5.2.

Fig. 9
figure 9

An example of a time series resulting from a dense flow on the 2-holed torus, using the flat squared Euclidean distance (Zorich 2006) from an observation point (shown as a red dot) (color figure online)

Example 3.5

Genus 2 surface \(\mathbb {T}\# \mathbb {T}\)

Finally, we show a time series whose sliding window embedding lies on the two holed torus. We use an irrational flow with slope \((dt, dt \sqrt{3}/2)\), with an observation function as the squared flat metric on the fundamental domain (Zorich 2006) represented by an octagon with opposite sides identified. Figure 9 shows the result, in which four highly persistent dots are visible in \(H_1\) and a single persistent dot is visible in \(H_2\), matching what is expected of the homology of a genus 2 surface.

4 Main theorem: characterizing good observation functions

As our main theoretical contribution, we now state more general conditions for good observation functions. Let M be a compact manifold of dimension m, \(G:M \rightarrow \mathbb {R}\) a smooth function, and X a vector field with flow \(\psi _t\). Applying G to an integral curve \(\gamma _p(t) = \psi _t(p)\) through a point p yields a real-valued function

$$\begin{aligned} g_p := G\circ \gamma _p \end{aligned}$$

in t, the observation curve of p. For sufficiently nice G and X, one can recover the point p from a finite uniform sampling of \(g_p\). More precisely, the Takens map\(\Psi _\tau ^N: M \rightarrow \mathbb {R}^{N+1}\) defined by

$$\begin{aligned} \Psi _\tau ^N(p) = \big (g_p(0), g_p(\tau ), g_p(2\tau ), \ldots , g_p(N\tau )\big ) \end{aligned}$$

is an embedding for some dimension \(N>0\) and flow time \(\tau >0\). For such G and X we say G is a good observation for X.

4.1 Motivation for the approach

As a simple example, take \(M=S^1 = \mathbb {R}/2\pi \mathbb {Z}\), \(\psi _t(x) = x+t\), and \(G(x) = \cos (x)\). The point x is uniquely determined by sampling the two values \(g_x(0)=\cos (x)\) and \(g_x(\pi /2)= -\sin (x)\) and the Takens map

$$\begin{aligned} \Psi _{\pi /2}^1(x) = (\cos (x), -\sin (x)) \end{aligned}$$

is an embedding, so G is a good observation.

On the other hand, the doubly periodic function \(G(x) = \cos (2x)\) is not a good observation function. Indeed, any integral curve \(g_x(t)\) is invariant under a \(\pi \)-shift of x, as G cannot distinguish between any flow of x and \(x+\pi \). In fact, the good observation functions on \(S^1\) for the rotational dynamic are precisely ones with minimum period \(2\pi \)

In higher dimensions the task of recovering p from \(g_p\) becomes less clear. Consider the torus \(\mathbb {T}= S^1\times S^1\) and \(G:\mathbb {T}\rightarrow \mathbb {R}\) given by

$$\begin{aligned} G(x,y) = \cos (x) + \cos (y) \end{aligned}$$

and \(\psi _t\) an irrational flow

$$\begin{aligned} \psi _t(x,y) = (x+\alpha t, y+ \beta t) \end{aligned}$$

and thus for \(p = (x,y) \in \mathbb {T}\) we have the observation curve

$$\begin{aligned} g_p(t) = \cos (x+\alpha t) + \cos (y + \beta t) \end{aligned}$$

For G to be good, there must be a \(\tau \) such that each p is uniquely determined by sampling G along the integral curve \(\gamma _p\) at finitely many \(\tau \)-steps. Since we are free to shrink \(\tau \) and increase N, it is natural to examine infinitesimal changes of G along the flow \(\psi _t\). The derivatives

$$\begin{aligned} g_p(0)&= \cos (x) + \cos (y)\\ g_p'(0)&= -\alpha \sin (x) - \beta \sin (y)\\ g_p^{(2)}(0)&= -\alpha ^2\cos (x) - \beta ^2\cos (y)\\ g_p^{(3)}(0)&= \alpha ^3\sin (x) + \beta ^3\sin (y) \end{aligned}$$

up to 3rd order yield the linear equation

$$\begin{aligned} \begin{pmatrix}1 &{} 1 &{} 0 &{} 0 \\ -\alpha ^2 &{} -\beta ^2 &{} 0 &{} 0 \\ 0 &{} 0 &{} -\alpha &{} -\beta \\ 0 &{} 0 &{} \alpha ^3 &{} \beta ^3\end{pmatrix} \begin{pmatrix} \cos (x) \\ \cos (y) \\ \sin (x) \\ \sin (y)\end{pmatrix} = \begin{pmatrix} g_p(0) \\ g_p'(0) \\ g_p^{(2)}(0) \\ g_p^{(3)}(0) \end{pmatrix} \end{aligned}$$

Equivalently, over \(\mathbb {C}\), the linear system

$$\begin{aligned} \begin{pmatrix}1 &{} 1 &{} 1 &{} 1 \\ i\alpha &{} -i\alpha &{} i\beta &{} -i\beta \\ -\alpha ^2 &{} -\alpha ^2 &{} -\beta ^2 &{} -\beta ^2 \\ -i\alpha ^3 &{} i\alpha ^3 &{} -i\beta ^3 &{} i\beta ^3 \end{pmatrix} \begin{pmatrix} e^{ix} \\ e^{-ix} \\ e^{iy} \\ e^{-iy}\end{pmatrix} = \begin{pmatrix} g_p(0) \\ g_p'(0) \\ g_p^{(2)}(0) \\ g_p^{(3)}(0) \end{pmatrix} \end{aligned}$$

has invertible Vandermonde matrix and one can solve for \(e^{ix}\) and \(e^{iy}\). Therefore (xy) is uniquely determined by \(g_p^{(k)}(0)\)’s. Choosing \(\tau \) small enough so that \(g_p(\tau )\) is close to the 3rd order Taylor polynomial of \(g_p\) about 0, we see that p is uniquely determined (modulo \(2\pi \)) by a \(\tau \)-uniform finite sampling of \(g_p\).

4.2 Main theorem and proof

The above calculation illustrates our approach to determining whether G is good: by studying the Taylor coefficients \(g_p^{(k)}\). Note that \(g_p^{(k)}(t)\) is the k-fold derivation of X applied to G at \(\psi _t(p)\), i.e. in Lie derivative notation,

$$\begin{aligned} g_p^{(k)}(t) = \mathcal {L}_X^kG(\psi _t(p)). \end{aligned}$$

\(\mathcal {L}_X\) is the linear operator on tensor fields which measures infinitesimal change along X, i.e. if T is a tensor then

$$\begin{aligned} \mathcal {L}_XT(p) = \frac{d}{dt}\bigg |_{t=0}((\psi _{-t})_*T_{\psi _t(p)}) \end{aligned}$$

Writing dG for the differential of G, and \(\wedge \) for exterior product, we now state the main result:

Theorem 4.1

The Takens map \(\Psi _{\tau }^{N}\) is an embedding for some \(N>0\) and flow time \(\tau >0\), if the following conditions hold:

  1. 1.

    For any point of \(p\in M\) there is an m-tuple \(J\in \mathbb {Z}_{\ge 0}^m\) of nonnegative integers such that the m-form

    $$\begin{aligned} \mathcal {L}_{X}^{\wedge J}dG := \bigwedge _{j\in J}\mathcal {L}_X^jdG \end{aligned}$$

    is nonzero at some point on the integral curve \(\gamma _p(s)\).

  2. 2.

    For any pair of distinct points \(p, q\in M\) the observation curves \(g_p(s)\) and \(g_q(s)\) are not identical.

Proof

For \(\Psi _\tau ^N\) to be an immersion, the cotangent vectors

$$\begin{aligned} dG|_p, d(G\circ \psi _\tau )|_p, d(G\circ \psi _{2\tau })|_p, \ldots , d(G\circ \psi _{N\tau })|_p\in T^*_pM \end{aligned}$$

must span an m-dimensional space for all \(p \in M\). Equivalently, for any point \(p\in M\) there must be a strictly increasing m-tuple \(I=(i_1, i_2, \ldots , i_m)\in \mathbb {Z}_{\ge 0}^m\) of indices such that the determinant m-form \(\bigwedge d(G\circ \psi _{i_k\tau })\) does not vanish at p, i.e.

$$\begin{aligned} \omega _{p}^I(\tau ) := \bigwedge _{i_k\in I} d(G\circ \psi _{i_k\tau })|_{p} \ne 0. \end{aligned}$$

We require that I be strictly increasing because a wedge product containing identical factors is zero.

The idea is to perform a convolution of the Taylor series of the cotangent curves \(d(G\circ \psi _{i_kt})\) and to use condition 1 above to choose sufficiently small \(\tau \) so that \(\omega _p^I(\tau )\ne 0\) for some I. By compactness of M, one makes a uniform choice of small \(\tau \) so that \(\Psi _\tau \) is immersive and each observation curve is distinguished on some integer multiple of \(\tau \), thereby making \(\Psi _\tau ^N\) injective.

Let \(s\ge 0\) be a time parameter for p such that

$$\begin{aligned} \mathcal {L}_{X}^{\wedge J}dG \end{aligned}$$

is nonzero at \(\gamma _p(s)\). Write \({\tilde{p}} = \gamma _p(s)\) and \({\mathcal {J}}_n\) for the set of all strictly increasing m-tuples \(J = (j_1, j_2, \ldots , j_m)\) with degree

$$\begin{aligned} j_1 + j_2 + \cdots + j_m = n \end{aligned}$$

satisfying

$$\begin{aligned} \mathcal {L}_{X}^{\wedge J}dG|_{{\tilde{p}}} \ne 0 \end{aligned}$$

Fix \(n> 0\) to be the minimal integer for which \({\mathcal {J}}_n\) is nonempty (possible by condition 1 above).

Let A(t) be the m by \((n+1)\) matrix with (kj)th entry

$$\begin{aligned} A_{k,j}(t) = \frac{i_k^{j-1}(t-s)^{j-1}}{(j-1)!} \end{aligned}$$

and \(L:T_{{\tilde{p}}}M \rightarrow \mathbb {R}^{n+1}\) the linear map given by \(\mathcal {L}_X^{j-1}dG|_{{\tilde{p}}}\) in the jth coordinate,

$$\begin{aligned} L = \left( dG|_{{\tilde{p}}}, \mathcal {L}_X^1 dG|_{{\tilde{p}}}, \ldots , \mathcal {L}_X^n dG|_{{\tilde{p}}}\right) . \end{aligned}$$

So the kth component of the composition \(A(t)\circ L\), viewed as an m-tuple of t-dependent cotangent vectors, yields the nth order Taylor polynomial about \(t=s\) of the cotangent curve \(d(G\circ \psi _{i_kt})|_{p}\):

$$\begin{aligned} A(t) \circ L = \sum _{j=0}^n \frac{i_k^j (t-s)^j}{j!}\mathcal {L}_X^j dG|_{{\tilde{p}}}. \end{aligned}$$

By Cauchy-Binet formula applied to A(t) and L, the top exterior product

$$\begin{aligned} \omega _{p}^I(t)= \bigwedge _{i_k\in I} d(G\circ \psi _{i_kt})|_{p} \end{aligned}$$

has nth order Taylor series expansion about \(t=s\) with nth coefficient

$$\begin{aligned} C_n = \frac{\det (V)}{a_n} \sum _{J\in {\mathcal {J}}_n}|I^{J}|\cdot \mathcal {L}_{X}^{\wedge J}dG|_{{\tilde{p}}} \end{aligned}$$

where

  • \(a_n\) is a nonzero constant depending only on n

  • \(|I^J| = \prod i_k^{j_k-k+1}\)

and

$$\begin{aligned} \det (V) = \prod _{k<k'} (i_{k'}-i_k) \ne 0 \end{aligned}$$

is the nonzero determinant of the \(m\times m\) Vandermonde matrix V with (kj)th entry

$$\begin{aligned} V_{k, j}={i_k}^{j-1} \end{aligned}$$

where we take \(0^0 = 1\).

By the minimality assumption on \({\mathcal {J}}_n\), all the lower degree Taylor coefficients, which contain \(\mathcal {L}_X^K dG|_{{\tilde{p}}} = 0\) for m-tuples K with degree strictly less than n,

$$\begin{aligned} C_j = 0\quad \hbox { for}\ j<n \end{aligned}$$

are zero. So the Taylor expansion of \(\omega _{p}^I(t)\) has the form

$$\begin{aligned} \omega _{p}^I(t) = (t-s)^nC_n + R_p^n(t) \end{aligned}$$

where \(R_p^n(t)\) is the nth order Taylor error term with vanishing limit

$$\begin{aligned} \lim _{t\rightarrow s}\frac{R_p^n(t)}{(t-s)^n} = 0 \end{aligned}$$

For for suitable choice of I, there will be a dominating term in the sum over \({\mathcal {J}}_n\) such that \(C_n\) is nonzero. For \({\tilde{J}}\in {\mathcal {J}}_n\) the colexigraphically maximal element of \({\mathcal {J}}_n\), let a be the maximal index such that \(j_a < {\tilde{j}}_a\), for all \(J < {\tilde{J}}\). Choose I by making all terms right of \(a-1\) large, so that

$$\begin{aligned} |I^{J}|<< |I^{{\tilde{J}}}| \end{aligned}$$

for all \(J < {\tilde{J}}\)

$$\begin{aligned} \sum _{J\in {\mathcal {J}}_n}|I^{J}|\cdot \mathcal {L}_{X}^{\wedge J}dG|_{{\tilde{p}}} \ne 0. \end{aligned}$$

So the nth Taylor coefficient

$$\begin{aligned} C_n \ne 0 \end{aligned}$$

is nonzero. Hence we may choose a time \(\eta >s\) sufficiently close to s so that the Taylor error \(R_p^n(\eta )\) is small and the inequality

$$\begin{aligned} \omega _{q}^I(\eta ) \ne 0 \end{aligned}$$

holds for all q in a neighborhood of p, and this property remains invariant under shrinking \(\eta \) closer to s. By compactness of M there is a finite collection of triples \((I_r, \eta _r, s_r)\) such that the collection of m-forms

$$\begin{aligned} \{\omega ^{I_r}(\eta _r)\} \end{aligned}$$

do not all vanish at any given point of M and the cotangent vectors

$$\begin{aligned} \{d(G\circ \psi _{i_k\eta _r})|_q\}_{i_k \in I_r} \end{aligned}$$

specified by \(I_r\) are linearly independent. Choose \(\tau >0\) small enough so that there is an integer multiple of \(\tau \) lying in the interval \((s_r, \eta _r)\) for each r. Then the Takens map \(\Psi _\tau ^N\) is an immersion for all \(N>0\) bounding \(I_r\) and \(\eta _r/\tau \).

So \(\Psi _\tau ^N\) is locally injective and the difference map

$$\begin{aligned} \Psi _\tau ^N(p) - \Psi _\tau ^N(q) \end{aligned}$$

does not vanish for all \(p\ne q\) in an open neighborhood U of the diagonal in \(M\times M\), and this property is invariant under scaling \(N\mapsto Nd\) and \(\tau \mapsto \tau /d\) for an integer \(d>0\) (with U fixed).

For distinct \((p, q)\in M\times M\setminus U\), we may shrink \(\tau \) so that \(g_p\) and \(g_q\) are distinguished on some integer multiple of \(\tau \) and \(\Psi _\tau (p) \ne \Psi _\tau (q)\). By compactness of \(M\times M\setminus U\), there is a uniform choice of \(\tau \) and N making \(\Psi _\tau ^N\) injective, hence an embedding. \(\square \)

Remark 4.2

While one can provide a lower bound for the dimension N needed to yield a Takens embedding, the formula depends in a complicated way on G and X. In practice, choosing sufficiently large N and small \(\tau \) amounts to a dense sampling of a discrete time series.

5 An application to surfaces via fourier theory

Now that we have our theory in hand, we can examine another class of observation functions which are constructed from Fourier modes, in addition to our distance-based observation functions in Sect. 3.

5.1 The torus

We start by characterizing all smooth observations \(G:\mathbb {T}\rightarrow \mathbb {R}\) for a vector field X of irrational flow

$$\begin{aligned} \psi _t(x,y) = (x+\alpha t, y+\beta t) \end{aligned}$$

yielding toroidal delay embedding. For G write the Fourier expansion

$$\begin{aligned} G(x,y) = \sum _{(n,m)\in \mathbb {Z}^2} {\hat{G}}(n,m)\cdot \exp (i(nx+my)) \end{aligned}$$

where \({\hat{G}}(n,m)\in \mathbb {C}\) is the (nm)th Fourier coefficient of G. Set

$$\begin{aligned} {{\,\mathrm{Supp}\,}}{\hat{G}} = \{ (n,m) \in \mathbb {Z}^2 \mid {\hat{G}}(n,m) \ne 0 \} \end{aligned}$$

the support of \({\hat{G}}\).

Theorem 5.1

A smooth function \(G:\mathbb {T}\rightarrow \mathbb {R}\) is a good observation for an irrational winding if and only if the support \({{\,\mathrm{Supp}\,}}{\hat{G}}\) of the Fourier coefficients generates \(\mathbb {Z}^2\) as an abelian group.

Proof

Write \(e_{n,m} = \exp (i(nx+my))\) for the (nm)th Fourier basis element. The k-fold Lie derivative \({\mathcal {L}}_X^kG\) has Fourier coefficient

$$\begin{aligned} \widehat{{\mathcal {L}}_X^kG}(n,m) = i^k(n\alpha + m\beta )^k\cdot {\hat{G}}(n,m) \end{aligned}$$

and thus Fourier expansion

$$\begin{aligned} {\mathcal {L}}_X^kG = \sum _{(n,m) \in \mathbb {Z}^2} i^k(n\alpha + m\beta )^k{\hat{G}}(n,m)\cdot e_{n,m} \end{aligned}$$

Since \(\alpha /\beta \) is irrational, the coefficients

$$\begin{aligned} c_{n,m} = i\cdot (n\alpha + m\beta ) \end{aligned}$$

are nonvanishing and pairwise distinct. Therefore the Vandermonde matrix with \((n,m)\times j\)th entry

$$\begin{aligned} (c_{n,m}^k) \end{aligned}$$

is nonsingular and the projection

$$\begin{aligned} G*e_{n,m} = {\hat{G}}(n,m)\cdot \exp (i(nx+my)) \end{aligned}$$

can be written as an infinite sum

$$\begin{aligned} {\hat{G}}(n,m)\cdot \exp ( i(nx+my) = \sum _{j=0}^\infty b_j\mathcal {L}_X^jG \end{aligned}$$
(6)

Hence the values of \(\mathcal {L}_X^kG\) on a point \((u,v)\in \mathbb {T}\) uniquely determine

$$\begin{aligned} {\hat{G}}(n,m)\cdot e^{i(nu+mv)} \end{aligned}$$

If \({{\,\mathrm{Supp}\,}}{\hat{G}}\) generates \(\mathbb {Z}^2\), then there is some finite product

$$\begin{aligned} \prod _{(n_j,m_j)\in {{\,\mathrm{Supp}\,}}{\hat{G}}}e^{i(n_ju+m_jv)} = e^{iu} \end{aligned}$$

and thus u, and similarly v, are uniquely determined modulo \(2\pi \) by the observation curve \(G\circ \gamma _{u,v}\) and condition 2 of Theorem 4.1 above is satisfied.

If \(d\mathcal {L}_X^jG\wedge d\mathcal {L}_X^kG\) vanishes at p for all \(j,k\ge 0\), then by Eq. 6 above, the 2-form

$$\begin{aligned} d(G*e_{n,m})\wedge d(G*e_{n',m'}) = \det \begin{pmatrix}n &{} m \\ n' &{} m'\end{pmatrix}{\hat{G}}(n,m){\hat{G}}(n',m')\cdot e_{n,m} e_{n',m'} \end{aligned}$$

also vanishes at p for all pairs \((n,m), (n',m')\in \mathbb {Z}^2\). Thus

$$\begin{aligned} \det \begin{pmatrix}n &{} m \\ n' &{} m'\end{pmatrix} = 0 \text { for all }(n,m), (n',m') \in {{\,\mathrm{Supp}\,}}{\hat{G}} \end{aligned}$$

and \({{\,\mathrm{Supp}\,}}{\hat{G}}\) cannot generate \(\mathbb {Z}^2\). So condition 1 of Theorem 4.1 is satisfied if \({{\,\mathrm{Supp}\,}}{\hat{G}}\) generates \(\mathbb {Z}^2\).

Conversely, suppose \({{\,\mathrm{Supp}\,}}{\hat{G}}\) does not generate \(\mathbb {Z}^2\). By the classification of finitely generated abelian groups, there is a \(\mathbb {Z}\)-basis

$$\begin{aligned} (n_1,m_1), (n_2, m_2) \end{aligned}$$

for \(\mathbb {Z}^2\) such that \({{\,\mathrm{Supp}\,}}{\hat{G}}\) is generated by

$$\begin{aligned} a\cdot (n_1, m_1), b\cdot (n_2,m_2) \end{aligned}$$

where a and b are integers not both \(\pm 1\). Then there is some \((u,v) \notin 2\pi \mathbb {Z}^2\) such that

$$\begin{aligned} \begin{pmatrix}an_1 &{} am_1 \\ bn_2 &{} bm_2\end{pmatrix}\cdot \begin{pmatrix}u \\ v\end{pmatrix} \end{aligned}$$

takes values in \(2\pi \mathbb {Z}\), so that \(\exp (i(nu+mv)) = 1\) for all \((n,m) \in {{\,\mathrm{Supp}\,}}{\hat{G}}\). So for any point \((x,y)\in \mathbb {T}\), \((x+u,y+v)\in \mathbb {T}\) is a distinct point with the same observation curve, and no Takens map can distinguish between (xy) and \((x+u, y+v)\). \(\square \)

Remark 5.2

Theorem 5.1 can be strengthened to include rational windings. In this case one cannot expect the delay mapping to recover all Fourier modes of an observation function, but only those which are coprime to the slope of the winding.

Remark 5.3

For the irrational winding on the torus, the Koopman eigenfunctions are given by the Fourier basis. The Vandermonde inversion in Eq. (6) above shows that the Fourier modes of an observation are determined by its delay mapping. We are not aware of such a connection between Takens and Koopman, though it seems natural in this context.

By Theorem 5.1, whether or not G is good for an irrational flow depends only on the support \({{\,\mathrm{Supp}\,}}{\hat{G}}\). The quasiperiodic function

$$\begin{aligned} g(t) = \cos \sqrt{2} t + \cos t \end{aligned}$$
(7)

is the observation of \(G(x,y) = \cos (x)+\cos (y)\) along the irrational flow \((\sqrt{2}t, t)\) on the planar torus \(\mathbb {T}=\mathbb {R}^2/2\pi \mathbb {Z}^2\). A point cloud densely sampled from the sliding window \({{\,\mathrm{SW}\,}}_1^{10}g(t)\) coordinates given by 10 uniform shifts of g(t) yields a curve in \(\mathbb {R}^{10}\) with toroidal persistence (Fig. 10).

Fig. 10
figure 10

The observation function \(\cos (x) + \cos (y)\) for the same flow as Fig. 4

5.2 The Klein bottle

As in our example in Sect. 2.4.1, we write the Klein bottle \(\mathbb {K}\) as the quotient of the torus \(\mathbb {T}\) by the automorphism \(\kappa : (x,y) \mapsto (x+\pi , -y)\). The irrational flow on the \(\mathbb {T}\) is not \(\kappa \)-invariant since \(\kappa \) is orientation reversing in the y coordinate. To approximate the shallow flow in Figs. 5 and 6 above, we construct a vector field which flows cyclically along a repellor \(y=0\) and an attractor \(y=\pi \) by restricting a linear flow to the fundamental domain \([0,2\pi ]\times [0, \pi ]\) and flatten it out on the boundary circles \(y=0, \pi \). For \(\alpha ,\beta \in \mathbb {R}\) with \(0< \alpha /\beta<<1\) irrational, let \(X_\epsilon \) be a vector field on the rectangle given by

$$\begin{aligned} X_\epsilon (x,y)= {\left\{ \begin{array}{ll} (\alpha , \rho (y)) &{} 0 \le y \le \epsilon \\ (\alpha , \beta ) &{} \epsilon< y \le \pi -\epsilon \\ (\alpha , \rho (\pi +\epsilon -y)) &{} \pi -\epsilon < y \le \pi \end{array}\right. } \end{aligned}$$
(8)

where \(\rho \) is a smooth function on a neighborhood of \([0, \epsilon ]\) with \(\rho (0) = 0\), \(\rho (\epsilon ) = \beta \) making \(X_\epsilon \) smooth. For example, \(\rho = \beta \exp (1/(y/\epsilon -1)^2-1)\). Then \(X_\epsilon \) extends uniquely to a \(\kappa \)-invariant vector field on on \(\mathbb {T}\), and therefore induces a vector field on \(\mathbb {K}\).

Theorem 5.4

Let \(G:\mathbb {T}\rightarrow \mathbb {R}\) be a \(\kappa \)-invariant function on \(\mathbb {T}\). For fixed \(N\tau \), the Takens map

$$\begin{aligned} \Psi _\tau ^N: \mathbb {K}\rightarrow \mathbb {R}\end{aligned}$$

induced by G and \(X_\epsilon \) for arbitrarily small \(\epsilon \) and slope \(\alpha /\beta<<1\) is an embedding if and only if the following conditions hold:

  1. 1.

    \(G(x,\pi )\) and G(x, 0) have period \(\pi \) in x and do not differ by a shift

  2. 2.

    \({{\,\mathrm{Supp}\,}}{\hat{G}}\) generates \(\mathbb {Z}^2\)

Proof

Suppose G is good for \(X_\epsilon \). Since \(X_\epsilon \) flows horizontally at \(y=0,\pi \), condition (1) must hold so that each point is uniquely determined by its observation curve. Condition (2) must hold as well, since \(X_\epsilon \) is given by an irrational winding away from the \(\epsilon \)-neighborhood of \(y=0,\pi \) and the same argument as in Theorem 5.1 above applies for sufficiently shallow slope \(\alpha /\beta \) because \(N\tau \) is fixed.

Conversely, suppose conditions (1) and (2) hold. \(X_\epsilon \) is given by an irrational flow away from the \(\epsilon \)-neighborhood of \(y=0,\pi \). Furthermore, any point in the \(\epsilon \)-strip with \(y\ne 0,\pi \) may be flowed to a point where \(X_\epsilon \) has irrational slope. The same argument as in Theorem 5.1 shows that the Takens map restricts to an embedding on \(y\ne 0, \pi \).

By condition 1, the observation curve of a point (xy) where \(y=0,\pi \) uniquely determines x modulo \(\pi \), and is periodic and therefore distinct from any observation curve for \(y\ne 0,\pi \). So each point is uniquely determined by its observation curve as per condition (2) of Theorem 4.1.

It remains to show that the Takens map is immersive at \(y=0,\pi \). If not, then \(\frac{\partial G}{\partial y}\) vanishes on the circles \(y=0,\pi \), a neighorhood about which \(\Psi _\tau ^N\) would fail to immerse, a contradiction. \(\square \)

According to Theorem 5.4, the “simplest” \(\kappa \)-symmetric good observation is

$$\begin{aligned} G(x,y) = \cos 2x + \cos x\sin y + \cos y. \end{aligned}$$
(9)

Indeed, the Fourier coefficients of G are supported at \((\pm 2,0), (\pm 1, \pm 1), (0, \pm 1)\), which generates \(\mathbb {Z}^2\). Along the limit cycles we have \(G(x, 0) = \cos 2x + 1\) and \(G(x,0) = \cos 2x - 1\), which are distinct and doubly periodic. Let g(t) be the time series resulting from the flow from Eq. 8 composed with the observation G(xy). Then a point cloud densely sampled from the sliding window \(\hbox {SW}^{40}_1 g(t)\) yields the persistence of a Klein bottle, as shown in Fig.11. Furthermore, Fig. 12 shows that the projective coordinates of this point cloud result in two Möbius strips connected along their boundary, which is further evidence that the sliding window has recovered the Klein bottle.

Fig. 11
figure 11

The observation function G(xy) from Eq. 9 for the same flow as Fig. 5. Indeed the good observation function reproduces \({\mathbb {K}}\), as evidenced by the persistence diagram

Fig. 12
figure 12

Projective coordinates for a Klein bottle with observation function specified in Eq. 9. We plot the first half in the second subplot, which traces a Möbius strip from its core to its boundary, shown in yellow. Then, that boundary is glued to a second Möbius strip, which corresponds to the second half of the time series, as shown on the right (color figure online)

Intuitively, the \(\cos 2x\) term is responsible for delay-mapping the limit cycles \(y=0,\pi \) via a double covering. Without this term, the boundary \(G(x, 0) = 1\), \(G(x, \pi ) = -1\) along the bottom and top boundaries, respectively. Not only are these boundaries no longer identified, but they also each map to a single point, turning the Klein bottle into a sphere. The delay mapping of \(\cos x\sin y\) fills two Möbius strips in conjuction with \(\cos (2x)\), while the \(\cos (y)\) term serves to “separate” the Möbius strips, as shown in the right hand side of Fig. 12.

We can also see this by parameterizing the flow by a single variable \(t = x\) and examining the time series directly. In this case, the time series is

$$\begin{aligned} g(t) = \cos (2t) + \cos (t) \sin \left( \frac{\alpha }{\beta } t \right) + \sin \left( \frac{\alpha }{\beta } t \right) \end{aligned}$$
(10)

for \(\epsilon< \frac{\alpha }{\beta } t < \pi - \epsilon \). Over small ranges of t, the sine terms are approximately constant. The time series is then of the form \(\cos (2t) + a \cos (t), |a| < 1\); that is, its sliding window embedding locally parameterizes the boundary of a Möbius strip (Perea and Harer 2015). As it moves further along, a changes, and so it fills out the strip.

6 Discussion

It is clear what circular and toroidal observations look like in the time domain, and as we have mentioned, there are many applications that take advantage of this knowledge. The theory developed in this paper has enabled us to move beyond this and to develop examples of signals recovering other manifolds.

Also, by showing the existence of time series whose “attractors” are on twisted spaces, we also provide further motivation for TDA time series users to move beyond exclusively using \({\mathbb {Z}} / 2 {\mathbb {Z}}\) in TDA. The latter is the default option across most applications of TDA in time series analysis, but it is possible that these pipelines are blind to important features, as some of our examples show.

Moreover, just as circular and toroidal sliding window embeddings have interpretations in terms of physical phenomena, the presence of Klein bottles, Moebius strips, spheres, projective planes, etc, should also have practical meaning. It is unlikely that one could recognize the significance of these time series in the wild without such examples in hand, and being primed as such makes it more likely that we will be able to discover physical examples where non-orientable state spaces are natural.

Finally, we note that not only do we have a method for producing time series recovering other manifolds, which we have validated empirically using persistent homology and Eilenberg MacClane coordinates, but the method is backed by a theorem that indicates exactly when it will succeed/fail.