1 Introduction

The familiar parallelogram law states that for any vectors \(x\) and \(y\) in a Hilbert space \(\mathcal {H}\), we have

$$\begin{aligned} \Vert x+y\Vert ^{2} + \Vert x-y\Vert ^{2} = 2\Vert x\Vert ^{2} + 2\Vert y\Vert ^{2} \end{aligned}$$
(1.1)

If this condition is imposed on a normed space, then in fact the polarization identity

$$\begin{aligned} \langle x,y \rangle = \frac{\Vert x+y\Vert ^{2} - \Vert x-y \Vert ^{2}}{4} + i\frac{\Vert ix-y\Vert ^{2} - \Vert ix+y \Vert ^{2}}{4} \end{aligned}$$

(assuming complex scalars) determines an inner product, with \(\Vert x\Vert ^{2} = \langle x,x \rangle \) for all vectors \(x\). Thus, the parallelogram law would have to be altered or weakened in some way in order to apply to a more general normed space.

One example of this is furnished by Clarkson’s inequalities, which constitute parallelogram laws of sorts for the spaces \(L^{p}=L^{p}(\Omega ,\Sigma ,\mu )\), where \(1 < p < \infty \) and \((\Omega ,\Sigma ,\mu )\) is any measure space (see, for instance, [7, p. 119]). If \(1 < p \le 2\), then

$$\begin{aligned} \Vert x+y\Vert ^{p}_p + \Vert x-y\Vert ^{p}_p \ge 2^{p-1}\left( \Vert x\Vert ^{p}_p + \Vert y\Vert ^{p}_p\right) \end{aligned}$$
(1.2)

for all \(x\) and \(y\) in \(L^{p}\); and if \(2 \le p < \infty \), then

$$\begin{aligned} \Vert x+y\Vert ^{p}_p + \Vert x-y\Vert ^{p}_p \le 2^{p-1}\left( \Vert x\Vert ^{p}_p + \Vert y\Vert ^{p}_p\right) \end{aligned}$$
(1.3)

for all \(x\) and \(y\) in \(L^{p}\). Another example comes from Bynum and Drew [5] and Bynum [4]. They discovered what they call weak parallelogram laws for \(L^{p}\):

If \(1 < p \le 2\), then

$$\begin{aligned} \Vert x+y\Vert ^{2}_p + (p-1)\Vert x-y\Vert ^{2}_p \le 2\left( \Vert x\Vert ^{2}_p + \Vert y\Vert ^{2}_p\right) \end{aligned}$$
(1.4)

for all \(x\) and \(y\) in \(L^{p}\); and if \(2 \le p < \infty \), then

$$\begin{aligned} \Vert x+y\Vert ^{2}_p + (p-1)\Vert x-y\Vert ^{2}_p \ge 2\left( \Vert x\Vert ^{2}_p + \Vert y\Vert ^{2}_p\right) \end{aligned}$$
(1.5)

for all \(x\) and \(y\) in \(L^{p}\). That is, they are able to impose a version of condition (1.1) on the space \(L^{p}\), at the cost of introducing a constant factor \((p-1)\), and weakening the equation to an inequality. In both examples, the resulting inequalities tell us something about the geometry of the space, such as smoothness and convexity properties.

Guided by these two examples, and in the interest of pursuing a parallelogram law for general normed linear spaces, let us adopt the following terminology.

Definition 1.1

Let \(C>0\), and \(1< p < \infty \). A Banach Space \({\mathcal {X}}\) satisfies a \(p\)-lower weak parallelogram law with constant \(C\) if

$$\begin{aligned} \Vert x + y \Vert ^{p} + C\Vert x - y\Vert ^{p} \le 2^{p-1}\left( \Vert x\Vert ^{p} + \Vert y\Vert ^{p}\right) \end{aligned}$$
(1.6)

for all \(x\) and \(y\) in \({\mathcal {X}}\). In this case let us say that \({\mathcal {X}}\) is \(p\)-LWP(\(C\)).

Similarly, a Banach Space \({\mathcal {X}}\) satisfies a \(p\)-upper weak parallelogram law with constant \(C\) if

$$\begin{aligned} \Vert x + y \Vert ^{p} + C\Vert x - y\Vert ^{p} \ge 2^{p-1}\left( \Vert x\Vert ^{p} + \Vert y\Vert ^{p}\right) \end{aligned}$$
(1.7)

for all \(x\) and \(y\) in \({\mathcal {X}}\). In this case let us say that \({\mathcal {X}}\) is \(p\)-UWP(\(C\)).

It may be convenient during usage to suppress the parameter \(p\) or the constant \(C\). Naturally, we speak of Banach spaces satisfying LWP or UWP as weak parallelogram spaces. By this terminology, the Clarkson inequalities say that \(L^{p}\) is \(p\)-UWP(1) when \(1 < p \le 2\), and \(L^{p}\) is \(p\)-LWP(\(1\)) when \(2 \le p < \infty \). Similarly, Bynum’s inequalities assert that \(L^{p}\) is \(2\)-LWP(\(p-1\)) when \(1 < p \le 2\), and \(L^{p}\) is \(2\)-UWP(\(p-1\)) when \(2 \le p < \infty \).

In this paper, our objectives are to obtain some properties of weak parallelogram spaces, and to apply those ideas toward the prediction of certain processes. First, a complete description is obtained for the values of \(r\) and \(p\) for which the space \(L^r\) satisfies \(p\)-LWP(\(C\)) for some \(C>0\), and respectively \(p\)-UWP(\(C\)).

The geometry of weak parallelogram spaces is then explored. We note that LWP spaces are uniformly convex, and UWP spaces are uniformly smooth. We also derive Pythagorean-type theorems for weak parallelogram spaces. In this context, orthogonality is in the Birkhoff–James sense: two vectors \(x\) and \(y\) in \({\mathcal {X}}\) satisfy \(x\perp y\) if

$$\begin{aligned} \Vert x+ ay \Vert \ge \Vert x\Vert \end{aligned}$$

for all scalars \(a\). (See [1] for a recent review of Birkhoff–James orthogonality in normed linear spaces).

Weak parallelogram spaces are shown to have the following properties. If \({\mathcal {X}}\) is \(p\)-LWP, then there exists a constant \(K >0\) such that

$$\begin{aligned} \Vert x+y \Vert ^{p} \ge \Vert x\Vert ^{p} + K \Vert y\Vert ^{p} \end{aligned}$$

whenever \(x \perp y\); similarly, if \({\mathcal {X}}\) is \(p\)-UWP, then there exists a constant \(K >0\) such that

$$\begin{aligned} \Vert x+y \Vert ^{p} \le \Vert x\Vert ^{p} + K \Vert y\Vert ^{p} \end{aligned}$$

Indeed, if \(p=2\) and \(K=1\), and equality holds, then we have the usual Pythagorean Theorem.

These ideas are used to study certain sequences of vectors in a Banach space. Specifically, we are interested in sequences \(\{X_{n}\}_{n=0}^{\infty }\) of unit vectors for which \(m < n\) implies that \(X_m \perp X_{n}\) in the Birkhoff–James sense. This structure is motivated by prediction problems for \(p\)-stationary processes, which were likewise extensions of prediction theory for stationary Gaussian processes (i.e., the \(p=2\) case). For examples of processes of this type, see [6, 911, 14]. For processes in the broader class of weak parallelogram spaces, bounds are obtained for the coefficients \(\{a_{n}\}\) whenever \(\sum _{n=0}^{\infty } a_{n} X_{n} \) converges in norm.

In addition, a Baxter-type inequality is derived for LWP spaces. This gives us a practical bound for the average error incurred when the best estimate of \(X_0\) in the linear span of \(\{X_1, X_2, X_3, \ldots \}\), truncated at the \(n\)th term, is used in place of the best estimate of \(X_0\) in the linear span of \(\{X_1, X_2, X_3, \ldots ,X_{n}\}\). The original Hilbert space version due to Baxter [2] has proved very useful in time series analysis [12, 16]. Our work extends this notion to a broader class of processes.

Finally, a number of criteria are given for such sequences \(\{X_{n}\}_{n=0}^{\infty }\) to be regular (in the sense of being purely linearly nondeterministic), if the underlying space is LWP. For instance, regularity of the sequence is shown to be equivalent to \(\{X_{n}\}_{n=0}^{\infty }\) being a conditional basis for its span.

We conclude this section with some notational and technical preliminaries. All Banach spaces being considered here are assumed to be of dimension 2 or greater, so as to avoid trivialities. The function space \(L^{p}(\Omega ,\Sigma ,\mu )\) is simply referred to as \(L^{p}\), there being no need to refer to the underlying measure space. The functions may be real or complex valued. Typically, it is assumed that \(1 < p < \infty \), so that \(L^{p}\) is a reflexive Banach space. As usual, the norm in \(L^{p}\) is written \(\Vert \cdot \Vert _p\). In this setting, let \(p'\) denote the conjugate index to \(p\): thus \((1/p) + (1/p') = 1\).

If \(x\) is a nonzero vector in a Banach space \({\mathcal {X}}\), then a norming functional for \(x\) is a norm one functional \(T\) in \({\mathcal {X}}^*\) such that \(T(x) = \Vert x\Vert \). The Hahn-Banach theorem assures the existence of such a functional. If the norming functional for \(x\) is unique, then let us call it \(T_x\). If every nonzero vector \(x\) in \({\mathcal {X}}\) has a unique norming functional, then \({\mathcal {X}}\) is said to be smooth. For example, the space \(L^{p}\) is smooth whenever \(1 < p < \infty \), but fails to be smooth when \(p=1\) or \(p=\infty \). Here are some well established facts about smoothness.

Proposition 1.2

The following statements about a Banach space \({\mathcal {X}}\) are equivalent: (i) \({\mathcal {X}}\) is smooth; (ii) the norm in \({\mathcal {X}}\) has directional derivatives at every nonzero point; and (iii) Birkhoff–James orthogonality \(\perp \) in \({\mathcal {X}}\) is linear in its second argument. Furthermore, if \({\mathcal {X}}\) is smooth, then \(x \perp y\) in \({\mathcal {X}}\) if and only if \(T_x(y) = 0\).

These properties are particularly relevant to the linear prediction problems of the later sections. For the proofs, see [3] for the equivalence of (i) and (ii); and [13] for the equivalence of (i) and (iii).

Other natural and interesting questions arise. One might consider, for example, properties of and relationships between the parameters \(p\) and \(C\) in the condition \(p\)-LWP(\(C\)) or \(p\)-UWP(\(C\)). In addition, the weak parallelogram properties exhibited by the \(L^{p}\) spaces suggest a more general duality relationship. Furthermore, connections could be established between the weak parallelogram laws and other ways of expressing a parallelogram law on a general normed space, for example, the Rademacher type and co-type properties. In order to maintain our present focus on applications, we defer these issues to another paper [8].

2 Weak parallelogram laws for \(L^{p}\)

The inequalities (1.2), (1.3), (1.4) and (1.5) show that for \(1< p < \infty \), the \(L^{p}\) spaces satisfy certain weak parallelogram laws. Our aim in the this section is to identify all of the weak parallelogram laws satisfied by the \(L^{p}\) spaces.

Theorem 2.1

If \(1 < p \le 2\), and \(p'\) is the conjugate index, then \(L^{p}\) is:

$$\begin{aligned} \begin{array}{r@{\quad }c@{\quad }l} r-{\mathrm{UWP}(1)} &{} when &{} 1<r \le p; \\ r-{\mathrm{LWP}((p-1)^{r/2})} &{} when &{} 2\le r \le p';\;and\\ r-{\mathrm{LWP}(1)} &{} when &{} r \ge p'. \end{array} \end{aligned}$$

If \(2 \le p < \infty \), and \(p'\) is the conjugate index, then \(L^{p}\) is:

$$\begin{aligned} \begin{array}{r@{\quad }c@{\quad }l} r-{\mathrm{LWP}(1)} &{} when &{} p \le r < \infty ; \\ r-{\mathrm{UWP}((p-1)^{r/2})} &{} when &{} p' \le r \le 2;\;and\\ r-{\mathrm{UWP}(1)} &{} when &{} r \le p'. \end{array} \end{aligned}$$

To prove this, we begin by extending Clarkson’s inequalities. There are eight cases, presented in two lemmas.

Lemma 2.2

Suppose that \(1 < p < \infty \) and \(1 < r < \infty \).

  1. (i)

    If \(r \ge \max \{p,p'\}\), then for all \(x\) and \(y\) in \(L^{p}\),

    $$\begin{aligned} \Vert x + y\Vert ^r_p + \Vert x-y \Vert ^r_p \le 2^{r-1}\left( \Vert x\Vert ^r_p + \Vert y\Vert ^r_p\right) \end{aligned}$$
    (2.1)
  2. (ii)

    If \(r \le \min \{p,p'\}\), then for all \(x\) and \(y\) in \(L^{p}\),

    $$\begin{aligned} \Vert x + y\Vert ^r_p + \Vert x-y \Vert ^r_p \ge 2^{r-1} \left( \Vert x\Vert ^r_p + \Vert y\Vert ^r_p\right) \end{aligned}$$
    (2.2)
  3. (iii)

    If \(1 < p \le r \le p' < \infty \), then for all \(x\) and \(y\) in \(L^{p}\),

    $$\begin{aligned} \Vert x + y\Vert ^r_p + \Vert x-y \Vert ^r_p \le 2^{r/p}\left( \Vert x\Vert ^r_p + \Vert y\Vert ^r_p\right) \end{aligned}$$
    (2.3)
  4. (iv)

    If \(1 < p' \le r \le p < \infty \), then for all \(x\) and \(y\) in \(L^{p}\),

    $$\begin{aligned} \Vert x + y\Vert ^r_p + \Vert x-y \Vert ^r_p \ge 2^{r/p}\left( \Vert x\Vert ^r_p + \Vert y\Vert ^r_p\right) \end{aligned}$$
    (2.4)

Proof

In the case \(2 \le r \le p < \infty \), we use the inequality (1.3) to get

$$\begin{aligned}&\left( \Vert x+y \Vert ^r_p + \Vert x-y \Vert ^r_p \right) ^{p/r} \\&\quad \le \Vert x+y \Vert ^{p}_p + \Vert x-y \Vert ^{p}_p \\&\quad \le 2^{p-1} \left( \Vert x \Vert ^{p}_p + \Vert y \Vert ^{p}_p \right) \\&\quad = 2^{p-1} \left( \Vert x \Vert ^{p}_p\cdot 1 + \Vert y \Vert ^{p}_p\cdot 1 \right) \\&\quad \le 2^{p-1} \left( \Vert x \Vert ^r_p + \Vert y \Vert ^r_p \right) ^{p/r}\left( 1^{ r/(r-p)} + 1^{ r/(r-p)} \right) ^{(r-p)/r} \end{aligned}$$

The last step comes from applying the Hölder inequality to two-dimensional \(\ell ^{1}\), using the exponent \(r/p\) and its conjugate. Now raise both sides to the power \(r/p\), noting that on the right hand side, 2 is raised to the power

$$\begin{aligned} \frac{r(p-1)}{p} +\frac{r(r-p)}{rp} = r-1 \end{aligned}$$

This proves part of case (i). If \(2 \le p' \le r < \infty \), then

$$\begin{aligned}&2\left( \Vert x\Vert ^r_p +\Vert y\Vert ^r_p\right) ^{p'(p-1)/r} \nonumber \\&\quad \le 2\left( \Vert x\Vert ^{p'}_p + \Vert y\Vert ^{p'}_p\right) ^{p-1}\nonumber \\&\quad \le \Vert x+y\Vert ^{p}_p + \Vert x-y\Vert ^{p}_p \nonumber \\&\quad \le \left( \Vert x+y\Vert ^r_p + \Vert x-y\Vert ^r_p\right) ^{p/r}2^{1-p/r} \end{aligned}$$
(2.5)

where the step from line (2.5) to the next is yet another of Clarkson’s inequalities [7, p. 119].

It follows that

$$\begin{aligned} 2\left( \Vert x\Vert ^r_p +\Vert y\Vert ^r_p\right) \le \Vert x+y\Vert ^r_p + \Vert x-y\Vert ^r_p \end{aligned}$$

Now replace \(x\) and \(y\) with \(x+y\) and \(x-y\), respectively, to see that this is equivalent to (2.1). This completes the verification of (i). The rest is similar. \(\square \)

Indeed, when \(p=r\), these are the original Clarkson’s Inequalities, and the special case \(p=r=2\) is the parallelogram law proper. However, only cases (i) and (ii) above are weak parallelogram laws as defined here in (1.6) and (1.7). The other two cases are near-misses since the multiplicative constant \(2^{r/p}\) is of the wrong form. To obtain further results, we need the following extension of the Bynum and Drew inequalities.

Lemma 2.3

Suppose that \(1 < r < \infty \) and \(1 < p < \infty \).

  1. (v)

    If \(1 < p \le r \le 2\), then for all \(x\) and \(y\) in \(L^{p}\),

    $$\begin{aligned} \Vert x+y \Vert ^r_p + (p-1)^{r/2}\Vert x-y \Vert ^r_p \le 2\left( \Vert x\Vert ^r_p + \Vert y\Vert ^r_p\right) \end{aligned}$$
    (2.6)
  2. (vi)

    If \(1 < p \le 2 \le r \le p'\), then for all \(x\) and \(y\) in \(L^{p}\),

    $$\begin{aligned} \Vert x+y \Vert ^r_p + (p-1)^{r/2}\Vert x-y \Vert ^r_p \le 2^{r-1}\left( \Vert x\Vert ^r_p + \Vert y\Vert ^r_p\right) \end{aligned}$$
    (2.7)
  3. (vii)

    If \(2 \le r \le p < \infty \), then for all \(x\) and \(y\) in \(L^{p}\),

    $$\begin{aligned} \Vert x+y \Vert ^r_p + (p-1)^{r/2}\Vert x-y \Vert ^r_p \ge 2\left( \Vert x\Vert ^r_p + \Vert y\Vert ^r_p\right) \end{aligned}$$
    (2.8)
  4. (viii)

    If \(p' \le r \le 2 \le p < \infty \), then for all \(x\) and \(y\) in \(L^{p}\),

    $$\begin{aligned} \Vert x+y \Vert ^r_p + (p-1)^{r/2}\Vert x-y \Vert ^r_p \ge 2^{r-1}\left( \Vert x\Vert ^r_p + \Vert y\Vert ^r_p\right) \end{aligned}$$
    (2.9)

Proof

For (v), use Bynum’s weak parallelogram law (1.4) for \(L^{p}\), along with Hölder’s Inequality on two-dimensional \(\ell ^{2/r}\), as follows:

$$\begin{aligned}&\Vert x+y \Vert ^r_p + (p-1)^{r/2}\Vert x-y \Vert ^r_p \\&\quad \le \left( \Vert x+y \Vert ^{2}_p + (p-1)\Vert x-y \Vert ^{2}_p \right) ^{r/2} (1 + 1)^{1 - r/2}\\&\quad \le \left( 2[ \Vert x\Vert ^{2}_p + \Vert y \Vert ^{2}_p ]\right) ^{r/2} 2^{1 - r/2} \\&\quad \le 2\left( \Vert x\Vert ^r_p + \Vert y\Vert ^r_p\right) \end{aligned}$$

The other assertions are similarly handled. \(\square \)

Note that when \(r=2\) we indeed get the inequalities of Bynum [4] and Bynum and Drew [5].

Cases (vi) and (viii) are weak parallelogram laws. It will turn out that for the remaining cases (v) and (vii), which are again near-misses, we can rule out any \(r\)-weak parallelogram laws in principle. (See Proposition 3.6.) This exhausts all of the possibilities for \(p\) and \(r\), and affirms Theorem 2.1. Note that \(L^r\) is \(p\)-LWP if and only if \(L^{r'}\) is \(p'\)-UWP; a more general duality relationship is established in [8]. This completes the description of \(p\)-weak parallelogram laws for the \(L^r\) spaces.

3 Geometry of weak parallelogram spaces

Let us explore some of the the geometric consequences of the weak parallelogram laws. First, we note the following convexity and smoothness properties. The proofs follow those of [4] for the \(p=2\) case, mutatis mutandis.

Proposition 3.1

Let \(1 < p < \infty \). If a Banach space \(\mathcal {X}\) satisfies \(p\)-LWP(\(C\)), then \(\mathcal {X}\) is uniformly convex. If a Banach space \(\mathcal {X}\) satisfies \(p\)-UWP(\(C\)), then \(\mathcal {X}\) is uniformly smooth.

Thus, a LWP space is reflexive, and it enjoys the unique nearest point property by the Milman-Pettis Theorem [15].

Next, suppose that \({\mathcal {X}}\) is a smooth Banach space, so that for each nonzero \(x\) in \({\mathcal {X}}\), there is a unique norm one functional \(T_x\) satisfying \(T_x(x) = \Vert x\Vert \). We establish below that weak parallelogram laws on smooth Banach spaces are expressible as bounding conditions on the norming functionals. Here \(\mathfrak {R}z\) stands for the real part of a complex number \(z\).

Lemma 3.2

Let \(1 < p < \infty \).

A smooth Banach space \({\mathcal {X}}\) is \(p\)-LWP if and only if for some positive constant \(K\), and for all \(x\ne 0\) and \(y\) in \({\mathcal {X}}\),

$$\begin{aligned} \Vert x+y\Vert ^{p} \ge \Vert x\Vert ^{p} + K\Vert y\Vert ^{p} + p\Vert x\Vert ^{p-1}\mathfrak {R}(T_x(y)) \end{aligned}$$
(3.1)

A smooth Banach space \({\mathcal {X}}\) is \(p\)-UWP if and only if for some positive constant \(K\), and for all \(x\ne 0\) and \(y\) in \({\mathcal {X}}\),

$$\begin{aligned} \Vert x+y\Vert ^{p} \le \Vert x\Vert ^{p} + K\Vert y\Vert ^{p} + p\Vert x\Vert ^{p-1} \mathfrak {R}(T_x(y)) \end{aligned}$$
(3.2)

Proof

Suppose that (3.1) holds. Then

$$\begin{aligned} \Vert x\Vert ^{p} + K\Vert y\Vert ^{p} + p\Vert x\Vert ^{p-1} \mathfrak {R}( T_x(y))&\le \Vert x+y\Vert ^{p} \\ \Vert x\Vert ^{p} + K\Vert -y\Vert ^{p} + p\Vert x\Vert ^{p-1} \mathfrak {R}( T_x(-y))&\le \Vert x-y\Vert ^{p} \\ 2 \Vert x\Vert ^{p} + 2K\Vert y\Vert ^{p}&\le \Vert x+y\Vert ^{p} + \Vert x-y\Vert ^{p} \end{aligned}$$

Replace \(x\) with \(x+y\), and replace \(y\) with \(x-y\), to find that

$$\begin{aligned} \Vert x+y\Vert ^{p} + K \Vert x-y\Vert ^{p} \le 2^{p-1}\left( \Vert x\Vert ^{p} + \Vert y\Vert ^{p}\right) \end{aligned}$$
(3.3)

Thus \({\mathcal {X}}\) is \(p\)-LWP(\(K\)).

Conversely, assume that \({\mathcal {X}}\) is \(p\)-LWP(\(C\)) for some \(C>0\), and apply

$$\begin{aligned} \Vert u+v\Vert ^{p} + C \Vert u-v\Vert ^{p} \le 2^{p-1}\left( \Vert u\Vert ^{p} + \Vert v\Vert ^{p}\right) \end{aligned}$$

to the pair of vectors \(u=x\) and \(v=x+2^{-n}y\), for \(n = 0\), \(1\), \(2\), \(\ldots \) . The first step, with \(n=0\), provides that

$$\begin{aligned} \Vert x + {\scriptstyle \frac{1}{2}} y\Vert ^{p} + C\Vert {\scriptstyle \frac{1}{2}} y\Vert ^{p} \le {\scriptstyle \frac{1}{2}} \Vert x\Vert ^{p} + {\scriptstyle \frac{1}{2}} \Vert x+y\Vert ^{p} \end{aligned}$$

and subsequent steps may take the form

$$\begin{aligned} 2\Vert x + 2^{-(n+1)}y \Vert ^{p} + 2C\Vert 2^{-(n+1)} y\Vert ^{p} - \Vert x\Vert ^{p} \le \Vert x + 2^{-n}y\Vert ^{p} \end{aligned}$$
(3.4)

We then apply (3.4) repeatedly, substituting the last \(\Vert x + 2^{-n}y\Vert ^{p}\) term with the smaller quantity on the left. This yields the estimate

$$\begin{aligned} \Vert x+y \Vert ^{p}&\ge -(1 + 2 + 2^{2} + \cdots + 2^{n-1})\Vert x\Vert ^{p}\\&+\, \Big [\frac{1}{2^{p-1}} + \frac{1}{(2^{p-1})^{2}} + \frac{1}{(2^{p-1})^3} +\cdots + \frac{1}{(2^{p-1})^n}\Big ]C\Vert y\Vert ^{p}\\&+\, 2^n \Vert x + 2^{-n}y\Vert ^{p}\\&= -(2^n - 1) \Vert x\Vert ^{p} + \frac{1 - 2^{-(p-1)n}}{2^{p-1} -1} C \Vert y\Vert ^{p} + 2^n \Vert x + 2^{-n}y\Vert ^{p}\\&= \Vert x\Vert ^{p} + \frac{1 - 2^{-(p-1)n}}{2^{p-1} -1} C \Vert y\Vert ^{p} + \frac{ \Vert x + 2^{-n} y\Vert ^{p} - \Vert x \Vert ^{p}}{2^{-n}} \end{aligned}$$

Now take the limit as \(n\) tends toward infinity. Smoothness ensures that the directional derivative in the final term exists. To obtain its value, apply McShane’s Lemma [7, Lemma 11.17, p. 120] to get

$$\begin{aligned} \lim _{a\rightarrow 0+}\frac{\Vert x + ay\Vert ^{p} - \Vert x\Vert ^{p}}{a} = p\Vert x\Vert ^{p-1} \mathfrak {R}(T_x(y)) \end{aligned}$$

The conclusion is

$$\begin{aligned} \Vert x+y\Vert ^{p} \ge \Vert x\Vert ^{p} + \frac{C}{2^{p-1}-1}\Vert y\Vert ^{p} + p\Vert x\Vert ^{p-1} \mathfrak {R}(T_x(y)) \end{aligned}$$

This affirms (3.1) with \(K = C/(2^{p-1} -1)\), and an analogous argument establishes (3.2).   \(\square \)

This, in turn, gives rise to a Pythagorean Theorem for weak parallelogram spaces. Let \(1 < p < \infty \).

Theorem 3.3

If a smooth Banach space \({\mathcal {X}}\) is \(p\)-LWP(\(C\)), then there exists a positive constant \(K\) such that whenever \(x \perp y\) in \({\mathcal {X}}\),

$$\begin{aligned} \Vert x \Vert ^{p} + K\Vert y\Vert ^{p} \le \Vert x + y \Vert ^{p} \end{aligned}$$
(3.5)

If \({\mathcal {X}}\) is \(p\)-UWP(\(C\)), then there exists a positive constant \(K\) such that whenever \(x \perp y\) in \({\mathcal {X}}\),

$$\begin{aligned} \Vert x \Vert ^{p} + K\Vert y\Vert ^{p} \ge \Vert x + y \Vert ^{p} \end{aligned}$$
(3.6)

In either case, the constant \(K\) can be chosen to be \({C}/{(2^{p-1}-1)}\).

Proof

The inequalities follow from the fact that \(T_x(y) = 0\) if and only if \(x \perp y\), as noted in Proposition 1.2. The proof of Lemma 3.2 assures that the constant \(K = C/(2^{p-1}-1)\) suffices. \(\square \)

Let us refer to the constant \(K\) as a Pythagorean Constant for the space \({\mathcal {X}}\).

The \(L^{p}\) spaces certainly satisfy Pythagorean inequalities. These follow directly from the weak parallelogram laws for \(L^{p}\).

Corollary 3.4

Let \(1 < r < \infty \), and \(1 < p < \infty \).

If \(1 < p \le 2 \le r < \infty \) or \(2 \le p \le r < \infty \), then there exists \(K>0\) such that

$$\begin{aligned} \Vert x \Vert ^r_p + K\Vert y\Vert ^r_p \le \Vert x + y \Vert ^r_p \end{aligned}$$

whenever \(x \perp y\) in \(L^{p}\).

If \(1 < r \le p \le 2 \) or \(1 < r \le 2 \le p < \infty \), then there exists \(K>0\) such that

$$\begin{aligned} \Vert x \Vert ^r_p + K\Vert y\Vert ^r_p \ge \Vert x + y \Vert ^r_p \end{aligned}$$

whenever \(x \perp y\) in \(L^{p}\).

Our first application to prediction theory is a bound on the coefficient growth for certain moving average processes. Let \(\{X_{n}\}_{n=0}^{\infty }\) be a sequence of vectors in a Banach Space \({\mathcal {X}}\). Let us say that the sequence is an innovation sequence if each \(X_{n}\) is nonzero, and \(X_m \perp X_{n}\) whenever \(m < n\), where \(\perp \) is Birkhoff-James orthogonality. (This is a time reversal from the usual terminology.) In the special case \(\{Z_{n}\}_{n=0}^{\infty }\) is an orthonormal sequence in a Hilbert space, then of course we have \( \Vert \sum _{n=0}^{\infty } a_{n} Z_{n} \Vert ^{2} = \sum _{n=0}^{\infty } |a_{n}|^{2} \). This extends to Banach spaces in the following way.

Corollary 3.5

Let \(\{X_{n}\}_{n=0}^{\infty }\) be an innovation sequence of unit vectors in a smooth Banach Space \({\mathcal {X}}\), and suppose that the series \(\sum _{n=0}^{\infty } a_{n} X_{n}\) converges in norm.

If \({\mathcal {X}}\) satisfies \(p\)-LWP, then

$$\begin{aligned} \Big \Vert \sum _{n=0}^{\infty } a_{n} X_{n} \Big \Vert ^{p} \ge \sum _{n=0}^{\infty } K^n |a_{n}|^{p} \end{aligned}$$
(3.7)

where \(K\) is the associated Pythagorean constant.

If \({\mathcal {X}}\) satisfies \(p\)-UWP, then

$$\begin{aligned} \Big \Vert \sum _{n=0}^{\infty } a_{n} X_{n} \Big \Vert ^{p} \le \sum _{n=0}^{\infty } K^n |a_{n}|^{p} \end{aligned}$$

where \(K\) is the associated Pythagorean constant.

This follows from applying the Pythagorean inequalities repeatedly.

The above norm estimates furnish a sense of the growth or decay of the coefficients \(a_{n}\). For example, in the LWP case, (3.7) gives rise to the crude bound

$$\begin{aligned} |a_{n}| \le K^{-n/p}\Big \Vert \sum _{m=0}^{\infty } a_m X_m \Big \Vert \end{aligned}$$

where once again \(K\) is the associated Pythagorean constant.

Finally, let us return to the spaces \(L^{p}\) and use the Pythagorean inequalities to affirm that Theorem 2.1 indeed describes all of the \(r\)-weak parallelogram laws for \(L^{p}\). That is, for \(r\) and \(p\) outside of its conditions, a weak parallelogram law fails to exist.

Proposition 3.6

Let \(1 < r < \infty \), and \(1 < p < \infty \). If \(r > p\) or \(r > 2\), then there does not exist a positive constant \(C\) such that for all \(x\) and \(y\) in \(L^{p}\),

$$\begin{aligned} \Vert x+y \Vert ^r_p + C\Vert x-y \Vert ^r_p \ge 2^{r-1}\left( \Vert x\Vert ^r_p + \Vert y\Vert ^r_p\right) \end{aligned}$$

If \(r < p\) or \(r < 2\), then there does not exist a positive constant \(C\) such that for all \(x\) and \(y\) in \(L^{p}\),

$$\begin{aligned} \Vert x+y \Vert ^r_p + C\Vert x-y \Vert ^r_p \le 2^{r-1}\left( \Vert x\Vert ^r_p + \Vert y\Vert ^r_p\right) \end{aligned}$$

Proof

Our strategy is to show that the respective Pythagorean inequalities fail, and hence the corresponding weak parallelogram laws fail. Consider the example of 2-dimensional \(\ell ^{p}\) with vectors \(x = (a,a)\) and \(y = (1,-1)\). Note that \(x \perp _p y\). Assuming \(a\) is positive and large, we find that

$$\begin{aligned}&\frac{\Vert x + y\Vert ^r_p - \Vert x\Vert ^r_p}{\Vert y\Vert ^r_p} \\&\quad = \frac{\Big (|a+1|^{p} + |a-1|^{p}\Big )^{r/p}-\Big (|a|^{p} + |a|^{p}\Big )^{r/p}}{(1+1)^{r/p}}\\&\quad = a^r \Big [ 1 + \frac{p(p-1)}{2a^{2}} + O(1/a^3) \Big ]^{r/p} - a^r \\&\quad = \frac{1}{2} a^{r-2}r(p-1) + O\left( a^{r-3}\right) \end{aligned}$$

where the estimates come from the binomial series:

$$\begin{aligned} \Big (1 + \frac{1}{a} \Big )^{p} = 1 + p\Big (\frac{1}{a}\Big ) + \frac{p(p-1)}{2}\Big (\frac{1}{a}\Big )^{2} + O\Big (\Big [\frac{1}{a}\Big ]^3\Big ) \end{aligned}$$

As \(a \rightarrow \infty \) this tends to the limit \(0\) if \(1 < r < 2\); it diverges to \(\infty \) if \(2 < r < \infty \). It follows that \(\ell ^{p}\) fails to be \(r\)-LWP if \(1 < r < 2\), and that \(\ell ^{p}\) fails to be \(r\)-UWP if \(2 < r < \infty \).

Next, consider 2-dimensional \(\ell ^{p}\) with vectors \(x = (1,0)\) and \(y = (0,a)\). As before, \(x \perp _p y\). Assume that \(a\) is positive and small. Then

$$\begin{aligned}&\frac{\Vert x + y\Vert ^r_p - \Vert x\Vert ^r_p}{\Vert y\Vert ^r_p} \\&\quad = \frac{\Big (1 + a^{p}\Big )^{r/p}-1}{a^{r}}\\&\quad = \frac{r}{p}a^{p-r} + O\left( a^{2p-r}\right) \end{aligned}$$

As \(a \rightarrow 0+\), this tends to \(0\) if \(p>r\); it diverges to \(\infty \) if \(r>p\). We may conclude that \(\ell ^{p}\) fails to be \(r\)-LWP if \(p>r\), and it fails to be \(r\)-UWP if \(r>p\).

Now, every \(L^{p}\) (of dimension at least 2) contains a possibly weighted copy of 2-dimensional \(\ell ^{p}\). It is a simple matter to account for the weights, and thus extend the above conclusions to \(L^{p}\). Indeed, if the weights are \(w_1\) and \(w_2\), then the mapping \((x_1,x_2)\) \(\mapsto \) \((x_1/w_1^{1/p}, x_2/w_2^{1/p})\) is an isometry from \(\ell ^{p}\) to weighted \(\ell ^{p}\) that preserves orthogonality. \(\square \)

The results of this section amplify those of [4], where \(r=2\), as well as [10], where \(p=r\), and apply to a larger class of Banach spaces.

4 Baxter’s inequality

Our next application is an extension of Baxter’s inequality from the stationary Gaussian case to a much broader class of processes. Suppose that \(\{Z_k\}_{k = -\infty }^{\infty }\) is a centered stationary Gaussian process. Let us write \(\hat{Z}\) for the best linear least-squares estimate of \(Z_0\) based on the entire past

$$\begin{aligned} \Big \{\ldots ,Z_{-3},Z_{-2},Z_{-1}\Big \} \end{aligned}$$

and \(\hat{Z}_{n}\) for the best linear estimate of \(Z_0\) based on the finite past

$$\begin{aligned} \Big \{Z_{-n},\ldots ,Z_{-3},Z_{-2},Z_{-1}\Big \} \end{aligned}$$

Let \(\hat{Z}\) and \(\hat{Z}_{n}\) have series representations

$$\begin{aligned} \hat{Z}&= \sum _{k=1}^{\infty } a_k Z_{-k} \\ \hat{Z}_{n}&= \sum _{k=1}^{n} a_{k,n} Z_{-k} \end{aligned}$$

The original Baxter’s inequality [2] states that if the process has a positive and continuous spectral density function, then there exist constants \(B\) and \(N\) such that

$$\begin{aligned} \sum _{k=1}^{n} |a_{k,n} - a_k| \le B \sum _{k = n+1}^{\infty } |a_k| \end{aligned}$$

whenever \(n \ge N\). The importance of Baxter’s inequality is that it gives us an estimate of the error in using the truncated infinite best predictor in place of the finite best predictor, expressed in terms of the tail of the best predictor. This is useful in applications, and has been extended in a number of directions (see, for example, [12, 16]).

Extending this idea to arbitrary normed spaces is challenging since there doesn’t exist any notion of a spectrum in such generality. Here, however, is a version of Baxter’s inequality for LWP spaces, which evidently retain just enough Hilbert space character. Suppose that the sequence of vectors \(\{X_{n}\}_{n=0}^{\infty }\) spans the Banach space \({\mathcal {X}}\). We adopt the notation \(\hat{X}\) for the best linear estimate of \(X_0\) based on \(\{X_{1},X_{2},X_{3},\ldots \}\), and \(\hat{X}_{n}\) for the best linear estimate of \(X_0\) based on \(\{X_{1},X_{2},X_{3},\ldots ,X_{n}\}\). Let us write \(\hat{X}_{(n)}\) for the \(n\)th partial sum of \(\hat{X}\).

Theorem 4.1

Let \(\{X_{n}\}_{n=0}^{\infty }\) be a sequence of unit vectors spanning a smooth Banach space \({\mathcal {X}}\). If \({\mathcal {X}}\) satisfies \(p\)-LWP(C), and the series representation

$$\begin{aligned} \hat{X} = \sum _{n=1}^{\infty } a_{n} X_{n} \end{aligned}$$

converges in norm to \(\hat{X}\), then there exist constants \(B\) and \(N\) such that

$$\begin{aligned} \Vert \hat{X}_{n} - \hat{X}_{(n)} \Vert ^{p} \le B \Vert \hat{X} - \hat{X}_{(n)} \Vert \end{aligned}$$
(4.1)

whenever \(n \ge N\).

Proof

By assumption, \(X_0 - \hat{X}_{n}\) is orthogonal to any vector in closed linear span of \(\bigvee \{X_k\}_{k=1}^{n}\). In particular,

$$\begin{aligned} X_0 - \hat{X}_{n} \ \ \perp \ \ \hat{X}_{n} - \hat{X}_{(n)} \end{aligned}$$

It follows that

$$\begin{aligned} \Vert X_0 - \hat{X}_{n} \Vert ^{p} + C \Vert \hat{X}_{n} - \hat{X}_{(n)}\Vert ^{p} \le \Vert X_0 - \hat{X}_{(n)}\Vert ^{p} \end{aligned}$$

from the Pythagorean inequality (3.5), and therefore

$$\begin{aligned} C \Vert \hat{X}_{n} - \hat{X}_{(n)}\Vert ^{p}&\le \Vert X_0 - \hat{X}_{(n)}\Vert ^{p} - \Vert X_0 - \hat{X}_{n} \Vert ^{p} \nonumber \\&\le \Vert X_0 - \hat{X}_{(n)}\Vert ^{p} - \Vert X_0 - \hat{X} \Vert ^{p} \end{aligned}$$
(4.2)

The Mean Value Theorem, applied to the function \(f(x) = x^{p}\), assures that for any positive numbers \(a > b\) there exists a number \(c\), with \(a > c > b\), such that

$$\begin{aligned} \frac{a^{p} - b^{p}}{a-b} = pc^{p-1} \end{aligned}$$

Use \(c^{p-1} \le a^{p-1}+b^{p-1}\), \(a = \Vert X_0 - \hat{X}_{(n)}\Vert \), \(b = \Vert X_0 - \hat{X} \Vert \), and the Triangle inequality

$$\begin{aligned} \Vert X_0 - \hat{X}_{(n)}\Vert - \Vert X_0 - \hat{X} \Vert \le \Vert \hat{X} - \hat{X}_{(n)} \Vert \end{aligned}$$

and then the chain of estimates continues from (4.2) with

$$\begin{aligned}&\le p\Vert \hat{X} - \hat{X}_{(n)} \Vert \left( \Vert X_0 - \hat{X}\Vert ^{p-1} + \Vert X_0 - \hat{X}_{(n)}\Vert ^{p-1}\right) \\&\le 2p \Vert \hat{X} - \hat{X}_{(n)} \Vert \end{aligned}$$

for \(n\) sufficiently large. \(\square \)

Once again, what we have in (4.1) is an error estimate for using the truncated best predictor in place of the best finite predictor, in terms of the tail of the best predictor. This is a new result even for \(L^{p}\), with \(p \ne 2\). Thus it is applicable, for example, to a random process \(\{X_{n}\}_{n=-\infty }^{\infty }\) that satisfies \(E|X_{n}|^{p} < \infty \) for all \(n\), and

$$\begin{aligned} E \bigg | \sum _{n=1}^{N} c_{n} X_{k_{n} + K} \bigg |^{p} = E \bigg | \sum _{n=1}^{N} c_{n} X_{k_{n}} \bigg |^{p} \end{aligned}$$

for all integers \(K\), positive integers \(n\), scalars \(c_1, c_2,\ldots ,c_{n}\), and indices \(k_1, k_2,\ldots , k_{n}\). Such a process is said to be \(p\)-stationary. For \(1<p\le 2\), the \(p\)-stationary processes include weakly stationary processes, harmonizable stable processes, certain S\(\alpha \)S stable moving averages, and \(p\)-th order strictly stationary processes (see [6, 911, 14]).

We emphasize that the Banach space norm \(\Vert \cdot \Vert \) in Theorem 4.1 need not be the \(L^{p}\) norm; rather, the requirement is for the space to satisfy \(p\)-LWP(\(C\)), which occurs for a large class of spaces.

5 Regularity

Let \(\{X_{n}\}_{n=0}^{\infty }\) be a sequence of nonzero vectors spanning a Banach space \({\mathcal {X}}\). Define for all \(n = 0\), \(1\), \(2\), ...,

$$\begin{aligned} {\mathcal {X}}_{n}&= \bigvee \Big \{X_{n}, X_{n+1}, X_{n+2}, \ldots \Big \} \\ {\mathcal {X}}_{\infty }&= \bigcap _{n} {\mathcal {X}}_{n} \end{aligned}$$

where \(\bigvee \) means the closed span under the norm topology. We say that the sequence \(\{X_{n}\}_{n=0}^{\infty }\) is regular if \(\mathcal {X}_{\infty } = (0)\). This is a linear version of the stronger condition that the sequence \(\{X_{n}\}_{n=0}^{\infty }\) has trivial tail \(\sigma \)-field. It is also analogous to a Gaussian process being purely non-deterministic. In the stationary Gaussian case, complete spectral criteria are well known for regularity (see, for example, [17]). However, in the present more general setting, there is no comparable notion of spectrum, and even an innovation sequence can fail to be regular. Our aim is to characterize the regular innovation sequences in LWP spaces.

We can attempt to define the coordinate functionals \(k_{n}\) on finite linear combinations in the obvious way:

$$\begin{aligned} k_{n}\left( \sum a_k X_k\right) = a_{n} \end{aligned}$$

It needs to be established that each \(k_{n}\) is well-defined, and extends continuously to all of \({\mathcal {X}}\). We find that a sufficient condition is for \({\mathcal {X}}\) to satisfy LWP.

Lemma 5.1

Let \(\{X_{n}\}_{n=0}^{\infty }\) be an innovation sequence of unit vectors spanning a smooth Banach space \({\mathcal {X}}\). If \({\mathcal {X}}\) satisfies \(p\)-LWP, then the coordinate functionals \(k_{n}\), for \(n=0\), \(1\), \(2\), ..., extend continuously to \({\mathcal {X}}\).

Proof

If LWP holds, then \({\mathcal {X}}\) satisfies a lower Pythagorean inequality. Thus by Corollary 3.5 there is a constant \(C\) such that

$$\begin{aligned} \Big \Vert \sum _{n=0}^{\infty } a_{n} X_{n} \Big \Vert ^{p} \ge \sum _{n=0}^{\infty } C^n |a_{n}|^{p} \end{aligned}$$

holds when all but finitely many of the coefficients \(a_{n}\) are zero. From this it follows that if two such sums on the left are equal, their corresponding coefficients must coincide. It also follows that \(k_{n}\) is a bounded linear functional, with \(\Vert k_{n}\Vert \le C^{-n/p}\). \(\square \)

Let us keep the symbol \(k_{n}\) for the extended functional.

These functionals connect to our objective by way of the following lemmas.

Lemma 5.2

Let \(\{X_{n}\}_{n=0}^{\infty }\) be an innovation sequence of unit vectors spanning a smooth Banach space \({\mathcal {X}}\), and assume that \({\mathcal {X}}\) satisfies LWP. Then \(X \in \mathcal {X}_{\infty }\) if and only if \(k_{n}(X) = 0\) for all \(n=0\), \(1\), \(2\), ....

Proof

Suppose that \(X \in \mathcal {X}_{\infty }\). Then for each nonnegative integer \(m\), there exists \(Y_m\) belonging to the span of \(\{X_m,X_{m+1},X_{m+2},\ldots ,\}\) such that \(\Vert X - Y_m\Vert < 1/m\). Thus by the continuity of the functional \(k_{n}\), we have \(k_{n}(X) = \lim _{m\rightarrow \infty } k_{n}(Y_m) = 0\) for each \(n\).

Conversely, suppose that \(k_{n}(X) = 0\) for all \(n=0\), \(1\), \(2\), .... Consider finite sums \(Y_m = \sum _{n=0}^{\infty } a_{m,n}X_{n}\) converging in norm to \(X\). For any fixed positive integer \(N\), and any \(\epsilon >0\), it happens that \(\lim _{m\rightarrow \infty } a_{m,n} = 0\) for all \(0 \le n \le N\). Thus we can choose \(m\) sufficiently large that \(Y_N\) differs from a vector in the span of \(\{X_m,X_{m+1},X_{m+2},\ldots ,\}\) by no more than \(\epsilon \) in norm. This shows that \(X\) belongs to \({\mathcal {X}}_N\), and since \(N\) was arbitrary, it follows that \(X\) belongs to \({\mathcal {X}}_{\infty }\). \(\square \)

Let us write \(T_{n}\) for the norming functional of \(X_{n}\). Here is the way the norming functionals \(\{T_{n}\}\) and coordinate functionals \(\{k_{n}\}\) are related.

Lemma 5.3

Let \(\{X_{n}\}_{n=0}^{\infty }\) be an innovation sequence of unit vectors spanning a smooth Banach space \({\mathcal {X}}\). If \({\mathcal {X}}\) satisfies LWP, then the spans of \(\{k_{n}\}_{n=0}^{\infty }\) and \(\{T_{n}\}_{n=0}^{\infty }\) in \({\mathcal {X}}^*\) coincide.

Proof

We verify that

$$\begin{aligned} k_0&= T_0 \\ k_{n}&= T_{n} - \sum _{m=0}^{n-1}T_{n}(X_{m})k_{m} \end{aligned}$$

for all \(n=1\), \(2\), \(3\),.... Indeed, the equations are true when the respective functionals are evaluated at any \(X_j\). From this we deduce that each \(T_N\) belongs to the span of \(\{k_{n}\}_{n=0}^{N}\), and similarly each \(k_N\) belongs to the span of \(\{T_{n}\}_{n=0}^{N}\).

Thus \(\{k_{n}\}_{n=0}^{N}\) and \(\{T_{n}\}_{n=0}^{N}\) span the same subspace for all \(N\). The assertion follows. \(\square \)

Lemma 5.4

Let \(\{X_k\}_{k=0}^{\infty }\) be an innovation sequence of unit vectors spanning a smooth \(p\)-LWP space \({\mathcal {X}}\), and let \(T\) be a nonzero functional on \({\mathcal {X}}\). The following are equivalent:

  1. (i)

    \(T \perp k_{n}\) for all \(n=0\), \(1\), \(2\), ....

  2. (ii)

    \(T/\Vert T\Vert \) is the norming functional for some nonzero \(X \in {\mathcal {X}}_{\infty }\).

Proof

In any case, the \(p\)-LWP condition implies that \({\mathcal {X}}\) is uniformly convex, and hence reflexive. Furthermore, \({\mathcal {X}}^*\) must then be smooth.

If condition (i) holds, then by [13, Theorem 2.1], there exists a nonzero vector \(X\) in \({\mathcal {X}}= {\mathcal {X}}^{**}\) such that \(k_{n}(X) =0\) for all indexes \(n\), and \(T(X) = \Vert T\Vert \Vert X\Vert \). By Lemma 5.2, this implies that \(X\) belongs to \({\mathcal {X}}_{\infty }\). It is readily seen that \(T/\Vert T\Vert \) is the norming functional for \(X\), and hence (ii) holds.

Conversely, assume (ii). Then for any constant \(a\) and any index \(n\), we have

$$\begin{aligned} \Vert T + ak_{n}\Vert&\ge |T(X/\Vert X\Vert ) + a k_{n}(X/\Vert X\Vert )| \\&= |T(X/\Vert X\Vert )| \\&= \Vert T\Vert \end{aligned}$$

and (i) follows. \(\square \)

Here is our characterization of regular innovation sequences in LWP spaces, in terms of the coordinate functionals, the norming functionals for the sequence, and a basis property. (See [17] for the definition of conditional basis).

Theorem 5.5

Let \(\{X_{n}\}_{n=0}^{\infty }\) be an innovation sequence of unit vectors spanning a smooth \(p\)-LWP space \({\mathcal {X}}\). The following are equivalent.

  1. (i)

    The sequence \(\{X_{n}\}_{n=0}^{\infty }\) is regular.

  2. (ii)

    The associated coordinate functionals \(\{k_{n}\}_{n=0}^{\infty }\) span \({\mathcal {X}}^*\).

  3. (iii)

    The sequence \(\{X_{n}\}_{n=0}^{\infty }\) is a conditional basis for \({\mathcal {X}}\).

  4. (iv)

    The associated norming functionals \(\{T_{n}\}_{n=0}^{\infty }\) span \({\mathcal {X}}^*\).

Proof

The equivalence of (i) and (ii) is immediate from Lemma 5.4.

Suppose that (i) holds, and for \(X\) in \({\mathcal {X}}\) we have \(k_{n}(X) = a_{n}\) for all \(n\). If the series \(\sum _{n=0}^{\infty }a_{n} X_{n}\) converges in norm to \(Y\), then \(k_{n}(X-Y) = 0\) for all \(n\). By assumption this forces \(X = Y\), and (iii) is affirmed.

Conversely, suppose (iii) holds. If \(X\) belongs to \({\mathcal {X}}_{\infty }\), then \(k_{n}(X) = 0\) for all \(n\). By assumption, the series \(\sum _{n=0}^{\infty } k_{n}(X) X_{n}\) converges to \(X\). But the left hand side is the zero vector, and we may conclude that (i) holds.

The equivalence of (ii) and (iv) follows from Lemma 5.3. \(\square \)

These results show that weak parallelogram spaces serve as a natural environment in which to pursue prediction theory, extending previous work on Gaussian and \(p\)-stationary processes.