State-Space Models

Brockwell, Peter J.; Davis, Richard A.

doi:10.1007/978-3-319-29854-2_9

Peter J. Brockwell⁵ &
Richard A. Davis⁶

Part of the book series: Springer Texts in Statistics ((STS))

253k Accesses
1 Citations

Abstract

In recent years state-space representations and the associated Kalman recursions have had a profound impact on time series analysis and many related areas.

Access provided by Autonomous University of Puebla. Download chapter PDF

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

In recent years state-space representations and the associated Kalman recursions have had a profound impact on time series analysis and many related areas. The techniques were originally developed in connection with the control of linear systems (for accounts of this subject see Davis and Vinter 1985; Hannan and Deistler 1988). An extremely rich class of models for time series, including and going well beyond the linear ARIMA and classical decomposition models considered so far in this book, can be formulated as special cases of the general state-space model defined below in Section 9.1. In econometrics the structural time series models developed by Harvey (1990) are formulated (like the classical decomposition model) directly in terms of components of interest such as trend, seasonal component, and noise. However, the rigidity of the classical decomposition model is avoided by allowing the trend and seasonal components to evolve randomly rather than deterministically. An introduction to these structural models is given in Section 9.2, and a state-space representation is developed for a general ARIMA process in Section 9.3. The Kalman recursions, which play a key role in the analysis of state-space models, are derived in Section 9.4. These recursions allow a unified approach to prediction and estimation for all processes that can be given a state-space representation. Following the development of the Kalman recursions we discuss estimation with structural models (Section 9.5) and the formulation of state-space models to deal with missing values (Section 9.6). In Section 9.7 we introduce the EM algorithm, an iterative procedure for maximizing the likelihood when only a subset of the complete data set is available. The EM algorithm is particularly well suited for estimation problems in the state-space framework. Generalized state-space models are introduced in Section 9.8. These are Bayesian models that can be used to represent time series of many different types, as demonstrated by two applications to time series of count data. Throughout the chapter we shall use the notation

$$ \displaystyle{ \{\mathbf{W}_{t}\} \sim \mathrm{WN}(\mathbf{0},\{R_{t}\}) } $$

to indicate that the random vectors W _t have mean 0 and that

$$ \displaystyle{ E\left (\mathbf{W}_{s}\mathbf{W}_{t}'\right ) = \left \{\begin{array}{@{}l@{\quad }l@{}} R_{t},\quad &\mbox{ if }s = t, \\ 0, \quad &\mbox{ otherwise}. \end{array} \right. } $$

9.1 State-Space Representations

A state-space model for a (possibly multivariate) time series {Y _t, t = 1, 2, …} consists of two equations. The first, known as the observation equation, expresses the w-dimensional observation Y _t as a linear function of a v-dimensional state variable X _t plus noise. Thus

$$ \displaystyle{ \mathbf{Y}_{t} = G_{t}\mathbf{X}_{t} + \mathbf{W}_{t},\quad t = 1,2,\ldots, } $$

(9.1.1)

where {W _t} ∼ WN(0, {R _t}) and {G _t} is a sequence of w × v matrices. The second equation, called the state equation, determines the state X _t+1 at time t + 1 in terms of the previous state X _t and a noise term. The state equation is

$$ \displaystyle{ \mathbf{X}_{t+1} = F_{t}\mathbf{X}_{t} + \mathbf{V}_{t},\quad t = 1,2,\ldots, } $$

(9.1.2)

where {F _t} is a sequence of v × v matrices, {V _t} ∼ WN(0, {Q _t}), and {V _t} is uncorrelated with {W _t} (i.e., E(W _t V _s′) = 0 for all s and t). To complete the specification, it is assumed that the initial state X ₁ is uncorrelated with all of the noise terms {V _t} and {W _t}.

Remark 1.

A more general form of the state-space model allows for correlation between V _t and W _t (see Brockwell and Davis (1991), Chapter 12) and for the addition of a control term H _t u _t in the state equation. In control theory, H _t u _t represents the effect of applying a “control” u _t at time t for the purpose of influencing X _t+1. However, the system defined by (9.1.1) and (9.1.2) with $ E{\bigl (\mathbf{W}_{t}\mathbf{V}_{s}'\bigr )} = 0 $ for all s and t will be adequate for our purposes. □

Remark 2.

In many important special cases, the matrices F _t, G _t, Q _t, and R _t will be independent of t, in which case the subscripts will be suppressed. □

Remark 3.

It follows from the observation equation (9.1.1) and the state equation (9.1.2) that X _t and Y _t have the functional forms, for t = 2, 3, …,

$$ \displaystyle\begin{array}{rcl} \mathbf{X}_{t}& =& F_{t-1}\mathbf{X}_{t-1} + \mathbf{V}_{t-1} \\ & =& F_{t-1}(F_{t-2}\mathbf{X}_{t-2} + \mathbf{V}_{t-2}) + \mathbf{V}_{t-1} \\ & & \ \ \vdots \\ & =& (F_{t-1}\cdots F_{1})\mathbf{X}_{1} + (F_{t-1}\cdots F_{2})\mathbf{V}_{1} + \cdots + F_{t-1}\mathbf{V}_{t-2} + \mathbf{V}_{t-1} \\ & =& f_{t}(\mathbf{X}_{1},\mathbf{V}_{1},\ldots,\mathbf{V}_{t-1}) {}\end{array} $$

(9.1.3)

and

$$ \displaystyle{ \mathbf{Y}_{t} = g_{t}(\mathbf{X}_{1},\mathbf{V}_{1},\ldots,\mathbf{V}_{t-1},\mathbf{W}_{t}).\mbox{ $\square $} } $$

(9.1.4)

Remark 4.

From Remark 3 and the assumptions on the noise terms, it is clear that

$$ \displaystyle{ E\left (\mathbf{V}_{t}\mathbf{X}'_{s}\right ) = 0,\qquad E\left (\mathbf{V}_{t}\mathbf{Y}'_{s}\right ) = 0,\quad 1 \leq s \leq t, } $$

and

$$ \displaystyle{ E\left (\mathbf{W}_{t}\mathbf{X}'_{s}\right ) = 0,\quad 1 \leq s \leq t,\qquad E(\mathbf{W}_{t}\mathbf{Y}'_{s}) = 0,\quad 1 \leq s < t.\mbox{ $\square $} } $$

Definition 9.1.1

A time series {Y _t} has a state-space representation if there exists a state-space model for {Y _t} as specified by equations (9.1.1) and (9.1.2).

As already indicated, it is possible to find a state-space representation for a large number of time-series (and other) models. It is clear also from the definition that neither {X _t} nor {Y _t} is necessarily stationary. The beauty of a state-space representation, when one can be found, lies in the simple structure of the state equation (9.1.2), which permits relatively simple analysis of the process {X _t}. The behavior of {Y _t} is then easy to determine from that of {X _t} using the observation equation (9.1.1). If the sequence {X ₁, V ₁, V ₂, …} is independent, then {X _t} has the Markov property; i.e., the distribution of X _t+1 given X _t, …, X ₁ is the same as the distribution of X _t+1 given X _t. This is a property possessed by many physical systems, provided that we include sufficiently many components in the specification of the state X _t (for example, we may choose the state vector in such a way that X _t includes components of X _t−1 for each t).

Example 9.1.1

An AR(1) Process

Let {Y _t} be the causal AR(1) process given by

$$ \displaystyle{ Y _{t} =\phi Y _{t-1} + Z_{t},\quad \{Z_{t}\} \sim \mathrm{WN}\left (0,\sigma ^{2}\right ). } $$

(9.1.5)

In this case, a state-space representation for {Y _t} is easy to construct. We can, for example, define a sequence of state variables X _t by

$$ \displaystyle{ X_{t+1} =\phi X_{t} + V _{t},\quad t = 1,2,\ldots, } $$

(9.1.6)

where X ₁ = Y ₁ = ∑ _j = 0 ^∞ ϕ ^j Z _1−j and V _t = Z _t+1. The process {Y _t} then satisfies the observation equation

$$ \displaystyle{ Y _{t} = X_{t}, } $$

which has the form (9.1.1) with G _t = 1 and W _t = 0.

Example 9.1.2

An ARMA(1,1) Process

Let {Y _t} be the causal and invertible ARMA(1,1) process satisfying the equations

$$ \displaystyle{ Y _{t} =\phi Y _{t-1} + Z_{t} +\theta Z_{t-1},\quad \{Z_{t}\} \sim \mathrm{WN}\left (0,\sigma ^{2}\right ). } $$

(9.1.7)

Although the existence of a state-space representation for {Y _t} is not obvious, we can find one by observing that

$$ \displaystyle{ Y _{t} =\theta (B)X_{t} = \left [\begin{array}{*{10}c} \theta \quad 1 \end{array} \right ]\left [\begin{array}{*{10}c} X_{t-1} \\ X_{t} \end{array} \right ], } $$

(9.1.8)

where {X _t} is the causal AR(1) process satisfying

$$ \displaystyle{ \phi (B)X_{t} = Z_{t}, } $$

or the equivalent equation

$$ \displaystyle{ \left [\begin{array}{*{10}c} X_{t} \\ X_{t+1} \end{array} \right ] = \left [\begin{array}{*{10}c} 0&1\\ 0 & \phi \\ \end{array} \right ]\left [\begin{array}{*{10}c} X_{t-1} \\ X_{t} \end{array} \right ]+\left [\begin{array}{*{10}c} 0\\ Z_{t+1}. \end{array} \right ]. } $$

(9.1.9)

Noting that X _t = ∑ _j = 0 ^∞ ϕ ^j Z _t−j, we see that equations (9.1.8) and (9.1.9) for t = 1, 2, … furnish a state-space representation of {Y _t} with

$$ \displaystyle{ \mathbf{X}_{t} = \left [\begin{array}{*{10}c} X_{t-1} \\ X_{t}\\ \end{array} \right ]\ \mathrm{and}\ \mathbf{X}_{1} = \left [\begin{array}{*{10}c} \sum \limits _{j=0}^{\infty }\phi ^{j}Z_{-j} \\ \sum \limits _{j=0}^{\infty }\phi ^{j}Z_{1-j} \end{array} \right ]. } $$

The extension of this state-space representation to general ARMA and ARIMA processes is given in Section 9.3.

In subsequent sections we shall give examples that illustrate the versatility of state-space models. (More examples can be found in Aoki 1987; Hannan and Deistler 1988; Harvey 1990; West and Harrison 1989.) Before considering these, we need a slight modification of (9.1.1) and (9.1.2), which allows for series in which the time index runs from −∞ to ∞. This is a more natural formulation for many time series models.

9.1.1 State-Space Models with t ∈ { 0, ±1, …}

Consider the observation and state equations

$$ \displaystyle{ \mathbf{Y}_{t} = G\mathbf{X}_{t} + \mathbf{W}_{t},\qquad t = 0,\pm 1,\ldots, } $$

(9.1.10)

$$ \displaystyle{ \mathbf{X}_{t+1} = F\mathbf{X}_{t} + \mathbf{V}_{t},\qquad t = 0,\pm 1,\ldots, } $$

(9.1.11)

where F and G are v × v and w × v matrices, respectively, {V _t} ∼ WN(0, Q), $ \{\mathbf{W}_{t}\} \sim \mathrm{WN}(\mathbf{0},R) $, and E(V _s W _t′) = 0 for all s, and t.

The state equation (9.1.11) is said to be stable if the matrix F has all its eigenvalues in the interior of the unit circle, or equivalently if det(I − Fz) ≠ 0 for all z complex such that | z | ≤ 1. The matrix F is then also said to be stable.

In the stable case equation (9.1.11) has the unique stationary solution (Problem 9.1) given by

$$ \displaystyle{ \mathbf{X}_{t} =\sum _{ j=0}^{\infty }F^{j}\mathbf{V}_{ t-j-1}. } $$

The corresponding sequence of observations

$$ \displaystyle{ \mathbf{Y}_{t} = \mathbf{W}_{t} +\sum _{ j=0}^{\infty }GF^{j}\mathbf{V}_{ t-j-1} } $$

is also stationary.

9.2 The Basic Structural Model

A structural time series model, like the classical decomposition model defined by (1.5.1), is specified in terms of components such as trend, seasonality, and noise, which are of direct interest in themselves. The deterministic nature of the trend and seasonal components in the classical decomposition model, however, limits its applicability. A natural way in which to overcome this deficiency is to permit random variation in these components. This can be very conveniently done in the framework of a state-space representation, and the resulting rather flexible model is called a structural model. Estimation and forecasting with this model can be encompassed in the general procedure for state-space models made possible by the Kalman recursions of Section 9.4.

Example 9.2.1

The Random Walk Plus Noise Model

One of the simplest structural models is obtained by adding noise to a random walk. It is suggested by the nonseasonal classical decomposition model

$$ \displaystyle{ Y _{t} = M_{t} + W_{t},\quad \mathrm{where}\ \{W_{t}\} \sim \mathrm{WN}\left (0,\sigma _{w}^{2}\right ), } $$

(9.2.1)

and M _t = m _t, the deterministic “level” or “signal” at time t. We now introduce randomness into the level by supposing that M _t is a random walk satisfying

$$ \displaystyle{ M_{t+1} = M_{t} + V _{t},\quad \mathrm{and}\quad \{V _{t}\} \sim \mathrm{WN}\left (0,\sigma _{v}^{2}\right ), } $$

(9.2.2)

with initial value M ₁ = m ₁. Equations (9.2.1) and (9.2.2) constitute the “local level” or “random walk plus noise” model. Figure 9.1 shows a realization of length 100 of this model with M ₁ = 0, σ _v ² = 4, and σ _w ² = 8. (The realized values m _t of M _t are plotted as a solid line, and the observed data are plotted as square boxes.) The differenced data

$$ \displaystyle{ D_{t}:= \nabla Y _{t} = Y _{t} - Y _{t-1} = V _{t-1} + W_{t} - W_{t-1},\quad t \geq 2, } $$

constitute a stationary time series with mean 0 and ACF

$$ \displaystyle{ \rho_{D}(h) = \left \{\begin{array}{@{}l@{\quad }l@{}} \dfrac{-\sigma _{w}^{2}} {2\sigma _{w}^{2} +\sigma _{ v}^{2}},\quad &\mbox{ if }\vert h\vert = 1, \\ 0, \quad &\mbox{ if }\vert h\vert > 1. \end{array} \right. } $$

Since {D _t} is 1-correlated, we conclude from Proposition 2.1.1 that {D _t} is an MA(1) process and hence that {Y _t} is an ARIMA(0,1,1) process. More specifically,

$$ \displaystyle{ D_{t} = Z_{t} +\theta Z_{t-1},\quad \{Z_{t}\} \sim \mathrm{WN}\left (0,\sigma ^{2}\right ), } $$

(9.2.3)

where θ and σ ² are found by solving the equations

$$ \displaystyle{ \frac{\theta } {1 +\theta ^{2}} = \frac{-\sigma _{w}^{2}} {2\sigma _{w}^{2} +\sigma _{ v}^{2}}\quad \mathrm{and}\quad \theta \sigma ^{2} = -\sigma _{ w}^{2}. } $$

For the process {Y _t} generating the data in Figure 9.1, the parameters θ and σ ² of the differenced series {D _t} satisfy θ∕(1 +θ ²) = −0. 4 and θ σ ² = −8. Solving these equations for θ and σ ², we find that θ = −0. 5 and σ ² = 16 (or θ = −2 and σ ² = 4). The sample ACF of the observed differences D _t of the realization of {Y _t} in Figure 9.1 is shown in Figure 9.2.

The local level model is often used to represent a measured characteristic of the output of an industrial process for which the unobserved process level {M _t} is intended to be within specified limits (to meet the design specifications of the manufactured product). To decide whether or not the process requires corrective attention, it is important to be able to test the hypothesis that the process level {M _t} is constant. From the state equation, we see that {M _t} is constant (and equal to m ₁) when V _t = 0 or equivalently when σ _v ² = 0. This in turn is equivalent to the moving-average model (9.2.3) for {D _t} being noninvertible with θ = −1 (see Problem 8.2). Tests of the unit root hypothesis θ = −1 were discussed in Section 6.3.2

The local level model can easily be extended to incorporate a locally linear trend with slope β _t at time t. Equation (9.2.2) is replaced by

$$ \displaystyle{ M_{t} = M_{t-1} + B_{t-1} + V _{t-1}, } $$

(9.2.4)

where B _t−1 = β _t−1. Now if we introduce randomness into the slope by replacing it with the random walk

$$ \displaystyle{ B_{t} = B_{t-1} + U_{t-1},\quad \mathrm{where}\ \{U_{t}\} \sim \mathrm{WN}\left (0,\sigma _{u}^{2}\right ), } $$

(9.2.5)

we obtain the “local linear trend” model.

To express the local linear trend model in state-space form we introduce the state vector

$$ \displaystyle{ \mathbf{X}_{t} = (M_{t},B_{t})'. } $$

Then (9.2.4) and (9.2.5) can be written in the equivalent form

$$ \displaystyle{ \mathbf{X}_{t+1} = \left [\begin{array}{*{10}c} 1&1\\ 0 &1 \end{array} \right ]\mathbf{X}_{t}+\mathbf{V}_{t},\quad t = 1,2,\ldots, } $$

(9.2.6)

where V _t = (V _t, U _t)′. The process {Y _t} is then determined by the observation equation

$$ \displaystyle{ Y _{t} = [1\quad 0]\ \mathbf{X}_{t} + W_{t}. } $$

(9.2.7)

If {X ₁, U ₁, V ₁, W ₁, U ₂, V ₂, W ₂, …} is an uncorrelated sequence, then equations (9.2.6) and (9.2.7) constitute a state-space representation of the process {Y _t}, which is a model for data with randomly varying trend and added noise. For this model we have v = 2, w = 1,

$$ \displaystyle{ F = \left [\begin{array}{*{10}c} 1& 1\\ 0 &1, \end{array} \right ]\quad G = [1\quad 0],\quad Q = \left [\begin{array}{*{10}c} \sigma _{v}^{2} & 0 \\ 0 &\sigma _{u}^{2} \end{array} \right ],\quad \mathrm{and}\ R =\sigma _{ w}^{2}. } $$

Example 9.2.2

A Seasonal Series with Noise

The classical decomposition (1.5.11) expressed the time series {X _t} as a sum of trend, seasonal, and noise components. The seasonal component (with period d ) was a sequence {s _t} with the properties s _t+d = s _t and ∑ _t = 1 ^d s _t = 0. Such a sequence can be generated, for any values of s ₁, s ₀, …, s _−d+3, by means of the recursions

$$ \displaystyle{ s_{t+1} = -s_{t} -\cdots - s_{t-d+2},\quad t = 1,2,\ldots. } $$

(9.2.8)

A somewhat more general seasonal component {Y _t}, allowing for random deviations from strict periodicity, is obtained by adding a term S _t to the right side of (9.2.8), where {V _t} is white noise with mean zero. This leads to the recursion relations

$$ \displaystyle{ Y _{t+1} = -Y _{t} -\cdots - Y _{t-d+2} + S_{t},\quad t = 1,2,\ldots. } $$

(9.2.9)

To find a state-space representation for {Y _t} we introduce the (d − 1)-dimensional state vector

$$ \displaystyle{ \mathbf{X}_{t} = (Y _{t},Y _{t-1},\ldots,Y _{t-d+2})'. } $$

The series {Y _t} is then given by the observation equation

$$ \displaystyle{ Y _{t} = [1\quad 0\quad 0\ \cdots \ 0]\ \mathbf{X}_{t},\quad t = 1,2,\ldots, } $$

(9.2.10)

where {X _t} satisfies the state equation

$$ \displaystyle{ \mathbf{X}_{t+1} = F\mathbf{X}_{t} + \mathbf{V}_{t},\quad t = 1,2\ldots, } $$

(9.2.11)

V _t = (S _t, 0, …, 0)′, and

$$ \displaystyle{ F = \left [\begin{array}{*{10}c} -1 & -1 & \cdots & -1 & -1\\ 1 & 0 & \cdots & 0 & 0 \\ 0 & 1 & \cdots & 0 & 0\\ \vdots & \vdots & \ddots & \vdots & \vdots\\ 0 & 0 & \cdots & 1 & 0 \end{array} \right ]. } $$

(9.2.12)

Example 9.2.3

A Randomly Varying Trend with Random Seasonality and Noise

A series with randomly varying trend, random seasonality and noise can be constructed by adding the two series in Examples 9.2.1 and 9.2.2. (Addition of series with state-space representations is in fact always possible by means of the following construction. See Problem 9.9.) We introduce the state vector

$$ \displaystyle{ \mathbf{X}_{t} = \left [\begin{array}{*{10}c} \mathbf{X}_{t}^{1} \\ \mathbf{X}_{t}^{2} \end{array} \right ], } $$

where X _t ¹ and X _t ² are the state vectors in (9.2.6) and (9.2.11). We then have the following representation for {Y _t}, the sum of the two series whose state-space representations were given in (9.2.6)–(9.2.7) and (9.2.10)–(9.2.11). The state equation is

$$ \displaystyle{ \mathbf{X}_{t+1} = \left [\begin{array}{*{10}c} F_{1} & 0 \\ 0 & F_{2} \end{array} \right ]\mathbf{X}_{t}+\left [\begin{array}{*{10}c} \mathbf{V}_{t}^{1} \\ \mathbf{V}_{t}^{2} \end{array} \right ], } $$

(9.2.13)

where F ₁, F ₂ are the coefficient matrices and {V _t ¹}, {V _t ²} are the noise vectors in the state equations (9.2.6) and (9.2.11), respectively. The observation equation is

$$ \displaystyle{ Y _{t} = [1\quad 0\quad 1\quad 0\ \cdots \ 0]\,\mathbf{X}_{t} + W_{t}, } $$

(9.2.14)

where {W _t} is the noise sequence in (9.2.7). If the sequence of random vectors {X ₁, V ₁ ¹, V ₁ ², W ₁, V ₂ ¹, V ₂ ², W ₂, …} is uncorrelated, then equations (9.2.13) and (9.2.14) constitute a state-space representation for {Y _t}.

9.3 State-Space Representation of ARIMA Models

We begin by establishing a state-space representation for the causal AR(p) process and then build on this example to find representations for the general ARMA and ARIMA processes.

Example 9.3.1

State-Space Representation of a Causal AR(p) Process

Consider the AR(p) process defined by

$$ \displaystyle{ Y _{t+1} =\phi _{1}Y _{t} +\phi _{2}Y _{t-1} + \cdots +\phi _{p}Y _{t-p+1} + Z_{t+1},\quad t = 0,\pm 1,\ldots, } $$

(9.3.1)

where $ \{Z_{t}\} \sim \mathrm{WN}{\bigl (0,\sigma ^{2}\bigr )} $, and ϕ(z): = 1 −ϕ ₁ z −⋯ −ϕ _p z ^p is nonzero for | z | ≤ 1. To express {Y _t} in state-space form we simply introduce the state vectors

$$ \displaystyle{ \mathbf{X}_{t} = \left [\begin{array}{*{10}c} Y _{t-p+1} \\ Y _{t-p+2}\\ \vdots \\ Y _{t}, \end{array} \right ],\quad t = 0,\pm 1,\ldots. } $$

(9.3.2)

From (9.3.1) and (9.3.2) the observation equation is

$$ \displaystyle{ Y _{t} = [0\quad 0\quad 0\ \cdots \ 1]\mathbf{X}_{t},\quad t = 0,\pm 1,\ldots, } $$

(9.3.3)

while the state equation is given by

$$ \displaystyle{ \mathbf{X}_{t+1} = \left [\begin{array}{*{10}c} 0 & 1 & 0 & \cdots & 0\\ 0 & 0 & 1 & \cdots & 0\\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & 0 & \cdots & 1\\ \phi _{ p} & \phi _{p-1} & \phi _{p-2} & \cdots & \phi _{1} \end{array} \right ]\mathbf{X}_{t}+\left [\begin{array}{*{10}c} 0\\ 0\\ \vdots \\ 0\\ 1 \end{array} \right ]Z_{t+1},\quad t = 0,\pm 1,\ldots. } $$

(9.3.4)

These equations have the required forms (9.1.10) and (9.1.11) with W _t = 0 and V _t = (0, 0, …, Z _t+1)′, t = 0, ±1, ….