Characterization of time series by means of autoregressive (AR) or moving-average (MA) processes or combined autoregressive moving-average (ARMA) processes was suggested, more or less simultaneously, by the Russian statistician and economist, E. Slutsky (1927), and the British statistician G.U. Yule (1921, 1926, 1927). Slutsky and Yule observed that if we begin with a series of purely random numbers and then take sums or differences, weighted or unweighted, of such numbers, the new series so produced has many of the apparent cyclic properties that are thought to characterize economic and other time series. Such sums or differences of purely random numbers are the basis for ARMA models of the processes by which many kinds of economic time series are assumed to be generated, and thus form the basis for recent suggestions for analysis, forecasting and control (e.g., Box and Jenkins 1970).

Let L be the lag operator such that Lkxt = xt–k. Consider the familiar pth order linear, homogeneous, deterministic difference equation with constant coefficients common in discrete dynamic economic analysis (e.g. Chow 1975)

$$ \psi (L){y}_t=0 $$

Or

$$ {y}_i-{\psi}_1{y}_{t-1}-\cdots -{\psi}_p{y}_{t-p}=0. $$
(1)

Relationships are seldom exact, however, so we introduce a serially uncorrelated random shock εt with zero mean and constant variance:

$$ {\displaystyle \begin{array}{l}E{\in}_t=0\\ {}E{\in}_t{\in}_{t^{\prime }}=\left\{\begin{array}{ll}{\sigma}^2,\hfill & t={t}^{\prime}\hfill \\ {}0,\hfill & \mathrm{otherwise}\hfill \end{array}\right.\end{array}} $$
(2)

Thus

$$ \psi (L){y}_t={\varepsilon}_t, $$
(3)

which is the pth-order autogressive process, AR(p), with constant coefficients studied by Yule (1927).

If the stochastic term in (3) is itself assumed to be a linear combination of past values of a variable such as εt with properties (2), for example,

$$ \psi (L){y}_t={\mu}_t, $$
(4)

where μt = ϕ(L)εt, and ϕ(·) is a polynomial of order q, then the process is a mixed autoregressive moving-average process of order (p, q), ARMA (p, q). The process generating μt is simply a moving-average process of order q, MA(q).

Such dynamic processes, under appropriate conditions on the coefficients of ψ, ϕ, and the distribution of εt have found wide application in both theoretical and empirical economics. The ability of such processes to describe the evolution of a series has made ARMA models a powerful tool for forecasting economic time series and other applications, such as seasonal adjustment. Moreover, because the models can capture a wide range of stochastic properties of economic time series, they have been widely used in models involving rational expectations (e.g. Whiteman 1983).

Univariate Arma Models

Conditions for weak stationarity and invertibility (i.e. capability of being expressed as a pure, but possibly infinite, AR) are most easily discussed in terms of the so-called z-transform or autocovariance-generating transform of the model. This is obtained for models (3) and (4) by replacing the lag operator by a complex variable z; thus, in general,

$$ B(z)=\frac{\phi (z)}{\psi (z)}, $$
(5)

where the expression on the right converges. If the roots of ψ(z) = 0 do not lie strictly outside the unit circle (i.e. some lie on or inside), the process described by (3) or (4) will not be stationary, nor will the expression on the right converge outside of a circle with radius less than one. (SeeTime Series Analysis.) In order to find a purely AR representation of MA and ARMA models, we require that 1/ϕ(z) converge in the same region, so that ϕ (z) = 0 must also have roots outside the unit circle. In this case, B(z) is well-defined everywhere outside the unit circle, and the model defined by (4) is both weakly stationary and invertible; the representation

$$ {y}_t=B(L){\varepsilon}_t $$
(6)

is a one-sided, infinite-order MA, with \( {\sum}_{j=-\infty}^{\infty }{b}_j^2<\infty \).

In Wold (1938) it is shown that every discrete weakly stationary process may be decomposed into a purely linearly deterministic part (which can be predicted exactly from a sufficient past history) and a part which corresponds to (6) above. (See the discussions of stationarity and ergodicity in Time Series Analysis.)

Let the autocovariances of a stationary, zero-mean time series, xt, be given by

$$ \gamma \left(\tau \right)=E{x}_t{x}_{t-\tau }. $$
(7)

The function

$$ g(z)=\sum_{\tau =-\infty}^{\infty }{z}^r\gamma \left(\tau \right) $$
(8)

is called the autocovariance generating function. If the function g(z) is known and analytic in a certain region, it is possible to read off the autocovariances of the time series as the coefficients in a Laurent series expansion of the function there. For a linearly nondeterministic time series with one-sided MA representation (6) the autocovariance generating transform is given by

$$ {g}_{yy}(z)={\sigma}^2B(z)B\left({z}^{-1}\right). $$
(9)

This function is analytic everywhere in an annulus about the unit circle. If yt is generated by a stationary ARMA model with invertible MA component then gyy(z) will have no zeros anywhere in this annulus. On the unit circle itself the spectral density of the series is proportional by a factor of (2π)−1 to the autocovariance generating transform:

$$ {f}_{yy}\left(\lambda \right)=\left(1/2\pi \right){g}_{yy}\left({e}^{i\lambda}\right),\ \ \ \ \ -\pi \le \lambda <\pi . $$
(10)

Stationary, invertible ARMA processes give rise to time series with spectral densities which are strictly positive in the interval (−π, π).

Let 1/βj, j = 1,…, q be the roots, not necessarily distinct, of ϕ (z) = 0, and 1/αj, j = 1,…, p be the roots of ψ(z) = 0. For a stationary, invertible ARMA model all these roots lie outside the unit circle. The autocovariances of the time series generated by this model are

$$ {\displaystyle \begin{array}{l}\gamma \left(\tau \right)=\left(1/2\pi i\right){\oint}_{\mid z\mid =1}{z}^{-\tau -1}g(z)\mathrm{d}z\hfill \\ {}=\left({\sigma}^2/2\pi i\right){\oint}_{\mid z\mid =1}{z}^{p+\mid \tau \mid -q-1}\hfill \\ {}\times \left\{\prod \limits_{j=1}^q\left(1-{\beta}_jz\right)\left(z-{\beta}_j\right)/\prod \limits_{k=1}^p\left(1-{\alpha}_kz\right)\left(z-{\alpha}_k\right)\right\}\hfill \\ {}\mathrm{d}z.\hfill \end{array}} $$
(11)

By the residue theorem, the integral on the right-hand side of (11) is 2πi times the sum of the residues enclosed by the unit circle. This fact allows a particularly simple calculation of the autocovariances of a time series generated by an ARMA model (see Nerlove et al. 1979, pp. 78–85). For example, for the general pth-order autoregression, AR(p), with distinct roots, the result is

$$ {\displaystyle \hfill \gamma \left(\tau \right) =\sum \limits_{k=1}^p\left[{\alpha}_k^{p+\mid \tau \mid -1}/\left\{\prod \limits_{j=1}^p\left(1-{\alpha}_j{a}_k\right)\underset{p}{\overset{}{\prod_{j\ne k}^{j=1}}}\left({\alpha}_j-{\alpha}_k\right)\right\}\right],\hfill } $$
(12)

and for the ARMA(1, 1) model, it is

$$ {\displaystyle \begin{array}{l}\gamma \left(\tau \right)={\alpha}^{\left|\tau \right|}\left(1-\alpha \beta \right)\left(1-\beta /\alpha \right)/\left(1-{\alpha}^2\right),\hfill \\ {}\ \ \ \ \ \tau =\pm 1,\pm 2,\dots \ \ \ \ \ \ \ \ =\left(1+{\beta}^2-2\alpha \beta \right)/\left(1-{\alpha}^2\right),\hfill \\ \quad\quad\quad \tau =0.\hfill \end{array}} $$
(13)

Formulation and Estimation of Univariate Arma Models

The problem of formulating an ARMA model refers to determination of the orders p and q of the AR and MA components, while the estimation problem is that of determining the values of the parameters of the model, for example, the roots 1/αj, j = 1,…, p, and 1/βk, k = 1,…, q, and the variance σ2 of εt.

Box and Jenkins (1970), among others, have suggested the use of the sample autocorrelation and partial autocorrelation functions as an approach to the problem of formulating an ARMA model. It is known, however, that the estimates of these functions are poorly behaved relative to their theoretical counterparts and, thus, provide a somewhat dubious basis for model formulation (Nerlove et al. 1979, pp. 57–68, 105–106; Hannan 1960, p. 41).

More recently, information-theoretic approaches to model formulation, having a rigorous foundation in statistical information theory, have been proposed. These procedures are designated for order determination in general ARMA (p, q) models. The Akaike (1973) Information Criterion (AIC) leads to selection of the model for which the expression:

$$ \mathrm{AIC}(k)=\ln\ {\widehat{\sigma}}_{\mathrm{ML}}^2+2k/T $$
(14)

is minimized, where \( {\widehat{\sigma}}_{\mathrm{ML}}^2 \) is the maximum likelihood estimate of σ2ε, T is sample size, and k = p + q. It is well known that the AIC is not consistent, in the sense that it does not lead to selection of the correct model with probability one in large samples (Shibata 1976; Hannan and Quinn 1979; Hannan 1980; Kashyap 1980). The procedure does, however, have special benefits when selecting the order of an AR model, as shown by Shibata (1980). Specifically, he shows that if the true model can not be written as a finite AR, but an AR is fitted anyway, then use of the AIC minimizes asymptotic mean-squared prediction error within the class of AR models.

Schwarz (1978) and Rissanen (1978) develop a consistent modification of the AIC which has become known as the Schwarz Information Criterion (SIC). This criterion selects the model which minimizes:

$$ \mathrm{SIC}(k)=\ln\ {\widehat{\sigma}}_{\mathrm{ML}}^2+\frac{\ln\ T}{T}(k), $$
(15)

and Hannan (1980) shows that this procedure identifies the true model with probability one in large samples, so long as the maximum possible orders of the AR and MA components are known.

Once the orders p and q are determined, the problem of estimating the parameters of the ARMA model remains. Various approaches in the time domain are available, such as least squares, approximate maximum likelihood (Box and Jenkins 1970), or exact maximum likelihood (Newbold 1974; Harvey and Philips, 1979; Harvey 1981). Approximate maximum likelihood in the frequency domain is also possible (Hannan 1969b; Hannan and Nicholls 1972; Nerlove et al. 1979, pp. 132–6). The latter is based upon the asymptotic distribution of the sample periodogram ordinates.

Estimation of pure AR models (no MA component) is particularly simple since ordinary least squares yield consistent parameter estimates. The basis of such estimation is the set of Yule–Walker equations (Yule 1927; Walker 1931). Consider the AR(p) process;

$$ {y}_t=\sum_{i=1}^p{\psi}_i{y}_{t-i}+{\varepsilon}_t. $$
(16)

Multiplying (16) by ytτ, τ ≥ 0, taking expectations, and recognizing that γ (τ) = γ (−τ) gives

$$ \gamma \left(\tau \right)=\sum_{i=1}^p{\psi}_i\gamma \left(\tau -i\right),\ \ \ \tau >0. $$
(17)

Dividing (17) by the variance γ(0), we obtain the system of Yule–Walker equations:

$$ \rho \left(\tau \right)=\sum_{i=1}^p{\psi}_i\rho \left(\tau -i\right),\ \ \ \tau >0, $$
(18)

which relate the autocorrelations of the process. This pth-order linear system is easily solved for the ψi, i = 1,…, p, in terms of the first p autocorrelations. In practice, the theoretical autocorrelations are replaced by their sample counterparts, yielding estimates of the ψi, i = 1,…, p. These parameter estimates may be conveniently used as start-up values for the more sophisticated, iterative estimation procedures discussed above.

Estimation of MA or mixed models by exact maximum likelihood methods is complicated further by a tendency to obtain a local maximum of the likelihood function at a unit root of the MA component, even when no roots are close to the unit circle (Sargan and Bhargava 1983; Anderson and Takemura 1984).

Prediction

Optimal linear least squares prediction of time series generated by ARMA processes may be obtained for known parameter values by the Wiener–Kolmogorov approach (Whittle 1983). If yt is generated by a stationary, invertible ARMA model with one-sided MA representation (6), a very simple expression may be given for the linear minimum meansquare error (MMSE) prediction of yt+v at time t, \( {y}_{t+v}^{\ast } \), in terms of its own (infinite) past

$$ {y}_{t+v}^{\ast }=C(z){y}_t, $$
(19)

where

$$ C(z)=\sum_{j=0}^{\infty }{c}_j{z}^j=\frac{1}{B(z)}{\left[\frac{B(z)}{z^v}\right]}_{+}. $$

The operator [.]+ eliminates negative powers of z.

Suppose that yt is AR(1):

$$ {y}_t=\alpha {y}_{t-1}+{\varepsilon}_t,\ \ \ \mid \alpha \mid <1, $$

then \( {y}_{t+v}^{\ast }={\alpha}^v{y}_i \). If yt is AR(2):

$$ {\displaystyle \begin{array}{ll}{y}_t=& \left({\alpha}_1+{\alpha}_2\right){y}_{t-1}-{\alpha}_1{\alpha}_2{y}_{t-2}\hfill \\ {}& +{\varepsilon}_{t,}\mid {\alpha}_1\mid, \mid {\alpha}_2\mid <1,\hfill \end{array}} $$

then \( {y}_{t+1}^{\ast }=\left({\alpha}_1+{\alpha}_2\right){y}_t-{\alpha}_1{\alpha}_2{y}_{t-1} \). In general, the result for AR(p) as in (1) is \( {y}_{t+v}^{\ast }={\psi}_1{y}_{t+v-1}^{\ast }+\cdots +{\psi}_p{y}_{t+v-p}^{\ast } \), where \( {y}_{t-j}^{\ast }={y}_{t-j} \), for j = 0, 1,…, at time t. Thus for pure autoregression the MMSE prediction is a linear combination of only the p most recently observed values.

Suppose that yt is MA(1):

$$ {y}_t={\varepsilon}_t-\beta {\varepsilon}_{t-1},\ \ \ \mid \beta \mid <1, $$

then \( {y}_{t+1}^{\ast }=-\beta {\sum}_{j=0}^{\infty }{\beta}^j{x}_{t-j} \) and \( {y}_{t+v}^{\ast }=0 \) for all v > 1. For moving-average processes, in general, predictions for a future period greater than the order of the process are zero and those for a period less distant cannot be expressed in terms of a finite number of past observed values.

Finally, suppose that yt is ARMA(1, 1): ytαyt−1 = εtβεt−1,|α|, |β| < 1, then \( {y}_{t+v}^{\ast }={\alpha}^{v-1}\left(\alpha -\beta \right){\sum}_{j=0}^{\infty }{\beta}^j{y}_{t-j} \). For further examples, see Nerlove et al. 1979, pp. 89–102.

When an infinite past is not available and the parameter values of the process are not known, the problem of optimal prediction is more complicated. The most straightforward approach is via the state-space representation of the process and the Kalman filter (Kalman 1960; Meinhold and Singpurwalla 1983).

Multivariate Arma Processes

Let ψ(·) and Φ(·) be K × K matrix polynomials in the lag operator, yt and εt be K × 1 vectors. Then the K-variate ARMA(p, q) process is defined as

$$ \psi (L){y}_t=\varPhi (L){\varepsilon}_t,\ \ \ {\varepsilon}_{\tilde{t}}^{iid}\left(0,\ \ \ \sum \right), $$
(20)

where Ψ(L) = Ψ0Ψ1L –… –ΨpLp and Φ(L) = Φ0Φ1L –… – ΦqLq, with each ψj and Φj, y = 0, 1,…, being a K × K matrix. The model is weakly stationary if all the zeros of det |Ψ(z)| lie outside the unit circle (Hannan 1970), and invertible if all the zeros of det |Φ (z)| also do.

In addition to the issues of formulation, estimation, and prediction, which arise in the univariate case as well, identification (in the usual econometric sense) becomes an important problem. Hannan (1969a) shows that a stationary vector AR process is identified if Φ0 is an identity matrix (i.e. no instantaneous coupling), and Ψp is nonsingular. Hannan (1971) extends the analysis to recursive systems and systems with prescribed zero restrictions.

There are three approaches to the formulation of multivariate ARMA models. Nerlove et al. (1979) and Granger and Newbold (1977) develop an augmented single-equation procedure, and Wallis (1977) and Wallis and Chan (1978) develop another procedure which involves preliminary univariate analysis.

A second approach is due to Tiao and Box (1981), who use multivariate analogues of the autocorrelation and partial autocorrelation functions as a guide to model formulation. Their approach is computationally quite simple and usually leads to models with a tractable number of parameters. Identification is achieved by allowing no instantaneous coupling among variables.

Finally, the information-theoretic model formulation procedures which were discussed above generalize to the multivariate ARMA case. Quinn (1980) shows that Schwarz’s criterion (SIC) again provides a consistent estimate of the vector AR order. In a large Monte Carlo comparison of criteria for estimating the order of a vector AR process (VAR), Lütkepohl (1985) shows the clear superiority of the SIC in medium-sized samples; the SIC chooses the correct model most often and leads to the best forecasting performance.

As in the case of univariate ARMA models, estimation in the multivariate case may be carried out in the frequency domain (Wilson 1973; Dunsmuir and Hannan 1976) or in the time domain (Hillmer and Tiao 1979). An exact likelihood function in the time domain may also be derived by the Kalman filter by casting the multivariate ARMA model in state space form. Anderson (1980) provides a good survey of estimation in both time and frequency domains.

Prediction in the multivariate case with an infinite past is a straightforward generalization of the results for the univariate case (Judge et al. 1985, pp. 659–60). When only a finite past is available and the parameters of the process must be estimated, the most straightforward approach is again through the Kalman filter (see also Yamamoto 1981).

Applications

In addition to their obvious uses in forecasting, ARMA models, especially multivariate ARMA models, have a wide range of economic and econometric application.

The use of time-series methods in formulating distributed-lag models is discussed at length in Nerlove (1972) and Nerlove et al. (1979, pp. 291–353) and applied in the latter to an analysis of US cattle production. The notion of quasi-rational expectations introduced there is that the expectations on the basis of which economic agents react may, under certain conditions, be assumed to be the statistical expectations of the variables in question, conditional on observations of past history. If these variables are generated by time-series processes, such as those discussed in this entry, time-series methods may be used to derive expressions for the MMSE forecasts for any relevant future period; these MMSE forecasts are, by a well-known result, the aforementioned conditional expectations.

An econometric definition of causality based on time-series concepts has been developed by Granger (1969) and extended by Sims (1972). Let (xt, yt) be a pair of vectors of observations on some economic time series, and let Ωt−1 be the information available up to time t, which includes \( \left\{\left({x}_{t-1},{y}_{t-1}\right),\ \ \ \left({x}_{t-2},{y}_{t-2}\right),\dots \right\} \). Granger gives the following definitions in terms of the conditional variances:

Definition 1

x causes y if and only if

$$ {\sigma}^2\left({y}_t|{\Omega}_{t-1}\right)<{\sigma}^2\left({y}_t|{\Omega}_{t-1}-\left\{{x}_{t-1},{x}_{t-2},\dots \right\}\right), $$

where \( {\Omega_{t-}}_1-\left\{{x}_{t-1},{x_{t-}}_2,\dots \right\} \) is the information set omitting the past of the series xt.

Definition 2

x causes y instantaneously if and only if

$$ {\sigma}^2\left({y}_t|{\Omega}_{t-1},{x}_t\right)<{\sigma}^2\left({y}_t|{\Omega}_{t-1}\right). $$

It may happen that both x causes y, and y causes x; then x; and y are related by a feedback system. In applications (xt, yt), is generally assumed to be generated by multivariate ARMA processes, and Ωt is assumed to consist only of the past history of (xt, yt). Since ARMA models are applicable only to weakly stationary time series it must further be assumed that any transformation necessary to achieve stationarity is causality preserving. Granger’s (1969) test for causal association is based on a multivariate AR representation, while Sims (1972) bases his on an equivalent MA representation. Sims also introduces a regression-based test related to the above which makes use of both future and past values of the series xt in relation to the current value of yt. Pierce and Haugh (1977) show that causality may also be tested in univariate representations of the series. Feige and Pierce (1979) and Lütkepohl (1982) show that the direction of causality so defined may be sensitive to the transformations used to achieve stationarity, and to the definition of the information set.

Time series methods have also been applied to the analysis of the efficiency of capital markets (Fama 1970). The question is whether market prices fully reflect available information, for example, in a securities market. Efficiency requires that the relevant information set be that actually used by the market participants. Since the latter is inherently unobservable, tests of the efficiency of a market can be carried out only within the context of a particular theory of market equilibrium. Various alternatives lead to tests, based on AR or more general models, of the rates of return for different securities over time in the presence of shocks of various sorts which may or may not represent the introduction of new information (Ball and Brown 1968; Fama et al. 1969; Scholes 1972).

Finally, an important example of the use of time-series methods in econometrics has been put forth in the controversial revisionist views of Sargent and Sims (1977), and Sims (1980) on appropriate methods of econometric modelling. These views may be traced back to the work of T. C. Liu (1960) who argued that when only reliable a priori restrictions were imposed, most econometric models would turn out to be underidentified; furthermore, he argued that most of the exclusion restrictions generally employed, and the assumptions about serial correlation made to justify treating certain lagged values of endogenous variables as predetermined, were invalid; he concluded that only unrestricted reduced form estimation could be justified. The revisionist approach treats all variables as endogenous and, in general, places no restrictions on the parameters except the choice of variables to be included and lengths of lags. Attention in this approach is focused on the estimation of a general relationship among a relatively short list of variables rather than policy analysis and structural inference, which have been the emphasis of mainstream econometrics. As such, the approach has been mainly useful for data description and forecasting.

See Also