Keywords

1 The Relationship Between Absolute and Relative Purchasing Power Parity

Republished with permission of MIT Press, from Review of Economics and Statistics, Vol. 60, No. 4 (November 1978): pp. 562–568; permission conveyed through Copyright Clearance Center, Inc.

The absolute purchasing power parity (PPP) theory asserts that the equilibrium exchange rate (number of units of domestic currency per unit of standard currency) is determined by the ratio of the price level of the domestic country to the price level of the standard country.1 This ratio is itself called the absolute PPP. The relative PPP theory states that the ratio of the equilibrium exchange rate in a current period (t) to the equilibrium exchange rate in a base period (o) is determined by the ratio of the domestic country’s price index in period t to the standard country’s price index in period t, where both indexes are measured relative to period o.

Suppose that the absolute PPP theory is fulfilled in both periods t and o. Then the relative PPP theory may be restated as follows: the ratio of absolute PPP in period t to absolute PPP in period o is determined by the ratio of the domestic country’s price index to the standard country’s price index, where both indexes are measured in period t relative to period o. The question immediately arises, however, whether the relative PPP theory has now become a truism. Is the current/base-period absolute-PPP ratio identically equal to the domestic/standard-country price-index ratio? It is the purpose of this article, first, to demonstrate that the restated relative PPP theory is not a truism and, second, to provide an empirical test of this interpretation of the PPP hypothesis.

The restatement of the relative PPP theory separates its validity from that of the absolute PPP theory. Relative PPP becomes concerned only with the movement from one potential exchange-rate equilibrium to another. Whether the exchange rate is actually in equilibrium (à la PPP) at the two end points of the time period becomes the purview of the absolute PPP hypothesis, and is an issue beyond the confines of this paper. But it must be shown that the restated PPP hypothesis is an operational theory rather than a truism.

1.1 Proof That the Restated PPP Theory Is Not a Truism

Consider the following notation:

  • Pj = absolute purchasing power parity in period j, number of units of domestic currency per unit of standard currency

  • Lj = price level of the domestic country in period j

  • Ljs = price level of the standard country in period j

  • Ij = price index of the domestic country in period j relative to period o

  • Ijs = price index of the standard country in period j relative to period o

  • pij = price of commodity i in the domestic country in period j

  • Pijs = price of commodity i in the standard country in period j

  • wij = weight of the price of commodity i in the domestic country’s price level in period j

  • wijs = weight of the price of commodity i in the standard country’s price level in period j

Then \(w_{io} \left( {w_{io}^{s} } \right)\) is the weight of the price of commodity i in the domestic (standard) country’s price index for all time periods, in particular, for periods o and t.

By definition,

$$\begin{gathered} L_{j} \equiv \sum\limits_{i} {w_{{ij}} p_{{ij}} } \,\,\,\,\,\,\,\,\,j = o,t \hfill \\ L_{{j^{s} }} \equiv \sum\limits_{i} {w_{{ij^{s} }} p_{{ij^{s} }} } \,\,\,\,\,\,\,j = o,t \hfill \\ P_{j} \equiv L_{j} /L_{{j^{s} }} \,\,\,\,\,\,\,\,\,\,\,\,\,\,j = o,t \hfill \\ I_{t} \equiv \frac{{\sum\limits_{i} {w_{{io}} p_{{it}} } }}{{\sum\limits_{i} {w_{{io}} p_{{io}} } }} \hfill \\ I_{{t^{s} }} \equiv \frac{{\sum\limits_{i} {w_{{io^{s} }} p_{{it^{s} }} } }}{{\sum\limits_{i} {w_{{io^{s} }} p_{{io^{s} }} } }} \hfill \\ \end{gathered}$$

The restated PPP theory is a truism if and only if

$$\frac{{P_{t} }}{{P_{o} }} \equiv \frac{{I_{t} }}{{I_{{t^{s} }} }},$$
(5.1)

i.e., if and only if

$$\frac{{\sum\limits_{i} {w_{it} p_{it} } /\sum\limits_{i} {w_{{it^{s} }} p_{{it^{s} }} } }}{{\sum\limits_{i} {w_{io} p_{io} } /\sum\limits_{i} {w_{{io^{s} }} p_{{io^{s} }} } }} \equiv \frac{{\sum\limits_{i} {w_{io} p_{it} } /\sum\limits_{i} {w_{io} p_{io} } }}{{\sum\limits_{i} {w_{{io^{s} }} p_{{it^{s} }} } /\sum\limits_{i} {w_{{io^{s} }} p_{{io^{s} }} } }}$$

or

$$\frac{{\sum\limits_{i} {w_{it} p_{it} } }}{{\sum\limits_{i} {w_{{it^{s} }} p_{{it^{s} }} } }} \equiv \frac{{\sum\limits_{i} {w_{io} p_{it} } }}{{\sum\limits_{i} {w_{{io^{s} }} p_{{it^{s} }} } }}$$

or

$$w_{it} = w_{io} \,\,{\text{and}}\,\,w_{it}^{s} = w_{io}^{s} \;\,{\text{for}}\;\,{\text{all}}\,\,i$$
(5.2)

What is the interpretation of (5.2)? It states that, for both the domestic and standard country, the weights of the component prices in the price level of the country are the same in the current as in the base period. For the country’s price index, of course, the weighting pattern in the current period is, by definition, equal to that in the base period. But, for the price level, equal weighting patterns in the two periods would exist only in a very special case.

Since under the PPP theory a country’s price level covers its entire output of commodities, the weights of the price level reflect the production pattern of the country. Assume the absence of money illusion, so that a mere change in units of measurement makes no difference to real economic behavior. Then the only situation in which production would be distributed among all commodities in an unvarying proportion in two time periods would be the absence of any changes in relative prices in the economy. Therefore (5.2) holds if and only if, for both the domestic and standard country, all individual prices in the country in period t are a constant multiple of the prices in period o, i.e.,

$$p_{{it}} = c \cdot p_{{io}} \,\,{\text{and}}\,\,p_{{it}} ^{s} = c^{s} \cdot p_{{io}} ^{s} \,\,{\text{for}}\,\,{\text{all}}\,\,i$$
(5.3)

where c and cs are positive constants.

What has been demonstrated? Assuming the absence of money illusion, identity (5.1) is fulfilled if and only if Eqs. (5.3) hold, i.e., there is pure inflation or deflation in both the domestic and standard country. Furthermore, for (5.1) to hold, the only permissible real change in the economies is an equiproportional increase or decrease in the production of every commodity.

The condition for the restated PPP theory to be a truism is stringent indeed. In practice, prices of individual commodities do not move uniformly with the general price level and therefore the relative production of individual commodities (weights of the individual prices in the price level) also change. So Eq. (5.1) in its identity form cannot be expected to be fulfilled in the real world.

1.2 Alternative Price-Level Concepts of PPP

Let \(P \equiv P_{t} /P_{o}\) and \(D \equiv I_{t} /I_{t}^{s}\). It has been shown that the restated relative PPP theory is not an identity, that is, P is not identically equal to D. Therefore the theory may legitimately be tested empirically, and it has the general form \(P = h\left( D \right)\), where h is an increasing function.

Two separate data sets are used to generate samples of observations on P and D.2 The first data set involves a gross domestic product (GDP) price-level concept for the absolute-PPP computation (P) and, correspondingly, the GDP deflator as the price measure for constructing variable D. The United States is the standard country. The second data set employs a cost-of-living (COL) concept of PPP and the consumer price index as the price measure, with Germany as the standard country.

There are several reasons why the GDP-concept samples are deemed superior to the COL-concept samples for the purpose of testing the PPP theory. An empirical reason is that the United States, as the dominant country in the world economy, can be construed as the optimal standard country for any broad group of domestic countries. There also exist two theoretical arguments in favor of the GDP-concept data set, one on the consumption side, the other on the production side of the economy. To the extent that the PPP theory is justified by the existence of arbitrage and substitutability of commodities in consumption (broadly construed), the price concept underlying PPP should be as comprehensive as possible. Therefore a GDP measure, encompassing all output of the economy, is preferred to a COL measure, which restricts pricing to those commodities purchased by households. On the production side, it can be argued that a unit-factor-cost concept is the most appropriate methodology for absolute PPP (Houthakker 1962, pp. 293–294). Now, under certain assumptions, a unit-factor-cost concept of PPP is equivalent to a PPP based on price levels that are production-weighted averages of commodity prices in each country, implying a GDP price-level measure for PPP (Houthakker 1962, p. 296; Officer 1974, pp. 871–872; 1976, pp. 11–12). However, for equivalence with a COL concept of PPP, i.e., with the use of household consumption weights in the construction of price levels, additional—and more stringent—assumptions are required (Officer 1976, pp. 12–13).

On the other hand, the theoretical argument in Sect. 5.1.1 implies that the appropriate price index for the construction of variable D is base-weighted rather than current-weighted. Yet the only available GDP price index is the current-weighted deflator. In contrast, the consumer price index, used in association with the COL-concept PPP, is base-weighted, although there may be changes in the weighting pattern at discrete points in time.

While, on balance, the GDP concept may be construed as the preferred foundation for PPP computation, data availability limits the size of the samples that can be generated on this basis. Observations on P and D are collected for 8 countries in the 1950–1955 period (that is, with 1950 as the base year and 1955 as the current period), 4 countries in the 1967–1970 period, and 4 in the 1950–1970 period. Thus there are three distinct samples, and the countries composing each sample are listed in the first column of Table 5.1.

Table 5.1 Errors of strong PPP and naive models: GDP concept

Use of the COL concept enables the assembling of a much larger data set. Unfortunately, there is no uniformity in base and current periods. So samples are delineated on the basis of the duration between base and current period: (i) less than 10 years, a 15-observation sample, (ii) 10 to 19 years, 9 observations, and (iii) 20 years or more, 6 observations. For each sample, the observations are identified by country, base period, and current period in the first column of Table 5.2.

Table 5.2 Errors of strong PPP and naive models: COL concept

1.3 Empirical Analysis of the Strong PPP and Naive Models

The “strong PPP model” is obtained by the inclusion of an error term in Eq. (5.1):

$$P = D\,+ \in_{1}$$
(5.4)

In general, an error term is denoted by a subscripted ∈. The strong PPP model is tested against a corresponding naive model, called the “strong naive model”:

$$P = 1\,+ \in_{2}$$
(5.5)

The strong naive model, in effect, predicts Pt, the absolute PPP in period t, by Po, the base-period PPP. Price indexes in the domestic and standard countries are assigned no role in predicting absolute PPP in the current period relative to the base period.

A comparison of the performance of the strong PPP and strong naive models can be made by calculating their percentage errors in ex post prediction. These errors are \(P - D/P\) and \(P - 1/P\), respectively. They are shown in Tables 5.1 and 5.2 for the samples based on the GDP and COL concepts, respectively. A positive error implies an underestimate of P, while a negative error implies an overestimate. Then under the GDP concept, the PPP model has a tendency to overestimation, while the naive model has the opposite tendency. With a total of 16 observations over the three samples, the PPP model overpredicts P in 12 cases, with the naive model overpredicting in only 3 cases. The implication is that the relative price level between a domestic and a standard country in a current compared to a base period, tends to be less than that indicated by the corresponding ratio of price indexes between the countries. The relative PPP hypothesis tends to predict too great a change in absolute PPP on the basis of changes in price indexes.

This result does not carry over to the COL-concept samples. Except for the 20-years-or-more computational period, the PPP model underestimates P at double the rate that it overestimates it, while the naive model tends to underpredict P in all samples, and with greater overall frequency. There is no apparent reason for the expected direction of the forecast error to vary with the price-level concept of PPP.

In any event, the direction of a prediction error is less relevant than the amount of the error, especially for comparison with a naive model. Consider the GDP-concept samples first. In terms of absolute percentage errors, the PPP model is superior to the naive model (i.e., has a lower percentage error) for 10 of the 16 observations over all samples, it is inferior to the naive model for 5 observations, and there is an equal percentage error for 1 observation. For both shorter time periods (1950–1955 and 1967–1970), the PPP model is superior by a three-to-one margin, under this ordinal criterion. Only for the longer time period (1950–1970) does the naive model outperform the PPP model (by a two-to-one margin, with one tie).

The ordinal superiority of the PPP model is stronger for the COL concept. Over all samples, the PPP model has a lower absolute percentage error than the alternative model in 22 of 30 observations. The PPP model is superior in 8 of 9 observations for the 10-to-19-years sample, and by a two-to-one margin in the other samples.

A cardinal measure of performance of the models is the average of the absolute values of the percentage errors. For each sample, this average is shown in Tables 5.1 and 5.2. Now the PPP model is superior to the naive model in all samples for both data sets. Especially noticeable in the GDP-concept samples are the low average percentage errors of the PPP model for the shorter time periods, with the average between 2 and 3%. This result is particularly impressive when coupled with the fact that the highest absolute error is below 7% (and the second highest below 4%) for the 1950–1955 period, and below 4% for the 1967–1970 period.

In contrast, an average absolute error of about 9 3/4% marks a less prominent performance of the PPP model for the 1950–1970 period, even though this result is superior to that of the naive model, which has an average error of nearly 16 1/2%. The inferior result for the 1950–1970 period is not unexpected, because with a longer time period there is greater scope for changes in the price-quantity structure underlying a country’s price level, thus reducing the applicability of the PPP model.

Turning to the COL-concept samples, the average absolute error of the PPP model is approximately 9%, 11%, and 12%, respectively, for the three samples in order of duration of the computational period. This performance is superficially inferior to that of the PPP model for the GDP-concept samples. However, the longer duration of the computational periods under the COL data set must be considered. The average computational periods for the COL samples are 6.5, 13.8, and 20.7 years, while the computational periods for the GDP-concept samples are 3, 5, and 20 years. Interestingly enough, for the COL data set, the superiority of the PPP over the naive model increases greatly when the computational period exceeds 10 years.

1.4 Empirical Analysis of the Weak PPP and Naive Models

An alternative PPP model, the “weak PPP model,” involves a general linear relationship between P and D:

$$P = \alpha + \beta D\, + \in_{3}$$
(5.6)

where α and β are parameters, with β positive. The corresponding naive model, designated as the “weak naive model,” again ignores any information on relative price indexes in predicting the ratio of absolute PPP in the current period to absolute PPP in the base period. Therefore only the constant and the error term remain in Eq. (5.6). Thus, letting γ be a parameter, the weak naive model is as follows:

$$P = \gamma + \in_{4}$$
(5.7)

The weak PPP model allows for a general linear relationship between P and D, with an error term. In contrast, the weak naive model specifies that P is equal to a constant (not necessarily unity) plus an error term. The implication of the naive model is that D can contribute nothing to the explanation of P. A significant correlation coefficient between P and D would indicate a linear association between the two variables and consequently forestall rejection of the weak PPP model.

The correlation coefficient (r) and coefficient of determination (r2) between P and D for the six samples heretofore discussed are exhibited in Table 5.3. Because a negative correlation between P and D can be eliminated on theoretical grounds (according to the weak PPP model), a one-tail test of r is appropriate. Considering first the GDP-concept samples, the correlation coefficient for 1950–1955 is significantly different from zero at the 1% level. With only two degrees of freedom, the test of r for the 1967–1970 and 1950–1970 periods must be viewed with some skepticism. For neither of these samples is r significant at the 1% level; and while r is significant at the 5% level for the 1950–1970 sample, this result is largely due to an extreme observation, that for France.3

Table 5.3 Correlation of P with D

Results are much better for the COL-concept samples. The correlation coefficient is uniformly significant at the 1% level. Indeed, even for a two-tail test, r continues to be significant at this level for the two longer-period samples and is significant at the 5% level for the less-than-ten-years sample.

With the objective of achieving more powerful tests of r than that provided by the individual samples, for each data set observations are pooled over these samples. While within the individual samples all observations are independent, this is not so between samples. Care must be taken to exclude observations that are dependent on other observations in the data set. The objective is to achieve a maximum-size sample of independent observations for each data set. For the GDP-concept data, this criterion involves excluding the 1950–1970 observations, resulting in a maximum-size sample of 12 observations. For the COL-concept data, 8 observations must be dropped from the longer computational periods, yielding a sample of 22 observations. Excluded observations are identified by superscript a in Tables 5.1 and 5.2.

The correlation coefficient and coefficient of determination for the maximum-size samples are presented in Table 5.3. For both samples, the correlation coefficient is significantly different from zero at the 1% level, even under a two-tail test—a result distinctly favorable for the weak PPP model.

The weak PPP and naive models have been tested against one another using correlation analysis; they may also be tested by means of regression analysis. The weak PPP model is formulated as Eq. (5.6), which can be viewed as a regression model the parameters of which, α and β, can be estimated by ordinary least-squares. The maximum-size samples from the GDP-concept and COL-concept data sets provide sample sizes (n) of 12 and 22 observations, respectively, to which Eq. (5.6) is fitted. The resulting regression lines are as follows:

$$\begin{gathered} P = - .0666 + 1.0525D \hfill \\ \quad \quad \;\left( {.1188} \right)~\quad \left( {.1118} \right) \hfill \\ \end{gathered}$$
(5.8)
$$\begin{gathered} P = - .4066 + 1.3948D \hfill \\ \quad \quad \;\left( {.1699} \right)\quad \left( {.1487} \right) \hfill \\ \end{gathered}$$
(5.9)

The GDP-concept data yield Eq. (5.8) and the COL-concept data produce Eq. (5.9). Numbers in parentheses are standard errors of the estimated coefficients. The standard error of estimate (s) is 0.0322 in Eq. (5.8) and 0.1145 in Eq. (5.9), while the corrected coefficient of determination is 0.89 in (5.8) and 0.81 in (5.9).

The weak naive model is represented by Eq. (5.7), which may be viewed as the following regression model:

$$P = \gamma [1] + \in_{4}$$
(5.10)

where [1] is a variable identically equal to unity. Equation (5.10) is a degenerate regression equation, the parameter of which, γ, may be estimated by ordinary least-squares. The resulting estimate of γ is the mean of the dependent variable \((\overline{P})\).

Letting \(\hat{\alpha }\), \(\hat{\beta }\), and “hatted” γ denote their estimates, parameter estimation of the weak PPP and naive models may be summarized as follows:

GDP-concept sample:

$$\hat{\alpha } = - .0666;\,\hat{\beta } = 1.0525;\,\hat{\gamma } = \overline{P} = 1.0488$$

COL-concept sample:

$$\hat{\alpha } = - .4066;\,\hat{\beta } = 1.3948;\,\hat{\gamma } = \overline{P} = 1.1709$$

Equation (5.6) may be used to exposit all four models that have been investigated empirically. Each of these models may be identified by the value it assigns to the parameters of Eq. (5.6), as shown in the second and third columns of Table 5.4. For the weak PPP and naive models, where the parameters lack preassigned numerical values, the estimates derived above are used.

Table 5.4 Summary of models

The estimated regression Eqs. (5.8) and (5.9), which pertain to the weak PPP model, may be used for statistical testing of the remaining three models. As thus far only the weak PPP model has been subjected to testing with a level of significance (the correlation analysis), it is appropriate to use the estimated versions of this model for econometric testing of the other models.

The weak naive, strong PPP, and strong naive models each involve a joint hypothesis on α and β, the parameters of the weak PPP model. The respective hypotheses are indicated in the second and third columns of Table 5.4. For joint testing of α and β, the test statistic

$$F = \frac{{n\left( {\hat{\alpha } - \alpha } \right)^{2} + 2n\overline{D} \left( {\hat{\alpha } - \alpha } \right)\left( {\hat{\beta } - \beta } \right) + \sum {D^{2} \left( {\hat{\beta } - \beta } \right)^{2} } }}{{2s^{2} }}$$

has the F-distribution with (2, n − 2) degrees of freedom (Johnston 1972, pp. 28–29). The values of F computed for the (α, β) hypotheses implied by the weak naive, strong PPP, and strong naive models, respectively, are listed in the fourth and fifth columns of Table 5.4.

Results are most striking for the GDP-concept sample. With (2, 10) degrees of freedom, the critical value for the F-distribution at the 0.05% level of significance is 17.9. Both naive models are rejected at this extremely low level of significance. In contrast, the F value for the strong PPP model is such that this model cannot be rejected even at the 40% level of significance.4 Thus, using the F-test as the criterion, the strong PPP model strongly out-performs both naive models.

For the COL-concept sample, the F-distribution has (2, 20) degrees of freedom; its critical value at the 0.05% level of significance is 11.4. Again the naive models are rejected at this very low level of significance. Indeed, the F-statistics for these models are even further in the tail of the distribution than under the GDP-concept sample. A statement similar in kind applies to the F-statistic of the strong PPP model. With F-distribution critical values of 3.49 and 5.85 at the 5% and 1% levels, respectively, the strong PPP model itself is rejected at the 5% level of significance, though it cannot be rejected at the 1% level. So the strong PPP model survives the F-test more easily under the GDP-concept sample than under the COL-concept sample.

In summary, empirical investigation supports the relative PPP hypothesis in both strong and weak versions. The findings are especially favorable to the relative PPP theory in light of the facts that (i) no secondary variables were used to increase the explanatory power of the PPP model, (ii) complicated functional forms and lagged relationships were not adopted in an effort to increase explanatory power, and (iii) several countries outside the Western industrial mode were included in the samples.

Notes

  1. 1.

    The term “standard” is used in preference to “foreign” currency or country, because this country may serve as the standard of comparison for a group of “domestic” countries.

  2. 2.

    Details on data are provided in the appendix.

  3. 3.

    See Table 5.1. With the value of P extremely close to that of D, coupled with by far the highest values of both these variables in the 1950–1970 sample, the observation for France is dominant in the correlation.

  4. 4.

    Critical values for the F-distribution at the 40% level of significance are not published in an accessible source. However, the critical value at this level for the F-distribution with (2, ∞) degrees of freedom is readily obtained as 0.916; for this degrees-of-freedom configuration reduces the F-distribution to a chi-square distribution with 2 degrees of freedom. The corresponding critical value for the F-distribution with (2, 10) degrees of freedom is necessarily greater than 0.916.