Introduction

The measurement of prices is an important area of research in economics since prices play a central role in welfare analysis and macroeconomic comparisons across time and space. While accurate figures on inflation and cost of living are required in temporal comparisons of standard of living in a country and in adjusting poverty lines over time, such information is also essential in spatial comparisons of prices within and across countries.

One popular price metric to compare economic productivity and standards of living between countries is purchasing power parity (PPP). PPP of different countries’ currencies are required in a host of cross-country calculations such as calculating global poverty rates, comparison of GDP and consumption levels between countries and examining how global inequality has changed over time. Companies and individual investors who hold stock or bonds of foreign companies sometimes use PPP figures to predict the impact of exchange-rate fluctuations on a country's economy and thus the impact on their investment. A working definition of the PPP is provided in World Bank (2013, p.19) as 'it represents the number of currency units required to purchase the amount of goods and services which can be bought with one currency unit of the base or reference or numeraire country.' The PPP concept owes its origins to early work by Balassa (1964) and Samuelson (1964). PPP is regarded as a better indicator of the strength of a country’s currency than exchange rates, both expressed in terms of a numeraire, typically, the US dollar. The PPPs, calculated from information on prices supplied by member countries, are based on a wider basket of items than exchange rates, covering both tradable and non-tradable items. The accuracy of the PPPs used in the currency conversions in converting the international poverty line (IPL) into national currencies is an essential ingredient in the production of reliable global poverty numbers. Following the World Bank procedure for calculating global poverty, the IPL is itself dependent on PPPs since it is calculated as the mean of the national poverty lines (in local currency) of the 15 poorest countries in the world, converted to the numeraire currency at PPP. As the number of countries included in the global calculations has grown over the years, so has the scale and importance of the PPP calculations undertaken by the International Comparison Program (ICP). The last two rounds of the ICP for calculating PPPs were carried out in 2005 (published in World Bank 2008) and in 2011 (published in World Bank 2015). With significant revision in the PPP estimates over the ICP rounds, the IPL that is used in the global poverty calculations has also been revised. For example, with the recent release of the PPP estimates from the 2011 ICP round, the IPL has been revised from $1.25 a day in 2005 PPPs to $1.90 a day in 2011 PPPs.

Most of these international comparisons treat a whole country as a single entity and ignore the spatial dimension within the country. They ignore the fact that in large countries, such as Brazil, China, Vietnam and India, there is much greater variation in prices and consumer preferences between states or provinces than there is between several of the smaller countries that figure in the ICP real income or inequality comparisons. There is now mounting evidence on spatial variation in prices within a country. The variation in the PPP of a currency inside a large country can be attributed to three related but conceptually different factors: (a) intra-national spatial heterogeneity in preferences, (b) differences in prices and (c) spatial differences in household size and composition. In large countries, the combined impact of these three factors may lead to high spatial heterogeneity in the PPP of the country’s currency. The assumption of a single PPP restricts the usefulness of the methodology adopted in such countries. Within a country, the measurement of regional differences in consumer price levels is important to policy-makers in business, government and academics. Estimates of the magnitude of regional price differences are needed in comparisons of real income, standards of living or consumer expenditure patterns across regions. Moreover, in country-wide calculations of inequality and poverty, exclusive reliance on nominal income or expenditure without correcting for price differences between regions will bias the estimates. What is required is not an aggregate price deflator that is invariant across regions since that will leave the inequality and poverty magnitudes unchanged but spatial prices as expenditure deflators. Spatial prices can be used in (a) constructing the spatially differentiated poverty lines that take into account spatial differences in prices when calculating the cost of meeting the subsistence needs, (b) designing the transfer of resources from the centre to the states that maintain their real (i.e. spatial price deflated) value of the transferred amount, (c) designing minimum wages that vary between states and between rural and urban areas, (d) calculating the ‘dearness allowance’, ‘house rent allowance’, etc. state-wise and (e) helping a potential migrant decide whether to move from one state to another. Spatial heterogeneity in prices is expected to have its impact on the estimates of income (expenditure) and price elasticities of items of consumption. These estimates are important from the point of view of the policy planner as they have implications in terms of imposition of commodity taxes. Price elasticity also plays a crucial role in the pricing decisions of the business firms and the government when it regulates prices. For example, for managers, a key point in the discussions of demand is what happens when they raise prices for their products and services. It is important to know the extent to which a percentage increase in unit price will affect the demand for a product. With elastic demand, total revenue will decrease if the price is raised. With inelastic demand, total revenue will increase if the price is raised.

Temporal element in price movement in single country contexts has always attracted the bulk of the attention of the economists, especially for measurement of inflation. However, the measurement of spatial variation in prices within a country has generally proceeded separately from that of the temporal movements in prices in the country as a whole. In case of large heterogeneous countries and long time period, the spatial and temporal aspects will interact to record large regional differences in inflation over time.

While the PPPs discussed above provide an overall picture of purchasing power for a national/subnational region, the contribution of the items comprising the overall index is not apparent from the overall value. Yet in terms of policy implication, it may be important to identify the items that are major contributors to differential purchasing power of a country’s currency unit across its regions. One may, therefore, be interested in individual item-specific PPPs and their variations. This variation could be for a particular item over space/time (e.g. rural–urban comparison) and/or across items given space/time (e.g. food PPP may not be the same as non-food PPP). The variation in PPPs across items, if present, will result in a variation in the overall PPP between households because of variation in household expenditure patterns.

In what follows, we present a review of the literature on various aspects of price comparisons discussed above.Footnote 1 The plan of the paper is as follows. Section 2 describes the alternative methods that have been used in calculating the price indices for international comparisons and their applications; Sect. 3 discusses subnational PPPs; Sect. 4 presents some results on the spatial–temporal aspect; and Sect. 5 discusses determination of specific PPPs. Finally, Sect. 6 concludes the paper.

Alternative PPP estimation procedures and applications: international comparisons

In view of its importance, the methodologies adopted to calculate the PPP have received considerable critical scrutiny. For example, Hill (2000) and Almas (2012) analyse and quantify the PPP bias in the widely used Penn World Table of incomes of various countries. One of the most prominent methods adopted in the PPP calculations has been the Country Product Dummy Method (CPD), due to Summers (1973), that is based on the idea of hedonic price regressions and was originally proposed to deal with the problem of missing observations in international price comparisons. The CPD method has been analysed and extended by Diewert (2005) and Rao (2005). Coondoo et al. (2004) extend the CPD methodology by using it in conjunction with the idea of a ‘quality or price equation', due to Prais and Houthakker (1955), to calculate spatial prices in the Indian context. The methodology proposed by Coondoo et al. (2004) has been used in modified form in the cross-country context by Deaton et al. (2004) to calculate PPP rates between India and Indonesia. The latter study is not based on any preference consistent 'complete' demand system. In contrast, Oulton (2012) takes an expenditure function-based approach, but does not consider the spatial dimension within each country in the cross-country expenditure comparisons.

We discuss below various methods that have been used to calculate the PPPs.

The ICP methodology—GEKS Index

The ICP distinguishes between ‘below basic headings’ and ‘above basic headings’ in the procedures it uses to calculate the PPP. A full description of the ICP methodology is contained in World Bank (2013)—see, in particular, the contributions by Rao (Chapters 1, 4) and Diewert (Chapters 5, 6) in that volume. The ICP follows a hierarchical approach for estimating the PPPs. Basic headings (BH) are the lowest level at which the PPPs are estimated. The BH PPPs are then aggregated to calculate PPPs for different uses in cross-country comparisons. In this study, we will restrict ourselves to the PPP estimation procedure above the BH levels, building on the prices constructed from below the BH levels. While the unweighted CPD method (described below) is used by the ICP below the BH level to deal with the problem of missing price information, the commonly used methods of aggregation for computing PPPs for GDP and other major aggregates above the BH level are the Gini-Elteto-Koves and Szulc (GEKS), Ikle, Geary–Khamis and the Rao or weighted CPD methods.

An important principle that multilateral PPP estimation ought to satisfy is the transitivity principle which is as follows:

$${\text{PPP}}_{jk} = {\text{PPP}}_{jm } \cdot {\text{PPP}}_{mk}.$$
(1)

In words, the PPP between countries j and k can be obtained as the product of the PPP between j and m and that between m and k. This property guarantees the level of internal consistency required in international comparisons. When PPPs are based on a single product, this property is guaranteed for simple price indices such as relative price. However, this is not so if we have multiproduct in the multilateral comparisons. Instead, the GEKS method is used by the ICP above the BH level.

The GEKS method is a generic method which generates transitive indexes from a matrix of binary indexes which satisfy the country reversal test but not transitivity. Let \({I}_{jk}\) represent a price index (or PPP) for country k with country j as base (j, k = 1,2,…,M) such that  \({I}_{jk}.{I}_{kj}\) = 1. Then, the GEKS index is given by:

$${\text{GEKS}}_{jk} = \mathop \prod \limits_{l = 1}^{M} \left( {I_{jl} \cdot I_{lk} } \right)^{\frac{1}{M}}.$$
(2)

The GEKS index can be implemented once the binary index number formula to compute \(I_{jk}\) is chosen. The Fisher binary index is the most commonly used index. It is the square root of the product of Laspeyres and Paasche price indices and is given byFootnote 2

$$I_{jk} = \sqrt {\frac{{\mathop \sum \nolimits_{i = 1}^{N} p_{ik} q_{ij} }}{{\mathop \sum \nolimits_{i = 1}^{N} p_{ij} q_{ij} }} \times \frac{{\mathop \sum \nolimits_{i = 1}^{N} p_{ik} q_{ik} }}{{\mathop \sum \nolimits_{i = 1}^{N} p_{ij} q_{ik} }}},$$
(3)

where N is the number of commodities and \({p}_{ik}\) is the price of commodity i in country k.

The Geary–Khamis (GK) Index

Let \({p}_{ij}\) and \({q}_{ij}\) denote the price and quantity of commodity i for country j, i = 1,2 ….. N and j = 1,2 …, M. Let \({P}_{i}\) and \({\mathrm{P}\mathrm{P}\mathrm{P}}_{j}\), respectively, denote the international price of ith commodity and the purchasing power parity of jth currency. The Geary–Khamis method defines the international prices and the purchasing power parities through the following system of (M + N) equations:

$$P_{i} = \mathop \sum \limits_{j = 1}^{M} \left( {\frac{{ q_{ij} }}{{\mathop \sum \nolimits_{j = 1 }^{M} q_{ij} }} \times \frac{{p_{ij} }}{{{\text{PPP}}_{j} }}} \right),\quad {\text{PPP}}_{j} = \frac{{\mathop \sum \nolimits_{i = 1}^{N} p_{ij} q_{ij} }}{{\mathop \sum \nolimits_{i = 1}^{N} P_{i} q_{ij} }}.$$
(4)

In general, the above system of equations, a set of (M + N) linear homogeneous equations in as many unknowns, has a unique positive solution for the \(P_{i}\)’s and \(PPP_{j}\)’s apart from an undetermined scalar multiplicative factor [see Geary (1958), Rao (1971) and Khamis (1972)]. As defined above, the GK method is multilateral since the ‘international price’, \({P}_{i },\) is defined in (4) as the quantity-weighted average of prices in all the countries. It is possible, however, to define a bilateral GK with the ‘international price’ defined as the weighted average of only the countries being compared. While multilateral GK is transitive, bilateral GK is not. However, multilateral GK has the disadvantage of violating the ‘characteristicity’ requirement of Drechsler (1973) that stipulates that the PPP between two countries should depend on the prices and expenditures in those two countries alone.

The equally weighted Geary–Khamis (EWGK) index

Given that the GK index gives greater weight to the price vectors of larger countries when determining the reference price vector resulting in the 'Gershenkeron effect' explained above, an equally weighted variant of the index has been proposed.Footnote 3

The equally weighted Geary–Khamis method defines the international prices and the purchasing power parities through the following system of (M + N) equations:

$$P_{i} = \mathop \sum \limits_{j = 1}^{M} \left( {\frac{{ w_{ij} }}{{\mathop \sum \nolimits_{j = 1 }^{M} w_{ij} }}\frac{{p_{ij} }}{{{\text{PPP}}_{j} }}} \right),\quad {\text{PPP}}_{j} = \frac{{\mathop \sum \nolimits_{i = 1}^{N} p_{ij} q_{ij} }}{{\mathop \sum \nolimits_{i = 1}^{N} P_{i} q_{ij} }},$$
(5)

where \(w_{ij}\) denotes the share of good i in the expenditure of country j.

The CPD PPP

The Country Product Dummy (CPD) PPPs are estimated from the following equation:

$$y_{nc} \,\, \equiv \,\ln p_{nc} = \alpha_{1}^{{}} D_{1} + \alpha_{2}^{{}} D_{2} + \cdots + \alpha_{M}^{{}} D_{M} + \eta_{1}^{{}} D_{1}^{*} + \eta_{2}^{{}} D_{2}^{*} + \cdots + \eta_{N}^{{}} D_{N}^{*} + v_{nc} ,$$
(6)

where Dc (c = 1,2,…,M) and Dn* (n = 1,2,…,N) are, respectively, country and commodity dummy variables and \(v_{nc}\)’s are random disturbance terms which are independently and identically (normally) distributed with zero mean and variance σ2.

Under complete price information, comparisons of price levels between two countries c and d, represented by \({\mathrm{P}\mathrm{P}\mathrm{P}}_{cd}\), can be derived as:

$$PPP_{cd\,} \, = \,\,\frac{{\alpha_{d} }}{{\alpha_{c} }}\,\, = \,\,\prod\limits_{n = 1}^{N} {\left[ {\frac{{p_{nd} }}{{p_{nc} }}} \right]}^{1/N}.$$
(7)

However, Rao (1995), in the spirit of the standard index number approach, proposed that a more appropriate procedure would be to find estimates of the parameters that are likely to track the more important commodities more closely. This is achieved by minimising a weighted residual sum of squares, with each observation weighted according to the expenditure share of the commodity in a given country.Footnote 4

Thus, the generalised CPD method suggests that estimation of Eq. (6) is conducted after weighting each observation according to its value share. This is equivalent to the application of ordinary least squares after transforming the equation pre-multiplied by \(\sqrt{{w}_{nc}}\), where \({w}_{nc}\) is the budget share of item n in country c. The equation thus becomes:

$$\sqrt {w_{nc} } \ln p_{nc} = \sqrt {w_{nc} } \mathop \sum \limits_{i = 1}^{M} \alpha_{i} D_{i} + \sqrt {w_{nc} } \mathop \sum \limits_{j = 1}^{N} \eta_{i} D_{j}^{*} + u_{nc}.$$
(8)

Rao (2005) has shown that PPPs resulting from the least squares estimation of the above weighted CPD equation are equivalent to a system of expenditure-share weighted log-change system. The Rao system is given by:

$$\begin{aligned} & {\text{ PPP}}_{d} = \mathop \prod \limits_{n = 1}^{N} \left( {\frac{{p_{nd} }}{{P_{n} }}} \right)^{{w_{nd} }} ,\;{\text{setting}}\;{\text{one}}\;{\text{country}}\;{\text{as}}\;{\text{the}}\;{\text{numeraire}}, \\ & \quad {\text{and}}\;P_{n} = \mathop \prod \limits_{c = 1}^{M} \left( {\frac{{p_{nc} }}{{PPP_{c} }}} \right)^{{\frac{{w_{nc} }}{{\mathop \sum \nolimits_{c = 1}^{M} w_{nc} }}}} \\ \end{aligned},$$
(9)

where \(P_{n}\), n = 1,2,…,N are the international average prices (at the numeraire country’s currency) of commodities. \({PPP}_{d}\) is the PPP of country d with respect to the numeraire country. Note that \(\sum_{n=1}^{N}{w}_{nd}=1,\) the sum of budget shares in country d.

The basic CPD model, given by Eq. (6) above, has the advantage that, as it is based on stochastic formulation, it allows the use of a range of econometric tools and techniques that are not normally used in the computation of PPPs. In particular, the regression approach provides estimated standard errors for all the coefficients. An added advantage is that the stochastic formulation of CPD given by (6) and (8) can be extended to allow regionally correlated price movements via admitting spatially correlated errors. The empirical literature on subnational and cross-country PPPs is generally based on the assumption that there is no interdependence between the price movements in various regions of a country or between that in various countries. There is some evidence to the contrary in early work reported by Aten (1996) on subnational PPPs and by Rao (2001) on cross-country PPPs.

The spatial CPD model is given by:

$$y_{cn} = \alpha_{1} D_{1} + \alpha_{2} D_{2} + \cdots \alpha_{M} D_{M} + \beta_{1} D_{1}^{*} + \beta_{2} D_{2}^{*} + \cdots + \beta_{N} D_{N}^{*} + \varepsilon_{cn} ,$$
(10)

where \({D}_{c}\) and \({D}_{n}^{*}\) are, respectively, the country and commodity (product) dummy variables.

Here, \(\varepsilon\), the vector of \(\varepsilon_{{{\varvec{cn}}}}\)’s, is specified as follows:

$$\varepsilon = \user2{\rho S}\varepsilon \user2{ + }\eta ,$$

where ρ is the overall spatial correlation and \(\eta_{{{\varvec{cn}}}}\)’s are i.i.d. with mean 0 and variance \({\upsigma }_{.}^{2}\).

S is a spatial weight matrix of order NM × NM. The spatial weight matrix can be of various types depending on the neighbourhood criteria, based on distance, in general. For example,

$$\begin{aligned} {\varvec{S}}_{{{\varvec{ij}}}} & = {1}\quad {\text{if}}\;i\;{\text{and}}\;j\;{\text{belong}}\;{\text{to}}\;{\text{the}}\;{\text{same}}\;{``}{\text{neighbourhood}}{``}\;{\text{and}}\;i \ne j, \\ {\varvec{S}}_{{{\varvec{ij}}}} & = 0\quad {\text{ otherwise}}. \\ \end{aligned}$$

ρ can be estimated using maximum likelihood methods in the joint estimation of the two equations.

The True Cost of Living Index (TCLI) as a PPP

The TCLI, proposed by Konus (1939), is the ratio of the minimum expenditures to obtain the same standard of living, given by the indirect utility indicator, u, in two price situations. If we denote \({p}^{1}\) and \({p}^{2}\) as the price vector in initial and given years, respectively, and c (u, p) as the cost or expenditure function, then the TCLI in year 2 with year 1 as base is given by:

$$P(p^{1} ,p^{2} ,\overline{u}) = \frac{{{\text{c }}\left( {\overline{u},{ }p^{2} } \right)}}{{{\text{c }}\left( {\overline{u},{ }p^{1} } \right)}},$$
(11)

where \(\overline{u}\) is the reference utility level. In general, namely, unless preferences are homothetic, the TCLI as defined in (11) will depend on the reference utility level. The TCLI was proposed in the temporal context to measure price changes over time. If, however, we define \({p}^{1}\) and \({p}^{2}\) as price vectors in two countries, 1 and 2, where 1 is the reference country, 2 is the comparison country and \(\bar{u}\) is the common utility level in the two countries, then we can view \(P({p}^{1}\),\({p}^{2}\), \(\bar{u})\) as the PPP of country 2 with respect to country 1—see Majumder et al. (2015a, b) for an application of TCLI in computing PPP between India and Vietnam.

To make Eq. (11) operational, we need to assume specific functional form for the cost or expenditure function, c(u, p). Following Coondoo et al. (2011), we assume that the underlying expenditure function is the Quadratic Logarithmic (QL) system. A specific form of the QL system is the Quadratic Almost Ideal Demand System (QAIDS) due to Banks et al. (1997). The QAIDS expenditure function is given by:

$$C\left( {u,p} \right) = a\left( p \right) \cdot \exp \left( { \frac{b\left( p \right)}{{(1/\ln u) - \lambda \left( p \right)}}} \right),$$
(12)

p is the price vector, \(a\left( p \right){ }\) is a homogeneous function of degree one in prices, \(b\left( p \right)\) and \({\uplambda }\left( {\text{p}} \right)\) are homogeneous functions of degree zero in prices, and u denotes the level of utility.

The corresponding True Cost of Living Index (TCLI) in logarithmic form comparing price situation \(p^{2}\) with price situation \(p^{1}\) is given by:

$$\ln P\left( {p^{2} ,p^{1} ,u^{*} } \right) = \left[ {\ln a\left( {p^{2} } \right) - \ln a\left( {p^{1} } \right)} \right] + \left[ {\frac{{b\left( {p^{2} } \right)}}{{\frac{1}{{\ln u^{*} }} - \lambda \left( {p^{2} } \right)}} - \frac{{b\left( {p^{1} } \right)}}{{\frac{1}{{\ln u^{*} }} - \lambda \left( {p^{1} } \right)}}} \right],$$
(13)

\(u^{* }\) is the reference utility level. Note that 'price situation' refers to the prices prevailing in a particular country in a given year.

Therefore, if we change the notation from 1 and 2 to c and d, respectively, then PPP of country d with respect to country c is given by:

$${\text{PPP}}_{cd} = \frac{{a\left( {p^{d} } \right)}}{{a\left( {p^{c} } \right)}}{\exp}\left[ {\frac{{b\left( {p^{d} } \right)}}{{\frac{1}{{\ln u^{*} }} - \lambda \left( {p^{d} } \right)}} - \frac{{b\left( {p^{c} } \right)}}{{\frac{1}{{\ln u^{*} }} - \lambda \left( {p^{c} } \right)}}} \right].$$
(14)

It is worth noting that (14), which involves binary comparison between c and d, yields a transitive index, unlike the Fisher and Tornqvist price indices. Note, also, that to make (14) operational, we need to estimate the parameters of the functions, a(p), b(p), and λ(p) from the demand systems in the two countries, c and d. This is typically done from the information on expenditures on various items by the households contained in the household expenditure surveys. Demand system estimation also requires price information at the household level that is not usually available in the expenditure surveys. Majumder et al. (2015a, b) overcome the problem by using unit values, obtained by dividing the item expenditures by item quantities with appropriate corrections for quality and demographic effects. Coondoo et al. (2011), on the other hand, circumvent the problem by proposing a three-step computation procedure that does not require any explicit price information.

The Coondoo et al. (2011) Procedure

The budget share functions corresponding to the cost function (12) are of the form

$$w_{i} = a_{i} \left( p \right) + b_{i} \left( p \right)\ln \left( {\frac{x}{a\left( p \right)}} \right) + \frac{{\lambda_{i} \left( p \right)}}{b\left( p \right)}\left( {\ln \frac{x}{a\left( p \right)}} \right)^{2} ,\quad i = 1,2, \ldots N.$$
(15)

\(x{ }\) denotes nominal per capita expenditure and i denotes item of expenditure.

The procedure for estimating PPP for M countries, taking country 0 as base, involves three stages.

Stage 1 a set of item-specific Engel curves relating budget shares to the logarithm of income are estimated for each country d = 0, 1, 2…M as follows.

$$w_{ij}^{d} = a_{i}^{d} + b_{i}^{d} lnx_{j}^{d} + c_{i}^{d} \left( {lnx_{j}^{d} } \right)^{2} + \varepsilon_{ij}^{d} ,$$
(16)

i denotes item, \(j\) denotes income category (or household), \(\varepsilon_{ij}^{d}\) is a random disturbance term, and \(a_{i}^{d} ,b_{i}^{d} ,c_{i}^{d}\) are parameters that contain the price information on item i in country d.

Stage 2\(a(p^{d} ),{ }\)r = 0, 1, 2…,M is estimated from the following equation obtained by equating Eqs. (15) and (16):

$$\hat{b}_{i}^{d} - \hat{b}_{i}^{0} = \ln a\left( {p^{0} } \right)\left( {2\hat{c}_{i}^{0} } \right) - \ln a\left( {p^{d} } \right)\left( {2\hat{c}_{i}^{d} } \right) + { }e_{i}^{d} ;\quad d = 1, 2, \ldots ,M.$$
(17)

Here, \(e_{i}^{d}\) is a composite error term, which is a linear combination of the individual errors of estimation of the parameters \(a_{i}^{d} ,b_{i}^{d} ,c_{i}^{d}\) and \(p^{0}\) denotes the price vector of the base country.

Stage 3 Using the normalisation \(b\left( {p^{0} } \right)\)\(= \lambda \left( {p^{0} } \right) = 1,\) the money metric utility \(u_{j}^{0}\) of the jth income group of the base country that has nominal per capita income \({ }x_{j}^{0} \left( { = C\left( {u_{j,}^{0} p^{0} } \right)} \right)\) is obtained from (12) as:

$$\frac{1}{{ \ln u_{j}^{0} }} = \frac{1}{{\ln \frac{{x_{j}^{0} }}{{a\left( {p^{0} } \right)}}}} + 1.$$
(18)

Again, using the expression in (12) for country d, income group j, and (18), b\(\left({p}^{d}\right)\) and \({\uplambda }\left( {{\text{p}}^{{\text{d}}} } \right)\)d = 1, 2…, M; are estimated from the following regression equationFootnote 5:

$$\frac{1}{{\ln \left( {\frac{{x_{j}^{d} }}{{\widehat{{a\left( {p^{d} } \right)}}}}} \right)}} = \frac{1}{{b\left( {p^{d} } \right)}}\left( {\frac{1}{{\ln \frac{{x_{j}^{0} }}{{\widehat{{a\left( {p^{0} } \right)}}}}}} + 1} \right) - \frac{{\lambda \left( {p^{d} } \right)}}{{b\left( {p^{d} } \right)}} + error.$$
(19)

To estimate (19), we take j as decile (percentile) group so that the data are ordinally comparable across countries.

The PPPs are then estimated as TCLIs from Eq. (14) for a given reference level of utility \(u^{* }\) (taken to be the one corresponding to the median level income of the base country). It may be emphasised that \(a\left( {p^{d} } \right), b\left( {p^{d} } \right)\) and \(\lambda \left( {p^{d} } \right){ }\) are estimated as composite variables and no explicit algebraic forms for these functions are assumed. This confers the advantage that the estimated PPPs are not dependent on a priori specified particular functional forms such as the specification proposed by Banks et al. (1997).

Applications (Cross-Country PPPs)

The literature on PPP is large and growing. Froot and Rogoff (1995), Rogoff (1996), Sarno and Taylor (2002), Lan and Ong (2003), Taylor and Taylor (2004), Taylor (2006), Chen et al. (2007) and Clements et al. (2010) are a subset of available literature reviews on the matter. Given the crucial role that PPPs play in international comparisons, there has been considerable controversy on the PPP values that should be used as deflators. While Clements et al. (2006a, b) provide a method of comparison of consumption patterns between countries that is free of currency units, the requirement of PPP is, in general, unavoidable in most cross-country comparisons. Recent examples of international comparisons of real income or real expenditure include Hill (2004), Neary (2004) and Feenstra et al. (2009). Oulton (2012) sets out a preference-based algorithm for comparing living standards across countries.

The cross-country PPPs, with India as base, are reported in Table 1. These relate to the ICP round, 2011. The PPPs from applying the CPD and TCLI procedures were estimated in a study by Majumder et al. (2017), and the ICP PPPs are the ones reported in World Bank (2015), with a change in base country from the USA to India.

Table 1 Alternative purchasing power parities (PPPs) for 2011 (Numeraire: Indian Rupee)

The TCLI-based procedure requires expenditure information disaggregated by expenditure classes for the demand estimation. Since such information is publicly available only for a select group of countries, in the interest of comparison, in Table 2 the CPD and ICP PPPs are reported and compared for a limited set of countries rather than the full set of 200 or so economies featuring in the ICP, 2011 exercise (Majumder et al. 2017).

Table 2 TCLI-based PPPs from Household Level Data: (Base Country: India), 2011

Subnational PPPs

Within a country, the measurement of regional differences in consumer price levels is important to policy-makers in business, government and academics. As noted earlier, estimates of the magnitude of regional price differences are needed in comparisons of real income, standards of living or consumer expenditure patterns across regions. In large Federal countries with considerable heterogeneity in preferences, quality of items and household characteristics between regions, the calculation of regional price differentials, hence, acquires considerable importance.

A significant bottleneck in the calculation of spatial prices has been the absence of detailed (i.e. item-wise) price information across various regions. ‘Spatial comparisons of consumer prices pose specific problems because of the non-overlapping nature of the consumption baskets, major differences in the quality of items priced in different regions, and the non-availability of crucial data on region-specific expenditure patterns. These problems require the development of new analytical techniques that can handle major differences in quality' (ILO and Others, 2004).

To address the problem of non-availability of prices, various methods have been proposed to compute proxies for prices. Some of these methods are listed below.

  1. (1)

    Unit values: these are computed by dividing expenditures by quantities at the household level obtained from the unit records in the household expenditure surveys. Since unit values are endogenous and depend on the household’s consumption decisions, they are not true measures of exogenously determined prices. Computation of quality-adjusted unit values has been proposed by Cox and Wohlgenant (1986), Deaton (1988) and Hoang (2009). However, information on unit values is restricted to food items, and hence, the estimated spatial price indices are limited to only a subset of items in household spending.

  2. (2)

    Pseudo Unit Values: The procedure for estimating spatial prices in the absence of unit values or price information at the item level is due to Lewbel (1989). This procedure is based on the concept of generalised Barten (1964) equivalence scales, where the generalisation allows the scales to depend on the exact mix of goods that comprise each group of items, as well as on demographic variables. The Lewbel procedure exploits the variation in household size and composition in a single household expenditure survey data to construct what Atella et al. (2004) call ‘pseudo unit values’ (PUVs) that can be used as proxy for the missing prices.Footnote 6

A third alternative to get around the problem of prices is to use the TCLI-based PPP index (discussed in an earlier Section) that does not require item-specific price/unit value data.

The methods used for calculating the subnational PPPs are mostly the ones that have been used in the context of international comparisons. Examples of studies on spatial prices within a country include Aten and Menezes (2002) and Deaton and Dupriez (2011) on Brazil, De Carli (2010), Biggeri et al. (2008, 2010), Montero et al. (2019) on Italy, Weinand and von Auer (2020) on Germany, Dikhanov et al. (2011) on Philippines, Coondoo, Majumder and Chattopadhyay (2011), Majumder, Ray and Sinha (2012), Majumder et al. (2012, 2015a) and Deaton and Dupriez (2011) on India, Majumder et al. (2015b) on Vietnam, Mishra and Ray (2014) on Australia, Brandt and Holz (2006), Biggeri et al. (2017) for China, Gomez-Tello et al. (2018) for Spain.

Recently, Costa et al. (2019) proposed a new method for estimating PPPs at subnational level for OECD countries using publicly available data from the OECD and the US Bureau of Economic Analysis. The method is based on the Balassa-Samuelson (1964) hypothesis, which states that countries with a higher level of income per capita tend to have higher price levels. They estimated regional prices for a time series of more than ten years (2000–2016) and for more than 300 OECD large regions (Territorial Level 2 regions).Footnote 7 The process involves three steps. In the first step, a relationship between state prices and state household disposable income per capita including also data on the industrial composition of the GDP by State in the USA is defined through the following regression equation.

$$\ln P_{it} = \beta_{0} + \beta_{1} \ln HDIpc_{it} + \beta_{2} Ind_{it} + \beta_{3} Serv_{it} + u_{it},$$
(20)

where \(P_{it}\) is the price in purchasing power parities of the state i in period t, \(HDIpc_{it}\) is the corresponding value of the available household disposable income per capita, \(Ind_{it}\) is the weight of the industry and \(Serv_{it}\) the weight of the services over GDP in each state i in period t.

In the second step, OECD regional prices are estimated based on the derived relationship from Step 1. For region h that belongs to country J in period t, the predicted prices are:

$$\hat{p}_{Jht} = {\exp}(\hat{\beta }_{0} + \hat{\beta }_{1} lnHDIpc_{Jht} + \hat{\beta }_{2} Ind_{Jht} + \hat{\beta }_{3} Serv_{Jht} ).$$
(21)

To ensure that the weighted sum of the regional price levels matches the reference national price levels, these prices are adjusted as:

$$\hat{p}^{*}_{Jht} = \tau_{Jt} \widehat{.p}_{Jht},$$
(22)

where \(\tau_{Jt} = \frac{{p_{Jt} }}{{\mathop \sum \nolimits_{h} w_{Jht} \widehat{.p}_{Jht} }}\) is the adjustment factor, \(p_{Jt}\) refers to the price level of country J in period t and \(w_{Jht}\) refers to the weight of the GDP in region h over GDP in country J in period t.

In the third and final step, OECD regional price parity indices (regional PPPs) are estimated by using the adjusted regional prices derived from Step 2 and the PPPs at national level as

$$PPP_{Jht} = \hat{p}^{*}_{Jht} .PPP_{Jt} ,$$
(23)

where \(PPP_{Jht}\) refers to the PPPs at regional level (in $) and \(PPP_{Jt}\) refers to the PPPs at national level (in $).

Applications (subnational PPPs)

Table 3 reproduces subnational PPPs for two selected countries, viz. India and Italy that have been calculated in Deaton and Dupriez (2011) and Menon et al. (2019), respectively.

Table 3 Subnational PPPs

Figure 1 reproduces (from Majumder et al. (2019)) the spatial price mapsFootnote 8 For the 19 major states in India in rural and urban areas, respectively, with various shades of colour representing the different bands into which the price indices fall. Figure 2 reports the corresponding spatial price map based on the spatial price indices in the district. In both cases, the Fisher price index formula has been used to compute the spatial price indices. Spatial heterogeneity in prices within India is evident from Fig. 1. Figure 2 shows that there is lot more price heterogeneity between districts than is evident at the level of states. In case of several of the states, there is heterogeneity between the districts within the state as seen from the frequent change of colour in a state in Fig. 2.

Fig. 1
figure 1

State-level spatial price indices (All-India = 1): NSS 68th Round (2011–2012)

Fig. 2
figure 2

Source: Majumder et al. (2019)

District-level spatial price indices (All-India = 1): NSS 6th Round (2011–2012).

Table 4 presents the subnational PPPs for some selected OECD countries for the year 2016, taken from the tables presented in Costa et al. (2019). Clearly, the estimates underline the importance of account for price differentials when assessing regional economic disparities.

Table 4 Estimated Regional Price Parity Index for some OECD Countries, 2016 (US $ = 100)

Subnational PPPs with Spatial Autoregressive (SAR) Error structure

The literatures on spatial and temporal prices have generally moved in parallel, with the spatial studies looking at differences in prices faced by a cross section of units at a single time period, while the temporal studies concentrate on price changes faced by a single unit over time. In case of the measurement of price movements over a long time period for a large, heterogeneous country such as India, the spatial and temporal aspects will interact to record large spatial differences in inflation over time. There was an early recognition of this interaction in the studies on India by Bhattacharyya et al. (1980), Bhattacharya et al. (1988) and Coondoo and Saha (1990). Recent examples of studies that investigate the spatial and temporal aspects of price movements in a unified framework include Hill’s (2004) study on the European Union and Almas et al.’s (2013) study on India. Hill (2004) proposes 'a general taxonomy of panel price index methods' (p. 1379) to compute spatial and temporal price indexes and investigate whether there was convergence in price levels and relative prices across the European Union. Hill’s (2004) methodology requires panel data sets which are not often available. As he explains, 'One reason why panel comparisons have not received more attention in the index number literature is the lack of suitable data sets' (p. 1379). In contrast, Almas et al. (2013) propose a methodology that can be implemented on available data sets, for calculating spatial prices in India based on the estimated budget share equation for food specified as a linear function of nominal household expenditure and a set of household-specific control variables. The fact that the literatures on the measurement of the spatial and temporal variation in prices have moved in parallel has meant that there has been an absence of a single unified framework that allows for both sets of calculations.

The basic premise of the approach in Coondoo et al. (2004) to model both aspects, as discussed above, is the concept of quality equation due to Prais and Houthakker (1971) in which the price/unit value for a commodity paid by a household is taken to measure the quality of the commodity group consumed (and hence the price/unit value is postulated to be an increasing function of the level of living of the household) and the Country Product Dummy (CPD) model due to Summers (1973). Majumder and Ray (2017) extend this model to adapt it to the household context by introducing household demographics. This model has been called the ‘Household Regional Product Dummy’ (HRPD) model. This model has been further modified by Chakrabarty et al. (2018) and has been called the ‘Dynamic Household Regional Product Dummy’ (DHRPD) model with the following features: (a) it allows the movement in the spatial price indices to be correlated over time, and (b) it allows interdependence between price indices in neighbouring states or regions in a country.Footnote 9

The model is given by

$$p_{jrht} - \pi_{rt} = \alpha_{jt} + \mathop \sum \limits_{i = 1}^{4} \delta_{ijt} n_{irht} + \left( {\lambda_{jt} + \eta_{jrt} } \right)\left( {y_{rht} - \pi_{rt} } \right) + \varepsilon_{jrht} ,$$
(24)

\(\alpha_{jt}\), \(\delta_{ijt}\), \(\lambda_{jt} ,\eta_{jrt}\) and \(\pi_{rt}\) are the parameters of the model, \(p_{jrht}\) denotes the natural logarithm of the nominal price/unit value for the jth commodity \(j = 1, \ldots ,N\) paid by the hth sample household of region r, (r = 0,…,R), at time \(t = 1, \ldots ,T\). \(y_{rht}\) denotes the natural logarithm of the nominal per capita income/per capita expenditure (PCE) of the hth sample household in region \(r\), at time \(t\). In principle, \(\pi_{rt} ^{\prime}\) s may be interpreted as the natural logarithm of the value of a reference basket of commodities purchased at the prices of region r in time t. The left-hand side of Eq. (24) thus measures the logarithm of the price/unit value paid in real terms and \(\left( {y_{rht} - \pi_{rt} } \right)\) on the right-hand side of Eq. (24) measures the logarithm of real PCE. The parameters \(\left( {\pi_{rt} - \pi_{0t} } \right)\), with r = 1,…,R and t = 1,…,T, denote a set of logarithmic price index numbers for individual regions measuring the regional price level relative to that of the reference numeraire region (r = 0) at time t and the spatial price index is given by the formula \(exp\left( {\pi_{rt} - \pi_{0t} } \right)\).

Tables 5, 6 present estimates of state-wise spatial and temporal PPPs, respectively, for India based on the DHRPD model.

Table 5 Estimates of Spatial Price Indices (AR(1) Model) with Dependence on Neighbouring States of India: 55th—68th rounds
Table 6 Estimates of Temporal Price Indices (AR(1) Model) with Dependence on Neighbouring States of India: 61st—68th rounds (Index = 1 for each state for 55th Round)

Item-specific subnational PPPs

The variation in PPPs across items, if present, will result in a variation in the overall PPP between households because of variation in household expenditure patterns. This is consistent with the argument of Reddy and Pogge (2007) that in converting national poverty lines into a common currency, one should use PPP rates that are relevant for the poor. Majumder, Ray and Sinha (2012) proposed a methodology for the calculation of PPP between rural and urban areas in the context of a large heterogeneous country such as India. The proposed procedure is based on an idea that is similar to the idea of quasi-price demographic effects in the Barten (1964) model that is used to estimate the general equivalence scale as a function of the item-specific equivalence scales. The proposed procedure is rooted in utility maximising demand models and generalises the conventional framework to allow commodity-specific PPPs between rural and urban areas. The extended framework is more policy friendly by enabling the calculation of item-specific rural–urban differential in prices and allows a simple test of the idea of commodity-invariant PPP underlying the conventional calculations. In modifying the prices faced by a household in the Barten (1964) model, the commodity-specific equivalence scales perform a role that is similar to that played by the item-specific PPP rates in the framework that is proposed here. While household size and composition effects work through the equivalence scales in the Barten model, spatial prices work through the PPP parameters.

Table 7 presents estimates of overall PPPs, where the proposed methodology is benchmarked against the conventional procedures by comparing the calculated rural–urban price differentials with those obtained from using the Laspeyres’ price index (Clements and Izan 1981; Selvanthan (1991) and the Country Product Dummy (CPD) Method (Summers 1973; Rao 2005). Table 7 also presents the corresponding item-specific PPPs.

Table 7 Estimates of All India Urban PPPs (Rural = 1): NSS 55th and 61st Rounds

Conclusion

In this paper, we have attempted to provide a brief account of the various price indices (PPPs) used in the context of international and subnational comparisons of cost of living, welfare and poverty. A major problem in calculating these PPPs is non-availability of comparable price data. In the international context, the problem arises from non-overlapping nature of the consumption baskets, major differences in the quality of items priced in different regions and the non-availability of crucial data on region-specific expenditure patterns. In the subnational context, household surveys frequently record only expenditure information. The lack of information about quantities purchased precludes the possibility of deriving household-specific unit values. The aggregate price indexes derived from sources exogenous to the household survey are often not sufficient to identify all parameters and to provide plausible estimates.

However, new data gathering techniques, often referred to as 'Big Data', are underway (Cavallo and Rigobon 2016) and have the potential to improve statistics and empirical research in macro- and international economics by using the vast number of online prices displayed on the web. The Billion Prices Project at MIT is an academic initiative that uses prices collected from hundreds of online retailers around the world on a daily basis to construct daily price indexes and real-time inflation metrics in multiple countries. With these new data gathering techniques, it is hoped that there will be studies on spatial and temporal price indices based on real price information in future.