Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Residuals often are considered as a troublesome noise in spatial—or, for that matter—non-spatial econometric models. Current practice in spatial econometrics is to set up a spatial error model, more often than not with an exogenous W spatial weight matrix, in order to improve the efficiency of the estimators.

Looking closely into the residuals is less common practice. And still, residuals can represent extremely precious building blocks for further work, as other disciplines have shown. Around 1850 the British chemists, Mansfield and Perkin, had the—for that era of chemistry—strange idea to analyze the composition of tar, until then exclusively used to improve coverage of roads (John London McAdam had his name attached to that technique, tarmacadam); the result of the British chemists’ investigation was the roaring development of a whole branch of (industrial) chemistry: carbochemistry.

In the next section, a simple spatial econometric example will be treated, after which further analysis and more results will be presented.

1 Residuals

Tables 14.1 and 14.2 present the degrees of contiguity for Belgian regional units, BRU, (the maximum degree being 3) and their gross regional products (1995, 105 Euros of 2000); the entries of the two tables follow the same order.

Table 14.1 Degrees of contiguity between Belgian regions
Table 14.2 Gross regional products for the Belgian units, 1995

Figure 14.1 reproduces the map of those regions.

Fig. 14.1
figure 14_1_214082_1_En

Regional map of Belgium

The regions are the following. From West to East, northern slice: West-Flanders, East-Flanders, Antwerp, Limburg; same, southern slice: Hainaut, Namur, Luxembourg, Liège (slightly upwards); right in the middle, from north to south: Flemish Brabant and Walloon Brabant, with the Brussels Capital region sticking out.

First the products of 1995 were analyzed. The idea was to investigate the effects of (average) products for different degrees of contiguity (1, 2, 3) on a given GRP, y i . Hence the equation

$${{\textrm{y}}_{\textrm{i}}}{\textrm{ = a}}{{\textrm{y}}_{\textrm{1}}}_{\textrm{i}}{\textrm{ + b}}{{\textrm{y}}_{\textrm{2}}}_{\textrm{i}}{\textrm{ + c}}{{\textrm{y}}_{\textrm{3}}}_{\textrm{i}}{\textrm{ + d + }}{{\upvarepsilon}_{{\textrm{i,}}}}$$
((14.1))

where y 1i , y 2i and y 3i are the average products for different degrees of contiguity.

Table 14.3 presents the OLS estimation results.

Table 14.3 First results for model (14.1)

Obviously the results are far from being satisfactory. The residual spatial correlation coefficients, r c 2 (c=1, 2, 3, the observed degrees of contiguity) are respectively –0.2619, –0.1161 and –0.2426. They are not significant, but show that there is no completely random field in the residuals.

Accordingly, further analysis is in order.

2 Multiple Regimes

The first column of Table 14.4 shows the residuals of the exercise, and compares them (columns 2 and 3) with the growth rates (averages over 1995–2004) and the GRP levels.

Table 14.4 Comparing residuals

The following scatter plots (Figs. 14.2 and 14.3) picture the partial relations.

Fig. 14.2
figure 14_2_214082_1_En

Residuals and growth rates from Table 14.4

Fig. 14.3
figure 14_3_214082_1_En

Residuals and GRP from Table 14.4

The Kendall- τ (Kendall, 1955) between residuals (+ or –) and growth rates (above or below the average, 0.0217) is near zero (exactly, 0.0910), and between residuals and GRPs it is 0.4546, but further investigation is still required.

To prepare the latter, a complexity index has been computed (Getis and Paelinck, 2004), derived from a fourth degree polynomial

$$\begin{array}{ll}{{\textrm{e}}_{\textrm{i}}}{\textrm{ = a + b*}}{{\textrm{r}}_{\textrm{i}}}{\textrm{ + c*}}{{\textrm{y}}_{\textrm{i}}}{\textrm{ + d*}}{{\textrm{r}}_{\textrm{i}}}^{\textrm{2}}{\textrm{ + e*}}{{\textrm{y}}_{\textrm{i}}}^{\textrm{2}}{\textrm{ + f*}}{{\textrm{r}}_{\textrm{i}}}{{\textrm{y}}_{\textrm{i}}}{\textrm{ + g*}}{{\textrm{r}}_{\textrm{i}}}^{\textrm{3}}{\textrm{ + h*}}{{\textrm{y}}_{\textrm{i}}}^{\textrm{3}}{\textrm{ + i*}}{{\textrm{r}}_{\textrm{i}}}^{\textrm{2}}{{\textrm{y}}_{\textrm{i}}}{\textrm{ + j*}}{{\textrm{r}}_{\textrm{i}}}{{\textrm{y}}_{\textrm{i}}}^{{\textrm{2 }}}\\ \quad{\textrm{ + k*}}{{\textrm{r}}_{\textrm{i}}}^{\textrm{2}}{{\textrm{y}}_{\textrm{i}}}^{\textrm{2}}{\textrm{, }}\end{array}$$
((14.2))

in which the y i s are again the GRPs, and the r i s the growth rates. Table 14.5 presents the interpolated coefficients of Eq. (14.2).

Table 14.5 Polynomial coefficients from Eq. (14.2)

Coefficients b, d, and g are extremely high, but they relate to growth rates that are small numbers. Excluding the relatively small coefficients (smaller than one), the complexity coefficient can be computed as

$${\textrm{C = (v \hbox{--} 1)/(n \hbox{--} 1) = 0}}{\textrm{.6,}}$$
((14.3))

where v is the number of maintained coefficients, and n their maximal number (i.e., the number of observations).

Given the rule followed, this is a relatively high value (0 c 1), and invites rethinking the model generating the observed residuals, as this model is a very simple one.

The revealed complexity suggests the need for a possible correction by r i and y i , but the first correction would not be complete, as shown above, and using y i would be trivial. A plausible alternative would be to introduce two separate regimes, leading to the following specification (see Chap. 12):

$${{\textrm{y}}_{\textrm{i}}}{\textrm{ = }}{{\uplambda}_{\textrm{i}}}{\textrm{(a*}}{{\textrm{y}}_{\textrm{1}}}_{\textrm{i}}{\textrm{ + b*}}{{\textrm{y}}_{\textrm{2}}}_{\textrm{i}}{\textrm{ + c*}}{{\textrm{y}}_{\textrm{3}}}_{\textrm{i}}{\textrm{ + d) + (1 \hbox{--} }}{{\uplambda}_{\textrm{i}}})({ \upalpha} {\textrm{*}}{{\textrm{y}}_{\textrm{1}}}_{\textrm{i}}{ + \upbeta} {\textrm{*}}{{\textrm{y}}_{\textrm{2}}}_{\textrm{i}}{ + \upgamma} {\textrm{*}}{{\textrm{y}}_{\textrm{3}}}_{\textrm{i}}{ + \updelta ) + }{{\upvarepsilon}_{\textrm{i}}}{\textrm{,}}$$
((14.4))

where the λis are the binary variables qualifying the spatial regimes.

This produced the results of Table 14.6 hereafter.

Table 14.6 Results with two regimes, Eq. (14.4)

The residual spatial correlation coefficients for contiguities 1, 2 and 3, respectively, are –0.1793, –0.8029 and 0.2938, showing that there is still some specific spatial autocorrelation, especially of order 2 (coefficient significant at the 0.995 level). Some changes occurred over nine years, as the third column of Table 14.6 shows; this also was the case for the r c s (–0.6702, –0.8223, 0.6428, all significant at the 0.995 level), the second order spatial autocorrelation still dominating. But the overall fit is satisfactory, and OLS can be replaced by other estimation methods (see Sect. 11.1.3).

To test the general properties of the residual fields, two statistics have been computed:

  • a generalized τ-statistic between all residuals (i.e., 55 cross-products are involved); for 1995 and 2004 they amount to non-significant values –0.1542 and –0.1523, respectively, excluding any general correlation; and,

  • the C-statistics of Eq. (14.3); in both cases they equal 0.9, showing a high degree of Chaitin-Wolfram complexity (as independent variables, the numbers from 1 through 12 were used in the test polynomial).

The problem now is: though overall randomness seems to be present, spatial r c s show specific dependency, so some further investigation is in order.

Table 14.7 presents the two vectors of λ I estimates.

Table 14.7 Regime allocators for two distant years

The pattern is remarkably stable: most regions (7) belong to the same regime, only the center-south deviating from this.

3 Spatial Interpolation

Because there is quite some variability in the coefficients reported in Table 14.6, the question arises asking whether computing local coefficients could give more insight in this phenomenon. One possibility is to interpolate the parameters from groups of— possibly neighboring—spatial units. If more than four regions present themselves as candidates, nearest neighbors—in terms of distances and/or political/linguistic proximity—have been selected.

Table 14.8 presents the results. The parameters are those of Eq. (14.1).

Table 14.8 Results from spatial interpolation to compute the parameters of Eq. (14.1)

The remarkable finding, again, is that the orders of magnitude are the same for the two years, with some exceptions for the constant d. But the variability is large between regions (as shown by the coefficients of variation in the last row of Table 14.8), which suggests the need for further analysis of the available data to complete the picture.

4 Composite Parameters

Because time series from 1995 through 2004 are available, composite parameters can be computed (Ancot et al., 1978); for instance, parameter a in Eq. (14.1) can be expanded as

$${\hat {\textrm{a = a + a}}}_{\textrm{r}} {\textrm{ + a}}_{\textrm{t}} {\textrm{,}}$$
((14.5))

where a is the generic, a r is the region-specific parameter, and a t is the time-specific parameter. For reasons of identifiability, one spatial unit should be selected as a kernel (not affected by a r or a t ); the first region, Antwerp, was picked for this purpose, but any other region would have done.

Table 14.9 hereafter presents the coefficients; the first row, as said above, contains the generic ones, the following rows the region-specific ones. Sometimes a region-specific coefficient c r is absent, due to the absence of a third order spatial lag.

Table 14.9 Generic and region-specific coefficients according to model (14.5)

The entries of Table 14.9 display a large region-specific parameter variability. No measure hereof has been computed this time, but a comparison with Table 14.8 confirms this variability.

Table 14.10 presents the time-specific parameter estimates for 1996 through 2004.

Table 14.10 Time-specific parameters for model (14.5)

The coefficients are of a much smaller order of magnitude, which confirms a previous remark about the relative constancy of the parameters through time, as opposed to their interregional variability.

Finally Table 14.11 presents the partial and global pseudo-R 2-values, pseudo- because the parameters have been computed by least absolute discrepancies to avoid outliers.

Table 14.11 Pseudo R 2s

The result is remarkably high for 60 df, with a local exception for Hainaut.

5 Conclusion

The doggy-bag principle (“never throw away your leftovers”) has given insight into a possibly appropriate specification of the spatial econometric models investigated. This is in line with the clear warning that has been given off for time series analysis (G. Mizon, A Note to Autocorrelation Correctors: Don’t, Journal of Econometrics, 1995, 69, pp. 267–288).

More research is in order, especially for very large models. But considering residuals as informative should transcend the usual practice of trying to neutralize them. Meanwhile, pure spatial “randomness” also could be interpreted as spatial complexity, and might encourage continued analysis rather than finishing it by discussing “ideal” parameter properties.

In the Belgian case, this has lead to deeper insights in spatio-temporal properties of a static model. Indeed, it appears that each spatial unit possesses its own reaction coefficients with a great stability over time. Problem however is to find out how much of that interregional divergence is due to system heterogeneity, and how much to spatial aggregation. The latter problem is taken up in Chap. 17.