1 Introduction

Jackson (1984) and Malizia and Ke (1993) note that interest in regional instability goes back at least 60 years, with a recurrent theme in this literature being the relationship between the sectoral composition of regional employment and regional economic growth and instability. Within this literature lies, in many cases, the implicit assumption that a diverse regional economy will enjoy a stable employment growth rate, with the diversity acting to shield the regional economy from fluctuations in the market for its products.

Support for this hypothesis is mixed, with several studies including Kort (1981), Brewer and Moomaw (1985) and Malizia and Ke (1993) finding that increased industrial diversification is associated with reduced regional instability. Other researchers such as Jackson (1984) have found only limited support for this relationship, while Attaran (1986) concluded that instability is not related to regional industrial diversity. Wagner and Deller (1998) consider that the principal causes of the empirical inconsistency include the use of highly aggregated data sets, theoretically poor measures of diversity and overly simplistic statistical methods.

Malizia and Ke (1993) note that previous research has used data from different units of geography ranging from metropolitan areas, states and Bureau of Economic Analysis (BEA) economic areas and suggest that only functional economic units should be used in the analysis. Furthermore, there has been considerable debate about the most appropriate measure of regional diversification, with a plethora of measures being suggested. These have ranged from measures that assume diversity implies equal shares of activity in all sectors of the economy, the portfolio measure of diversification to measures based on input–output systems as proposed by Siegel et al. (1995) and Wagner and Deller (1998).

Another limitation of much of the empirical work has been that it has relied on the use of inappropriate statistical techniques, with much of the early empirical work being confined to the use of bivariate techniques. Malizia and Ke (1993) note that such an approach will provide biased estimates of the effect of diversity on regional instability; for this reason, appropriate control variables need to be incorporated in the analysis. In addition, the statistical techniques that are used must be appropriate for the data sets being studied, and developments in the field of spatial data analysis suggest that techniques that take account of spatial dependence in the residual of estimated equations must be used. These techniques can also be used to explore the role of geographic location on regional performance.

Studies that have used multivariate techniques to study the cross-regional variation in employment instability include Smith and Gibson (1988), Wundt (1992), Malizia and Ke (1993) and, more recently, Izreali and Murphy (2003). Smith and Gibson (1988), using the log share measure of diversity, concluded that stable industries are more important for regional instability than is industrial diversification. Wundt (1992) was concerned with the relationship between industrial diversity and regional economic instability over the business cycle and concluded that diversification acted to reduce regional instability, with the portfolio measure providing more accurate results.

Malizia and Ke (1993) incorporated a similar methodology to Smith and Gibson (1988) and, using the entropy measure of diversity, concluded that more diversity leads to lower unemployment rates and less employment instability. Izreali and Murphy (2003) were concerned with the effect of industrial diversity on state unemployment and income and conducted a study of US states using a pooled rather than a cross-sectional database. Using the Herfindahl index of industrial diversification, Izreali and Murphy (2003) found a strong link between industrial diversification and reduced unemployment, while the link between diversification and per capita income was found to be much weaker. These authors also considered the role of spatial spillovers in their analysis but appear more concerned with problems associated with spatially autocorrelated disturbances than with the role of spatial interaction in their model.

This current work can be seen as an attempt to clarify the underlying causes of regional instability, particularly the role of industrial diversification. In the light of the limitations of much of the previous research in this area, this paper can be seen as extending the literature in three directions. Firstly, this study applies a cross-sectional regression approach to a new data set, the Local Government Areas (LGAs) of Queensland, which comprise sparsely settled rural regions and densely settled urban areas. Thus the results from this analysis will be applicable to a wider range of regions than those from studies based on metropolitan areas only. Secondly, these regions more closely correspond to functional economic units than larger geographic units such as state economies which have been used in much of the previous analysis. Finally, the study applies spatial econometric techniques; these techniques seem more appropriate when using geographically related data and allow the exploration of the role of regional spillover effects in determining regional instability.

The following section develops the empirical model which is used to explore the diversity–instability relationship, while Sect. 3 provides an outline of the spatial data techniques used in this study. Section 4 presents the results derived from the estimation of our model of regional instability, with a brief conclusion provided in Sect. 5.

2 An empirical model of regional instability

In much of the early literature, bivariate techniques were used to explore the relationship between economic instability and regional diversification. This paper uses a wider approach, with instability (INSTAB) hypothesised to be a function of regional demographic and labour market variables (L); the structure of a region’s industrial base (I), comprising its diversity, growth and structural change; and spatially lagged variables (S), capturing the economic performance of neighbouring regions. The basic model is expressed as:

$$ INSTAB = \beta _{0} + \beta _{1} L + \beta _{2} l + \beta _{3} S + \varepsilon . $$
(1)

In this equation the β are regression coefficients to be estimated and ε is an error term.Footnote 1

The data used in this study are taken from the 1996 Census of Population and Housing conducted by the Australian Bureau of Statistics (ABS) in addition to the small area labour market data provided by the Department of Employment and Workplace Relations (DEWR).Footnote 2 The small area labour market data are used to derive the measure of regional instability, while the data provided by the ABS are used to derive the covariates used to explain economic instability, including the measure of regional industrial diversification. The 1996 census data were chosen because these are the closest to the starting point for the DEWR small area labour force data, which have been collected from the June Quarter 1994. Thus, the model can be broadly interpreted as measuring the influence of these variables on regional economic instability given the demographic and economic structure that existed in the base year of 1996.

In this study, regional instability has been derived using the measure of instability presented in Eq. 2.

$$ INSTAB = {{\sum\limits_{t - 1}^T {{\left[ {{\left( {{E_{{it}} - E^{{Tr}}_{{it}} } \mathord{\left/ {\vphantom {{E_{{it}} - E^{{Tr}}_{{it}} } {E^{{Tr}}_{{it}} }}} \right. \kern-\nulldelimiterspace} {E^{{Tr}}_{{it}} }} \right)}} \right]}^{2} } }} \mathord{\left/ {\vphantom {{{\sum\limits_{t - 1}^T {{\left[ {{\left( {{E_{{it}} - E^{{Tr}}_{{it}} } \mathord{\left/ {\vphantom {{E_{{it}} - E^{{Tr}}_{{it}} } {E^{{Tr}}_{{it}} }}} \right. \kern-\nulldelimiterspace} {E^{{Tr}}_{{it}} }} \right)}} \right]}^{2} } }} T}} \right. \kern-\nulldelimiterspace} T, $$
(2)

where INSTAB is the index of regional instability, E it is employment at time t in region i, \( E^{{Tr}}_{{it}} \) is the predicted level of employment at time t, region i predicted by a linear time trend equation and T is the time-span over which the trend line is estimated. Kort (1981) notes that this definition of regional economic instability is based on the idea that the economic time-series is based on four components: random, seasonal, trend and cyclical. In this context, the purpose of the index of instability is to isolate and measure the cyclical component of this time-series.

In the study of the effect of regional diversification on regional instability, many authors have noted that the regional population is likely to be an important variable in the explanation of the cross-sectional variation in regional instability (see, for example, Kort 1981; Brewer and Moomaw 1985; Begovic 1992). This work suggests that the variance of the measure of instability will be wider for smaller regional economies than it is for large regional economies. For this reason this study incorporates the log of regional population (LPOP) in the estimated equation.

The variables used to capture differences in the demographic structure of the regional labour market comprised the proportion of the labour force that are females (PERFEM) and the proportion of the population made up of indigenous persons (PERIND). Variables incorporated in the model to account for differences in the quality of the labour force across the regional economies comprised the proportion of the population with a bachelor or higher level qualification (PERBACH), the proportion of the labour force that are employed as managers (PERMGR) and the percentage employed in the lowest-skilled occupations (LOWSKILL), consisting of labourers and elementary, sales and service workers.

Data from the 2001 census indicate that indigenous persons in Queensland are strongly represented in low-skilled jobs. For this reason, it would be expected that higher proportions of indigenous persons in the labour force would be associated with higher measured regional instability. For women, the result is much less clear, and Malizia and Ke (1993) hypothesise that areas with more women may experience lower instability because they offer stable, more educated jobs. However, it is also possible that women, being strongly represented in part-time and casual employment, are contributing to regional instability through the unstable nature of these types of jobs. Finally, within labour economics, it is generally believed that low-skilled workers tend to have higher rates of job turnover than high-skilled workers (see, for example, Le and Miller 2000). If this is the case, it would also be expected that the variable LOWSKILL will be positively associated with regional instability, while the opposite would be expected for PERMGR and PERBACH.

The preliminary stages of analysis suggested that PERFEM and PERIND were insignificant in all estimated equations and so were dropped in the final model. PERBACH was found to be significantly correlated with PERMGR and LOWSKILL, and its incorporation was perhaps resulting in collinearity and imprecision in the estimated model. As a result of this and its marginal significance in only a few of the early versions of the model, it was omitted from the final version of the model.

The growth that a region experiences is also likely to influence the amount of instability experienced, with regions that are growing relatively quickly likely to experience higher levels of instability. In this study average annual regional growth is used (GROWTH); this growth relates to the 1996 to 2001 period, corresponding to the available census data. Apart from the amount of growth a region experiences, the amount of structural change that a region experiences may also be an important determinant of regional instability. In this study structural change has been measured as:

$$ SCHANGE = \frac{1} {2}{\sum\limits_{j = 1} {{\left[ {s_{{jt}} - s_{{jt - 1}} } \right]}} }. $$
(3)

Here, structural change is derived as the absolute value of the share of regional employment in industry j in region i in 2001 less the share of regional employment in industry j in region i in 1996 summed over all industries. Higher levels of structural change and growth would be expected to result in higher levels of instability within the regional economy.

The most important variable in the past literature on regional instability is the measure of regional diversification, and in this study, the entropy measure is applied. This index is derived as:

$$ ENTROPY = {\sum\limits_{j = 1}^k {{{\left( {{E_{{ij}} } \mathord{\left/ {\vphantom {{E_{{ij}} } {E_{i} }}} \right. \kern-\nulldelimiterspace} {E_{i} }} \right)}} \mathord{\left/ {\vphantom {{{\left( {{E_{{ij}} } \mathord{\left/ {\vphantom {{E_{{ij}} } {E_{i} }}} \right. \kern-\nulldelimiterspace} {E_{i} }} \right)}} {\log k{\left( {{E_{i} } \mathord{\left/ {\vphantom {{E_{i} } {E_{{ij}} }}} \right. \kern-\nulldelimiterspace} {E_{{ij}} }} \right)}}}} \right. \kern-\nulldelimiterspace} {\log k{\left( {{E_{i} } \mathord{\left/ {\vphantom {{E_{i} } {E_{{ij}} }}} \right. \kern-\nulldelimiterspace} {E_{{ij}} }} \right)}}.} } $$
(4)

Here, i stands for the ith area and j is the jth industry, k is the total number of industries in the ith area, E ij is employment in the jth industry in area i and E i is total employment in area i.

The entropy measure is one of a number of measures that assume that an ideally diversified economy is one that has equal levels of employment in all industries. The greater the concentration of employment in a few industries, the less diversified or more specialised the economy and the smaller the entropy index of diversification. The measure, as expressed in Eq. 4, ranges from 0 for an economy with all employment in one industry to 1 for a perfectly diversified economy.

A number of concerns have been expressed in the literature relating to the use of this and related measures. On a theoretical basis, Conroy (1974) has suggested that the selection of an equal distribution of activities across sectors as a reference point for diversity may be arbitrary. Additionally, Wasylenko and Erickson (1978) have found that several regions, defined as highly specialised using the entropy measure, were in fact characterised by relatively stable economies, while a further concern, raised by Wagner and Deller (1998), is that these measures do not account for any form of inter-industry linkages.

In response to the first concern, Kort (1981) notes that Eq. 4 provides a definition of the entropy measure of diversification; it is not a behavioural equation. Kort (1981) goes on to note that the entropy measure does not imply that a region should have all activities distributed equally across all sectors, rather it implies that if employment is equally distributed across all sectors, then further diversification is impossible. With regard to the criticism by Wayslenko and Erickson (1978), Kort (1981) suggests that it is the presence of heteroscedasticity and the failure to account for the relationship among industrial diversification, economic instability and region size that result in incomplete or inconsistent analysis. Additionally, in incorporating the variable in an equation explaining the cross-sectional variation of regional economic instability, we are looking for a significant relationship and acknowledge that there may be regions which have a narrow industrial base, as indicated by the entropy index, which have enjoyed a stable employment level.

Alternate methodologies such as the portfolio selection model introduced into regional science by Conroy (1974) and the proposals of Siegel et al. (1995), Wagner and Deller (1998) and Wagner (2000) require additional data. In the case of the portfolio selection model, a detailed time-series of employment by industry is required, while the input–output approach of Siegel et al. (1995), Wagner and Deller (1998) and Wagner (2000) requires input–output tables for each region being studied. For the 125 LGAs of Queensland being studied in this paper, the only available disaggregated employment data are limited to a five yearly census. For this reason, portfolio selection or input–output based methods, which include various forms of industry linkages, are not a viable alternative.

Finally, spatially lagged variables may be important in determining the instability experienced by a regional economy. In particular the instability a region experiences may affect neighbouring regions; this seems likely in cases where regional economies are linked through inter-regional trade or population flows. In this study this hypothesis is formally tested through the use of spatial data analysis techniques.

3 Econometric methodology

Regional science has always recognised the role of space in determining regional economic performance; space is also increasingly being recognised in empirical modelling through the use of techniques that formally incorporate a role for geographic location. These techniques allow the specification and testing of models that incorporate geographic spillover effects or specify a dependence between observations at different points in geographic space.

In this study two types of spatial econometric models were considered, the spatial lag and error models.

The spatial lag model takes the form

$$ Y = \rho Wy + X\beta + \varepsilon , $$
(5)

while the spatial error model is defined as

$$ \begin{array}{*{20}c} {Y} & { = } & {{X\beta + \varepsilon }} \\ {\varepsilon } & { = } & {{\lambda W\varepsilon + \mu ^{'} }} \\ \end{array} $$
(6)

where Y is a vector of N observations of the dependent variable, X is an N×K matrix of observations of the explanatory variables, β is a vector of regression coefficients, ε is a vector of residuals, μ is an independently and normally distributed error term with constant variance, and W is an N×N spatial weight matrix. Anselin (2002) notes that these models require specialised estimation techniques, such as maximum likelihood or instrumental variables. In this study, maximum likelihood techniques implemented through the R statistical software package have been used.

The weight matrix W shows the interconnectedness of the areas in the sample; each element w ij in W tells us the strength of interaction between the pair of regions i and j. Generally, it is expected that neighbouring areas would have a stronger interaction (larger w ij ) compared to geographically distant areas.

In this study a first-order spatial weight matrix has been used. In this case, a symmetric matrix is defined by having the element (i,j) set equal to 1 if i and j are neighbours and 0 otherwise. By convention, the diagonal elements are set to zero, i.e. w i =0. Before use in estimation, the weight matrix is row-standardised, denoted by the superscript s, with each of the non-zero elements being defined as \( W^{s}_{{ij}} = {w_{{ij}} } \mathord{\left/ {\vphantom {{w_{{ij}} } {\sum _{j} w_{{ij}} }}} \right. \kern-\nulldelimiterspace} {\sum _{j} w_{{ij}} } \)In this matrix, the elements of the rows sum across to 1. This manipulation facilitates the interpretation of the weights as an averaging of neighbouring values and also ensures the comparability between models of the spatial parameters in many spatial stochastic processes (Anselin and Bera 1998).

The spatial lag model, shown in Eq. 5, is related to the distributed lag interpretation of time-series economics. The lagged dependent variable, Wy, can be seen as equivalent to the sum of a power series of lagged dependent variables stepping out across a map, with the impact of spillovers declining with successively higher powers of ρ. This may be termed a structural autoregressive relationship, and one would expect it to be based on economic processes. An alternate specification might be the spatial error model, shown in Eq. 6. This model presupposes a shared spatial process affecting all variables. This spatial process is frequently interpreted as indicating missing variables. In this model, λ is the residual spatial autocorrelation coefficient and represents unmodelled shocks. These sorts of effects include regional characteristics that are not part of the model but affect neighbouring regions similarly. Anselin (1999) notes that this type of regression incorporates a special case of a non-spherical error term. In this situation, OLS remains unbiased, but it is no longer efficient, and the classical estimation of standard errors will be biased.

From this discussion it can be seen that the inclusion of spatial effects into an applied econometric model is typically motivated either on theoretical grounds, following the formal specification of spatial interaction in an economic model, or on practical grounds, due to the peculiarities of the data. In the empirical model represented by Eq. 1, it was hypothesised that the amount of instability experienced by a region will affect the instability of neighbouring regions, i.e. our model of regional instability formally incorporates spatial effects on theoretical grounds. Consequently, for this hypothesis to be accepted, the data must support the spatial lag model.

4 Model estimation and evaluation

The first step in modelling involved the estimation of the model using conventional OLS techniques. This approach has been used by Smith and Gibson (1988), Wundt (1992) and Malizia and Ke (1993). Specialised techniques have been developed to study geographically related data, and the next stage of the investigation involved determining whether spatial autocorrelation was present in the residuals of the equation estimated and, if so, whether it is best represented by a spatial lag or spatial error model.

A series of tests have been developed, and the results derived from the application of some of these tests are presented in Table 1. The tests used here comprise the Moran I statistic, Lagrange Multiplier (LM) error and LM lag tests and robust versions of these tests.Footnote 3 The results of applying these tests to the residuals of the OLS version of the model shown in Table 2 are presented in Table 1.

Table 1 Tests for residual spatial autocorrelaton
Table 2 OLS, spatial lag and spatial error model estimates

The results of the Moran I test suggest that we can reject the hypothesis of spatial independence due to the small marginal probability and conclude that that the residuals exhibit spatial dependence. The Moran I test is perhaps the most commonly used specification test for spatial autocorrelation. Anselin et al. (1996) note that this test consistently outperforms other tests in terms of power in simulation results. A limitation of the test, however, is that it provides no indication of whether the spatial autocorrelation present in the residuals is due to a true spatial process, best represented by a spatial lag model, or an error process, best represented by a spatial error model. On the other hand, the LM lag and error tests have been designed to determine the characteristic of spatial dependence and whether it is best represented by a spatial lag or error process. However, the LM lag and error test may be affected by the presence of the alternative form of spatial dependence. For this reason the robust forms of these tests have also been developed.

The LM error and lag tests shown in Table 1 seem to suggest that the spatial error model is the most appropriate for this data set. However, the LM lag test is also significant and suggests that the data will support a spatial lag model, while the robust forms of both tests have returned insignificant results, although the robust form of the spatial error test is only marginally insignificant at conventional levels of significance. For this reason, both spatial lag and error models were estimated and the results, along with those of the OLS estimation, are presented in Table 2. The first three columns in this table provide the results from the model estimated using OLS, while columns 4 through 6 and 7 through 9 provide the coefficients, z values and associated probabilities of the spatial lag and error models, respectively.

It is apparent from this table that the hypothesis that regional instability is a function of the instability of neighbouring regions is supported by the spatial lag model where ρ, the coefficient of the spatially lagged dependent variable, is significant at the 5% level and the LR error statistic, providing a test for spatial autocorrelation in the residuals of this model, marginally rejects the hypothesis of spatial dependence at the 5% level. However, the log likelihood (LL) and Akaike Information Criteria (AIC) tend to marginally favour the choice of the error model as do the test results shown in Table 1. It is also apparent from Table 2 that heteroscedasticity does not appear to be a problem, with the Goldfeld–Quandt test insignificant in all versions of the model (critical value 1.84).

The results from all of these models indicate that regional instability is negatively associated with the size of the region, with the negative coefficient of LPOP significant at the 5% level in all specifications. This supports the idea that larger regions have a greater ability to soak up adverse economic shocks than smaller regions by virtue of the size of the regional labour market and the broader range of industries supported by their regional economies. Higher levels of skill, as captured by the variable PERMGR, are also found to be associated with greater regional instability, while the coefficient of LOWSKILL is negative. This is contrary to commonly held views that lower-skilled employment tends to be more unstable (see, for example, Le and Miller 2000) yet is similar to the finding of Trendle and Tunny (2003) where it was found, for the Queensland public service, that higher-skilled occupations had higher rates of job turnover than lower-skilled occupational categories.

The variables incorporated to capture the effect of regional industrial and economic performance on regional instability are significant and have the expected sign in all three estimated equations with the exception of GROWTH. GROWTH is significant at the 10% level in the OLS and spatial lag versions of the model and has a positive sign, suggesting that higher rates of growth are associated with higher levels of regional instability, although the support for this is only weak, and is non-existent in the spatial error version of the model. The positive coefficient of SCHANGE indicates that greater structural change results in higher levels of regional economic instability. This result is not surprising as structural change is likely to result in jobs being either destroyed or created, or both simultaneously, leading to discontinuities within regional labour markets. Such factors are likely to destabilise regional labour markets as evidenced by the positive and significant coefficient on SCHANGE. Finally, the coefficient of ENTROPY, the index of industrial diversification, is negative and significant in all versions of the model, supporting the hypothesis that increased regional diversification is associated with increased regional stability.

5 Conclusions

The spatial error model seems to be favoured over the spatial lag model in terms of the diagnostics presented in Table 2. If the intention of the study were to provide a precise estimate of the equation represented in Table 2, the diagnostics provided in Tables 1 and 2 would drive the model selection process. In this case a highly significant LM lag or error test could be used to inform model selection. Unfortunately, the results presented in Table 1, although tending to favour a spatial error model, provide no clear direction for model selection. In the present case, however, the hypothesis outlined in Sect. 2 included the proposal that a region’s employment instability is, in part, determined by the instability of its geographic neighbours. In this case then, model selection is as much driven by theoretical concerns, with the spatial lag model being the most appropriate specification. Thus, while the residual tests and the model diagnostics tend to favour the spatial error specification, the spatial lag formulation is also found to be supported by the data. In addition, the LR test for residual spatial autocorrelation is insignificant at the 5% level, suggesting that this specification of the model has overcome the problem of residual spatial autocorrelation.

The results of the spatial lag model, by and large, confirm the conclusions of earlier studies that found that regional diversification acts to reduce regional instability. However, this study widens the implications of earlier work in three directions. Firstly, the regions used in this analysis more closely correspond to functional economic units than larger geographic units, such as state economies. Secondly, where functional economic units have been used in previous research, they have been confined to metropolitan areas; in contrast, the regions incorporated in this study are more diverse, with the results being applicable to both densely settled urban centres with relatively high levels of manufacturing and services employment and sparsely settled rural regions heavily dependent on agriculture and mining. Finally, spatial econometric techniques have been used to explore the role of geographic location on regional economic instability.

The spatial data techniques used in this study allow the formal incorporation of regional spillover effects. These techniques are particularly useful in the analysis of geographic data sets such as those used in this study. While the analysis undertaken in this study suggests that the spatial error specification is the most appropriate, suggesting that spatial spillovers in the form of ignored or omitted variables are important, the model diagnostics also support the spatial lag model, and this model provides evidence that spatial spillover effects are significant, with the instability of regions affecting the instability of neighbouring regions.