1 Introduction

The purpose of this study is to construct a multi-dimensional Rural Development Index (RDI) as a proxy <a composite indicator (CI)> describing an overall level of regional development and the quality of life in individual rural regions at NUTS-4 level. Given growing demand for composite development indicators in applied policy analysis (e.g. in evaluation of rural development/structural programmes) potential gains from having a multi-dimensional regional/rural development index are straightforward. As a composite indicator, the proposed RDI can be applied to analysis of the main determinants of rural/regional development in individual rural areas as well as for the assessment (i.e. the measurement of the impact) of cohesion policy and RD/structural programmes at various regional levels (Michalek 2007, 2009).

2 Application of an RDI to Policy Analysis of Rural Development

2.1 A Measure of a Sustainable Rural Development at Regional/local Level

Fully understanding of economic and social dimensions of sustainable development in rural areas remains one of the chief policy issues (Bryden 2003). Given the multiple dimensions (e.g. economic, social, environmental) of rural development, there is a huge interest among policy makers to learn more about the magnitude and trends in the overall welfare in rural regions. There is also the desire to learn about the importance of factors fostering the overall growth and convergence of individual regions. Typically, GDP per capita (calculated at NUTS-2 or NUTS-3 level) is used as: (a) a standard measure of a regional level of welfare, (b) a basic criterion of eligibility criteria for EU funding under structural funds, and (c) the main quantitative indicator of the effectiveness of pursued policies despite the fact that numerous deficiencies of this specific indicator make its application to the measurement of the overall level of socio-economic development of individual rural areas problematic:

  1. 1.

    Regional GDP per capita as a measure of local welfare largely ignores many important aspects of the regional/local quality of life, e.g. education, health, intra-regional income variation, environmental quality, etc.;

  2. 2.

    Regional GDP per capita does not take into account the price variation and purchasing power within a country,

  3. 3.

    Regional GDP per capita can be biased due to interregional imbalances in commuting and;

  4. 4.

    GDP per capita is usually not available at lower regional levels (i.e. NUTS 4 and NUTS5 levels, etc.).

Deficiencies of GDP measure and the need of taking into consideration various economic, social and environmental aspects of development in individual rural areas stimulated already in the past a search for alternative and more objective measures of the overall rural development, e.g. concept of well-being, a multi-dimensional concept of a regional performance, or regional quality of life etc.Footnote 1 While the work at this area is still progressing, relevant policy questions in this context are:

  • Can the overall development (beyond GDP) and individual performance of complex rural systems, including their economic, social and environmental domains, be objectively measured and compared across rural regions? If yes, how big are “the real” differences between individual rural regions/areas leading/lagging behind in terms of their overall development? To what extent have specific domains of rural development (e.g. production, employment, education, environment, etc.) contributed to an overall development of individual rural regions? Have pursued economic, social and environmental policies resulted in regional divergence or convergence of individual rural territories?

Clearly, answers to above questions can be used not only in a “standard” regional analysis, but also in evaluation of policies and programmes targeting specific rural areas, e.g. by measuring quantitatively the net effect of rural development/structural policies, programmes or specific RD measures (Michalek 2009; EC 2010).

2.2 Problems with the Use of Partial Indicators

Typically, basic knowledge about performance of rural economies is obtained on the basis of partial indicatorsFootnote 2 (PI) available from secondary statistics for a specific individual region. Though widely used, applicability of PI as impact indicators to the assessment of success/failure of pursued RD policies is however limited. Firstly, it is very difficult to select a “right” proxy for any broader rural development domain (e.g. rural education, environmental condition or health situation). Secondly, an interpretation of a development on the basis of a large number of PI can be especially problematic in case of opposite or dissimilar trends observed on the same area.Footnote 3 Thirdly, the direct use of PI for an analysis of an overall (i.e. economic, social and environmental) development of rural areas is challenging if weights of these indicators in the overall rural/regional development are not known.

2.3 A Composite Index Approach

A possible solution to the above problems may offer a composite index approach. The expected advantages from using a composite development index to policy analysis include: comprehensiveness, multi-dimensionality and an ability to reduce empirical sets of the hundreds/thousands of available indicators to a one synthetic measure (Saisana and Tarantola 2002; OECD 2005).

Ideally, a composite development indicator, e.g. Rural Development Index (RDI) should measure multi-dimensional concepts of rural growth/decline by embracing performance of the most important rural development domains, e.g. economic output (incl. agriculture, food industry, rural tourism, etc.), investment, employment, poverty, education, health, housing conditions, crime, environment, urbanization and land use, etc. A good RDI should be able to aggregate the above domains into a one dimensional indicator using objective and statistically verifiable weights. As a composite indicator (CI), a RDI should also fulfil a number of general conditions (Hagerty et al. 2001; OECD 2005), e.g. it should be based on a sound theoretical framework; the selection of variables should take into consideration their relevance, analytical soundness, accessibility, etc.; construction of the index should follow an exploratory analysis investigating the overall structure of used indicators; the index should be reported as a single number but could be broken down into components/domains, etc.

The review of various empirical studies concerned with construction of a composite index to policy analysis shows that its creators have to cope with numerous methodological issues of which the most crucial ones are (Berger-Schmitt and Noll 2000; Deutsch et al. 2001; Henderson and Black 1999; Ontario Social Development Council 2001; Rahman et al. 2005; Kaufmann et al. 2007):

  • Selection of appropriate variables/coefficients and balancing between objective vs. subjective indicators;

  • Weighting the variables/indicators according to their relative importance;

  • Application of unbiased aggregation techniques; and

  • Making the index useful for policy purposes (i.e. in programme evaluation).

A comprehensive description of various methodologies and problems linked to a derivation of a meaningful QOL/RDI in policy analysis is provided in Kaufmann et al. (2007). The authors showed that in order to be relevant for an empirical policy analysis (e.g. policy evaluations) a composite QOL/RDI index should meet a number of general (e.g. efficiency, effectiveness, relevance, etc.) and specific (e.g. regionality, rurality, simplicity, etc.) policy criteria. Given the above criteria, Kaufmann et al. (2007) suggest some practical consequences for the construction of a composite RD index at disaggregated level, such as:

  • The RDI should be either built on an indirect (i.e. using available secondary data) or a hybrid approach (i.e. combining secondary data with direct surveys on various aspects of quality of life in rural areas). A solely direct approach (i.e. by interviewing population living in this area) is not adequate due to high costs, low frequency of data collections and high level of subjectivity.Footnote 4

  • The RDI should be based on a method that allows empirical derivation of the weights from an econometric model.

  • The form of the Index should be as simple as possible (e.g. a one equation model) to be better understood by the broader public.

  • Data for the index must be available cheaply or freely at the regional level over time with the possibility of rural–urban distinctions.

In the following chapters we show how an RDI can draw on these considerations and how it can be used for practical policy analysis.

2.4 Overview of Methodological Approaches Applied to Construction of a Composite Development Index

Assuming that an overall level of rural development is closely related to the concept of the quality of life for population living in this area, among various methodological approaches that have been recently applied to construct an index measuring an overall development and/or a quality of life at regional level the most well-known are: (1) direct or expert approach;Footnote 5 (2) factor analysis;Footnote 6 (3) structural equation modelling approach;Footnote 7 (4) hedonic price approach;Footnote 8 (5) structural models of growth;Footnote 9 efficiency transformation approach;Footnote 10 or (6) market/residence approach, spatial equilibrium approach and compensating differentials.Footnote 11 Obviously, an in-depth review of the above methodological approaches would go beyond the scope of this paper. Yet, the main identified problems in constructing of a quality of life index by using above approaches are: (1) arbitrary selection of a proxy serving as a natural identification of a direct equivalence of a quality of life in a specific geographical area (or an amenity’s capitalization), e.g. wages/incomes, house prices, rents, land prices, net-migration, decision of business location, etc.; (2) the assumption that, within a given geographic area/region, the particular proxy (e.g. land or housing prices) remains homogenous and the quality of life can be expressed in this a one-dimensional space; (3) numerous problems with assigning objective weights to selected socio-economic proxies/indicators. Regarding the latter, major difficulties associated with construction of weights can be summarized as follows: (1) in a huge majority of relevant studies the choice/selection of “the most representative” socio-economic indicators was carried out arbitrary, leaving other available indicators unused or downgraded as “non-representative”; (2) experts’ weights of selected indicators appear often as extremely subjective and not directly transferable from one geographic area to another; (3) different normalizations of variables could result in different weights; (4) some weights would become inconsistent when a larger number of indicators/coefficients/variables had been analyzed; (5) weights that were based on a pure statistical analysis of factors (e.g. based on factor loadings) appear to miss an appropriate welfare (social utility) context; (6) many assigned weights tend to be region specific, so they are not applicable to other regions even in the same country.

In the following section, we directly address the above issues both from a methodological as well as a practitioner’s perspectives.

3 Construction of the RDI

3.1 Empirical Studies on Quality of Life and Migration

Assuming equivalence between the level of rural development and rural quality of life, the methodology used in our study for derivation and construction of a composite RDI draws upon research on the relationship between the quality of life and migration.Footnote 12 In migration studies incorporating characteristics of origin and destination regions the most frequently reported motives for in-migration flows into destination areas (pull-factors) included factors such as higher probability of obtaining employment, better housing, nicer neighbourhood, more pleasant community, lower pollution, lower crime rates, better health service, better educational facilities, more favourable human-made and natural environments, etc. Under factors found to determine out-migration in origin areas (push-factor) the most important were: poor location amenities, poor public transportation, lack of good medical facilities, unemployment, economic and environmental distress, etc. (Williams and McMillen 1980; Roseman 1977; Michalos 2003).Footnote 13 The “pull–push” approach assumes that numerous objective indicators describing various regions (e.g. unemployment, crime rate, infant mortality, level of prices, etc.) can be transformed into a subjective judgement of the overall quality of life on which any migration decision is made.Footnote 14 An extension of pure origin–destination migration models can be found in gravity, modified gravity or spatial interaction models (Tinbergen 1962; Anderson 1979; Sen and Smith 1995) which forecasted migration flows as a function of distance, size of population between respective areas and differences in characteristics of both areas (Greenwood 1997; Andrienko and Guriev 2003).Footnote 15

From the perspective of our study, particularly interesting version of migration models are those models which forecast probability of migration by incorporating information on the relative frequency of non-migration (e.g. probit or logit models) thus providing a natural transition from the gravity model to the more behavioural grounded modified gravity models.Footnote 16 The modelling of migration decision depends also on the type of data available. For example, in models, in which data is available in form of a full origin–destination matrix migration flows, migration decision may be modelled by using spatial econometrics (Ibarra and Soloaga 2005; Frazier and Kockelman 2005; Ashby 2007; Lundberg 2002; Verkade and Vermeulen 2005). Irrespective of a selected object of such analysis (individual or household) and chosen methodological approach (non-spatial vs. spatial econometrics) major determinants of a migration decision appear those variables describing:

  • Differences in factors determining the quality of life in origin and destination regions, and

  • Transaction costs related to such a decision.

In the simplest form, the incorporation of transaction costs into modified gravity models involves distance as a proxy for costs of moving.Footnote 17 Although over time the importance of some direct costs related to distance may diminish, e.g. transportation and communication systems become relatively cheaper and more accessible, some other important costs remain high and directly proportional to distance, e.g. psychological costs, direct costs of moving, some of search costs, etc.Footnote 18

3.2 Derivation of Weights in the RDI

The methodological approach applied in our study for derivation of weights in RDI draws on the supposition that quality of life and migration are closely linked to each other (e.g. Greenwood et al. 1991; Douglas and Wall 1993, 2000). Greenwood et al. (1991) estimated compensating income differentials between the states in the US on the base of net-migration rates.Footnote 19 In Douglas and Wall (1993) the QOL index was derived directly as proportional to the positive scores computed for each province on the base of net-migration coefficients across all destination provinces. Douglas and Wall (2000) applied regression techniques to identify the portion of migration flows that was correlated with income opportunities to compute a measure of the relative levels of living standards in different regions.Footnote 20

The approach applied in our study to the derivation of weights in RDI builds upon (Tiebout 1956; Douglas and Wall 1993, 1999, 2000) who argue that cross-migration rates provide the richest and most reliable source of data on the relative attractiveness of different locations. Yet, contrary to previous studies, the approach used in our study neither implies equivalence between quality of life and migration, nor is the quality of life expressed as a parameter that is independent of individual characteristics of a given location.Footnote 21 In fact, as we show below, the method proposed in this study allows for the computation of the quality of life/rural development index even in regions exhibiting null in-or out-migration.

3.3 The Model

Using the notation of Douglas and Wall (1993) we assume that an individual perception of quality of life (QL) for each person l living in region i can be expressed as a real-valued function q that captures the common component of utility function across individuals with region specific characteristics Z i as arguments (Eq. 1),

$$ QL_{i}^{l} = {\mathbf{q}}\left( {{\mathbf{Z}}_{{\mathbf{i}}} } \right) + \varepsilon^{{\mathbf{l}}}_{{\mathbf{i}}} $$
(1)

where l, individual person; q, real valued function that captures the common component of utility function a cross individuals; Z i, vector of characteristics in region i; ε li , stochastic element capturing factors unique to individual l.

In this approach \( QL_{i}^{l} \), which is an individual l’s perception of his/her own quality of life in region i, has to be distinguished from q i in (2) that stands for the “objective” quality of life in region i and is expressed as a function of a vector of characteristics Z generally available in region i.

$$ {\mathbf{q}}_{{\mathbf{i}}} = {\mathbf{q}} \, ({\mathbf{Z}}_{{\mathbf{i}}} ) $$
(2)

where q i , “objective” quality of life in region i.

Following Douglas and Wall (1993), by defining a cost of moving from region i to j as C ij and considering a decision of an individual regarding migration from region i to region j as \( mig_{ij}^{l} \) where \( mig_{ij}^{l} \), is an individual decision of moving from region i to j such that: \( mig_{ij}^{l} \), {1} if individual l migrates from i to j or; \( mig_{ij}^{l} \), {0}, otherwise.

Douglas and Wall (1993) showed that in case an individual l decides to move from region i to region j the quality of life in region j, i.e. \( QL_{j}^{l} \) less the costs of moving from i to j must be higher than the quality of life in region i (\( QL_{j}^{l} \)).

Formally,

$$ mig_{ij}^{l} = \left\{ 1 \right\}\;{\text{if}}\;QL_{j}^{l} - {\mathbf{C}}_{{{\mathbf{ij}}}} > QL_{i}^{l} $$
(3)

Given (3), a decision of an individual l to move from i to a region j depends on the relative quality of life in all possible destination regions n less costs of moving to regions n compared with the quality of life in the origin region i.

Thus,

$$ QL_{j}^{l} - {\mathbf{C}}_{{{\mathbf{jj}}}} > QL_{i}^{l} \quad {\text{and}}\quad QL_{j}^{l} - {\mathbf{C}}_{{{\mathbf{ji}}}} > QL_{n}^{l} - {\mathbf{C}}_{{{\mathbf{in}}}} $$
(4)

In terms of utility maximization, all else being equal, it is expected that individuals will move to a new location j if the perceived utility (corrected for respective transaction costs/moving costs) from doing so is greater than the utility of moving to any other location (corrected for respective transaction costs/moving costs) or not moving at all.

While “real” QOL in a possible destination relative to an individual’s current residence is the prime determinant of the probability that the individual will moveFootnote 22 to that location, in this sense, migration is a better measurement of utility improvement than any other measurement of well-being Footnote 23 (the preferences are manifested through revealed action (Ashby 2007).

By defining migration rate as in (5)

$$ {\mathbf{MR}}_{{{\mathbf{ij}}}} = \Upsigma mig_{ij}^{l} /\left( {{\mathbf{P}}_{{\mathbf{i}}} *{\mathbf{P}}_{{\mathbf{j}}} } \right) $$
(5)

where MR ij , rate of migration between regions i and j Footnote 24; \( mig_{ij}^{l} \), inflows of those l who migrate from region i to region j; P i , P j  = population P in regions i and j (only those who are at risk of migration).

Given (Eqs. 35) an econometrically estimable form of E (MR ij ) can be expressed in terms of function f, with Z ki and C ij as the main arguments. In (Michalek 2009) various forms of f are discussed and separately estimated using appropriate econometric methods.

In contrast to previous studies, a synthetic index of the rural development (RDI) is calculated in our study according to Eq. (6) on the base of regional characteristics Z i and their individual weights β k that are derived from the estimated migration function (with M ij or a MR ij as dependent variable).Footnote 25 In our model, the estimated weights β k represent the relative “importance” or a “social value” assigned by a society (composed of those who migrated and those who stayed) to each of characteristics Z ki representing various aspects of the quality of life in all origin and destination regions i.

Formally, the RDI in region i can be expressed as a linear function of i-region specific characteristics Z ki and their weights β k (Eq. 6):

$$ {\mathbf{RDI}}_{{\mathbf{i}}} = h({\varvec{\upbeta}}_{{\mathbf{k}}} , {\mathbf{Z}}_{{\mathbf{k}}}^{{\mathbf{i}}} ) = \Upsigma_{k} {\varvec{\upbeta}}_{{\mathbf{k}}} \times {\mathbf{Z}}_{{\mathbf{k}}}^{{\mathbf{i}}} $$
(6)

where RDI i , rural development index (an equivalent of the quality of life index) in region i; Z i k , Measurable characteristics k in a region i; β k, Weights for each characteristic Z k derived from a given migration model (see Sect. 4).

In empirical work, due to the multidimensionality of relevant data, a particular importance is to be assigned to:

  1. 1.

    An appropriate selection (or estimation) of Z i k describing major attributes of the overall development and the quality of life in individual rural areas.

  2. 2.

    Appropriate estimation of <social> weights β k

In our study Z i k are constructed empirically using the factorization method applied to all relevant partial indicators (coefficients and variables) VAR i available in a given country at regional level. The latter are nested in Z i k (i.e. RD domains) and describe in detail various specific aspects of rural development in each individual area i (e.g. a number of enterprises, employment coefficients, water/air pollution coefficients, schools, health facilities, etc., per km² or per capita). While the basic objective of this intermediate analysis is to reduce dimensionality of performed analysis, Z i k are empirically estimated using the principle-component factor method.Footnote 26 The number (k) of extracted factors Z k to be used in the construction of the RDI is usually unknown, so various criteria are commonly applied in empirical studies to determine k, e.g. eigenvalues larger than 1 (Kaiser criterion); fixed number of factors, etc. In our study the optimal k is determined endogenously by ensuring that derived factors/principal components Z k (number and values) also guarantee the best fit of the estimated migration model (see Sect. 4). Thus, given that both the RDI and the estimated migration function share several common arguments (Z k ) the “optimal” number of factors/principal components Z k is empirically derived using an iterative procedure, i.e. by (1) starting from an arbitrary k, performing factorization, deriving Z k and carrying out an estimation of respective migration function; (2) iterate on k and perform all steps as in (1); (3) selecting optimal k (result of factor/principal component analysis and estimation of a given migration model) and vector of Z k that guarantee a maximization of the likelihood function or any other relevant maximization criterion applied in an econometric estimation of the respective migration model.

Given estimates of β k (<social> weights) for all individual factors Z i k and the knowledge of particular factor loadings of each observable individual rural development attribute (coefficient/variable) VAR i a in all Z k (factorization using principal component method) the “social importance” = rank showing a relative contribution of each individual attribute/variable/coefficient/partial indicatorFootnote 27 (R i a ) to the overall rural development (at the country level) can be computed from Eq. (7).

$$ {\mathbf{R}}_{{\mathbf{a}}} = {\varvec{\Upsigma}}_{{\mathbf{k}}} {\varvec{\upbeta}}_{{\mathbf{k}}} \times {\mathbf{LV}}_{{\mathbf{a}}}^{{\mathbf{k}}} $$
(7)

where R a , relative importance (rank) of an individual regional attribute (VAR a ) in the overall rural development (at the country level); β k , <Social> weight of a given factor (principal component) Z k obtained from a relevant migration model; LV k a , factor loading of an individual attribute/variable/coefficient (VAR a ) in a given factor (component) Z k ; k, number of selected factors/principal components.

By applying the above method, the social value of each selected partial rural development attribute VAR a (i.e. contribution of individual partial indicator VAR a to the overall quality of life and development level) can be measured at the country level, and is equal to the weighted sum (=k) (β k as weights) of each attribute’s respective factor loading (LV k a ) in all selected factors/principal components Z k . Obviously, the combination of the highest factor loadings and <highest> social weights (in absolute terms) is decisive for the obtained rank of a given variable VAR a (see Sect. 8).

4 Econometric Model Used for Estimation of Weights in RDI

Depending on availability of data and research hypothesis, an econometric estimation of weights β k in the RDI (Eq. 6) can be carried out on the basis of various models (Michalek 2009). The migration model applied for derivation of weights in the RDI in this study was selected from many alternative modelling approaches, e.g. net-migration model; spatial dependence model migration model (i.e. the general spatial model, the spatial lag model or the spatial error regression model); net migration model (i.e. multi–level mixed effect or nested error component regression model) and gross-flow migration model (i.e. multi–level mixed effect or nested error component regression model) by using selection criteria described in Michalek (2009).Footnote 28 As a result of model selection, an estimation of weights β k in the RDI (Eq. 6) was carried out on the basis of a panel regression model with gross migration flows between rural regions (in a given country) as a dependent variable (Eqs. 8a, 8b). The selected model allows for pair-wise data observations on gross migration inflows between each region i and j, and postulates that gross migration inflows between each pair of regions depend both on observable by individual migrants differences between factor k in region i and respective factor k in region j (\( \Updelta F_{ij,k} \)) as well as transaction costs of moving from region i to j.

It is important to note that introduction of transaction costs into the migration model brings about a formal separation of the RDI (consisting of individual factors and related estimated coefficients) from migration. Footnote 29 This is because transaction costs do not enter the index itself, but are used to explain a part of the overall variance in a migration model. In current version, transaction costs are modelled as a time-invariant variable consisting of two elements, i.e. distance matrix D and squared distance matrix D² reflecting curvature properties of transaction costs (a quadratic function). As all observable migration inflows between regions are either zero or positive, model (8) can be estimated as a logistic functionFootnote 30 (comp. Schultz 1982; Ashby 2007), whereby a dependent variable reflects the probability distribution of migration from one region to another (it is closely related to the modelling of a microeconomic behaviour of an individual willing to migrate).

Important features of this model are: (1) a comprehensive treatment of all basic- and region-specific characteristics (e.g. economic, social and environmental, etc.) assumed to affect the quality of life at the regional level (and thus intra-regional migrations); (2) introduction of variables representing transaction costs in a migration decision of moving between regions i and j; and (3) a better approximation of the micro-foundation of a migration decision compared with other approaches (e.g. in comparison to a net-migration model).

Model (8) can be estimated in two alternative forms: a) as a panel regression that allows the choice between fixed or random effect models (specification 8a); or b) as a multi-level mixed-effect regression model (8b).

$$ \log (m)_{{{\text{ID}},t}} = \alpha_{0} + D_{\text{ID}} \cdot \delta_{1} + D_{\text{ID}}^{2} \cdot \delta_{2} + \Updelta F_{{{\text{ID}}Kt}} \cdot \beta_{K} + v_{\text{ID}} + \varepsilon_{{{\text{ID}}t}} $$
(8a)

where

M :

Migration Matrix

D :

Distance Matrix

F :

Factor/principal component Matrix

n:

Number of regions

k:

Number of factors/principal components

T:

Number of years

a:

Index for individual rural development attributes VAR => a = 1…m

i, j:

Index for regions => i, j = 1…n

p, q:

Index for factors/principal components => p, q = 1…k

ID:

Index for region pairs => ID = 1…\( A_{i}^{n} \) (=n (n − 1))

t:

Index for years => t = 1…T

log (m):

\( \log \left( {{\frac{\text{mrate}}{{1 - {\text{mrate}}}}}} \right) \)

mrate:

inflows from region i to j divided by (population in i multiplied by population in j)

\( D_{\text{ID}} \) :

distance between region i and j

\( D_{\text{ID}}^{2} \) :

squared distance between i and j

\( \Updelta F_{{{\text{ID}}Kt}} \) :

differences in factors k between regions i j (each ID)

\( v_{\text{ID}} \) :

random intercept at the pair wise ID level

\( \varepsilon_{{{\text{ID}}t}} \) :

residual with “usual” properties (mean zero, uncorrelated with itself, uncorrelated with D and F, uncorrelated with v and homoscedastic)

$$ \varepsilon \approx N(0,\sigma_{\varepsilon }^{1} ) $$

As a random effect model, Model 8a assumes that the random effects occur at the level of the pair-wise migration flows between all regions ij (region as a group variable). Model 8a is thus estimated as a random effect linear regression model with a group variable at the level of ij (ID) by using the GLS random effects estimator (a matrix-weighted average of the between and within estimators).Footnote 31

Version b of the Model (Eq. 8b) controls for the possibility of the nested error structure within a region i. In our model pair-wise data on gross migration flows between regions ID is set to be a panel (observable in t years). Since ID can be specific within regions, it allows also for the specificity/similarity of gross flows (ID)-within-a given region (i).

$$ \log (m)_{{{\text{ID}}t}} = \hat{\alpha}_{0} + D_{\text{ID}} \cdot \hat{\delta}_{1} + D_{\text{ID}}^{2} \cdot \hat{\delta}_{2} + \Updelta F_{{K,{\text{ID}},t}} \cdot \hat{\beta} _{K} + v_{i}^{(1)} + v_{{{\text{ID}},t}}^{(2)} + \varepsilon_{{{\text{ID}},t}} $$
(8b)

where

log (m):

\( \log \left( {{\frac{\text{mrate}}{{1 - {\text{mrate}}}}}} \right) \)

mrate:

inflows from region i to j divided by (population in i multiplied with population in j)

D ij :

matrix of distances between regions i, j

\( D_{ij}^{2} \) :

matrix of squared distances between regions i, j

\( \Updelta F_{ij,k} \) :

matrix of the differences in factors k between regions i, j

\( v_{i}^{(1)} \) :

random intercept at the region i level

\( v_{\text{ID}}^{(2)} \) :

random intercept at the gross migration flows <pair wise level> nested within the region i level

\( \varepsilon \sim N(0,\sigma_{\varepsilon }^{2} ) \) :

the residual with “usual” properties (mean zero, uncorrelated with itself, uncorrelated with D and F, uncorrelated with \( v \) and homoscedastic)

Model 8b has two random effect equations. The first is a random intercept at the regional level, and the second is a random intercept at the ID level. Model 8b can be estimated using a restricted maximum likelihood (REML) estimator.

5 Synthesis of the Methodological Approach

The estimation of the RDI was carried out by taking the following steps:

  1. 1.

    Defining relevant rural development domains to be taken into consideration prior to the assessment of the overall impact of the RD programme;

  2. 2.

    Defining variables describing each rural development domain in all regions i;

  3. 3.

    Translating the above variables into meaningful coefficients (e.g. per capita, per km², etc.) in all regions i;

  4. 4.

    Converting those coefficients into region specific factors f i (principal component method) in order to reduce the dimension of the analysis (factor analysis);

  5. 5.

    Deriving weights for each individual factor/principal component f (embracing variables in each rural development domain) to be applied in the construction of the RDI from econometrically estimated migration function (Model 8).

  6. 6.

    Computing for each rural region i a synthetic index RDI i. The latter is defined as a weighted sum of factors (variables, domains) with β k derived from a selected inter- and intra-regional migration function according to Eq. 6 (the optimal number of factors k selected to the construction of an RDI was derived from the maximization of the restricted likelihood function used in the estimation of the intra-regional migration model).

In practice, steps 4 and 5 were performed jointly using an iterative procedure, i.e. starting from the minimal number of factors/principal components and increasing this number (trough factor- and migration model re-estimation) until achieving a convergence, i.e. whereby the maximization of a restricted likelihood function of a migration model was applied as the main criterion (given the set of estimated factors/principal components) see Sect. 8.

6 Domains of an RDI

Generally speaking, existing literature does not provide a definite answer to the question: which domains and what relevant variables/proxies should be selected into a synthetic/composite index measuring the overall level of economic and social development/quality of life (Jones and Riseborough 2002; Kazana and Kazaklis 2008; Erikson 1993; Johansson 2002; Grasso and Canova 2007). While in international comparison studies some consensus was achieved concerning the inclusion of specific domains into such an index (the list of an index’s components includes various important quality of life aspects linked to, e.g. democracy, health conditions, etc.),Footnote 32 a similar consensus regarding the appropriate list of welfare components (quality of life domains) in the analysis of regional economics appear as problematic and difficult.Footnote 33

In order to meet relevant policy criteria (e.g. objectivity, transparency and simplicity) and ensure full data comparability across all regions within a given country, an indirect approach was applied in our study. In this approach a country’s available secondary regional statistics (objectively verifiable indicators) representing various aspects of quality of life were used, instead of subjective indicators derived on the base of sporadic interviews with individuals in selected regions (NUTS-4). An important advantage of this approach is an explicit consideration of all aspects of regional/rural development available from secondary statistics at regional basis (i.e. economic, social, environmental, infrastructural, administrative, etc.), thus avoiding an arbitrary pre-selection of “the most important” partial indicators, by using subjective judgments as to their “social relevance” and “representativeness”. Furthermore, the applied method allows for the assessment of “social importance” of all individual partial indicators collected at regional level (see Eq. 7). The list of domains linked to various important aspects of rural development in individual regions, together with examples of indicatorsFootnote 34 used in our study, is shown in Table 1.

Table 1 Overview of domains and examples of 991 indicators/coefficients applied in empirical construction of the RDI index (Poland)

While all the above domains and relevant socio-economic indicators show different aspects of rural development and some of them are typically more crucial than others, it can be expected that any change in variables/coefficients representing these domains ceteris paribus will have a positive, neutral or negative impact on the overall level of rural development measured in a specific locality.Footnote 35

Following this approach, the rural development domains discussed above are represented in our study by hundreds of partial socio-economic indicators/variables (e.g. 991 variables/indicators describing various aspects of rural development at NUTS-4 level in Poland; 340 variables/indicators at NUTS-4 level in Slovakia; see Sect. 7). For this the constructed RDI combines all selected economic, environmental and social indicators and links them under a consistent theoretical framework.

7 Data

The multi-dimensional character of the quality of life (level of development) of rural areas in various countries calls for the use of objectively verifiable statistical secondary data on variables/indicators reflecting various important aspects of rural development (e.g. economic, social, environmental, etc.). These may be calculated either directly for rural regions (at NUTS-4 level) or collected at NUTS-5 level and aggregated to a higher NUTS-4 level. The approach applied in our study to the territorial delimitation of rural areas excludes from available data large cities but acknowledges the importance of small towns located in rural areas as being a significant component of rural economy in most parts of Europe (“sub-poles” in rural economic and social development).

Poland: The data used for the calculation of the RDI for Poland originates from the Regional Data Bank of the Polish Statistical Office at (NUTS-4), as well as data obtained from the Ministry of Finance (e.g. distribution of personal income) and the Ministry of Interior (e.g. crimes) collected at NUTS-4 levels for the years 2002–2005. Of 379 NUTS-4 regions in Poland 314 rural Powiats (NUTS-4) are included in the analysis (84.2% of all NUTS4-regions), which excludes 65 big cities. The data basis for Poland covers all relevant rural development dimensions available in regional statistics at NUTS-4 level and consists of 991 coefficients/indicators collected/calculated either directly at NUTS-4 level or aggregated from NUTS-5 (approximately 2500 Polish gminas) levels into NUTS-4 level.

Slovakia: The database for Slovakia originates from the Slovak Statistical Office whereby 337 indicators/variables collected at 72 (Okres) regions (NUTS-4) in years 2002–2005 are used for the construction of the RDI.

In both countries data cleaning was performed using linear interpolation if less than 10% data were missing, whereas the expectation–maximization method (EM) was applied if data for one whole year was missing. EM estimates the means, the covariance matrix, and the correlation of quantitative variables with missing values, using an iterative process. Overall, imputations were done for approximately 2–3% of variables.

8 Results

8.1 Factor Analyses

In both Poland and Slovakia the number of variables characterizing various aspects of RD in individual rural regions was large and assorted regional indicators/coefficients were expected to be linearly dependent. Therefore, at the first stage the factor analysis (principles component method)Footnote 36 was carried out with the main objectives of:Footnote 37

  • Reducing the database necessary for computation of the RDI (explaining variability among observed random variables describing various aspects of rural development in terms of fewer unobserved random variables called factors), and

  • Detecting data structure that would allow a clear interpretation of obtained results.

The number of retained factors in Slovakia was determined using Kaiser criterion (factors with eigenvalues greater than 1 were retained). In contrast to this procedure, the final number of selected factors in Poland was determined in an iterative procedure by selecting such a number of factors that simultaneously maximized the restricted likelihood function used in the selected model (see Sect. 8.2) as the convergence criterion. As an outcome of factor analysis (2002–2005) 337 original variables/indicators in 72 Slovak NUTS-4 regions were converted into 21 factors characterising various aspects (domains) of rural/regional development in Slovakia; 991 variables/coefficients in 314 rural NUTS-4 regions were converted into 17 factors in Poland. Estimated factor values in both countries are region and time specific. For each region and year, estimated factor values were z-normalized thus indicating a relative position (with respect to factor endowment) of a given region (in the respective country) in comparison to a country’s average (years 2002–2005). Positive factor values reflect a positive deviation from a country’s average (for a given domain); negative values mean the opposite. The respective labelling patterns of factor domains draw on the major loading components.

8.2 Estimated Migration Functions

An econometric estimation of weights in the RDI was carried out separately in both countries on the basis of Eqs. 8a and 8b. As all observable migration inflows between regions are either zero or positive, Model 8 was estimated as a logistic functionFootnote 38 reflecting a probability distribution of migration from one region to another (71 × 72 × 4 = 20,448 data observations in Slovakia, and 313 × 314 × 4 = 393,128 data observations in Poland). Model 8 was estimated in two versions: Version 8a as a panel regression that allows between, fixed or random effect model specification (estimated as a random effects linear regression model with a group variable at the level of ID [GLS regression estimate]), and version 8b as a multi-level mixed-effect regression model (mixed-effects REML regression) that additionally allows for the possibility of the nested error structure within a region I.Footnote 39 The estimation results of Model 8 for Slovakia and Poland are presented in Table 2 and Fig. 10a, b in Annex.

Table 2 Estimated coefficients (Models 8a and 8b)

Results on the base of Model 8a and 8b for Slovakia (see Table 2) are very similar. As model 8b is more general, our final estimation results (both for Slovakia and Poland) are based on this version.

In Slovakia approximately 67% and in Poland approximately 75% of estimated coefficients are significant at the 0.01–0.05 level. In both Slovakia and Poland approximately half the extracted factors/principal components were found to contribute positively to in-migration flows and thus to the RDI. Concerning the sign and magnitude of coefficients representing the contribution of individual rural development domains (factors/principal components) to the overall RDI, the respective values in Slovakia ranged from the highest +0.121 (Factor f4, i.e. agriculture and natural endowment) to the lowest −0.107 (Factor f2, i.e. low spatial availability of social services and technical infrastructure (high value per capita). In Poland the respective values ranged from +0.086 (Factor 4, i.e. high income groups and availability of dwellings) to −0.015 (Factor 11, i.e. energy sector and specific deaths structure).Footnote 40 Concerning the impact of transaction costs on migration, both coefficients (dist and dist2) included in the estimated migration models (Model 8b) in Slovakia and Poland have expected signs and are highly significant (at 0.01 level). This empirical outcome confirms that the probability of migration between regions initially decreases along with an increase of a distance between regions, but only to a particular threshold, than it increases again. For example, the estimated value of this threshold/radius in Poland in years 2002–2005 was found to be equal to 44 km. This means that rural population in Poland was able to gain from local quality of life attributes (incl. amenities, cultural heritage etc.) in case the latter were located in a radius of 44 km from the place of living. Beyond this threshold/radius a further access to attributes increasing the quality of life was in principle only possible via out-migration. Clearly, such a threshold is usually time-/country-specific, and it may change as a result of structural adjustments in transportation and communication networks.

8.3 Individual Components of the RDI

8.3.1 Ranking of Partial Indicators

Information about the relative importance (i.e. contribution of respective coefficient to the overall rural development) of each partial indicator describing various aspects of rural development in a given country was obtained on the basis of Eq. 7.

Among the top 10 variables/coefficients positively contributing to quality of life in rural regions in Poland the most important were:

  • Personal income − highest income group (social weight = 0.07);

  • Availability and quality of new residential buildings (social weight = 0.06/0.07);

  • Access to selected technical infrastructure, e.g. gas consumption from gas-line system per capita (social weight = 0.05/0.06);

  • The share (high) of the private sector in the service sector (social weight = 0.05/0.06);

  • Spatial accessibility of rural enterprises (social weight = 0.05)

In Slovakia, the most important variables/coefficients positively contributing to local rural development were those associated with:

  • Population structure (e.g. high share of population at a productive age within the total population) (social weight = 0.17/0.18)

  • The share (high) of private enterprises and natural persons in total legal units (social weight = 0.17)

  • Level of consumption (high), e.g. municipal waste disposal per capita (social weight = 0.16)

  • Spatial access of rural population to social infrastructure, e.g. swimming pools, sport stadia, telephone lines, post offices, local communication, etc., per km² (social weight = 0.1/0.12)

  • The structure of local business; share (high) of enterprises in areas: financial mediation, real estate, rental and business activities in total enterprises (social weight = 0.12)

  • Variables/coefficients associated with favourable climate and nature, e.g. high share of vineyard in agricultural land (social weight = 0.10)

Among the 10 variables/coefficients that had a particularly negative impact on the quality of life and rural development, the most important in Poland were:

  • Low personal income − low income groups (social weight = −0.07)

  • The share (high) of the public sector in the service sector (social weight = −0.06)

  • Disproportion in the gender structure of the rural population, i.e. over-proportional share of male of working age (=>low share of females of working age) (social weight = −0.04)

  • The share (high) of legal units in the public administration and security sectors (social weight = −0.04)

  • The share (high) of young unemployed (25–34 years) of the total registered unemployed (social weight = −0.04)

  • Level of subsidies received at gmina level (NUTS-5) (social weight = −0.03). Yet, the latter may also merely represent society’s response to a low development level in the regions.

Respective variables/coefficients that were particularly negatively associated with the quality of life and the level of rural development in Slovakia were:

  • The over-proportional share of NGOs, contributory organisations, other non-profit organisations in the structure of legal units registered in a given region (weight = −0.17). Yet, this variable (along with a number of other response variables, e.g. a high percentage of social expenditures) may also represent the policy’s response to a low local development level in the past.

  • The share (high) of women among unemployed persons (weight = −0.16)

  • The share (high) of urban territory in the total area of municipality (weight = −0.13)

  • The share (high) agricultural units in total number of legal subjects registered on a given territory (weight = −0.12)

  • The share (high) of cooperatives in total enterprises (weight = −0.12)

Beyond these two extreme groups a third group of variables/coefficients was found to have a neutral impact on rural development (social weight equals to approximately zero). In Poland these were variables showing: e.g. a share of commercial companies in the public sector; a share of overnight stays of foreign tourists in total overnights; a share of publicly-owned entities in sectors: G (trade and retail) I (transport and communication) and H (hotels and restaurants). Among the respective “neutral” variables in Slovakia there were: the number of tax offices per capita; number of secondary school-children per school; number of cable TV per capita, etc.

While the above results of this ranking seem to be highly plausible they show also that an assessment of the level of rural development using specific partial per capita indicators (used as a measure of the level of regions development) may be misleading. Indeed, the results of this study prove that many per capita indicators, e.g. apparently showing a high availability of social and infrastructural goods/services per capita may be negatively linked with the overall quality of life, thus may merely reflecting a low density of rural population in those regions; i.e. they ignore an important aspect of spatial accessibility to these goods/services.

8.3.2 Ranking Importance of Individual RD Domains

Assessment of the relative importance of various rural development domains was carried out in two steps: firstly, all partial coefficients/variables describing various aspects of rural development were allocatedFootnote 41 into six main areas:

  • Economic (292 variables in Poland; 102 variables in Slovakia)

  • Social (337 variables in Poland; 187 variables in Slovakia)

  • Environmental (199 variables in Poland; 20 variables in Slovakia)

  • Demographic (70 variables in Poland; 13 variables in Slovakia)

  • Administrative (122 variables in Poland; 13 variables in Slovakia)

  • Infrastructural (69 variables in Poland; 19 variables in Slovakia)

Secondly, given information on a relative individual importance of variables entering a particular RD domain (Eq. 7), the social weight of each above RD domain was calculated as a sum of all above values (for variables included into specific RD domain) divided by the number of variables in each entry. Obtained valuations of RD domains are presented in Table 3a (Poland) and Table 3b (Slovakia).

Table 3 Social weights of individual RD domains: (a) Poland, (b) Slovakia

The results of the above rankings show that the highest individual impact on the level of rural development had demographic and social domains (Poland), and the environmental and infrastructural domains (in Slovakia). On the other hand, a relatively low or even negative impact on RD was found in case of administrative variables. While economic and infrastructural domains are closely linked to each other, both of them (in total) had the highest impact on the level of rural development and the population’s quality of life in rural areas in both countries.

8.3.3 Individual RDI Components

The main individual components (C) of an estimated RDI (Eq. 6) were calculated (for each country, regional unit, and year) as a product of z-standardized factor’s value F k (average in 2002–2005 = 0) and its respective weight β k (from Model 8b).Footnote 42

The most important RDI components found to improve the quality of life in rural regions in Slovakia were: SL-C4 (more developed agriculture compared with a country’s average) and SL-C12 (higher than average density of accommodation facilities). On the other hand, domains that negatively affected the quality of life in rural regions were: SL-C7 (high share of public enterprises), SL-C14 (low availability of retail infrastructure) and SL-C8 (over-endowment with vocational secondary schools). Yet, the importance of particular terms regarding their impact on the rural development changed slightly between the years 2002 and 2005.

In Poland, the most important domains positively contributing to the local development level (country average) in 2005 were: PL-C12 (natural population growth, high share of population of pre-working age); PL-C4 (highest income groups and housing availability); PL-C6 (Population’s structure, high percentage of population in productive age). Among the most disadvantageous ones (country average) in 2005 were: PL-C11 (partial indicators: energy sectors and deaths, an over-proportionally high share of male deaths; extensive agriculture with a high share of pasture land; high exposure to industry, e.g. heat supply, energy sales, etc.), PL-C16 (structure of local budget, lower than average expenditures from rural poviats’ budget on investment, properties, communication and transport; lower than average share of newly registered entities in total public sector), and PL-C2 (lowest income groups and own budgetary resources, high share of local budget revenues from personal income tax in total local budget revenues; high share of local budget expenditures on health care; high level of appropriated budget allocations from the national budget).Footnote 43

8.4 Rural Development Index

8.4.1 Poland

8.4.1.1 Ranking of Regions

The RDI in Poland involving 991 regional indicators was calculated for 314 rural regions (NUTS-4) according to Eq. 6 as the sum of 17 individual components (PL-C) (i.e. component = product of a given factor’s value and its respective coefficient from the estimated migration function using Model 8b). The distribution of the RDI by NUTS-4 regions in years 2002–2005 is shown in Fig. 1.

Fig. 1
figure 1

Poland: ranking of regions. RDI Index by regions (NUTS-4, 314 regions)

During the years 2002–2005 the estimated value of the RDI in Poland ranged between −0.13 and 0.57 (in 2002) and from −0.11 to 0.62 (in 2005), i.e. regional disparity between extreme regions slightly increased (the RDI range grew by 0.03 points). In the majority of regions (46.5%) the overall level of rural development was similar to a country’s average (RDI varied between −0.03 and 0.03). While 31.5% of all rural regions can be characterised as a better and/or well developed (RDI > 0.03) 22.6% of all rural regions in Poland can be qualified as less or least developed (RDI < −0.03). The geographical distribution of the RDI in Poland is shown in Fig. 2a, b.

Fig. 2
figure 2

a Poland: average RDI (by regions and years 2002–2005). b Poland: distribution of the RDI by NUTS-4 (2002–2005)

As expected, the highest values of an RDI (higher than 0.18) were found in the rural suburb areas of big cities Warsaw, Poznan, and Gdansk, thus confirming a thesis of a strong positive influence of economically and socially most developed urban regions (cities) on the development of neighbouring rural areas. On the other hand the lowest RDIs (lower than −0.08) were found in remote regions situated in south-eastern Poland, i.e. hrubieszowski (border with Ukraine), bierunsko-ledzinski (post heavy industrial complex in south Poland), chelmski (border with Ukraine), bieszczadzki (remote region bordered to Ukraine and Slovakia). The results confirm a clear typological division of Poland based on the performance of individual regions into a good performing western and central part, and a badly performing eastern part (north-eastern and south-eastern Poland).

8.4.1.2 Regional Disparities

Regarding the level of regional disparities our analysis shows that in Poland, these are very large and especially concern the best developed regions. Indeed, the difference in estimated level of quality of life measured in terms of the RDI between the best developed regions in Poland and a country’s average was found in 2005 to be much higher than the difference in the RDI between country’s average and the least-developed regions (i.e. South-East Poland).

Comparison of the RDI in the 10 best and 10 least developed rural regionsFootnote 44 reveals also that discrepancies in the development of the above two extreme groups of regions increased during the examined period (2002–2005), e.g. the RDI in the 10 best developed regions increased from 0.36 to 0.40, whereas in the 10 least developed regions the RDI dropped from −0.09 to −0.10) see Tables 4, 5).

Table 4 Poland: Highest developed rural regions: 2002–2005
Table 5 Poland: Lowest developed rural regions: 2002–2005

Both groups of regions differed significantly concerning their endowments with specific factors/principal components determining, the overall quality of life. The most significant differences concerned endowments with factors: F1 (employment by sectors), F4 (Highest income groups and housing availability), F6 (structure of population), F11 (primarily sector—energy, structure of deaths), F12 (population natural growth), and F16 (structure of expenditures in local budgets).

Generally, identification of the most and the less developed regions in Poland by means of the RDI proved very robust. Comparison of both groups of regions, e.g. the most-developed (Group 1) and the less developed regions (Group 2) using partial indicators confirms existence of numerous differences in various important attributes and domains of rural development. The largest differences between both groups were found in:

  • Natural population increase (high rate of growth in grouping 1 vs. negative rate in grouping 2);

  • Share of state-owned and public-owned enterprises in total enterprises; (=>very low shares in grouping 1 vs. high shares in grouping 2);

  • Availability of housing and living space (New two-dwelling and multi-dwelling buildings, number of buildings per km²; usable floor space of dwellings; number of building permits per km², number of dwellings per km²) => high shares in grouping 1 vs. low in grouping 2.

  • Environmental pollution (“Air protection, capacity of the installed facilities to arrest pollutants; particulate pollutants in t/year per 1,000 population”; “Area of waste management total, disposal sites, per total land”; “particulate pollutants per km²”; “gaseous pollutants per km²”); => low values in grouping 1 vs. very high values in grouping 2;

  • Protected landscape areas (“Legally environmentally protected areas in ha of which: protected landscape areas of which those established under gmina council resolutions per protected landscape areas”) => high values in grouping 1 vs. very low values in grouping 2);

Additionally, both groups of regions were found to differ considerably in a number of other important coefficients: e.g. region (gmina) own revenues per capita (high value in grouping 1 vs. low value in grouping 2); share of population with high income (high share in grouping 1 and low share in grouping 2); share of enterprises in sectors: public administration, national defence and social security to total enterprises (low share in grouping 1 vs. high in grouping 2); infant deaths per 1,000 live births (low share in grouping 1 vs. higher share in grouping 2); rate of unemployment (lower rate in grouping 1 vs. higher in grouping 2); number of job offers per total unemployed (higher value in grouping 1 vs. low in grouping 2), etc.

8.4.1.3 Dynamics in Spatial Inequalities

During the years 2002–2005 the estimated mean value of the RDI in Poland for 314 rural regions dropped slightly from 0.020 (2002) to 0.018 (2005) showing some fluctuation over the years. Yet, the regional inequality pattern observed in 2002 strengthened. The quality of life (RDI) in the best developed regions of rural Poland further improved (compared to the country average) whether in less developed regions deteriorated. The number of powiats with negative RDIs (i.e. those below the average level of development) increased from 154 (2002) to 160 (2005). In the same period the overall level of rural development improved in 135 regions, but it deteriorated in another 179 regions (Fig. 3).

Fig. 3
figure 3

Change of RDI in 2005 in comparison to 2002 (absolute values)

The majority of regions which improved their absolute level of RDI were located close to bigger cities and in west- and south-western Poland (probably due to stronger socio-economic ties with Germany and other “old” EU member states); those where the quality of life deteriorated were located mostly in north-east and eastern Poland (close to a border with Russia, Belorussia and Ukraine) and partly in central Poland (located far from bigger cities).

The statistical analysis of changes in RDI shows that during 2002–2005 the level of regional disparities increased (Table 6).

Table 6 Poland: RDI index (2002–2005), descriptive statistics

Over the whole period 2002–2005 RDI range grew from 0.703 to 0.734; variance of the RDI increased from 0.007 to 0.009. Regional disparities grew particularly strongly between 2002 and 2003 (i.e. the RDI range increased from 0.703 to 0.851; variance increased from 0.007 to 0.010), and then dropped in years 2004 and 2005. Interestingly, though the RDI dropped significantly in 2004, i.e. in the year of Poland’s accession to EU, the strong regional divergence that occurred between 2002 and 2003 (RDI range increased from 0.703 to 0.851) was stopped in 2004 (between 2004 and 2005 the RDI increased from −0.065 to 0.018 while range dropped from 0.834 to 0.734 and variance remained unchanged).

8.4.1.4 Stability of Rural Development

The stability of rural development over time was measured using the Pearson-Correlation coefficient matrix (higher values stand for higher stability), the Euclidean-Distance matrix (lower values stand for a higher stability over time) and quartile stability matrices. The similarity-/dissimilarity matrices show that the highest stability in rural development occurred between the years 2004 and 2005. The quartiles development matrix shows that in the period 2002–2005 as many as 96 (31% of all) regions changed their group-membership (in both directions, i.e. positive and negative). The highest number of changes took place in the 3rd quartile (second worst regions in terms of RDI) followed by the 2nd (second best). Regarding overall level of development the most stable were regions included in quartiles 1 and 4 (i.e. the group of the highest and the less developed regions).

Detailed information about the geographical location of regions that changed their position in the years 2002–2005 can be obtained from Fig. 4.

Fig. 4
figure 4

Poland quartiles-change 2002–2005

It shows that most of the changes (positive and negative) concerned those regions located in Central- and South-West Poland, while for example in Eastern Poland (consisting in a great part of the least developed regions) a relative position of rural regions remained unchanged.

8.4.2 Slovakia

8.4.2.1 Ranking of Regions

The RDI constructed for Slovakia consists of 21 terms and involves 337 regional indicators calculated and weighted according to Eq. 6. The territorial and geographical distribution of the RDI in Slovakia (by NUTS-4 regions) in years 2002–2005 is shown in Figs. 5 and 6.

Fig. 5
figure 5

Slovakia: distribution of RDI (by NUTS-4 regions) in years 2002–2005

Fig. 6
figure 6

Slovakia: distribution of RDI (average 2002–2005)

During the years 2002–2005 the estimated value of the RDI ranged from −0.51 to +0.91 (regional discrepancies were therefore higher than in Poland). As expected, the highest values of RDI were found in regions located in West Slovakia (e.g. Senec, Pezinok, Dunajska Streda, Galanta, etc.), while regions of Eastern Slovakia and Central Slovakia (e.g. Gelnica, Stropkov, Namestovo, Kezmarok, Stara Lubovna) exhibited the lowest RDI values.

8.4.2.2 Statistical Distribution of RDI Index

The results of analysis show that the statistical distribution of 72 rural regions in Slovakia with regard to their development level was close to normal (approximately the same number of rural regions belonged to high and low performing groups). The results also confirm a clear typographic division of Slovakia into western-, central and eastern sub-areas based on performance of individual regions, and back-up a general opinion that the level of rural development decreases from West to East.

8.4.2.3 Regional Disparities and Development Dynamics

The change in the RDI across Slovak regions over the years 2002–2005 is illustrated in Fig. 7). The figure shows that a general pattern of development (i.e. western regions have higher RDI values compared with east-Slovak regions) persisted throughout the years 2002–2005. Yet, particularly interesting was an improvement of the RDI in regions located in West and Central Slovakia, which can be interpreted as a considerable spill-over effect transmitting economic and social development from better developed regions (Western Slovakia) to less developed regions (Central Slovakia).

Fig. 7
figure 7

Slovakia: change in RDI by region (2002–2005)

During the years 2002–2005, the range of RDI values in Slovak regions shrank from 1.45 to 1.39, i.e. the absolute difference between two extreme regions decreased over this period. At the same time a general improvement of a development level across all rural regions took place (i.e. the number of regions with negative values decreased from 42 (2002) to 31 (2005), and those with a positive RDI increased from 30 (2002) to 41 (2005). Yet, this encouraging development was simultaneously accompanied by an increasing variance in RDI values (see Table 7) which indicates a progressing regional divergence.

Table 7 Slovakia: RDI 2002–2005 descriptive-statistics

When looking at the geographical distribution of changes in RDI by regions (Figs. 7, 8a, b) our results show that most regions with an improved RDI were located in Western Slovakia and in the northern part of Central Slovakia.

Fig. 8
figure 8

a Slovakia RDI 2002. b Slovakia RDI 2005

At the same time, the level of development deteriorated in some of regions located in the southern part of Central Slovakia and Eastern Slovakia. Especially problematic is an apparent continuous deterioration of a rural development level observed in region Vielki Krtis (region 48) located at the border with Hungary.

An analysis of the geographical distribution of RDI values confirms a dichotomy in the development of Slovak regions (i.e. a clear pattern with the higher-than-average rural development in West-Slovakia and lower-than-average development pattern of regions located in Eastern-Slovakia). Yet, in contrast to declared policy and efforts towards a greater regional convergence (one of the main important objectives of EU regional and rural policies) our analysis shows that discrepancies in the level of rural development between Western and Eastern Slovakia was reinforced over the years 2002–2005, i.e. in Western Slovakia an average increase of the RDI was approximately 50% higher compared with Eastern Slovakia.

In Slovakia, the most significant differences between good and bad performing regions concerned endowments with factors F2 (availability of social services and technical infrastructure per capita), F3 (social and living environment including availability of housing), F10 (special schools), F4 (agriculture), F13 (public facilities) and F14 (availability of retail infrastructure). A high endowment with social and technical infrastructure calculated per capita (F2) was not found to contribute to the higher quality of life in individual rural regions (high values of regional coefficients computed per capita level may reflect a region’s low population density, and therefore usually do not provide reliable information about the spatial availability of a given service). Good performing regions were found to be endowed with a higher than the country average with factors: F3 (Social and living environment, incl. availability of housing), F4 (Agriculture), F13 (Public facilities) and F14 (Availability of retail infrastructure).

The analysis of regions with the highest and lowest RDI (2002-2005) also shows that both the five most developed regions (i.e. Senec, Pezinok, Dunajska Streda, Galanta and Piestany) and the five less developed regions (Stara Lubovna, Kezmarok, Namestovo, Stropkov and Gelnica) maintained their rank over time (i.e. high stability). While both groupings of regions experienced a positive trend in their development (the sum of RDI values calculated for the five highest and five lowest RDI regions increased over time), in the case of the five best regions this trend stopped in 2004, i.e. the level of development in the great majority of the best regions deteriorated in 2005, compared with 2004 (except for the leading region: i.e. Senec). The highest improvement of RDI among the five less developed regions occurred in eastern Slovakia: Stropkov (40%) and Kezmarok (23%).

In the five most developed regions, i.e. regions with the RDI higher than 0.3 (5 regions in 2002; 10 regions in 2003; 11 regions in 2004; 17 regions in 2005) components with the most positive impact on rural development were: SL-C4 (agriculture), SL-C2 (availability of social and technical infrastructure per capita), and SL-C14 (availability of retail infrastructure per capita). In all these cases the shares of the above components in an overall index’s value were among the highest and estimated coefficients were statistically significant at the 1% level.

On the other hand, i.e. in the case of the five least developed regions, i.e. regions with an RDI lower than −0.3 (15 regions in 2002; 10 regions in 2003; 9 regions in 2004; 7 regions in 2005) components which contributed to the highest extent to the low value of the RDI were: T13 (inadequate public facilities), T4 (less intensive agriculture) and T2 (social and technical infrastructure per capita).

Quartile-Stability. The quartiles development matrix shows that in the period 2002–2005 only 12–15% of all regions in Slovakia changed their group-membership (in both directions, i.e. positive and negative). Similar to Poland, the highest number of changes took place in the 2nd quartile (second best regions in terms of the RDI), followed by the 3rd and 1st quartile. The most stable were regions included in quartile 4 (i.e. the group of the least developed regions).Footnote 45 The most of the observed changes (positive as well as negative) concerned regions located in Central Slovakia (see Fig. 9).

Fig. 9
figure 9

Slovakia: quartile change during years 2002–2005

9 Conclusions

The main purpose of this research was to construct a multi-dimensional (composite) index measuring objectively the overall (synthetic) level of rural development and quality of life in all individual rural regions of a given EU country at a highly disaggregated level (e.g. NUTS-4). In the proposed RDI the rural development domains are represented by hundreds of partial territorial, socio-economic, environmental, infrastructural and administrative indicators/variables calculated from secondary regional statistics. The weights of various domains entering the RDI index are derived for a given country empirically from the econometrically estimated intra- and inter-regional migration model, which inter alia takes into consideration preferences of both migrants as well as those who stayed, and can be therefore viewed as representative (weights) for a whole population in a given time period. Application of the RDI to analysis of rural economies allows for an analysis of importance of specific economic, social and environmental factors affecting rural development at a local level; the measurement of the real regional disparities in overall regional development (beyond GDP); ranking of all rural areas with respect to their overall (synthetic) level of development, etc.

An empirical analysis of the overall development and performance of rural regions (NUTS-4 level) using an RDI in Slovakia and Poland in the period 2002–2005 shows a number of important common trends: (1) considerable diversity in the level of regional/rural development among rural regions in both countries; (2) positive spill-overs of development from better developed to the neighbouring less developed regions; (3) progressing regional disparities between the highest and the lowest developed regions over time; (4) particular importance of specific economic, social and environmental indicators (e.g. high income, availability of housing, lack of pollution, high share of private sector, high share of population in working age and women in population’s structure, etc.) contributing to the high overall level of development in rural areas.

The main methodological conclusions are:

  • An RDI allows for a comprehensive analysis of various rural development domains (economic, social, environmental, etc.) and their impact on the overall quality of life in rural regions and is powerful at NUTS 2–5 or even village levels;

  • The index is not constant over time, easily adjustable and allows for an easy inclusion of additional relevant variables/coefficients representing various aspects of the overall quality of life/rural development;

  • The weights applied into the construction of the RDI represent society’s valuation of endowments and socio-economic trends observable at local/regional levels. They are also representative for society as whole (reflects both the decision of the migrating population and of the population that stays in the region). The weights are empirically derived and statistically verified (in the actual version the estimated weights are kept constant in time);

  • The inclusion of transaction costs to the model allows for a technical separation of quality of life from migration;

  • Data: an RDI is data hungry.

The main policy conclusion of this study is that, due to its comprehensiveness and high reliability, the RDI is suitable both to an analysis of the overall level of development of rural areas as well as to a quantitative evaluation of the impacts of given RD and structural programmes at regional levels. Examples of the latter (with RDI as an impact indicator and applying matching methods, e.g. binary and generalized propensity score matching in Poland and Slovakia) can be found in (Michalek 2007, 2009).