1 Introduction

In the researches of vulnerability to natural hazards and its driving factors, many scholars recognize that the social essence, equity, population dynamics, and economic development are the important impacting factors in the formation and development of vulnerability (Brooks 2003; Zou and Wei 2009, 2010; Füssel 2007; Brenkert and Malone 2005; Downing and Patwardhan 2004). With the combined functions of these factors, the sustainability of the social economic system, the resilience to natural hazards, and the exposure and sensitivity to the shocks of hazards are influenced, which further work on the vulnerability of the whole social economic system. However, the mechanism of ways and extents these factors influence the vulnerability, the association between them, and the feedbacks of vulnerability to these factors, are all still not cleared.

The methodologies that are employed in these researches are various. Usually, there have been two methods of assessment: expert-opinion-based and historical-data-based. The first method is mainly used to assess the impact on non-economic factors, such as psychological health situations and welfare of certain social groups (Crighton et al. 2003) as well as interaction between different sectors (Vazquez et al. 2001). The assessment based on expert opinion is used to assess indicators or factors that are not easily quantified. However, the main disadvantage includes the issue of ensuring that the results of assessment from the experts are impartial and more or less objective. Assessment that is based on historical data avoids the disadvantage of subjectivity by using historical secondary data and models. This kind of assessment is usually based on some assumptions, though sometime not explicitly. For example, the computed data of rising sea level was analyzed, and the impact on society from this disaster was assessed. Also, in the study, future scenarios were simulated. Through the analysis and simulation, it was concluded that although the sea level rise was very slow, it still causes significant consequences for the national economy, as well as political issues and people’s pattern of living (Klein and Nicholls 1999). Another example (Torres-Vera and Canas 2003) used a method of grids to assess the vulnerability of lifelines in Barcelona, Spain. In other research (Anbalagan and Singh 1996), a special zoning method was employed to assess the economic loss from mountain slides in places in India.

Although the studies on vulnerability are carried in various research fields, and many methodologies and approaches are employed, researches on how the interactions between different social economic factors function on vulnerability are very few. In the research of Zou and Wei (2010), the authors undertook a comprehensive systematic analysis of the scientific literature on coastal hazards to identify the factors contributing to hazard vulnerability, to determine the relationships between them, and to review recommendations for vulnerability reduction. With the employment of meta-analysis methodology, 361 social economic impacting factors to vulnerability were determined, as well as the complex causal relationship between them. However, the meta-analysis method is only suitable for the observable factors. If a factor does not exist in any literatures, it cannot be uncovered with the meta-analysis method. Therefore, if there are any patent factors functioning on the vulnerability, it is hard to be extracted.

Structural equation modeling (SEM) is a statistic technology to verify the causal relationship with linear equations based on pre-assumed theories. The aim of SEM is to discover the causal relationship and present it with patterns and routine maps (Kline 1998). SEM is to handle the multiple variables and complicated data and is used to estimate latent variables and to estimate the parameters of complex independent and response variables simultaneously. Therefore, SEM is suitable to be employed in the analysis of driving factors and their relationships to the vulnerability. The SEM method is usually used in social and psychological researches, but in the study of vulnerability to natural hazards field, there are very few applications.

The aim of this study is to analyze the impacting factors of and their interactions on vulnerability to natural hazards, by utilizing the characteristics of SEM of estimating and verifying the construct of driving factors to social vulnerability, combining the advantage of SEM of analyzing interrelationship between the multiple factors. Therefore, the conclusions of this study are expected to clarify the key factors in vulnerability and to help the management practice more accurate and scientific.

2 Review on the driving factors to social vulnerability

The research on vulnerability started from the synthesis research on the risks and impacts of natural hazards. Along with the deepening of the problems and the broadening of the fields, the vulnerability research has been developed into a typical cross-cutting discipline (Zou and Wei 2009). Nowadays, a widely accepted concept of vulnerability comes from the studies of Turner et al. (2003), in which “vulnerability” refers to the capacity to be wounded, i.e. the degree to which a system is likely to experience harm due to exposure to a hazard (Turner et al. 2003). In the researches in recent years, many other conceptual frameworks and relating models have been developed (Downing and Patwardhan 2004; Fussel 2007; Kasperson et al. 2005; O’Brien et al. 2007; Leary et al. 2008). Some researches argue that frameworks should also be able to integrate the social and biophysical dimensions of vulnerability to climate change (Klein and Nicholls 1999; Turner et al. 2003).

Although there are some studies focusing on the measurement of vulnerability, and methods have been proposed (Ionescu et al. 2005; Metzger et al. 2006), the approaches of analyzing and assessing vulnerability are still preliminary. For example, the report of Intergovernmental panel on climate change (IPCC) mentions using Climate change impacts, vulnerability and adaptation assessments (CCIAV) methodology to study the vulnerability. The CCIAV framework, though contains the part of scenario driven impact approach and vulnerability-based approach, still pays more attention on the risks themselves, while in a big part, the approaches are mixed with the assessment of adaptations.

In the framework of Adger and Kelly (1998), the authors described the vulnerability to climate change as the characteristics of poverty, inequity, and institutional change and pointed out that to a big extent, these characteristics are closely related to the political economic preference of market and decision makers (Adger and Kelly 1998). In a study on the tornado disaster in Bangladesh in 1991, the researchers concluded that the migration is among the important reasons for high mortality rate (Mushtaque et al. 1993). And they argued that because of the lack of local hazards knowledge and experience of the migrated people, there were short of necessary coping strategies. Also, in other researches, it is found that the motives of migration usually are related to hoping to grasp more natural resource, and the immigration would bring overexploration then become a reason of conflictions (Cutter 1995; Thompson and Sultana 1996). The environmental degradation caused by immigration is also a reason for high loss in the natural hazards (Cutter 1995; Bandyopadhyay 1997; Chan 1997; Imamura and To 1997). Along with the more and more closer connection between regional trading association around the world, the social development is also impacted by progresses in larger scales, which further works on the vulnerability within the scale (Adger et al. 2005; Lindskog et al. 2005).

In the researches on the driving factors or impacting factors, there is no consistent viewpoint yet. Adger et al. (2004) agreed that the vulnerability interacted with adaptations and then concluded two kinds of impacting processes: local scale processes (e.g. household or community) and process at higher scales, in each of which category there listed 10 indices (Adger et al. 2004). Tol and Yohe (2007) listed eight underlying determinants of vulnerability and adaptation, and they pointed that these determinants were closely binded on certain places and times (Tol and Yohe 2007). In the research of Alberini et al. (2006), the researchers considered that the income per capita, equity of income, universal health care, and the accessability to information were the determinants to adaptation (Alberini et al. 2006). In their research, they brought out 34 indices of adaptation being categorized into five groups: institutions, religion, culture, economics, education, and six indices of vulnerability: fraction of people affected by natural disasters, infant mortality, life expectancy at birth, average calorie supply per person per day, percentage of people with access to improved sanitation, and percentage of people with access to an improved source of drinking water.

In the measurement of vulnerability, different researchers build up various indexes. In Table 1, there list several typical measurement indexes.

Table 1 Main measurement indexes for vulnerability (1970–2007)

In this study, combining previous research and existing acknowledgement, the income allocation, social progress, and the industrialization are taken as the most important impacting factors beside the environmental essence and the natural process. On the other hand, according to different existing vulnerability frameworks, the social vulnerability could be presented as the environmental sustainability, resilience to natural hazards, and the stability of social structure. In this paper, the relationship between the above six aspects would be deeply discussed.

Based on the above reviews of the driving factors and the presentation index of vulnerability, in this study, we decided three categories totally 10 indices as the driving factors and same number of the presentation proxy indices, which are shown in Table 2.

Table 2 Driving factors and presentation indices to social vulnerability to natural hazards

3 Methodology and data

3.1 Conceptual framework and hypotheses on the impacting factors to vulnerability

Because the vulnerability cannot be measured directly, this study takes following hypothesis on the impacting factors and their causal relationships to vulnerability, based on the reviews above and the indices in Table 2:

  • H1: To the income allocation, the more equal the allocation, the more resilient the social economic system to the shocks of natural hazards;

  • H2: To the social progress, it is assumed that the more advanced the social progress, the more sustainable the system is, and the more resilience to the hazards, as well as more reasonable the social structure;

  • H3: The more industrialized the economic system is, the more stable the social structure, and the higher the development level;

  • H4: The stability of the society impacts the resilience to natural hazards, and the more stable is the society, the more resilient;

  • H5: The more sustainable is the system, the more resilient to the shocks of natural hazards.

Based on the above H1 to H5, combined the measuring index, a construct relationship framework of impacting factors to vulnerability is shown as Fig. 1.

Fig. 1
figure 1

Construct relationship framework of impacting factors to vulnerability

3.2 Method of analyzing the impact factors to vulnerability: a SEM model

Structural equation model (SEM) is also called latent variable model (LVM). It is developed in 1970s in the works of Joreskog and Goldberger (1972). In early years, the SEM was widely used in the researches of psychology and sociology (e.g. Williams et al. 2005; Fyhri and Klæboe 2009; Roesch and Weiner 2001), and then, it was further used in the ecological and environmental researches (e.g. Chen and Lin 2010). In recent years, SEM has been gradually used in the management and economics researches (e.g. Golob 2003; Ülengin et al. 2010). One of the advantages of SEM is that it could analyze the un-observed variables through measuring the observable variables. Since its publishment, the SEM has been improved in many ways, including the multiple indicators and multiple causes models (e.g. Gertler 1988; Chou and Bentler 2002; Iacobucci 2010; Tang et al. 2009; Curran and Hussong 2002).

Compared to other multiple variable statistic methods, SEM can better test the causal relationship between variables by modeling measurement error (Chen and Lin 2010). A complete SEM model includes a measurement model and a structure model. The measurement model describes the interrelationship between observed variables and latent factors, while the structure model describes the relationship between different latent variables.

In this study, the measurement model is presented as Eqs. 1 and 2, and the structural model is presented as Eq. 3:

$$ Y = \Uplambda_{y} \eta + \varepsilon $$
(1)
$$ X = \Uplambda_{x} \xi + \delta $$
(2)
$$ \eta = B\eta + \Upgamma \xi + \varsigma , $$
(3)

where X is the q × 1 vector of exogenous observed variables including the 10 indices in 3 driving categories; Y is the p × 1 vector of observed responses, in this study which presents the 10 features indices in the 3 endogenous variables of vulnerability: environmental sustainability, resilience to natural hazards, and the social structure; ξ is an n × 1 vector of latent exogenous variables; η is an m × 1 vector of latent dependent or endogenous variables; Λx is the q × m matrix of regression coefficients of X on ξ; Λy is the p × m matrix of coefficients of the regression of Y on η; and δ and ε are q × 1 and p × 1 vectors of measurement errors in X and Y, respectively.

In this study, the exogenous variables (ξ) included income allocation (ξ1), development level (ξ2), and industrialization level (ξ3). Environmental sustainability (η1), vulnerability to natural hazards (η2), and social structure (η3) represented endogenous variables (η).

The measurement model is used to measure the relationship between observed variables and latent variables, while it does not reflect the causal relationship between latent variables. The structural model achieves this aim with the transaction matrix. In the structural model Eq. 3, β m×m is the structural coefficient matrix of endogenous variables η, Γ m×n is the effect structural coefficient matrix of exogenous variable ξ to η, and \( \zeta \) is the error vector and uncorrelated with ξ.

Therefore, the functional form of the estimated causal relationships is described as:

$$ \eta_{1} = \eta_{1} \left( {\xi_{2} ,\xi_{3} } \right) $$
(4)
$$ \eta_{2} = \eta_{2} \left( {\xi_{1} ,\xi_{2} ,\eta_{3} } \right) $$
(5)
$$ \eta_{3} = \eta_{3} \left( {\xi_{2} ,\xi_{3} } \right). $$
(6)

The model was estimated using LISREL maximum likelihood procedures.

3.3 Data source

The natural hazards in Chinese provinces are taken as the study cases to investigate the latent impacting factors to the vulnerability and their interrelationship. To the SEM, there are requirements on the sample sizes (Bayard and Jolly 2007; Nunnally 1974; Herbert and Bell 1997; Jöreskog and Sörbom 2001; Iacobucci 2010). In a big part of the sociological studies, questionnaire is the common approach to collect data (e.g. Chen and Lin 2010; Fyhri and Klæboe 2009; Krishnakumar and Ballon 2008). In this study, the data are statistics of certain hazards in the years from 1950 to 2009, including floods, storms, and earthquakes. The disaster frequency, the damage losses, and the affected population are shown in Figs. 2, 3 and 4. From 1950 to 2009, there are totally 498 disasters.

Fig. 2
figure 2

The annual frequencies of the natural disasters in China (1950–2009) (data source: EM-DAT: The OFDA/CRED international disaster database)

Fig. 3
figure 3

Total affected population in natural disasters in China (1950–2009) (data source: EM-DAT: The OFDA/CRED international disaster database)

Fig. 4
figure 4

The estimated damage losses from natural disasters in China (1950–2009) (data source: EM-DAT: The OFDA/CRED international disaster database)

Each disaster is taken as a sample, and the provinces are the considered scales. Then, the latent and observable variables in a certain sample are from the statistics of that province in corresponding years. Considering the statistical accuracy and the comparableness of the disaster dimensions, only those disasters with affected people more than 2000 or the economic loss over 5 million RMB (price in current year) are included. With exception of missing data, totally 348 disaster samples are got. The disaster statistics and other provincial statistics are from yearbooks of 28 mainland provinces, excluding the Tibet and Hainan. The GINI coefficients are calculated with the method in Schader and Schmid (1994).

3.4 Consistency reliability test on the factors

To test the invariance of causal structures across samples, a cluster analysis is adopted based on the Euclidean distances to categorize the samples. Through the cluster analysis, it is found that the 348 samples could be grouped in two categories, taking the GDP per capita as the pivotal variable for distancing the clusters. In the first category, the GDP per capita is higher than the second, and the samples in each category are 97 and 251, respectively.

Also a factor analysis is carried for the samples. The sample data are grouped into six parts, corresponding to the six exogenous and endogenous variables. The results of the testing show that the coefficients of internal consistency (Cronbach’s α) are all above 0.5, fitting with the reliability standard of Nunnally (1974). And the lording scores of each of the six main factors are all above 0.45.

The hypothesis in Eqs. 4, 5, and 6 is tested with all the samples, and then, the two clusters are tested cross-validation separately. To test the consistency of the structural paths, the method in Bayard and Jolly (2007) is employed. The model structural paths are constrained to be equal across groups at first, and then, for unconstrained models, the assumption for causal paths was successively relaxed (Bayard and Jolly 2007). And then the unconstrained models were compared to the constrained model based on the chi-square difference test. In this study, fit of the model is tested with the comparative fit index (CFI) and the root mean square error approximation (RMSEA) (Jöreskog and Sörbom 2001). The CFI = 0.924 and RMSEA = 0.048 both show the model is suitable for verifying and analyzing the hypotheses.

4 The results and discussions

4.1 The testing results of the hypothesis

In this study, the maximum likelihood (ML) is used as the estimation method, and the interrelationship between factors in the model is tested. The testing results are shown in Table 3.

Table 3 Testing results of the structural equation model

From Table 3 it could be seen that in the relationship between income allocation and the resilience to natural hazards, the standardized estimation on the path coefficient is 0.544, and the significant possibility is 0.061. The income allocation and the resilience are positively correlated and being significant under the 0.1 level. The H1 is verified.

In the relationship between development level and the environmental sustainability, the standardized estimation on the path coefficient is 0.332, and the significant possibility is 0.005. The development level and the environmental sustainability are positively correlated and being significant under the 0.005 level. The H2 is verified.

In the relationship between industrialization level and the social structure, the standardized estimation on the path coefficient is 0.163, and the significant possibility is 0.047. The industrialization level and the social structure are positively correlated and being significant under the 0.05 level. The H3 is verified.

It is noticeable that in the relationship between industrialization and the environmental sustainability, the standardized estimation on the path coefficient is 0.009, and the significant possibility is 0.903 > 0.1, which indicates that even under the 0.1 level, it is not significant. This shows that the industrialization and the environmental sustainability are not actually correlated.

In the relationship between social structure and the resilience to natural hazards, the significant possibility is 0.170. The positive correlation is not significant, and the H4 is overthrown.

On the relationship between the environmental sustainability and the resilience, the H5 is verified: the more sustainable, the more resilient the social economic system to natural hazards.

4.2 The allocation of income is an important factor to regions of different economic development levels

The whole samples are separated into two groups according to the income per capita based on the cluster analysis. It is found that in both models of the two groups, the factors impacting environmental sustainability, the resilience, and the social structure function differently. In the model of lower GDP per capita, the GINI coefficient has the biggest impact on the income allocation. When the GINI coefficient gets higher, the resilience to natural hazards gets lower, and the affected population and the economic loss get bigger. This indicates that when the income of the affected areas or the people groups is relatively low, improving income and living level becomes the basic need, and hence, the requirement of protecting the sustainability and other relating issues is relatively overlooked.

The extent of diversity of income level or the wealth level is an important determinant to the vulnerability. The GINI coefficient of China in 2009 was closed to 0.5. Especially in some western provinces such as Gansu, Shanxi, and Inner Mongolia, it is higher than in eastern provinces. From the results of the models, it could be seen that in the provinces with bigger GINI coefficients, the people loss is also bigger in natural disasters. And in the provinces with relatively equal incomes, such as Jiangsu, Zhejiang, and Shandong, the people loss is relatively smaller. In other words, whether the income allocation is equal, reflects the accessibility to social and natural resources (Bhagavan and Virgin 2004), as well as the control to resources on different levels (Lebel et al. 2005).

4.3 Industrialization level has large impacts on vulnerability in both direct and indirect ways

The industrialization is closely related to the human activities and the constructions of infrastructures. In the models of this study, it is tried to describe the industrialization level through the energy use per GDP, the energy use per capita, and the industrial added value in GDP. China is a developing country with high speed of industrialization, and the economic growth is taken as one of the prior national strategies. The social system is changing fast at the meantime. The traditional small-scaled agricultural economics is taken place by the large-scaled commercial economic agriculture. The land views are largely changed by the land use changes of urbanization and construction, which induce together the deep change in geographical and environmental features. Building of infrastructures, such as plants, power stations, hydro dams, roads, and tourism zones, is influencing the environment and working on the ecosystem and its functions. All of these are directly working on the vulnerability on place.

Besides, the industrialization works on vulnerability indirectly through social structure. In the lower GDP per capita model, the social structure is sensitive to the changes in industrialization levels. When the industrial added value counts for a large part in GDP, the immigration and emigration are increasing, and the annual population growth rate is increasing too. This is accordant to other researches (Cutter 1995; Thompson and Sultana 1996). Considering the interrelationship between factors, the social structure does not only impact the forming and developing of vulnerability directly, but also work on other aspects, such as population dynamics, and living levels, including income and literacy and expected life. And then, these exogenous factors will work on the vulnerability in turn.

5 Conclusions

In this paper, we propose a new technical method to detect the vulnerability through structural equation models taking into account the interactions and causal factors impacting the vulnerability. The models are applied in an empirical context to study the driving factors and the superficial features of vulnerability to natural hazards in different provinces in China.

The results show that the allocation of income is an important essence in determining the vulnerability. In some sense, the allocation of income indicates the allocation to social resources. It reflects a kind of “routine” in the social system, behind which is the equity of accessibility to power and entitlement. This conclusion is accordant to some early research carried out in cross-national scale in the work of Zou and Wei (2010). Therefore, it is verified that the access to power and entitlement is a universal common essential factors impacting vulnerability. Although this point should be considered severely by the decision makers in scheming the vulnerability reduction strategies, it will take lot efforts to embody it into the practice. The levels of industrialization also have different impacts on the vulnerability in different contexts. In the regions with lower economic development levels, when the industry takes a big part in the whole economics, it turns to be more sensitive to the shocks of natural hazards. This indicates that in the undeveloped regions, to reduce the vulnerability, the industrialization progress probably should be slower.

For China, whose development is very unbalanced in different regions and the social forms various in provinces, the policies and measures of vulnerability management and reduction should be made much more careful than smaller countries. The policies and directives making should take consideration the local situations for different provinces of western and eastern. Not only the local hazards should be considered, but also the certain development level, the local economic environment, the incomes of households, and their differences, even the population structures and the knowledge levels. In this study only the provincial statistics are input in the model, but it is believed that the smaller scales bear the same situation, even needs more different detailed considerations. By all means, the vulnerability is an “in place” issue and characterized by places and scales.

In this study, the driving factors and superficial features are described broadly, and some other important issues are not included. For example, among the social economic factors, the gender issues, senior citizens, and vulnerable people groups should be considered very carefully in a comprehensive way of interacting with the whole system. And for different types of natural hazards, there are different management patterns that influence the vulnerability of social system. And there are still some other important points such as the early warning system function a lot in managing and reducing vulnerability. Because of the limitation of data access, the above issues are not analyzed and discussed in this paper. But they are absolutely not unimportant. More detailed investigations and analyses should be carried out in future works.