1 Introduction

Water is an indispensable and fundamental natural resource for human welfare and social development (Gleick 1996), especially in China, which has more than twenty percent of the world’s population but less than seven percent of its freshwater resources (Todd 2010). Water scarcity is a question of means rather than resource availability (Sullivan 2002; Rijsberman 2006), and so the social environmental and economic factors need be integrated into the water resource management model (Jemmali and Matoussi 2013). The theory of water poverty expands the analysis of water scarcity from the field of hydrology to integrated water resources management (Feitelson and Chenoweth 2002; Rijsberman 2006). However, the calculation method of WPI has some shortcomings when it was applied in the Chinese case. Thus the aims of this paper are to revise the method of the calculation of Water Poverty Index (WPI) and then use it to analyze the underlying complexities of water issues in China.

Water poverty is defined as not having sufficient water to cover basic needs. Water poverty might be caused by water unavailability, income poverty or other reasons for inadequate access (Lawrence et al. 2002; Komnenic et al. 2009). The measurement of water poverty enables water resource policymakers and managers to track progress and evaluate the effectiveness of actions. The Water Poverty Index (WPI) was created by Sullivan (2002) to reconcile measures of water availability with measures of people’s capacity to access water. There are four methods for the calculation of WPI: conventional composite index approach, gap approach, matrix approach and the simple time-analysis approach (Sullivan 2002). The first one is the focus of this paper because it is based on identifying the multi-dimensional nature of water poverty. The underlying framework was developed by Lawrence et al. (2002) covering five key components: water availability, access to water service, capacity for sustaining access, use of water, and the environmental factors influencing water quality and the ecology. Each of the components is made up of a number of sub-components identified to capture a wide range of water problems. They are firstly calculated by min–max normalization of initial variables and a weighted arithmetic average method. Then the components are combined using the same weighted average method in order to obtain the final value of WPI (Sullivan 2002; Lawrence et al. 2002).

The composite WPI method has been successfully applied through its final value or in the form of its components as a monitoring tool to express the water situation at national scale (Lawrence et al. 2002; Komnenic et al. 2009), regional scale (Heidecke 2006), local scale (Sullivan et al. 2003; Sullivan et al. 2007; Giné Garriga and Pérez Foguet 2011) and basin scale (Pérez Foguet and Giné Garriga 2011). However, the index has been criticized on several weaknesses involving the quality of data, arbitrariness of weights, high correlations between dimensions and variables, and loss of information in the aggregation. Sullivan et al. (2003, 2007) encouraged the use of existing data to reduce costs and promote the calculation of the index. Molle and Mollinga (2003) criticized the index for conflating disparate and correlated pieces of information with arbitrary weights. Nardo et al. (2005) argued that the subject weights assigned to WPI components affect the statistical proprieties and interpretabilities of the final values of the index, and that the additive aggregation implies the possibility of offsetting poor performance in some indicators by sufficiently high values of other indicators. Heidecke (2006) recommended further investigation of the equal weighting scheme due to its inadequate explanation.

On the basis of previous criticism, principal component analysis (PCA) was developed by Cho et al. (2010), Gine Garriga and Perez Foguet (2010), 2011), Jemmali and Matoussi (2013) and Jemmali and Sullivan (2014) to revise the method of calculation of composite WPI. Cho et al. (2010) used PCA to simplify WPI from five components comprised of Resources, Access, Capacity, Use and Environment to three components comprised of Access, Capacity and Environment. Gine Garriga and Perez Foguet (2010) and Jemmali and Sullivan (2014) adopted PCA to calculate each component of WPI, and then aggregated the components to obtain the final value of WPI. Jemmali and Matoussi (2013) use PCA to give more weight to components with large variance and to discard components with smaller ones. Their results revealed that PCA can determine the weights objectively and avoid the problem of ambiguity among components. However, in the previous studies, PCA is used only in the calculation of each component, not oriented to all indicators. Therefore ambiguity may still exist in the holistic indicators. Moreover, water is highly variable both on a spatial and temporal scale (Sullivan et al. 2007), which adds two dimensions into the analysis of water poverty so that the variable matrix tends to be very large and the results of dimension reduction by the traditional PCA is not very good. To strengthen the interpretability of PCA and overcome the limitations of the traditional PCA, the transformation of centralized logarithm of initial variable and the methodology of holistic and dynamic PCA are proposed in Sect. 2. In Sect. 3, a step-by-step procedure for developing the composite WPI for Chinese case is provided. In Sect. 4, the improved holistic and dynamic PCA is applied to analyze regional water poverty trends and influential factors in China. A discussion about the derived policy implications of the obtained results and a conclusion constitute the final section.

2 Improved Holistic and Dynamic PCA

2.1 Limitations of Traditional PCA

Principal components analysis (PCA) is used to transform a large set of correlated variables into a smaller set of uncorrelated components, called principal components, which account for most of the variation in the original set of variables (Dunteman 1989). Thus, the complex relationship of initial variables can be simplified due to the uncorrelated principal components. The traditional PCA is primarily oriented to static data, which consists of indicators and samples, and it fails to reveal the evolutionary trend of data over time.

The traditional PCA is a linear dimensional reduction technique because the principal component is the linear combination of the initial variables with characteristic vectors of the correlation or covariance (Jolliffe 1986; Johnson and Wichern 2007). In the process of PCA, the initial data needs to be normalized in order to eliminate the influence of various units of measure and scale of parameters, especially for the multidimensional indicators of WPI in this study. There are some limitations of the conventional normalization methods. For examples, the mean–variance normalization method fails to retain the information of variance among the initial data due to the normalization, and the min- max normalization, minimum normalization, maximum normalization, and mean normalization hinder comparability of the results because the maximum, minimum and mean values are highly dependent on the sample. Moreover, these methods are based on the coefficient matrix of the correlation or covariance subject to the two-dimensional uniform distribution (0,1), and not adequate when the dimension of time is added to the data as the relationship between indicators and sample data is nonlinear over time (Ye 2001; Liu et al. 2012).

Regarding a set of \( p \)-dimensional component data,\( \sum\nolimits_{i = 1}^{p} {x_{i} = 1} \), then for the component of x i ,

$$ \sum\limits_{l = 1}^{p} {Cov(x_{i} ,x_{l} )} = 0,\sum\limits_{i \ne l} {Cov(x_{i} ,x_{l} )} = - Var(x_{i} ) $$
(1)

Because \( Var(x_{i} ) \succ 0, \)(\( Var(x_{i} ) = 0 \) when x i is constant), there is at least a negative covariance in \( (p - 1)Cov(x_{i} ,x_{l} ),i \ne l \). In other words, there are at least p negative covariances in the theoretical matrix of covariance \( V = (Cov(x_{i} ,x_{l} )),(i \ne l) \). Their correlation coefficients don’t comply with the uniform distribution of (0, 1). For the case of p-dimensional (x 1, x 2),\( Cov(x_{1} ,x_{2} ) = - Var(x_{1} ) = - Var(x_{2} ) \), the correlation coefficient between x 1 and x 2 will be

$$ \rho = Cov(x_{1} ,x_{2} )\left\{ {Var(x_{1} )Var(x_{2} )} \right\}^{{ - \frac{1}{2}}} = - 1 $$
(2)

This indicates the correlation coefficient of p-dimensional component data is certain to be −1, rather than fitting the uniform distribution of (0, 1). The negative skewness of the coefficient matrix of the correlation or covariance is so significant in the component data that it is difficult to explain the p-dimensional component data by using the traditional normalized PCA, such as min–max normalized PCA, which is based on the coefficient matrix of correlation or covariance in the hypothesis of the two-dimensional uniform distribution of (0, 1).

2.2 The p-dimensional component data transformation of centralized logarithm

According to Zhang and Chen (1996), Ye (2001) and Liu et al. (2012), assuming there is a p-dimensional component data vector X = (x 1, x 2,…, x p ), the method of centralized logarithm can be adopted to transform the vector into

$$ y_{i} = \ln (x_{i} /g(x)),g(x) = (x_{1} x_{2} \ldots x_{p} )^{{ - \tfrac{1}{p}}} $$
(3)

The random vector Y = (y 1, y 2,…, y p ) is in the p-dimensional real space R p, because the space \( \left\{ {(x_{1} ,x_{2} , \ldots ,x_{p} )\sum\nolimits_{i = 1}^{p} {x_{i} = 1,x_{i} \succ 0} } \right\} \) corresponds with R p by the transformation of the centralized logarithm. The negative skewness of the covariance matrix of the component data is eliminated by entering the real space of R p. The principal component analysis of X = (x 1, x 2,…, x p ) is transformed into analysis of Y = (y 1, y 2,…, y p ).

The principal component analysis of Y = (y 1, y 2,…, y p ) focuses on finding the maximum value of the variance \( Var(a^{\prime}Y) \) in the condition of \( a^{\prime}a = 1 \) and a = (a 1, a 2,…, a p ). It is easy to obtain the equation that

$$ Var(a^{\prime}Y) = Var(a^{\prime}\ln (x/g(x))) = a^{\prime}\tau a $$
(4)

Then the variance matrix of the centralized logarithm is

$$ \tau = \left[ {Cov(\ln (x_{i} /g(x)),\ln (x_{j} /g(x))} \right] $$
(5)

Assuming that p characteristic roots of \( \tau \) are \( \lambda_{1} \ge \lambda_{2} \ge \cdots \lambda_{p} \), the corresponding characteristic vectors are a 1, a 2,…, a p , and \( (\tau - \lambda_{i} I)a_{i} = 0,\quad i = 1,2, \ldots p \), then the \( i{\text{th}} \) principal component is \( a_{i}^{\prime } \ln (x/g(x)) \) which is a nonlinear combination of x.

2.3 Holistic and Dynamic Principal Component Analysis

In the traditional PCA, the static data \( R_{n \times p} \) consists of samples (e 1, e 2,…, e n ) and indicators (x 1, x 2,…, x p ). When time is taken into consideration, X t is the array of time-series data of R n×p , which can be denoted as K = {X t  ∊ R n×p , t = 1, 2,…, T}. Assuming that in the time-series data X t , N t are the values of indicators \( (x_{1} ,x_{2} , \ldots ,x_{p} ) \) of samples (e 1, e 2,…, e n ) at time t, then \( N = \mathop U\limits_{t = 1}^{T} N_{t} \) is the cluster of the values of indicators (x 1, x 2,…, x p ) of samples (e 1, e 2,…, e n ) in period \( T \). The cluster \( N \) is the objective of the holistic and dynamic principal component analysis.

Based on what we have discussed in Sect. 2.2, assuming that the p-dimensional component vector X = (x 1, x 2,…, x p ) has the initial variable (x ijt ) p×n×T which can be simplified as (x ik ) p×nT , the holistic and dynamic PCA involves four key steps:

Firstly, transform the initial variable into the centralized logarithm

$$ y_{ik} = \ln x_{ik} - \frac{1}{nT}\sum\limits_{k = 1}^{nT} {\ln x_{ik} } $$
(6)

Secondly, compute the covariance matrix of the centralized logarithm sample

$$ S = (S_{ef} )_{p \times p} $$
(7)

where \( S_{ef} = \frac{1}{nT - 1}\sum\limits_{k = 1}^{nT} {(y_{ek} - \overline{{y_{e} }} )} (y_{fk} - \overline{{y_{f} }} ) \) , \( \overline{{y_{e} }} = \frac{1}{nT}\sum\limits_{k = 1}^{nT} {y_{ek} } ,\overline{{y_{f} }} = \frac{1}{nT}\sum\limits_{k = 1}^{nT} {y_{fk} } \).

Thirdly, assuming that \( \lambda_{1} \ge \lambda_{2} \ge \cdots \lambda_{p} \) are p characteristic roots of S, and a 1, a 2, …, a p are the corresponding characteristic vectors, then the value of the \( i{\text{th}} \) principal component is

$$ F_{i} = \sum\limits_{j = 1}^{p} {a_{ij} \ln x_{ij} } $$
(8)

Finally, the principal components are weighted with the corresponding proportion of variance in the original set of variables explained by that particular principal component, and the comprehensive index is computed using the following formula:

$$ Z = {{\sum\limits_{i = 1}^{m} {(\omega_{i} F_{i} )} } \mathord{\left/ {\vphantom {{\sum\limits_{i = 1}^{m} {(\omega_{i} F_{i} )} } {\sum\limits_{i = 1}^{m} {\omega_{i} } }}} \right. \kern-0pt} {\sum\limits_{i = 1}^{m} {\omega_{i} } }} $$
(9)

where Z is the value of the complex index, and \( \omega_{i} \) is the proportion of variance explained by the \( i{\text{th}} \) principal component. The bigger the value of Z, the poorer the regional water situation, and vice versa.

3 Developing the Water Poverty Index for the Chinese Case

3.1 Study Area

With rapid economic growth, the gap between water resource availability and water demand for ecological protection is gradually widening in China. The Chinese government has been aware of the water scarcity problem and began to reform its water resource management system in the late 1990s. China’s Water Law has been modified several times, and related water management regulations number in the hundreds or thousands. Yet the problems of water shortages and degraded water quality remain severe. The annual per capita freshwater resource is 2156 m3, which is less than a quarter of the global average (Jiang 2009). The direct impact of water pollution is costing China about 1 percentage of its GDP each year (OECD 2007). Water resource management is a top priority in the government’s policy agenda to promote natural resource conservation (SC 2006).

China has 22 provinces, four municipalities directly under the control of the central government (Beijing, Tianjin, Shanghai and Chongqing) and five autonomous regions (Guangxi, Xinjiang, Qinghai, Ningxia, and Tibet). In the national statistical process, the provinces, autonomous regions and municipalities directly under the control of the central government are investigated as observations at the same level because they own the equal administrative status in China. This study area covers all provinces, municipalities directly under the control of the central government and autonomous regions except Tibet, which is not considered due to its insufficient data from 2004 to 2012. The endowment of natural resources and economic characteristics of study areas are presented based on the eight regions proposed by the State Council in 2006 (SC 2006), which are Northeast, North Coast, East Coast, South Coast, Central Yellow River Delta, Central Yangtze River Delta, Great Southwest and Great Northwest (shown in Table 1).

Table 1 Eight regions in China

3.2 Selection of the indicators

In order to minimize the potential for ad hoc selection of indicators (Cho et al. 2010), a comprehensive review of indicators used in regional water poverty studies was firstly developed, which provides relevant indicators to represent the five components of WPI adopted in the study (Table 2). Secondly, the China Statistical Yearbook and the China Environmental Yearbook were inspected to select appropriate indicators for the measurement of each dimension of WPI to confirm the availability, reliability, understandability, regular updatability (Feitelson and Chenoweth 2002, Zhou et al. 2006, Nardo et al. 2005) of the panel data from the year of 2004–2012. As shown in Table 3, 26 indicators were obtained.

Table 2 Basic steps in composite index design
Table 3 Specification of indicators used in the WPI framework

The third step is to classify appropriate indicators based on the WPI framework from Lawrence et al. (2002) because these five components are considered to integrate the complexity of water sector. The indicators of Resources include the amount of groundwater and surface water resources in order to assess total water availability. The indicators of Access cover access to drinking water and to sewage disposal. The indicators of Capacity focus on water institutional capacity by combining a set of regional development indicators. The indicators of Use combine the stress of water use and efficiency of water use in agricultural and industrial production, discarding domestic water use per capita and industrial water use per capita because increasing water use is good up to a certain point but it becomes bad if inefficiency results from waste water (Lawrence et al. 2002). The indicators of Environment cover aquatic environmental pollution and aquatic environmental protection. 26 indicators were classified, as shown in Table 3.

The fourth step is to pretreat the initial data by the min–max normalization method which is widely used in many WPI studies (adapted from Lawrence et al. 2002, Sullivan 2005, Komnenic et al. 2009). In the case that the higher the value of the variable, the less the regional water poverty, the variables are transformed into indicators using the following formula:

$$ x_{i}^{ * } = \frac{{x_{\hbox{max} } - x_{i} }}{{x_{\hbox{max} } - x_{\hbox{min} } }} $$
(10)

In the case that the higher the value of the variable, the poorer the regional water situation, the variables are transformed using the following equation:

$$ x_{i}^{ * } = \frac{{x_{i} - x_{\hbox{min} } }}{{x_{\hbox{max} } - x_{\hbox{min} } }} $$
(11)

The descriptive statistical analysis of 26 indicators, covering 22 provinces, four municipalities directly under the control of the central government and four autonomous regions from 2004–2012, indicated the negatively skewed distribution of most data, which justify the conclusion in the section of 2.2 that the correlation coefficient of p-dimensional component data is negative, rather than fitting the uniform distribution of (0, 1).

Thus the initial variables were rescaled using the centralized logarithm as presented in Eq. (6) if the higher the value of the variable, the poorer the regional water situation. The bottom of the initial variable was applied to the centralized logarithm if the higher the value of the variable, the less the regional water poverty.

The fifth step is to filter the indicators for least eclipsing through comparison of the correlation coefficients between any two indicators and significance levels. Based on the comparison of the correlation coefficients (above 0.7) between any two indicators and significance levels (<0.01), seven redundant indicators were removed: water resource per capita, proportion of environmental protection investment, GDP per capita, income ratio between the urban and the rural households, water consumption per 10,000 yuan of GDP, water consumption per 10,000 yuan of industrial value added, and proportion of wetland in total area of territory. For instance, two variables, GDP per capita and proportion of tertiary industry in GDP, are used to measure the economic capacity, but GDP per capita is highly correlated with the number of college students per 10,000 people, Engel’s coefficient of households, rural tap water access rate, urban water access rate, and percentage of urban sewage disposed (the correlation coefficient is above 0.70 and the significance level is <0.01). The variable GDP per capita was then discarded and proportion of tertiary industry in GDP was retained in the sub-component of economic capacity.

Bartlett’s spherical test on the retained indicators is significant at the 0.01 level (\( \chi^{2} \) = 171; P value = 0.000), indicating significant correlations. The Kaiser–Meyer–Olkin Measure of Sampling Adequacy is 0.685, which falls in the acceptable range. These test results support that the remaining 19 indicators are suitable for PCA. Thus, the indicators were finally selected such that they were measurable, reliable, valid and comparable, stayed independent with each other, and could be associated with the practices and policies of water resources management in China. The specification of parameters and indicators used in the WPI framework are presented in Table 3.

3.3 Construction of the Composite Index

As mentioned in the Sect. 2.3, compute the covariance matrix S of the centralized logarithm sample according to the Eq. (7). Then assuming that \( \lambda_{1} \ge \lambda_{2} \ge \cdot \cdot \cdot \lambda_{p} \) are p characteristic roots of S, and a 1, a 2,…, a p are the corresponding characteristic vectors, then the value of the \( i{\text{th}} \) principal component is calculated according to the Eq. (8). Finally the principal components are weighted with the corresponding proportion of variance in the original set of variables explained by that particular principal component, and the comprehensive index is computed using the Eq. (9). The basic steps of composite index design is presented in Table 2.

3.4 Comparison of CLHDPCA and MMHDPCA

Factors contributing to the Water Poverty Index based on the min–max normalized holistic and dynamic PCA (MMHDPCA) and the central-logarithm normalized holistic and dynamic PCA (CLHDPCA), are constructed using SPSS17.0 for Windows (Table 4).

Table 4 Total variance explained by centralized logarithm (CL) holistic and dynamic PCA and min–max normalized (MM) holistic and dynamic PCA

The cumulative variance explained by four components in CLHDPCA is 76.90 %, compared to 57.72 % in MMHDPCA. Four components in CLHDPCA are enough to account for more than 75 % of the variation while seven components are required in MMHDPCA, demonstrating the greater dimensional reduction in CLHDPCA compared to MMHDPCA. Therefore in the following discussion, four components are extracted, and based on CLHDPCA the comprehensive index (Z) is computed using Eq. (9) where m = 4.

4 Empirical Results

4.1 Water Poverty Trends in China

The WPI results based on the central-logarithm normalized holistic and dynamic PCA for China’s provinces, autonomous regions, and municipalities directly under the control of the central government are exhibited in Table 5 and Fig. 1. In Table 5, Z 2012 and Z 2004 respectively represent the WPI values in 2012 and in 2004, while Z 2004–2012 represents the median value between 2004 and 2012. R 2012 and R 2004, respectively, represent the rank of the WPI in 2012 and 2004, while R 2004–2012 represents the rank of the median value between 2004 and 2012. Taking the region of Shanghai as an example, the 19 initial variables are normalized using centralized logarithm, 4 principal components are extracted, and the final values of the WPI from 2004 to 2012 are calculated according to Eq. (9). The results are Z 2004 = −0.11, Z 2004–2012 = −0.86 and Z 2012 = 0.03. Shanghai’s rank (R) of the final values (closer to the top rank means more water poverty) of the complex index of water poverty yearly from 2004 to 2012 can be obtained after all other regions’ final values of the complex index of water poverty are computed. The ranks of the final values of the complex index of water poverty in Shanghai are R 2004 = 17, R 2004–2012 = 28 and R 2012 = 30. Comparing the value in 2004 with that in 2012 (shown in Table 5), the value of the complex index (Z) of water poverty in nearly each region (except Tianjin) increased, which revealed that the general situation of water poverty was worsening.

Table 5 Water poverty index (Z) and its rank (R) based on centralized logarithm holistic and dynamic PCA
Fig. 1
figure 1

Regional WPI values for 2012 and median values for 2004–2012

Although the general trend of regional water poverty is similar, we find differing trajectories of the value of the complex index (Z) in different regions. Some provinces or autonomous regions located in Central Yellow River Delta, Great Northwest and South Coast, including Inner Mongolia, Qinghai, Ningxia, Gansu, Xinjiang, Shaanxi, Shanxi, and Hainan remain in the worst water poverty for recent years, whereas the level of water poverty in Xinjiang was the lowest in 2004. Four provinces and two municipalities directly under the control of the central government, located in East Coast, South Coast and North Coast, including Beijing, Shanghai, Fujian, Zhejiang, Jiangsu, and Guangdong, had the lowest water poverty in 2012. The level of water poverty in Tianjin, Chongqing, Hubei, Shandong, and Hebei has undergone the process of first declining and then rising. In Heilongjiang, Jilin, Liaoning, Hunan, Guizhou, Henan, Sichuan, Anhui, Yunnan, Guangxi, and Jiangxi the level of water poverty has increased steadily for recent years.

The results are presented further according to the ranks of comprehensive scores of WPI in different regions between 2004 and 2012 as shown in Table 4. In 2004, the regions of the worst water poverty (ranking from 1 to 8) were scattered across the Great Northwest (such as Qinghai and Ningxia), Central Yellow River Delta (as Inner Mongolia, and Shanxi), South Coast (as Hainan), and Southwest (as Chongqing). After several years, the municipalities directly under the control of the central government including Beijing, Tianjin, Chongqing, and Shanghai, and some developed provinces of southeastern coast, such as Guangdong, Fujian, Zhejiang, and Jiangsu, have dropped in the ranks of WPI. In 2012, the worst water poverty areas (ranking from 1 to 8) were clustered in the whole Great Northwest covering Qinghai, Ningxia, Xinjiang and Gansu, and Central Yellow River Delta such as Inner Mongolia, and Shanxi. The changing trajectories of the worst water poverty regions (ranking from 1 to 8) are exhibited as the darkest areas in Fig. 2, which indicates that regional water poverty has become more spatially clustered in the Great Northwest for recent years.

Fig. 2
figure 2

The comprehensive ranks of WPI in different regions between 2004 and 2012

4.2 Influential Factors Associated with Water Poverty

Which factors drive the WPI for China? The principal components whose eigenvalues are more than one, which explain the majority of the variance in the original data, are the primary influential factors producing the final results. There are three principal components (whose eigenvalues are more than one) to be analyzed as shown in Table 4 and 6. Among them, component 1 (Table 4, CLHDPCA) was characterized by a high positive loading for rate of ammonia nitrogen discharged (loading of 0.965, Table 6) and rate of chemical oxygen demand discharged (loading of 0.914, Table 6). This component was interpreted as “pollution factor (P)”, and explained 43.69 % of the variance in the data (Table 4). Obviously, water pollution is the current dominant factor contributing to regional water poverty in China. Together with the regional scores of principal components in Table 7, we can find that the score range of water pollution dimension is varying from [−11.19, 2.99] in 2004 to [0.72, 3.09] in 2012 with [−1.87, 1.48] as the range for the median. The lower limit in each range keeps rising, which indicates that the general level of regional water pollution is increasing. As water pollution is the heaviest loading component in all principal components, these results illustrate that aquatic environmental pollution is the main cause of China’s water poverty and it has been worsening in the past 9 years. And furthermore, as exhibited as the darkest areas in Fig. 3, the changing trajectories of the heaviest water pollution regions (ranking from 1 to 8) indicate that regional water pollution has become more spatially clustered in recent years.

Table 6 Factor loading pattern
Table 7 Factors’ scores (S) and ranks (R)
Fig. 3
figure 3

The ranks of the water pollution factor in different regions between 2004 and 2012

Component 2 was characterized by high positive loading for water resource per unit area, and water resource production coefficient. This component was interpreted as “resource availability factor (R)”. This factor explained 14.30 % of the variance in the data (Table 4). This result reveals that water resource endowment is still an important endogenous factor concerning regional water poverty in China. In Table 7, the score range of water resource availability dimension is narrowing from [−2.67, 2.85] in 2004 to [−2, 1.92] in 2012 with [−2.65, 2.05] as the median of these 9 years, which indicates that the regionally imbalanced distribution of water resources is decreasing through the reform of the water resource distribution system covering South-to-North Water Diversion Project, interregional water rights transfer, and integrated watershed management. Beijing, Tianjin, Hebei, Shaanxi, Liaoning, and many other northern areas directly benefited from the South-to-North Water Diversion Project. The interregional water rights transfer and integrated watershed management strategy are gradually moving forward especially in the East Coast and Great Northwest. And yet for all that, the trajectories of the poorest water resource availability regions (ranking from 1 to 8) indicate that the scarcest area of water resources has still concentrated in the northern China, as exhibited as the darkest areas in Fig. 4.

Fig. 4
figure 4

The ranks of the water resource availability factor in different regions between 2004 and 2012

Component 3 was characterized by high positive loading for the number of college students per 10,000 people, rural tap water access rate, urban drainage pipe access rate, water-saving irrigation rate, percentage of urban sewage disposed, rate of recycling use for industrial water, and proportion of tertiary industry in GDP. This component was interpreted as “capacity-access-use efficiency factor (C)”, which explained 11.82 % of the variance in the data (Table 6). Some information is presented from the indicators of higher negative loading in each principal component. There is high negative loading for proportion of tertiary industry in GDP, and the number of college students per 10,000 people in Component 1, suggesting a negative relation between pollution and a more educated, service-oriented workforce. There is a high negative loading for Engel’s coefficient, fertilizer and pesticide amount per unit cultivated land, and water-saving irrigation rate on component 2, suggesting water resource endowment is negatively related with poverty, agricultural chemical input use, and water-saving technology. There is a high negative loading for fertilizer and pesticide amount per unit cultivated land in Component 3, suggesting capacity-access is negatively related with agricultural chemical input use. Furthermore, as shown as the darkest areas in Fig. 5, the trajectories of the poorest capacity-access-use efficiency regions (ranking from 1 to 8) indicate that these areas have changed from the eastern coast to southwestern coast in recent years.

Fig. 5
figure 5

The ranks of the capacity-access-use efficiency factor in different regions between 2004 and 2012

4.3 Categories of Water Poverty

Based on the performance of three principal components, the study areas are sorted and analyzed in order to understand the factors driving the results for each region (Table 8). The categories of the level of components’ performance were defined: best (ranking 21th through 30th), second best (ranking 11th through 20rd), and worst (ranking 1st through 10th). If one or more component(s) category (categories) was (were) in the worst category, the component(s) was (were) included in its overall description. If three components were all in the second best category, the component(s) which ranked the lowest would be included in the overall description. If a component was in the best category, that component was left out. In cases where a region was in the best category for each component, that region was not included.

Table 8 Categories of regional water poverty

For example, in 2012, Gansu’s ranks of three principal components are respectively the 3rd, 4th and 7th, placing it in the worst category for each individual component, and thus its primary causes of water poverty are water pollution (P), water resource availability (R) and capacity-access-use efficiency (C). Hainan’s ranks for three principal components in 2012 are respectively the 5th (worst) for P, 14th (second best) for R and 23th (best) for C, therefore the component P was included in the overall description as its primary causes of water poverty. Liaoning’s ranks of three principal components in 2004 are respectively the 16th for P, 11th for R and 13th for C, placing in the second best category for each individual component. The component R ranked the lowest, and is labeled as its primary cause of water poverty. Fujian’s ranks of three principal components in 2012 are respectively the 25th, 27th and 25th, placing in the best category for each individual component. In this case, the region was not included in Table 8.

The first category is P, water poverty, characterized by high pollution emissions to water bodies. Hainan is typical of this type, having been in high water pollution for recent years. Hainan (latitude 3°30N–20°7N and longitude 108°15E–120°5E) is one of five Special Economic Zones which have had the priorities to attract foreign capital by exempting such capital investments from taxes and regulations since the early 1980s. From 1980 to 2012, Hainan’s GDP increased ten-fold with an annual growth rate of 7.45 %. As a tropical island province, Hainan’s marine fishery and marine aquaculture industries contribute greatly to local economic development but also pollution leading to severe stresses on the aquatic environment. Due to lack of environmental protection and wastewater treatment facilities, the annual rate of ammonia nitrogen discharged, rate of chemical oxygen demand discharged, and fertilizer and pesticide amount per unit cultivated land are 0.64, 2.16 and 0.09 respectively, much higher than the national average 0.26, 1.75 and 0.04 respectively. There is an urgent need for water pollution control and natural ecosystem restoration in Hainan.

The second category is R water poverty, featured by poor water resource availability, mainly covering the water-scarce North Coast, Central Yellow River Delta and Great Northwest. However, this category also includes some water-abundant developed regions located in East Coast and South Coast including Shanghai, Jiangsu and Fujian, where rapid economic growth has given rise to increasing demand for water and stressed water resource availability. Shanghai is located in the downstream of Yangtze River and the Lake Tai basin, and has the polluted water from the upstream and local area. Jiangsu and Fujian face the similar problems. Water poverty in these areas not only derives from limited water resource availability relative to increasing requirements, but also the poor quality renders water unusable.

The third category is C water poverty embracing some water-abundant regions in Great Southwest, Central Yangtze River Delta, and Northeast. The low capacity-access-use efficiency reflects the fact that these regions do not use available water resources efficiently. For example, in Guangdong (latitude 23°02N–23°38N and longitude 116°14E–117°19E) the average water-saving irrigation rate and the average rate of recycling use for industrial water are 6.1 and 37 % respectively, compared to the national average of 24.61 and 60 %, respectively. Improvements in water-saving are urgently needed to forestall future scarcities as water needs rise.

The fourth category, P-R water poverty, characterized by high pollution and poor availability of water resources includes Xinjiang (latitude 42°45N–44°08N and longitude 86°37E–88°58E) located in northwestern China. In 2000, Xinjiang was involved in the West China Development Program (WCDP) which was launched to stimulate economic growth in western areas. The region’s GDP grew at an annual rate of 14.6 % from 2000 to 2012. Meanwhile, Xinjiang has suffered seriously environmental degradation owing to its natural-resource focused industries. The underground water has been polluted by oil and natural gas overexploitation and poor waste disposal. The R water poverty of Xinjiang in 2004 converted into P–R in 2012 as a consequence of the increasing contamination of underground water.

The fifth category is P–C water poverty, characterized by high pollution and low capacity-access-use efficiency. Chongqing (latitude 28°10N–32°13N and longitude 105°11E–110°11E), is located upstream of the Three Gorges Reservoir Area. It is critically significant for the safety of Three Gorges Reservoir Area and the middle and lower Yangtze River to treat water pollution in Chongqing. In 2001, Water Pollution Prevention and Control Planning of the Three Gorges Reservoir Area was enacted, and more than 21 billion yuan from the central government was invested in water pollution treatment over 10 years including construction of urban sewage disposal plants, refuse landfill, and sewage pipe networks. As a result, Chongqing has moved from the P–C to the C category since 2011.

The sixth category, R–C water poverty, is characterized by poor water resources and weak capacity-access-use efficiency. For example, Gansu (latitude 35°5N–38°N and longitude 102°30E–104°30E) located in the upper reach of Yellow River, the inland arid area of northwestern China, has water resource per capita less than one-third of the national average, and less than one-eighth of the world’s average. Underground water in Gansu has been overexploited in recent years due to the scarcity of surface water. In 2000, Gansu was involved in WCDP for development of its abundant reserves of minerals and coal. Gansu’s underground water and ecological environment was further damaged as a result of heavy industry development. As a result, water poverty in Gansu deteriorated from R–C in 2004 to P–R–C in 2012.

Finally, P–R–C water poverty features heavy pollution, poor water resource availability and low capacity-access-use efficiency. P–R–C water poverty is largely found in water-scarce and undeveloped northern China. For an instance, Qinghai (latitude 31°9N–39°19N and longitude 89°35E–103°04E) has been in this category for all of 2004–2012. Fragile ecological environment, heavy water pollution, poor availability of water resources, weak access to water supply and sanitation facilities, and relatively low capacity to manage water resources and use efficiency are mainly responsible for the high water poverty of Qinghai.

5 Discussion and Conclusion

WPI results have some implications for water resource planning and management. China has experienced an unprecedented economic boom with real GDP rising at an annual average rate of nearly ten percentage in the past three decades. The improved holistic and dynamic principal component analysis results of WPI in this study show the economic development has improved water resource availability, physical access to water, water-saving facilities, sewage treatment utilities, and the regional distribution of water resources, but it does not automatically translate into less water poverty due to the rising consumption of water and increasing serious pollution discharge. The comprehensive scores and ranks of WPI from the use of panel data demonstrate that the general situation of water poverty is worsening in China, and the water poverty areas have become more spatially clustered in the Great Northwest for recent years. We highlight that it is a top priority for the policy-maker to consider the impact on the aquatic environment when setting any strategy of booming economy because water scarcity and poor water quality will directly threaten Chinese food security, economic development and the quality of life. It is believed that government policies play a critical role to drive sustainable water resource development in China, and the problem-specific policy interventions and planning would help improve the regional water poverty situation.

Environmental pollution is the most important driver of water poverty (P water poverty). Increased agricultural fertilizer and pesticide application, livestock waste, domestic sewage and industrial wastewater discharge contribute to the accelerated eutrophication of main rivers and lakes in China. Meanwhile, water scarcity and poor water quality interact with each other. The contaminated water is threatening the water availability even in some water-abundant areas, which results in the water unusable. The similarity of the trajectories of the poorest water resource availability regions and the heaviest water pollution areas in current years (shown as the darkest areas in Figs. 3, 4) might be a case in point. The social economic development is also under threat because the households, industries and agriculture are force to cut back their water use with a lack of clean water. And the cancer mortality rates related to poor water quality in China has been well above the world average (WB 2007). Each year, an estimate 190 million people fall ill and about 60,000 die from water pollution (Qiu 2011). Thus there is an urgent need to improve the public environmental protection awareness, devise sufficient incentives to the investment in the water environment protection and sewage treatment facilities, and formulate effective regulations to control the wastewater discharge and diverse pollution sources.

R water poverty characterized by lack of water resource endowment has spread from northern China to the South Coast and the Eastern Coast indicating that limited water resource availability cannot satisfy rising demand, even in the regions where ever rich in water resources. New approaches and strategies will be required to use and manage water resources more effectively. These strategies should include ways to slow the growth in water consumption, popularize the water-saving technologies, and promote the trade of virtual water to relieve the pressure on water resource through importing water-intensive products and export water-extensive products and services.

Southwestern China has the best endowment of water resources in all study areas, but the region is suffering from water poverty characterized as C through the lack of socio-economic capacity, the weakness of water use efficiency and heavy water pollution. The policies are stressed to stimulate public and private investment in developing the water-saving, socially inclusive innovations and technologies to drive sustainable development in the regions.

It would be beneficial if further management interventions are prioritized as indicated by the study results. For instance, in the regions characterized as P–R, more sewage networks should be constructed, more effective regulations and/or incentives aimed at reducing pollution, and further improvement of the efficiency of water use should be the main priorities. Regions with P–C water poverty should focus on aquatic environmental pollution control, adoption of water-saving technologies, and economic and technological development aimed at improving water access. Regions characterized as R–C need to strengthen socio-economic capacity, improve the access to water and the efficiency of water use. The provision of safe water and hygienic facilities of sewage disposal, people’s ability and capacity for sustaining access to clean water are all needed to diminish the water poverty of those regions featured by P–R–C.

To analyze the underlying complexities of water issues in China, the calculation method of WPI is revised through the transformation of centralized logarithm of initial variable and holistic and dynamic PCA. The comparison based on the traditional PCA and improved PCA shows that the proposed method achieves drastic dimension reduction. Although we use the improved holistic and dynamic PCA to assess the water poverty status in China, this study is based on the constructed WPI for Chinese case and available reliable data from the year of 2004–2012, which may impose certain limitations on the analysis of our results. For an example, other important indicators of water poverty were not captured in the study due to lack of reliable data and multi-collinearity among the variables. For instance, the variable of biodiversity is replaced by public green areas per capita, the Gini coefficient of income distribution is replaced by household’s income ratio between the urban and the rural, and three variables including precipitation variance, mortality rate of children under five, and rate of soil erosion by area are left out of the analysis, due to incomplete and insufficient data. Seven variables including water resource per capita, proportion of environmental protection investment, GDP per capita, water use per 10,000 yuan GDP, water consumption per 10,000 yuan of industrial value added, percentage of investment in anti-pollution projects, and households income ratio between the urban and the rural are discarded owing to multi-collinearity among the variables. Notwithstanding the limitations, this study has produced some important findings, which might help improve the calculation method of Water Poverty Index, raise the awareness of the public and government about the aquatic environmental protection and sustainable water resource development in China, and devise better policies to alleviate the regional water poverty. It would be advantageous if future studies can identify the impacts of those omitted variables and examine the interrelationship amongst water poverty, urbanization, industrialization and human welfare. Such analysis would further help policy makers and planners to capture the possible impacts of alternative planning and development interventions in China.