Introduction

Human activities associated with land uses, including agricultural activities, land development, industrial discharge, residential sewage, and urban runoff can cause surface water degradation, so land use planning and management is very important for water environment protection. A good understanding of the relationships between land use and water quality is necessary for effective and efficient land use planning and management. The relationships between land use and water quality have been studied around the world (Woli and others 2004; Williams and others 2005; Schoonover and others 2005; Conway 2007; Tu and others 2007; Tu and Xia 2008; Li and others 2009; Liu and others 2009; Kang and others 2010; Tran and others 2010). They found that percentages of land use types related to human activities and economic development, such as agricultural land and urban land including residential, commercial, transportation, and industrial lands, usually have positive relationships with concentrations of water pollutants. On the other hand, percentages of undeveloped land (e.g., natural forest) have negative relationships with water pollutants. In other words, higher percentages of urban or agricultural lands are usually associated with worse water quality, while higher percentages of forest are related to better water quality.

The relationships are usually analyzed by using traditional statistical methods, such as ordinary least squares (OLS) regression and Spearman’s rank correlation analysis, with the concentrations of water quality variables from sampling sites as dependent variables, and the percentages of different land use types for the drainage areas of the sampling sites as independent variables. These methods are global statistics that analyze the average situation for the whole study area, and so they assume that relationships are stationary or constant over the whole study area, even though pollution sources might change across the study area, especially an area including different types of watersheds, such as forested watersheds, agriculture-dominated watersheds, and urban watersheds.

However, in reality, the relationships between land use and water quality are not always consistent in the studies performed in different areas, because natural and anthropogenic characteristics of watersheds, including physical environment, economic activities, pollution sources, and policies are not constant over space. For example, a study in the watersheds in the State of Wisconsin, USA, found that agricultural land has significant positive relationships with many water quality indicators including conductivity, TP (Total phosphorus), SO4, Cl, Na, Ca, and Mg (Liu and others 2009). In contrast, a study covering the watersheds of eastern Massachusetts, USA found that agricultural land has significant negative relationships with conductivity, Cl, Na, Ca, and Mg, but non-significant relationships with TP and SO4 using OLS (Tu and Xia 2008). However, another study in a watershed also in eastern Massachusetts found that agricultural land has significant positive relationships with SO4, Cl, Na, Ca, and Mg, and a non-significant relationship with TP (Williams and others 2005). Thus, the comparison of the results among these studies shows that the relationship between a land use indicator and a water quality indicator might change in different regions, and it might even vary across different watersheds in the same study area. In other words, a spatial non-stationarity, which means the spatial variation in the relationships between independent and dependent variables across watersheds, usually exists in the relationships between land use and water quality. Global statistics (e.g., OLS) are unable to examine this spatial variation, so the relationships found using traditional statistical methods in the previous studies reflect the impact of land use on water quality on the scale of the whole study area, but might hide the local variations of the impact.

In recent years, a local spatial statistical technique called geographically weighted regression (GWR) has been developed to explore the spatial variations in relationships between independent and dependent variables (Fotheringham and others 2002). GWR attempts to capture spatial variations by allowing regression model parameters to change over space. This technique has been applied in various fields, including ecology (Shi and others 2006; Harms and others 2009), sociology (Farrow and others 2005; Malczewski and Poetz 2005), urban studies (Helbich and Leitner 2009; Luo and Wei 2009), and natural resource management (Windle and others 2010; Jaimes and others 2010).

GWR has also been applied to study the relationships between land use and water quality in the watersheds of eastern Massachusetts (Tu and Xia 2008; Tu 2010; Tu 2011a). These earlier studies found great spatial variations in the relationships between land use and water quality and that GWR is a more powerful tool to detect and model the spatial variation. However, to my knowledge, GWR has never been applied to study the impact of land use on water quality in any other areas. It is unclear that if the findings are obtained by chance or are only valid for eastern Massachusetts. More studies on the similar topic in more regions are necessary for confirming the findings from Massachusetts.

Thus, in the current study, GWR is applied to examine the spatial variations in the relationships between land use and water quality in the watersheds of northern Georgia by generally following the method in the earlier study in eastern Massachusetts. The purpose is to replicate the earlier study to test if: (1) GWR has advantages over OLS in investigating the relationships between land use and water quality; (2) The relationships between land use and water quality vary over space in response to the urbanization levels of watersheds; (3) The spatial patterns in the varying relationships associated with urbanization are consistent for these two regions.

Study Area

The study area covers the U.S. Environmental Protection Agency (USEPA) Level III Ecoregions of Piedmont, Blue Ridge, Ridge and Valley, and Southwestern Appalachians in the state of Georgia. It is located in northern Georgia including metropolitan Atlanta and its surrounding areas (Fig. 1). This area contains mountains, ridges, and valleys of the Appalachians and the transitional area between the Appalachians to the relatively flat Coastal Plain. Its natural environment is significantly different from the Ecoregions of Southeastern Plains and Southern Coastal Plain in the rest of Georgia. An ecoregion is defined as the areas that generally have similar patterns of natural and anthropogenic factors, such as vegetation, geology, soils, physiography, water resources, climate, and land use (Omernik 1987). Thus, the selection of the study area based on ecoregions can minimize the impact of natural variability and allow the study to focus on the spatial variation in water quality associated with land use patterns.

Fig. 1
figure 1

Study area in northern Georgia, USA

The study area is about 59,400 km2 with a population of about 5.8 million. It is more densely urbanized than southern Georgia. The metropolitan Atlanta area is primarily urban and suburban. However, the area beyond the metropolitan area is mainly rural with dense forest and some agricultural lands (Fig. 1). Similar to the earlier study in eastern Massachusetts that includes the Boston Metropolitan area (Tu and Xia 2008), this study area also contains watersheds with different levels of urbanization. The level of urbanization generally decreases from the city of Atlanta to the outside. The watersheds within the metropolitan area are usually highly-urbanized with high percentages of urban land, while those in the rest of study area are less-urbanized with high percentages of forest. Thus, a clear urbanization gradient exists from Atlanta to other places in the study area in all the directions. Furthermore, the study area has been experiencing land use changes caused by urban sprawl over decades. Forest and agricultural lands have been rapidly converted into urban land, especially residential land (Tu 2011b). With the great pressure of urban sprawl and the strong spatial variability in the types of watersheds, the study area is the ideal place for examining the relationships between land use and water quality and how the relationships vary over space associated with the urbanization level of watersheds.

Water resources are a pressing issue for the state of Georgia and even for the entire southeastern region of USA due to the increasing demand for freshwater caused by urban sprawl, population growth, and economic development. This issue is more urgent in northern Georgia since it contains the metropolitan Atlanta, which is one of the fastest growing and most sprawling areas in the U.S. Total water use in this area increased from 3,970 to 4,230 million gallons per day during 1980–2000 (Martin and others 2005). In addition, northern Georgia contains many headwaters flowing into neighbor states, such as Alabama, Florida, and South Carolina. The water availability and quality in northern Georgia have strong influence on these states. Georgia has been being at the center of some major interstate conflicts over water availability and rights (Martin and others 2005). Therefore, a better understanding of the relationships between land use and water quality and how the relationships are affected by urbanization in northern Georgia is very important for local and regional water resource management and conservation.

Data Sources and Methods

Water Quality Indicators

Water quality data from 2000 to 2009 were retrieved on-line from the USGS (United States Geological Survey) National Water Information System Web (NWISWeb; URL http://waterdata.usgs.gov/nwis/). The NWISweb contains water quality data collected by various projects ranging from national programs to projects in small watersheds. It is an important and widely used public water quality data source for research and administration in the US. The methods of field sampling and laboratory analysis and their quality assurance and quality control (QA/QC) are regulated by USGS (Friedman and Erdmann 1982; Fishman and Friedman 1989; USGS variously dated). The sampling sites, frequencies, methods, and water quality indicators are not designed for this study, so the study had to rely on the existing data. Based on data availability, forty-two USGS water quality sampling sites with 8 water quality indicators were selected (Fig. 1). The water quality indicators are specific conductance (SC), dissolved oxygen (DO), dissolved organic carbon (OC), and five dissolved nutrients parameters, including total nitrogen (TN), organic nitrogen (ON), ammonia plus organic nitrogen (also known as kjeldahl nitrogen, KN), nitrate plus nitrite nitrogen (NO3–N + NO2–N), and phosphorus (P). Different from the previous study in eastern Massachusetts (Tu and Xia 2008), this study did not use dissolved ions and solids due to poor data availability of these parameters. However, SC can be used to represent them since it is a measure of the ability of water to conduct electrical current and reflects the concentrations of dissolved ions or solids in water, and it was considered as one of the best general water quality indicators affected by land use change in many studies (Wang and Yin 1997; Dow and Zampella 2000).

Dissolved ions, solids, nutrients, and organic carbon are pollutants that affect aquatic ecosystem and can be contributed by human activities in urban land and agricultural land, including discharges of residential, municipal, and industrial sewage, mining, urban and road runoffs, fertilizer applications in urban lawn and agricultural land, and livestock raising. High concentrations of dissolved ions and solid are toxic to freshwater aquatic life. High concentrations of dissolved nutrients can cause algal blooms that may result in death of fish and reduction of diversity and growth of aquatic life (Enger and Smith 2010). Dissolved organic carbon can affect the pH, color, and transparency of water, and also affects the toxicity and bioaccumulation of metals in water (Porcal and others 2009). Higher concentrations of SC, OC, and nutrients suggest worse water quality. In contrast, higher concentration of DO indicates better water quality since it is essential for the survival of aquatic life. Thus, all the above water quality parameters are good indicators to assess water quality associated with land use.

The average concentration of each water quality indicator over the period of 2000–2009 at each sampling site was calculated. Not all the sampling sites had the data available for all the eight water quality indicators, so the number of sampling sites for different water quality indicators is different. The number of sampling sites ranges from 32 to 42 for the eight water quality indicators (Table 1).

Table 1 Statistical summary of land use and water quality indicators of studied watersheds

Land Use Indicators

Land use data in Year 2005 were obtained from the website of Georgia GIS Data Clearinghouse (URL http://gis1.state.ga.us/). The land use data were originally interpreted from Landsat imagery by the Natural Resources Spatial Analysis Laboratory at University of Georgia. Four land use types (urban land, forest, agricultural land, and wetland) were aggregated from more detailed land use categories in the original land-use data set by combining several similar land use types into one broad category. For example, the amount of urban land is the sum of low density urban and high intensity urban in the original data set. Urban land includes residential, commercial, industrial, and transportation lands.

Water quality at a sampling site is affected by the physical characteristics and human activities including land use in its upstream drainage area (watershed) rather than within the limits of an administrative region. Water quality at a sampling site could be used to represent the water quality of its watershed. Land use in a watershed is considered as the land use at the outlet of the watershed. Therefore, water quality indicators for sampling sites and land use indicators for the watersheds of the sampling sites were linked to analyze how water quality was associated with land use. This method is commonly used in the studies of land use and water quality relationship (Woli and others 2004; Conway 2007; Kang and others 2010). The watershed for each of the 42 water quality sampling site was delineated from digital elevation data provided by the USGS National Elevation Dataset (NED) 1 Arc Second (about 30 meter resolution, URL http://seamless.usgs.gov/website/Seamless/) using ArcGIS spatial analysis tools, and then used for the calculation of land use indicators. Four land use indicators, which are percentages of urban land, forest, agricultural land, and wetland, for each watershed were calculated in ArcGIS by overlapping land use layers to the delineated watershed layer. No two sampling sites were located along the same stream so that all the delineated watersheds were mutually exclusive to avoid potential contamination by upstream sites on downstream sites. The delineated watersheds were distributed throughout the study area with different levels of urbanization (Fig. 1).

Methods

In order to compare the model performance and results between OLS and GWR models, both models were performed using water quality indicators as dependent variables and land use indicators as independent variables. Since significant correlations exist among the land use indicators, including a significant negative correlation between percentage of urban land and percentage of forest (r = −0.89, P < 0.01), a significant negative correlation between percentage of urban land and percentage of forest (r = −0.53, P < 0.01), and a significant positive correlation between percentage of agricultural land and wetland (r = 0.42, P < 0.01), multivariate regression analysis involving multiple land use indicators is not appropriate for this study due to the potential multicollinearity among independent variables. Thus, each of the OLS and GWR models used only one land use indicator as independent variable to analyze its association with each water quality indicator.

As well known, an OLS model can be stated as:

$$ y = \beta_{0} + \sum\limits_{i = 1}^{p} {\beta_{i} x_{i} } + \varepsilon $$
(1)

where y is the dependent variable, β0 is the intercept, βi is the parameter estimate (regression coefficient) for independent variable xi, p is the number of independent variables, ε is the error term.

OLS and other traditional regression methods are global statistics, which assumes the relationship under study is constant over space, so the parameter is estimated to be the same for all the study area, and the model results are applied to the whole study area.

In contrast, GWR extends OLS to local statistics by allowing local rather than global parameters to be estimated. It assumes the model results including model parameter and R2 (coefficient of determination) vary over space. For each regression point, GWR produces a set of local regression results including local parameter estimates, the values of t-test on the local parameter estimates, the local R2 values, and the local residuals. Thus, the spatial variations in the relationships between independent and dependent variables and the model ability can be explored. Further analyses on the spatial variations can help provide some understanding of hidden possible causes of the variations (Fotheringham and others 2002). In addition, GWR also produces a global R2 to show the overall performance of the GWR model.

GWR model can be rewritten as:

$$ y_{j} = \beta_{0} (u_{j} ,v_{j} ) + \sum\limits_{i = 1}^{p} {\beta_{i} (u_{j} ,v_{j} )x_{ij} } + \varepsilon_{j} $$
(2)

where uj and vj are the coordinates for the location of observation j, β0 (uj, vj) is the intercept for observation j, βi (uj, vj) is the local parameter estimate (regression coefficient) for independent variable xi at the location j.

In this study, the dependent variable is the water quality indicators at sampling sites (points), while the independent variables are the land use indicators for watersheds of the sampling sites (polygons). This point to polygon spatial scale transformation may raise some questions about its validity, especially considering that the locations of the sampling sites are used in GWR. However, the land use in a watershed is considered as the land use at the outlet (sampling site) of the watershed for regression purpose since the water quality at the sampling site is affected by the land use in the watershed. Thus, this transformation is a feasible and acceptable way to analyze the relationship between land use and water quality.

The regression on an observation is performed by weighting all other observations around it using a distance decay function, assuming the other observations closer to the location of the observation have higher impact on the local parameter estimate for the observation.

The distance decay function of weighting can be stated as:

$$ w_{ij} = \exp ( - d_{ij}^{2} /b^{2} ) $$
(3)

where wij is the weight of observation j for observation i, dij is the distance between observation i and j, b is the kernel bandwidth. When the distance is greater than the kernel bandwidth, the weight rapidly approaches zero. The value of b can be set up using either fixed or adaptive kernel bandwidth in GWR. Fixed kernel is a constant bandwidth over space, while adaptive kernel bandwidth varies spatially according to the spatial variations in the density of observations so that bandwidths are larger in the locations where observations are sparse and smaller where observations are denser. In this study, the locations of observations are the water quality sampling sites. Their density varies over the study area (Fig. 1), so the adaptive kernel bandwidth was used. The optimal bandwidth was determined by minimizing the corrected Akaike Information Criterion (AICc) as described in Fotheringham and others (2002). More detailed description of GWR technique can be found in some literatures (Brunsdon and others 1998; Fotheringham and others 2002; Shi and others 2006; Jaimes and others 2010).

After the results of OLS and GWR were obtained, a comparison of the model performance between OLS and GWR models were performed by comparing the model R2 and an F-test. Higher R2 means that independent variable can explain more variance in dependent variable. The F-test can determine whether the GWR models have a statistically significant improvement over the OLS models (Fotheringham and others 2002).

Afterwards, the local regression results from GWR models, including the local parameter estimates and the values of t-test on the local parameter estimates were further interpreted using spatial and statistical analyses to examine the spatial variations in the relationships between water quality and land use across the urbanization gradient in the study area. The average percentage of urban land among watersheds with different types of relationships (negatively significant, non-significant, and positively significant) was also compared to study if the spatially varying relationships were affected by the urbanization level of watersheds. OLS models were performed using SPSS 13, GWR analyses with GWR 3 software package, and all GIS analyses were made using ArcGIS 9.3.

Results and Discussion

Spatial Variations in Land Use and Water Quality Indicators

Table 1 shows the statistical summary of land use and water quality variables of the 42 studied watersheds. Great spatial variations are found in land use indicators among the studied watersheds. The most dominant land use types for most of the watersheds are forest and urban land. Percentage of forest ranges from 10.35 to 99.53%, with an average of 45.91%. Percentage of urban land varies from 0.33 to 86.95%, with an average of 33.38%. The watersheds also contain different levels of agricultural land, which ranges from 0.05 to 30.54%, with an average of 11.61%. Percentage of wetland is pretty low, with an average of 2.29%. The ranges of land use indicators show the great variability in the urbanization level of watersheds. Both less- and highly urbanized watersheds are included into this study.

The water quality variables also show great spatial variations among the studied watersheds. For example, SC ranges from 14.3 to 687.5 μs/cm; TN concentration ranges from 0.367 to 9.945 mg/l; OC ranges from 1.06 to 8.0 mg/l among the sampling sites.

Relationships Between Land Use and Water Quality Obtained by OLS

Table 2 shows the Pearson correlations between water quality and land use indicators obtained from OLS models. Percentage of urban land has significant positive correlations with all the dissolved nutrients and OC (r = 0.343–0.699; P = <0.001 – 0.044), and a slightly significant positive correlation with SC (r = 0.303; P = 0.052), but a non-significant negative relationship with DO (r = −0.280; P = 0.073). This result indicates that higher percentage of urban land, usually involving more human activities, are associated with higher concentrations of water pollutants and lower concentration of dissolved oxygen. Various human and economic activities associated with urban land, including discharges of residential, municipal, and industrial sewage, fertilizer and pesticide use in lawns, and applications of road deicers, make contributions to the concentrations of water pollutants in the natural water, so significant positive correlations are usually found between percentage of urban land and concentrations of water pollutants. This result is consistent with the findings from many previous studies around the world (Woli and others 2004; Schoonover and others 2005; Conway 2007; Tu and others 2007; Tu and Xia 2008; Liu and others 2009).

Table 2 Pearson correlations between water quality and land use indicators

Conversely, percentage of forest shows opposite relationships with water quality to percentage of urban land. It has significant negative relationships with all the dissolved nutrients and OC (r ranges from −0.416 to −0.614; P = <0.001 – 0.018), and a slightly significant negative correlation with SC (r = −0.299; P = 0.054), but a significant positive relationship with DO (r = 0.451; P = 0.093). This result indicates that higher percentage of forest is related to better water quality and it is a good predictor of water quality. This result is also similar to the findings in various previous studies (Woli and others 2004; Schoonover and others 2005; Conway 2007; Tu and others 2007; Tu and Xia 2008; Liu and others 2009).

Percentage of agricultural land is found to have a significant relationship with only KN (r = −0.387; P = 0.018). This result indicates that agricultural land is not associated with water quality in the study area, which is different from the findings in many previous studies. It is well known that agricultural activities including fertilizer application and livestock farming can contaminate water quality, so that agricultural land is usually considered as an important non-point pollution source, especially in agriculture-dominated watersheds. Significant positive relationships are often found between percentage of agricultural land and concentrations of water pollutants, especially nutrients (Woli and others 2004; Mehaffey and others 2005; Stutter and others 2007, Liu and others 2009). This result of weak relationships in the study area might be due to the great variability in land use and urbanization levels of the studied watersheds. As found in the earlier study in eastern Massachusetts, significant positive relationships usually exist between percentage of agricultural land and concentrations of water pollutants in less-urbanized watersheds, while significant negative correlations are generally observed in highly-urbanized watersheds. This is due to the contribution of agricultural land to water degradation is masked by urban pollution sources and a lower percentage of agricultural land usually means a higher percentage of urban land in highly-urbanized watersheds (Tu 2011a). If a study area consists of watersheds with great variability in urbanization levels, the original different types of relationships between agricultural land and water quality in different watersheds are mixed together by global statistical methods (e.g., OLS), causing non-significant correlations for the whole study area (Tu 2011a). GWR will explore if the relationships between percentage of agricultural land and water quality indicators vary over space in response to the urbanization level of watersheds also in this study area.

Percentage of wetland has no significant relationships with any water quality indicators, which might due to the low amount of wetland in the studied watersheds. Thus, water quality is not largely associated with wetland in the study area. The analyses of the GWR results will focus on urban land, forest, and agricultural land.

Comparison of Model Performances Between OLS and GWR

Table 3 shows the comparison of global R2 between OLS and GWR models for each pair of water quality indicator and land use indicator. A great improvement in R2 of GWR over OLS is observed for every pair of water quality and land use indicators. The R2 values of the OLS models for percentage of urban land range from 0.078 to 0.489, suggesting that percentage of urban land can explain only 7.8–48.9% of the variances in water quality indicators. In comparison, the global R2 values of the GWR models for percentage of urban land vary from 0.200 to 0.844, indicating that percentage of urban land can explain 20–84.4% of the variances in water quality indicators using GWR models. Especially for SC and DO, no significant correlations were found between them and percentage of urban land using OLS, but their R2 values increase from 0.092 and 0.078 to 0.844 and 0.769 using GWR, respectively.

Table 3 Comparison of coefficient of determination (R2) between OLS and GWR models

Similar improvements in R2s of GWR over OLS are also found for forest, agricultural land, and wetland. The R2 values for the models of percentage of forest and water quality indicators are improved from the values of 0.090 to 0.377 from OLS to the values of 0.260 to 0.778 from GWR. The global R2 values for percentage of agricultural land in GWR range from 0.065 to 0.711, which is improved from the range of 0.003 to 0.150 in OLS. The R2s for percentage of wetland also increase from the range of 0.003 to 0.150 from OLS to that of 0.074 to 0.583 from GWR (Table 3). This result indicates that percentage of agricultural land and wetland can explain more than 58–70% of the variance in some water quality indicators using GWR when allowing the model parameters vary over space, even though they show weak correlations with water quality using OLS.

The higher values of the global R2s from GWR than the R2s from OLS indicate the improvement in model performance of GWR over OLS. The statistical significance of the improvement is also tested by an F-test. Table 4 shows the statistical test results for improvement in model fit of GWR over OLS. Three water quality indicators (TN, NO3–N + NO2–N, and OC) show non-significant improvement in model fit of GWR over OLS, indicating that no statistically significant difference in the model performance between OLS and GWR for them. However, the other five water quality indicators (SC, DO, ON, KN, and P) show significant improvements in model fit of GWR over OLS. Thus, considering the improvement of R2s in GWR over OLS and the F-test results, it is clear that the variance in water quality is more strongly explained by land use using GWR than OLS. This result is consistent with the comparison of GWR and OLS in the earlier study in eastern Massachusetts (Tu and Xia 2008).

Table 4 P-values for the F-test for the improvement in model fit of GWR over OLS

Spatial Variations in the Relationships Between Water Quality and Land use Explored by GWR

In addition to the better model fit of GWR over OLS, GWR also has the advantage to explore the spatial variations in the relationships between dependent and independent variables. GWR results including local parameter estimates and the values of t-test can be analyzed to discuss how the relationships between dependent and independent variables and how the abilities of independent variables to explain dependent variables change over space. The ranges of local parameter estimates are summarized in Table 5. A local parameter estimate (regression coefficient) for an independent variable is the change in a dependent variable in response to a unit change in an independent variable at a regression point (a water quality sampling site in this study). It can be used to reflect the correlation between the independent variable and dependent variable at a sampling site. Thus, the spatial variations in the local parameter estimates in a GWR model for a water quality indicator and a land use indicator can represent the variations in their relationships among different sampling sites. The t-test in GWR is not exactly the same as the one used in OLS for formal hypothesis testing because it has problems of multiple comparisons when making statistical inferences across multiple locations (Wimberly and others 2008). However, the t-test has still been widely used in GWR applications to analyze the significance of parameter estimates as a purely exploratory tool (Malczewski and Poetz 2005; Wimberly and others 2008; Harms and others 2009; Helbich and Leitner 2009; Luo and Wei 2009; Jaimes and others 2010). Thus, the spatial variation in the t-values for parameter estimates are used to analyze how the significance of the relationships between land use and water quality change over space in this study.

Table 5 Ranges of local parameter estimates from GWR models

Relationships Between Urban Land and Water Quality

As shown in Table 5, percentage of urban land has positive relationships from GWR with most water quality indicators, including TN, KN, NO3–N + NO2–N, P, and OC, which is consistent with OLS results (Table 2). However, both positive and negative relationships are found between percentage of urban and each of SC, DO, and ON. The t-values of the local parameter estimates for percentage of urban land show that not all the relationships are significant (Fig. 2).

Fig. 2
figure 2

Results of t-test on the parameter estimates for percentage of urban land from the GWR models

Different from the slight significant correlation between percentage of urban land and SC found using OLS, their relationships explored by GWR exhibit a great spatial non-stationarity. All three types of significance (positive, negative, and non-significant) in the relationships between percentage of urban land and SC are observed in the studied watersheds using GWR. The spatial variation in the significance is associated with the urbanization levels of watersheds. Table 6 shows the comparison of average urbanization level among watersheds with different significances in the relationships between land use and water quality indicators. Percentage of urban land is used to represent the urbanization level of a watershed. The average urbanization level for the watersheds with non-significant relationships between urban land and SC is significantly higher than that for the watersheds with significant positive relationship (P = 0.027). The t-value map also shows that significant positive relationships are mainly found in the watersheds of north part of the study area with high density of forest (Fig. 2). This result indicates that percentage of urban land has a stronger impact on the concentration of SC in less-urbanized watersheds than highly-urbanized watersheds. SC is a measurement to reflect the concentrations of dissolved ions or solids in water. All these pollutants might be contributed mainly by human activities associated with urbanization, such as urban runoff, road deicers use, and discharges of residential, municipal, and industrial sewage. This result is the same as the findings in eastern Massachusetts, indicating that the same degree increase in urban land will contribute more dissolved ions and solids in less-urbanized area than in highly-urbanized area (Tu 2010).

Table 6 Comparison of average urbanization level among watersheds with different significances in the relationships between water quality and land use indicators

Compared to the non-significant negative correlations between percentage of urban land and concentration of DO found by OLS, the relationship obtained using GWR is significant at some sites in both highly-urbanized and less-urbanized watersheds (Fig. 2 and Table 6). This result suggests that urbanization level of watersheds is not a factor to affect the association of DO and urban land, and a higher percentage of urban land can be related to a lower concentration of DO in watersheds with different levels of urbanization.

Same as the positive correlations found by OLS, all the significant relationships from GWR are positive for all the dissolved nutrients, indicating that urban land is an important source of nutrients in the studied watersheds, although they are not at all the sampling sites (Fig. 2 and Table 6). A clear spatial pattern can be observed for ON, KN, and P. As shown in Table 6, the average urbanization level for the watersheds with significant relationships is statistically significant higher than that for the watersheds with non-significant relationships for these three dissolved nutrients (P = <0.001–0.031). The average urbanization level for ON, KN, and P is 62.3%, 51.2%, and 43.7%, respectively, in the watersheds with significant relationships, compared to 27.3%, 28.0%, and 12.8%, respectively, in the watersheds without significant relationships.

The maps of t-values also show that the watersheds with significant relationships between urban land and the three nutrients are concentrated in the Atlanta metropolitan area (Fig. 2). The maps illustrate that the relationships for TN, NO3–N + NO2–N, and OC are stronger in less-urbanized watersheds than highly-urbanized watersheds. However, the difference in the urbanization level among watersheds with different types of relationships for these three indicators is not statistically significant (Table 6). Therefore, the results of the relationships between percentage of urban land and dissolved nutrients agree with those in eastern Massachusetts (Tu 2010). Their relationships are stronger in highly-urbanized watersheds than in less-urbanized watersheds, as opposed to the finding that the relationships between urban land and dissolved ions and solids are stronger in less-urbanized areas. This difference in the spatial patterns of the relationships can be explained by the difference in pollution sources between nutrients and dissolved ions and solids. Compared to dissolved ions and solid, which contributed mainly by human activities associated with urbanization, dissolved nutrients are also largely contributed by agricultural activities, such as fertilizer application and animal waste, besides urban sources. Thus, the sources of dissolved nutrients might differ largely between highly- and less- urbanized watersheds.

In highly-urbanized watersheds with few or no agricultural activities, dissolved nutrients are primarily contributed by urban sources, and thus significant positive relationship exists between dissolved nutrients and percentage of urban land. On the contrary, dissolved nutrients in less-urbanized watersheds are contributed by both agricultural and urban sources, and the contribution of urban sources to nutrients might be masked by that of agricultural activities, and so relatively weaker relationship is found between dissolved nutrients and percentage of urban land. Different from dissolved nutrients, dissolved ions and solid (represented by SC in this study) are mainly contributed anthropogenically by urban sources in both highly- and less- urbanized areas. Their correlations with percentage of urban land are not affected by agricultural sources and even get stronger in less-urbanized areas.

Relationships Between Forest and Water Quality

Same as the results of OLS, percentage of forest has negative relationships with the concentrations of most water pollutants, including SC, TN, KN, NO3–N + NO2–N, and OC, found by GWR, indicating that forest is associated with good water quality, and percentage of forest is a good predictor of water quality (Table 5). However, spatial variations are also explored by GWR. Different from the results of OLS, not all the relationships from GWR models are significant at every sampling site for all the water quality indicators. Clear spatial patterns in the spatially varying relationships can be identified from the maps of t-values and by comparing the urbanization levels of watersheds with different types of significance, which are almost opposite to the spatial pattern in the relationships between percentage of urban land and water quality (Fig. 3 and Table 6).

Fig. 3
figure 3

Results of t-test on the parameter estimates for percentage of forest from the GWR models

Different from the slightly significant correlation between percentage of forest and SC found by OLS, their relationships from GWR models show a clear spatial non-stationarity, and their spatial pattern is associated with the urbanization level of watersheds. As shown in the maps of t-values, most significant negative relationships between percentage of forest and SC are located in the less-urbanized area (Fig. 3). The average urbanization level of the watersheds with significant relationships for SC is significantly lower than that of the watersheds without significant relationships (P = 0.001; Table 6).

Both positive and negative relationships between percentage of forest and DO are found by GWR, but all the negative relationships are not significant. The average urbanization level of the watersheds with significant relationships is significantly lower than that of the watersheds without significant relationships (P = 0.035; Table 6).

For all the dissolved nutrients and OC, all their significant relationships with percentage of forest from GWR models are negative. As shown in the t-value maps, most significant relationships for ON and P are located in the highly-urbanized watersheds in the Atlanta metropolitan area (Fig. 3). In addition, the average urbanization level of the watersheds with negative significant relationships is higher than that of the watersheds with non-significant relationships for all the dissolved nutrients and OC, either statistically significant or not (Table 6).

The spatially varying relationships between forest and water quality indicators in this study are consistent with the results of the earlier study in eastern Massachusetts (Tu 2011a). The relationships between percentage of forest and concentrations of SC become stronger as the urbanization levels of watersheds decrease. In contrast, the relationships between percentage of forest and concentrations of dissolved nutrients get stronger as the urbanization levels of watersheds increase. Thus, the ability of forest as a water quality predictor varies over space associated with the urbanization level of watersheds and dependent on water quality indicators to be predicted.

Relationships Between Agricultural Land and Water Quality

Different from the OLS results that percentage of agricultural land has a significant correlation with only one water quality indicator (KN), the relationships between agricultural land and water quality obtained using GWR are significant at many sampling sites for six out of the eight water quality indicators (Fig. 4). Only TN and NO3–N + NO2–N have no significant relationships at any sampling sites. Clear spatial patterns in the varying relationships can be also identified from the t-value maps and by comparing the urbanization levels of watersheds with different types of significance, which are similar to the spatial patterns in the relationships between percentage of forest and water quality (Fig. 4 and Table 6).

Fig. 4
figure 4

Results of t-test on the parameter estimates for percentage of agricultural land from the GWR models (The t-values for TN and NO3–N + NO2–N are not significant at all sampling sites, so they are not included in the figure)

As shown in Fig. 4, there are three sampling sites with significant positive relationships between percentage of agricultural land and concentration of SC. They are all located in less-urbanized watersheds with an average urbanization level of 7.8% (Table 6).

Both significant positive and negative relationships between percentage of agricultural land and DO are found by GWR. All the significant positive relationships are found in highly-urbanized watersheds in the Atlanta metropolitan area with an average urbanization level of 72.6%, while all the significant negative relationships are found in less-urbanized watersheds with an average urbanization level of 7.5%, and the other watersheds without significant relationships have an average urbanization level of 31.8% (Fig. 4 and Table 6). This result indicates that agricultural land is an important pollution source in less-urbanized areas, but is usually associated with good water quality and is a good predictor of water quality in highly-urbanized areas.

The significant negative relationships between percentage of agricultural land and dissolved nutrients are found in highly-urbanized watersheds, mainly located in the Atlanta metropolitan area (Fig. 4). As shown in Table 6, the average urbanization level of the watersheds with significant relationships is significantly higher than that of the watersheds without significant relationships for ON (P = 0.002), KN (P = 0.002), and P (P < 0.001).

Although both positive and negative relationships are found between percentage of agricultural land and dissolved organic carbon using GWR, all the negative relationships are not significant. Positive relationships are observed at four sampling sites located in less-urbanized areas (Fig. 4). This result indicates that a higher percentage of agricultural land is related to a higher concentration of OC in less-urbanized watersheds. It can be explained by the anthropogenic sources of dissolved organic carbon. Animal feedlots and compost-ing facilities associated with agricultural land can contribute OC to natural water (Hopple and others 2006). This result along with the significant negative relationships between percentage of agricultural land and DO suggest again that agricultural land is an important pollution source in less-urbanized watersheds.

The scatter plots of the urbanization level of watersheds and the parameter estimates of percentage of agricultural land from GWR models illustrate more clearly how the relationships between percentage of agricultural land and concentrations of water quality indicators vary in response to the urbanization level of watersheds (Fig. 5). The parameter estimate for DO has a significant positive correlation with the urbanization level, indicating that the mainly positive relationship between DO and percentage of agricultural land becomes stronger as the urbanization level increases. In contrast, the parameter estimates for SC and dissolved nutrients have significant negative correlations with the urbanization level of watersheds, suggesting that the mainly negative relationships between percentage of agricultural land and SC and dissolved nutrients also get stronger as the urbanization level increases. However, the spatial pattern of the relationship between percentage of agricultural land and OC is opposite to those for SC and dissolved nutrients. The mainly positive relationship between agricultural land and OC decreases as the urbanization level of watersheds increases.

Fig. 5
figure 5

Scatter plots of urbanization level and parameter estimates from the GWR models for agricultural land and water quality

The GWR results of the relationships between agricultural land and water quality indicators agree with the findings in the earlier study in eastern Massachusetts (Tu 2011a). Agricultural land might be an important pollution source in less-urbanized, especially agriculture-dominated watersheds, but it might be associated with good water quality as a good predictor of water quality in highly-urbanized watersheds because its contribution to pollution is negligible and usually masked by urban sources, and its ability to predict water quality increases as the urbanization level of watersheds increase.

The similar results between this study in northern Georgia and the earlier study in eastern Massachusetts might due to the fact that a similar urbanization gradient exists in both regions. The percentages of urban land in studied watersheds in eastern Massachusetts range from 7.3% to 100%, with an average of 44.6%, which decrease from the city of Boston to the outside. The percentage of agricultural land is also low in most of studied watersheds in eastern Massachusetts, ranging from 0% to 22.4%, with an average of 22.4% (Tu 2011a). These variations in the urbanization level of watersheds and percentage of agricultural land are very similar to this study area, as described in Sect. 4.1. Both studies show that agricultural land has significant positive relationships with some pollutants in less-urbanized watersheds due to the contribution of agricultural activities to water pollution, but it may have significant negative relationships with pollutants in highly-urbanized watersheds since the contribution of agricultural land to water pollution is usually masked by urban sources in highly-urbanized area and a lower percentage of agricultural land usually means a higher percentage of urban land in highly-urbanized watersheds. If conventional statistical methods (e.g., OLS) are performed in such an area that contains watersheds with a great ranges of urbanization level, such as eastern Massachusetts and northern Georgia, no significant relationships between agricultural land and water quality can be found (Tu 2011a). However, it is necessary to study the actual agricultural practices, such as agricultural type (livestock raising, organic farming or traditional farming), crop types, and fertilizer application, to more clearly understand the influence of urbanization level of watersheds on the relationships between agricultural land and water quality. This and the earlier studies did not collect this kind of information. Thus, further research is needed to interpret the underlying causes of the spatial variations in the relationships between land use and water quality.

Besides the limitation raised by using agricultural land without considering actual agricultural practices, the other three land use indicators (urban land, forest, and wetland) are also too broad to discover the actual underlying causes of the relationships between land use and water quality. Urban land includes different types of use (e.g., commercial, residential, industrial, and transportation) and different intensities. Both forest and wetland can be also classified into more uses in reality. The different more detailed land uses may have different effects on water quality. Thus, in order to more clearly identify the changing causes of water pollution at different sampling sites, further research is also needed to analyze how more detailed land uses in the watersheds of the sampling sites are related to water quality.

Conclusions

This study examined the relationships between four land use and eight water quality indicators across an urbanization gradient in the watersheds of northern Georgia using both OLS and GWR models. The results obtained using OLS show that percentage of urban land has significant positive correlations with all the dissolved nutrients and organic carbon, a slightly significant positive correlation with specific conductance, and a slightly significant negative correlation with dissolved oxygen. Percentage of forest has an opposite significant relationship with each water quality indicator to percentage of urban land. Agricultural land and wetland have no significant relationships with water quality in the study area. The results of OLS indicate that urban land is associated with water pollution, but forest is related to good water quality, and agricultural land is not an important pollution source in the study area.

However, the results from GWR analyses show that the relationships between land use and water quality actually vary over space. The spatial variations in the relationships are affected by the urbanization level of watersheds. The significant positive relationships between percentage of urban land and dissolved nutrients are mainly observed in highly-urbanized watersheds, while that between percentage of urban land and SC is mainly found in less-urbanized watersheds.

Similar to the results of OLS, the GWR results also show percentage of forest is a good predictor of water quality since significant negative relationships with all the studied water pollutants and significant positive relationships with DO are observed for percentage of forest at many sampling sites. However, the relationships from GWR analyses are not significant at all the sampling sties; there are great spatial variations in the relationships, which is also affected by the urbanization level of watersheds. Percentage of forest has stronger relationships with SC in less-urbanized areas, while has stronger relationships with dissolved nutrients in highly-urbanized areas.

Different from the weak relationships between agricultural land and water quality found using OLS, the results of GWR analyses show that significant relationships for agricultural land can be observed at many sampling sites. Percentage of agricultural land has significant negative relationships with most dissolved nutrients and a significant positive relationship with DO in highly-urbanized watersheds, but a significant negative relationship with DO and a significant positive relationship with OC and SC in less-urbanized watersheds. The results indicate that agricultural land might be an important pollution source in less-urbanized rural areas, but might be associated with good water quality as a predictor of water quality in highly-urbanized central cities.

All the spatial patterns in the spatially varying relationships between land use and water quality are consistent with the findings in the earlier study using GWR in the watersheds of eastern Massachusetts (Tu 2010, 2011a). The results of the comparison of the model performance between OLS and GWR are also similar between these two studies (Tu and Xia 2008). GWR models show great improvements of model performance over their corresponding OLS models, which is proved by F-test and the comparisons of model R2 from both GWR and OLS. A dramatic improvement in R2 of GWR over OLS is observed for every pair of water quality and land use indicators. The similar results from these two study areas might be caused by the similar land use pattern. Both study areas contain watersheds with a great variation in urbanization level, and both have a clear urbanization gradient with percentage of urban land decreasing from central cities to the outsides. The spatial variation in urbanization level of watersheds cause the contributions of pollution sources associated with different land uses to water pollution change over space, so that spatially varying relationships between land use and water quality can be found.

The major contribution of this study is that it replicates the findings from the earlier research in eastern Massachusetts, which is a pilot study of the application of GWR in water environment. This study confirms that GWR as a spatial statistical tool has its advantages to explore the spatially varying relationships between land use and water quality; the spatial variations in the relationships are affected by the urbanization level of watersheds. Thus, these findings are not obtained by chance and are not limited by study area.

Combined with the findings from the earlier study in eastern Massachusetts, this study has important implications for both water quality research and watershed management. First, the significance of the relationship between water quality and land use is different at different sampling sites due to the fact that pollution sources (e.g., agricultural or urban activities) usually vary associated with urbanization levels of watersheds. Thus, land use and watershed management policies should be modified in different parts of a study area based on the local pollution sources unveiled by GWR. For example, a nutrient might be the primary pollutant in a watershed and it shows a significant positive relationship with the percentage of agricultural land of the watershed, and thus to control the contribution of the nutrient from agricultural activities in the watershed is important. However, the nutrient might show a significant negative relationship with percentage of agricultural land and a significant positive relationship with percentage of urban land in another watershed, and thus to control the nutrient from urban activities (e.g., residential sewage, street runoff, etc.) is more important, and it is not necessary to control the agricultural activities.

Second, the impact of urbanization on different water pollutants might be different. As shown in these two studies, the relationships between urban land and dissolved ions and solids (represented by SC in the current study) become weaker as the urbanization level of watersheds increases. It indicates that the same degree increase in urban land will contribute more dissolved ions and solids in less-urbanized than in highly-urbanized areas. In contrast, the relationships between urban land and dissolved nutrients get stronger as the urbanization levels of watersheds increases. It indicates that the same degree increase in urban land will contribute more nutrients in highly-urbanized than less-urbanized areas. Thus, in order to reduce water degradation, land use planning should be adjusted based on the primary pollutants in watersheds. For example, for an area where dissolved ions and solids are the primary pollutants that cause water degradation, it might be better to control low density suburban development but to encourage compact central city development. Conversely, for an area where nutrients are the primary pollutants, the restriction on suburban development might be eased, while central city development might be better controlled. However, this land use planning idea is proposed based on the results from only eastern Massachusetts and northern Georgia that have a similar land use pattern. Further studies on more regions with various land use patterns and deeper investigations into the underlying causes of the spatially varying relationships between land use and water quality are necessary for a better reliability and feasibility of this idea.

Furthermore, GWR techniques can be also expanded to other fields of environmental studies and management. Environmental scientists and agencies are concerned about the interrelations among various natural and human factors that affecting environment, such as soil, climate, land use, human activities, and policy, but all these factors are changing over space, and thus their relationships might have great spatial variations. The spatial variations are not easy to be studied using traditional global statistics, but they can be explored by GWR techniques. Therefore, GWR technique is a very useful tool in water environmental research and also has the potential to be applied to other fields of environmental studies and management.