Introduction

Groundwater systems in urban areas show greater differences in physical and chemical characteristics than natural groundwater systems. A new hydrogeological system has been established in urban areas in Korea because of the artificial water supply (recharge) and discharge as well as a decreased groundwater recharge rate caused by increasing coverage of pavement (Ministry of Science and Technology 1997).

Seoul, the capital of South Korea (Fig. 1), has a population of approximately 10 million in an area of 605.7 km2 (Seoul Metropolitan City 2001a). The quantity and quality of groundwater in this metropolitan area, which has expanded rapidly since 1970, have been greatly affected by factories, sewer lines, waste landfills, groundwater exploitation and so on (Kim and others 2001). The depth of wells has increased because of the deterioration of groundwater quality in shallow wells since the late 1980s. Subsequently, groundwater quality in the deep aquifer has been exposed to an increased risk of contamination. The annual precipitation in the Seoul area is 1,360 mm/year. A thin topsoil layer and steep slopes in the watershed causes rapid runoff of rainfall (Seoul Metropolitan City 1996). Groundwater aquifers of the Seoul area consist of thin alluviums and fractured crystalline bedrocks of granites, gneisses and schists. Alluviums are mainly distributed in the nearby Han River and are composed of coarse- and fine-grained sediments that result in high permeability. The geology of the area is mainly biotite granites and banded biotite gneisses, covering nearly 60% of the study area (Fig. 2). Gneiss areas show severe weathering. Groundwater systems are extremely distorted by large buildings, subway systems, water-supply and sewage systems and various underground constructions (Seoul Metropolitan City 1996). About 15,000 wells are used in the Seoul area at the present time and about 4.045×107 m3/year of groundwater is pumped from domestic, agricultural and industrial wells (Kim and Lee 2001). Both groundwater depletion and serious groundwater contamination have been reported in some parts of the area (Kim and others 2000). Leakage of water from the sewerage systems has a significant impact on some characteristics of the groundwater. To solve these problems, the factors contributing to groundwater contamination in urban areas have to be analyzed and systematic management plans need to be established (Custodio 1997; Foster 2001).

Fig. 1.
figure 1

Location of Seoul, Korea

Fig. 2.
figure 2

Geological map of the study area

Recently, many research projects have examined the relationship between specific land-use patterns, corresponding pollutant emissions and the resulting groundwater quality (Trauth and Xanthopoulos 1997; Hong and Rosen 2001; Lasserre and others 1999). The object of this study is to analyze the relationship between land uses and the well-known contaminant in urban groundwater, NO3-N, and subsequently to develop statistical models for the assessment of nitrate contamination.

Materials and methods

This study was carried out in two steps: at first, data collection and database construction using a Geographic Information System (GIS), and secondly, statistical analysis to investigate the characteristics of NO3-N according to the well depth, rainfall distribution and land uses. The SPSS statistical package for Windows was used for the statistical analysis. The P values for testing the results of statistical analysis were calculated at the 95% significance level.

Data collected and used to construct the GIS database include monthly rainfall records, well locations, groundwater quality, land-use data from references and reports and related GIS data.

Figure 3 represents the rainfall distribution in each month from 1991 to 2000. The average annual rainfall is 1,360 mm/year (Seoul Metropolitan City 2001a). Based on the rainfall data for 10 years, the dry and rainy seasons were identified to be from December to February and from July to September, respectively. More than 70% of the rainfall occurs during the 4 months in summer, resulting from seasonal rain fronts in June and July and typhoons in August and September. In this study, the characteristics of NO3-N in groundwater were analyzed according to the rainfall distribution.

Fig. 3.
figure 3

Rainfall distribution in the Seoul area (1991–2000)

The Seoul Metropolitan City (2001b) constructed land-use data, which are based on 1:1,000-scale digital topographic and cadastral maps, and the land uses were classified at three levels (Table 1). As a first-order classification, land uses are classified into urban and open space, which occupy about 58% and about 42% of the total Seoul area, respectively (Seoul Metropolitan City 2001b). In the second-order classification, urban areas are divided into residential areas, commercial and business areas, mixed residential and business areas, industrial areas, public facilities, transportation facilities, urban infrastructure facilities, denuded areas and inaccessible areas. Open spaces include grasslands, field crop areas, forest areas, managed green areas, and rivers and streams (Fig. 4). In the third-order classification, residential areas are further classified into detached houses, apartment houses, and traditional houses. Transportation facilities are subdivided into railroad, road, airport and related facilities. Urban infrastructure facilities, the potential contamination sources, are classified into sewage treatment facilities, storm-water retention ponds, water supply reservoirs, waste landfills, power plants, waste incineration facilities and waste transfer stations.

Table 1. Land uses in the Seoul area (Seoul Metropolitan City 2001b)
Fig. 4.
figure 4

Land uses in the Seoul area

A database of well locations and NO3-N concentrations was constructed in 2001. Presently, about 30,000 wells, which include in-use and abandoned wells, constitute the database (Seoul Metropolitan City 2002). Results from a total of 1,988 wells, monitored in January, February, July, August, September and December of the year 2000, were used for statistical analysis.

The well depth was classified based on the maximum depth of weathered bedrocks, which is 40 m, in the Seoul area (Seoul Metropolitan City 1996). If the borehole depth was less than 40 m, the well was considered a "shallow well." Wells deeper than 40 m were identified as "deep wells." Figure 5 shows the location of wells used in this study, which are classified according to the well depth in the dry season: (a) 306 shallow wells and (b) 225 deep wells. Figure 6 shows the location of the wells in the rainy season: (a) 849 shallow wells and (b) 608 deep wells.

Fig. 5.
figure 5

Distribution of wells monitored in 2000. a Shallow wells in the dry season, b deep wells in the dry season

Fig. 6.
figure 6

Distribution of wells monitored in 2000. a Shallow wells in the rainy season, b deep wells in the rainy season

Figure 7 represents a flowchart for developing the statistical models for contamination assessment. At first, a test for normality is performed for evaluating the distribution of NO3-N concentration grouped according to the well depth and rainfall distribution. Secondly, a test for determining the difference of NO3-N concentration in groundwater is performed in each group according to the well depth and rainfall distribution. Thirdly, the radius of influence (ROI) is evaluated to identify the area over which a certain land use could affect NO3-N concentration in wells. Finally, correlation and linear regression analyses are performed to analyze the relationship between land uses within specific ROI and NO3-N concentrations. Before these statistical methods are performed, two GIS processes are implemented to calculate the area of specific land uses within the radius of influence. Those two processes are buffering at the well center and overlaying the land-use data and buffer layer.

Fig. 7.
figure 7

Developing statistical models for contamination assessment of NO3-N

Results

Present NO3-N distribution

To estimate the extent of contamination from each type of land use, NO3-N concentrations were analyzed based on the number of wells and ratios exceeding the drinking water standard of 10 mg/L (OECD 1986). The mean NO3-N concentration for the 1,988 wells was 6.33 mg/L. Among these wells, 463 wells (23.29%) exceeded the drinking water standard. Table 2 shows the number of wells located in various second-order land uses and the number of contaminated wells with NO3-N >10 mg/L in each category.

Table 2. Range of nitrate-N concentrations in wells according to each land use. (LN1 residential area, LN2 commercial and business area, LN3 mixed residential and business area, LN4 industrial area, LN5 public facilities area, LN6 transportation facilities area, LN7 urban infrastructure facilities area, LN8 denuded area, LN9 inaccessible area, LN10 grassland area, LN11 field crop area, LN12 forest area, LN13 managed green area)

The land use that represents the highest ratio of NO3-N concentrations exceeding 10 mg/L is the field crop area. A total of 70 wells (38.46%) of the total 182 wells exceeded 10 mg/L of NO3-N. The next highest is the mixed residential and business areas (28.73%), followed by grass land areas (22.22%), industrial areas (20.37%), residential areas (19.95%), commercial and business areas (19.10%) and so on. NO3-N in groundwater generally derived from urban wastewaters, various fertilizers in agriculture and livestock wastes in farming areas (Hem 1992; Huber and others 2000).

In Seoul, field crop areas and mixed residential and business areas show a higher percentage of NO3-N contaminated wells than the average ratio (23.29%) of all the wells exceeding the standard, indicating that general causes of nitrate contamination could be related to the land uses. The land uses represented by the lowest ratios are inaccessible areas, urban infrastructure facilities, managed green areas and forest (Table 2).

A normality test for distribution of NO3-N concentration is performed to determine if parametric or non-parametric test procedures may be employed. Table 3 shows the results of the normality test for distribution of NO3-N concentration, which present the non-normality in each data group, and as an example, the Kolmogorov-Smirnov value for NO3-N concentration in shallow wells and the rainy season is 3.667, and the P value is <0.001. This value shows that the P value is smaller than the significance level 0.05, which implies distribution of NO3-N concentration is non-normal. Other P values are smaller than the significance level of 0.05, demonstrating a non-normal distribution.

Table 3. Results of testing for normality in the distribution of NO3-N concentration by well depth and season

A Mann-Whitney U-test is performed to verify the population difference of NO3-N concentration in wells. In the comparison of rainy and dry seasons (Table 4a, b), both shallow and deep wells show Z values of −5.068 and −2.888, respectively. These Z values indicate that there are significant differences of NO3-N concentration in different seasons.

Table 4. Results of the Mann-Whitney U-test: analyzing the difference in NO3-N concentration by comparing the well depth and rainfall distribution

Comparing shallow and deep wells, Z values of the rainy and dry seasons were −6.591 and −4.089, respectively, showing the statistically significant differences of NO3-N concentrations with depth (Table 4c, d).

During the same season, shallow wells tend to have higher concentrations than deep wells (Table 4a vs. b), indicating that groundwater quality in shallow wells has deteriorated more. For wells of the same depth, the NO3-N values were higher in the dry season than the rainy season (Fig. 8, Table 4c vs. d). Consequently, the results of the Mann-Whitney U-test show that the differences of concentration of NO3-N due to well depth and seasonal rainfall are significant at the 95% confidence level.

Fig. 8.
figure 8

Mean and median concentration of NO3-N according to well depth and rainfall distribution. a Shallow wells in the rainy season, b deep wells in the rainy season, c shallow wells in the dry season, and d deep wells in the dry season

Establishment of radius of influence

The radius of influence (ROI) is the range over which a particular land use has the major influence on the groundwater quality. It is very important to establish the optimum ROI when analyzing the relationship between land use and groundwater quality. If the ROI is set at too short a distance, land use characteristics are not reflected properly in the groundwater quality. On the contrary, if the ROI is set too long, unrelated land uses may appear to influence the groundwater quality. Use of a ROI concept ignores the regional and local flows of groundwater or contamination moving into the ROI area of a well. The hydrogeological conditions in the Seoul area show extreme disturbance caused by such influences as well pumping, subway pumping, sewer system leaks, the municipal water-supply system and non-permeable pavement. Regional groundwater flow is therefore difficult to model in the Seoul area (and detailed groundwater flow modeling even more so within the various ROI areas). Thus, although the ROI concept has a few shortcomings, it reflects the effects of contaminant sources, such as specific land uses, in the local area. Figure 9a shows the land uses within specific ROIs from a well located in an urban area, which is composed mainly of residential areas, commercial and business areas and mixed residential and business areas. Figure 9b shows the same information in green and open-space areas, mainly forest areas, field crop areas and grassland areas.

Fig. 9.
figure 9

Land uses within the 200 m ROI. a Land uses in the urban area, b land uses in the green and open-space area

In this study, to establish the optimum ROI, we considered various distances from the well center (50 to 400 m in 50-m increments) and then performed multiple regression analysis between specific land uses and concentrations of NO3-N at each ROI. As previously mentioned, land uses that correlate with the higher NO3-N concentrations in wells are mixed residential and business areas and field crop areas. Therefore, these two land uses were used as independent variables for the regression analysis. The optimum ROI was determined based on the highest regression coefficient.

Figure 10 shows the regression coefficients between the two land uses and the NO3-N concentration according to the radius of the ROI. The regression coefficients for NO3-N in the dry season increased from 50 to 200 m of radius, then continuously decreased above 250 m of radius in both shallow and deep wells. Therefore, in the dry season, 200 m from a well is considered to be an optimum ROI to capture land-use characteristics affecting the concentration of NO3-N in groundwater. In the rainy season, an ROI of 250 m has the highest regression coefficient in both shallow and deep wells. Consequently, ROIs of 200 and 250 m for the dry and rainy seasons, respectively, were selected to evaluate the statistical models for contamination assessment. This result confirms Chon and Ahn's (1998) study concluding that the ROI for NO3-N from potential point sources would be 200 m in the Guro-Ku area, a western part of the Seoul metropolitan area.

Fig. 10.
figure 10

Regression coefficient variation for NO3-N by increment of radius of influence. a Shallow wells in the dry season, b shallow wells in the rainy season, c deep wells in the dry season, and d deep wells in the rainy season

Development of statistical models

Correlation analysis expresses the relationship between two variables. Previous results for the Mann-Whitney U-test show that the NO3-N concentrations are different according to the well depth and rainfall distribution. Therefore, correlation analysis was performed to see how much each land use could affect the NO3-N concentration. Data distribution for NO3-N concentration showed the non-normality in each group, which is classified by well depth and rainfall distribution, so the Spearman's rho test is performed.

Various land uses within the ROI include residential, commercial and business, mixed residential and commercial, industrial, public facilities, transportation facilities, urban infrastructure facilities, denuded, inaccessible, grassland, field crop, forest, managed green, and river and stream.

Table 5 shows the results of the Spearman's rho test between land uses within the ROI and NO3-N concentration at the center wells. In the dry season, the land uses showing positive coefficients at the significance level of 0.05 are deep wells in field crop areas (LN11). Field crop areas (LN11) show notably higher correlation coefficients than other land uses, implying that the occurrence of NO3-N in wells is related to field crop areas. In the rainy season, residential areas (LN1), mixed residential and business areas (LN3) and public facilities areas (LN5) show positive coefficients in shallow wells at the significance level of 0.05. Mixed residential and business areas (LN3) show positive coefficients with deep wells at the significance level of 0.05.

Table 5. Results of the Spearman's rho test between land use and NO3-N concentration. (LN1 residential area, LN2 commercial and business area, LN3 mixed residential and business area, LN4 industrial area, LN5 public facilities area, LN6 transportation facilities area, LN7 urban infrastructure facilities area, LN8 denuded area, LN9 inaccessible area, LN10 grassland area, LN11 field crop area, LN12 forest area, LN13 managed green area, LN14 river and stream)

Correlation coefficients are generally low, implying that the groundwater quality was affected by factors other than the land uses. If the land uses that have negative correlation were increased in some areas, the NO3-N concentration would be decreased.

Multiple regression analysis was used to draw a relationship between the NO3-N concentration and land use. The regression equation is expressed in the following form (Davis 1986):

$$ {\rm{Y}}_{\rm{i}} {\rm{ = }}\beta _{\rm{0}} {\rm{ + }}\beta _1 {\rm{x}}_{\rm{1}} {\rm{ + }}\beta _2 {\rm{x}}_{\rm{2}} {\rm{ + \cdot\cdot\cdot + }}\beta _{\rm{p}} {\rm{x}}_{\rm{p}} $$
(1)

where Yi is the NO3-N concentration as a dependent variable. The independent variables x1, x2, ..., xp represent the areas of each land use. β 1, β 2, ..., β p are regression coefficients and β 0 is a constant. The results of the regression analysis for NO3-N in the dry season are summarized in Table 6: (a) for shallow and (b) for deep wells. Among the land uses used to establish the equation, the mixed residential and business area (LN3), grassland area (LN10) and field crop area (LN11) have positive coefficients in shallow wells in the significance level of 0.05. In deep wells, grassland area (LN10) and field crop area (LN11) have positive coefficients that denote significance. The land uses with positive coefficients tend to increase the NO3-N concentration. The field crop area, which affects the NO3-N concentration most of all, is represented in both shallow and deep wells. Standardized coefficients for field crop areas are 0.289 and 0.523 in shallow and deep wells, respectively.

Table 6. Regression analysis results for NO3-N in the dry season. (LN1 residential area, LN3 mixed residential and business area, LN10 grassland area, LN11 field crop area)

Two regression equations for NO3-N in shallow (Eq. 2) and deep (Eq. 3) wells were derived at the significance level of 0.05 by the F-test. Units for NO3-N concentration and land use are mg/L and m2, respectively. For example, in Eq. (3), if the field crop area is increased by 1 m2, the NO3-N concentration is increased by 1.091E–04 mg/L. In Eqs. (2) and (3), LN3 is the mixed residential and business area, LN10 is the grassland area and LN11 is the field crop area.

$$ {\rm NO}_{\rm 3} - {\rm N(mg/L) = 5}{\rm .340 + 5}{\rm .844E} - {\rm 05} \times {\rm LN3(m}^{\rm 2} {\rm ) + 2}{\rm .641E} - {\rm 04} \times {\rm LN10(m}^{\rm 2} {\rm ) + 8}{\rm .268E} - {\rm 05} \times {\rm LN11(m}^{\rm 2} {\rm )} $$
(2)
$$ {\rm NO}_{\rm 3} - {\rm N(mg/L) = 4}{\rm .540 + 2}{\rm .346E} - {\rm 04} \times {\rm LN10(m}^{\rm 2} {\rm ) + 1}{\rm .091E} - {\rm 04} \times {\rm LN11(m}^{\rm 2} {\rm )} $$
(3)

The results of the regression analysis for NO3-N in the rainy season are summarized in Table 7. In shallow wells, the mixed residential and business area (LN3) have positive coefficients, indicating contributions to the increase of NO3-N concentration. In deep wells, the residential area (LN1), the mixed residential and business area (LN3), the industrial area (LN4), the denuded area (LN8) and the field crop area (LN11) have positive coefficients. Among these land uses, the mixed residential and business area has the highest regression coefficients, suggesting that the mixed residential and business area (LN3) would cause the most increase of NO3-N concentration. Unlike the results for the dry season, the R2 value in shallow wells in the rainy season (r 2=0.055) is higher than that of deep wells (r 2=0.048). This is probably a result of infiltration of surface runoff and subsequent recharge processes. The R2 values in the rainy season (Table 7: r 2=0.055 and 0.048 for shallow and deep wells, respectively) are lower than those in the dry season (Table 6: r 2=0.094 and 0.332 for shallow and deep wells, respectively). This result is interpreted as showing that the hydrogeological system in the rainy season becomes more complicated than in the dry season and the effects of land use could be masked by the other processes such as recharge.

Table 7. Regression analysis results for NO3-N in the rainy season. (LN1 residential area, LN3 mixed residential and business area, LN4 industrial area, LN6 transportation facilities, LN8 denuded area, LN11 field crop area, LN12 forest area, LN14 river and stream)

The equations in the rainy season were also evaluated by use of the F-test. The equations in shallow and deep wells are presented in Eqs. (4) and (5), respectively. In equations, LN1 is the residential area, LN3 the mixed residential and business area, LN4 the industrial area, LN6 the transportation facilities, LN8 the denuded area, LN11 the field crop area and LN14 the river and stream.

$$ \eqalign{ & {\rm NO}_{\rm 3} - {\rm N(mg/L) = 7}{\rm .910 + 7}{\rm .953E} - {\rm 06 \times LN3(m}^{\rm 2} {\rm )} - {\rm 5}{\rm .208E} - {\rm 05 \times LN6(m}^{\rm 2} {\rm )} \cr & - {\rm 2}{\rm .518E} - {\rm 05 \times LN12 (m}^{\rm 2} {\rm )} - {\rm 2}{\rm .009E} - {\rm 04 \times LN14(m}^{\rm 2} {\rm )} \cr} $$
(4)
$$ \eqalign{ & {\rm NO}_{\rm 3} - {\rm N(mg/L) = 2}{\rm .579 + 1}{\rm .532E} - {\rm 05 \times LN1(m}^{\rm 2} {\rm ) + 2}{\rm .219E} - {\rm 05 \times LN3(m}^{\rm 2} {\rm )} \cr & {\rm + 2}{\rm .508E} - {\rm 05 \times LN4(m}^{\rm 2} {\rm ) + 3}{\rm .855E} - 05{\rm \times LN8(m}^{\rm 2} {\rm ) + 5}{\rm .718E} - 05{\rm \times LN11(m}^{\rm 2} {\rm )} \cr} $$
(5)

Through the results of the four types of regression analysis, we could identify the grassland area (LN10) and the field crop area (LN11) to be major land uses affecting NO3-N concentration in the dry season, whereas the mixed residential area (LN3) was a major source in the rainy season. Constants representing the Y intercept are higher in shallow wells than deep wells, also indicating that shallow wells would be more vulnerable to contamination than deep wells. This confirms that the mean of NO3-N concentration in shallow wells is higher than that of deep wells.

Discussion

In this study, NO3-N concentrations in urban wells were classified according to the rainfall distribution (as the dry and the rainy season) and the depth of the well (as shallow and deep). The NO3-N concentrations were analyzed using GIS techniques and statistical methods to draw correlations with land uses.

The NO3-N concentrations in the rainy season were lower than those in the dry season. This could be attributed to rainfall recharge and resulting dilution effects on the NO3-N concentration. The NO3-N concentrations are higher in shallow wells than in deep wells in both dry and rainy seasons. This indicates that NO3-N sources are mainly located at the surface and these contaminants flow into the groundwater system by infiltration and recharge processes.

The optimum radius of influence (ROI) so that land use could show significant effects on NO3-N concentration in wells is shown to be about 200 m in the dry season and 250 m in the rainy season for both shallow and deep wells. Land uses that affect NO3-N concentration are the grassland area and the field crop area in the dry season, and the mixed residential and business area in the rainy season. Therefore, to manage the groundwater quality for NO3-N concentration, it is necessary to manage land use according to the rainfall distribution.

In the statistical methods, we use land use areas as the non-point sources that affect NO3-N in wells. Correlation results between NO3-N concentration and land uses are generally low and weak even if they are significant at a 0.05 level. The R2 values from the regression analysis also show the very low correlation. These results suggest that many contaminant sources exist in addition to the land uses, and that hydrogeological conditions are severely disturbed in urban areas. In the next step, to obtain better results, we need to enhance our database to include the number of people, sewer lines and sewage disposal facilities, geological conditions and soil characteristics.