Development of Multiple Linear Regression Models for Predicting the Stormwater Quality of Urban Sub-Watersheds

Arora, Amarpreet S.; Reddy, Akepati S.

doi:10.1007/s00128-013-1160-y

Development of Multiple Linear Regression Models for Predicting the Stormwater Quality of Urban Sub-Watersheds

Published: 24 November 2013

Volume 92, pages 36–43, (2014)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Bulletin of Environmental Contamination and Toxicology Aims and scope Submit manuscript

Development of Multiple Linear Regression Models for Predicting the Stormwater Quality of Urban Sub-Watersheds

Download PDF

Amarpreet S. Arora¹ &
Akepati S. Reddy¹

799 Accesses
19 Citations
Explore all metrics

Abstract

Stormwater management at urban sub-watershed level has been envisioned to include stormwater collection, treatment, and disposal of treated stormwater through groundwater recharging. Sizing, operation and control of the stormwater management systems require information on the quantities and characteristics of the stormwater generated. Stormwater characteristics depend upon dry spell between two successive rainfall events, intensity of rainfall and watershed characteristics. However, sampling and analysis of stormwater, spanning only few rainfall events, provides insufficient information on the characteristics. An attempt has been made in the present study to assess the stormwater characteristics through regression modeling. Stormwater of five sub-watersheds of Patiala city were sampled and analyzed. The results obtained were related with the antecedent dry periods and with the intensity of the rainfall event through regression modeling. Obtained regression models were used to assess the stormwater quality for various antecedent dry periods and rainfall event intensities.

Multiple Regression Analysis as a Comprehensive Tool to Model Flood Hazard in Sewersheds

Predicting Stormwater Quality Resulting from Traffic Generated Pollutants

Investigation of contaminant profile in highway stormwater runoff and risk assessment by statistical analysis

Article Open access 10 October 2022

Discover the latest articles, news and stories from top researchers in related subjects.

Environmental Chemistry

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Stormwater runoff from urban areas can be highly polluted with various materials, indicating a significant non-point source (NPS) of pollution to receiving water bodies (Brezonik and Stadelmann 2002; Kim et al. 2005). Site and event parameters (total event rainfall, cumulative seasonal rainfall, drainage area, annual average daily traffic and antecedent dry period) were found to have significant influences on urban runoff. Irish et al. (1998) and Brezonik and Stadelmann (2002) determined that loads for each constituent are dependent upon a unique subset of variables. Also, the processes responsible for the generation, accumulation, and wash-off of urban runoff pollutants are constituent-specific. Geographic and physical factors such as the type and intensity of urban land use, degree of imperviousness, tree cover, soil type and slope are also important parameters that impact the quality of urban runoff (Graves et al. 2004; Kayhanian et al. 2007).

Due to the impacts on receiving waters and the expense involved in obtaining monitoring data on nonpoint source pollution data, interest has grown in analyzing existing/measured data to develop estimation models for urban stormwater loads and concentrations (Thomson et al. 1997; Phillips and Thompson 2002). Such models will be very helpful in estimating concentrations for unmonitored watersheds. Consequently, parameters which are relatively easy to monitor can serve as indicators for other constituents reducing labour and time. Hence efforts have been made on similar lines to develop regression models for the few selected sub-watersheds of Patiala city, Punjab. Patiala (29°49′ and 30°47′ north latitude, 75°58′ and 76°54′ east longitude), city of Punjab (Northern India), does not have provisions for the stormwater drainage. As a result even modest rainfall events produce severe flooding in many parts of the city (Arora and Reddy 2012). Avoiding discharge of this polluted stormwater into the river Ghaggar, preventing flooding within the city and maintaining the groundwater table should be considered as utmost important while planning stormwater management system for the city.

This research was conducted to develop estimation models of pollutant concentrations from urban stormwater runoff, as a function of rainfall variables, using multiple regression analysis. The average rainfall and antecedent dry period variability was explored by means of investigating 6 years data gathered by the Indian Meteorological Department. Storm water characteristics were thus predicted for five sub-watersheds of the Patiala city through regression modeling for best, average and worst case conditions. The model developed was validated using Chi square test applied on six additional storm events monitored during the next rainy season.

Materials and Methods

Five urban sub-watersheds were identified within the Patiala city to serve as sample catchments for the present study (Fig. 1). These represent the diversity of urban watershed characteristics and conditions across the city. The catchments differed in the forms of land development activities and housing density. Civil Lines area is predominantly a residential catchment. Manjit Nagar, located near Seona village, is a mix of rural and residential acreage. Preet Nagar contains mixed urban development. Model Town, near Tagore theatre, is a predominately commercial area. Lastly, Bus Stand area is a heavily travelled and polluted catchment.

The study catchments were chosen within a 5 km radius so as to have uniformity in the geological, topographical, and climatic variables.

For each of the catchment, detailed background information was collected on:

Watershed area and land use patterns within the catchments
Slope and hydrological path lengths towards the point of flooding
Demographic features of the watersheds
Socioeconomic and other activities in the watersheds

For information on the watershed characteristics, physical survey of the catchments, interviews with local people, topographical sheets and local zoning maps were used. Google Earth Pro was used for demarcating the catchments and assessing the area contributing to the runoff. The data on the land use types was obtained from Patiala municipal council and Department of Town and Country planning, Punjab and the data on storm/rainfall events have been obtained from Indian Meteorological Department.

For each of the five sample catchments, grab samples of stormwater were collected from the stormwater accumulation ditch, immediately after the rainfall event. This research aims at designing sub-watershed level stormwater treatment systems, involving the collection or storage of stormwater in first place and then subsequently treating it. Hence, grab sampling was preferred over flow weighted measurements. A total of six storm events were sampled during the period ranging from April 2010 to January 2012. Rain events averaging less than 2.5 mm yielded insufficient volume of runoff to sample. A minimum gap of 4 days between any two successive storm events being sampled was maintained as per the guidelines specified in Caltrans Stormwater Monitoring Protocol Guidance Manual (Caltrans 2000). Information on the rainfall event, such as depth of rainfall and number of antecedent dry days (ADD) were recorded for each of the events sampled. The samples were analyzed in the Environmental Laboratories of the Thapar University for the parameters pH, Conductivity, Total Suspended Solids (TSS), Total Dissolved Solids (TDS), Bio-Chemical Oxygen Demand (BOD₅) for 5 day at 20 °C, Chemical Oxygen Demand (COD), Total Kjeldhal Nitrogen (TKN), Total Phosphorous (TP), Oil and Grease, Total Coliform count, Fecal Coliform count and Heavy metals: Zn, Cd, Ni, Pb, Fe and Cu. Field and laboratory blanks (ultrapure water) were maintained throughout the sampling, preparation and analytical steps. Techniques given in the APHA (1999) were used for analysis of the samples.

Statistical tools were used for assessing and formulating relationships between stormwater quality characteristics with average rainfall and antecedent dry period. Multiple regression analysis was undertaken keeping pollutant concentration as dependent variable, and ADD and rainfall as two independent variables. Regression models were formulated using software MiniTab Ver16 and regression equations were formulated for all the identified significant parameters for each of the sub-watershed separately.

The general multiple linear regression (MLR) equation used to develop estimation equations for pollutant concentrations is shown as:

$$ {\text{Pollutant Conc}}.{\text{ or Load }} = f\,\left( {{\text{ADD}},{\text{ Rainfall}}} \right) $$

$$ {\text{Pollutant Conc}}.{\text{ or Load }} = \, \alpha \, \pm \, \beta \, \times{{\rm ADD }} \pm \gamma \times {\text{ Rainfall}} $$

(1)

Where, α is an arbitrary constant; β and γ are the dependent parameter constants; and ADD (day), Rainfall (mm) are the input variables.

To confirm the accuracy of the regression equations obtained and assess its statistical power in predicting the concentration of pollutants, Chi square tests were applied to evaluate the goodness of fit at level of significance.

The value of Chi square obtained for a particular parameter i.e. BOD, COD, TSS; etc. is compared with the tabulated value of Chi squared distribution for α = 0.05 significant level at n-1 degrees of freedom i.e. 5 in present study. If the calculated value is less than the tabulated value, the null hypothesis gets accepted which means that the hypothesis stating that the regression equations obtained can predict the concentrations of the pollutants with 95 % accuracy and 5 % standard error holds true and valid.

Models are often used to predict pollutant concentrations. Different models have been used, including regression, stochastic, and deterministic simulations (Irish et al. 1998). The main difference between these models lies in their approach with regard to the mechanisms that are considered for the generation and transport of the pollutants. For demonstrating the appropriateness of the regression model developed, sensitivity analysis was undertaken for an organic parameter (BOD) and one of the metal pollutants (i.e. Zn) at each of the stormwater sites. Model sensitivity was completed by manipulating one of the model input parameters (i.e. ADD) through a range of values (i.e. mean ± 2SD) while maintaining the other value (i.e. rainfall) constant and vice a versa. Sensitivity of the model framework to these varying input coefficients was completed by comparing coefficients of variation (CV) for model output. Model sensitivity to input parameter variance was deemed to be high for changes in CV’s > 50 %, moderate for changes in CV’s between 20 % and 50 %, and low for changes in output CV < 20 %.

Results and Discussion

Comparison of key runoff quality constituents reported in other countries with this study is outlined in Table 1. Most of the studies listed were implemented in urban areas except for France and some sites in USA, which were carried on urban edge areas.

Table 1 Comparison of stormwater pollutant concentrations obtained in this study and previous publications

Full size table

Inspection of the table shows that at almost all the sites mean pollutant concentrations of the listed constituents, are higher in Patiala than the results of USA (Smullen et al. 1999) and China (Gan et al. 2008) except for the TSS, which is least among the compared areas. Runoff from the catchments in USA is less polluted as compared to the others. The studied areas in Paris and Isfahan were urban watersheds, where the water quality of stormwater runoff was closely related to the layout of land uses, drainage system and environmental background. In general pollutant concentration values of all the parameters except Cu were much higher in Paris than our study’s results (Kafi et al. 2008). Mean values of COD and TSS (168 and 114 mg/L respectively) in present study were much lower than the study results of Iran (561 and 161 mg/L respectively). Whereas, the concentrations of TKN, TP and Zn observed in our study were on higher side than Iran (Taebi and Droste 2004).

Low or negligible concentration of Pb observed in the present study as compared to the studied areas of other countries can be attributed to the fact that the present study has been conducted recently after an appreciable time lag, since the ban of leaded gasoline. The differences among the studies demonstrate the uncertainty and randomness of the stormwater pollution and press on the need for making site specific observations before planning a stormwater management system.

Besides the quality the quantity of stormwater generated from these sub-watersheds was estimated using US-SCS CN Method. The Curve Number (CN) for the catchments was assessed with the help of hydrologic model WIN TR-55 Small Watershed Hydrology software. This was then used for calculating the runoff volume generated for different intensities of rainfall i.e. 12 mm (70 percentile), 21 mm (80 percentile), 34 mm (90 percentile), 64, 122, and 157 mm (Table 2).

Table 2 Volume of runoff for urban sub-watersheds

Full size table

One can estimate the volume of stormwater to be managed for a particular storm event but it is practically very difficult to analyse every event for knowing the quality. In this context the regression models can help predicting the pollution concentrations, which largely depends on both build-up and wash-off processes. The build-up depends on the ADD, land use, wind speed and traffic. Wash-off is a function of rainfall intensity and other factors (Kim et al. 2005, 2006). Few researchers determined the effect of the antecedent dry period in their study on stormwater discharge (Saget et al. 1996; Lee et al. 2002) and even tried to find a build-up model using ADD (Ball et al. 1998). Keeping this in mind, modeling was done.

Among all the parameters BOD, COD, TSS, Oil and Grease along with heavy metals (Zn, Fe and Cu) showed a strong dependence on ADD and wash-off resulting from the amount of rainfall. Regression equations obtained for these parameters with respective coefficients for each of the sub-watershed has been listed in Table 3.

Table 3 Results of the multiple regression analysis

Full size table

It was observed that a large variation in the pollutant concentrations existed between different sites that can be attributed to a complex interaction among many factors. ADD, in general, positively influenced the concentrations of the pollutants in the sense of the more dry days, the higher the pollutant concentration. It was also observed that in the smaller catchments such as the Preet Nagar the measured endpoints showed less correlation with the ADD. This is evident from negative correlations in the case of COD and low regression coefficients in case of other parameters for that sub-watershed. In contrast, the Bus Stand sub-watershed has a larger area and traffic density and showed positive correlations between the antecedent dry days and the measured end-points. The degree and the extent to which these two variables impact stormwater pollutant concentrations of different sub-watersheds is difficult to determine on account of the influences of variables such as the rainfall pattern, traffic density, type of pavement, existence of curbs, maintenance practices, land use of the surrounding catchment and its size (FHWA 1996).

The parameters for those the regression showed coefficients of determination (R²) less than 0.5 for almost all the sub-watershed were dropped. R² values for some of the parameters are greater than 0.8, hence suggesting that the equations and data are well matched. To confirm the accuracy of the regression equations obtained using six storm events and to validate and test the predictive power of the calibrated regression model, Chi square tests were applied on the six additional samples collected during the next rainy season and the goodness of fit was evaluated at the level of significance set at α = 0.05(Table 3). Thus, a total of 12 storm events were monitored 6 were used for model development and remaining 6 for its validation. The Chi square values were compared with the tabulated values at n−1 degrees of freedom i.e. 5 for all the sub-watersheds. For most of the cases, the null hypothesis was valid, as the Chi square values obtained were less than the tabulated value (11.07 at df = 5).

The obtained results indicate that the pollutant load can largely be explained by two major factors namely ADD and rainfall, and appropriate regression equations can be formulated for different sub-watersheds which can aid in predicting the concentrations of the pollutant parameters which fit well within the model. Most of the models commonly use concentrations or loads of pollutants as variables that are dependent upon runoff volume, rainfall intensity, traffic intensity, antecedent dry days, surrounding land use, etc. Generally, it is difficult to consider all factors because many different site-specific conditions exist, such as the presence or absence of street sweeping, soil saturation, wind direction, etc. But the present research wanted to establish and develop the model taking into account only those factors which will be uniform to all and can be applied to sub-watersheds with similar characteristics. Hence the study confined its approach to the major factors ADD and rainfall.

The results of the sensitivity analysis performed have been indicated in Table 4.

Table 4 Results of the sensitivity analysis performed for BOD and Zn

Full size table

Through this analysis it was observed that for most of the cases the model coefficient ADD generated greater output variability (higher CV) than rainfall and can thus be concluded as the primary model sensitivity parameter. On the other hand rainfall showed moderate to low variability except for the case of BOD in Manjit Nagar and Bus Stand. In these two sub-watersheds rainfall is observed as the primary sensitivity parameter.

Multiple regression models were also used for stormwater quality assessment under three sets of conditions.

Worst case conditions for stormwater quality: minimum rainfall (13 mm) and maximum dry period (98 days)
Average or moderate conditions: average rainfall (56 mm) and average length of dry period (30 days)
Best case conditions for stormwater quality: maximum rainfall (115 mm) and minimum dry days (5 days).

Table 5 enlists the results of various stormwater quality parameters obtained for the set of above conditions.

Table 5 Stormwater quality of the sub-watersheds predicted using regression models

Full size table

The values have been obtained using the regression equation in cases where the R² value is greater than 1-sigma value in a normal distribution i.e. 0.68. It is believed that these criteria will yield strong, moderate and weak relationships in terms of R² values and signify the importance and reliability of the obtained results through regression modeling. In cases where R² values are less than 0.5 it is assumed that weak or no relations are observed and hence regression equations will provide misleading results. In such cases maximum, minimum and average values are reported from the results i.e. in case of total and fecal coliform count. The assessment of the regression model will help to predict the quality well before actually analyzing the water quality and thus aid in selecting those treatment schemes being most suitable for taking care of the stormwater quality even under worst case conditions.

The generated regression models had better estimation ability than the empirical models reported in previous studies, for most but not all the cases. However, still there is a need to understand the mechanisms of generation and transport of pollution in urban systems during a storm event, as it concerns many media, space and time scales. Nevertheless, the regression models are useful in predicting future trends of loads and concentrations of pollutants in stormwater for sub-watersheds and hence aid in deciding on the most appropriate and cost effective treatment schemes suitable for managing the stormwater.

References

APHA (1999) Standard methods for the examination of water and wastewater, 20th ed. American Public Health Association, Washington, DC
Google Scholar
Arora AS, Reddy AS (2012) Multivariate analysis for assessing the quality of stormwater from different urban surfaces of the Patiala city, Punjab (India). Urban Water J. doi:10.1080/1573062X.2012.739629
Google Scholar
Ball JE, Jenks R, Aubourg D (1998) An assessment of the availability of pollutant constituents on road surfaces. Sci Total Environ 209:243–254
Article CAS Google Scholar
Brezonik PL, Stadelmann TH (2002) Analysis and predictive models of stormwater runoff volumes, loads, and pollutant concentrations from watersheds in the Twin Cities metropolitan area, Minnesota, USA. Water Res 36:1743–1757
Article CAS Google Scholar
Caltrans (2000) Guidance manual: stormwater monitoring protocols, Report No. CTSW-RT-00-005. Sacramento, CA: California Department of Transportation
FHWA, Federal Highway Administration (1996) Evaluation and management of highway runoff water quality. In: Report No. FHWA-PD-96-032, U.S. Department of Transportation, 480
Gan HY, Zhuo MN, Li DQ, Zhou YZ (2008) Quality characterization and impact assessment of highway runoff in urban and rural area of Guangzhou, China. Environ Monit Assess 140(1–3):147–159
Article CAS Google Scholar
Graves G, Wan Y, Fike L (2004) Water quality characteristics of stormwater from major land uses in South Florida. J Am Water Resour Assoc 40:1405–1419
Article CAS Google Scholar
Irish LB, Barrett ME, Malina JF, Charbeneau RJ (1998) Use of regression models for analyzing highway storm-water loads. J Environ Eng 124(10):987–993
Article CAS Google Scholar
Kafi M, Gasperi J, Moilleron R, Gromaire MC, Chebbo G (2008) Spatial variability of the characteristics of combined wet weather pollutant loads in Paris. Water Res 42:539–549
Article CAS Google Scholar
Kayhanian M, Suverkropp C, Ruby A, Tsay K (2007) Characterization and prediction of highway runoff constituent event mean concentration. J Environ Manage 85:279–295
Article CAS Google Scholar
Kim LH, Kayhanian M, Lau SL, Stenstrom MK (2005) A new modeling approach in estimating first flush metal mass loading. Water Sci Technol 51(3–4):159–167
CAS Google Scholar
Kim LH, Zoh KD, Jeong S, Kayhanian M, Stenstrom MK (2006) Estimating pollutant mass accumulation on highways during dry periods. J Environ Eng 132(9):985–993
Article CAS Google Scholar
Lee JH, Bang KW, Ketchum LH, Choe JS, Yu MJ (2002) First flush analysis of urban storm runoff. Sci Total Environ 193:163–175
Article Google Scholar
Phillips BC, Thompson G (2002) Virtual stormwater management planning in the 21st century. Ninth International Conference on Urban Drainage, Portland, Oregon
Book Google Scholar
Saget A, Chebbo G, Bertrand-Krajewski JL (1996) The first flush in sewer systems. Water Sci Technol 33(9):101–108
Article Google Scholar
Smullen JT, Shallcross AL, Cave KA (1999) Updating the U.S. nationwide urban runoff quality date base. Water Sci Technol 39(12):9–16
Article CAS Google Scholar
Taebi A, Droste RL (2004) Pollution loads in urban runoff and sanitary wastewater. Sci Total Environ 327:175–184
Article CAS Google Scholar
The Environment (Protection) Rules (1986) Schedule-VI, General standards for discharge of environmental pollutants part A: effluents
Thomson NR, Mcbean EA, Snodgrass W, Monstrenko IB (1997) Highway stormwater runoff quality: development surrogate parameter relationships. Water Air Soil Pollut 94:307–347
CAS Google Scholar

Download references

Acknowledgments

The authors would like to thank Thapar University, Patiala for providing the laboratory facilities to undertake the task of testing and analysis.

Author information

Authors and Affiliations

School of Energy and Environment, Thapar University, Patiala, 147004, India
Amarpreet S. Arora & Akepati S. Reddy

Authors

Amarpreet S. Arora
View author publications
You can also search for this author in PubMed Google Scholar
Akepati S. Reddy
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Amarpreet S. Arora.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Arora, A.S., Reddy, A.S. Development of Multiple Linear Regression Models for Predicting the Stormwater Quality of Urban Sub-Watersheds. Bull Environ Contam Toxicol 92, 36–43 (2014). https://doi.org/10.1007/s00128-013-1160-y

Download citation

Received: 26 April 2013
Accepted: 15 November 2013
Published: 24 November 2013
Issue Date: January 2014
DOI: https://doi.org/10.1007/s00128-013-1160-y

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Development of Multiple Linear Regression Models for Predicting the Stormwater Quality of Urban Sub-Watersheds

Abstract

Similar content being viewed by others

Multiple Regression Analysis as a Comprehensive Tool to Model Flood Hazard in Sewersheds

Predicting Stormwater Quality Resulting from Traffic Generated Pollutants

Investigation of contaminant profile in highway stormwater runoff and risk assessment by statistical analysis

Materials and Methods

Results and Discussion

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Development of Multiple Linear Regression Models for Predicting the Stormwater Quality of Urban Sub-Watersheds

Abstract

Similar content being viewed by others

Multiple Regression Analysis as a Comprehensive Tool to Model Flood Hazard in Sewersheds

Predicting Stormwater Quality Resulting from Traffic Generated Pollutants

Investigation of contaminant profile in highway stormwater runoff and risk assessment by statistical analysis

Explore related subjects

Materials and Methods

Results and Discussion

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation