Sensitivity Analysis of Empirical and Data-Driven Models on Longitudinal Dispersion Coefficient in Streams

Nezaratian, Hosein; Zahiri, Javad; Kashefipour, Seyed Mahmood

doi:10.1007/s40710-018-0334-3

Sensitivity Analysis of Empirical and Data-Driven Models on Longitudinal Dispersion Coefficient in Streams

Original Article
Published: 14 November 2018

Volume 5, pages 833–858, (2018)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Environmental Processes Aims and scope Submit manuscript

Sensitivity Analysis of Empirical and Data-Driven Models on Longitudinal Dispersion Coefficient in Streams

Download PDF

Hosein Nezaratian¹,
Javad Zahiri ORCID: orcid.org/0000-0001-9681-1606¹ &
Seyed Mahmood Kashefipour²

205 Accesses
9 Citations
Explore all metrics

Abstract

Longitudinal dispersion coefficient (LDC) is a key element in pollutant transport modeling in streams. Several empirical and data-driven models have been proposed to evaluate this parameter. In this study, sensitivity analysis was performed on four key parameters affecting the LDC including: channel width, flow depth, mean flow velocity and shear velocity. In addition, Monte Carlo simulation was used to generate new datasets and evaluate performance of LDC estimation methods based on uncertainty of input parameters. Sensitivity indices of the input parameters in selected empirical equations and differential evolution model follow almost the same trend, where mean flow velocity is the most sensitive parameter among input parameters and the prediction accuracy depends heavily on the value of this parameter. In above mentioned models, shear velocity had a negative value and a reverse effect on LDC estimation. Channel width and mean flow velocity have the highest sensitivity in M5 model for narrow and wide streams, respectively. Based on sensitivity indices, the efficiency of empirical and data-driven models in different conditions, according to uncertainties in the input parameters, has been investigated. Result of LDC estimation based on the data of Monte Carlo simulation, showed that most LDC estimation models have a high uncertainty for upper LDC values.

The effect of neglecting spatial variations of the parameters in pollutant transport modeling in rivers

Article 25 March 2021

Longitudinal dispersive coefficient in channels with aquatic vegetation: A review

Article 01 June 2023

Estimation of Longitudinal Dispersion Coefficient Using Field Experimental Data and 1D Numerical Model of Solute Transport

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

In recent years much attention has been paid to the environment, especially river and lake pollution. Rivers and streams are usually receiving the outlet of sewage systems which may cause pollutant levels to rise (Haghiabi 2016). Pollutant dispersion is a key element in water quality modeling (Antonopoulos et al. 2015) and the longitudinal dispersion coefficient (LDC) is an important factor in stream pollution modeling due to its effect on pollutant mixing intensity. The pollution is affected by advective and dispersive processes, and is dispersed longitudinally, transversely and vertically (Seo and Cheong 1998). Most of the experimental studies on dispersion coefficient in streams are based on routing tracer concentration along the river (Atkinson and Davis 2000; Davis et al. 2000; Velísková et al. 2014; Disley et al. 2015; Parsaie and Haghiabi 2017). Estimation of the dispersion coefficient directly, using tracers in rivers, is very difficult, expensive and time-consuming. Mixing is a three-dimensional process near the pollution point source. After total mixing far from the injection point, only longitudinal dispersion is used to describe the dispersion phenomenon (Chatila 1997; Velísková et al. 2014; Haghiabi 2017). Therefore, to predict longitudinal dispersion, researchers have developed different equations based on experimental and field measurements. These equations use hydraulic and geometric parameters such as width of the channel, mean flow velocity, shear velocity and depth of flow (Alizadeh et al. 2017). Several researchers have used as input parameters the ratio of channel width to flow depth (W/H) and mean velocity to shear velocity (U/U_∗) to estimate LDC based on their correlation with dimensionless dispersion coefficient (K_x ∕ HU_∗) (Noori et al. 2017). Table 1 presents some empirical equations for LDC estimation. These equations have been derived using different methods and in some cases different set of data, and result in various performances based on stream conditions. In this regard, investigators have recently used data-driven models to estimate LDC. Some of data-driven models which include support vector machine, M5 tree algorithm, differential evolution, genetic algorithms and genetic programming, have been used by Azamathulla and Wu (2011), Etemad-Shahidi and Taghipour (2012), Li et al. (2013), Sahay and Dutta (2009) and Sattar and Gharabaghi (2015), respectively. Due to the use of LDC in devising water diversion strategies, designing treatment plants, intakes and outfalls, and studying the environment (Ho et al. 2002), an important step is the validation of LDC estimation models under different condition. It should be noted that, almost all empirical and data-driven models predict longitudinal dispersion coefficient with simplifying assumptions, which could affect the accuracy of the model results (Sahin 2014). For accurate estimation of water quality parameters, uncertainty and sensitivity analysis must be performed along with water quality modeling (Nakhaei and Etemad-Shahidi 2012). Quantification of the error in water quality models could be used as a first step in evaluation of risk assessment in water resources management and planning. Uncertainty of model-input, model-structure, model-parameter and measurement could be classified as different sources of uncertainty in water quality management (Radwan et al. 2004). Monte Carlo simulation in hydrologic models is widely used in uncertainty analysis (Mishra 2009). This method enables hydrologic modelers to study the effect of input parameter sets on uncertainty of water quality parameters (Pasha and Lansey 2010). Along with uncertainty analysis, sensitivity analysis assists modelers to evaluate output range based on range of input parameters, which could be used to determine the most effective parameters (Nakhaei and Etemad-Shahidi 2012). Non-dimensional sensitivity coefficient is used by hydrological and environmental scientists, to analyze multivariable models (McCuen 1974; Saxton 1975; Rana and Katerji 1998; Hupet and Vanclooster 2001; Gong et al. 2006).

Table 1 Empirical and data-driven models for estimating LDC

Full size table

This study presents the performance of empirical equations and data-driven models using statistical analysis. The novelty of this study lies in sensitivity analysis of accurate empirical and data-driven models, to determine the effect of each parameter on LDC estimation. Sensitivity indices showed which parameter in each model had the main role on LDC estimation. Also, uncertainty analysis was performed based on non-repeated, random data series produced by Monte Carlo simulation to investigate the behavior of empirical and data-driven models. This can determine the performance of each model considering uncertainty in input parameters.

2 Theory and Previous Studies

Transverse shear velocity and transverse mixing become in equilibrium after a certain timescale (Taylor 1954). The simplified 1-D advection-dispersion equation for steady flow conditions has the following form (Etemad-Shahidi and Taghipour 2012):

$$ \frac{\partial C}{\partial t}+U\frac{\partial C}{\partial x}={K}_x\frac{\partial^2C}{\partial {x}^2} $$

(1)

where C is the cross-sectional average concentration (kg/m³), U is the mean flow velocity (m/s), x is the direction of the mean flow (m), t is the time (s), and K_x is the longitudinal dispersion coefficient (m²/s) in the flow direction. Based on Rutherford (1994), important features of tracer profiles in laboratory and river channels can be illustrated using Eq. (1).

The LDC in streams is affected by a range of parameters. The most important parameters are density, viscosity, channel width, flow depth, mean flow velocity, shear velocity, bed slope, bed roughness, horizontal stream curvature (i.e., sinuosity), and bed shape factor (Seo and Cheong 1998; Guymer 1998; Etemad-Shahidi and Taghipour 2012). Due to the complexity of measuring these parameters, the researchers have generally applied the hydraulic and geometric parameters such as channel width, flow depth H, flow velocity U and shear velocity U_∗ which have important effects on LDC. Based on the equilibrium between longitudinal shear velocity and vertical turbulent diffusion, Elder (1959) used Taylor’s results from pipes to open channels and derived the following equation to estimate the LDC (Deng et al. 2001):

$$ {K}_x=5.93H{U}_{\ast } $$

(2)

where H and U_∗ represent the flow depth (m) and shear velocity (m/s), respectively.

Fischer (1967) suggested that the transverse profile of the velocity is more important than the vertical profile for dispersion in natural streams and developed the following integral relation for the dispersion coefficient in natural streams having large width to depth ratios (Sahay 2011):

$$ {K}_x=-\frac{1}{A}{\int}_0^W{Hu}^{\prime }{\int}_0^y\frac{1}{\varepsilon_tH}{\int}_0^y{Hu}^{\prime }\ dydydy $$

(3)

where A is cross-sectional area (m²); y is the coordinate in the lateral direction (m); u^′ is the deviation of local depth velocity from the cross-sectional mean velocity (m/s); W is channel width (m); and ε_t is the transverse, turbulent diffusion coefficient (m²/s). Due to the complexity of Eq. (3), Fischer (1967) developed the following simple and practical equation:

$$ {K}_x=0.011\left(\frac{U^2{W}^2}{H{U}_{\ast }}\right) $$

(4)

Seo and Cheong (1998) proposed an empirical expression based on one-step method developed by Huber (1981), which is a robust regression method, gives reasonably good estimation even in the presence of moderately bad leverage points. Seo and Cheong (1998) used 59 sets of data from 26 U.S. streams to develop the following equation and showed its superiority over existing expressions:

$$ {K}_x=5.915{\left(\frac{W}{H}\right)}^{0.62}{\left(\frac{U}{U_{\ast }}\right)}^{1.428}\left(H{U}_{\ast}\right) $$

(5)

Deng et al. (2001) developed an analytical method based on Fischer’s triple integral expression for estimation of LDC in rivers. They assumed that uniform-flow formula is valid for local depth-averaged parameters. Their equation is theoretically-based and clarifies the dispersion mechanism. Based on Deng et al. (2001), the velocity is the most sensitive parameter among all input parameters in Eq. (6); a change of 10% in this parameter causes significant variation in the LDC:

$$ {\displaystyle \begin{array}{l}{K}_x=\left(\frac{0.15}{8\ {\varepsilon}_{t_0}}\right){\left(\frac{W}{H}\right)}^{5/3}{\left(\frac{U}{U_{\ast }}\right)}^2\left(H{U}_{\ast}\right)\\ {}{\varepsilon}_{t_0}=0.145+\frac{1}{3520}\left(\frac{U}{U_{\ast }}\right){\left(\frac{W}{H}\right)}^{1.38}\end{array}} $$

(6)

Kashefipour and Falconer (2002) established an equation for predicting the LDC in natural channels using 81 sets of field data in the U.S., by relating this process through dimensional and regression analysis to the main hydraulic parameters such as river depth, width, velocity and shear velocity. Kashefipour and Falconer (2002) applied multiple regression between parameter combinations, and a best fit simple equation was derived, as follows:

$$ {K}_x=10.612\ (HU)\left(\frac{U}{U_{\ast }}\right) $$

(7)

Kashefipour and Falconer (2002) used a linear combination of Eq. (7) and Seo and Cheong’s (1998) formulation to develop Eq. (8), which led to a further improved equation for predicting the LDC in streams:

$$ {K}_x=\left[7.428+1.775{\left(\frac{W}{H}\right)}^{0.62}{\left(\frac{U_{\ast }}{U}\right)}^{0.572}\right](HU)\left(\frac{U}{U_{\ast }}\right) $$

(8)

Zeng and Huai (2014) showed that the product of water depth and cross-sectional mean flow velocity has a higher linear correlation with the LDC than the product of water depth and shear velocity. Therefore, with combination of the product of H and U and other two non-dimensional parameters, a new equation for longitudinal dispersion coefficient was proposed:

$$ {K}_x=5.4\ {\left(\frac{W}{H}\right)}^{0.7}{\left(\frac{U}{U_{\ast }}\right)}^{0.13} HU $$

(9)

Sahin (2014) proposed an equation based on dimensional and least squares analysis, using 128 field data sets measured in 41 rivers in the U.S. as follows:

$$ {K}_x=48\ {\left(\frac{U}{U_{\ast }}\right)}^{0.47}{R}_hU $$

(10)

where R_h is the hydraulic radius (m), which was calculated assuming a rectangular channel section due to the lack of data on cross section shape (Sahin 2014). Disley et al. (2015) developed an equation to estimate LDC using combined data sets from five steeper head – water streams and 24 milder and larger rivers. This equation relates the LDC to hydraulic and geometric parameters of the stream and has been developed using multiple regression analysis:

$$ {K}_x=3.563\ {\left(\frac{U}{gH}\right)}^{-0.4117}{\left(\frac{W}{H}\right)}^{0.6776}{\left(\frac{U}{U_{\ast }}\right)}^{1.0132}H{U}_{\ast } $$

(11)

where g is a gravitational acceleration (m/s²).

Data-driven models have widely been used by researchers to estimate LDC in streams. Sahay and Dutta (2009) developed an equation to estimate the LDC, using the datasets of Deng et al. (2001) and genetic algorithm:

$$ {K}_x=2{\left(\frac{W}{H}\right)}^{0.96}{\left(\frac{U}{U_{\ast }}\right)}^{1.25}\left(H{U}_{\ast}\right) $$

(12)

Etemad-Shahidi and Taghipour (2012) derived two interpretable equations to estimate LDC using M5 tree algorithm and 149 datasets from rivers around the world:

$$ If\left(\frac{W}{H}\right)\le 30.6,{K}_x=15.49{\left(\frac{W}{H}\right)}^{0.78}{\left(\frac{U}{U_{\ast }}\right)}^{0.11}\left(H{U}_{\ast}\right) $$

(13-a)

$$ If\left(\frac{W}{H}\right)>30.6,{K}_x=14.12{\left(\frac{W}{H}\right)}^{0.61}{\left(\frac{U}{U_{\ast }}\right)}^{0.85}\left(H{U}_{\ast}\right) $$

(13-b)

Table 1 summarizes selected equations and models for LDC estimation.

3 Materials and Methods

3.1 Data and Statistical Analysis

A collection of distinctive datasets measured in different streams were used in this study (Fischer 1968; Yotsukura et al. 1970; Godfrey and Frederick 1970; McQuivey and Keefer 1974; Nordin and Sabol 1974; Rutherford 1994; Graf 1995; Seo and Cheong 1998; Disley et al. 2015). The datasets contained geometric and hydraulic characteristics, which include: channel width, flow depth, mean flow velocity, shear velocity and longitudinal dispersion coefficient (Appendix Table 8). Histograms of W, H, U, U_∗, K_x, W/H, and U/U_∗ are illustrated in Fig. 1. The histogram of W/H implies that the studied cases varied from narrow rivers (W/H < 10) to very wide rivers (W/H > 100). The friction term in the form of U/U_∗ (Seo and Cheong 1998) can be considered as the hydrodynamic characteristic of the river bed (Etemad-Shahidi and Taghipour 2012). The statistical values of parameters are presented in Table 2.

Table 2 Statistics of parameters used in this study

Full size table

3.2 Sensitivity Analysis

Geometric and hydraulic characteristics such as channel width, flow depth, mean flow velocity and shear velocity may have some uncertainties in their value estimation. Poor estimation procedures, tracer loss, or measurements made in the advective zone are examples of such uncertainties in K_x values (Etemad-Shahidi and Taghipour 2012).

Sensitivity analysis was employed in order to identify which parameters have more influence on the dimensionless longitudinal dispersion coefficient. Model sensitivity is the rate of change in one factor as output with respect to change in another factor as input while the other parameters are kept constant (McCuen 1973), or how the variation in the output of a model (numerical or other) can be apportioned, qualitatively or quantitatively, to different sources of variation of input parameters (Saltelli et al. 2004).

A logical step in model development is the determination of the most important parameters affecting the model results. A ‘sensitivity analysis’ of these parameters could serve to help future studies (Hamby 1994). Computer models used in hydraulic engineering have been increased, and this has not been accompanied by a corresponding increase in sophistication of sensitivity analysis (Hall et al. 2009). Estimation of the risk by the coupling of hydrodynamic, structural reliability and impacts models causes additional motivation for improved sensitivity analysis (Dawson et al. 2005). However, without a systematic method to exploring the model response to inputs changes, model developers cannot discover reliable intuitions about the model behavior and interactions (Hall et al. 2009). Sensitivity analyses have been used to determine which parameter has the most effect on reducing output uncertainty, and/or which parameters are negligible and can be eliminated from the final model, and/or which inputs contribute most to output change, and/or which parameters are strongly correlated with the output, and/or what are the consequent results from changing each input parameter (Hamby 1994).

Input parameters for sensitivity analysis of LDC models were considered with their average value; one parameter was changed in a defined domain and this process continued for all of the remaining parameters. With this mechanism, the output variability was estimated based on insignificant modifications of each input parameter, and the model sensitivity to each parameter variation was predicted. A general LDC model can be defined as follows:

$$ {K}_x=f\left({V}_1,{V}_2,\dots {V}_n\right) $$

(14)

where V_i represents input parameters. Based on Beven (1979), the variation of K_x can be written as:

$$ {K}_x+\Delta {K}_x=f\left({V}_1+\Delta {V}_1,{V}_2+\Delta {V}_2,\dots, {V}_n+\Delta {V}_n\right) $$

(15)

Expanding Eq. (15) in Taylor series, and ignoring second-order terms, leads to:

$$ \Delta {K}_x=\frac{\partial {K}_x}{\partial {V}_1}\Delta {V}_1+\frac{\partial {K}_x}{\partial {V}_2}\Delta {V}_2+\dots +\frac{\partial {K}_x}{\partial {V}_n}\Delta {V}_n $$

(16)

where the differentials $ \frac{\partial {K}_x}{\partial {V}_i} $ define the sensitivity of the estimated output to each model parameter. Let us set:

$$ {A}_S=\frac{\partial {K}_x}{\partial {V}_i}\approx \frac{\Delta {K}_x}{\Delta {V}_i} $$

(17)

where A_s represents the absolute sensitivity of the output estimation to each input parameter. The differential analysis is typically much more demanding to implement than other sensitivity methods and yet provides only comparable results. Using sensitivity analysis as a partial derivative form is impractical due to its complexity (Gardner et al. 1981). In addition, when parameter variability takes realistic values this method which is valid for only small changes in parameter values will be impractical (Hamby 1994).

The magnitude of parameters in the LDC equation varied, therefore, the absolute form of sensitivity values from Eq. (17) are unsuitable for comparison of sensitivity values. So, relative sensitivity values were used to compare sensitivity values of input parameters (Mount et al. 2013) in the form:

$$ {R}_s=\frac{\Delta {K}_x}{\Delta {V}_i}\frac{V_i}{K_x} $$

(18)

Relative changes or errors can be defined as in Saxton (1975):

$$ {R}_E=\frac{\Delta {K}_x}{K_x} $$

(19)

where R_s is a dimensionless coefficient which demonstrates the percentage of the relative parameter change transmitted to the relative dependent parameter. This may be defined as the sensitivity coefficient, for example, a sensitivity coefficient of 0.2 means 10% change in V_i as an input parameter, would cause a 2% change in LDC (ΔK_x/K_x) (Saxton 1975).

LDC models are affected by four input parameters which have a wide variation range in nature and a lot of real data are needed to investigate the performance of LDC estimation. A sensitivity and error analysis of the empirical and data-driven models are conducted for mean values of input and output parameters and on the assumption that the interaction between input parameters is negligible (Deng et al. 2001). Performance of selected models were evaluated by two approaches of changing each input individually and the whole ones by random and none-repeating dataset. In addition, global sensitivity based on Saltelli et al. (2008) has been performed to investigate the interaction between input parameters. Therefore, first-order sensitivity index (S_i) has been estimated for each input parameter. If the sum of all S_i was equal to 1, model is additive and there is not any interaction between input parameters (Saltelli et al. 2008).

In this study, for each input parameter, 100 random and none-repeating datasets were produced in a domain of ±10% and ± 20% change of each input parameter for each available data series. To investigate the effect of all input parameters on LDC estimation, another random data series were produced based on ±10% and ± 20% change on all parameters. These datasets have been used for selected models and the minimum and maximum LDC estimation for each data series of every dataset was derived to analyze the performance of models and derive the uncertainty curves. It is necessary to mention that for sensitivity and uncertainty analysis of each input parameter, the other input parameters were kept constant at their average values.

3.3 Model Validation

Performance of LDC models have been evaluated using statistical measures, including the mean absolute error (MAE), the root mean square error (RMSE), and the discrepancy ratio (DR) and the related accuracy. DR was defined by White et al. (1973) to evaluate the difference between measured and predicted values. If DR = 0, the predicted and measured values of the dispersion coefficient are identical, while the model overestimates the measured values of the dispersion coefficient when DR > 0, and underestimates them when DR < 0. Accuracy is defined as the proportion of numbers with DR between −0.3 and 0.3 in the total number of data (Seo and Cheong 1998):

$$ \mathrm{MAE}=\frac{1}{N}\sum \left|D{R}_i\right| $$

(20)

$$ RMSE=\sqrt{\frac{1}{N}\sum {\left(D{R}_i\right)}^2} $$

(21)

$$ DR=\mathit{\log}\frac{K_{x_p}}{K_{x_m}} $$

(22)

where $ {K}_{x_p} $ and $ {K}_{x_m} $ are the predicted and measured LDC, respectively.

4 Results and Discussion

Sensitivity analysis, in addition to statistical analysis, helps the researchers to know limitations and advantages of LDC models. Statistical measures, including MAE, RMSE and DR of empirical and data-driven models are given in Table 3. Histogram of DR values for better comparison between models are also illustrated in Fig. 2.

Table 3 Comparison of the performance of the various models

Full size table

Table 3 results shows that Elder (1959) equation has the maximum error and minimum accuracy. This equation is suitable for streams with no transverse shear, but the accuracy of this equation illustrates the importance of transverse variation (Etemad-Shahidi and Taghipour 2012). DR < −0.3 for this model is about 98% and this demonstrates lower estimation of the LDC by Elder equation. McQuivey and Keefer (1974) model with RMSE equal to 2.04 and accuracy of 9.15% generally overestimates the LDC values in streams with 89% of DR > 0.3. Error criteria for Fischer (1967) decreased in comparison with Elder (1959) and McQuivey and Keefer (1974) and its accuracy has been improved. Sahin (2014) has the highest accuracy among all empirical models, followed by Zeng and Huai (2014), Liu (1977) and Kashefipour and Falconer (2002). Disley et al. (2015), Seo and Cheong (1998) and Deng et al. (2001) models with the accuracy of 48.17%, 46.34 and 45.12%, respectively, have relatively accurate estimation of LDC. Zeng and Huai (2014) has the lowest RMSE among all empirical formulas. Error estimation of LDC for the Kashefipour and Falconer (2002) model is more than the corresponding values for some of the empirical models but its perfect symmetry between lower and upper estimates make this model suitable for LDC estimation (Etemad-Shahidi and Taghipour 2012). Liu (1977), Seo and Cheong (1998) and Deng et al. (2001) overpredict the LDC by 2.15, 2.14 and 1.57 times, respectively, more than the underpredicted cases. However, for Kashefipour and Falconer (2002), the overpredicted and underpredicted cases are equal, which make the performance of this formula to be better on LDC estimation. This result is consistent with Etemad-Shahidi and Taghipour (2012) findings.

Genetic algorithm has the lowest accuracy among data-driven models, therefore, this model was eliminated from sensitivity analysis. Based on Table 3, M5 model has the highest accuracy and lowest RMSE among all LDC estimation models. Finally, based on statistical analysis three empirical equations, including Kashefipour and Falconer (2002), Sahin (2014) and Zeng and Huai (2014), and three data-driven models, including M5, GE and DE have been selected for sensitivity analysis.

As it was mentioned above, model sensitivity is the rate of change in LDC with respect to change in input parameters while the other parameters are kept constant; in other words, to investigate the direct effect of one parameter on LDC, the effect of other parameters should be neglected by keeping their values constant. The average of input parameters for LDC estimation are calculated from the existing datasets, and are presented in Table 4.

Table 4 Average value of input parameters used for LDC estimation

Full size table

Results of global sensitivity based on Saltelli et al. (2008) are presented in Table 5. In this table, first-order sensitivity index (S_i) has been estimated for all selected models. Sum of S_i for LDC models has been estimated near 1, which implies that these models are additive with weak interaction between input parameters.

Table 5 First-order sensitivity index (S_i) of input parameters for selected models

Full size table

In this study, ΔV_i = 0.1V_i has been used for estimation of relative sensitivity coefficient and relative error. Sensitivity analysis of selected empirical and data-driven models are presented in Table 6. Also, the rate of LDC changes based on ±10% and ± 20% change of each input parameter with assuming the other parameters to be constant are illustrated in Fig. 3. For M5 models, which contain two equations, the dataset was divided into two domains and used for each equation based on model criteria. R_s is one of the most important sensitivity indicators for multivariable models. Parameters with the large amount of R_s have the great effect on LDC. The estimated error caused by changing each parameter on LDC is shown by R_E (Table 6).

Table 6 Sensitivity analysis of selected empirical and data driven models

Full size table

Mean flow velocity has the maximum of R_s in Kashefipour and Falconer (2002), and when increasing 10% the velocity, the LDC value increases about 18%, as it is illustrated in Table 6. Shear velocity has an inverse effect on LDC, where increasing 10% the shear velocity leads to about 6.7% decrease in LDC. The effect of input parameters changes on LDC computed by Kashefipour and Falconer (2002) equation is presented in Fig. 3a. According to the contents of Table 6, mean flow velocity and channel width have the highest and lowest effect on LDC, respectively, according to Sahin (2014) model. Fig. 3b shows that the channel width has no influence on LDC using this model. Mean flow velocity, channel width and flow depth have the highest R_s based on Zeng and Huai (2014) model, respectively, (Table 6; Fig. 3c). This model has a lowest sensitivity to shear velocity among all empirical and data driven models.

The M5 algorithm proposed two piecewise equations for LDC estimation. For this reason, the sensitivity analysis was performed for two nonlinear equations of this model. Splitting value for W/H is approximately 30, close to the value obtained by Papadimitrakis and Orphanos (2004). In narrow rivers with W/H ≤ 30.6, the importance of shear velocity and channel width are more than the flow depth and velocity, therefore W/H is more important than U/U_∗ on LDC estimation. In wider rivers, where W/H > 30.6, mean flow velocity has the highest value of R_s, hence U/U_∗ has the main effect on LCD (Table 6). A possible interpretation is that K_x may be less influenced by the W/H ratio in very wide rivers than in narrow rivers. This result is in agreement with Papadimitrakis and Orphanos (2004) and Etemad-Shahidi and Taghipour (2012) findings. In addition, Fig. 3d, e and g for M5 and GE models, show that the shear velocity has a direct effect on LDC which is not consistent with the empirical equations. The DE model developed by Li et al. (2013) behaves similarly to the empirical models, with the mean flow velocity having a highest effect and the shear velocity an inverse effect on LDC estimation (Table 6; Fig. 3f). It was also found that in GE model, the most effective parameters on LDC were the flow depth, the mean flow velocity, the channel width and the shear velocity, respectively, in descending order of importance. The shear velocity in GE model has an uncommon effect on LDC, as shown in Fig. 3g, where an increase and decrease of this parameter has the same impact on LDC. It should be noted that the models developed by GE have a complicated structure which reduce their interpretation in comparison with other models.

Table 7 and Fig. 4 illustrate R_s values of the selected empirical and data-driven models for better comparison. Mean flow velocity is the most sensitive parameter on empirical equations and some of data-driven models (M5-Eq. 13-b, and DE), which is consistent with the findings of Deng et al. (2001) and Haghiabi (2017). Therefore, the performance of the formulas mentioned above, depends heavily on the velocity value. Comparison of the two equations developed by M5 model shows that the role of mean flow velocity becomes more pronounced in relatively wide rivers than narrow rivers, which is in agreement with Rutherford (1994) findings. In addition, shear velocity is the most sensitive parameter among the four input parameters in narrow streams. This has been reported by Tayfur and Singh (2005), who stated that ANN model can yield satisfactory predictions of the LDC in narrow streams, if the shear velocity was used as the only input parameter.

Table 7 Relative sensitivity (R_s) of all models

Full size table

Table 7 shows that the channel width is the most sensitive parameter in M5 Eq. (13-a) model, while Kashefipour and Falconer (2002) has the least sensitivity to channel width. Flow depth has the most impact on Sahin (2014) formula, while M5-Eq. (13-a) has the least sensitivity to flow depth among all models of LDC estimation. Velocity has the highest effect on Kashefipour and Falconer (2002) and this parameter is the least sensitive parameter in M5-Eq. (13-a). M5-Eq. (13-a) and Zeng and Huai (2014) have the most and the least sensitivity to shear velocity among others.

Uncertainty curves based on Monte Carlo simulation, which have been used to produce new data set based on ±10% and ± 20% changes of each input parameter for the empirical and data-driven models are presented in Figs. 5, 6, 7, 8, 9 and 10. For interpretation of the figures, all data were arranged in descending order based on LDC values.

According to the Monte Carlo simulation (Figs. 5, 6, 7, 8, 9 and 10), Zeng and Huai (2014) equation had the lowest sensitivity to parameter changes, and showed a smooth curve. Uncertainty of this equation in comparison with other models, for all input parameters, was negligible. Two other empirical models, those by Kashefipour and Falconer (2002) and Sahin (2014), showed a high uncertainty for higher LDC values. LDC estimated for ±20% changes in all input parameters, for high LDC values, were more than the LDC calculated for the original data values at about 80 and 60% for Kashefipour and Falconer (2002) and Sahin (2014), respectively. This was also observed for DE and GE models. The M5 models are very sensitive to parameter changes, based on Fig. 8. The W/H ratio would change with variation in flow depth and channel width, and in some cases, the stream is converted from narrow to wide and vice versa. In such cases, the other equation of the M5 model would be used for the new dataset, which makes a high differences between LDC estimated for the original dataset and the new data generated by the Monte Carlo simulation for the same dataset (sudden jump). The GE model has a non-smooth uncertainty curve for high LDC values due to the model’s complexity.

The results of the uncertainty analysis based on Monte Carlo simulation showed that the Zeng and Huai (2014) equation demonstrates less uncertainty to input parameter changes and is more reliable for estimation of LDC in comparison with the other models.

5 Conclusions

Due to the complexity of measuring and the time-consuming tracer studies, empirical and data-driven models have been developed by many scientists to estimate the longitudinal dispersion coefficient, in order to apply in mathematical models for water quality modelling. Based on statistical analysis, M5 algorithm has the best accuracy and the least computational error compared to other studied empirical and data-driven models. Sensitivity analysis on selected empirical and data-driven models showed that the Kashefipour and Falconer (2002) equation has the least sensitivity to channel width, which could be used for rivers with variety in width or uncertainty in measuring this parameter. In this case, uncertainty in this parameter would have the least effect on LDC estimation. M5-Eq. (13-a), DE and Zeng and Huai (2014) may be used for conditions with fluctuations in flow depth. Also, M5 has the least sensitivity to velocity for narrow rivers which makes this model suitable for narrow streams with fluctuation in depth and velocity, such as meandering rivers. The results of Monte Carlo simulation showed that uncertainty of Kashefipour and Falconer (2002), Sahin (2014), DE and GE models are very high for high LDC values. The M5 model results have some sudden jumps for W/H ratios around 30.6. It seems through this threshold (W/H=30.6), that the M5 model is forced to exploit two equations (Eqs. 13) for the original and the new dataset, making significant difference between the LDC estimated for the two datasets. Some jumps occurred in GE model for high LDC values due to the complexity of the equation of GE model. These jumps could have negative effect on performance of LDC estimation. However, this amount of measured data is a relatively small dataset to properly describe nonlinear processes. The effect of key factors will be better captured using empirical models by continuing to measure longitudinal dispersion in a wide range of streams (Gharabaghi and Sattar 2017).

References

Alizadeh MJ, Ahmadyar D, Afghantoloee A (2017) Improvement on the existing equations for predicting longitudinal dispersion coefficient. Water Resour Manag 31:1777–1794
Article Google Scholar
Antonopoulos VZ, Georgiou PE, Antonopoulos ZV (2015) Dispersion coefficient prediction using empirical models and ANNs. Environmental Processes 2:379–394. https://doi.org/10.1007/s40710-015-0074-6
Article Google Scholar
Atkinson TC, Davis PM (2000) Longitudinal dispersion in natural channels: l. experimental results from the river Severn, U.K. Hydrol Earth Syst Sci Discuss 4:345–353
Article Google Scholar
Azamathulla HM, Wu F-C (2011) Support vector machine approach for longitudinal dispersion coefficients in natural streams. Appl Soft Comput 11:2902–2905
Article Google Scholar
Beven K (1979) A sensitivity analysis of the Penman-Monteith actual evapotranspiration estimates. J Hydrol 44:169–190. https://doi.org/10.1016/0022-1694(79)90130-6
Article Google Scholar
Chatila GJ (1997) Modeling of pollutant transfer in compound open channels. PhD dissertation. University of Ottawa, Ontario
Google Scholar
Davis PM, Atkinson TC, Wigley TML (2000) Longitudinal dispersion in natural channels: 2. the roles of shear flow dispersion and dead zones in the river Severn, U.K. Hydrol Earth Syst Sci Discuss 4:355–371
Article Google Scholar
Dawson R, Hall J, Sayers P, Bates P, Rosu C (2005) Sampling-based flood risk analysis for fluvial dike systems. Stoch Env Res Risk A 19:388–402
Article Google Scholar
Deng Z-Q, Singh VP, Bengtsson L (2001) Longitudinal dispersion coefficient in straight rivers. J Hydraul Eng 127:919–927. https://doi.org/10.1061/(ASCE)0733-9429(2001)127:11(919)
Article Google Scholar
Disley T, Gharabaghi B, Mahboubi AA, McBean EA (2015) Predictive equation for longitudinal dispersion coefficient. Hydrol Process 29:161–172. https://doi.org/10.1002/hyp.10139
Article Google Scholar
Elder JW (1959) The dispersion of marked fluid in turbulent shear flow. J Fluid Mech 5:544–560. https://doi.org/10.1017/S0022112059000374
Article Google Scholar
Etemad-Shahidi A, Taghipour M (2012) Predicting longitudinal dispersion coefficient in natural streams using M5' model tree. J Hydraul Eng 138:542–554
Article Google Scholar
Fischer HB (1967) The mechanics of dispersion in natural streams. J Hydraul Div 93(6):187–216
Google Scholar
Fischer HB (1968) Dispersion predictions in natural streams. J Sanit Eng Div 94:927–944
Google Scholar
Gardner RH, O'neill RV, Mankin JB, Carney JH (1981) A comparison of sensitivity analysis and error analysis based on a stream ecosystem model. Ecol Model 12(3):173–190
Article Google Scholar
Gharabaghi B, Sattar AMA (2017) Empirical models for longitudinal dispersion coefficient in natural streams. J Hydrol. https://doi.org/10.1016/j.jhydrol.2017.01.022
Godfrey RG, Frederick BJ (1970) Stream dispersion at selected sites, US Government printing office, Washington
Gong L, Cy X, Chen D, Halldin S, Chen YD (2006) Sensitivity of the Penman–Monteith reference evapotranspiration to key climatic variables in the Changjiang (Yangtze River) basin. J Hydrol 329:620–629. https://doi.org/10.1016/j.jhydrol.2006.03.027
Article Google Scholar
Graf JB (1995) Measured and predicted velocity and longitudinal dispersion at steady and unsteady flow, Colorado River, Glen Canyon Dam to Lake Mead. J Am Water Resour Assoc 31:265–281. https://doi.org/10.1111/j.1752-1688.1995.tb03379.x
Article Google Scholar
Guymer I (1998) Longitudinal dispersion in sinuous channel with changes in shape. J Hydraul Eng 124:33–40. https://doi.org/10.1061/(ASCE)0733-9429(1998)124:1(33)
Article Google Scholar
Haghiabi AH (2016) Prediction of longitudinal dispersion coefficient using multivariate adaptive regression splines. J Earth Syst Sci 125:985–995
Article Google Scholar
Haghiabi AH (2017) Modeling river mixing mechanism using data driven model. Water Resour Manag 31:811–824
Article Google Scholar
Hall JW, Boyce SA, Wang Y, Dawson RJ, Tarantola S, Saltelli A (2009) Sensitivity analysis for hydraulic models. J Hydraul Eng 135:959–969
Article Google Scholar
Hamby DM (1994) A review of techniques for parameter sensitivity analysis of environmental models. Environ Monit Assess 32:135–154. https://doi.org/10.1007/bf00547132
Article Google Scholar
Ho DT, Schlosser P, Caplow T (2002) Determination of longitudinal dispersion coefficient and net advection in the tidal Hudson river with a large-scale, high resolution SF6 tracer release experiment. Environ Sci Technol 36:3234–3241
Article Google Scholar
Huber PJ (1981) Robust Statistics. John Wiley & Sons, Inc., New York
Hupet F, Vanclooster M (2001) Effect of the sampling frequency of meteorological variables on the estimation of the reference evapotranspiration. J Hydrol 243:192–204
Article Google Scholar
Kashefipour SM, Falconer RA (2002) Longitudinal dispersion coefficients in natural channels. Water Res 36:1596–1608. https://doi.org/10.1016/S0043-1354(01)00351-7
Article Google Scholar
Li X, Liu H, Yin M (2013) Differential evolution for prediction of longitudinal dispersion coefficients in natural streams. Water Resour Manag 27:5245–5260. https://doi.org/10.1007/s11269-013-0465-2
Article Google Scholar
Liu H (1977) Predicting dispersion coefficient of streams. J Environ Eng Div 103:59–69
Google Scholar
McCuen RH (1973) The role of sensitivity analysis in hydrologic modeling. J Hydrol 18:37–53. https://doi.org/10.1016/0022-1694(73)90024-3
Article Google Scholar
McCuen RH (1974) A sensitivity and error analysis of procedures used for estimating evaporation. J Am Water Resour Assoc 10:486–497
Article Google Scholar
McQuivey RS, Keefer TN (1974) Simple method for predicting dispersion in streams. J Environ Eng Div 100:997–1011
Google Scholar
Mishra S (2009) Uncertainty and sensitivity analysis techniques for hydrologic modeling. J Hydroinf 11:282–296
Article Google Scholar
Mount NJ, Dawson CW, Abrahart RJ (2013) Legitimising data-driven models: exemplification of a new data-driven mechanistic modelling framework. Hydrol Earth Syst Sci 17:2827–2843
Article Google Scholar
Nakhaei N, Etemad-Shahidi A (2012) Applying Monte Carlo and classification tree sensitivity analysis to the Zayandehrood River. J Hydroinf 14:236–250
Article Google Scholar
Noori R, Ghiasi B, Sheikhian H, Adamowski JF (2017) Estimation of the dispersion coefficient in natural rivers using a granular computing model. J Hydraul Eng 143(5):04017001
Article Google Scholar
Nordin CF, Sabol GV (1974) Empirical data on longitudinal dispersion in rivers. US Geological Survey, Water Resources Investigations, Report No. 74-20, p 332
Papadimitrakis I, Orphanos I (2004) Longitudinal dispersion characteristics of rivers and natural streams in Greece. Water Air Soil Poll Focus 4:289–305. https://doi.org/10.1023/b:wafo.0000044806.98243.97
Article Google Scholar
Parsaie A, Haghiabi AH (2017) Computational modeling of pollution transmission in rivers. Appl Water Sci 7:1213–1222. https://doi.org/10.1007/s13201-015-0319-6
Article Google Scholar
Pasha M, Lansey K (2010) Effect of parameter uncertainty on water quality predictions in distribution systems-case study. J Hydroinf 12:1–21
Article Google Scholar
Radwan M, Willems P, Berlamont J (2004) Sensitivity and uncertainty analysis for river quality modelling. J Hydroinf 6:83–99
Article Google Scholar
Rana G, Katerji N (1998) A measurement based sensitivity analysis of the Penman-Monteith actual evapotranspiration model for crops of different height and in contrasting water status. Theor Appl Climatol 60:141–149. https://doi.org/10.1007/s007040050039
Article Google Scholar
Rutherford JC (1994) River Mixing. John Wiley & Sons, ltd, Chichester
Sahay RR (2011) Prediction of longitudinal dispersion coefficients in natural rivers using artificial neural network. Environ Fluid Mech 11:247–261. https://doi.org/10.1007/s10652-010-9175-y
Article Google Scholar
Sahay RR, Dutta S (2009) Prediction of longitudinal dispersion coefficients in natural rivers using genetic algorithm. Hydrol Res 40:544–552. https://doi.org/10.2166/nh.2009.014
Article Google Scholar
Sahin S (2014) An empirical approach for determining longitudinal dispersion coefficients in rivers. Environmental Processes 1:277–285. https://doi.org/10.1007/s40710-014-0018-6
Article Google Scholar
Saltelli A, Tarantola S, Campolongo F, Ratto M (2004) Sensitivity analysis in diagnostic modelling: Monte Carlo filtering and regionalised sensitivity analysis, Bayesian uncertainty estimation and global sensitivity analysis. In: Sensitivity Analysis in Practice. John Wiley & Sons, ltd, pp 151–192. https://doi.org/10.1002/0470870958.ch6
Saltelli A, Ratto M, Andres T, Campolongo F, Cariboni J, Gatelli D, Saisana M, Tarantola S (2008) Global sensitivity analysis: the primer. John Wiley & Sons, ltd, Chichester
Sattar AMA, Gharabaghi B (2015) Gene expression models for prediction of longitudinal dispersion coefficient in streams. J Hydrol 524:587–596. https://doi.org/10.1016/j.jhydrol.2015.03.016
Article Google Scholar
Saxton KE (1975) Sensitivity analyses of the combination evapotranspiration equation. Agric Meteorol 15:343–353. https://doi.org/10.1016/0002-1571(75)90031-X
Article Google Scholar
Seo IW, Cheong TS (1998) Predicting longitudinal dispersion coefficient in natural streams. J Hydraul Eng 124:25–32. https://doi.org/10.1061/(ASCE)0733-9429(1998)124:1(25)
Article Google Scholar
Tayfur G, Singh VP (2005) Predicting longitudinal dispersion coefficient in natural streams by artificial neural network. J Hydraul Eng 131(11):991–1000
Article Google Scholar
Taylor G (1954) The dispersion of matter in turbulent flow through a pipe. Proc R Soc Lond A Mat Sci 223:446–468. https://doi.org/10.1098/rspa.1954.0130
Article Google Scholar
Velísková Y, Sokáč M, Halaj P, Koczka Bara M, Dulovičová R, Schügerl R (2014) Pollutant spreading in a small stream: a case study in Mala Nitra canal in Slovakia. Environmental Processes 1:265–276
Article Google Scholar
White W, Milli H, Crabbe A (1973) Sediment transport: an appraisal methods, Vol. 2: Performance of theoretical methods when applied to flume and field data. Hydr Res Station Rep. N. IT 119, Wallingford
Yotsukura N, Fischer HB, Sayre WW (1970) Measurement of mixing characteristics of the Missouri River between Sioux City, Iowa, and Plattsmouth, Nebraska. No. 1899-G. USGPO
Zeng Y, Huai W (2014) Estimation of longitudinal dispersion coefficient in rivers. J Hydro Environ Res 8:2–8
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Water Engineering, Agricultural Sciences and Natural Resources University of Khuzestan, Ahvaz, Iran
Hosein Nezaratian & Javad Zahiri
Department of Water Engineering, Shahid Chamran University of Ahvaz, Ahvaz, Iran
Seyed Mahmood Kashefipour

Authors

Hosein Nezaratian
View author publications
You can also search for this author in PubMed Google Scholar
Javad Zahiri
View author publications
You can also search for this author in PubMed Google Scholar
Seyed Mahmood Kashefipour
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Javad Zahiri.

Ethics declarations

Conflict of Interest

The authors declare that they have no conflict of interest.

Appendix

Table 8 Geometric, hydraulic and dispersion coefficient datasets used in this study

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Nezaratian, H., Zahiri, J. & Kashefipour, S.M. Sensitivity Analysis of Empirical and Data-Driven Models on Longitudinal Dispersion Coefficient in Streams. Environ. Process. 5, 833–858 (2018). https://doi.org/10.1007/s40710-018-0334-3

Download citation

Received: 25 May 2018
Accepted: 27 September 2018
Published: 14 November 2018
Issue Date: December 2018
DOI: https://doi.org/10.1007/s40710-018-0334-3

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Sensitivity Analysis of Empirical and Data-Driven Models on Longitudinal Dispersion Coefficient in Streams

Abstract

Similar content being viewed by others

The effect of neglecting spatial variations of the parameters in pollutant transport modeling in rivers

Longitudinal dispersive coefficient in channels with aquatic vegetation: A review

Estimation of Longitudinal Dispersion Coefficient Using Field Experimental Data and 1D Numerical Model of Solute Transport

1 Introduction

2 Theory and Previous Studies