Abstract
Accurate prediction of the chemical constituents in major river systems is a necessary task for water quality management, aquatic life well-being and the overall healthcare planning of river systems. In this study, the capability of a newly proposed hybrid forecasting model based on the firefly algorithm (FFA) as a metaheuristic optimizer, integrated with the multilayer perceptron (MLP-FFA), is investigated for the prediction of monthly water quality in Langat River basin, Malaysia. The predictive ability of the MLP-FFA model is assessed against the MLP-based model. To validate the proposed MLP-FFA model, monthly water quality data over a 10-year duration (2001–2010) for two different hydrological stations (1L04 and 1L05) provided by the Irrigation and Drainage Ministry of Malaysia are used to predict the biochemical oxygen demand (BOD) and dissolved oxygen (DO). The input variables are the chemical oxygen demand (COD), total phosphate (PO4), total solids, potassium (K), sodium (Na), chloride (Cl), electrical conductivity (EC), pH and ammonia nitrogen (NH4-N). The proposed hybrid model is then evaluated in accordance with statistical metrics such as the correlation coefficient (r), root-mean-square error, % root-mean-square error and Willmott’s index of agreement. Analysis of the results shows that MLP-FFA outperforms the equivalent MLP model. Also, in this research, the uncertainty of a MLP neural network model is analyzed in relation to the predictive ability of the MLP model. To assess the uncertainties within the MLP model, the percentage of observed data bracketed by 95 percent predicted uncertainties (95PPU) and the band width of 95 percent confidence intervals (d-factors) are selected. The effect of input variables on BOD and DO prediction is also investigated through sensitivity analysis. The obtained values bracketed by 95PPU show about 77.7%, 72.2% of data for BOD and 72.2%, 91.6% of data for DO related to the 1L04 and 1L05 stations, respectively. The d-factors have a value of 1.648, 2.269 for BOD and 1.892, 3.480 for DO related to the 1L04 and 1L05 stations, respectively. Based on the values in both stations for the 95PPU and d-factor, it is concluded that the neural network model has an acceptably low degree of uncertainty applied for BOD and DO simulations. The findings of this study can have important implications for error assessment in artificial intelligence-based predictive models applied for water resources management and the assessment of the overall health in major river systems.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
Introduction
Pollution control in river systems via the modeling of qualitative parameters of water is one of the primary components that warrant a special focus in managerial scheduling. In some cases, pollution indices of an aquatic system are evaluated by two terms, namely biochemical oxygen demand (BOD) and declined dissolved oxygen (DO). BOD, which is an important parameter to be estimated accurately, reflects all present materials that can be oxidized by chemical processes and aerobic organisms and also the abundance and activity of the oxidizing organisms and is deemed as one of the primary criteria that are required for any aquatic system. Since an inverse relationship exists between the BOD and DO, the higher value of BOD is symptomatic of the deficiency of dissolved oxygen. Moreover, the concentration of DO in water should be known before measuring the value of BOD. Hence, both of the above criteria should be determined simultaneously. However, the determination of these values in laboratory constrained conditions is both time-consuming and costly. This clearly warrants a need for indirect methods to be applied to predict these values (Singh et al. 2009).
In recent years, there have been a number of statistical and deterministic models developed for modeling water quality; however, most of the existing models for water quality parameters are very complex and require a significant amount of field data to support the analysis (Chen et al. 2003; Kurunç et al. 2005). Many statistical-based water quality models assume that the relationship between response variable and prediction variable is linear and distributed normally. However, as water quality can be affected by many factors, the traditional data processing methods are no longer efficient enough for solving the problem, as such factors encompass a complicated nonlinear relation to the variables of water quality forecast (Wu et al. 2000; Xiang et al. 2006). Therefore, utilizing statistical approaches usually does not possess high precision. In recent decades, promising results have been reported by several studies that investigated water quality modeling problems using artificial intelligence (AI) techniques (Diamantopoulou et al. 2005; Sengorur et al. 2006; Palani et al. 2008; Hore et al. 2008; Najah et al. 2009; Singh et al. 2009; Dogan et al. 2009; Najah et al. 2011; Kim and Seo 2015; Sarkar and Pandey 2015; Salami and Ehteshami 2015). In most of these studies, the monthly parameters of water quality have been used for the simulation of water quality parameters (Diamantopoulou et al. 2005; Sengorur et al. 2006; Palani et al. 2008; Singh et al. 2009; Dogan et al. 2009; Sarkar and Pandey 2015).
Despite being relatively successful, these research works have covered a comprehensive range of forecasting accuracy which varied significantly owing to the environmental features of locations. Hence, the significance of input data to forecast the river water quality became evident. Therefore, so far, a comprehensive model capable of simulating various environments has been out of reach. Yet, an inspiring question for researchers in the area of water resources management is: Does a hybrid intelligent model integrated with optimization algorithm enhance the predictive model’s precision? (Fahimi et al. 2016).
Different combinations of input (or predictor) data have proven to govern the predictive accuracy of an objective model (Abbot and Marohasy 2014; Deo et al. 2016; Galelli and Castelletti 2013; Quilty et al. 2016), but moreover, the need for a novel methodology to extract the information concealed in the input dataset to yield desirable and accurate artificial intelligence model is strongly required in the field of river systems engineering. Quite often, a stand-alone model is seen to lack a suitable optimization procedure for the extraction of features within the input data, from which the best model can be obtained, which is a prerequisite to enhance the performance of the predictive model (Long and Meesad 2013; Quilty et al. 2016; Sedki and Ouazar 2010). Firefly algorithm (FFA) used in this paper acts as an optimization tool for artificial intelligence models and has recently been in area of predictive modeling (Ghorbani et al. 2017). Mainly, the conceptual theory of FFA is triggered by the flashing pattern of fireflies as stated by Yang (2010). FFA has performed favorably in many different fields (Kavousi-Fard et al. 2014; Fu et al. 2015; Emary et al. 2015; Kazemzadeh-Parsi 2014; Nascimento et al. 2013; Talatahari et al. 2014; Yang 2010). In estimation problems, FFA method culminated in a considerable improvement solution procedures. In conclusion, this study revealed that the modified FFA model performed very efficiently in comparison with other optimization algorithms.
Considering all the arguments above, different results obtained from AI techniques for all sets of input data can render the optimal evaluation of results to be impossible, thereby causing uncertainty in the output results (Kim and Seo 2015). Consequently, uncertainty in AI-based models is considered to be one of the most important restrictions especially in using the well-established ANN technique applied to develop strategies for appropriate control and management of water quality (Noori et al. 2015b).
Despite numerous publications in the area of uncertainty assessment in qualitative models applied in water research (Canale and Seo 1996; Wagener and Gupta 2005; Gupta et al. 2006; Sin et al. 2009; Cea et al. 2011; Srivastava et al. 2014), the number of studies in uncertainty analyses based on AI techniques (e.g., ANN) is insufficient and thus requires further research (Noori et al. 2015b). Out of the many research works by different authors, uncertainty analysis of AI techniques was also performed by Aqil et al. (2007) who have investigated the uncertainty of output values in a neuro-fuzzy model applied for the prediction of weekly flows of a river. Using a Monte Carlo method, they found that the method was suitable for assessing uncertainty of a neuro-fuzzy model. Noori et al. (2010) investigated the uncertainty within ANN and adaptive neuro-fuzzy inference system (ANFIS) models applied to predict carbon monoxide’s (CO) concentration in the atmospheric region of Tehran. In another work, the uncertainty within ANFIS and ANN models was investigated by Noori et al. (2013a, b) to predict the value of BOD5 in Sefidrood River. Jiang et al. (2013) used a new and efficient model based on an ANN and the Monte Carlo method (ANN-MCS) to analyze the uncertainty in the prediction of COD pollution hazard within the Yellow River located in Lanzhou.
A study by Dehghani et al. (2013) examined the uncertainties within the multilayer feedforward artificial neural network (FFANN) model using the Monte Carlo method and applied the model to predict hydrological drought in Karoon River located in southeast Iran. Antanasijevic et al. (2014) evaluated the uncertainty of the general regression neural network (GRNN) in predicting the DO parameter in Danube River. Noori et al. (2015a) assessed the uncertainty of ANN, support vector machine (SVM) and ANFIS techniques to predict the longitudinal distribution coefficient (LDC) in natural rivers, while Noori et al. (2015b) investigated the uncertainty of the SVM model to estimate the 5-day BOD in Sefidrood River. Recently, Ghorbani et al. (2016) examined uncertainties of an SVM, radial basis function (RBF) and MLP model in predicting the monthly current of Zarrinehrud River.
In this paper, an artificial intelligence model, namely multilayer perceptron (MLP), is integrated with FFA for river water quality (BOD and DO) modeling. In general, the MLP model is a common artificial neural network architecture (Ghorbani et al. 2017). The aims of this study are: (I) to investigate the applicability of an MLP model for BOD and DO prediction in Langat River, (II) to combine an MLP model with FFA to create a hybrid MLP-FFA model, and (III) to assess the predictive precision of MLP-FFA using a number of visual and statistical criteria observed and predicted BOD and DO (IV) uncertainty assessment of MLP model and comparison of results. To the best of the author’s knowledge, there is no prior research in the literature that investigated the ability of multilayer perceptron-based FFA hybrid model for river water quality prediction.
In current paper, the code used for the MLP model is a readily available code, while the codes used for the hybrid MLP-FFA method and for uncertainty assessment have been developed by the authors as their original contribution to this research. The structure of the proposed hybrid MLP-FFA model is shown in Fig. 1. It is noted that the list of used abbreviations is given at the end of the article.
Methodology
Multilayer perceptron neural network (MLP)
In recent decades, efforts have been undertaken to simulate a natural neuron, capable of capturing the characteristics of a biological system which is also easy to implement in NN models. Modeled networks are also known as “paradigms of the NN.” ANN models were firstly introduced by Pits and McCulloch (Govindaraju 2000). ANN is a mathematical structure designed to simulate the data processing of brain neurons (Hinton 1992; Jensen 1994).
In this study we have applied a very popular and a widely applied neural network model, which is known as the multilayer perceptron (MLP). It is important to mention that the MLP model is a variant form of the classical ANN model and this model has been widely used in the current era of big data analytics (Gardner and Dorling 1998; Ay and Kisi 2011). The basic MLP model comprises of three layers: (I) input layer, (II) hidden layer and (III) output layer. The input layer receives the set of input data, the processing of the features is performed in the hidden layer(s), and the output layer is used to reveal the predicted results. Figure 2 illustrates a sample of a three-layer perceptron neural network.
The identification of the most accurate architecture of the ANN-MLP model is based on a trial-and-error method, and its final structure is determined through hidden layer and neuron values. In the structure of the MLP model, the inputs to the \( i \)th layer (\( x_{1} \) to \( x_{j} \)) are multiplied by their assigned weights (\( w_{i1} \) to \( w_{ij} \)) and then summed up. A threshold (\( b_{i} \)) is added to the input, and the net input (\( {\text{Net}}_{i} \)) is determined, which is always greater than zero. The weights indicate the strength of neurons’ connection and are optimized through the learning process.
Subsequently, the member function receives inputs and outputs and transfers them to the next layer. Sigmoid functions are often used in artificial neural networks to introduce nonlinearity in the model (Maier and Dandy 2000).
In this research, all datasets were normalized and divided into two categories of testing and training data. In order to predict the quality of water, tangent sigmoid, linear stimulator and the Levenberg–Marquardt algorithm (LMA), which is a fastest method for training the feedback neural network (Adamowski et al. 2012; Deo and Şahin 2016), were used for mapping the information from the input layer to the hidden layer and from the hidden layer to the output layer, respectively. LMA is a simple and robust feature extraction tool which provides a solution for the minimization problem with respect to the function variables. The optimum number of neurons in the hidden layer is obtained by a trial-and-error method and by changing the number of neurons from 1 to 20 in the hidden layer in the present study.
Hybrid MLP-FFA model
It should be noted that the Levenberg–Marquardt training algorithm, known as a robust predictive training algorithm (in terms of speed and efficiency) (e.g., Tiwari and Adamowski 2013; Deo and Şahin 2016, 2017), was used in identifying the local minima, which may not necessarily be the global minimum within the feature dataset. This indicates that there is a room for further improvement in the MLP model’s performance. In our study, this has been achieved by the application of the FFA as an optimization tool following our earlier studies (e.g., Ghorbani et al. 2017). In general, the FFA is an optimization algorithm that is known to yield better performance that is attributed to identifying the global minimum within the feature datasets.
In essence, the nature-inspired FFA procedure was first introduced by Yang (2010) as an extension of the swarm intelligence optimization method relying on the movement of fireflies. In this approach, the solution to an optimization problem can be regarded as an agent, i.e., firefly which shines in proportion to its quality. As a result, each brighter firefly is able to attract its partners, regardless of their sex, which render the exploration of the search space more effective (Lukasik and Zak 2009).
As fireflies are attracted toward light, the whole swarm moves toward the brightest firefly. In this case, the attractiveness of the fireflies is highly proportional to their brightness and the brightness relies on the intensity of the agent (Kayarvizhy et al. 2014). The main defect of firefly algorithm is the construction of its objective function and the differentiation of the light intensity.
The variables of the FFA are the light intensity I(r), the attractiveness \( \left( \beta \right) \), and the Cartesian distance between any two fireflies i and j at \( x_{i} \;{\text{and}}\; x_{j} \), respectively, that best can expressed (Yang 2010) as:
where \( x_{i,k} \) is the kth component of the spatial coordinate \( x_{i} \) of the ith firefly, \( \gamma \) is the light absorption coefficient, d is the dimensionality of the given problem, I(r) and \( I_{O} \) are the light intensity at distance r and initial light intensity from a firefly, and \( \beta \left( r \right)\; {\text{and}}\; \beta_{O} \) are the attractiveness \( \beta \) at a distance r and r = 0.
The next movement of firefly i can be represented as (Yang 2010):
Here, the first phase of formula (7) indicates the attraction, whereas the second phase denotes the randomization processes. The \( \alpha \) controls the randomization values that range between 0 and 1, and \( \epsilon_{\text{i}} \) represents the random number of the Gaussian distribution (Ch et al. 2014).
In this paper, a novel contribution to the prediction of BOD and DO is made where a newly constructed MLP-FFA hybrid model is attained. The MLP-FFA hybrid model was generated by integrating the traditional MLP model with the FFA that is a popular optimization tool used in data-driven modeling. Figure 3 shows the procedure of obtaining optimal MLP weights with FFA.
The simulation procedure of the MLP-FFA model involves determining the combination of input parameters with regard to the correlation coefficient among input and output (target) variables. Afterward, the firefly algorithm is supplied with a selection of best inputs based on their congruence with the target variable normally assessed by the objective function, and the chosen inputs are utilized in the MLP-FA model to generate the prediction of BOD and DO.
Uncertainty analysis
Uncertainty is a result-dependent factor that demonstrates the range of values a modeling result can attain. It also represents the possibility that the measured value may fall into the specified range. This research paper aims to estimate the uncertainty of neural network output. Here, we apply the method recommended by Abbaspour et al. (2007) and Noori et al. (2015b) that was used to analyze the uncertainty of river quality prediction. In this method, the percentage of measured data bracketed by 95 percent predicted uncertainties (95PPU) are considered. In order to gain this value, the ranges of empirical cumulative distribution probability (X_L) 2.5% and (X_U) 97.5% are determined through 1000 predictions.
The appropriate confidence level is the level in which two requirements are met: (1) The 95PPU band brackets “most of the observations” and (2) the average distance between the upper (at 97.5% level) and the lower (at 2.5% level) parts of the 95PPU is “small.” Quantifications of the two requirements are problem dependent to an extent.
Abbaspour et al. (2007) reported that 80–100% of measured data should be in the 95PPU level provided that they are of good quality. In some regions that data are not of good quality, having 50% of data in the 95PPU level would suffice.
For the second requirement, it is essential that the average distance between the upper and the lower 95PPU be smaller than the standard deviation of the measured data (Abbaspour et al. 2007). We utilize the above two measures to quantify the strength of calibration, accounting for the combined parameter, model, and input uncertainties.
To evaluate the average width of the confidence interval band, the band width indicator was suggested by Abbaspour et al. (2007) as follows:
where \( \sigma_{X} \) is the standard deviation of observed data and \( \overline{{d_{X} }} \) is the confidence band’s average width which is defined as follows:
The percentage of the data within the confidence band of 95% is determined as follows:
where 95PPU indicates 95% predicted uncertainty; \( k \) is the number of observed data; \( l \) is the current month which changes from 1 to \( k \); \( X_{L}^{l} \) and \( X_{U}^{l} \) are, respectively, the lower and the upper bands of uncertainty; and \( X_{\text{reg}}^{l} \) is the current month’s registered data.
Whenever the recorded data for the present month (\( l \)) are placed in the uncertainty range, one unit is added to the counter (\( j \)) and the maximum amount of \( j \) will occur when \( l = k \). If all the recorded amounts are within the lower and the upper band, then the maximum amount of “Bracketed by 95PPU” will be 100.
Model performance criteria
In order to assess the accuracy of the model’s results and the model fitness, the correlation coefficient (r), root-mean-square error (RMSE) (Willmott and Matsuura, 2005) and Willmott’s index of agreement (WI) (Willmott 1981, 1984) are used.
The correlation coefficient (r) is defined as the correlation between the observed and modeled data:
RMSE is defined as:
WI is defined as:
%RMSE is defined as:
where n is the number of input variables, \( O_{i} \) and \( P_{i} \) are, respectively, the measured and the output of the \( i \)th element, and \( \bar{O} \) and \( \bar{P} \) are the average of the values within the testing dataset.
In this study, the optimal model’s accuracy was considered to be excellent when the %RMSE < 10%; good if 10% < %RMSE < 20%; fair if 20% < %RMSE < 30%; and poor if %RMSE > 30% (Heinemann et al. 2012; Jamieson et al. 1991).
Study area and model development data
In this research, the monthly water quality datasets of data of two stations in Langat River from the period 2001–2010 are utilized. Langat River is one of the most important rivers in Malaysia, and it is located in geographical location of 2° 40′ M 152″ N to 3° 16′ M 15N latitude and 101° 19′ M 20″ E to 102° 1′ M 10″ E longitude. Catchment of Langat River is an important catchment which provides water and other facilities for some 1.2 million people. Its total area is 1,815 km2. Big cities that receive their water from this catchment are Cheras, Kajang, Bangi, government center of Putrajaya, etc. The length of the main water flow is about 141 km located at 40 km east of Kuala Lumpur.
The basin of Langat River is located in southern and southeastern parts of the Selangor Darul Ehsan state. Langat River originates from Pahang–Selangor border in which highlands are 1500 m above sea level. It drains westward to the Straits of Malacca. Figure 4 provides detailed geographical features and water quality control stations of the Langat River. Sets of data are divided in two categories: Monthly data of the first 7 years from 2001 to 2007 (84 sets or 70% of the whole dataset) are used for training and monthly data of the last 3 years from 2008 to 2010 (36 sets or 30% of the whole data) are used for testing.
Models proposed by researchers with regard to the prediction of the BOD and DO using various input parameters are summarized in Table 1. Based on Table 1 and parameters utilized by other researchers in their studies, in this study the following parameters are used: chemical oxygen demand (COD), total phosphate (PO4), total solids (TS), potassium (K), sodium (Na), chloride (Cl), electrical conductivity (EC), pH and ammonia nitrogen (NH4-N).
A key reason why the monthly data have been used is the discrete nature of the daily or hourly data for the case study, which is highly challenging to acquire. Besides this, there is enough evidence in the literature that gives credence to credibility of this choice, as explained in Introduction and Table 1. It is noteworthy that due to insufficient water quality data in the study area, in this research, only the aforementioned parameters are utilized for BOD and DO prediction. The statistical parameters of water quality data of the Langat River in two considered stations are shown in Table 2.
Results and discussion
Combination of input parameters
Table 3 lists the most suitable combination of input parameters. Using SPSS software, correlation coefficients of parameters, previously mentioned in Table 2, are calculated. Subsequently, in Fig. 5, the correlation map of parameters for both stations was drawn based on a color scale such that the closeness of values to 1 or -1 indicates high correlation. Based on Table 3, all 9 input parameters were used in the first combination. However, in combinations 2, 3 and 4, PO4, pH and TS which had the lowest correlation with BOD and DO were eliminated, respectively (based on Fig. 5), and the best combination was obtained for input parameters.
It should be noted that BOD and DO indicators are dependent variables, while the rest are independent variables.
A total of four models with various input combinations have been developed. Both non-optimized (MLP) and hybrid (MLP-FFA) models were constructed and tested in order to specify the optimum number of nodes in the hidden layer and transfer functions. Selection of an appropriate number of nodes in the hidden layer is of paramount importance as a large number of these may result in over-fitting, while a smaller number of nodes may not capture the information desirably (Singh et al. 2009). The optimum number of neurons was determined based on the minimum value of mean square error (MSE) of the training dataset. The network was trained in 1000 epochs, learning rate of 0.0013 and momentum coefficient of 0.9. In Tables 4 and 5, r, RMSE, %RMSE and WI values obtained from BOD and DO simulations in both testing and training data are shown along with optimal number of neurons. From the four sets of input data, the best set that has higher r and WI values and the lower RMSE and %RMSE during test is chosen as the best set.
Tables 4 and 5 also represent the best network structure and their respective function criteria. Based on the results of non-optimized MLP model (Tables 4, 5), the performance criteria reveal that the models designated as ANN(8,9,1) and ANN(8,6,1) are the best models to predict BOD and DO in 1L04 station, respectively (Table 4), and ANN(7,8,1) and ANN(9,13,1) are the best models to predict BOD and DO in 1L05 station, respectively (Table 5). The structure of ANN(8,9,1) consists of one input layer with eight input variables, one hidden layer with nine nodes and one output layer with one output variable.
A relatively low correlation coefficient between the measured and model output variables (BOD and DO) in the present study, especially at 1L05 station, may be due to the heterogeneous nature of the water quality (input and output) variables as these were measured over a span of 10 years in two sampling sites (as shown in Fig. 4). Moreover, relatively higher correlations between measured and model (NN) computed values of BOD and DO in various aquatic systems (Sengorur et al. 2006; Soyupak et al. 2003; Ying et al. 2007; Dogan et al. 2009) may be ascribed to the limited number of the input variables used. To fix this problem we investigated the model’s precision with respect to firefly optimizer algorithm.
In the MLP-FFA hybrid models, the multilayer perceptron model and firefly algorithm were integrated (Fig. 3). Tables 4 and 5 show the results of this study. It is obvious that the prediction performance of the MLP-FFA-based models in terms of r, RMSE, %RMSE and WI for training and testing periods is higher compared to the non-optimized models. That is, the MLP-FFA model displayed the smallest value of RMSE and %RMSE and the highest value of r and WI in the testing set (Tables 4, 5). In general, based on results, the MLP-FFA model is a powerful tool in predicting the water quality of rivers.
Also based on Tables 4 and 5, and results obtained from both MLP and MLP-FFA models, a relatively better performance (r between measured and computed values) of the BOD model as compared to that of the DO model shows that the selected influential factors (input variables) have relatively greater impact on BOD than on DO. Also, selection of the influential factors might affect the model output considerably (Ying et al. 2007).
Figures 6 and 7 show the comparison between MLP and MLP-FFA results and the observed data for the set of monthly test data. It is clear that the MLP-FFA model results are closer to the observed water quality values compared to MLP model. Moreover, BOD and DO parameters in station 1L04 show a higher correlation than that of the 1L05.
As previously discussed, one of the aims of this study is the uncertainty analysis of the multilayer perceptron neural network using two criteria, namely 95PPU and d-factor, such that the increase in observed data in 95PPU level and the decrease in average value of upper and lower bands (smaller than the standard deviation of the measured data) in uncertainty eventuate in a more favorable uncertainty. In this section, optimal structure of the models discussed in previous sections is used. Uncertainty indices of 95PPU and d-factor for testing datasets are given in Table 6. As shown in Figs. 6, 7 and Table 6, values bracketed by 95PPU indicate that about 77.7%, 72.2% of data for BOD and 72.2%, 91.6% of data for DO relate to 1L04 and 1L05 stations, respectively. Furthermore, the d-factor has a value of 1.648, 2.269 for BOD and 1.892, 3.480 for DO, which relate to 1L04 and 1L05 stations, respectively.
Based on the obtained values in both stations for 95PPU and d-factor indices, it can be concluded that all the observed data fall into the 95PPU band range (over 50% of observed data), and reasonable extent of uncertainty is achieved in simulating both BOD and DO. Simulation results for BOD are better than DO since the average distance between the upper and lower values of 95PPU (d-factor) is smaller than the standard deviation (SD) of measured data (2.269 < 3.55, 1.648 < 4.02) for BOD, while this is not valid for DO. Moreover, the uncertainty of MLP model in modeling BOD’s uncertainty is lower than that of DO’s as indicated by smaller d-factor.
In general, there are three types of uncertainties in all simulation processes. The first type involves the uncertainties associated with the simulator model. The second type involves uncertainties arising from data. The third type involves the local knowledge. Hence, the level of uncertainties varies significantly with the problem type. In this research, the uncertainty originates from the ANN model, local knowledge and the data, which stem from human and machine errors and some other unknown problems.
Sensitivity analysis
To evaluate the effective input parameters, two criteria (r and RMSE) are used to determine the most effective variables on the output. Based on Tables 4 and 5, and the obtained results, owing to relatively better performance, second (8 input variables) and third (7 input variables) combinations for BOD and second (8 input variables) and first (9 input variables) combinations for DO were used for sensitivity analyses in both stations. The analyses consisted of the comparison of overall 9 and 8 networks for BOD and 9 and 10 networks for DO in stations 1L04 and 1L05, respectively. Each one demonstrated to what extent the eliminated parameter would affect the network accuracy.
Obviously, the precision of MLP would become higher if all the suggested parameters were used as the input to the model for the testing dataset. Next, the most influential parameters were selected after determining the networks with reduced accuracy (lower r and higher RMSE) after the elimination of a parameter in testing stage compared to first network (all input parameters). Taking above arguments into consideration along with the results presented in Tables 7 and 8, in both stations the BOD parameter is more sensitive to Na, Cl and NH4-N, while DO parameter is more sensitive to COD, pH and NH4-N in station 1L04, and to K, pH and NH4-N in station 1L05.
Conclusion
In this paper, a multilayer perceptron (MLP) forecasting model integrated with a firefly (FF) optimizer algorithm (MLP-FFA) was used for forecasting a river water quality “i.e., BOD and DO.” The case study was Langat River basin in Malaysia. By applying correlation coefficient to water quality data, a set of four input combinations were deemed suitable for prediction of BOD and DO.
Hence, a number of forecasting models were developed, including the traditional MLP and integrated MLP-FFA models over a 10-year duration (2001–2010) for 1L04 and 1L05 stations. The results were assessed with several statistical and visual criteria and showed the better efficiency of MLP-FFA model in terms of the correlation coefficient (r) between forecasted and observed water quality, root-mean-square error (RMSE), % root-mean-square error (%RMSE) and Willmott’s index of agreement (WI). It was obvious that the MLP-FFA model with (8,9,1) and (8,6,1) structure to predict BOD and DO in 1L04 station, respectively, and (7,8,1) and (9,13,1) structure to predict BOD and DO in 1L05 station, respectively, was more accurate than the other counterparts, thus impressing upon the importance of the firefly algorithm as an optimizer for better accuracy of conventional models.
The results of this study suggest that the firefly optimizer algorithm is a useful add-on tool for improving the forecasting accuracy of forecasting models applied for water quality prediction. Also, this research gives credence to the effectiveness of the hybrid model that is applicable to other engineering problems where historical data can provide features for developing a predictive model.
Despite the good performance of MLP-FFA model attained in this study, it should be admitted that there are limitations in this study that demand further research. Presumably, it is speculated that further improvement in the performance accuracy is possible by the inclusion of more significant information in the learning process of the predictive model. This study was limited to available data at hand. Hence, for simulation purposes, it is very important to include the other important variables, such as discharge, temperature, T-Alk, T-Hard, NO3-N, and datasets that may contain factors which may help to predict the value of the BOD and DO. A future research work could apply the model for short-term prediction of water quality (e.g., daily or hourly parameters). Such a study is likely to generate a thorough model for operational usage, but was beyond the scope of this paper and thus awaits another independent investigation.
Additionally, the reliability of the MLP model prediction was calculated by an uncertainty estimation. Based on the values in both stations for the 95PPU and d-factor indices, it is concluded that the neural network model has an acceptably low degree of uncertainty applied for BOD and DO simulations. Besides, a comparison between the presented results for uncertainty determination of MLP model showed a lower degree of uncertainty in simulating the BOD compared to the DO dataset as indicated by smaller d-factor. In future work, the above-mentioned restrictions can be obviated using other robust methods of uncertainty analysis, which in turn improve results and reduce uncertainty.
At the end, the effective input variables analyzed through sensitivity analysis showed that in both stations, the BOD parameter was more sensitive to the Na, Cl and NH4-N data, while the DO parameter was more sensitive to the COD, pH and NH4-N data in station 1L04, and to the K, pH and NH4-N data in station 1L05.
Abbreviations
- ANN:
-
Artificial neural network
- BOD:
-
Biochemical oxygen demand
- Cl:
-
Chloride
- COD:
-
Chemical oxygen demand
- EC:
-
Electrical conductivity
- DO:
-
Dissolved oxygen
- FFA:
-
Firefly algorithm
- K:
-
Potassium
- MLP:
-
Multilayer perceptron
- Na:
-
Sodium
- NH4-N:
-
Ammonia nitrogen
- PO4 :
-
Total phosphate
- 95PPU:
-
95 Percent predicted uncertainty
- TS:
-
Total solids
References
Abbaspour KC, Yang J, Maximov I, Siber R, Bogner K, Mieleitner J, Zobrist J, Srinivasan R (2007) Modelling hydrology and water quality in the pre-alpine/alpine Thur watershed using SWAT. J Hydrol 333(2–4):413–430
Abbot J, Marohasy J (2014) Input selection and optimisation for monthly rainfall forecasting in Queensland, Australia, using artificial neural networks. Atmos Res 138:166–178
Adamowski J, Fung Chan H, Prasher SO, Ozga-Zielinski B, Sliusarieva A (2012) Comparison of multiple linear and nonlinear regression, autoregressive integrated moving average, artificial neural network, and wavelet artificial neural network methods for urban water demand forecasting in Montreal, Canada. Water Resour Res 48:1–14
Antanasijević D, Pocajt V, Perić-Grujić A, Ristić M (2014) Modelling of dissolved oxygen in the Danube River using artificial neural networks and Monte Carlo simulation uncertainty analysis. J Hydrol 519:1895–1907
Aqil M, Kita I, Yano A, Nishiyama S (2007) Analysis and prediction of flow from local source in a river basin using a Neuro-fuzzy modeling tool. J Environ Manag 85(1):215–223
Ay M, Kisi O (2011) Modeling of dissolved oxygen concentration using different neural network techniques in Foundation Creek, El Paso County, Colorado. J Environ Eng 138(6):654–662
Canale RP, Seo DI (1996) Performance, reliability and uncertainty of total phosphorus models for lakes—II. Stochastic analyses. Water Res 30(1):95–102
Cea L, Bermúdez M, Puertas J (2011) Uncertainty and sensitivity analysis of a depth-averaged water quality model for evaluation of Escherichia Coli concentration in shallow estuaries. Environ Model Softw 26(12):1526–1539
Ch S, Sohani S, Kumar D, Malik A, Chahar B, Nema A, Panigrahi BK, Dhiman RC (2014) A support vector machine-firefly algorithm based forecasting model to determine malaria transmission. Neurocomputing 129:279–288
Chen WB, Liu WC (2014) Artificial neural network modeling of dissolved oxygen in reservoir. J Environ Monit Assess 186(2):1203–1217
Chen JC, Chang N, Shieh W (2003) Assessing wastewater reclamation potential by neural network model. Eng Appl Artif Intell 16(2):149–157
Dehghani M, Saghafian B, Nasiri Saleh F, Farokhnia A, Noori R (2013) Uncertainty analysis of streamflow drought forecast using artificial neural networks and Monte-Carlo simulation. Int J Climatol 34(4):1169–1180
Deo RC, Şahin M (2016) An extreme learning machine model for the simulation of monthly mean streamflow water level in eastern Queensland. Environ Monit Assess 188:1–24
Deo RC, Şahin M (2017) Forecasting long-term global solar radiation with an ANN algorithm coupled with satellite-derived (MODIS) land surface temperature (LST) for regional locations in Queensland. Renew Sustain Energy Rev 72:828–848
Deo RC, Kisi O, Singh VP (2016) Drought forecasting in eastern Australia using multivariate adaptive regression spline, least square support vector machine and M5Tree model. Atmos Res 184:149–175
Diamantopoulou MJ, Papamichail DM, Antonopoulos VZ (2005) The use of a neural network technique for the prediction of water quality parameters. Oper Res 5(1):115–125
Dogan E, Sengorur B, Koklu R (2009) Modeling biological oxygen demand of the Melen River in Turkey using an artificial neural network technique. J Environ Manag 90(2):1229–1235
Emary E, Zawbaa HM, Ghany KKA, Hassanien AE, Parv B (2015) Firefly optimization algorithm for feature selection. Paper presented at the proceedings of the 7th Balkan conference on informatics conference. ACM
Fahimi F, Yaseen ZM, El-shafie A (2016) Application of soft computing based hybrid models in hydrological variables modeling: a comprehensive review. Theor Appl Climatol. doi:10.1007/s00704-016-1735-8
Fu Q, Jiang R, Wang Z, Li T (2015) Optimization of soil water characteristic curves parameters by modified firefly algorithm. Trans Chin Soc Agric Eng 31(11):117–122
Galelli S, Castelletti A (2013) Tree-based iterative input variable selection for hydrological modeling. Water Resour 49(7):4295–4310
Gardner MW, Dorling S (1998) Artificial neural networks (the multilayer perceptron)—a review of applications in the atmospheric sciences. Atmos Environ 32(14):2627–2636
Ghorbani MA, Zadeh HA, Isazadeh M, Terzi O (2016) A comparative study of artificial neural network (MLP, RBF) and support vector machine models for river flow prediction. Environ Earth Sci 75(6):1–14
Ghorbani MA, Shamshirband S, Haghi DZ, Azani A, Bonakdari H, Ebtehaj I (2017) Application of firefly algorithm-based support vector machines for prediction of field capacity and permanent wilting point. Soil Tillage Res 172:32–38
Govindaraju RS (2000) Artificial neural networks in hydrology. I: preliminary concepts. J Hydrol Eng 5(2):115–123
Gupta HV, Beven KJ, Wagener T (2006) Model calibration and uncertainty estimation. Encyclopedia hydrol sci 11:131. doi:10.1002/0470848944.hsa138
Heinemann AB, van Oort PA, Fernandes DS, Maia AdHN (2012) Sensitivity of APSIM/ORYZA model due to estimation errors in solar radiation. Bragantia Campinas 71(4):572–582
Hinton GE (1992) How neural networks learn from experience. Sci Am 267(3):145–151
Hore A, Dutta S, Datta S, Bhattacharjee C (2008) Application of an artificial neural network in wastewater quality monitoring: prediction of water quality index. Int J Nucl Desalin 3(2):160–174
Jamieson P, Porter J, Wilson D (1991) A test of the computer simulation model ARCWHEAT1 on wheat crops grown in New Zealand. Field Crops Res 27(4):337–350
Jensen B (1994) Expert systems-neural networks. Instrument engineers’ handbook, 3rd edn. Chilton, Radnor
Jiang Y, Nan Z, Yang S (2013) Risk assessment of water quality using Monte Carlo simulation and artificial neural network method. J Environ Manag 122:130–136
Kavousi-Fard A, Samet H, Marzbani F (2014) A new hybrid modified firefly algorithm and support vector regression model for accurate short term load forecasting. Expert Syst Appl 41(13):6047–6056
Kayarvizhy N, Kanmani S, Uthariaraj R (2014) ANN models optimized using swarm intelligence algorithms. WSEAS Trans Comput 13(45):501–519
Kazemzadeh-Parsi M (2014) A modified firefly algorithm for engineering design optimization problems. Iran J Sci Technol IJST Trans Mech Eng 38(M2):403
Kim SE, Seo IW (2015) Artificial neural network ensemble modeling with conjunctive data clustering for water quality prediction in rivers. J Hydro-environ Res 9(3):325–339
Kurunç A, Yürekli K, Çvik O (2005) Performance of two stochastic approaches for forecasting water quality and streamflow data from Yeşilιrmak River, Turkey. Environ Model Softw 20(9):1195–1200
Long NC, Meesad P (2013) Meta-heuristic algorithms applied to the optimization of type-1 and type-2 TSK fuzzy logic systems for sea water level prediction. In: 2013 IEEE 6th international workshop computational intelligence and applications IWCIA 2013—proceedings, pp 69–74. doi:10.1109/IWCIA.2013.6624787
Łukasik S, Żak S (2009) Firefly algorithm for continuous constrained optimization tasks. Firefly Algorithm Contin Constrained Optim Tasks 5796:97–106. doi:10.1007/978-3-642-04441-0_8
Maier HR, Dandy GC (2000) Neural networks for the prediction and forecasting of water resources variables: a review of modelling issues and applications. Environ Model Softw 15(1):101–124
Najah A, Elshafie A, Karim OA, Jaffar O (2009) Prediction of Johor River water quality parameters using artificial neural networks. Eur J Sci Res 28(3):422–435
Najah A, El-Shafie A, Karim OA, Jaafar O, El-Shafie AH (2011) An application of different artificial intelligences techniques for water quality prediction. Int J Phys Sci 6(22):5298–5308
Nascimento Z, Sadok D, Fernandes S (2013) Comparative study of a hybrid model for network traffic identification and its optimization using firefly algorithm. In: 2013 IEEE symposium on computers and communications, pp 000862–000867. doi:10.1109/ISCC.2013.6755057
Noori R, Hoshyaripour G, Ashrafi K, Araabi BN (2010) Uncertainty analysis of developed ANN and ANFIS models in prediction of carbon monoxide daily concentration. Atmos Environ 44(4):476–482
Noori R, Safavi S, Shahrokni SAN (2013a) A reduced-order adaptive neuro-fuzzy inference system model as a software sensor for rapid estimation of five-day biochemical oxygen demand. J Hydrol 495:175–185
Noori R, Karbassi A, Ashrafi K, Ardestani M, Mehrdadi N (2013b) Development and application of reduced-order neural network model based on proper orthogonal decomposition for BOD5 monitoring: active and online prediction. Environ Prog Sustain Energy 32(1):120–127
Noori R, Deng Z, Kiaghadi A, Kachoosangi FT (2015a) How reliable are ANN, ANFIS, and SVM techniques for predicting longitudinal dispersion coefficient in natural rivers? J Hydraul Eng 142(1):04015039
Noori R, Yeh HD, Abbasi M, Kachoosangi FT, Moazami S (2015b) Uncertainty analysis of support vector machine for online prediction of five-day biochemical oxygen demand. J Hydrol 527:833–843
Palani S, Liong SY, Tkalich P (2008) An ANN application for water quality forecasting. Mar Pollut Bull 56(9):1586–1597. doi:10.1016/j.marpolbul.2008.05.021
Quilty J, Adamowski J, Khalil B, Rathinasamy M (2016) Bootstrap rank-ordered conditional mutual information (broCMI): a nonlinear input variable selection method for water resources modeling. Water Resour Res 52:2299–2326. doi:10.1002/2015WR016959
Salami E, Ehteshami M (2015) Simulation, evaluation and prediction modeling of river water quality properties (case study: Ireland Rivers). Int J Environ Sci Technol 12(10):3235–3242
Sarkar A, Pandey P (2015) River water quality modelling using artificial neural network technique. Aquat Procedia 4:1070–1077
Sedki A, Ouazar D (2010) Hybrid particle swarm and neural network approach for streamflow forecasting. Math Model Nat Phenom 5:132–138. doi:10.1051/mmnp/20105722
Sengorur B, Dogan E, Koklu R, Samandar A (2006) Dissolved oxygen estimation using artificial neural network for water quality control. Fres Environ Bull 15(9):1064–1067
Sin G, Gernaey KV, Neumann MB, van Loosdrecht MC, Gujer W (2009) Uncertainty analysis in WWTP model applications: a critical discussion using an example from design. Water Res 43(11):2894–2906
Singh KP, Basant A, Malik A, Jain G (2009) Artificial neural network modeling of the river water quality—a case study. Ecol Model 220(6):888–895. doi:10.1016/j.ecolmodel.2009.01.004
Soyupak S, Karaer F, Gürbüz H, Kivrak E, Sentürk E, Yazici A (2003) A neural network-based approach for calculating dissolved oxygen profiles in reservoirs. Neural Comput Appl 12:166–172
Srivastava PK, Han D, Rico-Ramirez MA, Islam T (2014) Sensitivity and uncertainty analysis of mesoscale model downscaled hydro-meteorological variables for discharge prediction. Hydrol Process 28(15):4419–4432
Talatahari S, Gandomi AH, Yun GJ (2014) Optimum design of tower structures using firefly algorithm. Struct Des Tall Spec Build 23:350–361. doi:10.1002/tal.1043
Tiwari MK, Adamowski J (2013) Urban water demand forecasting and uncertainty assessment using ensemble wavelet-bootstrap-neural network models. Water Resour Res 49(10):6486–6507
Wagener T, Gupta HV (2005) Model identification for hydrological forecasting under uncertainty. Stoch Environ Res Risk A 19(6):378–387
Wen X, Fang J, Diao M, Zhang C (2013) Artificial neural network modeling of dissolved oxygen in the Heihe River, Northwestern China. Environ Monit Assess 185(5):4361–4371
Willmott CJ (1981) On the validation of models. Phys Geogr 2(2):184–194
Willmott CJ (1984) On the evaluation of model performance in physical geography. Spat Stat Models Springer 40:443–460
Willmott CJ, Matsuura K (2005) Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Clim Res 30(1):79–82
Wu HJ, Lin ZY, Gao SL (2000) The application of artificial neural networks in the resources and environment. Resour Environ Yangtze Basin 9(2):241–246
Xiang S, Liu Z, Ma L (2006) Study of multivariate linear regression analysis model for ground water quality prediction. Guizhou Sci 24(1):60–62
Yang XS (2010) Firefly algorithm, stochastic test functions and design optimization. Int J Bio-Inspired Comput 2(2):78–84. doi:10.1504/IJBIC.2010.032124
Ying Z, Jun N, Fuyi C, Liang G (2007) Water quality forecast through application of BP neural network at Yuquio reservoir. J Zhejjang Univ Sci A 8:1482–1487
Acknowledgements
The authors wish to thank the Department of Irrigation and Drainage in Malaysia for providing the required data for this research. The authors would also like to thank the anonymous reviewers for their valuable comments.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Raheli, B., Aalami, M.T., El-Shafie, A. et al. Uncertainty assessment of the multilayer perceptron (MLP) neural network model with implementation of the novel hybrid MLP-FFA method for prediction of biochemical oxygen demand and dissolved oxygen: a case study of Langat River. Environ Earth Sci 76, 503 (2017). https://doi.org/10.1007/s12665-017-6842-z
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s12665-017-6842-z