Abstract
This chapter designs intelligent data analytic approaches for predicting dissolved oxygen concentration in river utilizing extremely randomized tree versus random forest, MLPNN and MLR. Dissolved oxygen concentration (DO) in river, lake and stream can be measured directly in situ. However, mathematical models based on intelligent data analytic technique can provide a reasonably good alternative by linking several water quality variables to the concentration of DO at different time scale. Recent studies conducted worldwide have successfully demonstrated that models using intelligent data analytics contribute to accurately estimate dissolved oxygen with high precision. Here, we applied the extremely randomized tree (ERT) to develop a robust and computationally simple model for predicting dissolved oxygen concentration in river. Results obtained using the proposed ERT were compared to those obtained using the random forest (RF), the multilayer perceptron neural networks (MLPNN) and the standard multiple linear regression (MLR). The proposed models were developed using several inputs variables, e.g. water temperature, specific conductance, water pH and phycocyanin pigment concentration. Several inputs combinations were considered and compared to find the best inputs variables for predicting DO. All the proposed models were applied and compared using data collected from two rivers located in the USA. The accuracy of the models was evaluated using coefficient of correlation (R), Nash–Sutcliffe efficiency (NSE), root mean squared error (RMSE) and mean absolute error (MAE). Results were evaluated based on several input combinations and they showed that the RF provided the most effective estimation of DO concentration amongst the all the proposed models, while the ERT was ranked in the second place slightly less than the RF, the MLPNN ranked thirdly and the MLR model provided the worst accuracy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Antanasijević D, Pocajt V, Perić-Grujić A, Ristić M (2019) Multilevel split of high-dimensional water quality data using artificial neural networks for the prediction of dissolved oxygen in the Danube River. Neural Comput Appl 1–10. https://doi.org/10.1007/s00521-019-04079-y
Breiman L (2001) Random forests. Mach Learn 45(1):5–32. https://doi.org/10.1023/A:1010933404324
Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140. https://doi.org/10.1007/BF00058655
Breiman L, Friedman J, Stone CJ, Olshen RA (1984) Classification and regression trees, 1st edn. Chapman and Hall/CRC, Belmont, CA
Basith S, Manavalan B, Shin TH, Lee G (2018) IGHBP: Computational identification of growth hormone binding proteins from sequences using extremely randomized tree. Comput Struct Biotechnol J 16:412–420. https://doi.org/10.1016/j.csbj.2018.10.007
Banerjee A, Chakrabarty M, Rakshit N, Bhowmick AR, Ray S (2019) Environmental factors as indicators of dissolved oxygen concentration and zooplankton abundance: deep learning versus traditional regression approach. Ecol Ind 100:99–117. https://doi.org/10.1016/j.ecolind.2018.09.051
Crossman J, Futter MN, Elliott JA, Whitehead PG, Jin L, Dillon PJ (2019) Optimizing land management strategies for maximum improvements in lake dissolved oxygen concentrations. Sci Total Environ 652:382–397. https://doi.org/10.1016/j.scitotenv.2018.10.160
Cao W, Huan J, Liu C, Qin Y, Wu F (2019) A combined model of dissolved oxygen prediction in the pond based on multiple-factor analysis and multi-scale feature extraction. Aquacult Eng 84:50–59. https://doi.org/10.1016/j.aquaeng.2018.12.003
Chen L, Su W, Feng Y, Wu M, She J, Hirota K (2020) Two-layer fuzzy multiple random forest for speech emotion recognition in human-robot interaction. Inf Sci 509:150–163. https://doi.org/10.1016/j.ins.2019.09.005
Csábrági A, Molnár S, Tanos P, Kovács J, Molnár M, Szabó I, Hatvani IG (2019) Estimation of dissolved oxygen in riverine ecosystems: comparison of differently optimized neural networks. Ecol Eng 138:298–309. https://doi.org/10.1016/j.ecoleng.2019.07.023
Deng X, Liu Z, Zhan Y, Ni K, Zhang Y, Ma W, Shao S, Lv X, Yuan Y, Rogers KM (2020) Predictive geographical authentication of green tea with protected designation of origin using a random forest model. Food Control 107:106807. https://doi.org/10.1016/j.foodcont.2019.106807
Dickel D, Francis DK, Barrett CD (2020) Neural network aided development of a semi-empirical interatomic potential for titanium. Comput Mater Sci 171:109157. https://doi.org/10.1016/j.commatsci.2019.109157
Elkiran G, Nourani V, Abba SI (2019) Multi-step ahead modelling of river water quality parameters using ensemble artificial intelligence-based approach. J Hydrol 577:123962. https://doi.org/10.1016/j.jhydrol.2019.123962
Emenike PC, Neris JB, Tenebe IT, Nnaji CC, Jarvis P (2020) Estimation of some trace metal pollutants in River Atuwara southwestern Nigeria and spatio-temporal human health risks assessment. Chemosphere 239:124770. https://doi.org/10.1016/j.chemosphere.2019.124770
El Najjar P, Kassouf A, Probst A, Probst JL, Ouaini N, Daou C, El Azzi D (2019) High-frequency monitoring of surface water quality at the outlet of the Ibrahim River (Lebanon): a multivariate assessment. Ecol Ind 104:13–23. https://doi.org/10.1016/j.ecolind.2019.04.061
Geurts P, Ernst D, Wehenkel L (2006) Extremely randomized trees. Mach Learn 63(1):3–42. https://doi.org/10.1007/s10994-006-6226-1
Hoang THT, Nguyen VD, Van AD, Nguyen HT (2019) Decision tree techniques to assess the role of daily DO variation in classifying shallow eutrophicated lakes in Hanoi, Vietnam. Water Qual Res J. https://doi.org/10.2166/wqrj.2019.105
Heddam S (2017) Fuzzy neural network (EFuNN) for modelling dissolved oxygen concentration (DO). In: Kahraman C, Sari IU (eds) Intelligence systems in environmental management: theory and applications, intelligent systems reference library 113. https://doi.org/10.1007/978-3-319-42993-9_11
Hanna BN, Dinh NT, Youngblood RW, Bolotnov IA (2020) Machine-learning based error prediction approach for coarse-grid computational fluid dynamics (CG-CFD). Prog Nucl Energy 118:103140. https://doi.org/10.1016/j.pnucene.2019.103140
Hutchins MG, Hitt OE (2019) Sensitivity of river eutrophication to multiple stressors illustrated using graphical summaries of physics-based river water quality model simulations. J Hydrol 577:123917. https://doi.org/10.1016/j.jhydrol.2019.123917
Jerves-Cobo R, Forio MAE, Lock K, Van Butsel J, Pauta G, Cisneros F, Nopens I, Goethals PL (2020) Biological water quality in tropical rivers during dry and rainy seasons: a model-based analysis. Ecol Ind 108:105769. https://doi.org/10.1016/j.ecolind.2019.105769
Jang HS, Xing S (2020) A model to predict ammonia emission using a modified genetic artificial neural network: analyzing cement mixed with fly ash from a coal-fired power plant. Constr Build Mater 230:117025. https://doi.org/10.1016/j.conbuildmat.2019.117025
Khosravi K, Mao L, Kisi O, Yaseen ZM, Shahid S (2018) Quantifying hourly suspended sediment load using data mining models: case study of a glacierized andean catchment in Chile. J Hydrol
Keshtegar B, Heddam S, Hosseinabadi H (2019) The employment of polynomial chaos expansions approach for modeling dissolved oxygen concentration in River. Environ Earth Sci 78:34. https://doi.org/10.1007/s12665-018-8028-8
Kisi O, Yaseen ZM (2019) The potential of hybrid evolutionary fuzzy intelligence model for suspended sediment concentration prediction. CATENA 174:11–23
Kumar AU, Jayakumar KV (2020) Hydrological alterations due to anthropogenic activities in Krishna River Basin, India. Ecol Indicators 108:105663. https://doi.org/10.1016/j.ecolind.2019.105663
Kebede G, Mushi D, Linke RB, Dereje O, Lakew A, Hayes DS, Farnleitner AH, Graf W (2020) Macro invertebrate indices versus microbial fecal pollution characteristics for water quality monitoring reveals contrasting results for an Ethiopian river. Ecol Ind 108:105733. https://doi.org/10.1016/j.ecolind.2019.105733
Liu Y, Zhang Q, Song L, Chen Y (2019) Attention-based recurrent neural networks for accurate short-term and long-term dissolved oxygen prediction. Comput Electron Agric 165:104964. https://doi.org/10.1016/j.compag.2019.104964
Manavalan B, Basith S, Shin TH, Wei L, Lee G (2019) AtbPpred: a robust sequence-based prediction of anti-tubercular peptides using extremely randomized trees. Comput Struct Biotechnol J 17:972–981. https://doi.org/10.1016/j.csbj.2019.06.024
Moustris K, Kavadias KA, Zafirakis D, Kaldellis JK (2020) Medium, short and very short-term prognosis of load demand for the Greek Island of Tilos using artificial neural networks and human thermal comfort-discomfort biometeorological data. Renew Energy 147:100–109. https://doi.org/10.1016/j.renene.2019.08.126
Mitrović T, Antanasijević D, Lazović S, Perić-Grujić A, Ristić M (2019) Virtual water quality monitoring at inactive monitoring sites using Monte Carlo optimized artificial neural networks: a case study of Danube River (Serbia). Sci Total Environ 654:1000–1009. https://doi.org/10.1016/j.scitotenv.2018.11.189
Nattee C, Khamsemanan N, Lawtrakul L, Toochinda P, Hannongbua S (2017) A novel prediction approach for antimalarial activities of trimethoprim, pyrimethamine, and cycloguanil analogues using extremely randomized trees. J Mol Graph Model 71:13–27. https://doi.org/10.1016/j.jmgm.2016.09.010
Ozonoh M, Oboirien BO, Higginson A, Daramola MO (2020) Performance evaluation of gasification system efficiency using artificial neural network. Renew Energy 145:2253–2270. https://doi.org/10.1016/j.renene.2019.07.136
Orimoloye LO, Sung MC, Ma T, Johnson JE (2020) Comparing the effectiveness of deep feedforward neural networks and shallow architectures for predicting stock price indices. Expert Syst Appl 139:112828. https://doi.org/10.1016/j.eswa.2019.112828
Ross AC, Stock CA (2019) An assessment of the predictability of column minimum dissolved oxygen concentrations in Chesapeake Bay using a machine learning model. Estuarine Coast Shelf Sci 221:53–65. https://doi.org/10.1016/j.ecss.2019.03.007
Rahman A, Dabrowski J, McCulloch J (2019) Dissolved oxygen prediction in prawn ponds from a group of one step predictors. Inf Process Agric. https://doi.org/10.1016/j.inpa.2019.08.002
Suarez VVC, Brederveld RJ, Fennema M, Moreno-Rodenas A, Langeveld J, Korving H, Schellart NA, Shucksmith J (2019) Evaluation of a coupled hydrodynamic-closed ecological cycle approach for modelling dissolved oxygen in surface waters. Environ Model Softw 119:242–257. https://doi.org/10.1016/j.envsoft.2019.06.003
Shi P, Li G, Yuan Y, Huang G, Kuang L (2019) Prediction of dissolved oxygen content in aquaculture using Clustering-based Softplus Extreme Learning Machine. Comput Electron Agric 157:329–338. https://doi.org/10.1016/j.compag.2019.01.004
Tao H, Bobaker AM, Ramal MM, Yaseen ZM, Hossain MS, Shahid S (2019) Determination of biochemical oxygen demand and dissolved oxygen for semi-arid river environment: application of soft computing models. Environ Sci Pollut Res 26(1):923–937. https://doi.org/10.1007/s11356-018-3663-x
Tan K, Wang H, Chen L, Du Q, Du P, Pan C (2020) Estimation of the spatial distribution of heavy metal in agricultural soils using airborne hyperspectral imaging and random forest. J Hazard Mater 382:120987. https://doi.org/10.1016/j.jhazmat.2019.120987
Tao H, Chen R, Xuan J, Xia Q, Yang Z, Zhang X, He S, Shi T (2020) Prioritization analysis and compensation of geometric errors for ultra-precision lathe based on the random forest methodology. Precision Eng 61:23–40. https://doi.org/10.1016/j.precisioneng.2019.09.012
Yang H, Csukás B, Varga M, Kucska B, Szabó T, Li D (2019) A quick condition adaptive soft sensor model with dual scale structure for dissolved oxygen simulation of recirculation aquaculture system. Comput Electron Agric 162:807–824. https://doi.org/10.1016/j.compag.2019.05.025
Yahya A, Saeed A, Ahmed AN, Binti Othman F, Ibrahim RK, Afan HA, El-Shafie A, Fai CM, Hossain MS, Ehteram M, Elshafie A (2019) Water quality prediction model based support vector machine model for Ungauged River Catchment under dual scenarios. Water 11(6):1231. https://doi.org/10.3390/w11061231
Yaseen ZM, Ramal MM, Diop L, Jaafar O, Demir V, Kisi O (2018a) Hybrid adaptive neuro-fuzzy models for water quality index estimation. Water Resour Manage 32:2227–2245. https://doi.org/10.1007/s11269-018-1915-7
Yaseen Z, Ehteram M, Sharafati A, Shahid S, Al-Ansari N (2018b) The integration of nature-inspired algorithms with least square support vector regression models: application to modeling river dissolved oxygen concentration. Water 10:1–21
Zhu S, Heddam S (2019) New formulation for predicting dissolved oxygen in urban rivers at the Three Gorges Reservoir, China: extreme learning machines (ELM) versus artificial neural network (ANN). Water Qual Res J Can. https://doi.org/10.2166/wqrj.2019.053
Zounemat-Kermani M, Seo Y, Kim S, Ghorbani MA, Samadianfard S, Naghshara S, Kim NW, Singh VP (2019) Can decomposition approaches always enhance soft computing models? Predicting the dissolved oxygen concentration in the St. Johns River, Florida. Appl Sci 9(12): 2534. https://doi.org/10.3390/app9122534
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Heddam, S. (2021). Intelligent Data Analytics Approaches for Predicting Dissolved Oxygen Concentration in River: Extremely Randomized Tree Versus Random Forest, MLPNN and MLR. In: Deo, R., Samui, P., Kisi, O., Yaseen, Z. (eds) Intelligent Data Analytics for Decision-Support Systems in Hazard Mitigation. Springer Transactions in Civil and Environmental Engineering. Springer, Singapore. https://doi.org/10.1007/978-981-15-5772-9_5
Download citation
DOI: https://doi.org/10.1007/978-981-15-5772-9_5
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-5771-2
Online ISBN: 978-981-15-5772-9
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)