7.1 Introduction

Increased concern about environmental issues has encouraged specialists to focus their efforts toward developing modeling tools that can describe bioremediation and biodegradation processes (Nasr and Ismail 2015). These methods involve a high degree of nonlinearity regarding physical, biological, and chemical reactions, as well as a large number of microorganisms (Gernaey et al. 2004). In addition, some aquatic systems are subjected to significant variations of wastewater flow rate, concentration, and composition, which require highly complex models to define them (Nasr et al. 2012). The biological treatment of wastewater is also modeled to carry out multiple activities of carbonaceous removal, nitrification, and denitrification (Henze et al. 1987). The term “modeling” is used to express a particular system using mathematical language containing a set of equations and variables (Nasr et al. 2017a). The developed models should simulate, predict, and control the complex environmental systems, as well as the mechanisms of microbial processes with reasonable accuracy (Jeppsson 1996). In addition, modeling of the treatment process is used to maintain the effluent quality within regulation-specified limits. Hence, modeling of environmental systems is considered an attractive and essential point of the study.

Any environmental system can be described by either a white-box model (also known as deterministic models) or a black-box model (i.e., due to lacking process knowledge) (Dreiseitl and Ohno-Machado 2002). For example, a white-box system can be used to describe activated sludge models, viz., ASM1, ASM2, ASM2d, and ASM3, in which the process variables are translated into a set of differential equations (Gernaey et al. 2004). However, this type of model has some limitations concerning the validity of assumptions, availability of data for calibration, and prediction accuracy (Dreiseitl and Ohno-Machado 2002). Black-box models use only input-output data when process knowledge is insufficient to develop a white-box model (Yurtsever et al. 2015). For instance, artificial intelligence (AI) such as an artificial neural network (ANN) and fuzzy inference systems (FISs) can be used as a black-box model to provide accurate predictive tools for nonlinear and nonstationary processes (Fawzy et al. 2016). Practically, AI methods and white-box models can be combined in a hybrid scheme to provide a more reliable description of an individual system. AI tools can maximize the information obtained from data and operator experience, and then this knowledge is applied to enhance the system performance. For example, ANN was employed to model the errors between simulated responses from the white-box model and the corresponding experimental data (Cote et al. 1995). Multivariate analysis forms another appropriate black-box modeling tool that is used as a statistical technique for process monitoring, assessment, recognition, and isolation (Molaie et al. 2014).

Other modeling methodologies such as stochastic gray-box and hybrid models are useful in bioremediation applications for estimation of biomass activities and prediction of effluent quality parameters (Hijosa-Valsero et al. 2011). In addition, Box-Jenkins models were employed for the prediction of primary settler performance in a WWTP (El-Din and Smith 2002). Multiple models such as hydraulic model, oxygen transfer model, and clarifier model can interact with the process knowledge included in white-box models to provide an accurate description of a full-scale WWTP. Several simulator environments, viz., AQUASIM, EFOR, GPS-X, MATLAB/Simulink, SIMBA, STOAT, and WEST, can be used to allow for a reliable simulation of a wastewater treatment process (Nasr et al. 2011).

An accurate model can be constructed by following these steps: objective identification (e.g., design, simulation, control, etc.), model selection (activated sludge model, hydraulic model, settler model, etc.), data collection and processing, calibration of the model parameters, model validation, and scenario evaluations (Khataee et al. 2010a, b). An incomplete understanding of the system behavior can be related to a multiple sources including influent and effluent data, physical properties, operational settings, system performance, and model structure (Jeppsson 1996). Model calibration is an iterative step employed to represent an acceptable explanation of a certain set of data (i.e., improve fit) by adjustment of the model parameters (Alves et al. 2014). The default parameters from the literature can be set as a starting point for calibration. The model learning is affected by several factors including the composition of influent wastewater, the degree of inhibition by toxic compounds, operational conditions, plant configuration, and population dynamics (Nasr and Ismail 2015). The number of inputs and outputs and initial conditions for state variables should be defined as they might influence the model parameters during calibration. In validation, a different data set (i.e., that is not used for calibration) is compared with the responses of the developed model to obtain reliable predictive solutions (Alves et al. 2014).

Hence, the objective of this chapter was to cover different white-box and black-box models that have been recently employed for environmental bioremediation and biodegradation. Activated sludge models and adsorption isotherm studies were presented as white-box systems. On the other side, multivariate statistical analysis and artificial intelligence were introduced as black-box models.

7.2 Biological Activity in Environmental Bioremediation

An adequate model of a biological treatment process is essential to provide a predictive tool that can minimize the operation costs and sustain the environmental balance (Gernaey et al. 2004). In biological treatment processes, microorganisms, mainly bacteria, are adapted to consume organics (substrate) in wastewater for their nutrition and metabolism (Eq. 7.1).

$$ \mathrm{Substrate}+\mathrm{microbial}\ \mathrm{cells}\to \mathrm{biomass}\ \mathrm{growth}+\mathrm{energy}\ \mathrm{production}+\mathrm{byproducts} $$
(7.1)

7.3 Growth of Bacterial Population

Figure 7.1 shows a batch processing system that contains an initial substrate concentration, So, and a biomass concentration, X. The system is operated under a completely mixed and aerobic condition, and thus, the dissolved oxygen (DO) level is not a limiting factor for microbial growth. Microbial cells utilize substrate for biomass growth, energy production, and by-product formation (Nasr et al. 2014a). Hence, during the course of an experiment, the substrate decreases (negative dS/dt) along with an increase in the biomass concentration (positive dX/dt).

Fig. 7.1
figure 1

Substrate utilization and biomass growth in a completely mixed batch reactor supplied with oxygen

As displayed in Fig. 7.2, a plot of biomass concentration against time results in a growth curve that is composed of five distinct phases. These stages can be defined as follows (Nasr et al. 2017b):

  1. 1.

    The lag phase that occurs directly after inoculation of bacteria, and it remains until the cells are adapted (acclimated) to the new environment.

  2. 2.

    The exponential phase, in which the biomass concentration increased steadily describing the period of the optimum growth state.

  3. 3.

    The stationary phase that happens when essential substrates, nutrients, or dissolved oxygen are depleted to certain limits, and thus, the population is neither growing nor declining.

  4. 4.

    The death phase, where some cells are destroyed due to lysis, and thus, the net growth of biomass becomes negative.

Fig. 7.2
figure 2

Typical growth curve for a batch system

7.4 Monod-Type Expression

Monod equation (Eq. 7.2) is a kinetic model used to simulate the microbial growth as a functional correlation between the specific growth rate and substrate concentration (Rieger et al. 2001). The model can also be employed to predict the amount of substrate removal in treatment plants.

$$ \mu ={\mu}_{\mathrm{max}}\left(\frac{S}{S+{K}_S}\right) $$
(7.2)

where μ is the specific growth rate constant (d −1), μ max is the maximum specific growth rate (d −1), S is the limiting substrate concentration (mg L−1), and K S is the half saturation constant given at μ = 0.5 μmax (mg L−1). Once the relationship between μ and S is determined, the bio-kinetic growth constants (i.e., μ max and K S) can be computed statistically or graphically.

As seen in Eq. 7.3, the Monod formula can also be employed to estimate the bacterial growth rate (Gujer et al. 1999).

$$ \frac{\mathrm{dX}}{\mathrm{dt}}=\mu X $$
(7.3)

where dX/dt is the biomass growth rate (mg L−1 d−1) and X is the concentration of mixed liquor volatile suspended solids (mg L−1).

The stoichiometry between the utilized substrate and generated biomass can be expressed in Eq. 7.4 (Nasr and Ismail 2015).

$$ \frac{\mathrm{dX}}{\mathrm{dt}}=Y\frac{\mathrm{dS}}{\mathrm{dt}}-{k}_dX $$
(7.4)

where Y is the cell yield coefficient (dimensionless) and k d is the endogenous decay rate (d−1).

As seen in Eq. 7.5, a plot of μ versus U should give a linear line with a slope of Y and an intercept of k d.

$$ \mu =Y\times U-{k}_d $$
(7.5)

where U is the specific substrate utilization rate (d−1).

7.5 Activated Sludge Models

In activated sludge processes, aerobic microorganisms are used to convert a significant portion of organic compounds into inorganic components, carbon dioxide, nitrogen, and new cells (Henze et al. 1999). The biomass consumes organic matters as a substrate in the presence of oxygen or other types of electron acceptors (e.g., nitrate). A portion of the precipitated suspended solids in the secondary settler is recycled to the bioreactor, while the remaining percentage is disposed of as waste sludge. An earlier study by Jeppsson (1996) has presented a review of the historical development of the activated sludge system.

7.5.1 Activated Sludge Model No. 1

Activated sludge model no. 1 (ASM1) was established as a simple mathematical tool to simulate, predict, and control the biological activities in wastewater treatment systems (Henze et al. 1987). The model includes the processes of carbon oxidation, nitrification, and denitrification, and it can be used to present a good description of the sludge production. The carbonaceous organic matter in ASM1 is defined as COD, and it is classified into biodegradable, non-biodegradable, and biomass (Gujer et al. 1999).

The biodegradable COD is divided into a readily biodegradable substrate (S S) and slowly biodegradable substrate (X S). The microorganisms can directly metabolize the soluble molecules of S S for cellular growth and maintenance. However, the complex organic molecules of X S involve enzymatic breakdown before assimilation and utilization (Cote et al. 1995). Practically, the X S may be soluble although it is incorporated into the model as a colloidal component.

The non-biodegradable COD is divided into soluble inert COD (S I) and particulate inert COD (X I). These components are deemed to be unaffected by the biological activities of microorganisms. The S I escapes from the biological system as untreated effluent, whereas the X I is enmeshed in the biomass and then disposed of the system as excess sludge.

The active biomass is divided into heterotrophic biomass (X B,H) and autotrophic biomass (XB,A). Based on the death-regeneration model, the products resulting from microorganism decay are termed as inert particulate (X P).

Hence, the total COD balance of ASM1 can be presented as Eq. 7.6.

$$ {\mathrm{COD}}_{\mathrm{tot}}={S}_I+{S}_S+{X}_I+{X}_S+{X}_{B,H}+{X}_{B,A}+{X}_P $$
(7.6)

The nitrogen components in ASM1 are divided into nitrate and nitrite (SNO), ammonia nitrogen (SNH), soluble biodegradable organic nitrogen (SND), and particulate biodegradable organic nitrogen (XND). Dissolved oxygen (SO) and alkalinity (SALK) are other components described in ASM1. XB,A consumes SO (i.e., during their aerobic growth) for the oxidation of SNH to SNO, whereas XB,H can grow under both aerobic and anoxic conditions. The aerobic and anoxic conditions are modeled through the uptake of SO and SNO as electron acceptors, respectively.

7.5.2 Activated Sludge Models No. 2 and 2d

Activated sludge model no. 2 (ASM2) was established for the improvement of ASM1 by adding the phosphorus removal mechanism (Henze et al. 1995). ASM2 added a new group of biomass known as phosphorus-accumulating organisms (PAOs), which are capable of accumulating and storing phosphorus in their cells. Hence, the model can be used to simulate the performance of biological nutrient removal in activated sludge systems. Activated sludge model no. 2d (ASM2d) is based on ASM2, but it addresses the ability of XPAO to utilize the products of internal cell organic storage for denitrification (Henze et al. 1999). Hence, ASM2d allows for a better explanation of the dynamics of phosphate and nitrate, and it can be successfully used to simulate the biological phosphorus removal with a simultaneous nitrification-denitrification process.

7.5.3 Activated Sludge Model No. 3

Activated sludge model no. 3 (ASM3) added some updates to ASM1, such as the inclusion of storage polymers in the heterotrophic-activated sludge conversions, and the use of the growth-endogenous respiration model to represent the biomass decay (Gujer et al. 1999). ASM3 assumes that the readily biodegradable substrate (SS) is stored into an internal cell structure as, XSTO, followed by biomass growth. ASM3 can predict oxygen consumption, nitrification, denitrification, and sludge production of activated sludge systems. In addition, the ASM3 model can be extended for simulation of biological phosphorus removal (Rieger et al. 2001).

7.6 Adsorption in Environmental Bioremediation

Adsorption is related to physicochemical studies that use a solid material to uptake metal ions from aqueous solution. The adsorption process can be described by the following models:

7.7 Langmuir Adsorption Isotherm

Langmuir isotherm is used to quantitatively describe the transfer of metal ions between the solid phase and liquid phase at an equilibrium state. Langmuir equation has been formulated based on the following assumptions (Langmuir and Waugh 1940): (1) maximum monolayer coverage, i.e., a single layer of adsorbate is formed on the outer surface of the adsorbent; (2) the surface of solid phase contains a finite number of vacant sites, and each site occupies only one molecule; (3) the surface of solid is homogeneous, i.e., adsorption sites are identical with equal size and shape, and they have similar affinity for adsorbate molecules; (4) after adsorption, no transmigration of adsorbate in the plane of the surface, and no interaction between adsorbate atoms; and (5) coverage-independent binding energy, i.e., the heat of adsorption is the same (uniform) for each site.

As seen in Eq. 7.7, a linear plot of C e/q e versus C e provides a slope of 1/q m and an intercept of 1/(K Lq m).

$$ \frac{C_e}{q_e}=\left(\frac{1}{q_m}\right){C}_e+\frac{1}{K_L{q}_m} $$
(7.7)

where C e is the equilibrium concentration of adsorbate (mg L−1), q e is the amount of metal adsorbed per gram of the adsorbent at equilibrium (mg g−1), q m is the maximum monolayer coverage capacity (mg g−1), and K L is the Langmuir isotherm constant (L mg−1).

Based on the Langmuir-type adsorption process, the isotherm shape can be classified by a dimensionless constant separation factor (Eq. 7.8)

$$ r=\frac{1}{1+{K}_L{C}_o} $$
(7.8)

where r is a dimensionless separation factor, K L is the Langmuir constant (L mg−1), and C o is the initial concentration (mg L−1).

The factor “r” describes the shape of the isotherm according to the following classification: r > 1 “unfavorable,” r = 1 “linear,” 0 < r < 1 “favorable,” and r = 0 “irreversible.”

7.8 Freundlich Adsorption Isotherm

Freundlich model is based on the adsorption process for a single solute system on heterogeneous surfaces (Freundlich 1906). The model describes the distribution of a solute between the solid phase and aqueous phase at equilibrium (Ng et al. 2002). The model presumes an exponential variation in site energies, and that the surface adsorption is not a rate-limiting step (Mattson and Mark 1971). In addition, Freundlich model does not follow Henry’s Law of ideal dilute solutions. As seen in Eq. 7.9, a plot of log(q e) versus log(C e) gives a linear form with a slope of (1/n) and an intercept of log(K F).

$$ \log \left({q}_e\right)=\left(\frac{1}{n}\right)\log \left({C}_e\right)+\log \left({K}_F\right) $$
(7.9)

where K F is Freundlich’s constant corresponded to the adsorption capacity (mg g−1) and 1/n describes the adsorption intensity or surface heterogeneity (L g−1).

The Freundlich exponent “1/n” indicates the type of isotherm; i.e., the isotherm is favorable at 0 < 1/n < 1, irreversible at 1/n = 0, and unfavorable at 1/n > 1 (Saruchi and Kumar 2016). In addition, the value 1/n < 1 implies the chemisorption process, whereas 1/n > 1 indicates a cooperative process. Moreover, if 1/n is close to zero, the sorbent surface is more heterogeneous; otherwise, the more homogeneous the surface, the closer 1/n value is to unity.

7.9 Dubinin-Radushkevich (D–R) Isotherm Model

The D–R isotherm corresponds to the Gaussian energy distribution multiplied by the first power of the adsorption energy (Rudziński et al. 1974). The model expresses overall adsorption isotherm in the sub-monolayer coverage region for a variety of heterogeneous surfaces. This isotherm is temperature dependent, and it is valid for physical adsorption processes involving van der Waals forces (Boparai et al. 2011). As observed in Eq. 7.10, a linear plot of ln(q e) vs. ε2 gives a slope of -β and an intercept of ln(q o).

$$ \ln \left({q}_e\right)=\ln \left({q}_o\right)-{\beta \varepsilon}^2 $$
(7.10)

where q o is the D–R constant related to the saturation capacity (mg g−1), β is the activity coefficient (mol2 kJ−2), and ε is the Polanyi potential (dimensionless).

The value of ε is calculated from Eq. 7.11.

$$ \varepsilon =\mathrm{RTln}\left[1+\frac{1}{C_e}\right] $$
(7.11)

where R is the gas constant equivalent to 8.314 J mol−1 K−1 and T is the temperature (K).

The value of β is used to determine the mean sorption energy, as seen in Eq. 7.12.

$$ E=\frac{1}{\sqrt{2\beta }} $$
(7.12)

where, E is the mean sorption energy (kJ mol−1).

7.10 Temkin Isotherm Model

The Temkin isotherm model describes the interaction effect of adsorbent–adsorbate (Temkin 1941). The Temkin isotherm presumes that the heat of adsorption would decrease linearly rather than logarithmically while ignoring extremely low and very high concentration values (Aljeboree et al. 2014). In addition, the model suggests that adsorption is characterized by a uniform distribution of binding energies, up to a maximum binding energy. As seen in Eq. 7.13, the linear plot of qe vs. ln(C e) obtains a slope of RT/b and an intercept (RT/b)ln(A) (Boparai et al. 2011).

$$ {q}_e=\frac{\mathrm{RT}}{b}\ln (A)+\frac{\mathrm{RT}}{b}\ln \left({C}_e\right) $$
(7.13)

where b is the Temkin constant corresponded to the heat of adsorption (J mol−1) and A is the equilibrium constant associated with the maximum binding energy (L g−1).

7.11 Pseudo-First-Order Kinetic

The formula of Eq. 7.14 expresses the linear curve of the pseudo-first-order model. A plot of ln(q eq t) against t gives a straight line with a slope of k 1 and an intercept of ln(q e) (Ho and McKay 1999).

$$ \ln \left({q}_e-{q}_t\right)=-{k}_1\times t+\ln \left({q}_e\right) $$
(7.14)

where q e and q t are the amounts of adsorbed solute at equilibrium and at time t, respectively, and k 1 is the pseudo-first-order rate constant.

7.12 Pseudo-Second-Order Kinetic

The applicability of pseudo-second-order model designates that the adsorption process follows a chemical interaction, also known as chemisorption (Atkins 1995). The linear form of the pseudo-second-order model can be expressed as Eq. 7.15 (Fawzy et al. 2016). The values of t/qt were computed from the kinetic data and plotted against time, which will then provide a straight line with a slope and an intercept of 1/q e and 1/(k 2.q e 2), respectively.

$$ \frac{t}{q_t}=\frac{t}{q_e}+\frac{1}{k_2\cdot {q}_e^2} $$
(7.15)

where k 2 is the pseudo-second-order kinetic rate constant.

7.13 Statistical Analysis in Environmental Bioremediation

7.13.1 Design of Experiments

The design of experiments is a statistical technique used to determine the effects of several input factors on a response. Experimental design can be classified into three approaches, namely, one-factor-at-a-time, full factorial, and fractional factorial (Khataee et al. 2010a, b). In one-factor-at-a-time, only one factor or variable varies with time while holding other inputs constant. On the contrary, a factorial design is performed to examine two or multiple factors simultaneously. The factorial design is developed for an optimization purpose, as it describes the effect of each factor on the response as well as the interaction effects between factors (Nasr et al. 2017a). Hence, this method has the ability to reduce the number of experiments, save time and cost, and obtain accurate outputs.

A full factorial design with n-factors and m-levels for each factor is noted as a m n factorial experiment (Elhalil et al. 2016). For example, a full factorial design denoted as 23 identifies three number of factors, and each factor has two levels; i.e., hence the number of experiments is 23 = 8. Similarly, a 32 factorial design has two-factors, each with three-levels, and 32 = 9 experimental runs. In this context, a full factorial design describes all possible combinations of factors in a single experiment, and it can build a response surface.

The fractional factorial experiment includes the most relevant combinations of the variables (Cristóvão et al. 2015). This approach is useful when the number of factors is large; i.e., it allows to obtain information about all main effects and interactions while finding the minimum number of experiments for the purpose. The design of experiment can also be performed using central composite, Box-Behnken, Plackett-Burman, and Taguchi.

Results from the design of experiment can be graphed using a box plot, which describes each variable by four components as follows (Nasr et al. 2012): (1) a central line in each box is the sample median that represents 50th percentile of the data; (2) a box representing variability around the average, where the tops and bottoms of each box are the 25th and 75th percentiles of the sample, respectively; (3) the whiskers are lines situated above and below each box, which express the range of the variable; and (4) outliers of a + sign extended beyond the top or bottom of the whisker length, where its value is over 1.5-fold the interquartile range.

Cristóvão et al. (2015) employed a 32 factorial design to determine the effect of hydraulic retention time (HRT) and initial dissolved organic carbon (DOCi) on DOC removal by activated sludge treatment. The input factor of HRT has three levels of 4.20 h (−1), 6.15 h (0), and 8.10 h (+1), whereas the three levels of DOCi were 200 mg L−1 (−1), 500 mg L−1 (0), and 800 mg L−1 (+1). Results from the factorial design revealed that the optimum HRT and DOCi were 6.42 h and 406.2 mg L−1, respectively, which achieved DOC removal of 88.0%. The proposed model achieved r 2-value: 0.98463 and adj-r 2: 0.95902.

Elhalil et al. (2016) developed a 24 full factorial experimental design to estimate effects of four-factors, viz., malachite green dye (10 and 20 mg L−1), Fe2+ (5 and 10 mm), H2O2 (25.6 and 51.2 mm), and temperature (27 and 40 °C), on the degradation of dye by a Fenton process. Results from the factorial design indicated that the optimum malachite green dye, Fe2+, H2O2, and temperature were 10 mg L−1, 10 mm, 25.6 mm, and 40 °C, respectively, which attained a removal efficiency of 93.83%. The model validity and practicability were confirmed by computing r 2-value: 0.986 and adj-r 2: 0.889.

A study by Khataee et al. (2010a, b) developed a central composite design (CCD) to determine the influences of operational factors on biological dye removal in the presence of Chara culture. Their study indicated that the optimum condition that achieved the maximum decolorization efficiency was initial pH 6.8, dye concentration 9.7 mg L−1, algae weight 3.9 g, and contact time 75 min. The predicted results were in good agreement with experimental data (r 2: 0.982 and adj-r 2: 0.966).

7.14 Clustering

Clustering analysis attempts to classify a set of records into a number of important groups (Ferati et al. 2015). A reliable cluster result produces a high similarity between the observation inside the same cluster and a small similarity between the data in different groups. Kohonen networks, also known as self-organizing feature maps (SOFMs), are used to realize maximum separation between the data in different clusters using nonlinear activation functions (Kuo et al. 2005). K-means clustering is another algorithm used to split data points into a fixed number (k) of clusters based on the centroid of each cluster. Recently, clustering analysis has been employed for monitoring and assessment of environmental studies including natural resource management (e.g., agriculture, fisheries, and forests) and pollution risks to living organisms.

Kamble and Vijay (2011) applied cluster analysis for assessment of water quality in the coastal region of Mumbai, India. Six water quality parameters, i.e., turbidity, DO, BOD, NH3-N, PO4, and FC, were measured at 17 sampling sites during post-monsoon, winter, and pre-monsoon. The sampling areas were classified into three major groups, viz., cluster-I “less polluted sites,” cluster-II “moderately polluted sites,” and cluster-III “highly polluted sites.” Based on seawater standards, Mahim was the worst-affected beach because of an incoming organic load from the Mithi River, and thus, it was grouped in cluster-III.

Ferati et al. (2015) applied cluster analysis to handle a large data of eight heavy metals, i.e., As, Cd, Cr, Co, Cu, Ni, Pb, and Zn, collected from six locations at Trepça and Sitnica rivers during April – July 2014. Water and sediment samples were collected from each site and analyzed for heavy metal concentrations. Cluster analysis specified two major distinct clusters with three groups suggesting that the metal contamination resulted from anthropogenic sources.

7.15 Decision Trees

The decision tree is a hierarchical model used to represent classifications from a set of independent variables (called attributes) by following a structure of nodes, branches (links), and leaves (D’heygere et al. 2003). Decision trees are also flexible in handling both categorical and numerical data by finding the set of decision rules. A decision tree signifies each input attribute as a node, where the branches going downward from each node are the possible values that the data can be assigned (Hijosa-Valsero et al. 2011). A root node is the topmost decision node in the tree, which has no incoming edges. Leaf nodes, also known as terminal nodes, are situated at the bottom of the tree, and they predict outcomes expressed as class labels.

D’heygere et al. (2003) built a decision tree model to predict the absence or presence of benthic macroinvertebrate taxa in the non-navigable watercourses of Flanders using independent variables of 15 physical-chemical, structural, and ecotoxicological variables. The study found that conductivity and dissolved oxygen were the most relevant variables in the input attributes by placing them at the top of the tree.

Hijosa-Valsero et al. (2011) applied a decision tree method to predict the removal efficiency of pharmaceuticals and personal care products (PPCPs) and organics from wastewater using constructed wetlands. Their study indicated that DO, temperature, pH, conductivity, and redox potential influenced the removal of the investigated matters.

7.16 Artificial Intelligence in Environmental Bioremediation

7.16.1 Artificial Neural Network

Artificial neural network (ANN) is a computer-based system proposed to mimic the learning process of nerve cells in the human brain (Nasr and Zahran 2014). ANN is quite robust since it can achieve a high degree of prediction accuracy even when it receives erroneous and noisy datasets. It can also be employed to organize, classify, and summarize data sets, owing to its ability to solve nonlinear functions and to capture complex relationships (Yurtsever et al. 2015). ANN contains a large number of interconnected neurons (also called nodes), which is organized in layers including first (input) layer, hidden layers, and last (output) layer. Adaptive weights, biases, and transfer functions are used to interconnect the neurons and layers within a network (Yurtsever et al. 2015). The predicted output is obtained by examining the ANN through multiple steps of training, validation, and test. The optimal number of neurons, layers, and adjusted network parameter (weights and biases) is computed according to the minimum mean squared error (MSE) of the training and validation sets (Nasr et al. 2012).

Recently, several articles have attempted to apply ANNs for modeling the biological reactions in the area of environmental engineering. For example, Khataee et al. (2009) used an ANN model to predict the removal efficiency of BG4 dye using inputs of dye concentration (2.5–15.0 mg L−1), temperature (5–45 °C), pH (2–11), reaction time (0–420 min), and algal concentration (1.5 × 106–9.0 × 106 mg L−1). The study used a three-layered feed forward back propagation NN with a structure of 5 – 16 – 1. The proposed ANN provided a reasonable predictive performance with a coefficient of determination of r 2: 0.979.

Khataee et al. (2010a, b) developed a three-layer ANN with a back-propagation algorithm to predict the biological degradation efficiency of Malachite Green (MG). The input factors were reaction time (0–420 min), pH (1.5–8.5), temperature (5–45 °C), dye concentration (2.5–17.5 mg L−1), and algae dosage (0.5–6.0 g), whereas the output was dye removal efficiency (0–100%). Their study found that the network (5 – 12 – 1) succeeded to attain a coefficient of determination r 2: 0.970, indicating that the model was reliable to predict the dye removal efficiency.

Prakash et al. (2008) applied ANN to predict the biosorption efficiency of sawdust for the removal of Cu(II)-ions. The input attributes were Cu(II)-ion concentration, 50–80 mg L−1; pH, 3–6; temperature, 25–40 °C; and particle size, 50–200 μm, while the output parameter was the percent of sorption efficiency. The network used a simple back-propagation recurrent algorithm with three hidden layers, equivalent to a structure of 4 – 50 – 40 – 27 – 1. The ANN model notably tracked the experimental data and achieved average MSE of 0.002139579.

Yurtsever et al. (2015) employed a fast ANN to predict Cd(II)-ion adsorption rate using six inputs: initial pH, temperature, agitation speed, particle size, Cd(II) concentration, and reaction time. The model architecture was composed of four layers (6 – 25 – 5 – 1), viz., an input layer, first hidden layer with 25 neurons, second hidden layer with five neurons, and an output layer. The resulting model achieved an accurate prediction of Cd(II)-ion removal with r 2-value of 0.999. In addition, the proposed ANN model was found to be more promising for modeling the Cd(II) adsorption when compared to conventional isotherm and kinetic studies.

7.17 Fuzzy Inference System

A fuzzy inference system (FIS) is used to represent a nonlinear relation mapping of an input space to an output space (Zadeh 1997). A FIS is composed of four major parts: i.e., fuzzification, “If-then” rules, inference engine, and defuzzification. The engine that handles these components can be Mamdani fuzzy inference, Sugeno fuzzy inference, or Tsukamoto fuzzy inference (Alalm et al. 2016). Figure 7.3 displays the general architecture and components of the FIS, which can be illustrated as follows (Nasr et al. 2014b):

Fig. 7.3
figure 3

A fuzzy inference system

A crisp set of input data is collected, prepared, and further converted into a fuzzy set using linguistic variables and terms and membership functions. These steps are recognized as fuzzification, in which a crisp (numeric) value is converted into a fuzzy input (Giusti and Marsili-Libelli 2010). For example, a fuzzy variable (e.g., temperature) can be defined in terms of linguistic concepts such as low, medium, high, etc. Each linguistic concept is expressed graphically as a membership function, like linear, Gaussian, and trapezoidal fuzzy sets.

Subsequently, an inference engine is organized using a set of “If-then” rules. A single fuzzy “If-then” rule has the form “If x is A Then y is B.” Assume x and y are the variables “temperature” and “dissolved oxygen,” respectively, and A and B are linguistic variables “low” and “high,” respectively, then the “If-then” rule will have the form “If Temperature is Low Then Dissolved Oxygen is High.” Note that aggregation of rules may be used when the rule-based system contains more than one “If-then” rule (Fawzy et al. 2016).

Finally, defuzzification step is used to convert the resulting fuzzy output into a single crisp number. There exist several defuzzification methods available in the literature, such as Max-membership, center of gravity, weighted average, mean-max, and center of sums (Gupta et al. 2017).

Giusti and Marsili-Libelli (2010) developed a Sugeno fuzzy model to control the maximum in cycle temperature (output) by adjusting airflow (input) in a composting process. The model consisted of three fuzzy rules corresponding to three possible in cycle temperature trends (varied between mesophilic and thermophilic phases). Results from their study revealed that the proposed model could elucidate 95.46% of the observed variance, indicating a good agreement with the studied cycles.

Gupta et al. (2017) employed a FIS to predict the extraction yields of lipids, carbohydrates, and proteins from microalgae using input factors of specific growth rate, carrying capacity, and physiological health. The study used Sugeno-type FIS, and eight IF-THEN rules, in addition, the membership functions of input and output variables were Gaussian and linear, respectively. The fuzzy model provided a high coefficient of determination r 2-values >0.98, and it was concluded that an increase in microalgae concentrations resulted in higher lipid and carbohydrate extractions but lower protein yields.

Nasr et al. (2014b) employed a fuzzy logic control (FLC) to maintain the dissolved oxygen level in the last aerobic tank of Benchmark model.1 that comprises anoxic/aerobic tanks at the level of 2 mg L−1. The study used a fuzzy inference system with Mamdani’s method that was constructed based on five IF-THEN rules. The input variable of soluble oxygen had five fuzzy linguistic sets, viz., very low, low, medium, high, and very high. The model output (i.e., oxygen transfer coefficient) was classified into close-fast, close-slow, no-change, open-slow, and open-fast. Results revealed that the fuzzy model was able to handle variations in the influent wastewater characteristics. In addition, FLC was able to self-adapt the aeration supply, and hence, low aeration energy was consumed. The effluent wastewater characteristics were within the allowable limits of BOD ≤10 mg L−1, COD ≤100 mg L−1, ammonium ≤4 mg L−1, and total nitrogen ≤18 mg L−1.

7.18 Conclusions

This chapter presented several modeling and statistical methods that have been recently applied in environmental engineering studies. Activated sludge models (i.e., ASM1, ASM2, ASM2d, and ASM3) used for prediction of biological nitrogen and phosphorus removals, besides degradation of organic carbon matters, were covered. The application of adsorption isotherm models such as Langmuir, Freundlich, Dubinin-Radushkevich, and Temkin for the removal of metal ions from aqueous solutions was also demonstrated. In addition, this chapter included multivariate analysis, viz., principal component analysis, clustering, dendrogram, and decision trees, for evaluation and assessment of water quality in aquatic environments. Artificial intelligence such as ANN and fuzzy logic was also introduced as a black-box model for prediction of the treatment performance. Finally, this work included literature studies that have performed innovative methods within the field of modeling environmental processes. Future studies should be focused on the applications of stochastic gray-box and hybrid models for prediction of the treatment and bioremediation performances, as well as for estimation of microorganism activities.