8.1 Introduction

The term “bioremediation” is used to define the reduction, degradation, detoxification, and mineralization of pollutants via biological mechanisms (Olawoyin 2016). The objective of bioremediation is to transform contaminants into less harmful substances using microorganism and biomasses. Bioremediation technologies are classified into in situ and ex situ, depending on several factors such as source and concentration of pollutants, site characteristics and type, and cost saving (Sanusi et al. 2016). The bioremediation process undergoes a high degree of nonlinearity regarding physical, chemical, and biological reactions. In addition, bioremediation is influenced by several factors such as medium pH, temperature, aeration rate, agitation speed, and substrate to inoculum ratio (San-Valero et al. 2015). Hence, a significant effort should be exerted for developing adequate modeling techniques that can address the performance of bioremediation.

The term “modeling” is used to describe a particular system using mathematical language that comprises a set of factors, variables, and equations (Bas and Boyacı 2007). The designed models should be able to simulate, predict, and control the behavior of the system under study with a reasonable accuracy. Modeling of the bioremediation process is an essential procedure for reactor design and performance prediction (Nasr et al. 2017). The bioremediation process can be addressed by either a white-box model (also known as deterministic models, physically based models, or knowledge-driven models) or a black-box model due to missing process information (Pakravan et al. 2015). In white-box systems, the process variables are transformed into a number of mathematical equations. However, this type of model requires all necessary information and a lot of assumption to improve the prediction accuracy (Zadeh 1997). Black-box models are used to provide an adequate description of a system when the process knowledge is not enough. Artificial intelligence, which is defined as a black-box model, can be used for the prediction of nonlinear and complex systems. Multivariate analysis is another reliable black-box modeling technique that can be employed as a statistical tool for isolation, monitoring, and assessment (Alalm et al. 2016). Other modeling methods such as hybrid and stochastic gray-box systems have been employed in bioremediation studies for the determination of microorganisms’ activities.

Sequential steps should be conducted to develop a reliable model. The procedures include (a) problem identification, (b) model selection, (c) data collection and preparation, (d) model calibration and parameters estimation, (e) model validation, and (f) testing and scenario evaluations (Nasr et al. 2014). Model calibration is an important step, which is used to find a reliable explanation of a particular set of data. During calibration, the model parameters are adjusted to improve fitting accuracy (Fawzy et al. 2017). The parameters used as initial conditions can be obtained from the literature. In the validation procedure, the readings not used for calibration are compared with the model outputs to obtain a reliable model. The model inadequacy can result from different sources such as input and output data, physical properties and configuration of the system, operational conditions, and model structure (Panja et al. 2017).

Several hazardous pollutants can result from sewage, hydrocarbons, dyes, agrochemicals, chlorinated compounds, and heavy metals. Hydrocarbons are considered as an essential cause of environmental damage and several health risk problems (Nwadiogbu et al. 2016). Most studies on bioremediation have focused on hydrocarbons due to their toxic impact on soil and groundwater. In addition, aquatic systems receive significant variations in wastewater discharge and composition, which may contain multiple hydrocarbon contaminants (Srinivasan and Viraraghavan 2010a, b).

This chapter attempted to cover different modeling and statistical techniques that have been recently employed for describing the bioremediation of hydrocarbons. Different statistical and artificial intelligence methods were used to represent the highly complex models that undergo the bioremediation process. The application of adsorption isotherm models such as Temkin, Freundlich, Langmuir, and Dubinin-Radushkevich was also demonstrated. In addition, this work covered literature studies that have employed reliable techniques within the field of hydrocarbon bioremediation.

8.2 Stoichiometry and Kinetics of Bacterial Activity

Several microorganisms have been found to have important applications in the bioremediation of hydrocarbon-contaminated water (Nasr and Ismail 2015). The biological activities of these organisms are influenced by various physicochemical and environmental parameters. Some mathematical models are based on the theory that microorganisms can utilize hydrocarbons from the aqueous medium. Other models are used to couple mass transfer with Monod or first-order kinetics for hydrocarbon biodegradation (Boparai et al. 2011).

Monod equation, as expressed by Eq. (8.1), is a kinetic model employed to determine the microbial growth via the correlation between substrate concentration and specific growth rate (Ateia et al. 2015). Monod-type model is also used to predict the substrate removal efficiencies in bioremediation processes at a large-scale application.

$$ \mu ={\mu}_{\mathrm{max}}\left(\frac{S}{S+{K}_S}\right) $$
(8.1)

where μ is specific growth rate constant (1/day), μ max is maximum specific growth rate (1/day), S is limiting substrate concentration (mg/L), and K S is half-saturation constant (mg/L), provided at μ = 0.5μ max. The correlation between μ and S is used to estimate the bio-kinetic growth constants (i.e., μ max and K S) by either statistical or graphical technique.

The Monod equation can be employed to calculate the bacterial growth rate, as given by Eq. (8.2).

$$ \frac{dX}{dt}=\mu X $$
(8.2)

where dX/dt is biomass growth rate (mg/L/d) and X is biomass concentration (mg/L).

The stoichiometric correlation between the consumed substrate and produced biomass can be presented by Eq. (8.3).

$$ \frac{dX}{dt}=Y\frac{dS}{dt}-{k}_dX $$
(8.3)

where Y is the stoichiometry of biomass yield coefficient (dimensionless) and k d is the kinetic rate of cell decay (1/day).

The specific substrate utilization rate is calculated by Eq. (8.4).

$$ U=\frac{dS}{X\cdot dt} $$
(8.4)

where U is the specific substrate utilization rate (1/day) and dS/dt is substrate utilization rate (mg/L/d).

As presented by Eq. (8.5), a plot of μ versus U results in a linear line having a slope of Y and an intercept of k d.

$$ \mu =Y\times U-{k}_{\mathrm{d}} $$
(8.5)

8.3 Bacterial Behavior in a Controlled Batch System

Figure 8.1 displays a batch reactor that contains an initial substrate concentration (S o) and a biomass concentration (X). The reactor is operated under an aerobic and completely mixed condition; thus, the concentration of dissolved oxygen (DO) is not a limiting factor for bacterial growth. During batch experiments, microorganisms utilize substrate for synthesis of new cells, energy generation, and formation of by-products (Eq. 8.6). Hence, as time proceeds, the substrate concentration decreases (negative dS/dt) along with an increase in the microorganisms concentration (positive dX/dt).

$$ \mathrm{Substrate}+\mathrm{Biomass}\to \mathrm{New}\;\mathrm{cells}+\mathrm{Energy}+\mathrm{Byproducts} $$
(8.6)
Fig. 8.1
figure 1

Mixed batch reactor supplied with substrate, inoculum, and dissolved oxygen

Figure 8.2 shows a plot of biomass concentration versus time, resulting in a growth curve that contains five distinct phases. The phases can be illustrated as follows (Gupta et al. 2017):

  1. (a)

    Lag phase, which diminishes when the cells are acclimated (adapted) to the environmental condition. This phase occurs directly after bacterial inoculation.

  2. (b)

    Exponential phase, where the biomass concentration increases steadily due to the utilization of substrate for growth.

  3. (c)

    Stationary phase that occurs when essential substrates (e.g., carbon and nutrient species) and/or DO reach a threshold level. Under this condition, the bacterial population is neither growing nor decreasing.

  4. (d)

    Death phase, in which some bacterial cells are damaged due to death and lysis. Under this environment, the net biomass growth becomes negative.

Fig. 8.2
figure 2

Typical biomass growth curve in a batch system

8.4 Mathematical Modeling of Trickling Filter for Bioremediation Application

San-Valero et al. (2015) developed a mathematical model used for estimating the removal of the hydrophilic volatile organic compound by biotrickling filters. Their study reported that the mass balance of pollutant (or DO) in the gas phase could be represented by Eq. (8.7).

$$ {\theta}_{\mathrm{G}}\frac{\partial {C}_{\mathrm{G}}}{\partial t}=-{v}_{\mathrm{G}}\frac{\partial {C}_{\mathrm{G}}}{\partial z}-{K}_{\mathrm{L}}a\left(\frac{C_{\mathrm{G}}}{H}-{C}_{\mathrm{L}}\right) $$
(8.7)

where C G and C L are concentrations of gas and liquid phases, respectively (mg/L), K L a is mass transfer coefficient (1/s), H is Henry’s law constant (dimensionless), t is time (s), z is vertical distance from the bottom of the reactor (m), v G is superficial air velocity (m/s), and θ G is porosity of the bed (dimensionless).

The mass balance of the mobile liquid phase is given by Eq. (8.8).

$$ {\theta}_{\mathrm{L}}\frac{\partial {C}_{\mathrm{L}}}{\partial t}=-{v}_L\frac{\partial {C}_{\mathrm{L}}}{\partial z}-{K}_{\mathrm{L}}a\left(\frac{C_G}{H}-{C}_{\mathrm{L}}\right)-\frac{D\cdot a}{\beta}\left({C}_{\mathrm{L}}-S\right) $$
(8.8)

where D is diffusion coefficient of contaminant (or DO) in water (m2/s), a is specific surface area of the packing medium (m2/m3), β is the thickness of liquid-biofilm interface (m), S is the concentration of pollutant (or DO) in biofilm interface (mg/L), and v L is superficial liquid velocity (m/s).

The mass balance of the biofilm is represented by Eq. (8.9).

$$ \frac{\partial S}{\partial t}=f\left({X}_{\mathrm{v}}\right)D\frac{\partial^2S}{\partial {x}^2}-\frac{\mu_{\mathrm{max}}{X}_{\mathrm{v}}}{Y}\frac{S_{\mathrm{O}}}{S_{\mathrm{O}}+{K}_{\mathrm{O}}}\frac{S_{\mathrm{P}}}{S_{\mathrm{P}}+{K}_{\mathrm{P}}} $$
(8.9)

where S is concentration inside biofilm (mg/L); f(X v) is correction factor of diffusivity in solution due to biomass (dimensionless); X v is the concentration of biomass (mg/L); μ max is maximum specific growth rate (1/s); K O and K P are the half-saturation constants of oxygen and pollutant, respectively (mg/L); and Y is yield coefficient (dimensionless).

8.5 Adsorption Models

Adsorption isotherms and kinetics are appropriate models that can be used to investigate the removal of hydrocarbons from water bodies (Fawzy et al. 2016a). The most common isotherm models are Langmuir, Freundlich, Dubinin-Radushkevich (D-R), and Temkin. In addition, pseudo-first-order and pseudo-second-order are performed for examining the kinetic studies of adsorption.

8.5.1 Langmuir Adsorption Isotherm

Langmuir isotherm is applied to quantitatively describe the transfer of pollutants from the aqueous solution to the solid surface at equilibrium. Langmuir model has been developed according to the following assumptions (Langmuir and Waugh 1940):

  1. (a)

    Monolayer coverage; i.e., the outer surface of adsorbent is covered by a single layer of adsorbate.

  2. (b)

    The solid surface contains a finite number of vacant pores, where each site occupies one molecule, and no interaction occurs among adsorbate species.

  3. (c)

    The solid surface is homogeneous; i.e., adsorption sites are identical (equal size and shape), and the heat of adsorption is uniform for each site.

The Langmuir model in Eq. (8.10) demonstrates that a linear plot of C e/q e against C e gives a slope of 1/Q m and an intercept of 1/(K L · Q m).

$$ \frac{C_{\mathrm{e}}}{q_{\mathrm{e}}}=\left(\frac{1}{Q_{\mathrm{m}}}\right){C}_{\mathrm{e}}+\frac{1}{K_{\mathrm{L}}\cdot {Q}_{\mathrm{m}}} $$
(8.10)

where C e is the concentration of adsorbate at equilibrium (mg/L), q e is the milligram of adsorbate per gram of adsorbent at equilibrium (mg/g), Q m is the maximum monolayer capacity (mg/g), and K L is the Langmuir isotherm constant (L/mg).

The Langmuir-type adsorption is then used to determine the isotherm shape in terms of a separation factor (Li et al. 2010); see Eq. (8.11).

$$ r=\frac{1}{1+{K}_{\mathrm{L}}\cdot {C}_{\mathrm{o}}} $$
(8.11)

where r is a separation factor (dimensionless), K L is Langmuir constant (L/mg), and C o is initial adsorbate concentration (mg/L).

The factor “r” is used to evaluate the favorability of Langmuir adsorption based on the following classifications: “unfavorable” at r > 1, “linear” at r = 1, “favorable” at 0 < r < 1, and “irreversible” at r = 0.

8.5.2 Freundlich Adsorption Isotherm

Freundlich isotherm model is developed to describe the adsorption of a single solute onto heterogeneous surfaces (Freundlich 1906). The model describes the distribution of adsorbate between the solid and liquid phases, assuming an exponential distribution of adsorption energies. The Freundlich formula in Eq. (8.12) implies that a plot of log(q e) against log(C e) results in a linear form with a slope of (1/n) and an intercept of log(K F).

$$ \log \left({q}_{\mathrm{e}}\right)=\left(\frac{1}{n}\right)\log \left({C}_{\mathrm{e}}\right)+\log \left({K}_{\mathrm{F}}\right) $$
(8.12)

where K F is Freundlich constant indicating the adsorption capacity ((mg/g)·(L/mg)1/n), and 1/n represents surface heterogeneity or adsorption intensity, in which the sorbent surface is more heterogeneous at 1/n close to zero.

The Freundlich exponent “1/n” reveals the type of isotherm, which is “unfavorable” at 1/n > 1, “favorable” at 0 < 1/n < 1, and “irreversible” at 1/n = 0 (Saruchi and Kumar 2016). The value of 1/n lower than 1 implies a chemisorption process, whereas 1/n > 1 indicates a cooperative process.

8.5.3 Dubinin-Radushkevich (D-R) Isotherm

Dubinin-Radushkevich (D-R) isotherm model assumes that the sorption mechanism undergoes pore-filling rather than layer-by-layer surface coverage. The model is applied to structurally homogeneous systems, i.e., micropores having similar dimensions (Hutson and Yang 1997). In addition, this isotherm is temperature dependent and valid for physical adsorption processes involving van der Waals forces (Boparai et al. 2011).

The linearized form of D-R isotherm equation is expressed by Eq. (8.13). A plot of ln(q e) versus ε 2 results in a straight line having a slope = -K DR and an intercept = ln(q s).

$$ \ln \left({q}_{\mathrm{e}}\right)=-{K}_{\mathrm{DR}}\left({\varepsilon}^2\right)+\ln \left({q}_{\mathrm{s}}\right) $$
(8.13)

where q e is the milligram of adsorbate per gram of adsorbent at equilibrium (mg/g), q s is the theoretical isotherm saturation capacity (mg/g), K DR is D-R isotherm constant that describes adsorption energy (mol2/kJ2), and ε is Polanyi potential or the mean free energy (kJ/mol).

The Polanyi potential can be calculated by Eq. (8.14).

$$ \varepsilon = RT\ln \left[1+\frac{1}{C_{\mathrm{e}}}\right] $$
(8.14)

where R is the gas constant (8.314 J/mol/K) and T is the temperature (K).

As seen in Eq. (8.15), the value of K DR is used to determine the mean sorption energy.

$$ E=\frac{1}{\sqrt{2{K}_{DR}}} $$
(8.15)

where E is the mean sorption energy (kJ/mol).

8.5.4 Temkin Isotherm

The Temkin isotherm model describes the interaction effect of adsorbent/adsorbate in terms of the binding heterogeneity (Temkin 1941). The model assumes that the heat of adsorption of the molecules in a particular layer decreases linearly rather than logarithmically while neglecting deficient and high concentrations (Aljeboree et al. 2014). In this isotherm, adsorption is characterized by a uniform distribution of binding energies up to a certain extent. The Temkin isotherm is presented by Eq. (8.16), which shows that a linear plot of q e vs. ln(C e) gives a straight line with a slope = B and an intercept = B·ln(A T) (Boparai et al. 2011).

$$ {q}_{\mathrm{e}}=B\cdot \ln \left({C}_{\mathrm{e}}\right)+B\cdot \ln \left({A}_{\mathrm{T}}\right) $$
(8.16)

where A T is the equilibrium binding constant equivalent to maximum binding energy (L/mol) and B is a constant related to the heat of sorption (J/mol), and it equals R·T/b T, in which R is the universal gas constant (kJ/mol/K), T is the adsorption temperature (K), and b T is Temkin isotherm constant.

8.5.5 Pseudo-First-Order Kinetic

Pseudo-first-order model assumes that the interaction between sorbate and sorbent occurs due to hydrogen bonds and/or van der Waals forces, suggesting that the reaction is possibly physisorption (Saruchi and Kumar 2016). The formula of Eq. (8.17) presents the linear equation of the pseudo-first-order model. A plot of ln(q e – q t) versus t obtains a straight line with a slope of k 1 and an intercept of ln(q e).

$$ \ln \left({q}_{\mathrm{e}}-{q}_{\mathrm{t}}\right)=-{k}_1\times t+\ln \left({q}_{\mathrm{e}}\right) $$
(8.17)

where q e and q t are the milligram of adsorbate per gram of adsorbent at equilibrium and time t, respectively (mg/g), and k 1 is the pseudo-first-order rate constant (1/min).

8.5.6 Pseudo-Second-Order Kinetic

Pseudo-second-order model assumes that electrons are covalently exchanged between adsorbate and adsorbent via chemical interaction, also known as chemisorption (Fawzy et al. 2016b). According to Eq. (8.18), a plot of t/q t against t results in a linear relationship with slope and intercept of 1/q e and 1/(k 2 × q e 2), respectively.

$$ \frac{t}{q_{\mathrm{t}}}=\left(\frac{1}{q_{\mathrm{e}}}\right)t+\frac{1}{k_2\times {q}_{\mathrm{e}}^2} $$
(8.18)

where k 2 is the rate constant of second-order adsorption (g/mg/min).

8.5.7 Application of Adsorption for Hydrocarbon Remediation

Okiel et al. (2011) investigated the adsorption of oil from oil-contaminated effluents using deposited carbon (DC), bentonite, and powdered activated carbon (PAC). Their study found that at initial oil concentration of 1000 mg/L and for 30 min, the adsorption capacities were 250, 244, and 150 mg/g for DC, bentonite, and PAC, respectively. In addition, Freundlich isotherm provided a better description of the adsorption data rather than Langmuir model.

Rasheed et al. (2016) investigated the removal of polycyclic aromatic hydrocarbons, namely, anthracene and pyrene, from wastewater using PAC. Their study found that the removal efficiency of hydrocarbons was above 99% after an adsorption time of 4 h. The experimental data fitted well to Elovich model, suggesting that chemisorption was dominant during the adsorption process.

Nwadiogbu et al. (2016) investigated the treatment of oil spill through adsorption onto an agricultural waste of corncobs. Their study indicated that the adsorption process was described by surface reaction and intraparticle diffusion mechanisms. In addition, Langmuir isotherm provided a better fit to the adsorption data than the Freundlich model, and the maximum monolayer sorption capacities ranged between 0.0043 mg/g and 0.0768 mg/g.

Li et al. (2010) studied the application of coal for remediation of oily wastewater. The experimental factors were medium pH, oil concentration, coal type, particle size distribution, and contact time. Their results indicated that the equilibrium time was 1.5 h and the adsorption process followed the Freundlich isotherm. The adsorption capacities were 23.8 and 840.0 mg/g at initial oil concentrations of 160.5 and 1023.6 mg/L, respectively. The absorption mechanism comprised physical and chemical processes.

Angelova et al. (2011) revealed that rice husks could be used as a promising environmental material for the remediation of water contaminated with oil and oil products. Their study created a correlation between morphology and surface functional groups of the sorbent and adsorption mechanisms of the material.

Srinivasan and Viraraghavan (2010a, b) investigated the application of different types of biomaterials, i.e., Mucor rouxii and Absidia coerulea cultured in chitosan and walnut shell media, for the removal of oil from aqueous solutions. The selected oil types were cutting oil, standard mineral oil, and vegetable oil, achieving adsorption capacities of 84.0, 77.2, and 92.5 mg/g, respectively. The treatment efficiencies of oil-contaminated water by the fungal biomass of Mucor rouxii ranged between 77% and 93% at pH of 5.0.

Srinivasan and Viraraghavan (2008) investigated the adsorption of oil from aqueous solutions by walnut shell media. The findings depicted that the sorption capacities were 580 mg/g for Bright-Edge oil, 300 mg/g for standard mineral oil, and 510 mg/g for vegetable oil.

Ibrahim et al. (2010) examined the remediation of wastewater contaminated with emulsified oil using agricultural waste barley straw. The experimental factors were solution pH, temperature, loading of adsorbent, and particle size. Results revealed that the adsorption capability was favorable at a neutral pH environment. Langmuir model described well the experimental data, and the monolayer adsorption capacity was 576.0 ± 0.3 mg/g at 25 °C.

8.6 Design of Experiments

Design of experiments is a statistical approach employed to estimate the influences of multiple independent factors on a single variable. The optimization of experiments can be considered using different techniques such as one-factor-at-a-time and factorial design (Elhalil et al. 2016). The results of experimental design can be graphically displayed using a response surface methodology (RSM). RSM undergoes different mathematical and statistical techniques for optimizing, predicting, and improving a study of interest (Bas and Boyacı 2007). It can be applied to define the effects of multiple independent variables on chemical and biochemical processes. In addition, RSM can be employed for the determination of enzyme stability and kinetic constants.

8.7 One-Factor-at-a-Time Statistical Method

In a one-factor-at-a-time method, only one variable (or factor) differs with the experimental time, whereas other inputs are maintained constant. The optimum value of the first variable is used for the subsequent experimental runs, in which this step is repeated for other variables. However, this method fails to consider the interaction effects between factors.

8.8 Factorial Design Statistical Method

Factorial design is used to simultaneously determine the effects of two or multiple factors on output. In addition, factorial design can be developed to describe the interaction effects between the independent variables (Nasr et al. 2017). This technique can predict accurate outputs with a minimum number of experiments and reduced time. Factorial design is classified into full-factorial and fractional-factorial.

A full-factorial design, which contains n-factors and each factor has m-levels, is termed as a m n factorial experiment. For instance, a full-factorial design noted as 23 describes three factors (e.g., pH, time, and temperature) with two levels for each factor (e.g., minimum and maximum); i.e., hence the number of experiments is 23 = 8. Similarly, a 32 factorial design has two factors, each with three levels (e.g., minimum, average, and maximum), and 32 = 9 experimental runs. Based on the aforementioned, full-factorial design represents all possible combinations/interactions among factors, which can then be displayed in a single interface.

However, when the number of input variables is large (e.g., more than four factors), the full-factorial design becomes time-consuming. Under this condition, fractional-factorial design, which investigates the most important correlations between factors using a minimum number of experiments, becomes preferable (Srinivasan and Viraraghavan 2010a, b). Fractional-factorial design can be performed using central composite and Box-Behnken methods.

8.9 Application of Design of Experiments for Hydrocarbon Remediation

Srinivasan and Viraraghavan (2010a, b) employed a factorial design analysis to describe the removal of oil from aqueous solution by a fungal biomass, namely, Mucor rouxii. The selected oil types were cutting oil, standard mineral oil, and canola oil. The experimental factors were solution pH (3–9), temperature (5–30 °C), sorbent mass (0.05–0.5 g), initial oil concentration (50–350 mg/L), and mixing speed (100–200 rpm). The results of their study revealed that the medium pH was the most influential factor, in which the removal efficiencies ranged between 80% and 99% at pH of 3.0.

Tansel and Regula (2008) conducted a 2 × 2 × 3 factorial design experiment to determine the impacts of operational factors on the remediation of water contaminated with petroleum hydrocarbons (PHC). The input attributes were oil concentration (150 ppm “low” and 3000 ppm “high”), coagulant type (Cat floc K-10, Cat floc T-2, and no coagulant), and the source of water (pond and brackish). The model outputs were (a) turbidity removal and (b) petroleum hydrocarbon removal. Results indicated that the highest turbidity removal of 93.53% was obtained at “low” oil concentration, “pond” water source, and “Cat floc K-10” coagulant. In addition, a PHC removal of 92.53% was developed at “low” oil concentration, “brackish” water source, and “Cat floc K-10” coagulant.

Sivagurunathan et al. (2003) investigated the effects of several factors of medium pH (5, 7, and 9), temperature (18, 22, and 26 °C), and agitation speed (50, 150, and 250 rpm) on the bioremediation of water containing hydrocarbon residues using Pseudomonas fluorescens. A face-centered cube design having three factors and three levels for each factor (i.e., 33) was employed. A RSM was applied to plot the results of a quadratic equation having linear, second-order, and interaction terms. Results indicated that the optimum condition was temperature 22.48 °C, pH 7.31, and agitation speed 206 rpm, achieving a total biodegradation of toluene (R 2-value 0.98).

8.10 Artificial Intelligence Modeling

Artificial intelligence (AI) is the development of computer-based systems able to achieve tasks that involve human intelligence, including decision-making, speech recognition, translation between languages, and visual perception (Fawzy et al. 2017). AI can be employed for organization and classification of large datasets, as well as for capturing complex relationships. It is a black-box model that uses machine learning such as artificial neural networks (ANN) and fuzzy logic concepts.

8.11 Artificial Neural Network

Artificial neural network (ANN) is a computer-based approach that implements learning procedures similar to the nervous system of the human brain (Nasr et al. 2017). ANN is composed of a large number of interconnected nodes, also known as neurons, which are organized in layers. The input layer receives experimental data and transfers results to the last layer through successive hidden layers. The neurons in a particular layer are fully interrelated to those in the subsequent layer through weights and biases (Panja et al. 2017). In addition, activation functions are used to transfer results between successive layers. During training, the weights and biases are adjusted until the mean squared error (MSE) between the simulated outputs and the actual results is minimized. A back-propagation method with a Levenberg-Marquardt (trainlm) algorithm has been widely used for network training (Olawoyin 2016). In this method, the training process undertakes two phases: i.e., in the forward phase, the external signals are propagated from the input layer to the output layer, whereas in the backward phase, the error between the predicted and observed values at the output layer is propagated backward to modify weights and biases (Pakravan et al. 2015). These iterations are successively repeated until achieving the minimum MSE. After network training, a portion of data is used for validation and testing processes to prevent data overfitting and examine the stability level of the trained network. These procedures result in attaining a high degree of prediction accuracy even when ANN receives noisy and/or erroneous input data (Sanusi et al. 2016).

8.12 Fuzzy Inference System

A fuzzy inference system (FIS) is employed to describe nonlinear and complex relations between a number of input factors and one (or more) output (Zadeh 1997). As shown in Fig. 8.3, a FIS is achieved by conducting three major steps (a) fuzzification, (b) “if-then” rules, and (c) defuzzification (Gupta et al. 2017). During fuzzification, crisp (numeric) values are converted into fuzzy inputs using fuzzy linguistic variables, fuzzy linguistic terms, and membership functions. For example, an input factor such as hydrocarbon concentration can be converted according to linguistic concepts into “low,” “medium,” and “high.” Each linguistic expression can be graphically represented as a membership function, e.g., linear, trapezoidal, and Gaussian fuzzy sets (Nasr et al. 2014). After that, an inference engine is performed using a set of “if-then” rules, in which a single fuzzy rule has the form “if x is A, then y is B.”

Fig. 8.3
figure 3

A fuzzy inference system

Assume x and y are the variables “bioremediation” and “residual hydrocarbon,” respectively, whereas A and B are linguistic variables “high” and “low,” respectively. The “if-then” rule will have the form “if bioremediation is high, then residual hydrocarbon is low.”

The aggregation of rules is undertaken when the rule-based system comprises several numbers of “if-then” rules. Finally, defuzzification step is carried out to convert the fuzzy result into a crisp output (Zadeh 1997). The widely used defuzzification methods in the literature are center of gravity, mean-max, max-membership, weighted average, and center of sums. Mamdani, Sugeno, and Tsukamoto are different types of fuzzy inferences that have been widely used to implement the fuzzy logic procedures (Nasr et al. 2014).

8.13 Application of Artificial Intelligence for Hydrocarbon Remediation

Sanusi et al. (2016) applied an ANN model for optimizing the degradation performance of total petroleum hydrocarbon by Paspalum scrobiculatum L. Hack with R 2-value over 0.95. Their study found that the optimum condition achieving a removal efficiency of 85.5% was an aeration rate of 1.02 L/min, diesel concentration of 3%, and 72 sampling days.

Olawoyin (2016) proposed an ANN model with the Levenberg-Marquardt back-propagation training algorithm for the prediction of potential toxicity of polycyclic aromatic hydrocarbons in soils. The input parameters were treated soil (I and IV), pH (5.02–7.25), electrical conductivity (54–195 mS/cm), and dissolved organic carbon (31.18–62.96 mg/L). The model achieved a high accuracy with R2-value above 0.99.

Panja et al. (2017) developed an ANN with a structure of 8–14–3 to predict the production of hydrocarbon from shales. The eight input factors were bottom hole pressure (500, 1000, and 1500 psi), gas relative permeability (1, 2, and 3 ng), hydraulic fracture spacing (60, 180, and 300 ft), initial dissolved gas-oil ratio (800, 1900, and 3000), initial reservoir pressure (4000, 5250, and 6500 psi), reservoir permeability (10, 225, and 5000 nD), rock compressibility (4 × 10−6, 4 × 10−5, and 4 × 10−4 1/psi), and slope of gas-oil ratio (0.50, 0.65, and 0.80). The input factors were distributed according to Box-Behnken design of experiment. The model outputs were oil recovery, gas recovery, and gas-oil ratio. The model showed a high predictive accuracy in terms of coefficient of determination (R2-value) and normalized root mean square error.

Vaferi et al. (2014) applied an ANN model to predict the treatment efficiency of wastewater contaminated with aromatic hydrocarbons. The ANN structure was multilayer perceptron with one hidden layer containing 15 neurons. The input attributes were contact time (0–1440 min), initial concentration of H2O2 (0–1942 mg/L), pollutant concentration (200–840 mg/L), pH (3.1–11.6), temperature (25–86 °C), and UV intensity (225–304 nm). The output variable was final pollutant concentration (4–840 mg/L). It was found that the optimum experimental factors were three UV lights illumination and acidic pH of 3.1. The proposed model predicted the degradation of aromatics hydrocarbon with a mean square error of 5 × 104 (i.e., high accuracy).

Pakravan et al. (2015) investigated the effects of pH (1.5–10.5), initial COD (200–800 mg/L), concentration of H2O2 (2.2–15.4 mM), and contact time (45–135 min) on the treatment of petroleum refinery wastewater. The data were obtained from 30 experimental runs, i.e., a central composite factorial design of 24 + 6 center points + 8 star points. It was found that at an initial COD concentration of 300 mg/L, the optimum condition was pH, 5; H2O2, 8.8 mM; and time, 120 min. A RSM, along with a quadratic regression model, was employed to describe the relationship between the input factors and the output variable (i.e., COD removal efficiency). The findings of the statistical modeling technique were compared to those obtained from ANN (as typical artificial intelligence method). For this purpose, a feed-forward back-propagation ANN model with a structure of 4–5–1 was applied for the prediction of COD removal efficiency. Results indicated that ANN (R 2, 0.96; adj-R 2, 0.96) provided a higher predictive capability than RSM (R 2, 0.94; adj-R 2, 0.91). A sensitivity analysis using the network weights was employed, which indicated that the initial COD concentration was the most dominating factor.

8.14 Conclusion

This study presented recent applications of white-box and black-box models that have been used for the prediction of bioremediation performances of hydrocarbon pollutants. In addition, this chapter described various physicochemical and environmental parameters that affect the biological activities of hydrocarbon degradation. Isotherm and kinetic studies that have been employed for the adsorption of hydrocarbon contaminants from aqueous solution were investigated. The sorbent materials used in the literature were deposited carbon, bentonite, powdered activated carbon, coal, microorganisms, and agricultural waste (e.g., corncobs, rice husks, and barley straw). Factorial design experiments were demonstrated to determine the effects of several factors, viz., culture pH, temperature, pollutant concentration, mixing speed, and reaction time, on hydrocarbon removal efficiencies. Artificial neural network and fuzzy inference systems were also applied for modeling, developing, controlling, and simulating hydrocarbon remediation processes.