1 Introduction

Toxic heavy metal ions are highly soluble in aquatic environments and can be absorbed by living organisms. Once they enter the food chain, high concentrations of heavy metal ions including chrome ion may accumulate in human body [10]. Chrome ion is one of the most important metals that exist in the sewage coming from different industries. Chromium, in its trivalent and hexavalent forms, in the environment may become toxic if its rate increases above certain concentrations; high concentrations of Cr(VI) can be more toxic than Cr(III). Therefore, different industries have the responsibility of controlling the rate of chromium ions in factory sewage as well as finding methods for purifying their sewage [13, 25]. Different methods including sedimentation, ion exchange, immersion, liquid–liquid extraction, reverse osmosis, electrochemical processes, biological processes, and surface adsorption have been developed for removing these metal ions from wastewater [6, 27]. A comparison of the mentioned methods indicates the superiority of surface adsorption to other methods of water treatment the reason lies in the fact that surface adsorption is simple, easy, feasible, and economical, especially if cheap adsorbents are used [26, 20].

Most of the adsorbents are highly porous materials, and surface adsorption happens mostly in hole walls or specific parts inside the particles [30, 22]. Some of the different types of adsorbents that are widely used in various industries include active carbon, active alumina, zeolites, and the nanoparticles of metal oxides [30, 22, 11]. Among the mentioned nanoparticles, the nanoparticles of metal oxides have recently attracted a lot of attention because of their special features; one of the most important intermediary metal oxides is nickel oxide which has recently attracted a lot of attention because of catalyst, electronic, and magnetic features of nickel oxide nanoparticles. The porous nature of NiO nanoparticles allows metal cations to gain the capability of surface adsorption through nickel oxide nanoparticles. Response surface methodology (RSM) includes a set of statistical and mathematical techniques which are useful in the process of development, improvement, and optimization, and it is possible to change the optimum state of several variables simultaneously using the least possible amount of resources and quantitative data.

Response surface methodology is one of the effective methods in optimization of the process. The main advantage of RSM lies in the fact that it reduces the number of times an experiment has to be repeated in order to assess multiple parameters and their mutual relationships.

Neural networks are composed of simple operational neurons which act in parallel with each other. These neurons are inspired by biological neural systems. An artificial structure can be developed following the natural networks, and by regulating the rate of each contact, labelled as contact weight, the manner of the relationship between the neurons can be determined. One or more neurons together form one layer of the network. A network can consist of one or more layers of this type.

After regulating or training the neural network, the use of a particular input type leads to receiving a specific response. As evident in Fig. 1, the network is adjusted according to the match and correspondence between the input and the target output. Adjustments continue until the network output and the concerned output (target) correspond to each other. Generally, a large number of these input and output pairs are used until the network is trained as a result of this continuous process, labelled as “monitored learning”.

Fig. 1
figure 1

Compatibility between the input and output

Fig. 2
figure 2

Optimal structure of artificial neural network

Neural networks are used for performing complicated functions in different aspects; this includes pattern identification (neural networks have proper performance in recognition of patterns), for example, classification of benign and malignant tumours based on information such as cell size, tumour thickness and mitosis (with data containing 699 samples each of which contains 9 characteristics of tumours), so the normal line or graphical interface for recognition of nprtool pattern or graphical interface of nprtool neural networks can be used to solve, for example (Kia, Soft Computing in MATLAB), identity clarification, and speech; it can be noted that general regression neural networks (GRNNs) have been applied to phoneme identification and isolated word recognition in clean speech [3] and image processing. In this context, the application of neural networks, image processing, and CAD-based environment facilitates automatic road extraction and vectorization from high-resolution satellite images [1]. To clarify the issue, it should be added that neural networks are successful in controlling dynamic systems. Their ability to estimate their comprehensive capabilities of multilayer perceptron networks make it as a proper option to model nonlinear systems and to implement the controlling systems. One of the desirable and important features of a neural network is its ability to learn from the environment in order to improve its performance. Learning is a dynamic and repeated process which reforms the network parameters. This process is the response to the signals that the network receives from its environment [9]. In other words, artificial neural networks (ANNs) are the processors that are trained to perform particular tasks. Therefore, coupling a computational ANN with a simulated affective system is necessary in order to explore the interaction between the two. In most of the topologies, learning leads to change in synaptic efficiency. In other words, it changes the contact among the neurons in the layers. After determining the weights and biases, it is the time to train the neurons. The process of training needs a set of examples of the expected behaviour from the network including the network input and target. In the process of training, the weightings and biases are regulated so that the function of network efficiency will be minimized.

In order to determine the optimal number of the hidden groups, a set of topologies are used in which the number of the nodes change. Measurement of network performance and efficiency is obtained by MSE using Eq. 1 [2, 15].

$${\text{MSE}} = \frac{{\sum\nolimits_{i = 1}^{i = N} {(Y_{{i,{\text{pred}}}} - Y_{i,\exp } )^{2} } }}{N}$$
(1)

in which N is the number of points (nodes), Y i , pred is the network prediction, Yi,exp is the experimental response, and i is an index of the data.

All artificial neural networks are trained using a suitable gradient such as conjugate gradient, quasi-Newton and Lenvenberg–Marquardt algorithm. The main purpose of teaching is, in fact, minimizing the error function which searches a set of weights and contact biases and causes the artificial neural networks to produce output values which are near or equal to the target values. In feed-forward neural networks, the input information from the outer signals is calculated by input neurons, and the signal information in the output neuron is obtained. In return phase, the changes in the strength of the contacts occur according to the differences in the predictions and the observed information in the output neuron [12].

If the transfer function is in sigmoidal hidden layers, all samples must be in the range of 0.2–0.8. Therefore, all sets of the data (X i ) have turned into the new value of A i as shown in Eq. 2 [15, 12, 24]

$$A_{i} = 0.2 + \frac{{0.6\,(X_{i} - \hbox{min} (X_{i} ))}}{{\hbox{max} \,(X_{i} ) - \hbox{min} (X_{i} )}}$$
(2)

In order to prevent random correlation, according to the original weightings, each topology is repeated several times. Network performance can stabilize the number of the defined nodes in the hidden layer. Neural networks are widely applied in the area of modelling of chemical processes; for example, Daneshvar et al. used neural networks for modelling the elimination of basic yellow 28 through electrocoagulation. They studied the effect of some parameters such as current density, original pH of the solution, electrolyse time, the original concentration of the colour, the distance between the electrodes, and solution conductivity. The comparison between the results predicted through the suggested ANN model in this study corresponded to the results obtained from the laboratory data [7].

Padma Sree et al. used neural networks and response surface methodology approach for modelling and optimization of Cr(VI) adsorption from waste water using ragi husk powder. They used ANN model for the adsorption of chromium (VI) developed by a single-layer feed-forward back-propagation network with 14 neurons in the hidden layer to obtain minimum mean squared error [18]. Behin et al. performed a comparative study between ANN and RSM. They investigated the effects of four independent variables using a three-level four-factor central composite experimental design. This design was utilized to train a feed-forward multilayered perceptron artificial neural network with a back-propagation algorithm. A comparison between the results obtained from the model and experimental data indicated high correlation coefficients and showed that the two models were able to predict reactive red 33 removal by employing O3/UV process [4].

In the present study, the nickel oxide nanoparticles synthesized by sol–gel method were used as the adsorbent in order to remove Cr(VI) as a poisonous pollutant existing in industrial sewage, and according to the optimized results about the effect of the operational parameters, modelling was conducted using artificial neural networks with an experimental design.

2 Experiments

2.1 Materials and methods

All the materials used in this study were of analytical-grade type and were bought from the Merck, and the synthetic nickel oxide nanoparticles were obtained by sol–gel method. Cr(VI) solution was used as the pollutant, and HCl 0.1 M and NaOH 0.1 M were used for regulating pH [32].

2.2 Adsorption experiments

Adsorption experiments were carried out by adding different amount of the adsorbents and pollutants at a temperature of \(20 \pm 1\,^\circ {\text{C}}\) ith different pH levels and time intervals. The reaction vessels were placed on stirring device with a stable speed of 1000 rpm. The solution was, then, centrifuged (Hettich EBA, Kirchlengern, Germany) at a rate of 4000 rpm for 5 min to isolate the adsorbents. The filtrate was analysed through UV/V in a spectrophotometer at maximum wavelength (350 nm) of Cr(VI). Finally, the percentage of adsorption is calculated using Eq. 3 in the following way:

$$R\% = \frac{{(C_{0} - C_{i} )}}{{C_{0} }} \times 100_{{}}$$
(3)

where C0 and Ci are the initial and the final concentrations of Cr(VI), respectively.

2.3 Designing the experiment by RSM

Response surface methods (with DX 7 software) were used to obtain the main and interactional effects of the independent variables affecting the response in the process of adsorption of Cr(VI) on NiO nanoparticles. The study is characterized by the central composition design (CCD) and a quadratic model. In this method, the effect of four independent factors on response including the volume of NiO, the concentration of Cr(VI) adsorbent, pH, and contact time was studied [32, 28]. The ranges and levels of these factors are shown in Table 1.

Table 1 Factors and their levels in designing the experiments by RSM

2.4 Modelling the results of Cr(VI) adsorption process by NiO nanoparticles by an artificial neural network

Modelling the results of Cr(VI) adsorption process was carried out with NiO nanoparticles by an artificial neural network using the results of designing the experiment (with MATLAB 2010 software). Optimization of ANN topology was the next important step in the model. The number of neurons in the hidden layer was determined according to the minimum prediction error of the network. Hence it may be considered the parameter for the neural network design. In order to determine the number of neurons in the hidden layer, different topologies were examined in which the number of nodes varied from 2 to 15. Figure 3 shows that the network MSE is minimum for the inclusion of 10 nodes in the hidden layer. Therefore, based on the mean square error (MSE) function, a three-layered feed-forward back-propagation artificial neural network with a topology of 4:10:1 was used (Fig. 2). This network is composed of 4 neurons in the input layer, 10 neurons in the hidden layer, and one neuron in the output layer.

Fig. 3
figure 3

Variation of mean square error vs number of neurons in hidden layer

The input variables are shown in Fig. 2. In order to determine the optimal number of neurons in the hidden layer of different topologies, experiments were conducted with 1–10 neurons. The maximal value of correlation coefficient in the number of neurons is 10. In the present study, one sigmoidal function in the hidden layer was selected as the transfer function, and one linear function in the output layer was selected. The ranges of the data used are shown in Table 2. The ranges of the variables used in modelling surface adsorption of Cr(VI) by NiO nanoparticles are shown in Table 2.

Table 2 Ranges of the variables in modelling surface adsorption of Cr(VI) by NiO nanoparticles

One of the main problems of modelling is the fact that in ANN modelling, there is a need for a large amount of experimental data. In the present study, for training ANN the results from RSM and the mathematical equation obtained by this method were used for the first time. In this line, in training ANN in RSM using 4 parameters at 5 levels generally 625 data sets were considered out of which 535 data sets were selected for training, evaluating, and testing the network, and for simulation, 30 samples of the data used in designing RSM were used. Out of 535 data sets, randomly 60% were selected for training, 20% for evaluating, and 20% for testing.

3 Results and discussion

3.1 Characterization of NiO nanoparticles

The XRD pattern of NiO nanoparticles shows five primary peaks at 2Ө = 37, 43, 63, 75, and 790. The crystallite size of NiO nanoparticles was estimated by Debye–Scherrer equation. The average crystallite size of NiO nanoparticles prepared by sol–gel method was about 7 nm. Figure 4 illustrates the typical XRD pattern of NiO nanoparticles prepared by sol–gel method. Typical TEM image of NiO nanoparticles is shown in Fig. 5. There is a good agreement between the TEM and XRD results for the particle size. The average size of NiO nanoparticles, as measured by TEM, was found to be lower than 10 nm. The SEM micrograph in Fig. 6 reveals that NiO nanoparticles have uniform size distribution.

Fig. 4
figure 4

X-ray diffraction of NiO nanoparticles prepared by sol–gel method

Fig. 5
figure 5

TEM micrograph of NiO nanoparticles prepared by sol–gel method

Fig. 6
figure 6

SEM image of NiO nanoparticles prepared by the sol–gel method

The data in Fig. 7 show the EDX spectra for NiO nanoparticles prepared by the sol–gel method, which clearly show the peaks of Ni and O.

Fig. 7
figure 7

EDX spectra of NiO nanoparticles prepared by sol–gel method

The BET and BJH plots of nitrogen adsorption onto NiO nanoparticles are shown in Fig. 8. The specific surface area and total pore volume of NiO nanoparticles were 109.36 m2g−1 and 0.1993 cm3g−1, respectively. The plot of the pore size distribution (Fig. 9) was determined by using BJH method from the desorption branch of the isotherm. The pore size distribution of NiO nanoparticles was measured as 3.09 nm.

Fig. 8
figure 8

BET plot of NiO nanoparticles prepared by sol–gel method

Fig. 9
figure 9

BJH plot of NiO nanoparticles prepared by sol–gel method

3.2 Designing the experiment by RSM

In this method, the effects of 4 independent factors were studied. One of these factors is the effect of time duration of the process because equilibrium characteristics, referred to as adsorption isotherms, describe how the adsorbate interacts with the adsorbent. Research results indicate that the time duration of the process needs to reach an equilibrium [21, 23]. The designed experiments and the respective experimental results are shown in Table 3.

Table 3 The results of the experiments designed by RSM

The results of the experiments designed by response surface methodology and the results predicted by this method are given in Table 4.

Table 4 Comparison the results predicted by RSM and the obtained laboratory data

As evident in Fig. 10, there is good correspondence between these two values. In Fig. 10, the curve for the frequency of normal distribution has been calculated to study the validity of the experiments. Moreover, in Fig. 10, the curve for the frequency of normal linearity has been drawn. The normal distribution curve for residuals indicates the accuracy of the model.

Fig. 10
figure 10

Match between the experimental data with the responses from response surface method

For analysing the responses, the analysis of variance was used the results of which are given in Table 5. The correlation coefficient (R2) of this analysis can be noted. The correlation coefficient close to 1 is desirable. It can also refer to the P and F. P value is the possibility that uses the deviation rate to control the importance of each one of the rates, due to the variability of stochastic process. F value is the standard to compare the variance of a phrase with the variance of the residual.

Table 5 Analysis of variance from response modelling procedure

The value of R2 shows that 95.97% of the changes in the efficiency of elimination results from the independent variables, and this model cannot account for only 4.03% of the changes.

3.3 Modelling the surface adsorption of Cr(VI) using artificial neural network (ANN)

Nowadays, RSM and ANN approaches are applied for optimizing and process modelling [5, 8, 14, 16, 19, 29]. A comparison of the predictive and generalization capabilities, sensitivity analysis, and optimization abilities of ANN and RSM techniques revealed that the ANN model fitted the data better and had a higher predictive capability than RSM, even with the limited number of experiments. In order to calculate training, evaluation, and testing errors, all the data have been transformed into the original scale so that they could be compared with the original values. Also, Fig. 11 shows the best validation performance versus a number of epochs of network.

Fig. 11
figure 11

Best validation performance versus a number of epochs of network

Figure 12 shows the comparison between the values obtained from RSM and the calculated values of the output variable using ANN with 10 neurons in the hidden layer.

Fig. 12
figure 12

Comparison of the data from RSM and ANN for the set of experiments

The results in Fig. 12 show that, considering the high number of the input data, ANN could be trained by the data from RSM because the values of correlation coefficient (R2) and the slope of the obtained line are equal to one, and the distance from the starting point of the obtained line is almost zero. High correspondence between the experimental results and the results from ANN indicates appropriate training of the network. To ensure the appropriate training of the network, 30 samples of the data which were not used in ANN will be used for simulation; the results of comparison between the data from ANN with RSM results and the experimental results are shown in Figs. 13 and 14. The value of R2 near to 0.9 indicates the correspondence between the experimental data and the data predicted by ANN. The results of the present study are similar to those obtained in the following studies: Khayet studied artificial neural network model for desalination by sweeping gas membrane distillation. The agreement between the target (experimental observations) and the network output (predictions) for training, validation, and test sets is studied. The overall correlation coefficient including training, validation, and test, is R2 = 0.80, while the correlation coefficient for validation is higher (R2 = 0.93). These values reveal a satisfactory prediction of the experimental data by means of the developed ANN of MLP (3:9:1) type [16]. So Yetilmezsoy studied artificial neural network (ANN) approach for modelling of Pb(II) adsorption from aqueous solution by Antep pistachio (Pistacia vera L.) shells; the proposed ANN model showed a precise and an effective prediction of the experimental data with a satisfactory correlation coefficient of 0.93 for five operating variables [31].

Fig. 13
figure 13

Simulation of the data from RSM by ANN

Fig. 14
figure 14

Simulation of the experimental data by ANN

After appropriate training of ANN was ensured, the matrix of the network weightings was obtained and is reported in Table 6 in which W1 is the weight between the input and hidden layer, and W2 refers to the weights between the hidden and output layers. The weights are the coefficients between the artificial neurons which act like the synaptic power between the axons and dendrites in real biological neurons. Therefore, each of the neurons decides which proportion of the input signal should enter the body of the neuron. In fact, by the use of neural networks matrix, the significance of the relationship between different input variables and the output variable is specified.

Table 6 Matrix of the weight

4 Conclusion

In the present study, the adsorption efficiency of the nanoparticles of nickel oxide synthesized by sol–gel method in elimination of the pollutants from aqueous solutions was studied. The BET analysis showed that the specific surface area and total pore volume of NiO nanoparticles were 109.36 m2g−1 and 0.1993 cm3 g−1, respectively. So, due to its high surface area and natural porosity, NiO can interact with heavy metal ions and is a good candidate for the adsorption process. The effect of different operational parameters including the original concentration of Cr(VI), the dosage of NiO adsorbent, contact time, and pH at a temperature of \(20 \pm 1\,^\circ {\text{C}}\) on the removal of Cr(VI) by sol–gel-synthesized NiO was studied. Based on the results obtained from designing the experiment, modelling the efficiency of the process of surface adsorption by artificial neural network was conducted. One of the effective parameters in artificial neural network model is the number of neurons in the hidden layer. Therefore, by changing the number of neurons in the hidden layer to 10, the least possibility of error happens. The results showed that the data obtained from ANN fit well with the data from RSM and the experimental data.