Calcium treatment technology was widely employed during the production of Al-killed steels to modify Al2O3 inclusions and prevent the nozzle clogging.[1,2,3,4,5,6,7,8,9] The addition of calcium in the molten steel was also beneficial to the steel desulfurization, control of the sulfide inclusion morphology, and improvement of the steel performance.[1,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24] However, it was difficult to stably control the yield of calcium during the calcium treatment process because of the low solubility in the molten steel and its low boiling temperature. Thus, the calcium yield was always an important issue to stabilize the steel performance, improve the production efficiency, and reduce production costs. Previous studies mainly focused on thermodynamic and kinetic mechanisms of morphology, size, composition, and number evolution of inclusions during the calcium treatment process.[1,6,25,26,27,28,29,30] However, the yield of calcium during the calcium treatment process was little studied. It is difficult to realize the stable control of the calcium treatment process, since the yield of the calcium was affected by many factors. Related studies are summarized in Table I. It was found that the yield of calcium increased with a higher temperature.[15,31,32,33,34] Moreover, the calcium yield was also closely related to the calcium wire types and steel compositions.[31,32,33,35,36,37]

Artificial neural network, commonly known as a neural network, is a data processing model inspired by a biological neural network.[38,39,40,41,42,43,44,45,46,47,48] It mainly models the input data by adjusting the weight of neurons and finally solves practical problems. The basic unit of a neural network is a neuron cell. The output of an artificial neuron model is described as Eq. [1]; an output value is obtained by a sigmoid function. The connection weights of a neural network are adjustable to achieve a more precise prediction. A series of nodes (artificial neuron models) with simple processing ability are connected by weights. Correct results can be obtained when weights are properly adjusted.

Table I Related Studies on the Yield of Calcium Wire.
$$ y_{i} = f\left( {\sum\limits_{i = 1}^{n} {w_{i} x_{i} - \theta_{i} } } \right) $$
(1)

where x means input values, θ is threshold values, and w means weights.

In the current study, neural network models including shallow neural network (SNN), deep neural network (DNN), and neural network optimized by genetic algorithm (GA-BP) were established and compared to predict the yield of calcium during the calcium treatment process and improve the efficiency of calcium treatment.

The workflow of the calcium yield prediction is shown in Figure 1. First, the data were acquired, analyzed, and preprocessed. The SNN was a type of traditional model. The DNN model increased the depth of the neural network based on the SNN. The GA-BP model used the genetic algorithm to optimize the weights and thresholds of the neural network. To improve the accuracy of prediction results, three neural network models including SNN, DNN, and GA-BP were established. The accuracy and efficiency of the three neural network models were compared by training and testing. Finally, the optimal model was chosen and applied to predict the yield of calcium.

Fig. 1
figure 1

The workflow of the research on the prediction of calcium yield

Before training, inputs and targets variables were scaled so they were in a specified range, as Eq. [2]. Variables were preprocessed at the range of − 1 and 1. The root means square error (RMSE) was used to describe the average error between predicted values and experimental values using Eq. [3]. The accuracy of models can be explained by the RMSE value.

$$ y = (y_{\max } - y_{\min } ) \times \frac{{x - x_{\min } }}{{x_{\max } - x_{\min } }} + y_{\min } $$
(2)
$$ {\text{RMSE}} = \sqrt {\frac{{\sum {(Y_{{{\text{Cal}}{.}}} - Y_{{{\text{Exp}}{.}}} )^{2} } }}{n}} $$
(3)

where ymax is 1 and ymin is − 1 in the current model. xmin means the minimum value of x variables. xmax means the maximum value of x variables. YCal. means the predicted calcium yield, YExp. means experimental results, and n means the number of tested data.

The structure of the deep neural network is shown in Figure 2(a). The X represents input variables, and Y represents output results. The number nj means the ith hidden layer. Circles mean neurons connected by different lines with various weight values Wj. There were more middle layers in DNN, which was improved based on SNN. The operation process of DNN was closer to a human neuron system, predicting more accurate results.[52,53] The values of weights and thresholds were randomly selected in BP neural network. Thus, learning results may converge to the local minimum value instead of the global minimum value of the mean square error. In other words, there may be a big deviation between predicted and actual results. To solve this problem, the GA-BP was applied to attain optimal weights and threshold values through a series of selections, crossovers, and mutation operations in Figure 2(b).

Fig. 2
figure 2

Schematic diagrams of neural network: (a) traditional model, (b) genetic algorithm optimized model

The production route of steels was basic oxygen furnace (BOF) → ladle furnace (LF) refining → Ruhrstahl Heraeus (RH) refining → calcium treatment → continuous casting. The carburant, modifying agent of slag, slag deoxidizer, and lime were added on the surface of the molten steel during the LF refining process. Carburant and desulfurizer were added during the RH refining process. The composition and temperature of the molten steel were measured during the RH refining. The calcium wire was injected to the molten steel at the end of the RH refining. The yield of calcium was calculated as Eq. [4].

$$ \eta = \frac{{W(\omega [{\text{Ca}}]_{{\text{T}}} - \omega [{\text{Ca}}]_{{\text{O}}} )}}{\chi \beta \mu } \times 10^{3} $$
(4)

where \(\eta\) means the yield of calcium, pct; W means the weight of molten steel, t; \(\omega [{\text{Ca}}]_{{\text{T}}}\) means T.Ca content after the calcium addition, ppm; \(\omega [{\text{Ca}}]_{{\text{O}}}\) means T.Ca content before the calcium addition, ppm; χ means the length of the injected calcium wire, m; β means the calcium content in calcium wire, pct; μ means the mass of calcium wire per meter, kg/m.

The histograms of the 23 variables with mean values are shown in Figure 3. The composition and temperature of the liquid steel, the amount of slag addition, and some operation parameters that have a great effect on the calcium yield were included in the dataset. There are totally 511 sets of data collected from a steel plant. And 461 sets of data were selected as teaching data to train the model. Another 50 data points selected as test data were predicted to test models, and these test data were included within the ranges of input variables for teaching data. The unit number of input layer was 22, and the unit number of output layer was 1. The maximum training epoch of the training process was set as 1500 and the learning rate was 0.2. The unit number in each middle layer was determined by an empirical formula, shown as Eq. [5].

$$ \text{M} = \sqrt {\text{n + m}} + \text{a} $$
(5)

where n is the unit number of the input layer, m is the unit number of the output layer, and a is a constant between 1 and 10. In the current model, the unit number in each middle layer was set as 6 in these models. The accuracy and operation efficiency of various models were compared to optimize the prediction model.

Fig. 3
figure 3

Histograms of the 23 variables in the final dataset

The changes of the RMSE value and running time of models with the number of middle layers are shown in Figure 4. With the increase of the middle layer number, RMSE values decreased first and then increased, while the calculation time slightly increased within a second. Predicted results with different network models are shown in Figure 5. The red line in each figure means the linear regression of predicted results and experimental results. Predicted results were close to actual results. Comparing these three models, the DNN model and GA-BP model achieved smaller errors. The RMSE and running time of three neural network models are listed in Table II. The DNN model and GA-BP model achieved more accurate prediction results while the running time of the GA-BP model was longer. A series of selection, crossover, and mutation operations was applied to obtain the optimal weights and threshold while the values were selected randomly in the DNN model so that the GA-BP model needed more time to run.

Fig. 4
figure 4

Variation of RMSE values with the middle layer number

Fig. 5
figure 5

Prediction results with different neural network models: (a) SNN, (b) DNN, (c) GA-BP

Table II Comparison of Three Neural Network Models

Comparing the prediction results of DNN and GA-BP models, the runtime of the GA-BP model was dozens of times longer than that of the DNN model while the accuracy of the prediction results was not improved significantly. Moreover, the calcium yield needs to be calculated in time to accurately control the calcium content for the calcium treatment operation in the steelmaking process. However, the runtime of GA-BP mode is so long that it could not meet the needs of industrial production. Thus, the DNN model was used to predict the yield of calcium in the current study.

The effect of C, Si, Mn, and Ca concentrations in steel on the calcium yield is shown in Figure 6. The relationship between T.Ca content before the calcium addition and the calcium yield is calculated in Figure 6(a). The calcium yield during the calcium treatment process decreased with the increase of the initial T.Ca content. It was also reported in the previous study because of the low solubility of Ca in the molten steel.[51] In Figure 6(b), carbon content has a negative effect on the calcium yield. During the production process, the calcium yield of high carbon steel was generally lower than that of low carbon steel since more CaO inclusions will be formed under higher temperatures and the desired casting temperature for low carbon steel is higher.[31] Song[37] reported that the Si and Mn in the molten steel reduced the activity of calcium, thus increasing the calcium solubility in the molten steel and the calcium yield of calcium treatment. The effect of Si and Mn contents in the molten steel is calculated in Figures 6(c) and (d). Predicted results using the current DNN model exhibited a similar trend as reported results. The influence of the temperature and feeding speed of calcium wire on the calcium yield is calculated in Figures 6(e) and (f). A higher temperature was beneficial to improve the calcium yield of calcium treatment, as reported in previous studies.[15,31,32,33,34] The calcium yield was lower with the increase of the feeding rate of calcium treatment. With the current calculation dataset, it was suggested to lower the feeding speed to 90 m/min to increase the calcium yield.

Fig. 6
figure 6

Impact factors of molten steel on calcium yield: (a) T.Ca, (b) C, (c) Si, (d) Mn, (e) temperature, (f) feeding rate

Figure 7 shows the predicted calcium yield on the automobile structure steel and the tool steel. The calcium yield of calcium treatment was predicted with various feeding speeds and initial T.Ca contents. With the increase of T.Ca content before calcium addition, the calcium yield decreased. And the feeding speed had a negative effect on the calcium yield in the current range of feeding speed. The calcium yield in automobile structure steel is higher than that of tool steel because of the lower carbon content in automobile steel. The T.Ca content in the molten steel after calcium treatment can be controlled more stably. As a result, the efficiency of calcium treatment can be improved.

Fig. 7
figure 7

Prediction of Ca yield on (a) automobile structure steel, (b) tool steel

In the current study, three types of artificial neural network models were established to predict the calcium yield during calcium treatment process. The following conclusions were drawn:

  1. 1.

    Three types of neural network models including SNN, DNN, and GA-BP were established to predict the yield of calcium during the calcium treatment process. The accuracy and program running efficiency of the models were compared. The accuracy of the DNN and GA-BP model was higher than that of the SNN model. The DNN model exhibited better program running efficiency, which was selected to predict the yield of calcium for practical application.

  2. 2.

    It was predicted that the calcium yield decreased with the increase of calcium and carbon contents in the molten steel, while higher Si and Mn contents increased the calcium yield. A higher temperature was beneficial to improve the calcium yield of calcium treatment. With the current calculation dataset, it was suggested to lower the feeding speed to 90 m/min to increase the calcium yield.

  3. 3.

    With fixed parameters of the calcium treatment including the steel composition, temperature, feeding speed of calcium wire, etc., the currently developed model can be used to improve the calcium yield and the stability of calcium content during the calcium treatment process.