1 Introduction

Laminar diffusion flames are the ideal research targets for soot studies, since the fuel pyrolysis, soot inception, growth, and oxidation can be readily identified along with the flame height. A few systematic diffusion flame datasets are established worldwide soot research groups [1,2,3,4]. Accurate detailed flame temperature fields are essential to address the remaining soot inception/oxidation processes issues and refine the corresponding soot submodels. Some non-intrusive optical techniques could probe the total flame temperature field, no matter the soot exists or not in the flames, for example, Coherent anti-Stokes Raman Spectroscopy (CARS) [5], Filtered Rayleigh scattering [7], Two Line Atomic Fluorescence (TLAF) [6] and so on. These optical techniques base on the different principles that relevant to the temperature and one common feature is that the temperature retrieval relies on the atom/molecule level behavior. Normally, these techniques require laser sources, sophisticated optical arrangements and post-processing procedures, which causes less application in practical combustion scenarios.

Soot radiation-based thermometry for the soot temperature retrieval is widely applied in various non-intrusive optical techniques, i.e., two-color ratio pyrometry [8], three-color pyrometry [9], spectral soot emission (SSE) [10], modulated absorption emission (MAE) [11, 12], two-color Laser-induced incandescence (LII) [13], etc., because it takes the advantages of the simple correlation between soot thermal emission intensity and soot temperature, and provides the merits of rapid time response, being non-intrusive, economic, and easy to set up. One major drawback is that in the region where soot particle is absent or the amount of soot particle is less significant, the flame (gas) temperature could not be retrieved.

To complete the flame temperature field in the non-sooting region, some non-intrusive optical techniques are available. The tunable diode laser absorption spectroscopy (TDLAS) [14] is a promising one, in which the flame gas temperature is derived from molecular gas spectra and is, therefore, not affected with the absence of soot. However, it is more widely used for the line-of-sight-averaged measurement due to the simple optical schematics. An alternative option is the variant Rayleigh scattering thermometry, i.e., Filtered Rayleigh scattering (FRS) [7]. It typically places a molecular filter (molecular iodine vapor) in front of the ICCD to reject the stray lights and ensure the correlation between the temperature and the Rayleigh scattering intensity, therefore, the elimination of Mie scattering interference from soot is the key point of temperature measurement accuracy. The temperature field measurement was attempted in a lightly sooting flame [15], however, the temperature measurement accuracy and potential of FRS in more hostile combustion environments still need to be explored. On the other hand, an intrusive method, thin filament pyrometry (TFP) was developed to obtain the flame temperature in the non-sooting regions [16]. Basically, the thin SiC fibre is inserted into the flame and its radiative emission is recorded. The flame temperature could be detected by calibrating the radiative intensity to the thermocouple-derived temperature. One principal advantage against the thermocouple is less susceptible to soot interference and the ability of long-time measurement, i.e., 10 min. The high spatial and temporal resolution of 42 \(\upmu \)m and 0.66 \(\upmu \)s were also reported [16].

The machine learning method emerges in the combustion field recently and has proved to be a promising method. Machine learning extracts information from data automatically by computational and statistical methods to find relations between inputs and outputs even if the dependent and independent variables are not clear [17]. García-Cuesta et al. [18] have tried to retrieve hot gas temperature profiles from infrared spectra of CO\(_{2}\) and H\(_{2}\)O in the exhaust gas plume of a micro-jet engine via the artificial neural network approaches. Most recently, Ren et al. [19] developed an inverse radiation model based on the Multi-Layer Perceptron (MLP) neural network method to retrieve temperature and gas species volume fraction distributions from infrared spectral emission measurements for combustion gas mixtures. And the predicted temperature fields were in excellent agreement with temperatures deduced from Rayleigh scattering thermometry. Furthermore, the prediction of soot volume fraction and temperature fields simultaneously in the ethylene laminar flames from infrared soot emission through the modified MLP neural network were further explored in Ref. [20] for N\(_{2}\) and CO\(_{2}\) diluted flames. The results were compared with these obtained from the MAE technique and a good temperature prediction precision was found. However, the predicted flame temperature fields were still confined to the soot-existing region in the flames. However for the flames studied, the soot particles are small and in the Rayleigh limit, the soot temperature and gas temperature are assumed to be the same. Therefore the retrieved soot temperatures are just part of the flame temperatures. Using the information provided with soot temperature distributions, it might be possible to recover the complete flame temperature distributions.

In this paper, a novel two-step MLP approach is proposed to complete the flame temperature field from soot temperature field previously measured with the soot thermal radiative intensity ratio, i.e., MAE technique. The feasibility of this novel MLP approach is first numerically assessed by recovering the complete flame temperature field and the robustness of this method is also investigated by incorporating Gaussian random noise into the training and the testing data. The method is further experimentally validated by recovering the complete flame temperature field of a standard Santoro flame that is previously measured by a thermocouple. Additionally, the two-step MLP prediction accuracy is detailed.

2 The machine learning approach

The processes of MLP neural network method that applied to retrieve the flame gas species volume fraction and temperature distribution or soot temperature and volume fraction fields were detailed in previous studies [19, 20]. The major features of the MLP method are briefly reminded. The MLP consists of an input layer, one or more hidden layers, and an output layer. Each layer comprises several nodes called neurons. Neurons of one layer are directly connected to the next layer by weights. Each neuron in the hidden layers transforms the values from the previous layer with a weighted linear summation, followed by a nonlinear activation function (how data processed within neurons refer to Ref. [19]).

An important part of modeling with neural networks is the so-called training of the network (learning). Training neural networks is done by adjusting appropriate weights \(\mathbf{W }\) between neurons to minimize the error of the cost function so that the output values generated by the network are compared with the actual corresponding values. The cost function is,

$$\begin{aligned} F \left( \mathbf{X }_p,\mathbf{X },\mathbf{W } \right) =\displaystyle \Vert \mathbf{X }_p -\mathbf{X } \Vert ^2 +\alpha \Vert \mathbf{W } \Vert ^2 \end{aligned}$$
(1)

where \(\mathbf{X }_p\) is the predicted scalar values by the neural network and MLP uses parameter \(\alpha \) for regularization which helps to avoid overfitting by penalizing weights with large magnitudes. Learning is an iterative process and uses a relatively large number of samples, which should contain information spread evenly over the entire range of the system, which allows obtaining a sufficiently low error of the cost function. After training, the model can be directly used to predict new outputs by feeding new inputs. In the context of completing the flame temperature field, during the training process, the inputs are incomplete flame temperature (soot temperature) fields and the outputs are complete ones, which are both generated from numerical simulations. After training, the MAE measured soot temperature fields can be fed into the trained model to predict the corresponding flame temperature fields. The present study implements the scikit-learn Python library [21] and the MLP training uses a stochastic gradient-based optimizer proposed by Kingma and Ba [22].

2.1 Two-step MLP model

For machine learning using the artificial neural network, a large set of training data has to be routinely available [19, 20, 23]. These data can be either from experimental measurements or numerical simulations or both. In the present study, the training inputs are the numerical temperature field of one standard Santoro flame (fuel ethylene flow 0.231 L/min, air coflow 43 L/min) from Ref. [24]. To complete the non-sooting region temperature for the sooting flames, the total flame temperature fields as the priori for the training model are required. However, the numerical temperature field is the only available source that could be used. So the training data here are generated only based on the numerical temperature field from simulation of the standard Santoro flame. The training data are generated with series of soot temperature field “shapes” that normally obtained from the MAE technique. For example, for the diluted Santoro flames, the detectable temperature field region by the MAE technique was largely shrunk with increasing the N\(_{2}\) or CO\(_2\) fractions [4, 20]. So the training inputs are generated by mapping the “shapes” of the MAE soot temperature fields to the numerical temperature field, and only keeping the domain where MAE has values. Totally, there are 14 MAE soot temperature “shapes”, i.e., one flame is without any dilution, 7 flames are with different levels of N\(_2\) dilution and 6 flames are with different levels of CO\(_2\) dilution.

Two models are fostered in this novel MLP neural network, both of them conduct 1-D soot temperature to 1-D flame temperature predictions. One model (MLP1) recovers the soot temperature field horizontally and the other (MLP2) further recovers the soot temperature field vertically, resulting in a complete 2-D flame temperature field. Since our MAE soot temperature field has a 915 \(\times \) 80 pixels in dimension, so for each available MAE flame measurement, it can be used to generate 915 pairs of horizontal datasets for training MLP1 and 80 pairs of vertical datasets for training of MLP2. During the training process, we first perturbed the temperature field with Gaussian random noises and then randomly hold out 90% of the data as a training set and use the remaining 10% for cross-validations to test the models. Two models have been trained and tested, with 5% and 10% of Gaussian random noises in the training and testing data. The architecture of a MLP neural network is defined by a list of parameters called hyperparameters, such as number of hidden layers, number of neurons in each of the hidden layers and the regularization parameter \(\alpha \). There is no specific approach to determine the number of hidden layers and their neurons for different problems. The choice of the optimal hyperparameters remains more of an art than science and is usually made by trial and error. The criterion is to select a hyperparameters combination which makes the maximum training and testing scores. The training score and testing score are the \(R^2\) score, also known as the coefficient of determination, which is defined as,

$$\begin{aligned} R^2=1-\frac{\sum \left( y - y _p\right) ^2}{\sum \left( y -\bar{ y }\right) ^2} \end{aligned}$$
(2)

and the best possible score is 100%. Where \( y _p\) is the predicted value of the neural network and \( y \) is the actual value. So after trial-and-error, we found that 4 hidden layers with 400 neurons in each of the hidden layer with \(\alpha \) = 1000 for both MLP1 and MLP2 give best scores, thus used as the optimal neural network architectures.

Fig. 1
figure 1

A representative process of temperature completion from the soot temperature field (b) to the final predicted entire flame temperature field (d)

Figure 1 demonstrates a representative process of flame temperature completion from the soot temperature field by the two-step model. It is noted that only the right half of the diffusion flame temperature field is represented since the total flame is axis-symmetry. Figure 1b displays the representative soot temperature field, with a similar field shape as in the measured MAE soot temperature, which is used as the input of the MLP neural network. It is the numerical temperature field that mapped from the real experimental detectable soot temperature domain shape. Through the first MLP1 calculation, the temperature output is shown by Fig. 1c, which also is the input of the second calculation step of MLP2. As a result, the final flame temperature field output of the two-step MLP model is exhibited by Fig. 1d. As anticipated, the two-step MLP neural networks have predicted the “experimental” missing parts of the flame temperature. Ideally, the field (d) should be identical to that in Fig. 1a, which is the original numerical temperature field from Ref. [24]. During the training process, data in Fig. 1b, c can be artificially generated from Fig. 1a, which then are used to train the two models MLP1 and MLP2.

2.2 Robustness of the MLP model

Fig. 2
figure 2

Comparison between the ideal and the MLP recovered flame temperatures fields with 0%, 5% and 10% of Gaussian random noises added to both training and testing data

To show the robustness of the MLP model, the training data of the temperature field are perturbed with two sets of Gaussian random noises of 5% and 10%. After training of MLP neural network, the artificially generated soot temperature field with the same level of noises added is fed into the neural network to predict new temperature distributions. Figure 2 shows the temperature field comparison among the ideal (without noise) and the recovered ones from the MLP neural network predictions. As shown in the figure, even with 10% of random noises in the training and testing data, the temperature field is recovered very well.

Fig. 3
figure 3

Comparison between the ideal and the MLP predicted temperatures with 5% and 10% of Gaussian random noises added to both training and testing data

The correlation between the MLP predicted temperatures and the ideal ones for the 5% and 10% cases are shown in the upper two frames of Fig. 3 and the recovered temperature profiles at the height above burner z = 40 mm are shown in the lower two frames in Fig. 3. The noisy training data is also present for comparison, where the shadow represents the standard deviation intervals. As indicated in the figure, the discrepancies between the recovered and ideal values do not depend on the location within the flame. Despite relatively large noises in the training and testing data in the case of 5%, almost all the recovered temperatures are within 80 K discrepancies from the ideal values. While the noise level in the training and testing data increases to 10%, even the training temperatures can be as high as 400 K away from the ideal values, the MLP neural network models recovers the ideal temperatures quite well and the discrepancies from the ideal values are well within 100 K, as indicated in Fig. 3.

3 Experimental validation

3.1 Experimental santoro flame

Fig. 4
figure 4

Left sub-images: a representative process of temperature completion from experimental measured soot temperature field by MAE (a) to the final predicted entire flame temperature field (c). Right sub-images: profiles comparison of flame temperature at different heights above the burner, black asterisk: experimental data from MAE [4]; red triangle: output from two-step MLP model; blue square: experimental data from McEnally et al. [25]; yellow stars: experimental data from Santoro et al. [26]; gray diamond: experimental data from McEnally et al. [27]; pink triangle: experimental data from Santoro et al. [28]

Provided with the two-step MLP neural network, the temperature completion for the real experimental temperature result is further executed. Figure 4a displays the measured soot temperature field of the standard Santoro flame by the MAE technique (details refer to Ref. [4]). Meanwhile, Fig. 4b, c show the intermediate and final flame temperature field output by the two-step MLP models, respectively. In general, the two-step model could correctly recover the experimental temperature field trend, however, the recovered temperature is slightly lower than that of the numerical one (Fig. 1a). This is mainly attributed to the original experimental soot temperature by MAE is lower than the numerical one.

For further validation, the flame radial profiles at three heights of 20, 40 and 50 mm are compared with the dataset from literature, as shown by the right sub-images in Fig. 4. The asterisk (\(*\)) stands for the experimental temperature profile obtained by MAE technique (in Fig. 4a). The red triangles (\(\triangle \)) represent the temperature profiles output by the two-step MLP model (in Fig. 4c). And the scatter blue square, yellow stars, Gray diamond and pink triangle represent the temperature profiles measured by the thermocouple from McEnally et al. [25], Santoro et al. [26, 28] and McEnally et al. [27], respectively. The recovered temperature profiles by the MLP models are consistent with the experimental ones by MAE in the soot region. More importantly, in the soot absent region that highlighted in the shaded areas, the MLP models well predict the temperature variation trends and the predicted profiles match with the thermocouple measurements.

3.2 Prediction temperature uncertainties estimations

Since the initial parameters, i.e., initial weights and the stochastic objective function in Adam Optimizer during the MLP model training are assigned randomly, the MLP model predicted temperature field has a slight difference in the individual model predictions. Thus, ten independent models were trained and every predicted flame temperature field errors were accounted for by Eq. (3).

$$ T_{{{\text{ave}}}}^{t} = \frac{1}{N}\sum\limits_{{i = n_{1} ,\;j = n_{2} }}^{{i = n_{3} ,\;j = n_{4} }} \left| {T_{{{\text{PRE}}}}^{t} (x_{i} ,y_{j} ) - T_{{{\text{MAE}}}} (x_{i} ,y_{j} )} \right| $$
(3)

\(T_{\text {PRE}}^{t}(x_{i},y_{j}\)) is the \(t^{\text {th}}\) predicted flame temperature field region, which is limited to the soot existing position that could be measured by MAE technique, while \(T_{\text {MAE}}(x_{i},y_{j})\) is the experimental soot temperature field probed by MAE technique and N is the total pixel numbers within the MAE probed soot temperature field. In the present study, the N is estimated as 21,967.

Therefore, two parameters are further calculated to assess the prediction uncertainties performance. The ten averaged absolute temperature errors and the corresponding sample standard deviation are computed by Eqs. (4) and by  (5), respectively:

$$\begin{aligned} \begin{aligned} \overline{T_{\text {ave}}}=\frac{1}{10}\displaystyle \sum _{t=1}^{t=10}T_{\text {ave}}^{t} \end{aligned} \end{aligned}$$
(4)
$$\begin{aligned} \begin{aligned} S=\sqrt{\frac{1}{9}\displaystyle \sum _{t=1}^{t=10}(T_{\text {ave}}^{t}-\overline{T_{\text {ave}}})^{2}} \end{aligned} \end{aligned}$$
(5)

Table 1 summarized the ten averaged absolute temperature errors and the corresponding sample standard deviation in the two-step MLP models. Indeed, during the two-step MLP prediction, the temperature field in the soot region was predicted twice. Thus, these two-step predictions absolute errors and the corresponding sample standard deviations were displayed separately. The first step prediction performance was better than that of the second (final) step, which could be attributed to the accumulation of the first step prediction errors in the second model prediction. Nevertheless, the final step prediction still exhibited significantly low prediction uncertainties and strong prediction stability. Besides, since the reported temperature measurement uncertainty by MAE technique was ± 50 K [12], as a result, the total two-step MLP prediction uncertainty could be estimated as ± 85.5 K.

Table 1 Statistical averaged absolute temperature errors and sample standard deviation in the two-step MLP models

3.3 Efficiency

Machine learning carries out the time-consuming parts beforehand, i.e., including training data generations, model training and validation. Once the model is ready, the recovering processes are very efficient, i.e., MLP1 takes about 0.43 s and MLP2 takes about 0.70 s of CPU time to recover a temperature field with 915 \(\times \) 80 pixels on an Intel Xeon Gold 6130 processor. The longer computational cost of MLP2 is due to more neurons in both input and output layers.

4 Discussion

The proposed two-step MLP model was trained by the numerical temperature field due to the limited total temperature field source of that standard Santoro flame in the literature. Yet, the feasibility and robustness of this method were detailed. Furthermore, this two-step approach could be adapted to other soot radiation-based thermometry, no matter the input total flame temperature field comes from the simulation or experiment. Even though the model we got currently only works on Santoro-type burner with certain ranges of flow conditions, the method we proposed can be applied to other types of burners and other flames as well. New models can be trained corresponding to flame condition variations, i.e., different fuels, flow rates, dilutions, etc.

In fact, a more widely applicable model will be the target of our future work. For example, we could obtain series of N\(_{2}\) diluted total flames temperature fields from numerical simulations and then foster a new model to complete any N\(_{2}\) diluted fraction flame temperature field that probed by soot-radiation based thermometry. This universal model that could predict full temperatures for different fuels, flow rates, dilutions, etc. is our ultimate goal. However, additional experimental or numerical data sources are required as a priori, which helps for model training, testing and validation. After these, the applicability scope of the model could be significantly extended.

In addition, an experimental data-based MLP approach to recover the non-sooting region temperature in the sooting flame deserves further investigation. For example, in Ref. [19], the flame gas temperature could be retrieved by the mid-infrared flame radiation from CO\(_{2}\) or H\(_{2}\)O molecules through MLP approach. Therefore, the total sooting flame temperature field theoretically could be obtained by simultaneous infrared and mid-infrared flame radiation measurements.

5 Conclusion

This paper originally provides a two-step Multi-Layer Perceptron (MLP) neural network method, which allows completing the absent flame temperature field that is obtained by the soot radiation-based thermometry, i.e., MAE technique. The two-step MLP model is fostered by the numerical temperature field of one standard Santoro flame. And the feasibility of this approach is verified by recovering the artificially generated “experimental” soot temperature field from the MAE technique. Furthermore, the robustness of the approach is assessed by introducing 5% and 10% Gaussian random noises into training and the testing temperature fields. It is found that the recovered temperatures are within 80 K and 100 K discrepancies from the ideal values, respectively. Moreover, the predicted temperature profiles in the soot absent region by two-step MLP models are further validated by independently thermocouple results. A consistent and more complete flame temperature field is obtained by the MLP method, compared to the soot temperature field probed via MAE. As a result, the two-step MLP model exhibits significantly low prediction uncertainties and strong prediction stability and the total prediction uncertainty is estimated as ± 85.5 K. Eventually, it is worth mentioning that the proposed two-step MLP method could help all kinds of soot radiation-based thermometry for complete flame temperature field retrieval, if a total temperature field source was provided as a priori.