Abstract
High-speed planar imaging of key combustion species, like hydroxyl radicals (OH), is crucial for understanding the complex chemistry–turbulence interactions in turbulent flames. However, conducting high-speed (kHz) diagnostics is challenging due to the requirements on advanced optical system, including both fast lasers and cameras. In this paper, we report a computational imaging method to artificially achieve higher diagnosing rates based on experimental data at relatively low rates. Sequencies of planar laser-induced fluorescence (PLIF) of OH recorded at 100 kHz in a turbulent flame were first down sampled to 50 kHz, 33.3 kHz and 20 kHz, respectively, and then used as a data source to train several networks. The accuracies of the models were assessed by comparing the predicted images with those from laser measurements. It was found that, among the models tested, convolutional long short-term memory network (CONV-LSTM) can provide the best predictions and is reliable in predicting consecutive images with higher repetition rates. The model can also generate consecutive OH-PLIF images at 200 kHz based on the 100 kHz experimental data. This work sheds light on the hybrid of deep learning-based computational methods with conventional high-speed laser diagnosing techniques, which can potentially increase the temporal resolution of planar optical measurement in turbulent flows, and significantly reduce the computational timing.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
1 Introduction
High-speed planar laser-based diagnostics has been widely applied to experimentally study reacting and non-reacting flows [1,2,3,4,5,6,7,8,9]. Several laboratories have developed high-frequency laser diagnosing facilities for experimentally studying turbulent flows and flames. For example, Slipchenko et al. [10] have developed a burst-mode laser, the output frequency of which can be as high as 1 MHz. The laser was used for planar laser-induced fluorescence (PLIF) measurement of formaldehyde (CH2O) in floating methane diffusion flame at 20 kHz. The fundamental frequency (1064 nm) of this laser can also be used for high-speed planar laser induced incandescence (PLII) measurement of soot concentration as well. In addition, Fu et al. [11] carried out simultaneous measurement of PLIF and three-dimensional particle image velocimetry (PIV) at 20 kHz in an ethylene diffusion flame with acoustic excitation. Michael et al. [12] optimized the burst-mode laser system and successfully carried out a 100 kHz CH2O PLIF measurement on a floating diffusion flame. In addition to CH2O imaging, the burst-mode laser has also achieved 100 kHz in other measurements, including Rayleigh scattering temperature measurement [13, 14], coherent anti Stokes Raman scattering (CARS) temperature measurement [15], and PIV measurement for turbulent flow fields [16]. Besides, Philo et al. performed 100 kHz PIV in a liquid-fueled gas turbine swirl combustor at 1 MPa, demonstrating the validity of high repetition-rate measurements in practical combustion systems [17]. It is also worth mentioning that researchers have used multi-pulsed Nd:YAG systems to generate short-bursts for LIF measurements at 20–40 kHz in practical combustion systems such as spark ignition engines [18].
While most of the measurements discussed above applied the harmonics of the Nd:YAG laser, there are still many laser-based techniques requiring other excitation frequencies, e.g. in PLIF measurement for hydroxyl (OH) radicals, where the Nd: YAG laser needs to be combined with either a dye laser [19,20,21] or an optical parametric oscillator (OPO) [22]. Sjöholm et al. [23] used this experimental method for optical diagnosis of other substances such as CH, CH2O and toluene. Wang et al. [24] reported the first ultra-high-speed diagnostic technique which simultaneously probes the OH and formaldehyde distributions in a highly turbulent flame at a repetition rate of 50 kHz. Using fast Nd:YAG lasers and frequency extension units, OH-PLIF imaging has been conducted at 50 kHz by by Miller et al. [25] in a H2-air diffusion flame. However, further increasing the speed of imaging measurements is very challenging due to the limited repetition rates of laser system and cameras, as well as difficulties in storing and transferring big data. Therefore, either the number of consecutive images is significantly reduced, or the spatial resolution and field of view must be sacrificed to maintain image quality.
Yet these issues can potentially be addressed by combing laser-based imaging with computational methods to artificially accelerate the frequency of imaging, inspired by the previous works [26,27,28]. As an effective approach of computational imaging, machine learning architecture, and particularly deep neural networks, has seen explosive growth drawing on similar progress in mathematical optimization and computing hardware. While these developments have always been to the benefit of image interpretation and machine vision, only recently has it become evident that deep neural networks can be effective for computational image formation, aside from interpretation [29].
Generating high-speed imaging from low-frequency ones essentially requires a temporal sequence prediction model. Such a model takes the reference image sequence (low-speed diagnostics) as the input and generates the temporal interpolation (usually greater than 1) as the output. What is more challenging is that each image in the sequence needs to have high spatial resolution to reconstruct turbulence structures. In machine learning, recurrent neural network (RNN) [30] and its variant models such as long short-term memory (LSTM) [31] and gated recurrent unit (GRU) [32] are designed to deal with this type of multi-sequence input and multi-sequence output problems. What’s more, each image is spatially resolved by a convolutional neural network (CNN) [33,34,35]. Therefore, it is of great significance to explore the feasibility of such methodology, to compute the high-frequency imaging with the relatively low-frequency experimental data, thus reducing the dependence on high-speed laser-camera setup.
Many recent works have demonstrated the effectiveness of deep learning in generating a sequence of images with high spatial resolution. For example, Hong et al. [36] used a specific combination of skip connection and convolutional sequence-to-sequence auto-encoder to predict undiscovered weather situations from previous satellite images with high accuracy. However, the model only predicted the next weather situations, which were scalars instead of image matrix. Shi et al. [37] and Kim et al. [38] predicted future precipitation from historical multichannel radar reflectivity images with a convolutional long short-term memory (Conv-LSTM) network. This network has become a seminal model in this area. Afterwards, Finn et al. [39] constructed a network based on this model for predicting transformation in the next frame from previous image sequence. Lotter et al. [40] built a predictive model upon Conv-LSTM, mainly focusing on increasing the prediction quality of the next frame. However, they only predict the next single frame with a sequence of frames as the input. Whilst in this research, a sequence of images (equal to or greater than 1) need to be predicted from single input image, which is more challenging. Patraucean et al. [41] modified RNNs by introducing optical flow to model temporal dynamics. However, this methodology is difficult to apply due to the high additional computational costs. Such learning problems, regardless of their exact applications, are nontrivial in the first place due to the high dimensionality of the spatio-temporal sequences especially when multi-step predictions have to be made. Moreover, building an effective prediction model for high-speed imaging in turbulent flames is even more challenging due to the unsteady nature of the turbulent combustion.
This paper presents the Conv-LSTM network to form the computational model, because convolutional layers can extract spatial features and LSTM can capture temporal characteristics from high-speed imaging. As such a prediction with sufficient spatial and temporal resolution is expected using Conv-LSTM. Among all the intermediate species of combustion, OH is a critical radical, of which the formation is commonly interpreted as a marker of the flame reaction zone. Furthermore, excitation of OH radicals cannot be directly achieved by the harmonics of the commonly employed Nd:YAG lasers [5]. Therefore, this work chose OH-PLIF as an example to explore the feasibility of developing a computational method to artificially increase the diagnosing repetition rate of high-speed imaging in turbulent flames.
2 Experimental data
The experimental data in the present work was collected using an experimental setup that can be found in a previous paper [42], with the details of the optical system summarized in Table 1. An ultra-high-speed laser (Quasimodo by spectral energy, LLC) was used, which is similar to but not identical with the system developed by Slipchenko et al. [10]. This laser system was employed to pump the optical parametric oscillator (OPO, GWU, PremiScan/MB), for the generation of excitation radiation of OH radicals in the flame. The pump beam size was reduced to approximately 4 mm in diameter, using a telescope before the β-barium borate (BBO) crystal. OH excitation scan is not presented here for brevity. More information can be found in Ref. [24]. After the OH excitation scan, laser radiation was tuned to 283.93 nm (A2Σ+−X2Π, 1–0 transition). The emission from the A–X (0, 0) transition was collected at around 308 nm, through a bandpass filter (λT = 310 ± 10 nm) and an UV lens (B. Halle lens, f# = 2, f = 100 mm) mounted in front of a high-speed intensifier (Lavision HSIRO) and a CMOS camera (Photron Fastcam SA-Z). A Pellin-Broca prism was used to separate 284 nm from 568 nm after the doubling crystal. A 20 mm high laser sheet was formed by a cylindrical lens (f = − 40 mm) and a spherical lens (f = + 200 mm). The resulting pulse energy of the 283.93 nm laser at 100 kHz was about 150 µJ/pulse, which generates sufficient SNR in this application. The resolution of the CMOS camera for OH-PLIF measurement was set to 600 × 240 pixels at 100 kHz, resulting in an field of view of 18 × 7 mm.
The burner employed in this study was a mixed porous plug/jet burner (LUPJ burner), of which the details can be found in previous works [42,43,44,45]. The main components of the burner include a porous sintered stainless-steel plug with a diameter of 61 mm, and a nozzle with a diameter of 1.5 mm in the center. The premixed CH4 and air mixture formed a jet flame through a central nozzle. The jet flow speed is 66 m/s, corresponding to an exit Reynolds number of 6300 [46] and a turbulent Reynolds number of 95 at y/d = 30 [42]. The equivalence ratio of jet flow is 1.0. The gas flow rate was regulated by mass flow controllers (Bronkhorst), which are calibrated at 300 Kelvin with an accuracy higher than 98.5%.
In the present work, all the OH-PLIF images were preprocessed with binarization to enhance the discrimination of OH regions from those without OH radicals (Fig. 1). A threshold value of m was applied which is set to be the average value of the intensities of all pixels over the image. More specifically, for the pixels with an intensity less than m, a value of 0 is applied, while those with an intensity equal and larger than m, a unity intensity is applied, i.e. setting the value of [0, m] to 0, and (m, 255) to 255, with m being the average pixel intensity of the image.
3 Deep learning methodology
3.1 Convolutional LSTM
A modified Conv-LSTM model reported by Shi et al. [37] was used in the present paper. To enable the hidden layers learn from longer time scale, two extra parameters were designed, which are steps and effective steps, representing the length of OH-PLIF image sequence fed to the hidden layers and the interpolated OH-PLIF image sequence, respectively. Besides, we used the eLU [47] activation function in the input layer and SeLU [48] activation function in the output layer instead of ReLU [49]. The OH-PLIF images were generated as a sequence of images with sufficient temporal and spatial resolution, which can be solved under the general sequence-to-sequence learning framework proposed by Sutskever et al. [50].
The main formulas of the Conv-LSTM used in this paper are given as follows:
where ‘\(*\)’ represents the convolution operation and ‘\(\circ\)’ is the Hadamard product. Here, \({i}_{t}\), \({f}_{t}\) and \({o}_{t}\) of the Conv-LSTM represents input gate, forget gate and output gate, respectively, and they are all 3D tensors whose last two dimensions are spatial dimensions (rows and columns). \({\mathcal{X}}_{t}, {\mathcal{H}}_{t}\,\,\mathrm{ and }\,\,{\mathcal{C}}_{t}\) are the input, hidden layer memory state and memory cell output at timestamp \(t\), respectively. Here, the memory cell output \({\mathcal{C}}_{t}\) acts as an accumulator of the state information. The cell is accessed, written, and cleared by several self-parameterized controlling gates. When a new input comes, its information will be accumulated to the cell if the input gate is activated. Also, the past cell status \({\mathcal{C}}_{t-1}\) could be “forgotten” in this process if the forget gate \({f}_{t}\) is on. Whether the latest cell output \({\mathcal{C}}_{t}\) will be propagated to the final state \({\mathcal{H}}_{t}\) is further controlled by the output gate \({o}_{t}\). One advantage of using the memory cell and gates to control the information flow is that the gradient will be trapped in the cell (also known as constant error carousels) and be prevented from vanishing too quickly, which is a critical problem for the vanilla recurrent neural network (RNN) model.
If the states were viewed as the hidden representations of moving objects, a Conv-LSTM with a larger transitional kernel should be able to capture faster motions while that with a smaller kernel can capture slower motions. This trend has been proven in our model training process. Also, as was discussed in [38], the inputs, cell outputs and hidden states of the traditional FC-LSTM may also be seen as 3D tensors with the last two dimensions being 1. In this sense, FC-LSTM is essentially a special case of Conv-LSTM with all features standing on a single cell.
To ensure that the states have the same number of rows and columns as the inputs, padding is needed before applying the convolution operation. Here, padding of the hidden states on the boundary points can be considered as using the area out of photosensitive range. Usually, before the first input was fed to the network, initialize all the states of the LSTM to zero which corresponds to “total ignorance” of the future. Similarly, zero-padding (which is used in this paper) on the hidden states, are actually setting the state of the outside world to zero and assume no prior knowledge about the zone outside of the camera photosensitive range.
3.2 Encoding-forecasting architecture
Assuming a spatio-temporal sequence \(({S}_{1},{S}_{2},{S}_{3},\dots ,{S}_{n})\) was measured under a frequency of \(f\). To increase the measurement frequency by a factor of \(K\), at every timestamp \(t\), the model needs to generate a \(K\)-step prediction based on the previous observation, i.e. from \({S}_{t}\) to \((\widehat{{S}_{t+1}},\widehat{{S}_{t+2}},\dots ,\widehat{{S}_{t+K}})\). Our encoding-forecasting network first encodes the observation into n layers of RNN states: \({\mathcal{H}}_{t}^{1},{\mathcal{H}}_{t}^{2},\dots ,{\mathcal{H}}_{t}^{n}=h({S}_{t})\), and then uses another n layers of RNNs with the same weights to generate the predictions based on these encoded states: \(\widehat{{S}_{t+1}},\widehat{{S}_{t+2}},\dots ,\widehat{{S}_{t+K}}=p({S}_{t})\). Figure 2 illustrates the encoding-forecasting structure for \(n\)= 3; \(K\)= 3. The target images (ground truth) are supervision of the network, which is used to evaluate how close the predictions are to the ground truth. A similar formulation of the sequential prediction was adopted as was described in [37], but with a significant difference in input size.
As is shown in Fig. 2, the Conv-LSTM encoder compresses the whole input sequence into a hidden state tensor and the Conv-LSTM forecaster unfolds this hidden state to form the final prediction. This structure is also similar to the LSTM future predictor model proposed by Srivastava et al. [51] except that our input and output elements are all 3D tensors which preserve all the spatial information. Since the network has multiple stacked Conv-LSTM layers, it is able to predict sequences in complex dynamic systems, similar with that applied to solve the precipitation nowcasting problem by Shi et al. [37].
3.3 Model description
Before the training of the network, we first split OH-PLIF sequence to training set (80% of the data) and test set (20% of the data). The training set is used to train the Conv-LSTM model to optimize the parameters through back propagation [52]. The test set is preserved to calculate and analyze the errors of the model optimized by the training set.
All the models in this study were trained by minimizing the mean square error (MSE) loss using back-propagation through time (BPTT) [53] and Adam [54] with a learning rate of \({10}^{-4}\). MSE is defined as follows:
where \({f}_{ij}\) and \({g}_{ij}\) represent the pixel intensity of image \(f\) and \(g\) at row \(i\) and column \(j\), respectively; \(M\) and \(N\) represent the number of rows and columns, respectively.
The implementations of the models are in Python under Pytorch framework. We run all the experiments with eight NVIDIA 2080TI GPUs. To investigate the predicting capability of the deep learning model, we interpolated 1, 2 and 4 image sequence in between the experimental sequence, corresponding to generating 100 kHz OH-PLIF images from 50 kHz, 33.3 kHz and 20 kHz, respectively. In addition, we also artificially generated 200 kHz OH-PLIF images out of 100 kHz experimental sequence, which has not been captured by real-world PLIF yet. These predictions were denoted as P50-100 kHz, P33.3–100 kHz, P20-100 kHz, P100-200 kHz respectively in the following sections of this paper. These models are trained on 1791 OH-PLIF sequences and tested on 100 sequences, each sequence contains 15 images.
3.4 Quantitative evaluation of model accuracy
In addition of MSE mentioned in the last section, structural similarity index (SSIM) [55] was also used in this study to quantify the degree of similarity between predicted image P and experimental supervision T, both of which binary images. SSIM is defined as:
where the relevance of luminance \(l\), contrast \(c\) and structure \(s\) are further defined as follows:
where \({\mu }_{P}\;\mathrm{ and}\; {\mu }_{T}\) represent the mean value of pixel intensity for image \(P\) and \(T\); \({\sigma }_{P}\), \({\sigma }_{T}\) represent the standard deviation of image \(P\) and \(T\); \({\sigma }_{PT}\) represents the covariance of image \(P\) and T. \({C}_{1}\), \({C}_{2}\) and \({C}_{3}\) are constants to avoid zero denominator. The constants are determined as follows: \({C}_{1}={({K}_{1}\times L)}^{2}\), \({C}_{2}={({K}_{2}\times L)}^{2}\), \({C}_{3}=\frac{{C}_{2}}{2}\); \({ K}_{1}=0.01\),\({K}_{2}=0.03\), \(L=255.\)
Furthermore, another index, Correlation, was also employed to quantify the similarity between prediction and measurement. Correlation is defined as follows:
where \({P}_{ij}\) and \({T}_{ij}\) represent the pixel intensity of image \(P\) and \(T\) at row \(i\) and column \(j\), and \(\mathcal{E}={10}^{-9}\).
For the predicted \(P\) and its ground truth \(T\), their corresponding binary image \(\widehat{P}\) and \(\widehat{T}\) can be acquired based on the binarization threshold. In this study, the threshold is the mean value of image intensity. Then, the intersection over union (IoU) can be calculated by overlapping the two binary images, as is shown in Eq. 12. IoU is used to quantify the prediction accuracy on signal occurrence. Specifically, the difference between the prediction and ground truth can be found, as is shown by the red region in Fig. 3. It should be noted that, there are sparse noises after binarization, to decrease the influence of these noises while keeping the major parts of OH cluster, a 10 \(\times\) 10 pixels (0.31 mm \(\times\) 0.31 mm) smooth window was applied on the binary images before calculating IoU.
3.5 Architecture optimization and parametric study
To evaluate the performance of different architectures and parameters of the Deep-Learning (DL) model, a modified Conv-LSTM model (employed in the current work) was compared with the original Conv-LSTM model, as well as some other mainstream DL architectures like FC-LSTM and Conv-GRU. The comparison was performed under the case of P50-100 kHz. For the FC-LSTM network, the same structure was utilized as the unconditional model in [51] with two 2048-node LSTM layers. For Conv-LSTM and Conv-GRU, the convolutional kernel size and the number of hidden layers were fixed to directly compare their performances, for which the kernel sizes are all 5 × 5 with 3 hidden layers and 16, 16, 32 hidden states, respectively. The results show that the Conv-LSTM model employed in the current work results in the lowest MSE among all the models studied. These models are trained on 1791 image sequences and tested on 100 image sequences, with the length of each sequence being 15. The images were resized from [1024, 400] to [512, 200].
Table 2 shows that the FC-LSTM network results in relatively large MSE, which is mainly because a fully connected network has a disadvantage of capturing spatial correlations, whereas the sequential OH-PLIF images present strong spatial correlation, i.e., the motion of flame is highly consistent in a local region. Thus, this model is unlikely to capture these local consistencies. Also, the Conv-LSTM model with eLU and SeLU activation function outperforms the Conv-GRU and original Conv-LSTM, which is mainly due to two reasons. First, Conv-LSTM with eLU and SeLU activation function can learn more nonlinear relationships between input and output than the original Conv-LSTM and Conv-GRU whose activation function is ReLU. Secondly, the structure of LSTM consists of three gates which is more complicated than GRU. It takes about 36 h to train the model, and once trained, the model can generate one inserted frame in a few seconds.
Also, the performance of the DL model at different structures/configurations of the network was also tested and summarized in Table 3. Therein, the (5 × 5) represents input-to-state kernel size, the following 5 × 5 terms represent the state-to-state kernel size. The values of ‘16’, ‘32’ and ‘64’ represent the hidden states in hidden layer. The ‘Step’ value represents the length of temporal image sequence we feed into the network during the training process. The images were resized from [1024, 400] to [256, 100] in consideration of time and computational cost.
The test MSE of different model configurations were presented in Table 3. The 1-layer network contains one Conv-LSTM layer with 64 hidden states, the 2-layer network has two Conv-LSTM layers with 32 hidden states each, and the 3-layer network has 16, 16, and 32 hidden states respectively in the three Conv-LSTM layers. All the input-to-state and state-to-state kernels are of size 5 × 5. The deeper models (3 Conv-LSTM layers) result in lower MSE loss. To investigate the relationship between time step of the model and characteristic time scale of the flow itself, we have tried 5, 10, 15, 20, 25 as steps, which is related to Taylor time scale (100 microseconds) and 45 images, which is related to integral time scale (447 microseconds). The results are presented in Table 3. It can be seen that the test MSE of different steps are quite close when the steps is more than 10, and the statistical best performance is achieved when the step is 45, and longer sequence provides lower MSE loss. However, we tend not to choose long length of sequence as training parameter if very similar prediction capability of model can be guaranteed. Additionally, the increasing number of step size leads to the decline in the number of sequences available for training and testing, which means large step size demands even bigger dataset for the model to be properly trained. Considering these reasons, finally we choose step size 15 in practice for our further research.
In addition to the comparison of model configurations, the influence of input-to-state and state-to-state convolutional kernel size was also evaluated, as is shown in Table 4. This was achieved by comparing the model performance at different convolutional kernel sizes, including 3 × 3, 5 × 5, 7 × 7, 9 × 9 and 11 × 11. For convolution kernel sizes larger than 5 × 5, the train MSE loss decreases but the test MSE loss increases, which means the model is overfitted under larger kernel size. The reason for this is that larger kernel size introduces more parameters into the model. If the scale of train set is not sufficient to support the large scale of parameters, the model tends to be overfitted with low train MSE loss and high test MSE loss. Moreover, further increasing the kernel size will sacrifice the spatial resolution of the predicted results, hence is not preferred. Therefore, for the rest of the study, we chose a convolutional kernel size of 5 × 5.
In short, the network in the present work was trained by 100 epochs with 3 hidden layers, corresponding to 16, 16 and 32 hidden states. Both the input-to-state and state-to-state convolutional kernel size were 5 × 5, and the length of OH-PLIF sequences fed to the network was 15. The activation of the input layer was eLU and the activation of the output layer was SeLU.
4 Results and discussion
4.1 Transient performance of the DL model
The performance of the model under three different conditions were presented in Fig. 3. In this figure, the frames enclosed in red dash square are those from the DL prediction, while those not enclosed are experimental results. DL model has the capacity to generate high-speed OH-PLIF images from its low-frequency measured counterpart, especially for the case of P50-100 kHz, as the profile and fluidity are particularly similar to the experimental data for this case. While comparing the different cases studied, it can be found that the performance of the model becomes worse as the prediction steps increase. A quantitative evaluation of the model was summarized in Tables 4 and 5. Here we defined 3 indices to quantify the accuracy of the model, the mathematical descriptions of which are given in Eqs. (2–5). The values shown in Tables 5 and 6 were calculated from 100 image pairs. The statistics in the tables are consistent with what was found from Fig. 3, in that the performance of our model decreases with the interpolation steps, but still maintains a relatively satisfactory accuracy. For example, the average correlation (a similarity index) between the time-averaged ground truth and DL prediction are calculated to be 0.917, 0.854 and 0.752, for the case of P50-100 kHz, P33.3–100 kHz and P20-100 kHz, respectively.
Furthermore, the IoU (intersection over union) of the predicted and measured binary image is also presented in Fig. 3. So that it is easier to identify which region is mostly correctly predicted and which region is not. From the IoU, it can be found that the deviation between prediction and measurement mainly stay in the transition (or switching) regions, and the deviation increases with the inserted frame number, as is shown in the last column of Tables 5 and 6. According to some previous research, an IoU and SSIM of 0.65 and above [26, 56] suggests that the model has well-reconstructed the images with high similarity.”
To interpret the difference in model performance under the three conditions, it is important to discuss the characteristic time scale of turbulent structures. As was described above, the jet flow speed is 66 m/s for the current flame studied, hence the integral time scale is approximately 447 µs [45], the Kolmogorov time scale is 46 µs, while the Taylor time scale is about 100 µs at y/d = 30 [57]. The Taylor time scale indicates the characteristic time the small vortex is correlated with itself, or in this case, the time flame structures are correlated with themselves. Here, a Taylor time scale of 100 µs is reasonable to measure the time of flame self-correlation, because as can be seen from Figs. 3 and 7, that within around 50 µs, the flame has moved upwards by half of the frame length. With a Taylor time scale on the order of 100 µs, it seems that for the case of P50-100 kHz, the prediction might be reasonably made since a flame structure correlates highly with itself in the next frame. However, as more time-steps are skipped in the interpolation, for example, in the extreme case of P20-100 kHz, the correlation between flame structures is statistically much lower, thus difficult for the model to accurately predict intermediate steps as most of the turbulent structures begin to disappear in the next ground truth image. Therefore, the prediction capability of the model is related with the characteristic time scale of the flow itself.
Nevertheless, it is worth noting that, the above discussion was made under the condition that the models are trained on 1791 image sequences, with the length of each sequence being 15 frames. It is anticipated that further increasing the training data size will result in better model performance, because this is a supervised-training model, which means the ground truth (experimental data) at 100 kHz was fed into the model for parameter optimization. As such, with enough training data, the model is expected to capture the temporal-spatial correlation among consecutive frames even when the consecutive ground truths are weakly correlated with each other.
Figure 4 presents the probability density function of MSE, SSIM, Correlation and IoU calculated for 100 pairs of the testing images under the three conditions. The values of SSIM mostly range from 0.65 to 0.9, which indicates that the DL has the potential to predict the signal appearance and structure of OH-PLIF from its low-speed experimental data. Another obvious trend being observed is that the statistical results of SSIM and Correlation decreases with the number of temporal interpolations, indicating that increasing the prediction steps will compromise the model accuracy. It is worth noting that MSE presents an opposite trend compared with SSIM and Correlation as expected, this is because the former quantifies the difference while the latter two evaluates the degree of similarity.
Figure 5a and b presents the binarized OH-PLIF image from experiments and DL model, respectively. It can be seen that the proposed DL method is able to reconstruct the OH profile. In addition, the perimeter of OH signal in each image was also calculated to evaluate how well the profile of signals were reproduced. To minimize the influence of noises in the image, an image morphological operation named erosion and dilation [58] was applied to smooth the image, as is shown in Fig. 5c and d, then a “marching squares” method [59] was used to compute the contours of the input 2D array at a particular signal intensity. The matrix of pixel intensity is linearly interpolated to provide better precision for the output contours. To obtain the connective area for perimeter calculation, we used the array based union-find method proposed by Wu et.al. [60], which is reported to be effective in removing noises. It is worth noting that the perimeter was calculated based on the contour enclosed within the dashed box in Fig. 5e, f, i.e., that only vertical profiles within the dashed boxes were evaluated to avoid the interference from those ‘horizontal’ edges due to the low excitation power on the edge of laser sheet. Figure 5g is the IoU of the experimental and the DL predicted binary images, which is presented here just for comparison.
Figure 6 presents the perimeters of measured and predicted OH signal at various time under different modeling conditions. The average errors of perimeter between ground truth and prediction were also calculated for all the conditions studied, which is 2.05% for P50-100 kHz, 4.83% for P33.3–100 kHz and 5.42% for P20-100 kHz, respectively. Consistent with the results shown in Fig. 3, these results indicate that the model performed well for 50-100 kHz, but the increased length of interpolated sequence reduces the prediction accuracy. Also, it is worth noting that the model seems always to underestimate the perimeter of the OH boundary. This is because the DL based model was constructed with multiple kernels (filters) of 5 by 5, hence after the operation of these filters, some subtle features on the boundary of the OH cluster will be lost, hence the total perimeter of the DL predicted results will be shorter than the experimental one.
The Conv-LSTM model was also used to generate 200 kHz OH-PLIF based on 100 kHz experimental data, as is shown in Fig. 7. These predictions are not validated by experiments, as no such measurements are available yet. But it can be found that the predicted flame structure appears to be reasonable, and the flame fluidity is also reflected to some extent. Specifically, the predicted images preserve most of the spatial structures from the experimental data. In the meantime, the overall structure moves upwards to a reasonable extent at each time step. For example, the target area mark in the red circle moves up 452 pixels in total from 0 \(\mathrm{\mu s}\) to 45 \(\mathrm{\mu s}\), i.e. about 45 pixels every 5 \(\mathrm{\mu s}\).
4.2 Time-averaged predictions of the DL model
While the indices described above were employed to quantify the similarity between prediction and measurement for instantaneous images, that of time-averaged images also worth evaluating. Here, we present the time-averaged images of OH-PLIF, for the measured and DL predicted results, with each sub-figure averaged from 100 instantaneous images, as is shown in Fig. 8. It can be seen that the profile and intensity distribution of signals between prediction and measurement are similar, but the degree of similarity slightly decreases as the number of interpolations increases. Specifically, the SSIM between the ground truth and DL prediction are 0.937, 0.920 and 0.915 for the case of P50-100 kHz, P33.3–100 kHz and P20-100 kHz, respectively. This is again, consistent with the results shown in Figs. 4 and 6 that, longer temporal interpolation deteriorates the performance of the DL model.
5 Conclusion
In this paper, we artificially accelerated the high-speed planar imaging of turbulent flames by building up a deep learning-based computational imaging model. An end-to-end trainable model for accelerating OH-PLIF imaging was established based on a Conv-LSTM model and incorporating it into the encoding-forecasting structure. It was found that the model has the capacity to generate 100 kHz OH-PLIF images from 50 kHz, 33.3 kHz and 20 kHz experimental data. The accuracy of prediction was also quantified with similarity indices, and it was found that the SSIM, i.e. a similarity index between the time-averaged ground truth and DL prediction are calculated to be 0.833, 0.804 and 0.732 for the case of 50–100 kHz, 33.3–100 kHz and 20–100 kHz, respectively. But the accuracy of the model at longer interpolation sequence is expected to increase with training data size. Furthermore, 200 kHz OH-PLIF imaging was also generated by the DL model based on the 100 kHz experimental results, with reasonable spatial structure and fluidity.
References
K. Kohse-Höinghaus et al., Combustion at the focus: laser diagnostics and control. Proc. Combust. Inst. 30(1), 89–123 (2005)
V. Sick, High speed imaging in fundamental and applied combustion research. Proc. Combust. Inst. 34(2), 3509–3530 (2013)
R.K. Hanson, J.M. Seitzman, P.H. Paul, Planar laser-fluorescence imaging of combustion gases. Appl. Phys. B 50(6), 441–454 (1990)
G. Grünefeld, M. Schütte, P. Andresen, Simultaneous multiple-line Raman/Rayleigh/LIF measurements in combustion. Appl. Phys. B 70(2), 309–313 (2000)
U. Retzer et al., Burst-mode OH/CH2O planar laser-induced fluorescence imaging of the heat release zone in an unsteady flame. Opt. Express 26(14), 18105–18114 (2018)
M.N. Slipchenko, T.R. Meyer, S. Roy, Advances in burst-mode laser diagnostics for reacting and nonreacting flows. Proc. Combust. Inst. 38(1), 1533–1560 (2021)
M.E. Smyser et al., Compact burst-mode Nd:YAG laser for kHz–MHz bandwidth velocity and species measurements. Opt. Lett. 43(4), 735–738 (2018)
C. Yang, H. Tang, G. Magnotti, High-speed 1D Raman analyzer for temperature and major species measurements in a combustion environment. Opt. Lett. 45(10), 2817–2820 (2020)
S. Roy et al., 100-ps-pulse-duration, 100-J burst-mode laser for kHz–MHz flow diagnostics. Opt. Lett. 39(22), 6462–6465 (2014)
M.N. Slipchenko et al., Quasi-continuous burst-mode laser for high-speed planar imaging. Opt. Lett. 37(8), 1346–1348 (2012)
C. Fu et al., Experimental investigation on an acoustically forced flame with simultaneous high-speed LII and stereo PIV at 20 kHz. Appl. Opt. 58(10), C104–C111 (2019)
J.B. Michael et al., 100 kHz thousand-frame burst-mode planar imaging in turbulent flames. Opt Lett 39(4), 739–742 (2014)
T.A. McManus et al., Spatio-temporal characteristics of temperature fluctuations in turbulent non-premixed jet flames. Proc. Combust. Inst. 35(2), 1191–1198 (2015)
R.A. Patton et al., Multi-kHz temperature imaging in turbulent non-premixed flames using planar Rayleigh scattering. Appl. Phys. B 108(2), 377–392 (2012)
S. Roy et al., 100-kHz-rate gas-phase thermometry using 100-ps pulses from a burst-mode laser. Opt Lett 40(21), 5125–5128 (2015)
J.D. Miller et al., Spatiotemporal analysis of turbulent jets enabled by 100-kHz, 100-ms burst-mode particle image velocimetry. Exp. Fluids 57(12), 192 (2016)
J.J. Philo, M.D. Frederick, C.D. Slabaugh, 100 kHz PIV in a liquid-fueled gas turbine swirl combustor at 1 MPa. Proc. Combust. Inst. 38(1), 1571–1578 (2021)
B. Peterson et al., An experimental study of the detailed flame transport in a SI engine using simultaneous dual-plane OH-LIF and stereoscopic PIV. Combust. Flame 202, 16–32 (2019)
I. Boxx et al., High-speed laser diagnostics for the study of flame dynamics in a lean premixed gas turbine model combustor. Exp. Fluids 52(3), 555–567 (2012)
R. Wellander, M. Richter, M. Aldén, Time-resolved (kHz) 3D imaging of OH PLIF in a flame. Exp. Fluids 55(6), 1764 (2014)
S.D. Hammack et al., CH PLIF and PIV implementation using C-X (0,0) and intra-vibrational band filtered detection. Appl. Phys. B (2018). https://doi.org/10.1007/s00340-017-6883-8
J. Sjöholm et al., Ultra-high-speed pumping of an optical parametric oscillator (OPO) for high-speed laser-induced fluorescence measurements. Measur Sci Technol 20(2), 025306 (2009)
J. Sjöholm et al., Simultaneous visualization of OH, CH, CH2O and toluene PLIF in a methane jet flame with varying degrees of turbulence. Proc. Combust. Inst. 34(1), 1475–1482 (2013)
Z. Wang et al., Ultra-high-speed PLIF imaging for simultaneous visualization of multiple species in turbulent flames. Opt. Express 25(24), 30214–30228 (2017)
J.D. Miller et al., Ultrahigh-frame-rate OH fluorescence imaging in turbulent flames using a burst-mode optical parametric oscillator. Opt. Lett. 34(9), 1309–1311 (2009)
W. Zhang et al., Generating planar distributions of soot particles from luminosity images in turbulent flames using deep learning. Appl. Phys. B (2021). https://doi.org/10.1007/s00340-020-07571-9
C.-S. Liu, R.-C. Song, S.-J. Fu, Design of a laser-based autofocusing microscope for a sample with a transparent boundary layer. Appl. Phys. B 125(11), 199 (2019)
W. Zhang et al., 100 kHz CH2O imaging realized by lower speed planar laser-induced fluorescence and deep learning. Opt. Express 29(19), 30857–30877 (2021)
T. Li, Z. Zhang, H. Chen, Predicting the combustion state of rotary kilns using a Convolutional Recurrent Neural Network. J. Process Control 84, 207–214 (2019)
Z.C. Lipton, J. Berkowitz, C. Elkan, A critical review of recurrent neural networks for sequence learning.. 2015. arXiv e-prints. arXiv:1506.00019
S. Hochreiter, J. Schmidhuber, Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
J. Chung, et al., Empirical evaluation of gated recurrent neural networks on sequence modeling., 2014. arXiv e-prints. arXiv:1412.3555.
A. Krizhevsky, I. Sutskever, G.E. Hinton, ImageNet classification with deep convolutional neural networks. Commun. ACM 60(6), 84–90 (2017)
Z. Leihong et al., Restoration of single pixel imaging in atmospheric turbulence by fourier filter and CGAN. Appl. Phys. B (2021). https://doi.org/10.1007/s00340-021-07596-8
J. Li et al., Object identification in computational ghost imaging based on deep learning. Appl. Phys. B (2020). https://doi.org/10.1007/s00340-020-07514-4
S. Hong, et al. PSIque: Next sequence prediction of satellite images using a convolutional sequence-to-sequence network, 2017. .arXiv:1711.10644
X. Shi, et al., Convolutional LSTM network: a machine learning approach for precipitation nowcasting, 2015. arXiv:1506.04214
S. Kim, et al., DeepRain: ConvLSTM network for precipitation prediction using multichannel radar data, 2017. arXiv:1711.02316
C. Finn, I. Goodfellow, S. Levine, Unsupervised learning for physical interaction through video prediction, 2016. arXiv:1605.07157
W. Lotter, G. Kreiman, D. Cox, Deep predictive coding networks for video prediction and unsupervised learning, 2016. arXiv:1605.08104
V. Patraucean, A. Handa, R. Cipolla, Spatio-temporal video autoencoder with differentiable memory, 2015. arXiv:1511.06309
Z. Wang et al., Investigation of OH and CH2O distributions at ultra-high repetition rates by planar laser induced fluorescence imaging in highly turbulent jet flames. Fuel 234, 1528–1540 (2018)
R. Hanson, Combustion diagnostics: planar imaging techniques. Symp (Int) Combust 21, 1677–1691 (1988)
J. Rosell et al., Multi-species PLIF study of the structures of turbulent premixed methane/air jet flames in the flamelet and thin-reaction zones regimes. Combust. Flame 182, 324–338 (2017)
B. Zhou et al., Distributed reactions in highly turbulent premixed methane/air flames: Part I Flame structure characterization. Combust. Flame 162(7), 2937–2953 (2015)
B. Zhou et al., Simultaneous multi-species and temperature visualization of premixed flames in the distributed reaction zone regime. Proc. Combust. Inst. 35(2), 1409–1416 (2015)
D.-A. Clevert, T. Unterthiner, S. Hochreiter, Fast and accurate deep network learning by exponential linear units (elus) 2015. arXiv preprint arXiv:1511.07289
Klambauer, G., et al. Self-normalizing neural networks. In 31st Annual Conference on Neural Information Processing Systems, NIPS 2017, December 4, 2017 - December 9, 2017. 2017. Long Beach, CA, United states: Neural information processing systems foundation.
V. Nair, G.E. Hinton, Rectified linear units improve Restricted Boltzmann machines. In 27th International Conference on Machine Learning, ICML 2010, June 21, 2010 - June 25, 2010. 2010. Haifa, Israel: Unavailable.
I. Sutskever, O. Vinyals, Q.V. Le, Sequence to sequence learning with neural networks. In 28th Annual Conference on Neural Information Processing Systems 2014, NIPS 2014, December 8, 2014–December 13, 2014. 2014. Montreal, QC, Canada: Neural information processing systems foundation.
N. Srivastava, E. Mansimov, R. Salakhutdinov. Unsupervised learning of video representations using LSTMs. In 32nd International Conference on Machine Learning, ICML 2015, July 6, 2015–July 11, 2015. 2015. Lile, France: International Machine Learning Society (IMLS).
D.E. Rumelhart, G.E. Hinton, R.J. Williams, Learning representations by back-propagating errors. Nature 323(6088), 533–536 (1986)
Y. LeCun, Y. Bengio, G. Hinton, Deep learning. Nature 521(7553), 436–444 (2015)
D.P. Kingma, J. Ba, Adam: A Method for Stochastic Optimization, 2014. arXiv:1412.6980
W. Zhou et al., Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)
J. Redmon, et al. You Only Look Once: Unified, Real-Time Object Detection. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2016.
H. Belmabrouk, M. Michard, Taylor length scale measurement by laser Doppler velocimetry. Exp. Fluids 25(1), 69–76 (1998)
J.I. Liang, J. Piper, J.Y. Tang, Erosion and dilation of binary images by arbitrary structuring elements using interval coding. Pattern Recogn. Lett. 9(3), 201–209 (1989)
W. Lorensen, H. Cline, Marching cubes: a high resolution 3d surface construction algorithm. ACM SIGGRAPH Comput. Graph. 21, 163 (1987)
W. Kesheng, O. Ekow, S. Arie. Optimizing connected component labeling algorithms. In Proc. SPIE. 2005.
Acknowledgements
This research was financially supported by the National Natural Science Foundation of China under Grant No. 52006137, as well as Shanghai Sailing Program, under Grant No. 19YF1423400.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Guo, H., Zhang, W., Nie, X. et al. High-speed planar imaging of OH radicals in turbulent flames assisted by deep learning. Appl. Phys. B 128, 52 (2022). https://doi.org/10.1007/s00340-021-07742-2
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00340-021-07742-2