Keywords

1 Introduction

Modeling of turbulent flow combustion is central in the development of new combustion technologies in aviation, automotive and power generation [6]. Turbulent flow combustion combines two nonlinear and multi–scale phenomena: turbulent flow and chemical reactions. This coupling of the kinetic chemical reaction equations with the set of Navier-Stokes flow equations results in a problem that is too complex to be solved, at full resolution, by the current computational means. Even for a simple fuel such as methane, the combustion chemistry mechanism involves 53 species and 325 chemical reactions [19], and the numbers increase with increasing fuel complexity. Solving the details of such mechanisms during the flow simulation can consume up to 75% of the solution time [4].

In most cases, the large scale separation between the combustion chemistry/flame (typically sub millimeter/microsecond scale) and the characteristic turbulent flow (typically centimeter or meter/minute or hour scale) allows simplifying assumptions to be made that enable increased computational efficiency by (re)solving chemistry and flow separately [16]. In this paper, we focus on approximate methods that deal with handling the chemistry, and in particular, the methods based on laminar flames [15]. Here the 1–D or single–species flame reactions are solved a priori and stored. During the flow simulation, these reactions are looked–up to estimate the high–dimensional thermochemical state of the system, as shown in Fig. 1.

Fig. 1.
figure 1

(Re)solving systems separately

Most models developed for increased computational efficiency rely on the existence of a theoretical low–dimensional thermochemical state–space manifold to which the combustion chemistry can be mapped [11]. The central question then is, how to efficiently model low–dimensional thermochemical manifolds that capture the relevant physics of the problem; and parametrize and approximate these manifolds which can then be accessed during turbulent flow simulations?

While existing approaches (collectively referred to as state–space parametrization [16, 17]) have been successful, they have primarily solved the two sub–problems – progress variable generation to characterize the manifold, and manifold approximation to perform the lookup during run–time, independently. This can result in sub–optimal solutions because the progress variables, learnt using methods such as Principal Component Analysis (PCA) [2, 20], are not necessarily optimized to perform the run–time lookup. Similarly, while the traditional lookup approaches that use tabulation, or the recently proposed neural network based data–driven alternatives [1], facilitate efficient look–ups, the construction of the underlying data–structure or machine learning based model is not informed by the learning of the progress variables.

Our main hypothesis is that by simultaneously learning the progress variables and the manifold approximation (lookup model), we can achieve higher accuracy in terms of the estimation of the thermochemical state at run–time. But how does one combine the progress variable learning, an inherently linear mapping task, with a highly non–linear lookup model, while ensuring that the components influence each other during the learning phase? To that end, we propose a framework called ChemTab, in which the learning of these two components is formulated as a joint optimization task. An implementation of ChemTab, using a novel deep learning architecture, is proposed. The joint optimization includes a set of mathematical constraints that ensure that the progress variable learning is approximately similar to a PCA–type linear reduction, and, at the same time, can also predict the thermochemical state using a non–linear predictive component.

The deep learning implementation of ChemTab is shown to reduce the error by 73%, when compared to an existing tabulation based framework, in predicting one of the key thermochemical term, source energy, when applied to flames data for a Methane–Air fuel–oxidizer combination generated using the GRI–Mech 3.0 simulator. Moreover, the proposed architecture of ChemTab is shown to outperform a recently proposed state–of–art decoupled PCA+neural network based solution by 24%.

2 Related Work

In this section we provide a brief overview of existing in low–dimensional thermochemical manifold modeling, focusing more on data–driven methods. We note that there have been works that use physics–driven machine learning models for solving other physics problems [10, 23], however, these methods generally focus on simpler physics and are not necessarily applicable in the domain of turbulent combustion.

Common approaches to low–dimensional thermochemical manifold modeling are combustion chemistry mechanism reduction and thermochemical state–space parametrization [18, 20]. Chemistry mechanism reduction approach cannot be generalized and in the recent past state–space parametrization approach has been the most dominant method comprising of two phases progress variable generation and manifold approximation. For progress variable generation, existing methods have either used domain models or numerical methods.

Domain models like steady Laminar Flamelet Method (SLFM) [15], Flamelet–Generated Manifold (FGM) [21, 22], Flamelet Progress Variable approach (FPVA) [7, 17] and Flamelet–Prolongation of ILDM model (FPI) [5] theorize that a multi–dimensional flame can be considered as an ensemble of multiple one–dimensional locally laminar flames (flamelets). These flamelets are patametrized by a combination of conserved and reactive scalars [3, 17, 21, 22]. A lot of research in this area builds on the principles laid out in [9] for progress variables regularization however the fundamental problem of generating adequate number of progress variables that capture the underlying physics is still open.

Numerical methods, like PCA, have shown significant promise for parametrization of the thermochemical state. PCA provides a method of generating reaction progress variables using the flamelet solutions, the state–space variables are still nonlinear functions of the reaction progress variables, and a nonlinear regression is learned to approximate the state–space manifold [2, 12, 13, 20]. This purely numerical parametrization lack interpretability and may also not be generalizable enough due to variation capture maximization that may overlearn the numerical errors in the data. Linear Autoencoders have also been suggested [14] however this definition lacks incorporation of a principled approach to progress variable generation and thus may not be generalizable.

While domain based model have traditionally relied on tabular lookup, these are not scalable. Tabulated data occupies a larger portion of the available memory on every node where the flow simulation is computing. Also the searching and retrieval of this pre–tabulated data becomes increasingly expensive in a higher–dimensional space. For example, assuming a standard 3 progress variable discretization (200, 100, 50) with say 15 tabulated thermochemical state variables, we obtain a pre–computed combustion table of 120 Mb. The addition of a variable such as enthalpy with a very coarse discretization of 20 points, brings the size of the table to 2.4 Gb. To address the tabulation problem researchers like [1, 24] build on the work of [8] to investigate the use of a neural networks for manifold approximation which replaces the Tabulation. The mapping between the progress variables (reduced dimensionality) and thermochemical state variables obtained using the flamelets solutions is learnt using a neural network. However, due to the highly non–linear, knotted and discontinuous nature of the lower dimensional manifolds formed by the progress variables generated a priori the accuracy gained by a neural network is not satisfactory.

3 ChemTab: Joint Learning Progress Variables and Manifold Approximation

To reduce the computational effort in coupled simulations, state–space parametrization approaches follow a two–phase strategy. First, parametrize and tabulate a priori the scalar evolution of a reactive turbulent environment by few progress variables that govern the scalar evolution in a laminar flame. Second, use a tabular lookup at run–time to determine the high–dimensional chemical state required by the CFD solver. For instance, the FGM approach replaces all species and temperature by a mixture–fraction and a single reaction progress variable or reaction progress parameter. In this study, we focus on state–space parametrization using Unsteady Flamelet Generated Manifolds or Unsteady FGMs [3]. We modify this approach in three ways: the progress variable generation is different, the manifold is not tabulated and lastly, the progress variables and manifold approximation are done jointly.

3.1 Background: Unsteady FGM

FGM is a widely used tabulated chemistry method and can deal with a range of complicated conditions. FGM model shares the same theoretical basis with flamelet approaches [15], in which a multi–dimensional flame can be considered as an ensemble of multiple one–dimensional flames. Generally FGM model used for combustion modeling follows three steps as shown below:

  1. 1.

    Calculation of the representative 1–D flamelets.

  2. 2.

    Transformation of 1–D flamelets solutions to progress variables space.

  3. 3.

    Retrieval of thermo–chemical variables from the FGM tables according to FGM control variables from CFD simulations.

Table 1. Definitions for terms used in Sect. 3.1

Governing Equations. Conservation equations for mass, species, momentum and energy for the 1–D, fully compressible, and viscous flames, are given by:

$$\begin{aligned} \frac{\partial \rho }{\partial t} + \frac{\partial \left( \rho u_x \right) }{\partial x}= & {} 0 \end{aligned}$$
(1)
$$\begin{aligned} \frac{\partial \left( \rho Y_i \right) }{\partial t} + \frac{\partial \rho u_x Y_i}{\partial x}= & {} \frac{\partial }{\partial x}\left( \rho \mathcal {D}_i \frac{\partial Y_i}{\partial x} \right) + \dot{S_i} \end{aligned}$$
(2)
$$\begin{aligned} \frac{\partial \left( \rho u_x \right) }{\partial t} + \frac{\partial \left( \rho u_x^2 \right) }{\partial x}= & {} -\frac{\partial p}{\partial x} + \frac{\partial }{\partial x} \left( \mu \frac{\partial u_x}{\partial x} \right) \end{aligned}$$
(3)
$$\begin{aligned} \frac{\partial \left( \rho e_t \right) }{\partial t} + \frac{\partial }{\partial x} \left( \rho u_x H_t \right)= & {} \frac{\partial }{\partial x} \left( u_x \mu \frac{\partial u_x}{\partial x} \right) + \mu \frac{c_p}{Pr} \left( 1 - \frac{1}{Le} \right) \frac{dT}{dx} \\+ & {} \frac{1}{Sc} \frac{dh}{dx} - \sum \dot{S_i} h^o_{f,i}\nonumber \end{aligned}$$
(4)

where the different terms are defined in Table 1.

We simplify the above equations making some well known assumptions. In 1D cartesian coordinates, the steady state solution to (1)–(4) is obtained only when the total mass flux is zero, i.e., velocity field is zero \((u_x = 0)\) and so the four equations reduce to:

$$\begin{aligned}&\frac{\partial }{\partial x} \left( \rho \mathcal {D}_{i} \frac{\partial Y_i}{\partial x} \right) + \dot{S_i} = 0 \end{aligned}$$
(5)
$$\begin{aligned}&\frac{\partial }{\partial x} \left( \kappa \frac{\partial T}{\partial x} + \sum \rho \mathcal {D}_{i} \frac{\partial Y_i}{\partial x} h_i \right) - \sum \dot{S_i} h^o_{f,i} = 0. \end{aligned}$$
(6)

In (6), the final term in the energy equation is represented by the total sum of the product of all the source species and their respective heat of formation and is collectively called the source energy. Source energy is one of the crucial parameters in the combustion simulation and accurate chemistry description is required to define it. Prediction error of this term is used as the basis of comparison of our method against the other state of the art methods.

Flamelet Solutions. The data is generated by solving 1–D Steady State Flamelets differential equations in (6) using a finite volume PDE solver. The species Y and thermochemical state variables \(\dot{S}\) are generated using the solver.

$$\begin{aligned} Y = \begin{bmatrix} Y_{11} &{}..&{}..&{}Y_{1s}\\ ..&{}..&{}..&{}..\\ Y_{n1}&{}..&{}.. &{} Y_{ns} \end{bmatrix},\quad \dot{S} = \begin{bmatrix} S_{11} &{}..&{}..&{}S_{1s}\\ ..&{}..&{}..&{}..\\ S_{n1}&{}..&{}.. &{} S_{ns} \end{bmatrix},\quad Z_{mix} = \begin{bmatrix} Z_{mix_{1}}\\ ..\\ Z_{mix_{n}} \end{bmatrix} \end{aligned}$$
(7)

3.2 ChemTab

In ChemTab, the unsteady FGM approach is replaced with the following three steps:

  1. 1.

    Calculation of the representative 1D flamelets (data generation)

  2. 2.

    Using the data generated jointly generate Progress Variables (encoder) and Manifold Approximation (regressor) using ChemTab

  3. 3.

    Retrieval of thermo–chemical variables from the ChemTab–regressor according to progress variables from CFD simulations.

Formulation. The generated data described in (7) is then used by ChemTab. Conceptually the following equations summarize the relationships:

$$\begin{aligned} \dot{S} = \phi (Y)\end{aligned}$$
(8)
$$\begin{aligned} S_{energy} = - \sum _{i}^{s} h_{f,i}^{0} * \dot{S_{i}} \end{aligned}$$
(9)

The two sub–problems of state–space parametrization are formulated as a joint optimization problems as follows:

$$\begin{aligned} min (\sum _{i=1}^{k}\sum _{j=1}^{n} || {\dot{S_{ij}}} - \zeta _{i}(\widehat{Y_{j}}) ||_{t} + \sum _{j=1}^{n} ||S_{energy} - \psi (\widehat{Y}, Z_{mix})||_{t}) \end{aligned}$$
(10)
$$\begin{aligned} \text {s.t.} \quad t \in R \end{aligned}$$
(11)
(12)
$$\begin{aligned} p<< s \end{aligned}$$
(13)
$$\begin{aligned} ||W||=1 \end{aligned}$$
(14)
$$\begin{aligned} W^{T} \times W = I \end{aligned}$$
(15)
$$\begin{aligned} (\widehat{Y} \oplus Z{mix})^{T} \times (\widehat{Y} \oplus Z{mix})= I \end{aligned}$$
(16)
$$\begin{aligned} \dot{S} \approx \widehat{\dot{S}} = \zeta (\widehat{Y}, Z_{mix}) \end{aligned}$$
(17)
$$\begin{aligned} S_{energy} \approx \widehat{S_{energy}} = \psi (\widehat{Y}, Z_{mix}) \end{aligned}$$
(18)

The formulation described in Eq. 10 learns the optimal reactive scalars \(C_{pv}s\) (described by the embedding \(Y \times W\)) that along with \(Z_{mix}\) form the progress variables. This is a linear dimensionality reduction problem such that the new basis retains the inherent physics in higher dimensions described by the non–linear relation between Y and \(\dot{S}\). To facilitate the development of transport equations using the progress variables it is necessary that the embedding of the variables in the low–dimensional space be linear. The constraints on the linear embedding are inspired by the work of [9] and the key ideas from PCA.

Fig. 2.
figure 2

ChemTab architecture

Table 2. Symbols used in Sect. 3.2

Implementation. The joint optimization problem is solved using a Deep Neural Architecture. ChemTab jointly optimizes two neural networks for the tasks of reaction progress variable generation (encoder) and manifold approximation (regressor). The encoder network focuses on linear dimensionality reduction and creates a linear embedding for the input. The regressor network focuses on learning the manifold approximation: a regression function whose input is the linear embedding and the output are the desired thermo–chemical state variables (Fig. 2 and Table 2).

$$\begin{aligned} \begin{aligned} \begin{aligned} f_{\mathcal {\theta }}(y) = W^{[L-1]}\mathcal {\sigma } \, \mathcal {\mathrm{o}} \, (W^{[L-2]}\mathcal {\sigma } \, \mathcal {\mathrm{o}} \, (\dots (W^{[1]}\mathcal {\sigma } \, \mathcal {\mathrm{o}} \\ \, (W^{[0]}y + b^{[0]}) + b^{[1]})\dots ) + b \end{aligned}\\ \text {where,} \quad W^{[l]} \in R^{m_{l+1} \, \times \, m_{l}}\\ b^{[l]} = R^{m_{l+1}} \quad ;\quad m_{0} = d_{in} = d \quad ; \quad m_{L} = d_{out} \end{aligned} \end{aligned}$$
(19)

As described by (19) a Deep Neural Network can be conceptualized as a series of operations. The input of the network is the data for each of the species for each flame at each axial coordinate.

$$\begin{aligned} \begin{aligned} f_{\mathcal {\theta }}^{[0]}(y) = y \\ f_{\mathcal {\theta }}^{[1]}(y) = (W^{[0]} f_{\mathcal {\theta }}^{[0]}(y)) \\ f_{\mathcal {\theta }}^{[2]}(y) = (f_{\mathcal {\theta }}^{[2]}(y) \oplus Z_{mix}) \\ f_{\mathcal {\theta }}^{[l]}(y) = \mathcal {\sigma } \, \mathcal {\mathrm{o}} \, (W^{[l-1]} f_{\mathcal {\theta }}^{[l-1]}(y) \,+\, b^{[l-1]} ) \,\,\, \forall \quad l \quad \text {s.t.} \quad {3 \le l \le L-1} \\ f_{\mathcal {\theta }}(y) = f_{\mathcal {\theta }}^{[L]}(y) = \mathcal {\sigma } \, \mathcal {\mathrm{o}} \, (W^{[L-1]} f_{\mathcal {\theta }}^{[L-1]}(y) \,+\, b^{[L-1]} ) \end{aligned} \end{aligned}$$
(20)

As described by (20) the network is a layer–wise composition. The input of the network is reduced at the first layer linearly: this creates the linear embedding/reacting scalars (\(C_{pv}s\)). The next layer concatenates the conserved scalar \(Z_{mix}\) with the reacting scalars. These progress variables are then fed to the next layer. The subsequent layers together make up the regressor that learns a non–linear function between the progress variables and the thermo–chemical state variables.

$$\begin{aligned} \begin{aligned} \arg \min _{\mathcal {\theta }} \quad |f_{\mathcal {\theta }}(y) - \mathcal {S} |\\ s.t. \quad W^{[0]T}W^{[0]} = I\\ \Vert W^{[0]} \Vert = 1\\ f_{\mathcal {\theta }}^{[2]}(y)^Tf_{\mathcal {\theta }}^{[2]}(y) = I \end{aligned} \end{aligned}$$
(21)

As described by (21), ChemTab minimizes the Mean Absolute Error in predicting the thermo–chemical state variables (Source Energy in the current work) while ensuring that the linear embedding conforms to the following constraints:

  1. 1.

    Embedding Weights w learnt are unit norm (UN)

  2. 2.

    Embedding Weights w learned for the species mass fractions \(Y_i\)s are uncorrelated/orthogonal (WO)

  3. 3.

    The reaction progress variables are uncorrelated/orthogonal (AR)

The constraints in (21) will be also added to the objective in addition to the predictions of key source terms, corresponding to a few important species, which serve as the physics constraints.

Extensions. The current framework and implementation can be very easily extended to include the prediction of additional thermochemical state variables and the projection of the embedding to get back the high dimensional mass fractions. These can be implemented as two other neural networks and their respective prediction errors can be added to the objective function.

4 Experimentation and Results

In this section we explain the specifics of the data set used, the training strategy, impact of the number of \(C_{pv}\), comparison with the existing framework and relevant machine learning methods and the performance of the best model in the context of the multiple objectives.

4.1 Dataset

The training data was generated by solving 1–D Steady State Flamelets differential equations using a finite volume PDE solver. GRI–Mech 3.0 is one of the widely used Methane mechanism to model the reaction kinetics. This mechanism consists of 53 chemical species and 325 reactions.

The Flamelet solver discretizes the domain into 200 grid points (200 observations on the axial coordinate) in between the fuel and the air boundary and 100 flame are solved to steady–state. To train the model 20,000 data points (100 flames and 200 grid points) for a single pressure setting are used. Some of the generated data that represent extinguished flames were discarded, which led to exclusion of approximately 3,500 data points.

We experiment the model training and evaluation using two strategies:

  1. 1.

    50% Flamelets – Train using data from 50% of flamelets selected randomly and test using data from the remaining 50% of the flamelets, and,

  2. 2.

    50% Data points – Train using 50% data points selected randomly, and test on remaining 50% data.

4.2 Evaluation

We use the Mean Absolute Error of the Source Energy across the entire dataset as the metric to compare the performance as described in Eq. (21).

4.3 Implementation and Settings

We implemented ChemTab using Tensorflow 2.3.0, Keras and Adam optimizer. Models were trained on a server with Nvidia Quadro RTX 5000 GPU and cuDNN 8.0 and CUDA 11.0. We performed a coarse grid search on the hyperparameters (dropouts, learning rate, early stopping, batch size) & standard model architecture (number of layers, number of nodes in the layers, activation functions). After the initial model architecture and hyper–parameter search, all subsequent models in the subsequent studies were trained for 500 epochs. Results are reported as average over 10 runs (Table 3).

Table 3. Model parameters
Table 4. ChemTab architectural variants

4.4 Compared Methods

We compare the 7 variants of ChemTab with the relevant constraints on the Linear Embedding and the Progress Variables with a series of state–of–the–art baselines for Source Energy prediction Sect. 2.

Table 5. Current state of the art methods and ChemTab variants

4.5 Results

Current Framework Comparison. The current framework uses FGM based progress variables and Conformal Mapping based Tabulation and Lagrange Polynomial Interpolation based lookup. The tabulation was generated by using the entire data–set. The best MAE that the framework generated on the data–set was 2.243 E+09. The best ChemTab model trained on 50% of the data showed a 73% reduction in error. This reduction although high comes from the limitation of the current framework to include more than 2 progress variables and the realization of that through conformal mapping. We present a more principled comparison with the state–of–the–art methods in the next section.

Other Baseline Comparisons. We include DNN–PVG(NL)–DNN as reference although it cannot be used due to non–linear embedding. Similarly we did not consider Gaussian Processes as there are several challenges with operationalization of Gaussian Process in our context and so we focus more on bench–marking against the relevant DNN based approaches (Table 5).

Fig. 3.
figure 3

MAE for source energy: data set split strategy

Figure 3 shows the results of an ablation study for both types of sampling strategies. When the trained using the sampled points, all models consistently do better than when trained using sampled flamelets. Essentially the flame is considered as an ensemble of multiple one–dimensional flamelets, each of which captures some of the highly nonlinear state–space and hence almost all models struggle in this training regime. ChemTab models still perform better and our assertions are that our constraints help in the generalization process. Our dataset is limited and so we limit ourselves to use only 50% of the data for training.

Fig. 4.
figure 4

MAE for source energy: Cpv ablation

As we increase the number of \(C_{pv}\) the computational time of the flow simulation goes up, so we want to use the least number of \(C_{pv}\) while still capturing the essential physics. Figure 4 shows the MAE decreases with increase in the number of \(C_{pv}\) and then starts to increase again. As we add more \(C_{pv}\) the embedding has too many degrees of freedom and hence may start diverging.

Fig. 5.
figure 5

MAE for key source terms – best model

Table 6. Constraints – best model

Best Model Performance. Table 6 shows the conformity of the constraints of Eq. 10. The first tabulation shows the (14) constraint conformity. The second tabulation shows the (15) constraint conformity and the third (16) constraint conformity. The (16) is also adequately satisfied as the constraint conformity is measured through covariance (Fig. 5).

Best Model Long Run Performance. We trained best model architecture on a 50% Data Points strategy for a long run of 20000 epochs and generated a MAE of 1.80E+08.

5 Conclusion

We propose ChemTab, a novel framework for jointly learning the progress variables and the manifold approximation. ChemTab follows the principle of physics guided neural networks [10], however no solutions exist that can directly benefit the combustion community. ChemTab outperforms the state–of–the–art state–space parametrization in combustion. Crucially, ChemTab generated reaction progress variables can be interpreted by examining the weight matrix, W, and thus, allow for physical insights into the systems being modeled. Incorporation of ChemTab into a flow simulation will be explored as part of future work.