Towards High-Fidelity Multiphase Simulations: On the Use of Modern Data Structures on High Performance Computers

Föll, Fabian; Hitz, Timon; Keim, Jens; Munz, Claus-Dieter

doi:10.1007/978-3-030-66792-4_25

Fabian Föll⁴,
Timon Hitz⁴,
Jens Keim⁴ &
…
Claus-Dieter Munz⁴

1162 Accesses
1 Citations

Abstract

Compressible multi-phase simulations in the homogeneous equilibrium limit are generally based on real equations of state (EOS). The direct evaluation of such EOS is typically too expensive. Look-up tables, based on modern data-structures significantly, reduce the computation time while simultaneously increasing the memory requirements during the simulation. In the context of binary mixtures and large scale simulations this trade off is even more important due to the limited memory resources available on high performance computers. Therefore, in this work we propose an extension of our tabulation approach to shared memory trees based on MPI 3.0. A detailed analysis of benefits and drawbacks concerning the shared memory and the non-shared memory data-structure is described. Another research topic investigates the diffuse interface model of the isothermal Navier–Stokes–Korteweg equations. A parabolic relaxation model is implemented in the open-source code FLEXI and 3D simulations of binary head on collisions at various model parameters are shown.

Access provided by Autonomous University of Puebla. Download conference paper PDF

PetaFLOP Molecular Dynamics for Engineering Applications

591 TFLOPS Multi-trillion Particles Simulation on SuperMUC

Cluster Optimization and Parallelization of Simulations with Dynamically Adaptive Grids

1 Introduction

Typical technical applications, in which multiphase processes can be found, are fuel injection systems such as rocket combustion chambers. The problems inherently contain multiple scales. First, the liquid fuel is injected as a jet with a liquid core. Over time, the jet breaks up and ligaments and droplets form. At the surface of the liquid interface, phase change occurs and the gaseous environment is mixed with evaporated fuel. This mixture is then ignited.

In this project, we aim to understand the mixing processes leading up to the burning of the fuel/oxidizer mixture. Due to the multiscale character, we split the investigation into large scale jet simulations and more detailed simulations of single droplets. These processes face extreme ambient conditions that often exceed the critical state of the fuel. In these regimes, the liquid phase cannot be described incompressible any more and we have to consider the full coupling of hydrodynamics and thermodynamics, requiring the fully compressible flow equations.

The macroscopic modelling for jet simulations is based on the Homogeneous Equilibrium Model (HEM) [22], which considers a mixture of saturated liquid and saturated vapor under full thermodynamic equilibrium. An extension of the intrinsic assumption of vapor-liquid equilibrium in the HEM approach, towards binary mixtures, is the nested procedure of tangent plane distance (TPD) function [15] analysis and classical TPn-flash calculation [16]. These methods are restricted to modifications of the underlying equations of state (EOS), only. Especially with more than one species, the evaluation of the EOS becomes very costly. Therefore, we use look-up tables which shifts the evaluation costs into a pre-processing step while during runtime, only the look-up in an octree data structure is required [9]. For binary mixtures, the look-up tables become huge in storage size which causes problems if the size exceeds the memory of the CPU. Therefore, in this paper we propose a shared memory parallelization of look-up tables based on the MPI 3.0 standard. We provide performance results on benchmark test cases and show its practical use with the simulation of a binary mixing layer.

This project also investigates modelling strategies of phase interfaces, e.g. for droplets, such as sharp and diffuse interface models. As an example of the latter, we use a parabolic relaxation model of the isothermal Navier–Stokes–Korteweg (NSK) equations to simulate the collision of two droplets at varying model parameters. Numerical experiments were conducted using an extension of the open source code FLEXI.^{Footnote 1} It is based on a high order nodal discontinuous Galerkin spectral element method (DGSEM) [12].

The outline of the paper is as follows. In the next section, the governing equations are presented. This is followed by the description of the numerical methods, thermodynamic modelling and the look-up table approach. We then present the results on the performance of the look-up tables. Numerical experiments are shown of a two component mixing layer at super-critical conditions using a Peng-Robinson EOS as well as two colliding droplets using NSK diffuse interface model.

2 Governing Equations

2.1 The Compressible Navier–Stokes System for Multi-components

The compressible Navier–Stokes equations with multiple components are given by

$$\begin{aligned} \frac{\partial \rho }{\partial t} + \nabla \cdot \left( \rho \varvec{u}\right)&= 0\,, \end{aligned}$$

(1)

$$\begin{aligned} \frac{\partial \rho Y_{k}}{\partial t} + \nabla \cdot \left( \rho Y_{k} \varvec{u} \right)&= \nabla \cdot \left( -\varvec{J}_{k} \right) \,, \end{aligned}$$

(2)

$$\begin{aligned} \frac{\partial \rho \varvec{u}}{\partial t} + \nabla \cdot \left( \rho \varvec{u} \otimes \varvec{u}+p \underline{\varvec{I}}\right)&= \nabla \cdot \left( \underline{\varvec{\tau }}\right) \,, \end{aligned}$$

(3)

$$\begin{aligned} \frac{\partial E}{\partial t} + \nabla \cdot \left[ \left( E + p\right) \varvec{u} \right]&= \nabla \cdot \left( \underline{\varvec{\tau }} \cdot \varvec{u} - \varvec{q}\right) \,, \end{aligned}$$

(4)

with

$$\begin{aligned} \underline{\varvec{\tau }}=\underbrace{2\mu \underline{\varvec{S}}-2/3\underline{\varvec{I}}\cdot \nabla \varvec{u}}_{\mathrm {Stokes~law}} ,\quad \underline{\varvec{S}}=1/2 \left( \nabla \varvec{u}+\left( \nabla \varvec{u}\right) ^\text {T}\right) , \end{aligned}$$

(5)

where $\rho $ is the density, $\varvec{u}=(u,v,w)^{\mathrm {T}}$ is the velocity vector, p is the static pressure, E is the total energy per unit volume, $\underline{\varvec{I}}$ is the unit tensor. By considering $N_k$ species, the system is extended by $N_k-1$ concentration equations where $\varvec{Y}=(Y_1,Y_2,~\dots ~,Y_{N_k-1})^\text {T}$ with $Y_k = \frac{\rho _k}{\rho }$ is defined as the mass fraction of each species. For multi-component simulations, the heat flux is usually comprised of $\varvec{q}=\varvec{q}^{\mathrm {f}}+\varvec{q}^{\mathrm {d}}+\varvec{q}^{\mathrm {c}}$, where

$$\begin{aligned} \varvec{q}^{\mathrm {f}}=-\lambda \nabla T \end{aligned}$$

(6)

is the specific heat flux according to Fourier law with thermal conductivity $\lambda $ and temperature T. The second term is the inter-species energy flux due to diffusion

$$\begin{aligned} \varvec{q}^{\mathrm {d}}=\sum _k h_{k} \varvec{J}_{k}. \end{aligned}$$

(7)

Here, $\varvec{q}^{\mathrm {c}}$ are cross-effects, like the Dufour effect, which are not considered in this paper. The viscous stress tensor $\underline{\varvec{\tau }}$ with the strain rate tensor $\underline{\varvec{S}}$ is defined for a Newtonian fluid. The concentration diffusion flux is usually comprised of $\varvec{J}_{k}=\varvec{J}_{k}^{\mathrm {f}} + \varvec{J}_{k}^{\mathrm {c}} + \varvec{J}_{k}^{\mathrm {b}} $, where

$$\begin{aligned} \varvec{J}_k^{\mathrm {f}} = -\rho D_k \nabla Y_k\,, \quad k=1,\dots ,N_k-1, \end{aligned}$$

(8)

is the concentration diffusion flux according to Fickian law and $D_{k}$ is the species diffusion coefficient. Here $\varvec{J}_{k}^c$ are cross-effects, like the Soret effect, which are also neglected in this paper. The third term,

$$\begin{aligned} \varvec{J}_{k}^{\mathrm {b}} = -\rho Y_k \sum _{j=1}^{N_k}\left( D_j \nabla Y_j \right) \,, \quad k=1,\dots ,N_k-1, \end{aligned}$$

(9)

is a correction for the mass balance and recovers $\sum _{k=1}^{N_k}\varvec{J}_{k}=0$ to guarantee conservation in cases where the species diffusion fluxes are significantly large [6]. Properties for the last species can be calculated via following relations

$$\begin{aligned} \sum ^{N_k}_{k=1} Y_k = 1\,, ~ ~ ~ ~ ~ ~ ~ \sum ^{N_k}_{k=1} \rho _k = \rho \,. \end{aligned}$$

(10)

Since there are $5+(N_k-1)$ unknown variables, a closure relation is required between the variables pressure, density, specific internal energy per mass, $\epsilon $, and the species composition,

$$\begin{aligned} E=\rho \epsilon +\frac{1}{2}\rho \varvec{u} \cdot \varvec{u}\,, ~~~~\epsilon&=\epsilon (\rho ,p,\varvec{Y})\,,~~~~p=p(\rho ,\epsilon ,\varvec{Y}) \,. \end{aligned}$$

(11)

Such a functional relation is called an equation of state, more precise caloric EOS, and defines the thermodynamic relations between the state variables. For the temperature a thermal EOS (12)

$$\begin{aligned} T&=T(\rho ,p,\varvec{Y}). \end{aligned}$$

(12)

has also to be considered.

2.2 The Navier–Stokes–Korteweg Equations

The Navier–Stokes–Korteweg (NSK) equations are an extension of the Navier–Stokes equations where an interfacial stress is added that approximates capillary effects in phase interfaces of finite thickness. The NSK equations are given in the isothermal case for $T \equiv T_{\mathrm {ref}}$ by

$$\begin{aligned} \rho _t + \nabla \cdot \left( \rho \mathbf {u}\right)&= 0 , \end{aligned}$$

(13)

$$\begin{aligned} \left( \rho \mathbf {u}\right)_t + \nabla \cdot \left( \rho \mathbf {u}\otimes \mathbf {u}+ p \underline{\varvec{I}}\right)&= \nabla \cdot \underline{\varvec{\tau }}+ \nabla \cdot \underline{\varvec{\tau }}_{\mathrm {K}}. \end{aligned}$$

(14)

The NSK equations are non-dimensionalized such that the Stokes stress tensor, $\underline{\varvec{\tau }}\in {\mathbb R}^{d\times d}$, and the Korteweg stress tensor, $\underline{\varvec{\tau }}_{\mathrm {K}}\in {\mathbb R}^{d\times d}$, are given by

$$\begin{aligned} \underline{\varvec{\tau }}&= \frac{1}{\mathrm {Re}} \left( \nabla \mathbf {u}+ ( \nabla \mathbf {u})^\text {T} - \frac{2}{3} \nabla \cdot \mathbf {u}\underline{\varvec{I}}\right) , \end{aligned}$$

(15)

$$\begin{aligned} \underline{\varvec{\tau }}_{\mathrm {K}}&= \frac{1}{\mathrm {We}} \left( \rho \mathop {}\!\mathcal {4}\rho + \frac{1}{2}\left|\nabla \rho \right|^2 \right) \underline{\varvec{I}}- \frac{1}{\mathrm {We}} \nabla \rho \otimes \nabla \rho . \end{aligned}$$

(16)

The Reynolds number, $\mathrm {Re}$, and Weber number, $\mathrm {We}$, are expressed in terms of the numbers $\epsilon _{\mathrm {K}}>0$ and $\gamma _{\mathrm {K}}>0$,

$$\begin{aligned} \frac{1}{\mathrm {Re}} = \epsilon _{\mathrm {K}}, \quad \frac{1}{\mathrm {We}} = \epsilon _{\mathrm {K}}^2 \gamma _{\mathrm {K}}. \end{aligned}$$

(17)

Due to the capillary stress, Eq. (16), the momentum equation is a third order diffusion-dispersion equation. The system is closed by the pressure function of the Van-der-Waals law [24],

$$\begin{aligned} p = \frac{\rho RT_{\mathrm {ref}}}{1-b\rho } - a\rho ^2 , \end{aligned}$$

(18)

where $a,b,R$ are material parameters. In reduced, non-dimensional, form, they are $a=3,b=1/3,R=8/3$. For subcritical temperatures, Eq. (18) is non-convex and the eigenvalues of the hyperbolic flux Jacobian of the NSK equations may be imaginary numbers. The NSK system is therefore of hyperbolic-elliptic type and numerical methods that rely on the strict hyperbolicity of the conservation system cannot be used straight forward any more. To overcome these challenges, Corli et al. [7] proposed a parabolic relaxation scheme for diffusion-dispersion equations, which is extended to the isothermal NSK equations as

$$\begin{aligned} \rho _t^{\alpha } + \nabla \cdot \left( \rho ^{\alpha } \mathbf {u}^{\alpha } \right)&= 0 ,\end{aligned}$$

(19)

$$\begin{aligned} \left( \rho ^{\alpha } \mathbf {u}^{\alpha } \right)_t + \nabla \cdot \left( \rho ^{\alpha } \mathbf {u}^{\alpha } \otimes \mathbf {u}^{\alpha } + p^{\alpha } \underline{\varvec{I}}\right)&= \nabla \cdot \underline{\varvec{\tau }}^{\alpha } + \alpha \rho ^{\alpha } \nabla \left( c_{\mathrm {K}}^{\alpha } - \rho ^{\alpha } \right) ,\end{aligned}$$

(20)

$$\begin{aligned} \beta \left(c_{\mathrm {K}}^{\alpha }\right)_t - \epsilon _{\mathrm {K}}^2 \gamma _{\mathrm {K}}\mathop {}\!\mathcal {4}c_{\mathrm {K}}^{\alpha }&= \alpha \left( \rho ^{\alpha } - c_{\mathrm {K}}^{\alpha } \right) . \end{aligned}$$

(21)

An additional unknown, the relaxation variable $c_{\mathrm {K}}$, satisfies a linear parabolic evolution equation with constant relaxation parameters $\alpha ,\beta >0$. The system is of second order and of mixed parabolic-hyperbolic type. For $\alpha \rightarrow \infty $, the solution of the parabolic relaxation model approaches the solution of the classical NSK equations, i.e. $(\rho ^{\alpha },\mathbf {u}^{\alpha }) \rightarrow (\rho ,\mathbf {u})$. The total energy of the relaxation system is given by

$$\begin{aligned} \mathcal {E}^{\alpha }[\rho ] = \int _{\Omega } \left( \frac{1}{2}\rho \left| \mathbf {u}\right|^2 + W(\rho ) + \frac{\alpha }{2} \left( \rho -c_{\mathrm {K}}\right)^2 + \frac{1}{2}\epsilon _{\mathrm {K}}^2 \gamma _{\mathrm {K}}\left| \nabla c_{\mathrm {K}}\right|^2 \right) {\text {d}}\mathbf {x}. \end{aligned}$$

(22)

Admissible solutions to Eqs. (19)–(21) are minimizers of Eq. (22).

3 Numerical Methods

The multiphase solver is comprised of several building blocks. The bulk solver is based on a high order discontinuous Galerkin spectral element method (DGSEM). We use an efficient look up table to incorporate real gas equations of state. For the modelling of the phase interface we apply diffuse interface methods. In the Homogeneous Equilibrium Model (HEM), we rely on the EOS to describe phase transition. In the NSK model, capillarity effects are resolved in a phase interface of finite thickness.

3.1 Discontinuous Galerkin Method

The compressible Navier–Stokes equations and the parabolic relaxation model for the NSK equations are discretized by a discontinuous Galerkin spectral element method as described by [11, 12, 14]. The approach is suitable for general systems of conservation equations. In this paper we restrict ourself to the conservation equations of the form

$$\begin{aligned} \mathbf {U}_t + \nabla _{x} \cdot \underline{\varvec{F}}(\mathbf {U},\nabla _{x} \mathbf {U}) = \varvec{Q} \,, \end{aligned}$$

(23)

where $\mathbf {U}$ is the vector of the solution unknowns, $\underline{\varvec{F}}$ is the corresponding flux containing the convective and the diffusive fluxes, and $\varvec{Q}$ is the source term of the NSK relaxation model. The divergence operator in the physical space is defined as $\nabla _{x}=\left( \frac{\partial }{\partial x} ,\frac{\partial }{\partial y},\frac{\partial }{\partial z} \right) ^T$.

In a three-dimensional domain we subdivide the computational space into non-overlapping hexahedral elements. Each element is mapped onto the reference cube element $E:=[-1,1]^3$ by a mapping $\varvec{x}(\varvec{\xi })$, where $\varvec{\xi }=(\xi ,\eta ,\zeta )^\text {T}$ is the coordinate vector of the reference element. The mapping onto the reference element E transforms Eq. (23) to the system

$$\begin{aligned} \varvec{J} \mathbf {U}_t + \nabla _{\xi } \cdot \underline{\varvec{\mathcal {F}}}\left( \mathbf {U},\nabla _{\xi } \mathbf {U}\right) = \varvec{J}\varvec{Q} \,, \end{aligned}$$

(24)

with the Jacobian $\varvec{J}$ and the divergence operator in the reference space $\nabla _{\xi }=\left( \frac{\partial }{\partial \xi } ,\frac{\partial }{\partial \eta },\frac{\partial }{\partial \zeta } \right) ^\text {T}$. In each element, the solution and the fluxes are then approximated as polynomials

$$\begin{aligned} \mathbf {U}_h = \sum _{i,j,k=0}^{N} \varvec{\hat{U}}_{ijk} \psi _{ijk}(\varvec{\xi }) \quad \text {and} \quad \varvec{\mathcal {F}}^m_h = \sum _{i,j,k=0}^{N} \varvec{\hat{\mathcal {F}}}^m_{ijk} \psi _{ijk}(\varvec{\xi }) \,, \end{aligned}$$

(25)

where the superscript $m=\{1,2,3\}$ denotes the flux in the direction of the Cartesian coordinates. The basis function $\psi _{ijk}(\varvec{\xi })=l_i(\xi )l_j(\eta )l_k(\zeta )$ is built by the tensor product of one-dimensional Lagrange polynomials l of degree N. As interpolation nodes we choose Gauss-Legendre points. Due to the nodal character of the Lagrange basis, the degrees of freedom $\varvec{\hat{U}}_{ijk}$ and $\varvec{\hat{\mathcal {F}}}^m_{ijk}$ are values of the approximations of the solution and the flux vectors at the interpolation nodes. To obtain the discontinuous Galerkin formulation, the approximations (25) are inserted into (24) which is then multiplied by a test function $\phi $, identical to the basis function $\psi $, and then integrated in space. Integration by parts of the volume integral of the flux yields the weak formulation

$$\begin{aligned} \underbrace{ \frac{\partial }{\partial t}\int _{\Omega } \left( \varvec{J} \mathbf {U}_h \phi \right) \text {d} \varvec{\xi }}_{a} - \underbrace{ \int _{\Omega } \left( \underline{\varvec{\mathcal {F}}}_h \cdot \nabla _{\xi } \phi \right) \text {d} \varvec{\xi }}_{b} + \underbrace{ \int _{\partial \Omega } \left( \left[ \underline{\varvec{\mathcal {F}}}_h \cdot \varvec{n} \right] \phi \right) \text {d} \varvec{S}}_{c} = \int _{\Omega } \left( \varvec{J} \varvec{Q}_h \phi \right) \text {d} \varvec{\xi } \,. \end{aligned}$$

(26)

We identify three contributing parts: the volume integral of the time derivative of the solution (a), a volume integral (b) and a surface integral of the fluxes (c). The integrals are evaluated by Gauss-Legendre quadratures. To obtain an approximation of the flux $ \underline{\varvec{\mathcal {F}}}_h \cdot \varvec{n}$ at the element surface, a numerical flux function $\varvec{\mathcal {G}} = \varvec{\mathcal {G}}(\mathbf {U}_L,\mathbf {U}_R)$ is introduced. It depends on the states left and right of the interface, $\mathbf {U}_L$ and $\mathbf {U}_R$, respectively. In case of the viscous and heat conduction fluxes, the gradients are needed in addition. For the numerical flux, we use standard approximative Riemann solvers of the HLL-type and Lax Friedrichs families [23]. The discrete formulation (26) is discretized in time using explicit third- or fourth-order Runge–Kutta schemes (RK) [13]. For the viscous fluxes, the approach of Bassi and Rebay [3, 4] is used.

The DG method with high order accuracy is favourable in smooth parts of the flow. At discontinuities or strong gradients we apply the shock capturing of Sonntag and Munz [20, 21]. We switch locally to a second order accurate finite volume (FV) scheme, where the interpolation nodes of the DG polynomials are reorganized as an equidistant sub-grid on which the solution is stored as integral mean values. A modal Persson indicator [19] is used to switch between DG and FV cells.

3.2 Equation of State and Thermodynamic Equilibrium

As thermodynamic coupling relation for the Navier–Stokes equations the cubic Peng-Robinson (PR) EOS [18] is used

$$\begin{aligned} p&= \frac{R_m T}{\frac{M}{\rho }-b}-\frac{a}{\left( \frac{M}{\rho } + \delta _1 b \right) \left( \frac{M}{\rho } + \delta _2 b \right) }, \end{aligned}$$

(27)

with the universal gas constant $R_m$ and the molar weight of the mixture M. The parameter a takes intermolecular attraction forces into account, b is the co-volume and the PR EOS specific parameters $(\delta _1,\delta _2) = (1+\sqrt{2},1-\sqrt{2})$. The transformation of the pressure explicit thermal EOS to a caloric one is provided by a residual function ansatz [17]. In case of two-phase phenomena a thermodynamic modelling by use of the HEM approach is performed. The underlying assumption of thermodynamic equilibrium is defined by

$$\begin{aligned} T_v&= T_l, \end{aligned}$$

(28)

$$\begin{aligned} p_v&= p_l, \end{aligned}$$

(29)

$$\begin{aligned} \mu _v^k&= \mu _l^k, \end{aligned}$$

(30)

where the symbols v and l represent the vapor and liquid side respectively, $\mu $ is the chemical potential and equation (30) has to hold for all $N_k$ species. In case of single species systems the vapor-liquid calculation is performed by use of the algorithm presented by [1], for mixtures a combined approach of TPD analysis and multi-species VLE calculation is used. The TPD function is defined in mole fraction space $\mathbf {z}$ and given by

$$\begin{aligned} TPD(\mathbf {z}^{trial}) = \sum _{i=k}^{N_k} z_k^{trial} \left[ \mu _k(\mathbf {z}^{trial},T,p) - \mu _k(\mathbf {z}^{test},T,p) \right] . \end{aligned}$$

(31)

The superscript $(\cdot )^{test}$ indicates for the feed composition, which is provided from the flow solver and $(\cdot )^{trial}$ for all other possible molar compositions, which fulfill the mass balance condition $\sum _k^{N_k} z_k^{trial} = 1$. The TPD analysis is based on the idea of direct evaluation of the Gibbs free energy surface [2] and checks for a global minimum in Gibbs free energy at the present feed composition. Hereby TPD values greater zero correspond to a stable state, smaller ones to an unstable one. For the analysis of the TPD function the local minimization method with multiple initial guesses presented by [15] is used. The thermodynamic consistent modeling of the states in the two-phase region is provided by the HEM approach with

$$\begin{aligned} \epsilon ^{EQ} = x_v \epsilon _v + (1-x_v) \epsilon , \end{aligned}$$

(32)

where the specific inner energy per mass works as a dummy value for any caloric state variable and $x_v$ is the vapor mass fraction defined by

$$\begin{aligned} x_v = \frac{1 / \rho - 1 / \rho _l}{ 1 / \rho _v - 1 / \rho _l}. \end{aligned}$$

(33)

Due to the loss of hyperbolicity inside the spinodale region with real gas EOS, the sound speed in the two-phase region in the HEM approach is modeled with the relation presented by [25]

$$\begin{aligned} \frac{1}{\rho a^2} = \frac{\alpha _v}{\rho a_v^2}+\frac{1-\alpha _v}{\rho \alpha _l^2}, \end{aligned}$$

(34)

where a is the sound speed and $\alpha $ the volumetric vapor fraction given by

$$\begin{aligned} \alpha _v = \frac{x_v \rho }{\rho _v}. \end{aligned}$$

(35)

3.3 Look up Tables and Extension to Shared Memory Trees

The current Cray machine Hazel Hen has about 185,088 cores in the current expansion stage. These are provided with 24 cores each at 7712 nodes. Each node is comprised of 128 GB memory. The next expansion stage, Hawk, which is planned for spring 2020, will be approximately 640,000 cores at 5000 nodes. The ratio of nodes to cores will accordingly increase more than quintupled from the present time of $N_{cores}/N_{nodes}=24$ to $N_{cores}/N_{nodes}=128$. It is important to consider that the available capacity of memory on a node is not increased and will therefore be 1 GB per core. Scalable and highly efficient CFD codes for high-performance (HPC) computers, which are perfectly adapted to old architectures, should keep pace with such new developments. To maintain efficiency, the algorithms have to be modified. Examples are memory-consuming algorithms, which can be found in multi-phase and multi-component simulations in combination with so-called look up tables approaches [9, 10]. Today, these tables are composed of modern data structures such as quadtree or octree data structures, see Figs. 1 and 2. Quadtrees and octrees make use of properties from so called space filling curves for fast data localization. Here the Morton curve is popular due to the inherent possibility to access the data via bit operations

$$\begin{aligned} \text {data position}=f(\text {bit number}), \end{aligned}$$

(36)

see Fig. 1. Despite of the usage of such modern data structures, today’s CFD simulations may reach the memory limits fast, if large scale high fidelity simulations are performed. In this context we want to discuss in this paper the implementation and application of tree structures on high-performance computers associated with MPI 3.0 and shared memory (Fig. 3).

In the last period, we have extended our tabulation framework, in order to use the look up tables as efficiently as possible on future high performance computers. Initially we will give some information about the parallelization strategy of the CFD solver FLEXI [8]. FLEXI is based on the so-called domain decomposition, which divides the computational grid into heterogeneously distributed MPI processes depending on the number of cores used, see Fig. 4. For the domain decomposition, again a space filling curve, more precisely the so-called Hilbert curve, is used. The curve has the special property to optimally distribute the different MPI regions with respect to the volume/surface ratio, even on unstructured grids. Figure 4 shows such a division. To ensure that each MPI process can access the data in the table, each MPI process has to initialize and allocate its own table when using standard MPI features. By considering MPI 3.0 features, like shared memory windows, the number of tables for each node can be reduced to one table for each node.

However, modern data structures generally consist of chained pointer lists, which are not directly applicable with the MPI 3.0 shared memory feature. This is due to the fact, that each MPI process is linked with its own virtual memory space, see Fig. 5. This has consequences for the way in which the tree structure has to be read in and accessed during the simulation on HPC systems. In Fig. 6, the standard approach to store and access the tree data is depicted. Here, each branch of the tree stores a small portion of the whole data. Furthermore, each MPI process reads and allocates the data during IO. In Fig. 7 the alternative approach to store and access the tree data with MPI 3.0 shared memory window is depicted. Here, unlike before, each branch of the tree only stores two integer IDs depicting a range in the global shared memory array. An important aspect is the fact, that during IO only one MPI process on the node is allowed to read and allocate the data. Nevertheless, each MPI process has to read and store the empty tree. This is necessary because each MPI process has still to know the relative path to the unique IDs in the last branch. With this approach it is possible to maintain the efficient tree data structure while simultaneously be capable to store and access several magnitudes of data.

4 Results

4.1 Performance Comparison of Tree Data Structures with and Without MPI 3.0 Shared Memory

In this section we investigate the different data structures in terms of performance and memory usage.

For the comparison we use the performance index

$$\begin{aligned} \text {PID} = \frac{\text {wall-clock-time} \cdot \#\text {processors}}{\#\text {DOF} \cdot \#\text {time steps} \cdot \#\text {RK-stages}} . \end{aligned}$$

(37)

The results are obtained with the open source code FLEXI in combination with octree tables. Note that FLEXI is based on the HDF5 standard. To ensure a fair comparison, we have chosen a simple test case, the standard lid driven cavity in two dimensions, see [5]. We choose a binary mixture with two different ideal gases, instead of performing a one-component simulation as it is typically done in the literature. First, we look at the performance of both data structures that we defined in Sect. 3.3. We perform each simulation six times and average the measurement to cancel out hardware influences. The comparison was done on 8 nodes with 192 processes. The tree data was refined up to 7 levels resulting in about $\approx 2.8$ GB memory size. Each octant represents the data in a three dimensional polynomial basis of degree 4. In the first two lines of Table 2, we have listed the results for the performance test. We notice a slightly higher PID for the MPI 3.0 implementation, which is most likely due to additional index mapping used to get the position in the global shared memory array. In the third line we compare the time which was used to read and allocate the data before the simulation starts. We note that the IO of the MPI 3.0 implementation is different in the way that we do not read in the whole tree from the HDF5 at once. By using the shared memory option, we read, allocate and deallocate each octant successively from the HDF5 file, to save as much memory as possible. Here, we notice a non negligible longer IO time for the MPI 3.0 implementation. The factor between the standard and the shared memory approach is about 6 (Table 1).

Table 1 Numerical setup of the Lid driven cavity test problem

Full size table

Table 2 Performance comparison for the different data structures

Full size table

The next two Tables 3 and 4 contain memory comparisons.

Table 3 Memory usage depending on tree level, here we tabulated a binary mixture of Helium/Air

Full size table

Table 4 Theoretical memory usage by using non-shared memory and shared memory data structures on different architectures

Full size table

Here, we notice the huge improvement with the MPI 3.0 shared memory implementation. For the planned architecture Hawk we will (theoretically) be able to store and access about 128 times more memory than with the old algorithm.

4.2 Navier–Stokes Multi-component Simulations

The multi-component Navier–Stokes model was used for comparison simulations conducted with direct use of the EOS and tables with different refinement levels. As test case a two-dimensional shear layer of nitrogen and n-dodecane of the dimension $[0,0.2] \times [-0.15,0.15]$ m² was investigated. The initial states of the pure species are summarized in Table 5. As initial condition a base flow in x-direction superposed by a y-velocity disturbance was used, which are given by

$$\begin{aligned} u_{N_2}&= 2 M_{c,0} a_{N_2} \left[ 1 + \left( \frac{a_{N_2}}{a_{C_7H_{16}}} \right) \sqrt{\frac{\rho _{N_2} Z_{N_2}}{\rho _{C_7H_{16}} Z_{C_7H_{16}}}} \right] ^{-1}, \end{aligned}$$

(38)

$$\begin{aligned} u_{C_7H_{16}}&= - \sqrt{\frac{\rho _{N_2} Z_{N_2}}{\rho _{C_7H_{16}}}} u_{N_2}, \end{aligned}$$

(39)

$$\begin{aligned} u (x,t=0)&= u_0 \bigg |erf \left( \frac{\sqrt{\pi } y}{\delta _{\omega ,0}} \right) \bigg |, \end{aligned}$$

(40)

$$\begin{aligned} Y_{C_7H_{14}} (x,t=0)&= 1 - y_{N_2}, \end{aligned}$$

(41)

$$\begin{aligned} Y_{N_2} (x,t=0)&= 0.5+0.5 \; erf \left( \frac{\sqrt{\pi } y}{\delta _{\omega ,0}} \right) , \end{aligned}$$

(42)

$$\begin{aligned} v (x,y,t=0)&= 0.1 \; max \left( u_0 \right) \sin \left( \frac{8 \pi x}{\delta _{\omega ,0}} \right) \exp \left\{ - \left( \frac{y}{\delta _{\omega ,0}} \right) ^2 \right\} \end{aligned}$$

(43)

and

$$\begin{aligned} \rho = \rho (T,p,\mathbf {Y}). \end{aligned}$$

(44)

Table 5 Specified initial conditions of the base flow for the pure species of the mixing layer test case

Full size table

Here Z is the compressibility factor, $M_{c,0}$ is the Mach number which was chosen to 0.4 and $\delta _{\omega ,0}$ is the initial blending thickness between the two species with $\delta _{\omega ,0} = 6.859 \cdot 10^{-3}$ m.

The achieved results are visualized in Figs. 8 and 9. In both snapshots we can observe some differences in between the three computations. This is due to the fact, that the chosen Kelvin Helmholtz test problem is a highly sensitive initial value problem. The different thermodynamic approximations quickly lead to different results. In summary we can show that the tabulation approach is suitable for multi-component simulations in the super-critical regime, nevertheless future investigations are necessary.

4.3 Navier–Stokes–Korteweg

The parabolic relaxation model for the NSK equations was used to investigate head on collisions of two droplets.

4.3.1 Simulation Setup

The initial conditions were

$$\begin{aligned} \rho (\mathbf {x},t=0)&= \rho _{\mathrm {vap}} + \frac{\rho _{\mathrm {vap}}-\rho _{\mathrm {liq}}}{2} \sum _{i=1}^2 \left(\mathrm {tanh} \left( \frac{d_i-r_i}{2\sqrt{\gamma _{\mathrm {K}}\epsilon _{\mathrm {K}}^2}} \right) \right) \end{aligned}$$

(45)

$$\begin{aligned} u(\mathbf {x},t=0)&= {\left\{ \begin{array}{ll} \frac{v_{\mathrm {ini}}}{2} + \left( 1 - \mathrm {tanh} \left(\frac{d_1-r_{\mathrm {d}}}{2\sqrt{\gamma _{\mathrm {K}}\epsilon _{\mathrm {K}}^2}} \right) \right) \quad &{} \text {if} \quad x<0.5, \\ \frac{- v_{\mathrm {ini}}}{2} + \left( 1 - \mathrm {tanh} \left(\frac{d_2-r_{\mathrm {d}}}{2\sqrt{\gamma _{\mathrm {K}}\epsilon _{\mathrm {K}}^2}} \right) \right) \quad &{} \text {if} \quad x\ge 0.5, \\ \end{array}\right. }\end{aligned}$$

(46)

$$\begin{aligned} v(\mathbf {x},t=0)&= 0, \end{aligned}$$

(47)

$$\begin{aligned} w(\mathbf {x},t=0)&= 0, \end{aligned}$$

(48)

where $\rho _{\mathrm {vap}}=0.3197$, $\rho _{\mathrm {liq}}=1.8071$ are the Maxwellian densities at $T_{\mathrm {ref}}=0.85$. The droplet radii were $r_1=r_2=0.5$ and the distance was given by

$$\begin{aligned} d_i = \parallel \mathbf {x}- \mathbf {x}_{0,i} \parallel , \end{aligned}$$

(49)

where $\mathbf {x}_{0,1}=(0.3,0.5,0.5)^{\top }$ and $\mathbf {x}_{0,2}=(0.7,0.5,0.5)^{\top }$ are the initial positions of the droplets. Four cases were investigated where the droplet number, position, and size remained the same and the model parameters and initial velocities were changed. The parameters are summarized in Table 6. The computation domain was $\Omega =[0,1]^3$ and it was discretized by 64 elements in each direction. The polynomial degree was $N=3$ which yielded $256^3$ degrees of freedom (DOF). Time integration was done implicit with $\mathrm {CFL}=100$ using a fourth order ESDIRK scheme with six stages. The simulations were performed on the Hazel Hen supercomputer at HLRS using 200 nodes.

Table 6 Parameters for head on droplet collision simulations

Full size table

4.3.2 Simulation Results

The isocontour of the mean density, $\rho _{\mathrm {mean}}=1.0634$, of the solution of case A is shown in Fig. 10 for different time instances. Two droplets were pushed towards each other and coalesce. A flat disc formed for $t>0.12$ which broke up into a ring and a small droplet in the center at $t\approx 0.24$. Both the ring and the centered droplet evaporated and for $t \rightarrow \infty $ only vapour remained, since the average density was in the stable vapour phase.

In Case B $\epsilon _{\mathrm {K}}$ was increased and $\gamma _{\mathrm {K}}$ was decreased such that different phenomena were observed. The isocontour of the mean density is shown in Fig. 11. Again, the two droplets merged and a disc formed. The disc flattened and break up occurred at its centre, however no droplet was formed and only a ring remained. Eventually, the ring evaporated and the domain was filled by a stable vapour phase.

Case C reduced $\gamma _{\mathrm {K}}$ further, which led to a thinner phase interface. The isocontour is shown in Fig. 12. After coalescence, the disc formed again but no break up occurred and the disc remained at that form until it evaporated completely.

Case D used the same parameters as Case C but increased the initial velocity of the droplets. The isocontour is shown in Fig. 13. The momentum of the droplets was increased and the impact was stronger such that the disc quickly broke up and a ring and centered droplet remained.

The total energy, Eq. (22), was calculated in each time step. As seen in Fig. 14, the total energy decreased monotonously until a minimum was reached. Hence, the solutions produced by the relaxation model were admissible.

5 Summary and Conclusions

In this work we carried out investigations on the use of modern data structures on high performance computers. In this context, a new implementation strategy for shared memory look up tables for binary mixtures was introduced. We were able to show that a change in hardware architecture on high performance computers, e.g. from Hazel Hen to Hawk, has a great impact on the old algorithms. With the new implementation we are able to store and access about 128 times more memory than with the old algorithm. The simulation and comparison of a multi-component real gas shear layer with exact EOS and tabulation approach led to reasonable results, however further investigations are necessary.

In addition, 3D simulations of colliding droplets were carried out using a parabolic relaxation model of the Navier–Stokes–Korteweg diffuse interface model. A variation of model parameters produced a variation in the coalescence behaviour. Future research aims at validation with experimental results.

Notes

1.
http://www.flexi-project.org.

References

R. Akasaka, A reliable and useful method to determine the saturation state from Helmholtz energy equations of state. J. Therm. Sci. Technol. 3, 442–451 (2008)
Article Google Scholar
L.E. Baker, A.C. Pierce, K.D. Luks, Gibbs energy analysis of phase equilibria. SPE J. 22, 731–742 (1982)
Google Scholar
F. Bassi, S. Rebay, A high-order accurate discontinuous finite element method for the numerical solution of the compressible Navier-Stokes equations. J. Comput. Phys. 131(2), 267–279 (1997). https://doi.org/10.1006/jcph.1996.5572
Article MathSciNet MATH Google Scholar
F. Bassi, S. Rebay, Numerical evaluation of two discontinuous Galerkin methods for the compressible Navier-Stokes equations. Int. J. Numer. Methods Fluids 40(1–2), 197–207 (2002). https://doi.org/10.1002/fld.338
Article MATH Google Scholar
O. Botella, R. Peyret, Benchmark spectral results on the lid-driven cavity flow. Comput. Fluids 27(4), 421–433 (1998). https://doi.org/10.1016/S0045-7930(98)00002-4
T. Coffee, J. Heimerl, Transport algorithms for premixed, laminar steady-state flames. Combust. Flame 43(Supplement C), 273–289 (1981). https://doi.org/10.1016/0010-2180(81)90027-4
A. Corli, C. Rohde, V. Schleper, Parabolic approximations of diffusive-dispersive equations. J. Math. Anal. Appl. 414(2), 773–798 (2014). https://doi.org/10.1016/j.jmaa.2014.01.049
Article MathSciNet MATH Google Scholar
FLEXI, Description and source code (2018), https://www.flexi-project.org/. Accessed 02 Oct 2018
F. Föll, T. Hitz, C. Müller, C.D. Munz, M. Dumbser, On the use of tabulated equations of state for multi-phase simulations in the homogeneous equilibrium limit. Shock. Waves 1 (2019). https://doi.org/10.1007/s00193-019-00896-1
F. Föll, S. Pandey, X. Chu, C.D. Munz, E. Laurien, B. Weigand, High-fidelity direct numerical simulation of supercritical channel flow using discontinuous Galerkin spectral element method, in High Performance Computing in Science and Engineering ’ 18, ed. by W.E. Nagel, D.H. Kröner, M.M. Resch (Springer International Publishing, 2019), pp. 275–289
Google Scholar
J. Hesthaven, T. Warburton, Nodal Discontinuous Galerkin Methods: Algorithms, Analysis, and Applications, 1st edn. (Springer Publishing Company, Incorporated, 2008). https://doi.org/10.1007/978-0-387-72067-8
F. Hindenlang, G. Gassner, C. Altmann, A. Beck, M. Staudenmaier, C. Munz, Explicit discontinuous Galerkin methods for unsteady problems. Comput. Fluids 61, 86–93 (2012). https://doi.org/10.1016/j.compfluid.2012.03.006
Article MathSciNet MATH Google Scholar
C.A. Kennedy, M.H. Carpenter, R. Lewis, Low-storage, explicit Runge–Kutta schemes for the compressible Navier–Stokes equations. Appl. Numer. Math. 35(3), 177–219 (2000). https://doi.org/10.1016/S0168-9274(99)00141-5
Article MathSciNet MATH Google Scholar
D. Kopriva, Implementing Spectral Methods for Partial Differential Equations: Algorithms for Scientists and Engineers, 1st edn. (Springer Publishing Company, Incorporated, 2009)
Google Scholar
M.L. Michelsen, The isothermal flash problem. Part 1. Stability. Fluid Phase Equilib. 9, 1–19 (1982a)
Article Google Scholar
M.L. Michelsen, The isothermal flash problem. Part 2. Phase-split calculation. Fluid Phase Equilib. 9, 21–40 (1982b)
Article Google Scholar
M.L. Michelsen, J.M. Mollerup, Thermodynamic Models: Fundamentals & Computational Aspects, 2nd edn. (Tie-Line Publications, Holte, 2007)
Google Scholar
D.Y. Peng, D.B. Robinson, A new two-constant equation of state. Ind. Eng. Chem. Fundam. (1976)
Google Scholar
P. Persson, J. Peraire, Sub-cell shock capturing for discontinuous galerkin methods, in 44th AIAA Aerospace Sciences Meeting and Exhibit, Aerospace Sciences Meetings, American Institute of Aeronautics and Astronautics (2006), https://doi.org/10.2514/6.2006-112
M. Sonntag, C.D. Munz, Shock capturing for discontinuous Galerkin methods using finite volume subcells, in Finite Volumes for Complex Applications VII-Elliptic, Parabolic and Hyperbolic Problems, ed. by J. Fuhrmann, M. Ohlberger, C. Rohde (Springer International Publishing, 2014), pp. 945–953
Google Scholar
M. Sonntag, C.D. Munz, Efficient parallelization of a shock capturing for discontinuous Galerkin methods using finite volume sub-cells. J. Sci. Comput. 70(3), 1262–1289 (2017). https://doi.org/10.1007/s10915-016-0287-5
Article MathSciNet MATH Google Scholar
H.B. Stewart, B. Wendroff, Two-phase flow: models and methods. J. Comput. Phys. 56(3), 363–409 (1984). https://doi.org/10.1016/0021-9991(84)90103-7
Article MathSciNet MATH Google Scholar
E. Toro, Riemann Solvers and Numerical Methods for Fluid Dynamics: A Practical Introduction (Springer, Berlin, 2009). https://doi.org/10.1007/b79761
Book MATH Google Scholar
J. Van der Waals, Over de Continuiteit van den Gas-en Vloeistoftoestand. Ph.D. thesis, University of Leiden (1873)
Google Scholar
A.B. Wood, A Textbook of Sound, 1st edn. (G. Bell and Sons, 1941)
Google Scholar

Download references

Acknowledgements

We gratefully acknowledge the Deutsche Forschungsgemeinschaft (DFG) through SFB-TRR 40 “Fundamental Technologies for the Development of Future Space-Transport-System Components under High Thermal and Mechanical Loads” and SFB-TRR 75 “Droplet dynamics under extreme ambient conditions” Computational resources have been provided by the Bundes-Höchstleistungsrechenzentrum Stuttgart (HLRS).

Author information

Authors and Affiliations

Institute of Aerodynamics and Gas Dynamics, University of Stuttgart, Pfaffenwaldring 21, 70569, Stuttgart, Germany
Fabian Föll, Timon Hitz, Jens Keim & Claus-Dieter Munz

Authors

Fabian Föll
View author publications
You can also search for this author in PubMed Google Scholar
Timon Hitz
View author publications
You can also search for this author in PubMed Google Scholar
Jens Keim
View author publications
You can also search for this author in PubMed Google Scholar
Claus-Dieter Munz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fabian Föll .

Editor information

Editors and Affiliations

Zentrum für Informationsdienste und Hochleistungsrechnen (ZIH), Technische Universität Dresden, Dresden, Germany
Wolfgang E. Nagel
Abteilung für Angewandte Mathematik, Universität Freiburg, Freiburg, Germany
Dietmar H. Kröner
Höchstleistungsrechenzentrum Stuttgart (HLRS), Universität Stuttgart, Stuttgart, Germany
Michael M. Resch

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Föll, F., Hitz, T., Keim, J., Munz, CD. (2021). Towards High-Fidelity Multiphase Simulations: On the Use of Modern Data Structures on High Performance Computers. In: Nagel, W.E., Kröner, D.H., Resch, M.M. (eds) High Performance Computing in Science and Engineering '19. Springer, Cham. https://doi.org/10.1007/978-3-030-66792-4_25

Download citation

DOI: https://doi.org/10.1007/978-3-030-66792-4_25
Published: 30 May 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-66791-7
Online ISBN: 978-3-030-66792-4
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics

Towards High-Fidelity Multiphase Simulations: On the Use of Modern Data Structures on High Performance Computers

Abstract

Similar content being viewed by others

PetaFLOP Molecular Dynamics for Engineering Applications

591 TFLOPS Multi-trillion Particles Simulation on SuperMUC

Cluster Optimization and Parallelization of Simulations with Dynamically Adaptive Grids

1 Introduction

2 Governing Equations

2.1 The Compressible Navier–Stokes System for Multi-components

2.2 The Navier–Stokes–Korteweg Equations

3 Numerical Methods

3.1 Discontinuous Galerkin Method

3.2 Equation of State and Thermodynamic Equilibrium

3.3 Look up Tables and Extension to Shared Memory Trees

4 Results

4.1 Performance Comparison of Tree Data Structures with and Without MPI 3.0 Shared Memory

4.2 Navier–Stokes Multi-component Simulations

4.3 Navier–Stokes–Korteweg

4.3.1 Simulation Setup

4.3.2 Simulation Results

5 Summary and Conclusions

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Towards High-Fidelity Multiphase Simulations: On the Use of Modern Data Structures on High Performance Computers

Abstract

Similar content being viewed by others

PetaFLOP Molecular Dynamics for Engineering Applications

591 TFLOPS Multi-trillion Particles Simulation on SuperMUC

Cluster Optimization and Parallelization of Simulations with Dynamically Adaptive Grids

1 Introduction

2 Governing Equations

2.1 The Compressible Navier–Stokes System for Multi-components

2.2 The Navier–Stokes–Korteweg Equations

3 Numerical Methods

3.1 Discontinuous Galerkin Method

3.2 Equation of State and Thermodynamic Equilibrium

3.3 Look up Tables and Extension to Shared Memory Trees

4 Results

4.1 Performance Comparison of Tree Data Structures with and Without MPI 3.0 Shared Memory

4.2 Navier–Stokes Multi-component Simulations

4.3 Navier–Stokes–Korteweg

4.3.1 Simulation Setup

4.3.2 Simulation Results

5 Summary and Conclusions

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation