CFD/CAA Simulations on HPC Systems

Schlottke-Lakemper, Michael; Klemp, Fabian; Cheng, Hsun-Jen; Lintermann, Andreas; Meinke, Matthias; Schröder, Wolfgang

doi:10.1007/978-3-319-46735-1_12

Michael Schlottke-Lakemper⁶,
Fabian Klemp⁷,
Hsun-Jen Cheng⁷,
Andreas Lintermann⁶,
Matthias Meinke⁷ &
…
Wolfgang Schröder⁷

305 Accesses
2 Citations

Abstract

In this paper, a highly scalable numerical method is presented that allows to compute the aerodynamic sound from a turbulent flow field on HPC systems. A hybrid CFD-CAA method is used to compute the flow and the acoustic field, in which the two solvers are running in parallel to avoid expensive I/O operations for the acoustic source terms. Herein, the acoustic perturbation equations are solved by a high-order discontinuous Galerkin scheme using the acoustic source terms obtained from an approximate solution of the Navier-Stokes equations. Both solvers run simultaneously and operate on differently refined hierarchical Cartesian grids. This direct-hybrid method is validated by monopole and pressure pulse simulations and is used for performance measurements on current HPC systems. The results highlight the limitations of classic hybrid methods and show that the new approach is suitable for highly parallel simulations.

Access provided by Autonomous University of Puebla. Download conference paper PDF

The Direct-Hybrid Method for Computational Aeroacoustics on HPC Systems

Discontinuous Galerkin for High Performance Computational Fluid Dynamics

Parallel Algorithm of the NOISEtte Code for CFD and CAA Simulations

Article 25 May 2018

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

One of the major challenges of today’s aircraft development is noise reduction, which is also one of the central aims in European aircraft policy. The perceived noise levels of flying aircraft are to be reduced until 2050 by 65 % compared to the year 2000 [25]. Many sound-generating components of aircraft need to be assessed in sufficient detail to be able to improve their design, such as the optimization of the jet nozzle geometry to lower noise emissions at take-off without sacrificing the thrust efficiency. To achieve such optimizations, efficient, fully parallelized algorithms are needed to predict the flow field and the far-field noise of jet engines.

A hybrid method combining large-eddy simulation (LES) with computational aeroacoustics (CAA) for large-scale aeroacoustics simulations has been successfully applied in [7, 18]. It uses LES to determine the turbulent flow field for external flow configurations. From this solution, noise-generating source terms are extracted and used in a CAA simulation, where the acoustic field is predicted using the acoustic perturbation equations (APE) [6]. This scheme has been applied successfully to different problems in computational aeroacoustics, such as trailing edge noise [7], jet noise [12], or combustion noise [4, 11]. However, it suffers from the exchange of large data volumes for the acoustic source terms via I/O operations, which limits the efficiency of such a two-step approach especially on high-performance computing (HPC) systems.

To circumvent this bottleneck, the direct-hybrid method presented in this work combines the LES and CAA solvers in a single framework such that both solvers can run in parallel. The LES solver used for the prediction of the flow field is based on a finite-volume method, while the CAA approach makes use of a high-order discontinuous Galerkin (DG) method to solve the APE for the acoustic field. DG methods were first described by Reed and Hill [24] and were subsequently applied to various physical problems, such as incompressible and compressible flow [2, 22], magnetohydrodynamics [29], and aeroacoustics [1, 3].

The LES and CAA computations are performed on a joint Cartesian mesh. Based on a coloring scheme, cells are associated with different weights for the LES and CAA solution and a space-filling curve is used for the domain decomposition. The coupling mechanism between both simulations only requires memory transfer operations. That is, no additional communication between the subdomains is necessary, leading to an efficient algorithm to be used on massively parallel systems. Furthermore, this direct-hybrid approach allows a more fine-grained control over the coupling process itself, since the LES results are not obtained separately from the acoustic field anymore. This means that, e.g., the time step size or the grid size can be adapted during the simulation to account for time-dependent changes in the resolution requirements of both solvers, enabling in situ optimizations of the simulation process.

In this paper, the coupling approach for the direct-hybrid LES-CAA simulation is presented and results for performance measurements are shown. A CAA code is developed and integrated with an existing LES solver. After the governing equations are introduced in Sect. 2, the numerical methods are described in Sect. 3. In Sect. 4, the coupling strategy is discussed in detail. The CAA solver is validated in Sect. 5, before it is used for strong scaling experiments on two state-of-the-art HPC systems. In Sect. 6, the presented methods and the obtained results are summarized.

2 Governing Equations

In this hybrid CFD-CAA method, two sets of governing equations are utilized. One solely describes the generation and propagation of acoustic waves, while the other set of equations predicts the physics of the underlying flow field. Here, the acoustic perturbation equations are used for the acoustic field and the Navier-Stokes equations for the flow field. Both are briefly summarized in the following.

2.1 Navier-Stokes Equations

The Navier-Stokes equations in non-dimensional, conservative form are given by

$$\begin{aligned} \begin{aligned} \frac{\partial \rho }{\partial t} + \nabla \left( \rho \varvec{u} \right)&= 0, \\ \frac{\partial \rho \varvec{u}}{\partial t} + \nabla \left( \rho \varvec{u} \varvec{u} + p + \frac{\varvec{\tau }}{\text {Re}_0} \right)&= 0, \\ \frac{\partial \rho e}{\partial t} + \nabla \left( (\rho e + p) \varvec{u} + \frac{1}{\text {Re}_0} (\varvec{\tau } \varvec{u} + \varvec{q}) \right)&= 0. \end{aligned} \end{aligned}$$

(1)

The quantity $\rho $ represents the fluid density, $\varvec{u}$ the velocity vector, and e the total specific energy. The system in Eq. (1) is closed by the definition of the total specific energy for a perfect gas

$$\begin{aligned} \rho e = \frac{p}{\gamma - 1} + \frac{1}{2} \rho (\varvec{u} \cdot \varvec{u}), \end{aligned}$$

(2)

where p is the pressure and $\gamma $ is the specific heat ratio. For non-dimensionalization, the stagnation state is employed, which is denoted by the subscript 0. The Reynolds number based on the stagnation state is defined by

$$\begin{aligned} \text {Re}_0 = \frac{\rho _0 c_0 L}{\mu _0}, \end{aligned}$$

(3)

where L is a reference length and $\rho _0$, $c_0$, and $\mu _0$ are the stagnation density, the speed of sound, and the dynamic viscosity. A Newtonian fluid is assumed such that the components $\tau _{ij}$ of the stress tensor $\varvec{\tau }$ can be written as

$$\begin{aligned} \tau _{ij} = -2\mu S_{ij} + \frac{2}{3} \mu S_{ij} \delta _{ij}, \end{aligned}$$

(4)

where $S_{ij} = \frac{1}{2} \left( \frac{\partial u_i}{\partial x_j} + \frac{\partial u_j}{\partial x_i} \right) $ is the rate of strain tensor. The dynamic viscosity $\mu $ is calculated by using Sutherland’s law and the vector of heat conduction $\varvec{q}$ is determined by Fourier’s law

$$\begin{aligned} \varvec{q} = - \frac{k}{\text {Pr}(\gamma - 1)} \nabla T, \end{aligned}$$

(5)

where T is the static temperature. The Prandtl number is defined with the specific heat at constant pressure $c_p$ by $\text {Pr} = \frac{\mu _0 c_p}{k_0}$. For a constant Prandtl number, the relation $k(T) = \mu (T)$ holds for the thermal conductivity.

2.2 Acoustic Perturbation Equations

The acoustic perturbation equations (APE) were introduced in [6] and are used to predict the acoustic field for flow-induced noise. They are derived from the linearized Euler equations and modified to retain only acoustic modes without generating vorticity or entropy modes. Neglecting all viscous, non-linear and entropy-related contributions, the APE-4 system reads [6]

$$\begin{aligned} \frac{\partial \varvec{u}'}{\partial t} + \varvec{\nabla } \left( \bar{\varvec{u}} \cdot \varvec{u}'\right) + \varvec{\nabla } \left( \frac{p'}{\bar{\rho }} \right)&= \varvec{q_m}, \end{aligned}$$

(6)

$$\begin{aligned} \frac{\partial p'}{\partial t} + \bar{c}^2 \varvec{\nabla } \cdot \left( \bar{\rho }\varvec{u}' + \bar{\varvec{u}} \frac{p'}{\bar{c}^2}\right)&= 0, \end{aligned}$$

(7)

where the source term $\varvec{q_m}$ is the linear Lamb vector

$$\begin{aligned} \varvec{q}_m = - (\varvec{\omega } \times \varvec{u})' = - (\varvec{\omega }' \times \bar{\varvec{u}} + \bar{\varvec{\omega }} \times \varvec{u}'), \end{aligned}$$

(8)

with $\varvec{\omega }$ as the vorticity vector. The variables of the APE are perturbed quantities denoted by prime $(\cdot )'$ and are defined by $\phi ' := \phi - \bar{\phi }$, where the bar $(\bar{\cdot })$ denotes time-averaged quantities.

In the present work, the non-dimensional form of Eqs. (6) and (7) is used. As for the Navier-Stokes equations, the stagnation state is used for the definition of reference values. Furthermore, it is assumed here that the time-averaged values for the speed of sound and density are constant and equal to the stagnation state, i.e., $\bar{c} = c_0$ and $\bar{\rho } = \rho _0$, which is only valid in the low-Mach number regime. By using the following non-dimensional variables,

$$\begin{aligned} \tilde{t} = \frac{t c_0}{L}, \qquad \tilde{x} = \frac{x}{L}, \qquad \tilde{\varvec{u}} = \frac{\varvec{u}}{c_0}, \qquad \tilde{p} = \frac{p}{\rho _0 c_0^2}, \end{aligned}$$

(9)

the APE can be written as

$$\begin{aligned} \frac{\partial \tilde{\varvec{u}}'}{\partial \tilde{t}} + \tilde{\varvec{\nabla }} (\tilde{\bar{\varvec{u}}} \cdot \tilde{\varvec{u}}' + \tilde{p}')&= \tilde{\varvec{q}}_m,\end{aligned}$$

(10)

$$\begin{aligned} \frac{\partial \tilde{p}'}{\partial \tilde{t}} + \tilde{\varvec{\nabla }} \cdot (\tilde{\varvec{u}}' + \tilde{\bar{\varvec{u}}} \tilde{p}')&= 0. \end{aligned}$$

(11)

The non-dimensional source term is given by $\tilde{\varvec{q}}\varvec{_m} = \frac{\varvec{q}_m}{\bar{c}_0^2/L}$. For convenience, in the following discussion the tilde is dropped from the non-dimensional quantities.

3 Numerical Methods

In this section, the meshing process and the domain decomposition are outlined. Furthermore, the numerical methods for the acoustic perturbation equations and the Navier-Stokes equations are briefly described.

3.1 Hierarchical Mesh Topology

Both the LES solver and the CAA solver operate on a joint hierarchical Cartesian mesh. The cells of the grid are organized in a tree structure (2D: quadtree, 3D: octree), with parent-child relationships between different levels and neighbor relationships within a level. The discretization process follows the method described in [21] and starts with a single square/cube cell which encloses the whole computational domain.

This zero-level cell is then refined uniformly until the desired refinement level is reached (see Fig. 1a). A cell to be refined is isotropically subdivided into $2^d$ square/cube cells, with d being the number of spatial dimensions and with the original cell becoming the parent cell of the new child cells. Individual regions of the mesh can be further refined to meet resolution requirements, e.g., in areas with small-scale physical features such as wall-bounded shear layers or to accurately resolve boundaries (see Fig. 1b). A smoothing algorithm ensures that the level difference between neighboring cells does not exceed one, i.e., each cell has at most $2^{d-1}$ neighbor cells in each spatial direction. Special treatment is necessary for cells that are intersected by the body geometry. In this paper, only non-intersected cells are considered. During grid generation, the zero-level cell is homogeneously refined to a minimum level $l_\alpha $ and all coarser cells are discarded [21]. These cells at level $l_\alpha $ become the roots of their subtrees and are further subdivided until the required refinement level is reached.

For the domain decomposition, a Hilbert space-filling curve [26] is used to map the grid at level $l_\alpha $ to the interval [0, 1]. Each cell at level $l_\alpha $ is assigned a load that depends on the number of cells in its subtree and on the type of the cells, i.e., whether they are LES or CAA cells. Load balancing is achieved by taking into account these load values when distributing the cells among the processes and for each $l_\alpha $ cell the entire subtree is placed on the same rank (see Fig. 2). By consecutively placing $l_\alpha $ cells and their subtrees on the MPI ranks according to their position on the Hilbert curve, spatial compactness is ensured, reducing the overall communication cost.

3.2 Discontinuous Galerkin Approximation of the APE

A discontinuous Galerkin spectral element method (DGSEM) is used to determine the acoustic field. In Kopriva et al. [19], the DGSEM was proposed and has been used extensively [9, 17]. Since it was derived for quadrilateral/hexahedral mesh elements, it is well-suited for the use on hierarchical Cartesian grids. Furthermore, its compact formulation allows a very efficient parallelization, when explicit time stepping is used, and the parallel efficiency is independent of the chosen order of the scheme.

Since the DGSEM elements correspond to cells in a finite-volume context, the words cell or element will be used interchangeably. In the following, the main components of the DGSEM are outlined. First, the system of equations is mapped to a reference element for efficiency reasons. The derivation of the DG formulation then starts with the weak formulation, choosing Lagrange polynomials to represent the solution within each element. This gives rise to an integral equation, which is approximately solved using Gauss quadrature. Finally, the discrete DG operator is integrated in time using a Runge-Kutta scheme.

A general system of hyperbolic conservation equations in three dimensions reads

$$\begin{aligned} \frac{\partial \varvec{U}}{\partial t} + \varvec{\nabla } \cdot \varvec{f}(\varvec{U}) = 0, \end{aligned}$$

(12)

where $\varvec{U} = \varvec{U}(\varvec{x}, t)$ is the vector of conservative variables $\{u_i\}_{i=1}^{n_v}$ and $\varvec{f}$ is the flux vector. For efficiency reasons, the differential equation is mapped to a reference element E, which is in three dimensions given by a cube of size $[-1,1]\times [-1,1]\times [-1,1]$. Introducing the reference coordinate vector $\varvec{\xi } = (\xi ^1, \xi ^2, \xi ^3)^\intercal $, the final transformed equation reads [17]

$$\begin{aligned} \hat{J}\varvec{U}_t + \varvec{\nabla _\xi } \cdot \varvec{f} = 0, \end{aligned}$$

(13)

where $\hat{J}$ is the Jacobian, which for cube-to-cube transformations is just $\frac{h}{2}$, h being the side length of the cube, and $\varvec{U}_t$ is the time derivative of the vector of conservative variables.

The derivation of the DG method starts with the weak form of the equation. Therefore, Eq. (13) is multiplied by a test function $\phi = \phi (\varvec{\xi })$ and integrated over the reference element E

$$\begin{aligned} \int _E \left( \hat{J}\varvec{U}_t + \varvec{\nabla _\xi } \cdot \varvec{f} \right) \phi \,\mathrm {d}\varvec{\xi } = 0. \end{aligned}$$

(14)

Using integration by parts on the flux term, the weak formulation of the differential equation is obtained

$$\begin{aligned} \int _E \hat{J}\varvec{U}_t\phi \,\mathrm {d}\varvec{\xi } + \int _{\partial E} (\varvec{f} \cdot \varvec{n})^* \phi \,\mathrm {d}\varvec{s} - \int _E \varvec{f} \cdot \varvec{\nabla _\xi } \phi \,\mathrm {d}\varvec{\xi } = 0, \end{aligned}$$

(15)

where $\varvec{n}$ is the surface normal vector in the reference system. Similar to the finite-volume approach, the value for the normal flux $\varvec{f} \cdot \varvec{n}$ is not uniquely defined on the element boundaries $\partial E$, since the solutions in the left $\varvec{U}^-$ and right $\varvec{U}^+$ elements are discontinuous. Therefore, a numerical flux $(\varvec{f} \cdot \varvec{n})^* = \varvec{g}(\varvec{U}^+,\varvec{U}^-, \varvec{n})$ is chosen that combines values from both sides to a single flux. In this work, the local Lax-Friedrichs flux formulation is used,

$$\begin{aligned} \varvec{g}(\varvec{U}^+,\varvec{U}^-, \varvec{n}) = \frac{1}{2} \left( \varvec{f}(\varvec{U}^+) + \varvec{f}(\varvec{U}^-)\right) \cdot \varvec{n} + \frac{1}{2} \left( \max _{\varvec{U} \in [\varvec{U}^+,\varvec{U}^-]}|\varvec{a}(\varvec{U}) \cdot \varvec{n}| (\varvec{U}^+ - \varvec{U}^-) \right) , \end{aligned}$$

(16)

where $\varvec{a}$ is the vector of eigenvalues of the flux Jacobian. The solution $\varvec{U}$ is approximated by a polynomial basis

$$\begin{aligned} \varvec{U}(\varvec{\xi }, t) \approx \sum _{i,j,k=0}^N \varvec{\bar{u}_{ijk}}(t) \psi _{ijk}(\varvec{\xi }), \qquad \psi _{ijk}(\varvec{\xi }) = l_i(\xi ^1) l_j(\xi ^2) l_k(\xi ^3), \end{aligned}$$

(17)

where the basis functions $\psi _{ijk}$ are the product of one-dimensional Lagrange polynomials l of degree N in each spatial direction and $\varvec{\bar{u}_{ijk}}(t)$ are the coefficients to be determined. The nodal basis is defined on a set of interpolation points $\{\xi \}_{i=0}^N$ on the interval $\xi \in [-1, 1]$, which in this work are the Legendre-Gauss nodes (Fig. 3). For the fluxes, the same approach is used for the approximation.

The three integrals in Eq. (15) are approximated by Gauss quadrature. Generally, the Gauss quadrature of an arbitrary function f(x) on the interval [a, b] with $N+1$ nodes can be written as

$$\begin{aligned} \int \limits _a^b f(x)\,\mathrm {d} x \approx \sum _{i=1}^N \omega _i f(x_i), \end{aligned}$$

(18)

where the weights $\omega _i$ and the integration nodes $x_i$ are specific to the chosen quadrature type. These weights are pre-calculated and stored to make the algorithm efficient. With the interpolation points $\{\xi _i\}$ collocated at the Gauss nodes, all sums collapse into single values, yielding the discrete DG operator $\varvec{\mathscr {L}}(\varvec{U}, t) = \varvec{U}_t$ [17]. In the next step, the semi-discrete formulation is integrated in time to obtain the solution at the next time step, for which a low-storage fourth-order Runge-Kutta scheme is used [5].

3.3 Finite-Volume Method for the Flow Simulation

A second-order finite-volume method is used to solve the unsteady Navier-Stokes equations for compressible flow as given in Sect. 2.1. The solver has been extensively validated and used for various flow problems previously [15, 16]. A detailed description of the method can be found in [13, 15, 16, 28].

4 Coupling Strategy

To solve the acoustic perturbation equations, the averaged quantities $\bar{\varvec{u}}$ and $\bar{c}$ and the source term $\varvec{q}_m$ have to be determined first. The flow solution is advanced without coupling until the averaged quantities are statistically converged. The coupling process for each time step of the LES reads:

1.
Advance the LES solution.
2.
Calculate the source terms from instantaneous and averaged quantities.
3.
Advance the CAA solution.

The actual coupling takes place via the source terms computed from the LES solution, which are then used to solve the APE. This means that there is a one-way coupling from the flow solution to the acoustic field, while the flow solution is not influenced by the acoustic field.

In the direct-hybrid method described here, the LES and the CAA simulation are both performed within a single simulation framework and by using the same grid topology. This makes certain aspects of the coupling process more efficient and allows a more fine-grained control over the interface between the two solvers. In the following, some details of the method are presented.

4.1 Spatial Coupling

The instantaneous variables of the source term $\varvec{q}_m$ are available after each time step from the flow simulation. They have to be transferred, however, from the LES to the acoustic grid. Since both simulations typically operate on different levels of the same grid, identification of corresponding cells is possible by traversing the octree constituting the hierarchical Cartesian mesh. While LES and CAA leaf cells can generally be of different size, the coupling always happens within a single subtree. Since the domain decomposition algorithm distributes entire subtrees on different processes (see also Sect. 3.1), no additional inter-rank communication is required for the exchange of data between CFD and CAA cells.

This type of mesh also guarantees that there are no partially overlapping cells, i.e., a smaller cell is always fully contained inside a larger cell. Note that the DG elements are generally of higher order than the finite-volume cells. Depending on the resolution of the fluid and acoustics problems, four types of transformations are possible.

In the simplest case, one fluid cell corresponds exactly to one acoustics cell (Fig. 4a). That is, the source term is calculated once in the finite-volume part and the same value is used at all Gauss nodes of the DG element. This approach is used exclusively in the present work, i.e., no spatial interpolation is performed. Similar to the one-to-one mapping, the source term is calculated once and then used at all Gauss nodes of all elements if one fluid cell is mapped to multiple acoustics cells (Fig. 4b).

Having multiple finite-volume cells mapped onto one DG element (Fig. 4c) requires the values at the Gauss nodes to be interpolated from several flow cells. A natural choice would be to interpret the finite-volume cells as equidistant nodes of a polynomial and to obtain the values at the Gauss nodes through projection. This, however, can lead to spurious oscillations if the number of finite-volume cells and thus the polynomial degree is high, especially in regions with large flow gradients. Other possibilities are weighted least squares methods, nearest neighbor interpolation, or inverse distance weighting. Which approach is best depends on a number of factors. A practical consideration is the computational cost of the chosen method, e.g., whether the effort scales linearly with the number of degrees of freedom or worse, since the interpolation has to take place at each flow simulation time step. The smoothness of the interpolated function is also important, especially in high-gradient zones. Furthermore, it is desireable to have a conservative interpolation scheme such as proposed by Farrell and Maddison [8], to avoid distorting the source terms.

If there are regions without either a flow or acoustics grid, no coupling is performed. If only acoustic cells exist, far-field values for the averaged quantities $\bar{c}$ and $\varvec{\bar{u}}$ have to be specified for the APE, e.g., the freestream values from the flow field. The source term $\varvec{q}_m$ is set to zero with a smooth transition from non-zero to zero values.

4.2 Temporal Coupling

The coupling between the flow and the acoustics simulations has to be realized at each time step. Due to the explicit global time stepping it is possible that the time step size differs between the two solvers. In this case, at each time step the source term from the LES solution needs to be interpolated to the simulation time of the CAA solver.

Depending on the features of the geometry, the time step for the aeroacoustics simulation may be smaller than that for the flow simulation or vice versa and thus the source terms have to be interpolated between two flow time steps. As for the spatial coupling, there are many different interpolation methods to choose from. Linear interpolation is the most straightforward approach, with sometimes inferior results. Several temporal interpolation methods suitable for hybrid aeroacoustics simulations are compared and evaluated by Geiser et al. [10] and least-squares optimized interpolators were found to have the best properties when it comes to broadband error reduction.

The simplest approach is using the same time step for both simulations, which requires no interpolation between the two datasets. In this case, the next time step based on the CFL condition is determined for the CFD and the CAA method and the minimum of both methods is used, which is also the procedure that is used in this work.

4.3 Data Transfer

There exist two options for transferring data between the flow solution and the acoustics solution: via data files written to disk, i.e., offline coupling as used in standard hybrid approaches, or through in-memory data access, i.e., online coupling as done in the new direct-hybrid approach. Both methods are discussed in the following.

In offline coupling, the processes of obtaining the flow solution and running the aeroacoustics simulation are completely separated. At first, the flow solution is obtained and the source term $\varvec{q}_m$ is written to a file at certain time intervals. During the acoustics simulation, the source terms are determined from the files by interpolation in time. Conceptually, this is the simplest approach, since except for the I/O routines nothing has to be changed inside the two simulations. However, the high amount of data that has to be transferred to and from the disk makes this method expensive in terms of computational cost, especially for large-scale simulations on thousands of cores. However, it is also the first step towards a simulation which makes use of online coupling as outlined next.

In online coupling, the flow and the acoustics simulations are fully integrated and run synchroneously at the same time. Typically, the flow solution will be advanced by one time step and the acoustics solution has to be updated until they are both synchronized. Since no files have to be written to disk, this approach is more efficient than offline coupling. If the acoustics cells are kept on the same computational core as the corresponding flow cells, the acoustics simulation can directly access the relevant information by simple memory transfer operations. This locality of data is achieved by the specific subdomain decomposition, which operates on the joint LES-CAA grid. On the other hand, the increased memory consumption makes it necessary to use more computational cores. Furthermore, due to the different number of operations for the finite-volume and the DG operator, paired with different numbers of flow cells per acoustics cell, load balancing between the cores becomes mandatory to achieve reasonable parallel efficiency. This is accomplished by assigning appropriate loads to the fluid and acoustics cells.

5 Results

The CFD solver has already been extensively tested and used in the past, e.g., in [13, 15, 16, 23, 28]. Thus in Sect. 5.1, only the new CAA solver is validated. Additionally, parallel performance results for the CAA solver are presented in Sect. 5.2.

5.1 Validation of the Aeroacoustics Solver

The DG method described in Sect. 3.2 is validated by solving the acoustic perturbation equations for several generic problems. It is demonstrated that the solver is able to correctly predict the acoustic pressure field for sheared mean flow, for acoustic reflection at a solid wall, and for sound waves emanating from a boundary layer.

5.1.1 Monopole in Sheared Mean Flow

Figure 5 shows the results for wave propagation in a sheared mean flow. The example was chosen since mixing layer-type flow configurations with sheared mean flow are typical for noise generation, e.g., for turbulent jets. An S-shaped velocity profile is prescribed for the mean velocity,

$$\begin{aligned} \bar{\varvec{u}} = \frac{1}{2} \tanh \left( \frac{2 y}{\delta _w}\right) , \end{aligned}$$

(19)

where the shear-layer thickness is set to $\delta _w = 50$ and an analytical source term is used to generate an acoustic monopole [6]. The domain was discretized using $200 \times 200$ elements with a polynomial degree $N=3$. Figure 5 shows the result in comparison to the perturbed pressure field obtained in [6] from the linearized Euler equations (LEE). It can be seen that the DG results agree well with the reference solution.

5.1.2 Acoustic Reflection at a Solid Wall

A pressure pulse impinging on a plane wall in the presence of a uniform mean flow was simulated to validate the wall boundary conditions. The wall is located at $y=0$ and the initial conditions at time $t=0$ are

$$\begin{aligned} u' = v' = 0,\qquad p' = \exp \left\{ - (\ln 2) \frac{x^2 + (y-25)^2}{25} \right\} . \end{aligned}$$

(20)

The mean flow is prescribed parallel to the wall by setting $\bar{u} = 0.5$, $\bar{v} = 0.0$. Both the setup and the analytical values are taken from [14]. The square-shaped computational domain with side length $l = 200$ was discretized using 256 elements in each spatial direction with a polynomial degree of $N=5$. In Fig. 6, results for the acoustic pressure field of the reflected pulse are shown. They confirm that the CAA solver is able to correctly predict the reflection of acoustic waves from a solid wall.

5.1.3 Monopole in a Boundary Layer

In this case, a plane sound wave is assumed to travel through a small channel and to exit through a small orifice in a plane wall. Due to the small size of the channel, the emanating wave is an approximation for a singular monopole at the wall [3]. The domain is defined by $x \in [-25.6, 25.6]$ in the x-direction and $y \in [0.0, 20.0]$ in the y-direction, and it is discretized by $400,\!000$ elements with a polynomial degree of $N=3$. The monopole has a size of $\epsilon = 0.1$ and is located at the origin. It is created by enforcing a sinusoidal boundary state by setting

$$\begin{aligned} u' = 0, \qquad v' = p' = \sin (2\pi t). \end{aligned}$$

(21)

In addition to the monopole at the wall, a non-zero mean velocity is prescribed, which decreases to zero in the boundary layer region:

$$\begin{aligned} \bar{u} = {\left\{ \begin{array}{ll} M_x (2 y - 2 y^2 + y^4), &{} \text {if } 0 \le y \le 1,\\ M_x, &{} \text {if } y > 1, \end{array}\right. }\qquad \bar{v} = 0, \end{aligned}$$

(22)

where the Mach number is set to $M_x = 0.3$. Figure 7 shows a contour plot of the resulting pressure field. In Fig. 8 the results are compared to those in [3]. The DG-CAA solution is virtually indistinguishable from the reference solution, which demonstrates that the DG-APE method is able to adequately capture the refraction and reflection of sound waves in flow fields with velocity gradients, both in the channel region at $\theta < 10^\circ $ and in the shadow region at $140^\circ< \theta < 180^\circ $.

5.2 Parallel Performance Analysis

To assess the parallel performance of the newly developed aeroacoustics solver, a strong scaling experiment with two setups was performed on HPC systems. In each setup, the three-dimensional domain is cube-shaped. To obtain meaningful error measures, a manufactured solution approach was used, i.e., an auxiliary source term was added to the system of equations such that the analytical initial conditions, which are based on trigonometric functions, fulfill the system of equations exactly. In the first setup, a grid with 16.8 million cells and a polynomial degree $N = 3$ was used (low-order case). For the second setup, the number of cells was reduced to 2.1 million and the polynomial degree was set to $N = 7$ (high-order case). This yields the same global number of degrees of freedom for both cases (1.1 billion). The setups were chosen to be representative of typical large-scale aeroacoustics simulations under realistic conditions.

Figure 9 shows the strong scaling results for both setups on two state-of-the-art supercomputers, i.e., the Cray XC 40 of the High-Performance Computing Center Stuttgart and the BlueGene/Q of the Forschungszentrum Jülich. On both machines, the simulations were executed with one MPI rank per core and two OpenMP threads per rank. For the Cray system, the low-order case has a parallel efficiency of $79\,\%$ on $93,\!600$ cores, which improves to $98\,\%$ for the high-order case. Both values are very satisfactory. On the BlueGene/Q, the efficiency for the low-order case on the full machine is $80\,\%$. From these results, it can be concluded that the CAA solver is highly scalable and that it is well-suited for large-scale aeroacoustics simulations. Furthermore, the comparison of the two setups on the Cray XC 40 shows that it is beneficial for the parallel efficiency to use a higher-order approximation in the DG scheme.

To highlight the necessity of developing a new coupling approach for hybrid CFD-CAA simulations, another scaling experiment was conducted. In this case, a CAA simulation of a two-dimensional mixing layer was performed with offline coupling, i.e., the source term information was read from data files [27]. Figure 10 shows the speedup and the absolute wall-clock time for a single-threaded scaling from 32 to $4,\!096$ cores. In the left figure, the speedup is shown once for the overall simulation, with an ultimate efficiency of $61\,\%$ at $4,\!096$ cores. When excluding the I/O time, i.e., the time spent reading the source term data from disk, the efficiency improves to $92\,\%$. The reason for this behavior can be understood when looking at the wall-clock time for computation and I/O separately (see right figure): while the time for computation continuously decreases when using higher core counts, the curve for I/O time flattens out when going from $2,\!048$ to $4,\!096$ cores. This means that the I/O component ceases to scale beyond a certain number of cores, effectively turning the I/O into a bottleneck for the overall simulation.

The degradation of the parallel efficiency for offline-coupled simulations due to I/O performance limits can be further substantiated by examining the I/O bandwidth on current HPC systems. In Fig. 11, the measured maximum write speed for a single 63 GiB file is shown at increasing numbers of cores. The numbers were obtained with the Parallel netCDF library [20] using collective I/O and one MPI rank per core. On both machines, i.e., a Cray XC 40 with a Lustre file system (left figure) and a BlueGene/Q with a GPFS file system (right figure), the I/O bandwidth peaks at a certain number of cores and actually decreases for higher core counts. These results strongly suggest the need for an online coupling approach, where the CFD and the CAA solvers do not have to rely on the file I/O system to exchange data.

6 Conclusions

A direct-hybrid method suitable for large-scale aeroacoustic simulations has been presented. The flow field is predicted using an LES solver based on the finite-volume method. For the CAA solution, a nodal DG method is used to solve the acoustic perturbation equations for the determination of the acoustic field. In the novel approach, both solvers use the same hierarchical Cartesian grid, enabling an efficient data exchange between the two solvers. Appropriate strategies for the spatial and temporal coupling are described.

The CAA method is shown to correctly predict the acoustic pressure field for a monopole in sheared mean flow, acoustic reflection at a solid wall, and a monopole in a boundary layer. In addition, the parallel performance of the new scheme is investigated in several strong scaling experiments. They show that the new DG-CAA solver is capable of efficiently running simulations on hundreds of thousands of cores. Furthermore, while the direct-hybrid method with offline coupling involving disk I/O scales well up to a 128-fold increase in MPI ranks, the I/O operations necessary for reading the source terms from disk are identified as a bottleneck towards extreme scaling. This observation is further corroborated by an analysis of the I/O bandwidth on two current HPC systems, which emphasizes the need for the online coupling approach.

Overall, the proposed direct-hybrid method has shown to be a good candidate for efficient, highly parallel CAA simulations. As a next step, spatial as well as temporal interpolation schemes need to be investigated to lessen the restriction on the resolution requirements in space and time. A dynamic load balancing scheme will be developed to further improve the parallel performance for moving geometries.

References

Atkins, H.L.: Continued development of the discontinuous Galerkin method for computational aeroacoustic applications. AIAA Paper (97-1581) (1997)
Google Scholar
Bassi, F., Rebay, S.: A high-order accurate discontinuous finite element method for the numerical solution of the compressible Navier-stokes equations. J. Comput. Phys. 131(2), 267–279 (1997)
Article MathSciNet MATH Google Scholar
Bauer, M., Dierke, J., Ewert, R.: Application of a discontinuous Galerkin method to discretize acoustic perturbation equations. AIAA J. 49(5), 898–908 (2011)
Article Google Scholar
Bui, T.Ph., Schröder, W., Meinke, M.: Numerical analysis of the acoustic field of reacting flows via acoustic perturbation equations. Comput. Fluids 37(9), 1157–1169 (2008)
Google Scholar
Carpenter, M.H., Kennedy, C.: Fourth-order 2N-storage Runge-Kutta schemes. NASA Report TM 109112, NASA Langley Research Center (1994)
Google Scholar
Ewert, R., Schröder, W.: Acoustic perturbation equations based on flow decomposition via source filtering. J. Comput. Phys. 188, 365–398 (2003)
Article MathSciNet MATH Google Scholar
Ewert, R., Schröder, W.: On the simulation of trailing edge noise with a hybrid LES/APE method. J. Sound Vibr. 270(3), 509–524 (2004)
Article Google Scholar
Farrell, P., Maddison, J.: Conservative interpolation between volume meshes by local Galerkin projection. Comput. Meth. Appl. Mech. Eng. 200(1–4), 89–100 (2011)
Article MathSciNet MATH Google Scholar
Flad, D., Frank, H., Beck, A.D., Munz, C.D.: A Discontinuous Galerkin spectral element method for the direct numerical simulation of aeroacoustics. AIAA Paper (2014-2740) (2014)
Google Scholar
Geiser, G., Marinc, D., Schröder, W.: Comparison of source reconstruction methods for hybrid aeroacoustic predictions. International Journal of Aeroacoustics 12(7–8), 639–662 (2014)
Google Scholar
Geiser, G., Schlimpert, S., Schröder, W.: Thermoacoustical noise induced by laminar flame annihilation at varying flame thicknesses. In: 18th AIAA/CEAS Aeroacoustics Conference (33rd AIAA Aeroacoustics Conference), 04–06 June 2012, Colorado Springs, CO, AIAA 2012–2093 (2012)
Google Scholar
Gröschel, E., Schröder, W., Renze, P., Meinke, M., Comte, P.: Noise prediction for a turbulent jet using different hybrid methods. Comput. Fluids 37(4), 414–426 (2008)
Article MATH Google Scholar
Günther, C., Meinke, M., Schröder, W.: A flexible level-set approach for tracking multiple interacting interfaces in embedded boundary methods. Comput. & Fluids 102, 182–202 (2014)
Article Google Scholar
Hardin, J., Ristorcelli, J.R., Tam, C.K.W. (eds.): ISCASE/LaRC Workshop on Benchmark Problems in Computational Aeroacoustics (CAA), vol. NASA Conference Publication 3000. NASA (1995)
Google Scholar
Hartmann, D., Meinke, M., Schröder, W.: An adaptive multilevel multigrid formulation for Cartesian hierarchical grid methods. Comput. Fluids 37, 1103–1125 (2008)
Article MathSciNet MATH Google Scholar
Hartmann, D., Meinke, M., Schröder, W.: A strictly conservative Cartesian cut-cell method for compressible viscous flows on adaptive grids. Comput. Meth. Appl. Mech. Eng. 200, 1038–1052 (2011)
Article MathSciNet MATH Google Scholar
Hindenlang, F., Gassner, G.J., Altmann, C., Beck, A., Staudenmaier, M., Munz, C.D.: Explicit discontinuous Galerkin methods for unsteady problems. Comput. Fluids 61, 86–93 (2012)
Article MathSciNet Google Scholar
Koh, S., Schröder, W., Meinke, M.: Turbulence and heat excited noise sources in single and coaxial jets. J. Sound Vibr. 329, 786–803 (2010)
Article Google Scholar
Kopriva, D., Woodruff, S., Hussaini, M.: Discontinuous spectral element approximation of Maxwell’s equations. In: B. Cockburn, G. Kariadakis, C.W. Shu (eds.) Proceedings of the International Symposium on Discontinuous Galerkin Methods. Springer (2000)
Google Scholar
Li, J., Zingale, M., Liao, W.k., Choudhary, A., Ross, R., Thakur, R., Gropp, W., Latham, R., Siegel, A., Gallagher, B.: Parallel netCDF: a high-performance scientific I/O interface. In: Proceedings of the 2003 ACM/IEEE Conference on Supercomputing - SC ’03, p. 39. ACM Press, New York, USA (2003)
Google Scholar
Lintermann, A., Schlimpert, S., Grimmen, J.H., Günther, C., Meinke, M., Schröder, W.: Massively parallel grid generation on HPC systems. Comput. Meth. Appl. Mech. Eng. 277, 131–153 (2014)
Article MathSciNet Google Scholar
Liu, J.G., Shu, C.W.: A high-order discontinuous Galerkin method for 2D incompressible flows. J. Comput. Phys. 160(2), 577–596 (2000)
Article MathSciNet MATH Google Scholar
Pogorelov, A., Meinke, M., Schröder, W.: Cut-cell method based large-eddy simulation of tip-leakage flow. Phys. Fluids 27(7), 075106 (2015)
Article Google Scholar
Reed, W., Hill, T.: Triangular mesh methods for the neutron transport equation. Tech. Rep. LA-UR-73-479, Los Alamos Scientific Laboratory (1973)
Google Scholar
Directorate-General for Research, Innovation European Union: Flightpath 2050: Europe’s Vision for Aviation: Maintaining Global Leadership and Serving Society’s Needs. Office for Official Publications of the European Communities (2011)
Google Scholar
Sagan, H.: Space-filling curves, 1st edn. In: Universitext. Springer, New York (1994)
Google Scholar
Schlottke, M., Cheng, H.J., Lintermann, A., Meinke, M., Schröder, W.: A direct-hybrid method for computational aeroacoustics. In: AIAA Aviation, 22–26 June 2015, Dallas, TX, 21st AIAA/CEAS Aeroacoustics Conference, AIAA-2015–3133 (2015)
Google Scholar
Schneiders, L., Hartmann, D., Meinke, M., Schröder, W.: An accurate moving boundary formulation in cut-cell methods. J. Comput. Phys. 235, 786–809 (2013)
Article MathSciNet Google Scholar
Yakovlev, S., Xu, L., Li, F.: Locally divergence-free central discontinuous Galerkin methods for ideal MHD equations. J. Comput. Sci. 4(1–2), 80–91 (2013)
Google Scholar

Download references

Acknowledgements

This work has been performed with the support from the JARA-HPC SimLab Fluids & Solids Engineering of the RWTH Aachen University, Germany and the Forschungszentrum Jülich, Germany. The authors gratefully acknowledge the allocation of supercomputing time as well as the technical support by the High-Performance Computing Center Stuttgart of the University of Stuttgart, Germany and by the Jülich Supercomputing Centre of the Forschungszentrum Jülich, Germany.

Author information

Authors and Affiliations

Jülich Aachen Research Alliance - High Performance Computing, RWTH Aachen University, Aachen, Germany
Michael Schlottke-Lakemper & Andreas Lintermann
Institute of Aerodynamics, RWTH Aachen University, Aachen, Germany
Fabian Klemp, Hsun-Jen Cheng, Matthias Meinke & Wolfgang Schröder

Authors

Michael Schlottke-Lakemper
View author publications
You can also search for this author in PubMed Google Scholar
Fabian Klemp
View author publications
You can also search for this author in PubMed Google Scholar
Hsun-Jen Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Andreas Lintermann
View author publications
You can also search for this author in PubMed Google Scholar
Matthias Meinke
View author publications
You can also search for this author in PubMed Google Scholar
Wolfgang Schröder
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Michael Schlottke-Lakemper .

Editor information

Editors and Affiliations

Höchstleistungsrechenzentrum, Universität Stuttgart , Stuttgart, Baden-Württemberg, Germany
Michael M. Resch
Europe GmbH, NEC High Performance Computing Europe GmbH, Düsseldorf, Nordrhein-Westfalen, Germany
Wolfgang Bez
Europe GmbH, NEC High Performance Computing Europe GmbH, Stuttgart, Germany
Erich Focht
High Performance Computing, University of Stuttgart, Stuttgart, Germany
Nisarg Patel
Cyberscience Center, Tohoku University , Sendai, Japan
Hiroaki Kobayashi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Schlottke-Lakemper, M., Klemp, F., Cheng, HJ., Lintermann, A., Meinke, M., Schröder, W. (2016). CFD/CAA Simulations on HPC Systems. In: Resch, M., Bez, W., Focht, E., Patel, N., Kobayashi, H. (eds) Sustained Simulation Performance 2016. Springer, Cham. https://doi.org/10.1007/978-3-319-46735-1_12

Download citation

DOI: https://doi.org/10.1007/978-3-319-46735-1_12
Published: 02 December 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-46734-4
Online ISBN: 978-3-319-46735-1
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics