1 Introduction and motivation

Over several decades, significant progress has been made towards achieving high-fidelity numerical simulations of physically complex fluid flows in an efficient and accurate manner. Modern developments in the field of computational fluid dynamics (CFD) and recent advancements in high-performance parallel computing systems have collectively enabled the solution of a wide range of fluid dynamics problems with fundamental as well as practical applications in many areas of science and engineering. Nevertheless, the high-fidelity prediction of compressible viscous flows continues to pose a considerable computational challenge as the resources required to accurately resolve the features of such detailed and, often, intricate flow problems can be very large. In particular, the large disparity of physical length scales commonly associated with unsteady high-speed flows containing thermal and viscous boundary layers and/or shock waves and their reflections has elicited the need for researchers to devise improved numerical methods that can effectively handle and accurately predict the complex and rather elaborate structures of such flows. This continued demand for more reliable and robust high-fidelity numerical methods that are capable of accurately and efficiently treating unsteady compressible viscous flow problems with shocks comprises the driving motivation behind this research.

The unsteady reflection of shock waves at oblique incidence to a ramp is of concern here. For a planar shock wave of strength \(M_\mathrm{s}\) propagating in a gas and striking a rigid, non-porous inclined surface at an angle of \(\theta _\mathrm{w}\) to the direction of the flow field, a variety of multi-shock reflection configurations can result that are pertinent to the validation of the proposed parallel, fully implicit, anisotropic block-based AMR finite-volume framework. A two-shock regular reflection pattern can occur when the incident (S \(_{i}\)) and reflected (S \(_\text {r}\)) shock waves are connected through a reflection point on the surface of the wedge, as illustrated in Fig. 1a. Three- or four-shock Mach reflection configurations, presented in Fig. 1b, c, respectively, can occur when the reflection point moves gradually off and away from the surface of the wedge, becoming a triple point that connects the incident and reflected shock waves to an emerging shock wave known formally as the Mach stem (S \(_\text {m}\)), and leaves behind it a slipstream that separates two different regions of the flow. In general, for a given value of \(M_\mathrm{s}\), a two-shock regular reflection pattern will occur at large values of \(\theta _\mathrm{w}\), whereas a more complex multi-shock Mach reflection pattern will occur when \(\theta _\mathrm{w}\) is small.

Fig. 1
figure 1

Unsteady oblique shock reflection patterns arising from the interaction of a planar, rightward-moving, incident shock wave with a rigid, non-porous, inclined wedge

The numerical simulation of two-dimensional, unsteady, oblique shock wave reflection phenomena has a rather long history going back more than thirty years. Early research focused on the prediction of such flows using Godunov-type finite-volume solution methods [1] for the Euler equations governing inviscid compressible flows on simple uniform Cartesian meshes. This early research includes the studies of Colella and Glaz [2], Woodward and Colella [3] and Glaz et al. [4, 5]. Subsequent work by Colella and Glaz [6] as well as Colella and Henderson [7] considered the application of patch-based mesh adaptation techniques on a Cartesian mesh with localized mesh refinement in designated regions of the flow field. Shortly thereafter, other possibly more general solution-directed adaptive mesh refinement (AMR) schemes were proposed and developed by Fursenko, Timofeev, Voinovich and co-researchers [812], Sun and Takayama [13] and Henderson et al. [14] for compressible, inviscid, unsteady flow applications dealing with shock waves and their complex interactions.

More recently, the extension of upwind-based finite-volume schemes with AMR to the solution of the Navier–Stokes equations governing unsteady, compressible, viscous flows with shocks has permitted researchers to examine the significance of molecular transport properties on the behavior of a range of simple and complex compressible flow phenomena. For example, Colella and co-researchers [15, 16], Timofeev, Ofengeim, Voinovich and co-researchers [1720], as well as Henderson and co-researchers [21, 22] have all considered and/or proposed AMR schemes for the solution of viscous flows associated with unsteady oblique reflections of shock waves. Additionally, Graves et al. [23] have proposed a Cartesian mesh AMR scheme for the solution of the compressible Navier–Stokes equations with an embedded boundary treatment.

In spite of these successes, the capabilities of AMR gridding strategies have to date not permitted the high-fidelity, fully resolved, numerical solution of viscous, unsteady flow applications containing shock waves in which the shock is fully resolved for a wide range of Reynolds numbers. Apart from the numerical investigations of Henderson et al. [24], as well as Ivanov and co-researchers [25, 26], who each used very fine, uniform computational meshes to obtain fully resolved, unsteady and steady computations, respectively, there is a scarcity of published numerical studies with fully resolved internal shock structures. In turn, this has not allowed a full evaluation of the effects of micro-scale molecular transport on oblique reflection processes.

In the present study, the anisotropic block-based AMR finite-volume scheme of Zhang and Groth [27] is extended to solutions of two-dimensional, laminar, compressible, viscous flows governed by the Navier–Stokes equations and couples the spatial discretization scheme with a parallel, fully implicit, time-marching scheme based on Newton’s method [2830]. The former mitigates the inherently large computational memory and storage requirements associated with the use of the very fine spatial resolution needed for fully resolved viscous simulations of shocks, whereas the latter provides unconditional stability of the algorithm and the freedom to select the physical time step for unsteady shock reflection problems based solely on a consideration of solution accuracy, not stability constraints. Details of the proposed parallel finite-volume AMR scheme are given and the benefits, capabilities, and parallel performance of the method are demonstrated for unsteady oblique shock reflection problems in which the internal shock structure is fully resolved.

The structure of the remaining portions of the paper are as follows. In Sect. 2, the governing conservation equations for laminar, compressible, viscous, fluid flows are reviewed. In Sect. 3, details of the proposed finite-volume spatial discretization method are given. This is followed in Sects. 4, 5, and 6 by descriptions of the anisotropic AMR scheme, implicit time-marching scheme, and parallel implementation used herein, respectively. Numerical verification and validation of the computational framework is reported alongside previously published experimental results in Sect. 7. Finally, numerical solutions of fully resolved oblique shock reflection simulations are presented in Sect. 8. Section 9 provides a brief summary of the findings from this study.

2 Governing conservation equations

A parallel implicit AMR finite-volume scheme is considered herein for the solution of the Navier–Stokes equations governing two-dimensional, laminar, compressible, viscous, unsteady, gaseous flows. As the influence of the unsteady transition to turbulence and turbulent flow on the shock reflection process, which should occur far from the vicinity of the confluent shocks and slipstreams, is expected to be minimal at most, the assumption of laminar flow is deemed to be sufficient for the present simulations. The conservation form of the Navier–Stokes equations can be expressed using matrix-vector notation as

$$\begin{aligned} \frac{\partial \mathbf {U}}{\partial t} + {\vec {\nabla }}\!\cdot {\vec {\mathbf {H}}} = 0, \end{aligned}$$
(1)

which, for a two-dimensional Cartesian coordinate system (xy), can be written as

$$\begin{aligned} \frac{\partial \mathbf {U}}{\partial t} + \frac{\partial \mathbf {F}}{\partial x} + \frac{\partial \mathbf {G}}{\partial y} = \frac{\partial \mathbf {F_v}}{\partial x} + \frac{\partial \mathbf {G_v}}{\partial y}, \end{aligned}$$
(2)

where \({\vec {\mathbf {H}}} = (\mathbf {F} - \mathbf {F_v}, \mathbf {G} - \mathbf {G_v})\) is the total solution flux dyad. In Eqs. (1) and (2), the vector of conserved variables, \(\mathbf {U}\), the inviscid flux vectors, \(\mathbf {F}\) and \(\mathbf {G}\), and the viscous flux vectors, \(\mathbf {F_v}\) and \(\mathbf {G_v}\), are given by

$$\begin{aligned}&\mathbf {U} = \left[ \begin{array}{c} \rho , \rho u, \rho v, \rho E \end{array} \right] ^\text {T}, \end{aligned}$$
(3)
$$\begin{aligned}&\mathbf {F} = \left[ \begin{array}{c} \rho u, \rho u^2 + p, \rho u v, \rho u \left( E + \frac{p}{\rho }\right) \end{array} \right] ^\text {T}, \end{aligned}$$
(4)
$$\begin{aligned}&\mathbf {G} = \left[ \begin{array}{c} \rho v, \rho u v, \rho v^2 + p, \rho v \left( E + \frac{p}{\rho }\right) \end{array} \right] ^\text {T}, \end{aligned}$$
(5)
$$\begin{aligned}&\mathbf {F_v} = \left[ \begin{array}{c} 0, \tau _{xx}, \tau _{xy}, -q_x + u \tau _{xx} + v \tau _{xy} \end{array}\right] ^\text {T}, \end{aligned}$$
(6)
$$\begin{aligned}&\mathbf {G_v} = \left[ \begin{array}{c} 0, \tau _{x\,y}, \tau _{yy}, -q_y + u \tau _{xy} + v \tau _{yy} \end{array}\right] ^\text {T}, \end{aligned}$$
(7)

respectively, and t is the physical time. The conserved solution variables are expressed in terms of the fluid density, \(\rho \), the x- and y-direction velocity components, u and v, respectively, and the specific total energy, \(E=e + \frac{1}{2}\left( u^2 + v^2 \right) \), where e is the specific internal energy. The solution fluxes involve the fluid pressure, p, the non-zero elements of the viscous stress tensor, \(\tau _{xx}\), \(\tau _{yy}\) and \(\tau _{xy}\), and the x- and y-components of the heat flux vector, \(q_x\) and \(q_y\), respectively. The gas is assumed to behave as a polytropic gas satisfying the ideal-gas equation of state \(p=\rho R T\), where R is the specific gas constant and T is the fluid temperature. The specific internal energy and enthalpy in this case have the forms \(e=c_\mathrm{v} T\) and \(h=c_\mathrm{p} T\), where \(c_\mathrm{v}=R/(\gamma -1)\) and \(c_\mathrm{p}=\gamma R/(\gamma -1)\) are specific heats at constant volume and pressure, respectively, and \(\gamma =c_{\text{ p }}/c_\mathrm{v}\) is the ratio of specific heats. The non-zero elements of the viscous stress tensor are given by

$$\begin{aligned} \tau _{xx}&= \mu \left( \frac{4}{3} \frac{\partial u}{\partial x} - \frac{2}{3} \frac{\partial v}{\partial y} \right) , \end{aligned}$$
(8)
$$\begin{aligned} \tau _{yy}&= \mu \left( \frac{4}{3} \frac{\partial v}{\partial y} - \frac{2}{3} \frac{\partial u}{\partial x}\right) ,\end{aligned}$$
(9)
$$\begin{aligned} \tau _{xy}&= \mu \left( \frac{\partial u}{\partial y} + \frac{\partial v}{\partial x} \right) , \end{aligned}$$
(10)

where \(\mu \) is the dynamic viscosity of the gas. The x- and y-components of the heat flux vector, \(q_x\) and \(q_y\), respectively, are given by

$$\begin{aligned} q_x = - \kappa \frac{\partial T}{\partial x}, \quad \quad q_y = - \kappa \frac{\partial T}{\partial y}, \end{aligned}$$
(11)

where \(\kappa \) is the thermal conductivity coefficient for heat transfer. Note that both the heat transfer by radiation and the heat addition from the external surrounding environment are ignored in this research. Expressions governing the molecular transport properties, including the dynamic viscosity and thermal conductivity, for pure species and their mixtures follow from the tabulated empirical database and corresponding multi-component mixture formulations outlined by Gordon et al. [31, 32].

3 Finite-volume spatial discretization

The governing Navier–Stokes equations of Eqs. (1) and (2) are in differential form. The cell-centered, finite-volume, spatial discretization procedure utilized herein is applied to the integral form of these equations, which can be obtained by integrating over a two-dimensional control area, A, in (xy) space and applying the divergence theorem to a closed path, \(\varGamma \), surrounding this control volume. The following integral form of Eq. (1) is then obtained:

$$\begin{aligned} \frac{\mathrm {d}}{\mathrm {d}t} \int _A \mathbf {U}\,\mathrm {d}A + \oint _\varGamma {\vec {n}}\cdot {\vec {\mathbf {H}}}\,\mathrm {d}\varGamma = 0, \end{aligned}$$
(12)

where \({\vec {n}}\) is the outward unit vector that is normal to the closed contour. Subsequent application of the finite-volume method to Eq. (12) results in the following semi-discrete form of the conservation equations for an arbitrary cell (ij) of a two-dimensional, multi-block mesh composed of quadrilateral computational cells:

$$\begin{aligned} \frac{\mathrm {d}{\bar{\mathbf{U}}}_{ij}}{\mathrm {d}t} = - \frac{1}{A_{ij}} \sum _{k=1}^{N_f} \left( {\vec {n}}_k \cdot {\vec {\mathbf {H}}} \,\Delta l_k\right) _{ij} = \mathbf {R}_{ij}(\mathbf {U}), \end{aligned}$$
(13)

where \({\bar{\mathbf{U}}}\) is the cell-averaged value of the conserved solution vector given by

$$\begin{aligned} {\bar{\mathbf{U}}} = \frac{1}{A} \int _{A} \mathbf {U}\,\mathrm {d}A. \end{aligned}$$
(14)

In Eq. (13), each quadrilateral cell has \(N_f\!=\!4\) faces, \(\Delta l\) is the length of the cell edge and \(\mathbf {R}\) is the physical time residual vector. Standard mid-point rule quadrature has been used to evaluate the solution fluxes through each cell face k. Limited piecewise linear reconstruction is applied within each cell to ensure solution monotonicity near discontinuities while maintaining second-order accuracy in smooth regions of the flow field. The slope limiter of Venkatakrishnan [33] is employed. The inviscid, or hyperbolic, components of the numerical flux at cell interfaces are evaluated using the approximate Riemann solver of Harten et al. [34] with contributions from Einfeldt [35]. The viscous, or elliptic, components of the numerical flux are calculated using a central scheme with a diamond-path reconstruction technique for determining the solution gradients, as described by Coirier and Powell [36].

4 Anisotropic block-based AMR

Block-based AMR methods have been developed previously using both Cartesian and body-fitted, multi-block meshes for fluid flows involving a wide variety of complicated physical and chemical phenomena, as well as complex flow geometries, by Berger and co-researchers [3739], De Zeeuw and Powell [40], Powell et al. [41], Quirk and Hanebutte [42], as well as Groth et al. [43] and Groth and co-researchers [4448], amongst others. Despite the success of this previous research, one major limitation of these isotropic AMR gridding strategies has been the accurate and efficient treatment of multi-scale anisotropic physics. Recently, Zhang and Groth [27] proposed a treatment that addresses this important challenge by considering a parallel anisotropic block-based AMR method for solutions of a model linear advection–diffusion equation as well as the fully non-linear Euler equations governing two-dimensional, compressible, inviscid, gaseous flows.

The finite-volume scheme outlined above is used in conjunction with the anisotropic block-based AMR scheme of Zhang and Groth [27], which has been extended in this research to include applications to two-dimensional, laminar, compressible, viscous flows governed by the Navier–Stokes equations. Solution of the coupled non-linear ordinary differential equations (ODEs) given by Eq. (13) yields area-averaged solution quantities defined within quadrilateral computational cells. In the proposed multi-block AMR scheme, these cells are embedded in structured, body-fitted grid blocks, and a flexible block-based hierarchical binary tree data structure is used to facilitate automatic and local solution-directed mesh adaptation of the individual grid blocks. The refinement procedure can be performed independently in each of the \(\xi \) and \(\zeta \) local computational coordinate directions for the body-fitted grid block or domain of interest when dealing with strong anisotropic solution features. In regions requiring increased mesh resolution, a single parent block can be partitioned into two children blocks, with each new child block having the same number of cells as its parent block. The resolution in the coordinate direction of refinement is thereby doubled, while remaining unchanged in the other direction. Conversely, coarsening takes place by combining two children blocks into one parent block. This process is elucidated in Fig. 2, where the advantages of anisotropic AMR become apparent, in comparison to a traditional isotropic AMR approach, for dealing with flows exhibiting large solution gradients in one direction but not necessarily in the other. To ensure a smooth variation in the overall solution, mesh refinement ratios are limited to 2:1 between adjacent grid blocks and the minimum mesh resolution of the computational domain is limited to that of the initial, i.e., coarsest, mesh.

Fig. 2
figure 2

Refinement and coarsening of an \(8 \times 8\) cell block during i anisotropic AMR in the \(\xi \)-direction, ii anisotropic AMR in the \(\zeta \)-direction and iii isotropic AMR

At regular intervals during the computation, the coarsening and/or refinement of blocks within the flow field is directed using multiple physics-based refinement criteria. User-defined percentage thresholds are specified to refine blocks with criteria above the refinement threshold and to coarsen blocks with criteria below the coarsening threshold. This technique is implemented to treat flows with disparate spatial and temporal scales and to properly detect important flow features such as shock fronts, triple-shock confluence and reflection points, contact surfaces, as well as both thermal and viscous boundary and shear layers, while limiting the number of necessary computational cells required to accurately resolve these complex flow configurations. For any given solution variable of interest, u, the direction-dependent refinement criteria for anisotropic AMR are based on the measures \({\varepsilon _\xi = \vec {\nabla } u \cdot \Delta \tilde{\mathbf{X }} / |u|}\) and \({\varepsilon _\zeta = \vec {\nabla } u \cdot \Delta \tilde{\mathbf{Y }} / |u|}\) where \(\Delta \tilde{\mathbf{X }}\) and \(\Delta \tilde{\mathbf{Y }}\) are the vector differences between the mid-points of the cell faces in each of the logical coordinate directions. These indicators provide a representative measure of the total solution change across individual cells in each coordinate direction and regulate mesh adaptation in regions containing strong anisotropic characteristics of the flow. In this research, the direction-dependent refinement criteria for anisotropic AMR have been specified using both the gradient of density as well as the gradient of specific entropy, unless stated otherwise, in an effort to ensure that the internal structures of shock waves present in the flow field are properly resolved.

5 Implicit time-marching via Newton’s method

A fully implicit Newton–Krylov–Schwarz (NKS) method, as developed by Groth and Northrup [2830], is utilized to reliably and efficiently integrate the semi-discrete form of the system of the conservation equations given by Eq. (13). This implicit time-marching scheme is particularly well suited for obtaining highly resolved numerical solutions for cases in which the stability limits of an explicit time-marching method would likely result in severe limitations on the maximum allowable physical time step, as dictated by the smallest cells in the mesh. When used in combination with the previously described anisotropic AMR technique, the scheme provides significant computational savings for the calculation of fully resolved oblique shock reflection problems.

5.1 Steady-state computations

Steady-state solutions of Eq. (13) satisfy

$$\begin{aligned} \mathbf {R}(\mathbf {U}) = \frac{\mathrm {d} \mathbf {U}}{\mathrm {d}t} =0, \end{aligned}$$
(15)

the solution of which requires the solution of a large, coupled, non-linear system of algebraic equations. Newton’s method is used here to determine the solution of Eq. (15). Starting with an initial estimate, \(\mathbf {U}^{(0)}\), successively improved estimates of the solution at each iteration level, m, of Newton’s method can be obtained by solving the linear system

$$\begin{aligned} \left( \frac{\partial \mathbf {R}}{\partial \mathbf {U}}\right) ^{(m)} \Delta \mathbf {U}^{(m)} = \mathbf {J}^{(m)} \Delta \mathbf {U}^{(m)} = -\mathbf {R}\left( \mathbf {U}^{(m)}\right) \end{aligned}$$
(16)

where \(\mathbf {J} = \partial \mathbf {R}/\partial \mathbf {U}\) is the Jacobian of the residual vector with respect to the conserved solution vector. Improved approximations of the solution are then given by

$$\begin{aligned} \mathbf {U}^{(m+1)} = \mathbf {U}^{(m)} + \Delta \mathbf {U}^{(m)}, \end{aligned}$$
(17)

and the iterative procedure is repeated until a desired reduction in an appropriate norm of the solution residual vector is achieved, that is, until \(||\mathbf {R}(\mathbf {U}^{(m)})||_2 < \epsilon ||\mathbf {R}(\mathbf {U}^{(0)})||_2\), where \(\epsilon \) is some small convergence tolerance typically in the range of \(\epsilon \approx 10^{-7}\!-\!10^{-5}\) for the steady-state computations presented herein.

Each iteration level in Newton’s method requires the solution of a large, sparse, and non-symmetric system of linear equations given by Eq. (16). This system is of the general form \(\mathbf {J} \mathbf {x} = \mathbf {b}\), where \(\mathbf {x}\) and \(\mathbf {b}\) designate the solution and residual vectors, respectively. To solve for such non-symmetric linear systems, the present algorithm employs a class of Krylov subspace iterative methods known as generalized minimum residual (GMRES) methods [49] with an additive Schwarz global preconditioner. The application of the GMRES method within Newton’s method results in an overall solution algorithm that consists of a nested iterative procedure: inner-loop iterations to determine a solution of the linear system at each Newton step using the GMRES method and outer-loop iterations to solve the non-linear problem using Newton’s method. For improved performance, an inexact Newton method is adopted wherein the GMRES method is only partially converged at each iteration level of Newton’s method, i.e., the inner-loop iterations are deemed complete when \(||\mathbf {R}^{(m)}+\mathbf {J}^{(m)}\Delta \mathbf {U}^{(m)}||_2 \le \zeta ||\mathbf {R}^{(m)}||_2\), where \(\zeta \) is some small convergence tolerance (typically, \(\zeta \approx 0.01\!-\!0.5\) herein).

5.2 Dual-time-stepping-like approach for time-accurate computations

For the solution of time-dependent or unsteady problems, such as those encountered in the study of oblique shock reflections, the aforementioned implicit NKS method can be extended by adopting a dual-time-stepping-like procedure [30, 50]. In the implicit dual-time-stepping method, a pseudo time, \(\tau \), and pseudo time derivative of \(\mathbf {U}\) are introduced, resulting in a modified semi-discrete form of the governing equations given by

$$\begin{aligned} \frac{\mathrm {d}\mathbf {U}}{\mathrm {d}\tau } + \mathbf {R}^*(\mathbf {U}) = 0, \end{aligned}$$
(18)

where the vector \(\mathbf {R}^*(\mathbf {U})\) is the dual-time residual given by \(\mathbf {R}^*(\mathbf {U}) = \mathrm {d}\mathbf {U} / \mathrm {d}t + \mathbf {R}(\mathbf {U})\). Steady-state solutions in pseudo time of Eq. (18) are sought by applying an unconditionally stable implicit second-order backward differencing formula (BDF2) to the temporal discretization of the physical time derivative, yielding

$$\begin{aligned} \mathbf {R}^*(\mathbf {U}^{(n+1)}) ={ }&\frac{3\mathbf {U}^{(n+1)} - 4\mathbf {U}^{(n)} + \mathbf {U}^{(n-1)}}{2\Delta t}\nonumber \\&+ \mathbf {R}\left( \mathbf {U}^{(n+1)}\right) = 0. \end{aligned}$$
(19)

Although numerous time-marching schemes are compatible for use in a dual-time stepping approach, the BDF2 exhibits favorable stability properties [50] and has been used quite successfully to facilitate computations for a variety of practical flow applications, such as those studied by Northrup and Groth [29, 30], Isono and Zingg [51], as well as Tabesh and Zingg [52].

Solution of the modified non-linear system of algebraic equations given by Eq. (19) is again obtained via Newton’s method and requires the solution of the following linear system of equations at each Newton step:

$$\begin{aligned} \Bigg [ \left( \frac{3}{2\Delta t}\right) \mathbf {I} + \left( \frac{\partial \mathbf {R}}{\partial \mathbf {U}}\right) ^{(n+1,m)}\Bigg ]&\Delta \mathbf {U}^{(n+1,m)} \nonumber \\&= \mathbf {J}^*\Delta \mathbf {U}^{(n+1,m)}\nonumber \\&= -\mathbf {R}^* \left( \mathbf {U}^{(n+1,m)}\right) , \end{aligned}$$
(20)

Here, \(\Delta \mathbf {U}^{(n+1,m)}\) is the mth Newton estimate for the solution change at physical time level n. Successively improved estimates for the solution in physical time are given by

$$\begin{aligned} \mathbf {U}^{(n+1,m)} = \mathbf {U}^{(n)} + \Delta \mathbf {U}^{(n+1,m)}. \end{aligned}$$
(21)

In Eq. (20), \(\mathbf {I}\) denotes the identity matrix and \(\mathbf {J}^* = \partial \mathbf {R}^*/\partial \mathbf {U}\) is the Jacobian of the modified residual vector. The physical time step, \(\Delta t\), is determined by considering both the inviscid Courant–Friedrichs–Lewy (CFL) stability condition as well as the viscous von Neumann stability criterion, by means of \(\Delta t = \text {CFL} \cdot \min \left[ \Delta l / (|\vec {u}| + a), \rho \Delta l^2/\mu \right] \) in each coordinate direction, where a is the sound speed. In the dual-time-stepping approach, the iterative procedure is repeated until \(||\mathbf {R}^*(\mathbf {U}^{(n+1,m)})||_2 < \epsilon ||\mathbf {R}^*(\mathbf {U}^{(n)})||_2\), where a value of \(\epsilon \approx 10^{-3}\!-\!10^{-2}\) was found sufficient for the time-accurate computations presented herein.

6 Parallel implementation

Parallel implementation of the anisotropic block-based AMR finite-volume scheme has been carried out using the C ++ programming language and the Message Passing Interface (MPI) library. The block-based AMR and NKS methods are ideally suited to the parallel implementation of the algorithm via block-based domain decomposition in which the grid blocks are distributed to awaiting available processors, with more than one block permitted on each processor core. For homogeneous multi-processor architectures, the self-similar blocks are distributed and treated equally amongst the processors; for heterogeneous systems, a weighted distribution of the blocks is adopted to allocate more blocks to faster processors and fewer blocks to slower ones. In this investigation, all computations were performed using a large-scale, high-performance IBM System \(\times \) iDataPlex dx360 M2 computational cluster based on Intel’s Nehalem architecture, built using 3780 nodes in total, with two quad-core 2.53 GHz Intel Xeon E5540 Nehalem x86-64 processors and 16 GB of main memory per node. A highly scalable and efficient algorithm results.

7 Verification and validation

Prior to carrying out the fully resolved oblique shock reflection simulations presented in Sect. 8, an effort was made to first verify and validate the solutions of the proposed parallel, implicit, anisotropic, block-based AMR, finite-volume scheme. The validity of the numerical solutions and mesh resolution requirements for the prediction of steady one-dimensional planar shock structure was first assessed. In particular, resolution requirements for mesh-independent predictions of steady shock structure were explored via comparison to one-dimensional ODE solutions. Additionally, a direct comparison of anisotropic and isotropic block-based AMR strategies was made for oblique shock reflections with under-resolved internal shock structures, in an effort to forecast the anticipated computational savings of the anisotropic AMR approach when applied to the fully resolved case. Lastly, the predictive capabilities of the proposed solution method were assessed for a range of oblique shock reflection problems considered in other previous studies. In particular, the present numerical predictions were compared to the experimental results of Henderson and Gray [53] for the diffraction of strong incident shock waves over rigid concave corners. The latter provides evidence of the validity of the proposed numerical framework in the prediction of unsteady oblique shock reflection processes.

7.1 Mesh resolution study for one-dimensional stationary shock structure

To establish the resolution requirements of the proposed solution method for the prediction of fully resolved shocks, a mesh resolution study was conducted in which the parallel, implicit, finite-volume scheme with AMR was applied to the prediction of the one-dimensional stationary shock structure where the working gas was diatomic nitrogen (N\(_2\)). The predictions of the finite-volume scheme were compared to several ODE results for shock-front transitions [54, 55]. Such comparisons provide a validation of the high-fidelity CFD solution method in terms of its ability to compute accurate and highly resolved internal shock structures.

Stationary solutions for a shock in nitrogen with a shock Mach number of \(M_\mathrm{s} = 1.95\) were considered. A simple rectangular domain was used in which the initial mesh was 10 \(\times \) 10 cell blocks. The shock jump conditions were imposed as initial data and supersonic inflow boundary conditions and subsonic outflow boundary conditions were imposed at the upstream and downstream boundaries, respectively, so as to ensure that the shock remains centered indefinitely within the computational domain. The implicit NKS method described in Sect. 5 for steady flows was used to quickly and efficiently converge the solution of this problem to steady state. Smoothing of the solution on the coarse initial mesh was achieved by performing 10 steps of an explicit multi-stage time-marching scheme with optimal smoothing, at which point a steady-state solution was computed directly on the same mesh using the NKS method with limiter-freezing enabled to assist in solution convergence. Once the solution on the initial grid was fully converged to steady state, the process was repeated following the application of a single level of anisotropic AMR. This process was then successively repeated until additional levels of mesh refinement ceased to affect the variation of flow properties through the shock front. The refinement and coarsening thresholds for the anisotropic refinement were set to values of 0.125 and 0.075, respectively, encouraging refinement of the grid with each additional level of refinement in regions with strong gradients in entropy.

Fig. 3
figure 3

Mesh resolution study illustrating the smooth but rapid transition of specific entropy and density profiles through a one-dimensional, planar shock wave of strength \(M_\mathrm{s} = 1.95\) in gaseous N\(_2\). The inset diagram highlights the convergence of specific entropy profiles at their maximum peak value within the shock front

The predicted steady-state solutions for the stationary shock, illustrating the asymptotic convergence of the predicted profiles for the specific entropy as well as density through the shock wave for 16 through 23 levels of anisotropic AMR are compared in Fig. 3. The corresponding convergence histories for the computations on each successively refined mesh are presented in Fig. 4, where it can be seen that the residual is reduced by at least 5 orders of magnitude in an average of approximately 25 GMRES iterations per Newton step with 6–9 Newton iterations on each grid. It is evident from the results of the mesh resolution study presented in Fig. 3 that a total of 23 levels of anisotropic AMR are required to accurately capture and fully resolve the transition of flow properties through the shock front, although it appears as though as few as 20 levels would suffice and provide sufficient accuracy for many applications. Note that in the case of 23 levels of refinement, the finest cells in the mesh are more than \(8 \times 10^6\) (or \(2^{23}\)) times smaller than the coarsest cells present in the computational domain. As detailed in the inset of Fig. 3, the predicted peak in the specific entropy profiles through the shock front nearly coincides for the solutions with 22 and 23 levels of anisotropic AMR, signifying a mesh-independent solution has been achieved.

Fig. 4
figure 4

Steady-state convergence history with anisotropic AMR, corresponding to the preceding results of the mesh resolution study presented in Fig. 3, for a one-dimensional, planar shock wave of strength \(M_\mathrm{s} =1.95\) in gaseous N\(_2\)

The predicted shock wave thickness obtained using the parallel, implicit, finite-volume scheme with anisotropic AMR was found to be accurate to within 0.07 % of the results obtained from standard ODE solutions [54, 55], when 23 levels of anisotropic AMR were used. In this research, the shock thickness was calculated from the velocity profile according to the method presented by Taylor and Maccoll [56]. It is believed that this minor discrepancy in the computed thickness is largely attributed to small differences in the physical modeling adopted in the present finite-volume and ODE solutions. The good agreement between the ODE solutions and finite-volume predictions provides a strong indication of the mesh densities required for accurately predicting shock structure. A detailed listing that compares the accuracy of the predicted shock thickness results obtained using the anisotropic AMR mesh with 16 through 23 levels of refinement is presented in Table 1. In particular, the convergence of shock properties, including the maximum peak specific entropy value, \(s_{\text {max}}\), as well as the shock thickness, \(\Delta x\), is given as a function of the number of levels of refinement. It is evident from the results of the table that, to ensure recovery of fully resolved, mesh-independent, shock transitions in nitrogen under standard atmospheric conditions, the minimum cell sizes, \(\Delta l_{\text {min}}\), must be about \(10^{-9}\) m, or 1 nm. Depending on the strength of the shock wave, this translates to requiring approximately 100–200 cells to reside within the shock transition structure.

Table 1 Overview of the numerical results of the mesh resolution study for a one-dimensional, planar shock wave of strength \(M_\mathrm{s} = 1.95\) in gaseous N\(_2\)

7.2 Computational domain, boundary and initial conditions

For all of the unsteady oblique shock reflection computations that now follow, the schematic diagram of Fig. 5 illustrates the two-dimensional computational domain that was used to perform the simulations using the proposed finite-volume scheme. The numerical simulations are initiated on an initially coarse mesh consisting of 2 adjacent \(10 \times 10\) cell blocks. For any given wedge angle, \(\theta _\mathrm{w}\), the non-inclined block of cells has a 0.10 m length in both the x- and y-directions and the inclined block of cells has a length of 0.10 m along the wedge surface, as illustrated in the figure. This particular grid sizing has been chosen to be large enough to closely match the dimensions of conventional experimental shock tube facilities and to contain the entire structure of both the incident and reflected shock waves.

Fig. 5
figure 5

Grid geometry (\(10 \times 10\) cell blocks) exemplifying initial mesh refinement near the vertically oriented, rightward-moving, incident shock wave and near the shock wave-trailing boundary layer on the lower horizontal surface

The initial conditions for the reflection problem consist of an incident planar shock wave of a given strength propagating rightwards in the positive x-direction through a quiescent region of gas at standard atmospheric conditions.

The region behind (downstream of) the incident shock wave was defined by the Rankine–Hugoniot relations across a shock wave. The boundary conditions governing the fluid flow within this computational domain include a fixed inlet on the left-hand vertical boundary; a constant extrapolation outlet on the right-hand vertical boundary; viscous, isothermal walls on the lower horizontal and inclined boundaries; and an inviscid, adiabatic wall spanning the entire upper horizontal boundary.

7.3 Anisotropic versus isotropic AMR

The performance benefits of the anisotropic block-based AMR procedure were characterized herein by assessing the total reductions in mesh size provided using the anisotropic approach, as opposed to the usual isotropic method, for an unsteady oblique shock reflection problem. The particular case examined corresponds to the single Mach reflection flow in N\(_2\) examined previously by Henderson and Gray [53] with an incident shock Mach number of \(M_\mathrm{s} = 1.732\) and a wedge angle of \(\theta _\mathrm{w} = 36.90^{\circ }\). The simulations have been carried out using both AMR strategies (isotropic and anisotropic) with adapted meshes having refinement levels ranging from 7 to 10. A fixed physical time step of \(\Delta t = 1.25 \times 10^{-7}\) s was used. The AMR procedure was applied every 7 physical time steps and criteria based on the gradient of density with refinement and coarsening thresholds of 0.125 and 0.075, respectively, were used throughout the mesh refinement process.

The predicted distributions of the density for the single Mach reflection flow obtained using both isotropic and anisotropic AMR methods with refinement levels ranging from 7 to 10 are depicted in Fig. 6. Each of predicted results is shown at a solution time of \(t=9.34 \times 10^{-5}\) s after the incident shock wave has passed the corner of the wedge and the oblique shock reflection process has ensued. The grid blocks for the refined isotropic and anisotropic AMR meshes are overlayed onto distributions of the density field and the plots reveal the regions of the domain where large density gradients exist and, as a result, the mesh concentrations are highest. The latter correspond to regions near the incident and reflected shocks, Mach stem, viscous shear layers, and thermal boundary layers, as expected. The total number of computational cells, \(N_{\text {cells}}\), as well as the refinement efficiency, \(\eta \), is listed in each case. Here, the refinement efficiency for both isotropic and anisotropic AMR is defined as \(\eta = 1 - N_{\text {cells}}/N_{\text {uniform}}\), where \(N_{\text {uniform}}\) denotes the total number of computational cells that would exist on a uniform, isotropic mesh whose maximum refinement level equals the highest level of refinement in any computational coordinate direction on the current mesh.

Fig. 6
figure 6

Comparison of isotropic and anisotropic AMR methods for simulating an unsteady single Mach reflection problem (\(M_\mathrm{s} = 1.732\) and \(\theta _\mathrm{w} = 36.90^{\circ }\)) in gaseous N\(_2\) at \(t=9.34 \times 10^{-5}\) s after the initial interaction of the incident shock wave with the wedge corner. a 7 levels of isotropic AMR with \(N_{\text {cells}} = 105,200\) and \(\eta = 0.87158\). b 7 levels of anisotropic AMR with \(N_{\text {cells}} = 47,100\) and \(\eta = 0.94251\). c 8 levels of isotropic AMR with \(N_{\text {cells}} = 218,000\) and \(\eta = 0.93347\). d 8 levels of anisotropic AMR with \(N_{\text {cells}} = 92,000\) and \(\eta = 0.97192\). e 9 levels of isotropic AMR with \(N_{\text {cells}} = 476,300\) and \(\eta = 0.96366\). f 9 levels of anisotropic AMR with \(N_{\text {cells}} = 138,400\) and \(\eta = 0.98944\). g 10 levels of isotropic AMR with \(N_{\text {cells}} = 854,600\) and \(\eta = 0.98370\). h 10 levels of anisotropic AMR with \(N_{\text {cells}} = 189,600\) and \(\eta = 0.99638\)

It is evident from Fig. 6 that the application of the anisotropic AMR scheme for the single Mach reflection pattern of interest provides reductions in the mesh size of upwards of 78 % when using up to 10 levels of refinement, when compared to the isotropic AMR method, while still achieving the same overall solution accuracy. Assuming that the computational memory and storage requirements scale linearly with the mesh size, this translates to a factor of nearly 5 in computational savings. It is estimated that similar or perhaps even slightly higher levels of computational savings could be expected in the simulation of fully resolved, mesh-independent oblique shock reflections, wherein flow features such as shock structures as well as the viscous and thermal boundary layers lend themselves quite naturally to resolution via an anisotropic AMR approach. The results indicate that anisotropic AMR is markedly more effective than its isotropic counterpart when dealing with flows having strong anisotropic features.

7.4 Comparisons with experiments

The predictive capabilities of the proposed parallel, implicit, finite-volume scheme with AMR have also been assessed herein by comparing numerical predictions to published experimental measurements for several unsteady oblique shock reflection flows. The previous experimental results of Henderson and Gray [53] were again considered pertaining to the diffraction of shocks over rigid concave corners in N\(_2\). In addition to the single Mach reflection case outlined in Sect. 7.3, a relatively simple regular reflection pattern with \(M_\mathrm{s} = 1.721\) and \(\theta _\mathrm{w} = 52.36^{\circ }\), as well as a more complex double Mach reflection configuration with \(M_\mathrm{s} = 2.391\) and \(\theta _\mathrm{w} = 46.17^{\circ }\), were investigated. Numerical predictions for each one of these three flows were performed using 10 levels of anisotropic AMR based on the gradient of density, with refinement and coarsening thresholds set to 0.10 and 0.05, respectively. A CFL number of 0.2 was imposed for each case and anisotropic AMR was applied every 10 physical time steps.

The simulated or numerical schlieren images based on the predicted contours of the density gradient for each of the three oblique reflection cases are illustrated in Fig. 7 and compared to the actual experimental schlieren images. The regular, single Mach, and double Mach reflection computations are shown at physical solution times of \(t=5.13 \times 10^{-5}\), \(6.67 \times 10^{-5}\) and \(4.19 \times 10^{-5}\) s after the incident shock wave has passed the wedge corner, respectively. These values correspond to the propagation of the incident shock wave at an equidistance of 0.05 m up the surface of the inclined wedge. It is evident that the predicted solutions of the shock reflection process are very similar to those of the experiments in each case. Flow features such as boundary layers and slipstreams, as well as locations of planar incident, curved reflected, and resultant Mach stem shock waves are reproduced with excellent accuracy. Moreover, the good agreement between the numerical results and previous experimental images provides strong evidence of the validity of the proposed algorithm for predicting unsteady, oblique shock reflections in gaseous media.

Fig. 7
figure 7

Numerical schlieren images (right) replicating the oblique shock reflection configurations studied in the experiments of Henderson and Gray [53] (left) for the diffraction of shock waves in gaseous N\(_2\) over rigid concave corners. Experimental photographs reprinted from Henderson and Gray [53] with permission from Royal Society Publishing. a Regular reflection pattern with \(M_\mathrm{s} = 1.721\) and \(\theta _\mathrm{w} = 52.36^{\circ }\). b Single Mach reflection pattern with \(M_\mathrm{s} = 1.732\) and \(\theta _\mathrm{w} = 36.90^{\circ }\). c Double Mach reflection pattern with \(M_\mathrm{s} = 2.391\) and \(\theta _\mathrm{w} = 46.17^{\circ }\)

8 Numerical solution of fully resolved shocks

To demonstrate the capabilities of the proposed parallel, implicit, anisotropic AMR finite-volume scheme for predicting the physics of unsteady oblique shock reflection processes, a fully resolved simulation of the single Mach reflection configuration studied in the previous section of the paper is considered. The process of choosing an appropriate physical time step for use in the implicit BDF2 approach is first described and, then, the results of the numerical simulation with fully resolved internal shock-front transition profiles are presented and discussed.

8.1 Physical time step selection

For the time-accurate computation of unsteady compressible flows, implicit time-marching schemes such as the BDF2 can provide the opportunity to achieve much larger physical time steps than those permissible with the capabilities of an explicit time-marching method. However, the use of increasingly larger physical time steps tends to progressively degrade the overall accuracy of a numerical solution. As a result, it is important to first ensure that reasonably accurate numerical results can still be attained when sufficiently large physical time steps are used.

To examine the effects of time step selection on solution accuracy, an unsteady shock transition problem is studied, consisting of a one-dimensional planar shock wave with a strength of \(M_\mathrm{s} = 1.732\) propagating in the positive x-direction through an N\(_2\)-filled shock tube with straight walls (\(\theta _\mathrm{w} = 0^{\circ }\)). Numerical simulations were carried out using a variety of fixed physical time steps with the BDF2 scheme and the solutions for the propagating shock were determined for a total solution time of \(4.0\times 10^{-9}\) s. Based on the mesh density study of Sect. 7.1, 20 levels of anisotropic AMR are employed. This was deemed to be sufficient to capture the full transition of flow properties through the shock. For comparative purposes, similar results were also obtained using a standard, explicit second-order Runge–Kutta (RK2) time-marching method. The results obtained using each time-marching method were compared to a reference solution that was obtained using a similarly dense mesh and the explicit, fourth-order Runge–Kutta (RK4) time-marching method with a small time step. An estimate for the solution error based on the root-mean-square of the difference between the predicted temporal history of density at a selected point of interest as a function of time and that of the reference solution was determined, as defined by

$$\begin{aligned} \bar{\rho }_{\text {error}} = \sqrt{\frac{1}{N_t} \sum _{i=1}^{N_t} \left( \rho _i - \rho _{i,\text {ref}}\right) ^2} \end{aligned}$$
(22)

where \(\rho _{\text {i,ref}}\) represents the density of the reference RK4 solution at physical time step i and \(N_t\) denotes the total number of physical time steps taken to reach the final time. The reference solution was assumed to contain negligible error as it is calculated with a very small fixed physical time step.

The predicted temporal variations in the density at a selected point of interest produced by the passage of the shock as obtained using the implicit BDF2 scheme with various fixed physical time steps are compared to the predicted reference solution obtained using the explicit RK4 time-marching method with a very small time step in Fig. 8a. While it is evident that the BDF2 scheme allows use of large physical time steps, it is observed from the results shown in the figure that care must be exercised in selecting the time step. It should not be overly large so as not to corrupt the numerical accuracy of the predicted solution. From Fig. 8a, it appears that using a physical time step of \(\Delta t = 5 \times 10^{-11}\) s provides the BDF2 scheme with a reasonably large time step with a minimal loss in accuracy, in comparison to the reference RK4 solution, when 20 levels of anisotropic AMR are used. Moreover, the inviscid CFL number is not exceedingly large for this value of the physical time step and falls in the range from 2 to 3 for all cells in the computational domain.

Fig. 8
figure 8

An evaluation of second-order time-marching methods for use in the computation of unsteady shock wave problems with fully resolved internal structures. a Temporal variation of density due to the passage of a shock wave of strength \(M_\mathrm{s} = 1.732\) in N\(_2\) gas, captured using various fixed physical time steps for the fully implicit BDF2 approach. b A comparison of the computational efficiency for the fully implicit BDF2 scheme and the explicit RK2 method, for a variety of different fixed physical time steps

The computational savings provided by employing the fully implicit BDF2 approach over the conditionally stable RK2 method are summarized in Fig. 8b. The figure depicts the estimated error in the predicted temporal variation of the density, as given by Eq. (22), as a function of the total computational or central processing unit (CPU) time required to obtain the propagating shock solution for several different values of the physical time step. From this plot, it can be seen that computational savings of 42 % are provided using the BDF2 scheme with \(\Delta t = 5 \times 10^{-11}\) s, in comparison to using the largest possible fixed physical time step that is stable with the RK2 approach, without significantly compromising the global accuracy of the solution. Overall, the slight loss of accuracy incurred in using the fully implicit BDF2 scheme is thought to be outweighed by the computational savings afforded by the use of a larger time step. Note that if a greater level of mesh adaptation were considered, the savings offered by the implicit treatment are expected to be even more significant for, as the mesh spacing becomes smaller, the more restrictive Neumann or diffusive stability limit generally dominates and dictates the time step selection for explicit time-marching schemes. Hence, the implicit BDF2 approach is proposed as a preferred method for the simulation of oblique shock reflection problems with fully resolved internal shock structures.

8.2 Computation of a fully resolved single Mach reflection

Numerical predictions of the unsteady single Mach reflection pattern in N\(_2\) gas with \(M_\mathrm{s}=1.732\) and \(\theta _\mathrm{w} = 36.90^{\circ }\) were obtained using the anisotropic AMR technique alongside the fully implicit BDF2 time-marching method. The simulation was first initiated on a coarse mesh using 2 adjacent \(10 \times 10\) cell blocks and then started on an ensuing refined mesh following 20 levels of initial anisotropic AMR. Unsteady AMR was carried out once every 7 physical time steps, wherein refinement and coarsening was governed by thresholds of 0.125 and 0.075, respectively. This led to the refinement of the mesh with each additional level of anisotropic AMR in regions where strong density and entropy gradients existed within the flow field. The fixed physical time step used in the time-marching of the solution was \(\Delta t = 5 \times 10^{-11}\) s, as per the findings presented in Sect. 8.1.

The numerical results of the fully resolved simulation are illustrated in Fig. 9 at a solution time of \(t = 5.14 \times 10^{-7}\) s after the initial reflection of the incident shock wave from the wedge surface. At this time, a total of 39106 \(10 \times 10\) cell blocks with 3,910,600 computational cells are present in the simulation with 20 levels of anisotropic AMR, yielding a refinement efficiency of 0.99999993. The internal structures of the incident, reflected and Mach stem shock waves are well resolved. Furthermore, as with the under-resolved solution presented earlier, the predicted fully resolved solution obtained using the proposed parallel implicit anisotropic block-based AMR scheme agrees very well the experimental images of Henderson and Gray [53].

Fig. 9
figure 9

Predicted density contours with overlaid \(10 \times 10\) cell blocks of an unsteady single Mach reflection problem (\(M_\mathrm{s} = 1.732\) and \(\theta _\mathrm{w} = 36.90^{\circ }\)) in gaseous N\(_2\) at \(t = 5.14 \times 10^{-7}\) s after the incident shock wave strikes the corner of the wedge. The internal structures of the incident, reflected and Mach stem shock waves are fully resolved; the inset diagrams illustrate the smooth but rapid transitions of specific entropy and density measured along the dashed distances \(x'_\text {i}\), \(x'_\text {r}\) and \(x'_\text {m}\), aligned in the directions normal to each of these respective shock waves

Note that the application of the anisotropic AMR scheme was found to provide a reduction of 78 % in the mesh size when compared to the isotropic AMR case, while still achieving the same solution accuracy. Additionally, a further 42 % in computational savings was attained by exploiting the ability of the proposed implicit BDF2 scheme to take a larger time step than that permissible with the conditionally stable explicit RK2 time-marching method without significantly compromising accuracy. It then follows that the overall speedup provided by the combination of the anisotropic AMR and parallel fully implicit BDF2 time-marching methods for this fully resolved case is estimated to be a factor of more than \(10^{10}\), when using 256 cores and assuming a 50 % parallel efficiency due to latency and interprocessor communication, compared to a serial computation performed using the explicit RK2 scheme on a uniform mesh with a resolution equal to that of the finest mesh blocks.

9 Concluding remarks

A parallel, fully implicit, anisotropic block-based AMR finite-volume scheme has been described for solving two-dimensional, laminar, compressible, viscous, unsteady, gaseous flows governed by the Navier–Stokes equations. The combination of anisotropic AMR and parallel implicit time-marching techniques adopted in the proposed method has been shown to readily enable time-accurate numerical simulations of complex multi-shock interaction phenomena, as represented by unsteady oblique shock reflection processes with fully resolved internal shock structures.